Commit Graph

22 Commits

Author SHA1 Message Date
Matthew Auld
83dcf232cc drm/xe: prevent potential UAF in pf_provision_vf_ggtt()
The node ptr can point to an already freed ptr, if we hit the path with
an already allocated node. We later dereference that pointer with:

	xe_gt_assert(gt, !xe_ggtt_node_allocated(node));

which is a potential UAF. Fix this by not stashing the ptr for node.
Also since it is likely a bad idea to leave config->ggtt_region pointing
to a stale ptr, also set that to NULL by calling
pf_release_vf_config_ggtt() instead of pf_release_ggtt().

Fixes: 34e804220f ("drm/xe: Make xe_ggtt_node struct independent")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240828104341.180111-2-matthew.auld@intel.com
(cherry picked from commit 89076b5a8b)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-09-12 12:29:30 -05:00
Michal Wajdeczko
da6ec74339 drm/xe/pf: Reset thresholds when releasing a VF config
As part of the VF config release, we should reset all parameters,
including thresholds, to always start with the clean VF config.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240830132100.1704-3-michal.wajdeczko@intel.com
2024-09-02 20:51:03 +02:00
Michal Wajdeczko
a1498ab229 drm/xe/pf: Add thresholds to the VF KLV config
We are pushing threshold KLV to the GuC immediately during the
threshold provisioning, but those configs will be lost during a
GT reset.  Include threshold KLVs while encoding full VF config
buffer to make sure the GuC receives all of the config KLVs.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240830132100.1704-2-michal.wajdeczko@intel.com
2024-09-02 20:51:02 +02:00
Michal Wajdeczko
16ba2b28df drm/xe/pf: Add function to sanitize VF resources
On current platforms it is a PF driver responsibility to clear
some of the VF's resources during a VF FLR. Add simple function
that will clear configured VF resources (GGTT, LMEM). We will
start using this function soon.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240828210809.1528-2-michal.wajdeczko@intel.com
2024-08-30 10:51:06 +02:00
Nathan Chancellor
ff9c674d11 drm/xe: Fix total initialization in xe_ggtt_print_holes()
Clang warns (or errors with CONFIG_DRM_WERROR or CONFIG_WERROR):

  drivers/gpu/drm/xe/xe_ggtt.c:810:3: error: variable 'total' is uninitialized when used here [-Werror,-Wuninitialized]
    810 |                 total += hole_size;
        |                 ^~~~~
  drivers/gpu/drm/xe/xe_ggtt.c:798:11: note: initialize the variable 'total' to silence this warning
    798 |         u64 total;
        |                  ^
        |                   = 0
  1 error generated.

Move the zero initialization of total from
xe_gt_sriov_pf_config_print_available_ggtt() to xe_ggtt_print_holes() to
resolve the warning.

Fixes: 136367290e ("drm/xe: Introduce xe_ggtt_print_holes")
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240823-drm-xe-fix-total-in-xe_ggtt_print_holes-v1-1-12b02d079327@kernel.org
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-08-24 06:11:26 -07:00
Rodrigo Vivi
34e804220f drm/xe: Make xe_ggtt_node struct independent
In some rare cases, the drm_mm node cannot be removed synchronously
due to runtime PM conditions. In this situation, the node removal will
be delegated to a workqueue that will be able to wake up the device
before removing the node.

However, in this situation, the lifetime of the xe_ggtt_node cannot
be restricted to the lifetime of the parent object. So, this patch
introduces the infrastructure so the xe_ggtt_node struct can be
allocated in advance and freed when needed.

By having the ggtt backpointer, it also ensure that the init function
is always called before any attempt to insert or reserve the node
in the GGTT.

v2: s/xe_ggtt_node_force_fini/xe_ggtt_node_fini and use it
    internaly (Brost)
v3: - Use GF_NOFS for node allocation (CI)
    - Avoid ggtt argument, now that we have it inside the node (Lucas)
    - Fix some missed fini cases (CI)
v4: - Fix SRIOV critical case where config->ggtt_region was
      lost (Michal)
    - Avoid ggtt argument also on removal (missed case on v3) (Michal)
    - Remove useless checks (Michal)
    - Return 0 instead of negative errno on a u32 addr. (Michal)
    - s/xe_ggtt_assign/xe_ggtt_node_assign for coherence, while we
      are touching it (Michal)
v5: - Fix VFs' ggtt_balloon

Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240821193842.352557-11-rodrigo.vivi@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-08-22 14:00:45 -04:00
Rodrigo Vivi
136367290e drm/xe: Introduce xe_ggtt_print_holes
Introduce a new xe_ggtt_print_holes helper that attends the SRIOV
demand and finishes the goal of limiting drm_mm access to xe_ggtt.

Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240821193842.352557-9-rodrigo.vivi@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-08-22 14:00:45 -04:00
Rodrigo Vivi
1144e0dff5 drm/xe: Introduce xe_ggtt_largest_hole
Introduce a new xe_ggtt_largest_hole helper that attends the SRIOV
demand and continue with the goal of limiting drm_mm access to xe_ggtt.

v2: Fix a typo (Michal)

Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240821193842.352557-8-rodrigo.vivi@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-08-22 14:00:45 -04:00
Rodrigo Vivi
8b5ccc9743 drm/xe: Limit drm_mm_node_allocated access to xe_ggtt_node
Continue with the encapsulation of drm_mm_node inside xe_ggtt.

Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240821193842.352557-7-rodrigo.vivi@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-08-22 14:00:44 -04:00
Rodrigo Vivi
0567f18e07 drm/xe: Rename xe_ggtt_node related functions
Bring some consistency and prepare for more xe_ggtt_node related
functions to be introduced.

Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240821193842.352557-6-rodrigo.vivi@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-08-22 14:00:44 -04:00
Rodrigo Vivi
6062ea9398 drm/xe: Encapsulate drm_mm_node inside xe_ggtt_node
The xe_ggtt component uses drm_mm to manage the GGTT.
The drm_mm_node is just a node inside drm_mm, but in Xe we use that
only in the GGTT context. So, this patch encapsulates the drm_mm_node
into a xe_ggtt's new struct.

This is the first step towards limiting all the drm_mm access
through xe_ggtt. The ultimate goal is to have a better control of
the node insertion and removal, so the removal can be delegated
to a delayed workqueue.

v2: Fix includes and typos (Michal and Brost)

Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240821193842.352557-5-rodrigo.vivi@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-08-22 14:00:44 -04:00
Michal Wajdeczko
5bdacb0907 drm/xe/pf: Fix VF config validation on multi-GT platforms
When validating VF config on the media GT, we may wrongly report
that VF is already partially configured on it, as we consider GGTT
and LMEM provisioning done on the primary GT (since both GGTT and
LMEM are tile-level resources, not a GT-level).

This will cause skipping a VF auto-provisioning on the media-GT and
in result will block a VF from successfully initialize that GT.

Fix that by considering GGTT and LMEM configurations only when
checking if a VF provisioning is complete, and omit GGTT and LMEM
when reporting empty/partial provisioning.

Fixes: 234670cea9 ("drm/xe/pf: Skip fair VFs provisioning if already provisioned")
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Piotr Piórkowski <piotr.piorkowski@intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240806180516.618-1-michal.wajdeczko@intel.com
2024-08-09 10:37:52 +02:00
Michal Wajdeczko
25ec7e809c drm/xe: Add NEEDS_2M BO flag
In addition of NEEDS_64K BO flag, add similar one to force 2 MiB
alignment of the buffer objects. Explicitly use this flag during
VF LMEM provisioning as LMTT uses 2 MiB pages and one day we may
drop requirement of allocating pinned objects as contiguous.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240715180538.1418-3-michal.wajdeczko@intel.com
2024-07-22 12:53:06 +02:00
Michal Wajdeczko
4c3fe5eae4 drm/xe/pf: Limit fair VF LMEM provisioning
Due to the current design of the BO and VRAM manager, any object
with XE_BO_FLAG_PINNED flag, which the PF driver uses during VF
LMEM provisionining, is created with the TTM_PL_FLAG_CONTIGUOUS
flag, which may cause VRAM fragmentation that prevents subsequent
allocations of larger objects, like fair VF LMEM provisioning.

To avoid such failures, round down fair VF LMEM provisioning size
to next power of two size, to compensate what xe_ttm_vram_mgr is
doing to achieve contiguous allocations.

Fixes: ac6598aed1 ("drm/xe/pf: Add support to configure SR-IOV VFs")
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240711192320.1198-2-michal.wajdeczko@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-07-12 13:45:56 -07:00
Michal Wajdeczko
411220808c drm/xe/pf: Restart VFs provisioning after GT reset
Any prior configurations pushed to the GuC are lost when the GT
is reset. Push again all non-empty VF configurations to the GuC
as part of the GuC reset procedure.

This will also help restore early manual provisioning, when the
PF was in the meantime suspended and then resumed.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240701102738.934-3-michal.wajdeczko@intel.com
2024-07-01 19:43:52 +02:00
Michal Wajdeczko
234670cea9 drm/xe/pf: Skip fair VFs provisioning if already provisioned
Our debugfs allows to view and change VFs' provisioning configs.

If we attempt to experiment with VFs provisioning before enabling
them, this early config will affect fair provisioning calculations,
and will also be overwritten, which is undesirable behavior.

To improve this, check if the VFs configs are empty (unprovisioned)
before starting the fair provisioning procedure.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240701102738.934-2-michal.wajdeczko@intel.com
2024-07-01 19:43:50 +02:00
Michal Wajdeczko
b321cb83a3 drm/xe/pf: Assert LMEM provisioning is done only on DGFX
The Local Memory (aka VRAM) is only available on DGFX platforms.
We shouldn't attempt to provision VFs with LMEM or attempt to
update the LMTT on non-DGFX platforms. Add missing asserts that
would enforce that and fix release code that could crash on iGFX
due to uninitialized LMTT.

Fixes: c063cce7df ("drm/xe/pf: Update the LMTT when freeing VF GT config")
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Piotr Piórkowski <piotr.piorkowski@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240607153155.1592-1-michal.wajdeczko@intel.com
2024-06-10 12:14:23 +02:00
Michal Wajdeczko
c063cce7df drm/xe/pf: Update the LMTT when freeing VF GT config
The LMTT must be updated whenever we change the VF LMEM configuration.
We missed that step when freeing the whole VF GT config, which could
result in stale PTE in LMTT or LMTT PT object leaks. Fix that.

Fixes: ac6598aed1 ("drm/xe/pf: Add support to configure SR-IOV VFs")
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240527115408.1064-1-michal.wajdeczko@intel.com
2024-05-31 14:44:09 +02:00
Michal Wajdeczko
629df234bf drm/xe/pf: Introduce functions to configure VF thresholds
The GuC firmware monitors VF's activity and notifies the PF driver
once any configured threshold related to such activity is exceeded.
Add functions to allow configuration of these thresholds per VF.

Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240514190015.2172-5-michal.wajdeczko@intel.com
2024-05-16 18:04:42 +02:00
Michal Wajdeczko
49f853c78e drm/xe/pf: Clamp maximum execution quantum to 100s
GuC is silently clamping values of the execution quantum and
preemption timeout KLVs to 100s. Perform explicit clamping on the
driver side as later there is no way to read back values used by
the firmware and we shouldn't mislead the user about actual values
being used when we print them in dmesg or debugfs.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240419123543.270-3-michal.wajdeczko@intel.com
2024-04-24 15:32:26 +02:00
Michal Wajdeczko
d3b80dc7aa drm/xe/pf: Fix xe_gt_sriov_pf_config_print_available_ggtt()
This function is using internal helper pf_get_spare_ggtt() that
expects PF's master mutex to be locked. Fix that.

Fixes: ac6598aed1 ("drm/xe/pf: Add support to configure SR-IOV VFs")
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Piotr Piórkowski <piotr.piorkowski@intel.com>
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240419141000.314-1-michal.wajdeczko@intel.com
2024-04-22 19:50:47 +02:00
Michal Wajdeczko
ac6598aed1 drm/xe/pf: Add support to configure SR-IOV VFs
To run correctly, each Virtual Function must be provisioned with
some chunk of shared hardware or firmware resources (like GGTT,
device memory, GuC doorbell IDs, GuC context IDs) and scheduling
parameters (execution quantum or preemption timeout).

All resources assigned to VFs must be excluded from the PF driver
use and may require some additional preparation steps (like setup
of the LMTT or update of the GGTT PTE). Those provisioning details
must be then sent to the GuC firmware as most of those details
will be shared later with the VF drivers during their boot.

Add basic functions to provision VFs with all hardware resources
or scheduling parameters. We will use them shortly in upcoming
patches either in manual provisioning over debugfs, exposed to the
advanced users, or automatic provisioning done by PF driver during
VFs enabling.

Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240415173937.1287-7-michal.wajdeczko@intel.com
2024-04-16 12:37:36 +02:00