bo->size is redundant because the base GEM object already has a size
field with the same value. Drop bo->size and use the base GEM object’s
size instead. While at it, introduce xe_bo_size() to abstract the BO
size.
v2:
- Fix typo in kernel doc (Ashutosh)
- Fix kunit (CI)
- Fix line wrap (Checkpatch)
v3:
- Fix sriov build (CI)
v4:
- Fix display build (CI)
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Link: https://lore.kernel.org/r/20250625144128.2827577-1-matthew.brost@intel.com
On old Intel platforms, the size of the GSM (i.e., the stolen memory
that holds the GGTT page table entries) could vary, so the driver needed
to read the actual size from the PCI config space. However from Xe_HP
onward, the GSM is now always guaranteed to be exactly 8MB (which
translates to a 4GB GGTT address space); this is always true regardless
of what the platform's much larger PPGTT address space is.
The bspec doesn't document the PCI config space as being a valid way to
query the size of the GSM after Xe_LP platforms, although so far it
still seems to be giving us proper values for Xe_HP, Xe2, and Xe3.
However we suspect that the config space will stop providing correct
values on some upcoming platforms, so we should stop relying on it.
Instead just use the hardcoded 8MB value as documented elsewhere in the
bspec.
Bspec: 49636, 67090, 50589
Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com>
Link: https://lore.kernel.org/r/20250605225352.2333981-2-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
We have only one GGTT for all IOV functions, with each VF having assigned
a range of addresses for its use. After migration, a VF can receive a
different range of addresses than it had initially.
This implements shifting GGTT addresses within drm_mm nodes, so that
VMAs stay valid after migration. This will make the driver use new
addresses when accessing GGTT from the moment the shifting ends.
By taking the ggtt->lock for the period of VMA fixups, this change
also adds constraint on that mutex. Any locks used during the recovery
cannot ever wait for hardware response - because after migration,
the hardware will not do anything until fixups are finished.
v2: Moved some functs to xe_ggtt.c; moved shift computation to just
after querying; improved documentation; switched some warns to asserts;
skipping fixups when GGTT shift eq 0; iterating through tiles (Michal)
v3: Updated kerneldocs, removed unused funct, properly allocate
balloning nodes if non existent
v4: Re-used ballooning functions from VF init, used bool in place of
standard error codes
v5: Renamed one function
v6: Subject tag change, several kerneldocs updated, some functions
renamed, some moved, added several asserts, shuffled declarations
of variables, revealed more detail in high level functions
v7: Fixed typos, added `_locked` suffix to some functs, improved
readability of asserts, removed unneeded conditional
v8: Moved one function, removed implementation detail from kerneldoc,
added asserts
v9: Code shuffling without much change, and one param rename
v10: Minor error path change, added printing the shift via debugfs
Signed-off-by: Tomasz Lis <tomasz.lis@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://lore.kernel.org/r/20250512114018.361843-3-tomasz.lis@intel.com
The balloon nodes, which are used to fill areas of GGTT inaccessible
for a specific VF, were allocated and inserted into GGTT within one
function. To be able to re-use that insertion code during VF
migration recovery, we need to split it.
This patch separates allocation (init/fini functs) from the insertion
of balloons (balloon/deballoon functs). Locks are also moved to ensure
calls from post-migration recovery worker will not cause a deadlock.
v2: Moved declarations to proper header
v3: Rephrased description, introduced "_locked" versions of some
functs, more lockdep checks, some functions renamed, altered error
handling, added missing kerneldocs.
v4: Suffixed more functs with `_locked`, moved lockdep asserts,
fixed finalization in error path, added asserts
v5: Renamed another few functs, used xe_ggtt_node_allocated(),
moved lockdep back again to avoid null dereference, added
asserts, improved comments
v6: Changed params of cleanup_ggtt()
Signed-off-by: Tomasz Lis <tomasz.lis@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://lore.kernel.org/r/20250512114018.361843-2-tomasz.lis@intel.com
The kernel fault injection infrastructure is used to test proper error
handling during probe. The return code of the functions using
ALLOW_ERROR_INJECTION() can be conditionnally modified at runtime by
tuning some debugfs entries. This requires CONFIG_FUNCTION_ERROR_INJECTION
(among others).
One way to use fault injection at probe time by making each of those
functions fail one at a time is:
FAILTYPE=fail_function
DEVICE="0000:00:08.0" # depends on the system
ERRNO=-12 # -ENOMEM, can depend on the function
echo N > /sys/kernel/debug/$FAILTYPE/task-filter
echo 100 > /sys/kernel/debug/$FAILTYPE/probability
echo 0 > /sys/kernel/debug/$FAILTYPE/interval
echo -1 > /sys/kernel/debug/$FAILTYPE/times
echo 0 > /sys/kernel/debug/$FAILTYPE/space
echo 1 > /sys/kernel/debug/$FAILTYPE/verbose
modprobe xe
echo $DEVICE > /sys/bus/pci/drivers/xe/unbind
grep -oP "^.* \[xe\]" /sys/kernel/debug/$FAILTYPE/injectable | \
cut -d ' ' -f 1 | while read -r FUNCTION ; do
echo "Injecting fault in $FUNCTION"
echo "" > /sys/kernel/debug/$FAILTYPE/inject
echo $FUNCTION > /sys/kernel/debug/$FAILTYPE/inject
printf %#x $ERRNO > /sys/kernel/debug/$FAILTYPE/$FUNCTION/retval
echo $DEVICE > /sys/bus/pci/drivers/xe/bind
done
rmmod xe
It will also be integrated into IGT for systematic execution by CI.
v2: Wrappers are not needed in the cases covered by this patch, so
remove them and use ALLOW_ERROR_INJECTION() directly.
v3: Document the use of fault injection at probe time in xe_pci_probe
and refer to it where ALLOW_ERROR_INJECTION() is used.
Signed-off-by: Francois Dugast <francois.dugast@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240927151207.399354-1-francois.dugast@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Clang warns (or errors with CONFIG_DRM_WERROR or CONFIG_WERROR):
drivers/gpu/drm/xe/xe_ggtt.c:810:3: error: variable 'total' is uninitialized when used here [-Werror,-Wuninitialized]
810 | total += hole_size;
| ^~~~~
drivers/gpu/drm/xe/xe_ggtt.c:798:11: note: initialize the variable 'total' to silence this warning
798 | u64 total;
| ^
| = 0
1 error generated.
Move the zero initialization of total from
xe_gt_sriov_pf_config_print_available_ggtt() to xe_ggtt_print_holes() to
resolve the warning.
Fixes: 136367290e ("drm/xe: Introduce xe_ggtt_print_holes")
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240823-drm-xe-fix-total-in-xe_ggtt_print_holes-v1-1-12b02d079327@kernel.org
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Defer the ggtt node removal to a thread if runtime_pm is not active.
The ggtt node removal can be called from multiple places, including
places where we cannot protect with outer callers and places we are
within other locks. So, try to grab the runtime reference if the
device is already active, otherwise defer the removal to a separate
thread from where we are sure we can wake the device up.
v2: - use xe wq instead of system wq (Matt and CI)
- Avoid GFP_KERNEL to be future proof since this removal can
be called from outside our drivers and we don't want to block
if atomic is needed. (Brost)
v3: amend forgot chunk declaring xe_device.
v4: Use a xe_ggtt_region to encapsulate the node and remova info,
wihtout the need for any memory allocation at runtime.
v5: Actually fill the delayed_removal.invalidate (Brost)
v6: - Ensure that ggtt_region is not freed before work finishes (Auld)
- Own wq to ensures that the queued works are flushed before
ggtt_fini (Brost)
v7: also free ggtt_region on early !bound return (Auld)
v8: Address the null deref (CI)
v9: Based on the new xe_ggtt_node for the proper care of the lifetime
of the object.
v10: Redo the lost v5 change. (Brost)
v11: Simplify the invalidate_on_remove (Lucas)
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Cc: Francois Dugast <francois.dugast@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240821193842.352557-12-rodrigo.vivi@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
In some rare cases, the drm_mm node cannot be removed synchronously
due to runtime PM conditions. In this situation, the node removal will
be delegated to a workqueue that will be able to wake up the device
before removing the node.
However, in this situation, the lifetime of the xe_ggtt_node cannot
be restricted to the lifetime of the parent object. So, this patch
introduces the infrastructure so the xe_ggtt_node struct can be
allocated in advance and freed when needed.
By having the ggtt backpointer, it also ensure that the init function
is always called before any attempt to insert or reserve the node
in the GGTT.
v2: s/xe_ggtt_node_force_fini/xe_ggtt_node_fini and use it
internaly (Brost)
v3: - Use GF_NOFS for node allocation (CI)
- Avoid ggtt argument, now that we have it inside the node (Lucas)
- Fix some missed fini cases (CI)
v4: - Fix SRIOV critical case where config->ggtt_region was
lost (Michal)
- Avoid ggtt argument also on removal (missed case on v3) (Michal)
- Remove useless checks (Michal)
- Return 0 instead of negative errno on a u32 addr. (Michal)
- s/xe_ggtt_assign/xe_ggtt_node_assign for coherence, while we
are touching it (Michal)
v5: - Fix VFs' ggtt_balloon
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240821193842.352557-11-rodrigo.vivi@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
The xe_ggtt component uses drm_mm to manage the GGTT.
The drm_mm_node is just a node inside drm_mm, but in Xe we use that
only in the GGTT context. So, this patch encapsulates the drm_mm_node
into a xe_ggtt's new struct.
This is the first step towards limiting all the drm_mm access
through xe_ggtt. The ultimate goal is to have a better control of
the node insertion and removal, so the removal can be delegated
to a delayed workqueue.
v2: Fix includes and typos (Michal and Brost)
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240821193842.352557-5-rodrigo.vivi@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
This WA requires us to limit media GT frequency requests to a certain
cap value during driver load. Freq limits are restored after load
completes, so perf will not be affected during normal operations.
During normal driver operation, this WA requires dummy writes to media
offset 0x380D8C after every ~63 GGTT writes. This will ensure completion
of the LMEM writes originating from Gunit.
During driver unload(before FLR), the WA requires that we set requested
frequency to the cap value again.
v3: Do not use WA number in function name. Call WA wrapper from xe_device.
Rename some variables, check for locks in the correct function (Rodrigo).
Ensure reset path is also covered for this WA.
v4: Fix BAT failure
v5: Add a function pointer for ggtt_ops (Michal W)
v6: Fix name collision and use static function (Rodrigo)
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240620224928.3986377-2-vinay.belgaumkar@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Let's simply convert all the current callers towards direct
xe_pm_runtime access and remove this extra layer of indirection.
No functional change is expected with this patch since
xe_mem_access_get was already using the xe_pm_runtime_get_noresume
at this point.
v2: Convert all the current callers instead of a big refactor
at once.
v3: - Rebased
- Squashed the GSC/HDCP
- Added a new case: sriov_pf_policy
- Improved commit message to highlight that
there's no functional change in this patch.
Reviewed-by: Matthew Auld <matthew.auld@intel.com> #v2
Cc: Suraj Kandpal <suraj.kandpal@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Suraj Kandpal <suraj.kandpal@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240418143049.43231-1-rodrigo.vivi@intel.com
xe_pm_init is the very last thing during the xe_pci_probe(),
hence these protections are useless from the point of view
of ensuring that the device is awake.
Let's remove it so we continue towards the goal of killing
xe_device_mem_access.
v2: Adding more cases
v3: Provide a separate fix for xe_tile_init_noalloc return (Matt)
Adding a new case where display HDCP init calls which
are also called at display probe time.
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240417203952.25503-5-rodrigo.vivi@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
VF's drivers can't modify GGTT PTEs except the range explicitly
assigned by the PF driver. To allow hardware enforcement of this
requirement, each GGTT PTE has a field with the VF number that
identifies which VF can modify that particular GGTT PTE entry.
Only PF driver can modify this field and PF driver shall do that
before VF drivers will be loaded. Add function to prepare PTEs.
Since it will be used only by the PF driver, make it available
only for CONFIG_PCI_IOV=y.
Bspec: 45015, 52395
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com>
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240415173937.1287-3-michal.wajdeczko@intel.com
The flags stored in the BO grew over time without following
much a naming pattern. First of all, get rid of the _BIT suffix that was
banned from everywhere else due to the guideline in
drivers/gpu/drm/i915/i915_reg.h that xe kind of follows:
Define bits using ``REG_BIT(N)``. Do **not** add ``_BIT`` suffix to the name.
Here the flags aren't for a register, but it's good practice to keep it
consistent.
Second divergence on names is the use or not of "CREATE". This is
because most of the flags are passed to xe_bo_create*() family of
functions, changing its behavior. However, since the flags are also
stored in the bo itself and checked elsewhere in the code, it seems
better to just omit the CREATE part.
With those 2 guidelines, all the flags are given the form
XE_BO_FLAG_<FLAG_NAME> with the following commands:
git grep -le "XE_BO_" -- drivers/gpu/drm/xe | xargs sed -i \
-e "s/XE_BO_\([_A-Z0-9]*\)_BIT/XE_BO_\1/g" \
-e 's/XE_BO_CREATE_/XE_BO_FLAG_/g'
git grep -le "XE_BO_" -- drivers/gpu/drm/xe | xargs sed -i -r \
-e 's/XE_BO_(DEFER_BACKING|SCANOUT|FIXED_PLACEMENT|PAGETABLE|NEEDS_CPU_ACCESS|NEEDS_UC|INTERNAL_TEST|INTERNAL_64K|GGTT_INVALIDATE)/XE_BO_FLAG_\1/g'
And then the defines in drivers/gpu/drm/xe/xe_bo.h are adjusted to
follow the coding style.
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240322142702.186529-3-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>