There is already a BUILD_BUG_ON() check to make sure the size follow the
number of steering types. Also make sure the right index is being used
for each steering type.
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
The function pointer is already present as match_func, inside
struct xe_rtp_rule and handled as so instead of inside rtp_regval as
originally thought out when this was written.
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Now that we issue TLB invalidations on unbinds and rebind from execs we
no longer need to issue TLB invalidations from the ring operations.
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
If we add an TLB invalidation fence for rebinds issued from execs we
should be able to drop the TLB invalidation from the ring operations.
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Only the GuC should be issuing TLB invalidations if it is enabled. Part
of this patch is sanitize the device on driver unload to ensure we do
not send GuC based TLB invalidations during driver unload.
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Endless fences are not good, add a TDR to cleanup any invalidation
fences which have not received an invalidation message within a timeout
period.
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
This gets tricky as we can't do the TLB invalidation until the unbind
operation is done on the hardware and we can't signal the unbind as
complete until the TLB invalidation is done. To work around this we
create an unbind fence which does a TLB invalidation after unbind is
done on the hardware, signals on TLB invalidation completion, and this
fence is installed in the BO dma-resv slot and installed in out-syncs
for the unbind operation.
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Suggested-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com
Suggested-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
We can't currently do this due to TLB invalidation done handler
expecting the seqno being received in-order, with the fast-path a TLB
invalidation done could pass one being processed in the slow-path in an
extreme corner case. Remove TLB invalidation done from the fast-path for
now and in a follow up reenable this once the TLB invalidation done
handler can deal with out of order seqno.
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Scratch page is in VRAM, and therefore requires 64K GTT layout. In GGTT
world this just means having 16 consecutive entries, with 64K GTT
alignment for the GTT address of the first entry (also matching physical
alignment). However to keep things simple just dump it into system
memory, like we already do for ppGTT. While we are here, also give it
known default value.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Spec says we need to use 64K VRAM pages for GGTT on platforms like DG2.
In GGTT this just means aligning the GTT address to 64K and ensuring
that we have 16 consecutive entries each pointing to the respective 4K
entry. We already ensure we have 64K pages underneath, so it's just a
case of forcing the GTT alignment.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
On DG2 when running the xe_vm IGT, the kernel generates loads of CAT
errors and GT resets (sometimes at least). On small-bar systems seems
to trigger a lot more easily (maybe due to difference in allocation
strategy). Appears to be related to scratch, since we seem to use the
64K TLB hint on scratch entries, even though the scratch page is a 4K
vram page. Bumping the scratch page size and physical alignment seems
to fix it. Or at least we no longer hit:
[ 148.872683] xe 0000:03:00.0: [drm] Engine memory cat error: guc_id=0
[ 148.872701] xe 0000:03:00.0: [drm] Engine memory cat error: guc_id=0
[ 148.875108] WARNING: CPU: 0 PID: 953 at drivers/gpu/drm/xe/xe_guc_submit.c:797
However to keep things simple, so we don't have to deal with 64K TLB
hints, just move the scratch page into system memory on platforms that
require 64K VRAM pages.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
On DGFX this blows up if can call this with a system memory object:
XE_BUG_ON(!mem_type_is_vram(place->mem_type) && place->mem_type != XE_PL_STOLEN);
If we consider dpt it looks like we can already in theory hit this, if
we run out of vram and stolen vram. It at least seems reasonable to
allow calling this on any object which supports CPU access.
Note this also changes the behaviour with stolen VRAM and suspend, such
that we no longer attempt to migrate stolen objects into system memory.
However nothing in stolen should ever need to be restored (same on
integrated), so should be fine. Also on small-bar systems the stolen
portion is pretty much always non-CPU accessible, and currently pinned
objects use plain memcpy when being moved, which doesn't play nicely.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
The comparison with < 0 suggests that the memory device access
should be signed to handle underflow. This makes it work more reliably.
As a result, the max refcount is now S32_MAX instead.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
At present the interrupts are enabled while initializing the last GT.
But this is incorrect for a Multi-GT platform, as root GT initialization
will fail with interrupt disabled. Interrupts are required for
the GuC submission triggered during initialization.
Enable the interrupt during the root GT initialization.
Signed-off-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Some of the tests may benefit from running with ARCH=um, forgoing any
additional setup on the CI build side. Add min config for that.
Tested with:
./tools/testing/kunit/kunit.py build \
--kunitconfig drivers/gpu/drm/xe/.kunitconfig \
--jobs $(nproc) \
--build_dir build_kunit
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Mauro Carvalho Chehab <mchehab@kernel.org>
mem_type field was added in commit d8b52a02cb ("drm/xe: Implement
stolen memory.") to designate the TTM memory type for that mgr. Add
kernel-doc with its description.
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Updates for v6.8:
Core:
- Add support for SDM670, SM8650
- Handle the CFG interconnect to fix the obscure hangs / timeouts
on register write
- Kconfig fix for QMP dependency
- DT schema fixes
DPU:
- Add support for SDM670, SM8650
- Enable SmartDMA on SM8350 and SM8450
- Correct UBWC settings for SC8280XP
- Fix catalog settings for SC8180X
- Actually make use of the version to switch between QSEED3/3LITE/4
scalers
- Use devres-managed and drm-managed allocations where appropriate
- misc other fixes
- Enabled YUV writeback on SC7280, SM8250
- Enabled writeback on SM8350, SM8450
- CRC fix when encoder is selected as the input source
- other misc fixes
MDP4:
- Use devres-managed and drm-managed allocations where appropriate
- flush vblank event on CRTC disable
MDP5:
- Use devres-managed and drm-managed allocations where appropriate
DP:
- Add support for SM8650
- Enable PM runtime support
- Merge msm-specific debugfs dir with the generic one
- Described DisplayPort on SM8150 in DeviceTree bindings
- Moved dp_display_get_next_bridge() to probe()
DSI:
- Add support for SM8650
- Enable PM runtime support
GPU/GEM:
- demote userspace triggerable warnings to debug
- add GEM object metadata UAPI
- move GPU devcoredumps to GPU device
- fix hangcheck to skip retired submits
- expose UBWC config to userspace
- fix a680 chip-id
- drm_exec conversion
- drm/ci: remove rebase-merge directory (to unblock CI)
[airlied: fix drm_exec/amd interaction]
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rob Clark <robdclark@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/CAF6AEGs9auYqmo-7NSd9FsbNBCDf7aBevd=4xkcF3A5G_OGvMQ@mail.gmail.com
Idle bo's PTE needs to be re-created when resetting VM state machine.
Set idle bo's vm_bo as moved to mark it as invalid.
Fixes: 55bf196f60 ("drm/amdgpu: reset VM when an error is detected")
Signed-off-by: ZhenGuo Yin <zhenguo.yin@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
Driver incorrectly checks if pointer variable OutBpp is null instead of
if the value being pointed to is zero.
[How]
Dereference OutBpp before checking for a value of zero.
Reviewed-by: Chaitanya Dhere <chaitanya.dhere@amd.com>
Acked-by: Wayne Lin <wayne.lin@amd.com>
Signed-off-by: Josip Pavic <josip.pavic@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>