Catch up with -rc2 and fixing namespace conflict issue caused by
commit cdd30ebb1b ("module: Convert symbol namespace to string literal")
and commit 0c45e76fcc ("drm/xe/vsec: Support BMG devices")
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Pull drm updates from Dave Airlie:
"There's a lot of rework, the panic helper support is being added to
more drivers, v3d gets support for HW superpages, scheduler
documentation, drm client and video aperture reworks, some new
MAINTAINERS added, amdgpu has the usual lots of IP refactors, Intel
has some Pantherlake enablement and xe is getting some SRIOV bits, but
just lots of stuff everywhere.
core:
- split DSC helpers from DP helpers
- clang build fixes for drm/mm test
- drop simple pipeline support for gem vram
- document submission error signaling
- move drm_rect to drm core module from kms helper
- add default client setup to most drivers
- move to video aperture helpers instead of drm ones
tests:
- new framebuffer tests
ttm:
- remove swapped and pinned BOs from TTM lru
panic:
- fix uninit spinlock
- add ABGR2101010 support
bridge:
- add TI TDP158 support
- use standard PM OPS
dma-fence:
- use read_trylock instead of read_lock to help lockdep
scheduler:
- add errno to sched start to report different errors
- add locking to drm_sched_entity_modify_sched
- improve documentation
xe:
- add drm_line_printer
- lots of refactoring
- Enable Xe2 + PES disaggregation
- add new ARL PCI ID
- SRIOV development work
- fix exec unnecessary implicit fence
- define and parse OA sync props
- forcewake refactoring
i915:
- Enable BMG/LNL ultra joiner
- Enable 10bpx + CCS scanout on ICL+, fp16/CCS on TGL+
- use DSB for plane/color mgmt
- Arrow lake PCI IDs
- lots of i915/xe display refactoring
- enable PXP GuC autoteardown
- Pantherlake (PTL) Xe3 LPD display enablement
- Allow fastset HDR infoframe changes
- write DP source OUI for non-eDP sinks
- share PCI IDs between i915 and xe
amdgpu:
- SDMA queue reset support
- SMU 13.0.6, JPEG 4.0.3 updates
- Initial runtime repartitioning support
- rework IP structs for multiple IP instances
- Fetch EDID from _DDC if available
- SMU13 zero rpm user control
- lots of fixes/cleanups
amdkfd:
- Increase event FIFO size
- add topology cap flag for per queue reset
msm:
- DPU:
- SA8775P support
- (disabled by default) MSM8917, MSM8937, MSM8953 and MSM8996 support
- Enable large framebuffer support
- Drop MSM8998 and SDM845
- DP:
- SA8775P support
- GPU:
- a7xx preemption support
- Adreno A663 support
ast:
- warn about unsupported TX chips
ivpu:
- add coredump
- add pantherlake support
rockchip:
- 4K@60Hz display enablement
- generate pll programming tables
panthor:
- add timestamp query API
- add realtime group priority
- add fdinfo support
etnaviv:
- improve handling of DMA address limits
- improve GPU hangcheck
exynos:
- Decon Exynos7870 support
mediatek:
- add OF graph support
omap:
- locking fixes
bochs:
- convert to gem/shmem from simpledrm
v3d:
- support big/super pages
- add gemfs
vc4:
- BCM2712 support refactoring
- add YUV444 format support
udmabuf:
- folio related fixes
nouveau:
- add panic support on nv50+"
* tag 'drm-next-2024-11-21' of https://gitlab.freedesktop.org/drm/kernel: (1583 commits)
drm/xe/guc: Fix dereference before NULL check
drm/amd: Fix initialization mistake for NBIO 7.7.0
Revert "drm/amd/display: parse umc_info or vram_info based on ASIC"
drm/amd/display: Fix failure to read vram info due to static BP_RESULT
drm/amdgpu: enable GTT fallback handling for dGPUs only
drm/amd/amdgpu: limit single process inside MES
drm/fourcc: add AMD_FMT_MOD_TILE_GFX9_4K_D_X
drm/amdgpu/mes12: correct kiq unmap latency
drm/amdgpu: Support vcn and jpeg error info parsing
drm/amd : Update MES API header file for v11 & v12
drm/amd/amdkfd: add/remove kfd queues on start/stop KFD scheduling
drm/amdkfd: change kfd process kref count at creation
drm/amdgpu: Cleanup shift coding style
drm/amd/amdgpu: Increase MES log buffer to dump mes scratch data
drm/amdgpu: Implement virt req_ras_err_count
drm/amdgpu: VF Query RAS Caps from Host if supported
drm/amdgpu: Add msg handlers for SRIOV RAS Telemetry
drm/amdgpu: Update SRIOV Exchange Headers for RAS Telemetry Support
drm/amd/display: 3.2.309
drm/amd/display: Adjust VSDB parser for replay feature
...
Add a subtest that tries to allocate twice the amount of
buffer object memory available, write data to it and then read
all the data back verifying data integrity.
In order to be able to do this on systems that
have no or not enough swap-space available, allocate some memory
as purgeable, and introduce a function to purge such memory from
the TTM swap_notify path.
this test is intended to add test coverage to the current
bo swap path and upcoming shrinking path.
The test has previously been part of the xe bo shrinker series.
v2:
- Skip test if the execution time is expected to be too long.
- Minor code cleanups.
v3:
- Print random seed. (Matthew Auld)
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240909085654.5064-1-thomas.hellstrom@linux.intel.com
Now that rtp has OR rules, it's not needed to extend it to process OOB
WAs. Previously if an entry had no name, it was considered as "a set of
rules OR'ed with the last named entry".
Instead of generating new entries, add OR rules. The syntax for
xe_wa_oob.rules remains the same, with xe_gen_wa_oob generating the
slightly different table. Object sizes delta are negligible, but having
just one logic makes it easier to maintain:
add/remove: 0/0 grow/shrink: 1/2 up/down: 160/-269 (-109)
Function old new delta
__compound_literal 6104 6264 +160
xe_wa_dump 1839 1810 -29
oob_was 816 576 -240
Total: Before=17257, After=17148, chg -0.63%
Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240727015907.899192-9-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
The OOB WAs use xe_rtp_process(), without passing an sr to save result
of the actions since there are none. They are also executed in a gt-only
context, making it harder to share the implementation. Thus, introduce a
new set of tests to check these RTP entries. The only check that can be
done is if the entry was marked as active.
Before commit fd6797ec50 ("drm/xe/rtp: Fix off-by-one when processing
rules") several of these tests were failing: the processing of OR'ed
entries would make the subsequent entry to be inadvertently enabled.
Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240727015907.899192-6-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
There is no point to run those tests on VFs devices as they can't
access any of the MOCS registers. Skip testing on the VF device.
[ ] =================== xe_mocs (1 subtest) ====================
[ ] ================ xe_live_mocs_kernel_kunit ================
[ ] [PASSED] 0000:4d:00.0
[ ] [SKIPPED] 0000:4d:00.1
[ ] ============ [PASSED] xe_live_mocs_kernel_kunit ============
[ ] ===================== [PASSED] xe_mocs =====================
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240720142528.530-8-michal.wajdeczko@intel.com
Instead of iterating over available Xe devices within a testcase,
without being able to distinguish potential failures from different
devices on system with many Xe devices, introduce helpers that will
allow to treat each Xe device as a parameter for the testcase like:
static void bar(struct kunit *test)
{
struct xe_device *xe = test->priv;
...
}
struct kunit_case foo_live_tests[] = {
KUNIT_CASE_PARAM(bar, xe_pci_live_device_gen_param),
{}
};
struct kunit_suite foo_suite = {
.name = "foo_live",
.test_cases = foo_live_tests,
.init = xe_kunit_helper_xe_device_live_test_init,
};
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240720142528.530-3-michal.wajdeczko@intel.com
Xe2+ has unified compression (exactly one compression mode/format),
where compression is now controlled via PAT at PTE level.
This simplifies KMD operations, as it can now decompress freely
without concern for the buffer's original compression format—unlike DG2,
which had multiple compression formats and thus required copying the
raw CCS state during VRAM eviction. In addition mixed VRAM and system
memory buffers were not supported with compression enabled.
On Xe2 dGPU compression is still only supported with VRAM, however we
can now support compression with VRAM and system memory buffers,
with GPU access being seamless underneath. So long as when doing
VRAM -> system memory the KMD uses compressed -> uncompressed,
to decompress it. This also allows CPU access to such buffers,
assuming that userspace first decompress the corresponding
pages being accessed.
If the pages are already in system memory then KMD would have already
decompressed them. When restoring such buffers with sysmem -> VRAM
the KMD can't easily know which pages were originally compressed,
so we always use uncompressed -> uncompressed here.
With this it also means we can drop all the raw CCS handling on such
platforms (including needing to allocate extra CCS storage).
In order to support this we now need to have two different identity
mappings for compressed and uncompressed VRAM.
In this patch, we set up the additional identity map for the VRAM with
compressed pat_index. We then select the appropriate mapping during
migration/clear. During eviction (vram->sysmem), we use the mapping
from compressed -> uncompressed. During restore (sysmem->vram), we need
the mapping from uncompressed -> uncompressed.
Therefore, we need to have two different mappings for compressed and
uncompressed vram. We set up an additional identity map for the vram
with compressed pat_index.
We then select the appropriate mapping during migration/clear.
v2: Formatting nits, Updated code to match recent changes in
xe_migrate_prepare_vm(). (Matt)
v3: Move identity map loop to a helper function. (Matt Brost)
v4: Split helper function in different patch, and
add asserts and nits. (Matt Brost)
v5: Convert the 2 bool arguments of pte_update_size to flags
argument (Matt Brost)
v6: Formatting nits (Matt Brost)
Signed-off-by: Akshata Jahagirdar <akshata.jahagirdar@intel.com>
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/b00db5c7267e54260cb6183ba24b15c1e6ae52a3.1721250309.git.akshata.jahagirdar@intel.com
The test case logic is implemented by the functions compiled as
part of the core Xe driver module and then exported to build and
register the test suite in the live test module.
But we don't need to export individual test case functions, we may
just export the entire test suite. And we don't need to register
this test suite in a separate file, it can be done in the main
file of the live test module.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240708111210.1154-5-michal.wajdeczko@intel.com
The test case logic is implemented by the functions compiled as
part of the core Xe driver module and then exported to build and
register the test suite in the live test module.
But we don't need to export individual test case functions, we may
just export the entire test suite. And we don't need to register
this test suite in a separate file, it can be done in the main
file of the live test module.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240708111210.1154-4-michal.wajdeczko@intel.com
The test case logic is implemented by the functions compiled as
part of the core Xe driver module and then exported to build and
register the test suite in the live test module.
But we don't need to export individual test case functions, we may
just export the entire test suite. And we don't need to register
this test suite in a separate file, it can be done in the main
file of the live test module.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240708111210.1154-3-michal.wajdeczko@intel.com
The test case logic is implemented by the functions compiled as
part of the core Xe driver module and then exported to build and
register the test suite in the live test module.
But we don't need to export individual test case functions, we may
just export the entire test suite. And we don't need to register
this test suite in a separate file, it can be done in the main
file of the live test module.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240708111210.1154-2-michal.wajdeczko@intel.com
It's not very obvious what the difference is between the 'size' and
'n_entries' fields of the MOCS structure. Rename both fields slightly
and add some comments explaining that one is the documentation-defined
table size, while the other is the number of entries that can be
programmed into the hardware (and the documented table size can
potentially be smaller than the number of hardware entries).
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240627203741.2042752-4-matthew.d.roper@intel.com
Some workarounds started to depend on different set of conditions where
the action should be applied if any of them match. See e.g.
commit 24d0d98af1 ("drm/xe/xe2lpm: Fixup Wa_14020756599"). Add
XE_RTP_MATCH_OR that allows to implement a logical OR for the rules.
Normal precedence applies:
r1, r2, OR, r3
means
(r1 AND r2) OR r3
The check is shortcut as soon as a set of conditions match.
v2: Do not match on empty number of rules-other-than-OR evaluated
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240618050044.324454-4-lucas.demarchi@intel.com
Any kunit doing any memory access should get their own runtime_pm
outer references since they don't use the standard driver API
entries. In special this dma_buf from the same driver.
Found by pre-merge CI on adding WARN calls for unprotected
inner callers:
<6> [318.639739] # xe_dma_buf_kunit: running xe_test_dmabuf_import_same_driver
<4> [318.639957] ------------[ cut here ]------------
<4> [318.639967] xe 0000:4d:00.0: Missing outer runtime PM protection
<4> [318.640049] WARNING: CPU: 117 PID: 3832 at drivers/gpu/drm/xe/xe_pm.c:533 xe_pm_runtime_get_noresume+0x48/0x60 [xe]
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Francois Dugast <francois.dugast@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240417203952.25503-10-rodrigo.vivi@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>