linux

mirror of https://github.com/torvalds/linux.git synced 2026-04-20 07:43:57 -04:00

Author	SHA1	Message	Date
Linus Torvalds	bf4afc53b7	Convert 'alloc_obj' family to use the new default GFP_KERNEL argument This was done entirely with mindless brute force, using git grep -l '\<k[vmz]alloc_objs(., GFP_KERNEL)' \| xargs sed -i 's/\(alloc_objs(.*\), GFP_KERNEL)/\1)/' to convert the new alloc_obj() users that had a simple GFP_KERNEL argument to just drop that argument. Note that due to the extreme simplicity of the scripting, any slightly more complex cases spread over multiple lines would not be triggered: they definitely exist, but this covers the vast bulk of the cases, and the resulting diff is also then easier to check automatically. For the same reason the 'flex' versions will be done as a separate conversion. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2026-02-21 17:09:51 -08:00
Kees Cook	69050f8d6d	treewide: Replace kmalloc with kmalloc_obj for non-scalar types This is the result of running the Coccinelle script from scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to avoid scalar types (which need careful case-by-case checking), and instead replace kmalloc-family calls that allocate struct or union object instances: Single allocations: kmalloc(sizeof(TYPE), ...) are replaced with: kmalloc_obj(TYPE, ...) Array allocations: kmalloc_array(COUNT, sizeof(TYPE), ...) are replaced with: kmalloc_objs(TYPE, COUNT, ...) Flex array allocations: kmalloc(struct_size(PTR, FAM, COUNT), ...) are replaced with: kmalloc_flex(PTR, FAM, COUNT, ...) (where TYPE may also be VAR) The resulting allocations no longer return "void ", instead returning "TYPE ". Signed-off-by: Kees Cook <kees@kernel.org>	2026-02-21 01:02:28 -08:00
Linus Torvalds	939faf71cf	Merge tag 'drm-next-2026-02-11' of https://gitlab.freedesktop.org/drm/kernel Pull drm updates from Dave Airlie: "Highlights: - amdgpu support for lots of new IP blocks which means newer GPUs - xe has a lot of SR-IOV and SVM improvements - lots of intel display refactoring across i915/xe - msm has more support for gen8 platforms - Given up on kgdb/kms integration, it's too hard on modern hw core: - drop kgdb support - replace system workqueue with percpu - account for property blobs in memcg - MAINTAINERS updates for xe + buddy rust: - Fix documentation for Registration constructors - Use pin_init::zeroed() for fops initialization - Annotate DRM helpers with __rust_helper - Improve safety documentation for gem::Object::new() - Update AlwaysRefCounted imports - mm: Prevent integer overflow in page_align() atomic: - add drm_device pointer to drm_private_obj - introduce gamma/degamma LUT size check buddy: - fix free_trees memory leak - prevent BUG_ON bridge: - introduce drm_bridge_unplug/enter/exit - add connector argument to .hpd_notify - lots of recounting conversions - convert rockchip inno hdmi to bridge - lontium-lt9611uxc: switch to HDMI audio helpers - dw-hdmi-qp: add support for HPD-less setups - Algoltek AG6311 support panels: - edp: CSW MNE007QB3-1, AUO B140HAN06.4, AUO B140QAX01.H - st75751: add SPI support - Sitronix ST7920, Samsung LTL106HL02 - LG LH546WF1-ED01, HannStar HSD156J - BOE NV130WUM-T08 - Innolux G150XGE-L05 - Anbernic RG-DS dma-buf: - improve sg_table debugging - add tracepoints - call clear_page instead of memset - start to introduce cgroup memory accounting in heaps - remove sysfs stats dma-fence: - add new helpers dp: - mst: avoid oob access with vcpi=0 hdmi: - limit infoframes exposure to userspace gem: - reduce page table overhead with THP - fix leak in drm_gem_get_unmapped_area gpuvm: - API sanitation for rust bindings sched: - introduce new helpers panic: - report invalid panic modes - add kunit tests i915/xe display: - Expose sharpness only if num_scalers is >= 2 - Add initial Xe3P_LPD for NVL - BMG FBC support - Add MTL+ platforms to support dpll framework _ fix DIMM_S DRM decoding on ICL - Return to using AUX interrupts - PSR/Panel replay refactoring - use consolidation HDMI tables - Xe3_LPD CD2X dividier changes xe: - vfio: add vfio_pci for intel GPU - multi queue support - dynamic pagemaps and multi-device SVM - expose temp attribs in hwmon - NO_COMPRESSION bo flag - expose MERT OA unit - sysfs survivability refactor - SRIOV PF: add MERT support - enable SR-IOV VF migration - Enable I2C/NVM on Crescent Island - Xe3p page reclaimation support - introduce SRIOV scheduler groups - add SoC remappt support in system controller - insert compiler barriers in GuC code - define NVL GuC firmware - handle GT resume failure - fix drm scheduler layering violations - enable GSC loading and PXP for PTL - disable GuC Power DCC strategy on PTL - unregister drm device on probe error i915: - move to kernel standard fault injection - bump recommended GuC version for DG2 and MTL amdgpu: - SMUIO 15.x, PSP 15.x support - IH 6.1.1/7.1 support - MMHUB 3.4/4.2 support - GC 11.5.4/12.1 support - SDMA 6.1.4/7.1/7.11.4 support - JPEG 5.3 support - UserQ updates - GC 9 gfx queue reset support - TTM memory ops parallelization - convert legacy logging to new helpers - DC analog fixes amdkfd: - GC 11.5.4/12.1 suppport - SDMA 6.1.4/7.1 support - per context support - increase kfd process hash table - Reserved SDMA rework radeon: - convert legacy logging to new helpers - use devm for i2c adapters msm: - GPU - Document a612/RGMU dt bindings - UBWC 6.0 support (for A840 / Kaanapali) - a225 support - DPU: - Switch to use virtual planes by default - Fix DSI CMD panels on DPU 3.x - Rewrite format handling to remove intermediate representation - Fix watchdog on DPU 8.x+ - Fix TE / Vsync source setting on DPU 8.x+ - Add 3D_Mux on SC7280 - Kaanapali platform support - Fix UBWC register programming - Make RM reserve DSPP-enabled mixers for CRTCs with LMs - Gamma correction support - DP: - Enable support for eDP 1.4+ link rate tables - Fix MDSS1 DP indices on SA8775P, making them to work - Fix msm_dp_ctrl_config_msa() to work with LLVM 20 - DSI: - Document QCS8300 as compatible with SA8775P - Kaanapali platform support - DSI PHY: - switch to divider_determine_rate() - MDP5: - Drop support for MSM8998, SDM660 and SDM630 (switch over to DPU) - MDSS: - Kaanapali platform support - Fixed UBWC register programming nova-core: - Prepare for Turing support. This includes parsing and handling Turing-specific firmware headers and sections as well as a Turing Falcon HAL implementation - Get rid of the Result<impl PinInit<T, E>> anti-pattern - Relocate initializer-specific code into the appropriate initializer - Use CStr::from_bytes_until_nul() to remove custom helpers - Improve handling of unexpected firmware values - Clean up redundant debug prints - Replace c_str!() with native Rust C-string literals - Update nova-core task list nova: - Align GEM object size to system page size tyr: - Use generated uAPI bindings for GpuInfo - Replace manual sleeps with read_poll_timeout() - Replace c_str!() with native Rust C-string literals - Suppress warnings for unread fields - Fix incorrect register name in print statement nouveau: - fix big page table support races in PTE management - improve reclocking on tegra 186+ amdxdna: - fix suspend race conditions - improve handling of zero tail pointers - fix cu_idx overwritten during command setup - enable hardware context priority - remove NPU2 support - update message buffer allocation requirements - update firmware version check ast: - support imported cursor buffers - big endian fixes etnaviv: - add PPU flop reset support imagination: - add AM62P support - introduce hw version checks ivpu: - implement warm boot flow panfrost: - add bo sync ioctl - add GPU_PM_RT support for RZ/G3E SoC panthor: - add bo sync ioctl - enable timestamp propagation - scheduler robustness improvements - VM termination fixes - huge page support rockchip: - RK3368 HDMI Support - get rid of atomic_check fixups - RK3506 support - RK3576/RK3588 improved HPD handling rz-du: - RZ/V2H(P) MIPI-DSI Support v3d: - fix DMA segment size - convert to new logging helpers mediatek: - move DP training to hotplug thread - convert logging to new helpers - add support for HS speed DSI - Genio 510/700/1200-EVK, Radxa NIO-12L HDMI support atmel-hlcdc: - switch to drmm resource - support nomodeset - use newer helpers hisilicon: - fix various DP bugs renesas: - fix kernel panic on reboot exynos: - fix vidi_connection_ioctl using wrong device - fix vidi_connection deref user ptr - fix concurrency regression with vidi_context vkms: - add configfs support for display configuration * tag 'drm-next-2026-02-11' of https://gitlab.freedesktop.org/drm/kernel: (1610 commits) drm/xe/pm: Disable D3Cold for BMG only on specific platforms drm/xe: Fix kerneldoc for xe_tlb_inval_job_alloc_dep drm/xe: Fix kerneldoc for xe_gt_tlb_inval_init_early drm/xe: Fix kerneldoc for xe_migrate_exec_queue drm/xe/query: Fix topology query pointer advance drm/xe/guc: Fix kernel-doc warning in GuC scheduler ABI header drm/xe/guc: Fix CFI violation in debugfs access. accel/amdxdna: Move RPM resume into job run function accel/amdxdna: Fix incorrect DPM level after suspend/resume nouveau/vmm: start tracking if the LPT PTE is valid. (v6) nouveau/vmm: increase size of vmm pte tracker struct to u32 (v2) nouveau/vmm: rewrite pte tracker using a struct and bitfields. accel/amdxdna: Fix incorrect error code returned for failed chain command accel/amdxdna: Remove hardware context status drm/bridge: imx8qxp-pixel-combiner: Fix bailout for imx8qxp_pc_bridge_probe() drm/panel: ilitek-ili9882t: Remove duplicate initializers in tianma_il79900a_dsc drm/i915/display: fix the pixel normalization handling for xe3p_lpd drm/exynos: vidi: use ctx->lock to protect struct vidi_context member variables related to memory alloc/free drm/exynos: vidi: fix to avoid directly dereferencing user pointer drm/exynos: vidi: use priv->vidi_dev for ctx lookup in vidi_connection_ioctl() ...	2026-02-11 12:55:44 -08:00
Shuicheng Lin	c73a8917b3	drm/xe: Skip address copy for sync-only execs For parallel exec queues, xe_exec_ioctl() copied the batch buffer address array from userspace without checking num_batch_buffer. If user creates a sync-only exec that doesn't use the address field, the exec will fail with -EFAULT. Add num_batch_buffer check to skip the copy, and the exec could be executed successfully. Here is the sync-only exec: struct drm_xe_exec exec = { .extensions = 0, .exec_queue_id = qid, .num_syncs = 1, .syncs = (uintptr_t)&sync, .address = 0, /* ignored for sync-only / .num_batch_buffer = 0, / sync-only */ }; Fixes: `dd08ebf6c3` ("drm/xe: Introduce a new DRM driver for Intel GPUs") Cc: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20260122214053.3189366-2-shuicheng.lin@intel.com (cherry picked from commit `4761791c1e`) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>	2026-01-26 14:50:32 +01:00
Matt Roper	8367585154	drm/xe: Cleanup unused header includes clangd reports many "unused header" warnings throughout the Xe driver. Start working to clean this up by removing unnecessary includes in our .c files and/or replacing them with explicit includes of other headers that were previously being included indirectly. By far the most common offender here was unnecessary inclusion of xe_gt.h. That likely originates from the early days of xe.ko when xe_mmio did not exist and all register accesses, including those unrelated to GTs, were done with GT functions. There's still a lot of additional #include cleanup that can be done in the headers themselves; that will come as a followup series. v2: - Squash the 79-patch series down to a single patch. (MattB) Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20260115032803.4067824-2-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com>	2026-01-15 07:05:04 -08:00
Shuicheng Lin	8e46130400	drm/xe: Limit num_syncs to prevent oversized allocations The exec and vm_bind ioctl allow userspace to specify an arbitrary num_syncs value. Without bounds checking, a very large num_syncs can force an excessively large allocation, leading to kernel warnings from the page allocator as below. Introduce DRM_XE_MAX_SYNCS (set to 1024) and reject any request exceeding this limit. " ------------[ cut here ]------------ WARNING: CPU: 0 PID: 1217 at mm/page_alloc.c:5124 __alloc_frozen_pages_noprof+0x2f8/0x2180 mm/page_alloc.c:5124 ... Call Trace: <TASK> alloc_pages_mpol+0xe4/0x330 mm/mempolicy.c:2416 ___kmalloc_large_node+0xd8/0x110 mm/slub.c:4317 __kmalloc_large_node_noprof+0x18/0xe0 mm/slub.c:4348 __do_kmalloc_node mm/slub.c:4364 [inline] __kmalloc_noprof+0x3d4/0x4b0 mm/slub.c:4388 kmalloc_noprof include/linux/slab.h:909 [inline] kmalloc_array_noprof include/linux/slab.h:948 [inline] xe_exec_ioctl+0xa47/0x1e70 drivers/gpu/drm/xe/xe_exec.c:158 drm_ioctl_kernel+0x1f1/0x3e0 drivers/gpu/drm/drm_ioctl.c:797 drm_ioctl+0x5e7/0xc50 drivers/gpu/drm/drm_ioctl.c:894 xe_drm_ioctl+0x10b/0x170 drivers/gpu/drm/xe/xe_device.c:224 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:598 [inline] __se_sys_ioctl fs/ioctl.c:584 [inline] __x64_sys_ioctl+0x18b/0x210 fs/ioctl.c:584 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xbb/0x380 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f ... " v2: Add "Reported-by" and Cc stable kernels. v3: Change XE_MAX_SYNCS from 64 to 1024. (Matt & Ashutosh) v4: s/XE_MAX_SYNCS/DRM_XE_MAX_SYNCS/ (Matt) v5: Do the check at the top of the exec func. (Matt) Fixes: `dd08ebf6c3` ("drm/xe: Introduce a new DRM driver for Intel GPUs") Reported-by: Koen Koning <koen.koning@intel.com> Reported-by: Peter Senna Tschudin <peter.senna@linux.intel.com> Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/6450 Cc: <stable@vger.kernel.org> # v6.12+ Cc: Matthew Brost <matthew.brost@intel.com> Cc: Michal Mrozek <michal.mrozek@intel.com> Cc: Carl Zhang <carl.zhang@intel.com> Cc: José Roberto de Souza <jose.souza@intel.com> Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: Ivan Briano <ivan.briano@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251205234715.2476561-5-shuicheng.lin@intel.com (cherry picked from commit `b07bac9bd7`) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>	2025-12-18 18:10:34 +01:00
Matthew Brost	4ac9048d05	drm/xe: Wait on in-syncs when swicthing to dma-fence mode If a dma-fence submission has in-fences and pagefault queues are running work, there is little incentive to kick the pagefault queues off the hardware until the dma-fence submission is ready to run. Therefore, wait on the in-fences of the dma-fence submission before removing the pagefault queues from the hardware. v2: - Fix kernel doc (CI) - Don't wait under lock (Thomas) - Make wait interruptable Suggested-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patch.msgid.link/20251212182847.1683222-6-matthew.brost@intel.com	2025-12-15 14:02:54 -08:00
Shuicheng Lin	b07bac9bd7	drm/xe: Limit num_syncs to prevent oversized allocations The exec and vm_bind ioctl allow userspace to specify an arbitrary num_syncs value. Without bounds checking, a very large num_syncs can force an excessively large allocation, leading to kernel warnings from the page allocator as below. Introduce DRM_XE_MAX_SYNCS (set to 1024) and reject any request exceeding this limit. " ------------[ cut here ]------------ WARNING: CPU: 0 PID: 1217 at mm/page_alloc.c:5124 __alloc_frozen_pages_noprof+0x2f8/0x2180 mm/page_alloc.c:5124 ... Call Trace: <TASK> alloc_pages_mpol+0xe4/0x330 mm/mempolicy.c:2416 ___kmalloc_large_node+0xd8/0x110 mm/slub.c:4317 __kmalloc_large_node_noprof+0x18/0xe0 mm/slub.c:4348 __do_kmalloc_node mm/slub.c:4364 [inline] __kmalloc_noprof+0x3d4/0x4b0 mm/slub.c:4388 kmalloc_noprof include/linux/slab.h:909 [inline] kmalloc_array_noprof include/linux/slab.h:948 [inline] xe_exec_ioctl+0xa47/0x1e70 drivers/gpu/drm/xe/xe_exec.c:158 drm_ioctl_kernel+0x1f1/0x3e0 drivers/gpu/drm/drm_ioctl.c:797 drm_ioctl+0x5e7/0xc50 drivers/gpu/drm/drm_ioctl.c:894 xe_drm_ioctl+0x10b/0x170 drivers/gpu/drm/xe/xe_device.c:224 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:598 [inline] __se_sys_ioctl fs/ioctl.c:584 [inline] __x64_sys_ioctl+0x18b/0x210 fs/ioctl.c:584 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xbb/0x380 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f ... " v2: Add "Reported-by" and Cc stable kernels. v3: Change XE_MAX_SYNCS from 64 to 1024. (Matt & Ashutosh) v4: s/XE_MAX_SYNCS/DRM_XE_MAX_SYNCS/ (Matt) v5: Do the check at the top of the exec func. (Matt) Fixes: `dd08ebf6c3` ("drm/xe: Introduce a new DRM driver for Intel GPUs") Reported-by: Koen Koning <koen.koning@intel.com> Reported-by: Peter Senna Tschudin <peter.senna@linux.intel.com> Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/6450 Cc: <stable@vger.kernel.org> # v6.12+ Cc: Matthew Brost <matthew.brost@intel.com> Cc: Michal Mrozek <michal.mrozek@intel.com> Cc: Carl Zhang <carl.zhang@intel.com> Cc: José Roberto de Souza <jose.souza@intel.com> Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: Ivan Briano <ivan.briano@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251205234715.2476561-5-shuicheng.lin@intel.com	2025-12-15 13:33:56 -08:00
Matthew Brost	1a2cf01e1c	drm/xe: Remove last fence dependency check from binds and execs Eliminate redundant last fence dependency checks in exec and bind jobs, as they are now equivalent to xe_exec_queue_is_idle. Simplify the code by removing this dead logic. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patch.msgid.link/20251031234050.3043507-7-matthew.brost@intel.com	2025-11-04 08:21:18 -08:00
Matthew Brost	adda4e855a	drm/xe: Enforce correct user fence signaling order using Prevent application hangs caused by out-of-order fence signaling when user fences are attached. Use drm_syncobj (via dma-fence-chain) to guarantee that each user fence signals in order, regardless of the signaling order of the attached fences. Ensure user fence writebacks to user space occur in the correct sequence. v7: - Skip drm_syncbj create of error (CI) Fixes: `dd08ebf6c3` ("drm/xe: Introduce a new DRM driver for Intel GPUs") Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patch.msgid.link/20251031234050.3043507-2-matthew.brost@intel.com	2025-11-04 08:20:46 -08:00
Shuicheng Lin	4a7fe36a12	drm/xe: Limit number of jobs per exec queue Add a limit to the number of jobs that can be queued in a single exec queue to avoid potential resource exhaustion. A new field `job_cnt` is introduced in `struct xe_exec_queue` to track the number of active DRM jobs, along with a maximum limit `XE_MAX_JOB_COUNT_PER_EXEC_QUEUE` set to 1000. If the job count exceeds this threshold, `xe_exec_ioctl()` now returns `-EAGAIN` to signal that the caller should retry later. A trace event is added to track when the limit is reached: "xe_exec_queue_reach_max_job_count: dev=0000:03:00.0, job count exceeded the maximum limit (1000) per exec queue. engine_class=0x3, logical_mask=0x1, guc_id=2" v3: add assert in xe_exec_queue_destroy that q->job_cnt is zero. (Matt) v2 (Matt): - add log to trace the limit is hit. - Change max count from 0x1000 to 1000. - Use atomic_t for job_cnt. Suggested-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251027202118.3339905-2-shuicheng.lin@intel.com	2025-10-28 18:46:19 -07:00
Sanjay Yadav	dd5d11b657	drm/xe: Fix spelling and typos across Xe driver files Corrected various spelling mistakes and typos in multiple files under the Xe directory. These fixes improve clarity and maintain consistency in documentation. v2 - Replaced all instances of "XE" with "Xe" where it referred to the driver name - of -> for - Typical -> Typically v3 - Revert "Xe" to "XE" for macro prefix reference Signed-off-by: Sanjay Yadav <sanjay.kumar.yadav@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Stuart Summers <stuart.summers@intel.com> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Link: https://patch.msgid.link/20251023121453.1182035-2-sanjay.kumar.yadav@intel.com	2025-10-27 13:00:11 +00:00
Matthew Brost	f6375fb3aa	drm/xe: Track LR jobs in DRM scheduler pending list VF migration requires jobs to remain pending so they can be replayed after the VF comes back. Previously, LR job fences were intentionally signaled immediately after submission to avoid the risk of exporting them, as these fences do not naturally signal in a timely manner and could break dma-fence contracts. A side effect of this approach was that LR jobs were never added to the DRM scheduler’s pending list, preventing them from being tracked for later resubmission. We now avoid signaling LR job fences and ensure they are never exported; Xe already guards against exporting these internal fences. With that guarantee in place, we can safely track LR jobs in the scheduler’s pending list so they are eligible for resubmission during VF post-migration recovery (and similar recovery paths). An added benefit is that LR queues now gain the DRM scheduler’s built-in flow control over ring usage rather than rejecting new jobs in the exec IOCTL if the ring is full. v2: - Ensure DRM scheduler TDR doesn't run for LR jobs - Stack variable for killed_or_banned_or_wedged v4: - Clarify commit message (Tomasz) Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Tomasz Lis <tomasz.lis@intel.com> Link: https://lore.kernel.org/r/20251008214532.3442967-5-matthew.brost@intel.com	2025-10-09 03:22:19 -07:00
Thomas Hellström	f73f6dd312	drm/xe/pm: Add lockdep annotation for the pm_block completion Similar to how we annotate dma-fences, add lockep annotation to the pm_block completion to ensure we don't wait for it while holding locks that are needed in the pm notifier or in the device suspend / resume callbacks. Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://lore.kernel.org/r/20250918142848.21807-3-thomas.hellstrom@linux.intel.com	2025-09-23 14:42:58 +02:00
Thomas Hellström	8f25e5abcb	drm/xe: Convert existing drm_exec transactions for exhaustive eviction Convert existing drm_exec transactions, like GT pagefault validation, non-LR exec() IOCTL and the rebind worker to support exhaustive eviction using the xe_validation_guard(). v2: - Adapt to signature change in xe_validation_guard() (Matt Brost) - Avoid gotos from within xe_validation_guard() (Matt Brost) - Check error return from xe_validation_guard() v3: - Rebase on gpu_madvise() Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> #v1 Link: https://lore.kernel.org/r/20250908101246.65025-6-thomas.hellstrom@linux.intel.com	2025-09-10 09:16:00 +02:00
Thomas Hellström	0131514f97	drm/xe: Pass down drm_exec context to validation We want all validation (potential backing store allocation) to be part of a drm_exec transaction. Therefore add a drm_exec pointer argument to xe_bo_validate() and ___xe_bo_create_locked(). Upcoming patches will deal with making all (or nearly all) calls to these functions part of a drm_exec transaction. In the meantime, define special values of the drm_exec pointer: XE_VALIDATION_UNIMPLEMENTED: Implementation of the drm_exec transaction has not been done yet. XE_VALIDATION_UNSUPPORTED: Some Middle-layers (dma-buf) doesn't allow the drm_exec context to be passed down to map_attachment where validation takes place. XE_VALIDATION_OPT_OUT: May be used only for kunit tests where exhaustive eviction isn't crucial and the ROI of converting those is very small. For XE_VALIDATION_UNIMPLEMENTED and XE_VALIDATION_OPT_OUT there is also a lockdep check that a drm_exec transaction can indeed start at the location where the macro is expanded. This is to encourage developers to take this into consideration early in the code development process. v2: - Fix xe_vm_set_validation_exec() imbalance. Add an assert that hopefully catches future instances of this (Matt Brost) v3: - Extend to psmi_alloc_object Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> #v3 Link: https://lore.kernel.org/r/20250908101246.65025-2-thomas.hellstrom@linux.intel.com	2025-09-10 09:15:52 +02:00
Thomas Hellström	599334572a	drm/xe: Block exec and rebind worker while evicting for suspend / hibernate When the xe pm_notifier evicts for suspend / hibernate, there might be racing tasks trying to re-validate again. This can lead to suspend taking excessive time or get stuck in a live-lock. This behaviour becomes much worse with the fix that actually makes re-validation bring back bos to VRAM rather than letting them remain in TT. Prevent that by having exec and the rebind worker waiting for a completion that is set to block by the pm_notifier before suspend and is signaled by the pm_notifier after resume / wakeup. It's probably still possible to craft malicious applications that block suspending. More work is pending to fix that. v3: - Avoid wait_for_completion() in the kernel worker since it could potentially cause work item flushes from freezable processes to wait forever. Instead terminate the rebind workers if needed and re-launch at resume. (Matt Auld) v4: - Fix some bad naming and leftover debug printouts. - Fix kerneldoc. - Use drmm_mutex_init() for the xe->rebind_resume_lock (Matt Auld). - Rework the interface of xe_vm_rebind_resume_worker (Matt Auld). Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4288 Fixes: `c6a4d46ec1` ("drm/xe: evict user memory in PM notifier") Cc: Matthew Auld <matthew.auld@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: <stable@vger.kernel.org> # v6.16+ Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://lore.kernel.org/r/20250904160715.2613-4-thomas.hellstrom@linux.intel.com	2025-09-05 17:03:35 +02:00
Matthew Auld	9e97874148	drm/xe/userptr: replace xe_hmm with gpusvm Goal here is cut over to gpusvm and remove xe_hmm, relying instead on common code. The core facilities we need are get_pages(), unmap_pages() and free_pages() for a given useptr range, plus a vm level notifier lock, which is now provided by gpusvm. v2: - Reuse the same SVM vm struct we use for full SVM, that way we can use the same lock (Matt B & Himal) v3: - Re-use svm_init/fini for userptr. v4: - Allow building xe without userptr if we are missing DRM_GPUSVM config. (Matt B) - Always make .read_only match xe_vma_read_only() for the ctx. (Dafna) v5: - Fix missing conversion with CONFIG_DRM_XE_USERPTR_INVAL_INJECT v6: - Convert the new user in xe_vm_madise. Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Dafna Hirschfeld <dafna.hirschfeld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250828142430.615826-17-matthew.auld@intel.com	2025-09-05 11:45:47 +01:00
Harish Chegondi	aef87a5fdb	drm/xe: Use copy_from_user() instead of __copy_from_user() copy_from_user() has more checks and is more safer than __copy_from_user() Suggested-by: Kees Cook <kees@kernel.org> Signed-off-by: Harish Chegondi <harish.chegondi@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Link: https://lore.kernel.org/r/acabf20aa8621c7bc8de09b1bffb8d14b5376484.1746126614.git.harish.chegondi@intel.com	2025-05-07 09:27:40 -07:00
Daniele Ceraolo Spurio	41a97c4a12	drm/xe/pxp/uapi: Add API to mark a BO as using PXP The driver needs to know if a BO is encrypted with PXP to enable the display decryption at flip time. Furthermore, we want to keep track of the status of the encryption and reject any operation that involves a BO that is encrypted using an old key. There are two points in time where such checks can kick in: 1 - at VM bind time, all operations except for unmapping will be rejected if the key used to encrypt the BO is no longer valid. This check is opt-in via a new VM_BIND flag, to avoid a scenario where a malicious app purposely shares an invalid BO with a non-PXP aware app (such as a compositor). If the VM_BIND was failed, the compositor would be unable to display anything at all. Allowing the bind to go through means that output still works, it just displays garbage data within the bounds of the illegal BO. 2 - at job submission time, if the queue is marked as using PXP, all objects bound to the VM will be checked and the submission will be rejected if any of them was encrypted with a key that is no longer valid. Note that there is no risk of leaking the encrypted data if a user does not opt-in to those checks; the only consequence is that the user will not realize that the encryption key is changed and that the data is no longer valid. v2: Better commnnts and descriptions (John), rebase v3: Properly return the result of key_assign up the stack, do not use xe_bo in display headers (Jani) v4: improve key_instance variable documentation (John) Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Jani Nikula <jani.nikula@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250129174140.948829-11-daniele.ceraolospurio@intel.com	2025-02-03 11:51:23 -08:00
Nitin Gote	75fd04f276	drm/xe: Fix all typos in xe Fix all typos in files of xe, reported by codespell tool. Signed-off-by: Nitin Gote <nitin.r.gote@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Reviewed-by: Stuart Summers <stuart.summers@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250106102646.1400146-2-nitin.r.gote@intel.com Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>	2025-01-09 17:58:09 +01:00
Matthew Brost	9e7aacd840	drm/xe: Ensure all locks released in exec IOCTL In couple of places the wrong error handling goto was used to release locks. Fix these to ensure all locks dropped on exec IOCTL errors. Cc: Francois Dugast <francois.dugast@intel.com> Fixes: `d16ef1a18e` ("drm/xe/exec: Switch hw engine group execution mode upon job submission") Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Francois Dugast <francois.dugast@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241106224944.30130-1-matthew.brost@intel.com	2024-11-07 14:23:08 -08:00
Matthew Brost	7d1a4258e6	drm/xe: Drop VM dma-resv lock on xe_sync_in_fence_get failure in exec IOCTL Upon failure all locks need to be dropped before returning to the user. Fixes: `58480c1c91` ("drm/xe: Skip VMAs pin when requesting signal to the last XE_EXEC") Cc: <stable@vger.kernel.org> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241105043524.4062774-3-matthew.brost@intel.com	2024-11-05 11:16:42 -08:00
Matthew Brost	07064a200b	drm/xe: Fix possible exec queue leak in exec IOCTL In a couple of places after an exec queue is looked up the exec IOCTL returns on input errors without dropping the exec queue ref. Fix this ensuring the exec queue ref is dropped on input error. Fixes: `dd08ebf6c3` ("drm/xe: Introduce a new DRM driver for Intel GPUs") Cc: <stable@vger.kernel.org> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241105043524.4062774-2-matthew.brost@intel.com	2024-11-05 11:16:35 -08:00
Matthew Brost	b8b1163248	drm/xe: Use bookkeep slots for external BO's in exec IOCTL Fix external BO's dma-resv usage in exec IOCTL using bookkeep slots rather than write slots. This leaves syncing to user space rather than the KMD blindly enforcing write semantics on every external BO. Fixes: `dd08ebf6c3` ("drm/xe: Introduce a new DRM driver for Intel GPUs") Cc: José Roberto de Souza <jose.souza@intel.com> Cc: Kenneth Graunke <kenneth.w.graunke@intel.com> Cc: Paulo Zanoni <paulo.r.zanoni@intel.com> Reported-by: Simona Vetter <simona.vetter@ffwll.ch> Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/2673 Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Link: https://patchwork.freedesktop.org/patch/msgid/20240911152622.903058-1-matthew.brost@intel.com	2024-10-11 15:59:11 -07:00
Jani Nikula	87d8ecf015	drm/xe: replace #include <drm/xe_drm.h> with <uapi/drm/xe_drm.h> include/drm/xe_drm.h does not exist. Prefer the explicit uapi include. Signed-off-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240827091539.4136838-1-jani.nikula@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2024-08-28 15:17:54 -04:00
Francois Dugast	d16ef1a18e	drm/xe/exec: Switch hw engine group execution mode upon job submission If the job about to be submitted is a dma-fence job, update the current execution mode of the hw engine group. This triggers an immediate suspend of the exec queues running faulting long-running jobs. If the job about to be submitted is a long-running job, kick a new worker used to resume the exec queues running faulting long-running jobs once the dma-fence jobs have completed. v2: Kick the resume worker from exec IOCTL, switch to unordered workqueue, destroy it after use (Matt Brost) v3: Do not resume if no exec queue was suspended (Matt Brost) v4: Squash commits (Matt Brost) v5: Do not kick the worker when xe_vm_in_preempt_fence_mode (Matt Brost) Signed-off-by: Francois Dugast <francois.dugast@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240809155156.1955925-10-francois.dugast@intel.com	2024-08-17 18:31:57 -07:00
Ashutosh Dixit	43a6faa6d9	drm/xe/exec: Fix minor bug related to xe_sync_entry_cleanup Increment num_syncs after xe_sync_entry_parse() is successful to ensure the xe_sync_entry_cleanup() logic under "err_syncs" label works correctly. v2: Use the same pattern as that in xe_vm.c (Matt Brost) Fixes: `dd08ebf6c3` ("drm/xe: Introduce a new DRM driver for Intel GPUs") Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240711211203.3728180-1-ashutosh.dixit@intel.com	2024-07-12 07:51:53 -07:00
Francois Dugast	de8390b101	drm/xe/sched_job: Promote xe_sched_job_add_deps() Move it out of the xe_migrate compilation unit so it can be re-used in other places. Cc: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Francois Dugast <francois.dugast@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240614094433.775866-1-francois.dugast@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2024-06-14 16:51:43 -04:00
Matthew Brost	4468d0488e	drm/xe: Drop EXEC_QUEUE_FLAG_BANNED Clean up laying violation of setting q->flags EXEC_QUEUE_FLAG_BANNED bit in GuC backend. Move banned to GuC owned bit and report banned status to upper layers via reset_status vfunc. This is a slight change in behavior as reset_status returns true if wedged or killed bits set too, but in all of these cases submission to queue is no longer allowed. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240604184700.1946918-1-matthew.brost@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2024-06-07 12:16:36 -04:00
Thomas Hellström	7ee7dd6f30	drm/xe: Move vma rebinding to the drm_exec locking loop Rebinding might allocate page-table bos, causing evictions. To support blocking locking during these evictions, perform the rebinding in the drm_exec locking loop. Also Reserve fence slots where actually needed rather than trying to predict how many fence slots will be needed over a complete wound-wait transaction. v2: - Remove a leftover call to xe_vm_rebind() (Matt Brost) - Add a helper function xe_vm_validate_rebind() (Matt Brost) v3: - Add comments and squash with previous patch (Matt Brost) Fixes: `24f947d58f` ("drm/xe: Use DRM GPUVM helpers for external- and evicted objects") Fixes: `29f424eb87` ("drm/xe/exec: move fence reservation") Cc: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240327091136.3271-5-thomas.hellstrom@linux.intel.com	2024-03-28 08:39:30 +01:00
Thomas Hellström	5a091aff50	drm/xe: Rework rebinding Instead of handling the vm's rebind fence separately, which is error prone if they are not strictly ordered, attach rebind fences as kernel fences to the vm's resv. Fixes: `dd08ebf6c3` ("drm/xe: Introduce a new DRM driver for Intel GPUs") Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: <stable@vger.kernel.org> # v6.8+ Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240327091136.3271-3-thomas.hellstrom@linux.intel.com	2024-03-28 08:39:28 +01:00
Nirmoy Das	5dffaa1bb9	drm/xe: Create a helper function to init job's user fence Refactor xe_sync_entry_signal so it doesn't have to modify xe_sched_job struct instead create a new helper function to set user fence values for a job. v2: Move the sync type check to xe_sched_job_init_user_fence(Lucas) Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240321161142.4954-1-nirmoy.das@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2024-03-26 15:40:19 -07:00
José Roberto de Souza	58480c1c91	drm/xe: Skip VMAs pin when requesting signal to the last XE_EXEC Doing a XE_EXEC with num_batch_buffer == 0 makes signals passed as argument to be signaled when the last real XE_EXEC is completed. But to do that it was first pinning all VMAs in drm_gpuvm_exec_lock(), this patch remove this pinning as it is not required. This change also help Mesa implementing memory over-commiting recovery as it needs to unbind not needed VMAs when the whole VM can't fit in GPU memory but it can only do the unbiding when the last XE_EXEC is completed. So with this change Mesa can get the signal it want without getting out-of-memory errors. Fixes: `eb9702ad29` ("drm/xe: Allow num_batch_buffer / num_binds == 0 in IOCTLs") Cc: Thomas Hellstrom <thomas.hellstrom@linux.intel.com> Co-developed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240313171318.121066-1-jose.souza@intel.com	2024-03-13 15:46:26 -07:00
Himal Prasad Ghimiray	023f5c8e90	drm/xe/xe_exec : In xe_exec_ioctl remove deadcode At label err_unlock_list the condition write_label will never be true. Remove the deadcode line for write_label true. Reported by static analyzer. Cc: Matthew Brost <matthew.brost@intel.com> Cc: Tejas Upadhyay <tejas.upadhyay@intel.com> Cc: Bommu Krishnaiah <krishnaiah.bommu@intel.com> Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com> Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240313150545.2830408-3-himal.prasad.ghimiray@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2024-03-13 16:00:26 -04:00
Matthew Brost	d1df9bfbf6	drm/xe: Only allow 1 ufence per exec / bind IOCTL The way exec ufences are coded only 1 ufence per IOCTL will be signaled. It is possible to fix this but for current use cases 1 ufence per IOCTL is sufficient. Enforce a limit of 1 ufence per IOCTL (both exec and bind to be uniform). v2: - Add fixes tag (Thomas) Fixes: `dd08ebf6c3` ("drm/xe: Introduce a new DRM driver for Intel GPUs") Cc: Mika Kahola <mika.kahola@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Brian Welty <brian.welty@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240124234413.1640825-1-matthew.brost@intel.com	2024-01-30 15:51:04 -08:00
Lucas De Marchi	be3382ecdf	Merge drm/drm-next into drm-xe-next Sync to v6.8-rc1. Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2024-01-22 15:10:10 -08:00
Matthew Brost	56c253daab	drm/xe: Fix exec IOCTL long running exec queue ring full condition The intent is to return -EWOULDBLOCK to the user if a long running exec queue is full during the exec IOCTL. -EWOULDBLOCK aliases to -EAGAIN which results in the exec IOCTL doing a retry loop. Fix this by ensuring the retry loop is broken when returning -EWOULDBLOCK. Fixes: `8ae8a2e8dd` ("drm/xe: Long running job update") Reported-by: Sai Gowtham Ch <sai.gowtham.ch@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Brian Welty <brian.welty@intel.com> (cherry picked from commit `97d0047cbb`) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>	2024-01-15 15:36:37 +01:00
Matthew Brost	97d0047cbb	drm/xe: Fix exec IOCTL long running exec queue ring full condition The intent is to return -EWOULDBLOCK to the user if a long running exec queue is full during the exec IOCTL. -EWOULDBLOCK aliases to -EAGAIN which results in the exec IOCTL doing a retry loop. Fix this by ensuring the retry loop is broken when returning -EWOULDBLOCK. Fixes: `8ae8a2e8dd` ("drm/xe: Long running job update") Reported-by: Sai Gowtham Ch <sai.gowtham.ch@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Brian Welty <brian.welty@intel.com>	2024-01-09 06:55:14 -08:00
Matthew Auld	f4e8ab468f	drm/xe/exec: reserve fence slot for CPU bind Looks possible to switch from CPU binding to GPU binding mid exec, and if that happens for the same dma-resv we might use two fence slots, once for the dummy fence, and another for the actual GPU bind. References: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/698 Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com>	2024-01-09 14:27:10 +00:00
Matthew Auld	29f424eb87	drm/xe/exec: move fence reservation We currently assume that we can upfront know exactly how many fence slots we will need at the start of the exec, however the TTM bo_validate can itself consume numerous fence slots, and due to how the dma_resv_reserve_fences() works it only ensures that at least that many fence slots are available. With this it is quite possible that TTM steals some of the fence slots and then when it comes time to do the vma binding and final exec stage we are lacking enough fence slots, leading to some nasty BUG_ON(). A simple fix is to reserve our own fences later, after the validate stage. References: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/698 Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Tested-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com>	2024-01-09 14:26:58 +00:00
Dave Airlie	d219702902	Merge tag 'drm-xe-next-2023-12-21-pr1-1' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-next Introduce a new DRM driver for Intel GPUs Xe, is a new driver for Intel GPUs that supports both integrated and discrete platforms. The experimental support starts with Tiger Lake. i915 will continue be the main production driver for the platforms up to Meteor Lake and Alchemist. Then the goal is to make this Intel Xe driver the primary driver for Lunar Lake and newer platforms. It uses most, if not all, of the key drm concepts, in special: TTM, drm-scheduler, drm-exec, drm-gpuvm/gpuva and others. Signed-off-by: Dave Airlie <airlied@redhat.com> [airlied: add an extra X86 check, fix a typo, fix drm_exec_init interface change]. From: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/ZYSwLgXZUZ57qGPQ@intel.com	2023-12-22 10:36:21 +10:00
Matthew Brost	eb9702ad29	drm/xe: Allow num_batch_buffer / num_binds == 0 in IOCTLs The idea being out-syncs can signal indicating all previous operations on the bind queue are complete. An example use case of this would be support for implementing vkQueueWaitIdle easily. All in-syncs are waited on before signaling out-syncs. This is implemented by forming a composite software fence of in-syncs and installing this fence in the out-syncs and exec queue last fence slot. The last fence must be added as a dependency for jobs on user exec queues as it is possible for the last fence to be a composite software fence (unordered, ioctl with zero bb or binds) rather than hardware fence (ordered, previous job on queue). Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2023-12-21 11:46:09 -05:00
Matthew Brost	53bf60f6d8	drm/xe: Use a flags field instead of bools for sync parse Use a flags field instead of severval bools for sync parse as it is easier to read and less bug prone. v2: Pull in header change from subsequent patch Suggested-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2023-12-21 11:46:09 -05:00
Thomas Hellström	24f947d58f	drm/xe: Use DRM GPUVM helpers for external- and evicted objects Adapt to the DRM_GPUVM helpers moving removing a lot of complicated driver-specific code. For now this uses fine-grained locking for the evict list and external object list, which may incur a slight performance penalty in some situations. v2: - Don't lock all bos and validate on LR exec submissions (Matthew Brost) - Add some kerneldoc Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Acked-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231212100144.6833-2-thomas.hellstrom@linux.intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2023-12-21 11:46:08 -05:00
Thomas Hellström	fdb6a05383	drm/xe: Internally change the compute_mode and no_dma_fence mode naming The name "compute_mode" can be confusing since compute uses either this mode or fault_mode to achieve the long-running semantics, and compute_mode can, moving forward, enable fault_mode under the hood to work around hardware limitations. Also the name no_dma_fence_mode really refers to what we elsewhere call long-running mode and the mode contrary to what its name suggests allows dma-fences as in-fences. So in an attempt to be more consistent, rename no_dma_fence_mode -> lr_mode compute_mode -> preempt_fence_mode And adjust flags so that preempt_fence_mode sets XE_VM_FLAG_LR_MODE fault_mode sets XE_VM_FLAG_LR_MODE \| XE_VM_FLAG_FAULT_MODE v2: - Fix a typo in the commit message (Oak Zeng) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Oak Zeng <oak.zeng@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231127123349.23698-1-thomas.hellstrom@linux.intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2023-12-21 11:44:58 -05:00
Matthew Brost	f3e9b1f434	drm/xe: Remove async worker and rework sync binds Async worker is gone. All jobs and memory allocations done in IOCTL to align with dma fencing rules. Async vs. sync now means when do bind operations complete relative to the IOCTL. Async completes when out-syncs signal while sync completes when the IOCTL returns. In-syncs and out-syncs are only allowed in async mode. If memory allocations fail in the job creation step the VM is killed. This is temporary, eventually a proper unwind will be done and VM will be usable. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2023-12-21 11:43:17 -05:00
Francois Dugast	c73acc1eeb	drm/xe: Use Xe assert macros instead of XE_WARN_ON macro The XE_WARN_ON macro maps to WARN_ON which is not justified in many cases where only a simple debug check is needed. Replace the use of the XE_WARN_ON macro with the new xe_assert macros which relies on drm_*. This takes a struct drm_device argument, which is one of the main changes in this commit. The other main change is that the condition is reversed, as with XE_WARN_ON a message is displayed if the condition is true, whereas with xe_assert it is if the condition is false. v2: - Rebase - Keep WARN splats in xe_wopcm.c (Matt Roper) v3: - Rebase Signed-off-by: Francois Dugast <francois.dugast@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2023-12-21 11:41:08 -05:00
Matthew Brost	30278e2996	drm/xe: Fix fence reservation accouting Both execs and the preempt rebind worker can issue rebinds. Rebinds require a fence, per tile, inserted into dma-resv slots of the VM and BO (if external). The fence reservation accouting did not take into account the number of fences required for rebinds, fix this. v2: Rebase Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reported-by: Christopher Snowhill <kode54@gmail.com> Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/518 Signed-off-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2023-12-21 11:41:07 -05:00
Thomas Hellström	d490ecf577	drm/xe: Rework xe_exec and the VM rebind worker to use the drm_exec helper Replace the calls to ttm_eu_reserve_buffers() by using the drm_exec helper instead. Also make sure the locking loop covers any calls to xe_bo_validate() / ttm_bo_validate() so that these function calls may easily benefit from being called from within an unsealed locking transaction and may thus perform blocking dma_resv locks in the future. For the unlock we remove an assert that the vm->rebind_list is empty when locks are released. Since if the error path is hit with a partly locked list, that assert may no longer hold true we chose to remove it. v3: - Don't accept duplicate bo locks in the rebind worker. v5: - Loop over drm_exec objects in reverse when unlocking. v6: - We can't keep the WW ticket when retrying validation on OOM. Fix. Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230908091716.36984-5-thomas.hellstrom@linux.intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2023-12-21 11:41:07 -05:00

1 2

66 Commits