linux

mirror of https://github.com/torvalds/linux.git synced 2026-04-21 00:04:01 -04:00

Author	SHA1	Message	Date
Thomas Hellström	333b890633	drm/xe/userptr: Unmap userptrs in the mmu notifier If userptr pages are freed after a call to the xe mmu notifier, the device will not be blocked out from theoretically accessing these pages unless they are also unmapped from the iommu, and this violates some aspects of the iommu-imposed security. Ensure that userptrs are unmapped in the mmu notifier to mitigate this. A naive attempt would try to free the sg table, but the sg table itself may be accessed by a concurrent bind operation, so settle for only unmapping. v3: - Update lockdep asserts. - Fix a typo (Matthew Auld) Fixes: `81e058a3e7` ("drm/xe: Introduce helper to populate userptr") Cc: Oak Zeng <oak.zeng@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: <stable@vger.kernel.org> # v6.10+ Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Acked-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250304173342.22009-4-thomas.hellstrom@linux.intel.com (cherry picked from commit `ba767b9d01`) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2025-03-05 14:25:27 -05:00
Thomas Hellström	84211b1c0d	drm/xe: Fix fault mode invalidation with unbind Fix fault mode invalidation racing with unbind leading to the PTE zapping potentially traversing an invalid page-table tree. Do this by holding the notifier lock across PTE zapping. This might transfer any contention waiting on the notifier seqlock read side to the notifier lock read side, but that shouldn't be a major problem. At the same time get rid of the open-coded invalidation in the bind code by relying on the notifier even when the vma bind is not yet committed. Finally let userptr invalidation call a dedicated xe_vm function performing a full invalidation. Fixes: `e8babb280b` ("drm/xe: Convert multiple bind ops into single job") Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: <stable@vger.kernel.org> # v6.12+ Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250228073058.59510-4-thomas.hellstrom@linux.intel.com (cherry picked from commit `100a5b8dad`) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2025-03-05 14:25:03 -05:00
Thomas Hellström	e775e2a060	drm/xe/vm: Validate userptr during gpu vma prefetching If a userptr vma subject to prefetching was already invalidated or invalidated during the prefetch operation, the operation would repeatedly return -EAGAIN which would typically cause an infinite loop. Validate the userptr to ensure this doesn't happen. v2: - Don't fallthrough from UNMAP to PREFETCH (Matthew Brost) Fixes: `5bd24e7882` ("drm/xe/vm: Subclass userptr vmas") Fixes: `617eebb9c4` ("drm/xe: Fix array of binds") Cc: Matthew Brost <matthew.brost@intel.com> Cc: <stable@vger.kernel.org> # v6.9+ Suggested-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250228073058.59510-2-thomas.hellstrom@linux.intel.com (cherry picked from commit `03c346d4d0`) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2025-03-05 14:24:48 -05:00
Matthew Auld	a9f4fa3a7e	drm/xe/userptr: fix EFAULT handling Currently we treat EFAULT from hmm_range_fault() as a non-fatal error when called from xe_vm_userptr_pin() with the idea that we want to avoid killing the entire vm and chucking an error, under the assumption that the user just did an unmap or something, and has no intention of actually touching that memory from the GPU. At this point we have already zapped the PTEs so any access should generate a page fault, and if the pin fails there also it will then become fatal. However it looks like it's possible for the userptr vma to still be on the rebind list in preempt_rebind_work_func(), if we had to retry the pin again due to something happening in the caller before we did the rebind step, but in the meantime needing to re-validate the userptr and this time hitting the EFAULT. This explains an internal user report of hitting: [ 191.738349] WARNING: CPU: 1 PID: 157 at drivers/gpu/drm/xe/xe_res_cursor.h:158 xe_pt_stage_bind.constprop.0+0x60a/0x6b0 [xe] [ 191.738551] Workqueue: xe-ordered-wq preempt_rebind_work_func [xe] [ 191.738616] RIP: 0010:xe_pt_stage_bind.constprop.0+0x60a/0x6b0 [xe] [ 191.738690] Call Trace: [ 191.738692] <TASK> [ 191.738694] ? show_regs+0x69/0x80 [ 191.738698] ? __warn+0x93/0x1a0 [ 191.738703] ? xe_pt_stage_bind.constprop.0+0x60a/0x6b0 [xe] [ 191.738759] ? report_bug+0x18f/0x1a0 [ 191.738764] ? handle_bug+0x63/0xa0 [ 191.738767] ? exc_invalid_op+0x19/0x70 [ 191.738770] ? asm_exc_invalid_op+0x1b/0x20 [ 191.738777] ? xe_pt_stage_bind.constprop.0+0x60a/0x6b0 [xe] [ 191.738834] ? ret_from_fork_asm+0x1a/0x30 [ 191.738849] bind_op_prepare+0x105/0x7b0 [xe] [ 191.738906] ? dma_resv_reserve_fences+0x301/0x380 [ 191.738912] xe_pt_update_ops_prepare+0x28c/0x4b0 [xe] [ 191.738966] ? kmemleak_alloc+0x4b/0x80 [ 191.738973] ops_execute+0x188/0x9d0 [xe] [ 191.739036] xe_vm_rebind+0x4ce/0x5a0 [xe] [ 191.739098] ? trace_hardirqs_on+0x4d/0x60 [ 191.739112] preempt_rebind_work_func+0x76f/0xd00 [xe] Followed by NPD, when running some workload, since the sg was never actually populated but the vma is still marked for rebind when it should be skipped for this special EFAULT case. This is confirmed to fix the user report. v2 (MattB): - Move earlier. v3 (MattB): - Update the commit message to make it clear that this indeed fixes the issue. Fixes: `521db22a1d` ("drm/xe: Invalidate userptr VMA on page pin fault") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: <stable@vger.kernel.org> # v6.10+ Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250221143840.167150-5-matthew.auld@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit `6b93cb9891`) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2025-02-24 15:39:27 -05:00
Matthew Auld	e043dc16c2	drm/xe/userptr: restore invalidation list on error On error restore anything still on the pin_list back to the invalidation list on error. For the actual pin, so long as the vma is tracked on either list it should get picked up on the next pin, however it looks possible for the vma to get nuked but still be present on this per vm pin_list leading to corruption. An alternative might be then to instead just remove the link when destroying the vma. v2: - Also add some asserts. - Keep the overzealous locking so that we are consistent with the docs; updating the docs and related bits will be done as a follow up. Fixes: `ed2bdf3b26` ("drm/xe/vm: Subclass userptr vmas") Suggested-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: <stable@vger.kernel.org> # v6.8+ Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250221143840.167150-4-matthew.auld@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit `4e37e92892`) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2025-02-24 15:39:21 -05:00
Dave Airlie	0dc853865a	Merge tag 'drm-xe-next-2025-01-10' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-next Driver Changes: - SRIOV VF: Avoid reading inaccessible registers (Jakub, Marcin) - Introduce RPa frequency information (Rodrigo) - Remove unnecessary force wakes on SLPC code (Vinay) - Fix all typos in xe (Nitin) - Adding steering info support for GuC register lists (Jesus) - Remove unused xe_pciids.h harder, add missing PCI ID (Jani) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/Z4E0tlTAA6MZ7PF2@intel.com	2025-01-13 11:07:40 +10:00
Nitin Gote	75fd04f276	drm/xe: Fix all typos in xe Fix all typos in files of xe, reported by codespell tool. Signed-off-by: Nitin Gote <nitin.r.gote@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Reviewed-by: Stuart Summers <stuart.summers@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250106102646.1400146-2-nitin.r.gote@intel.com Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>	2025-01-09 17:58:09 +01:00
Dave Airlie	bdecb30d57	Merge tag 'drm-xe-next-2024-12-11' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-next UAPI Changes: - Make OA buffer size configurable (Sai) Display Changes (including i915): - Fix ttm_bo_access() usage (Auld) - Power request asserting/deasserting for Xe3lpd (Mika) - One Type-C conversion towards struct intel_display (Mika) Driver Changes: - GuC capture related fixes (Everest, Zhanjun) - Move old workaround to OOB infra (Lucas) - Compute mode change refactoring (Bala) - Add ufence and g2h flushes for LNL Hybrid timeouts (Nirmoy) - Avoid unnecessary OOM kills (Thomas) - Restore system memory GGTT mappings (Brost) - Fix build error for XE_IOCTL_DBG macro (Gyeyoung) - Documentation updates and fixes (Lucas, Randy) - A few exec IOCTL fixes (Brost) - Fix potential GGTT allocation leak (Michal) - Fix races on fdinfo (Lucas) - SRIOV VF: Post-migration recovery worker basis (Tomasz) - GuC Communication fixes and improvements (Michal, John, Tomasz, Auld, Jonathan) - SRIOV PF: Add support for VF scheduling priority - Trace improvements (Lucas, Auld, Oak) - Hibernation on igpu fixes and improvements (Auld) - GT oriented logs/asserts improvements (Michal) - Take job list lock in xe_sched_first_pending_job (Nirmoy) - GSC: Improve SW proxy error checking and logging (Daniele) - GuC crash notifications & drop default log verbosity (John) - Fix races on fdinfo (Lucas) - Fix runtime_pm handling in OA (Ashutosh) - Allow fault injection in vm create and vm bind IOCTLs (Francois) - TLB invalidation fixes (Nirmoy, Daniele) - Devcoredump Improvements, doc and fixes (Brost, Lucas, Zhanjun, John) - Wake up waiters after setting ufence->signalled (Nirmoy) - Mark preempt fence workqueue as reclaim (Brost) - Trivial header/flags cleanups (Lucas) - VRAM drop 2G block restriction (Auld) - Drop useless d3cold allowed message (Brost) - SRIOV PF: Drop 2GiB limit of fair LMEM allocation (Michal) - Add another PTL PCI ID (Atwood) - Allow bo mapping on multiple ggtts (Niranjana) - Add support for GuC-to-GuC communication (John) - Update xe2_graphics name string (Roper) - VRAM: fix lpfn check (Auld) - Ad Xe3 workaround (Apoorva) - Migrate fixes (Auld) - Fix non-contiguous VRAM BO access (Brost) - Log throttle reasons (Raag) - Enable PMT support for BMG (Michael) - IRQ related fixes and improvements (Ilia) - Avoid evicting object of the same vm in none fault mode (Oak) - Fix in tests (Nirmoy) - Fix ERR_PTR handling (Mirsad) - Some reg_sr/whitelist fixes and refactors (Lucas) Signed-off-by: Dave Airlie <airlied@redhat.com> # -----BEGIN PGP SIGNATURE----- # # iQEzBAABCAAdFiEEbSBwaO7dZQkcLOKj+mJfZA7rE8oFAmdaHkMACgkQ+mJfZA7r # E8o+twf/XYZTk4O3qQ+yNL3PDQT0NIKjH8mEnmu4udyIw/sYhQe6ji+uh1YutK8Y # 41IQc06qQogTj36bqSwbjThw5asMfRh2sNR/p1uOy7RGUnN25FuYSXEgOeDWi/Ec # xrZE1TKPotFGeGI09KJmzjzMq94cgv97Pxma+5m8BjVsvzXQSzEJ2r9cC6ruSfNT # O5Jq5nqxHSkWUbKCxPnixSlGnH4jbsuiqS1E1pnH+u6ijxsfhOJj686wLn2FRkiw # 6FhXmJBrd8AZ0Q2E7h3UswE5O88I0ALDc58OINAzD1GMyzvZj2vB1pXgj5uNr0/x # Ku4cxu1jprsi+FLUdKAdYpxRBRanow== # =3Ou7 # -----END PGP SIGNATURE----- # gpg: Signature made Thu 12 Dec 2024 09:20:35 AEST # gpg: using RSA key 6D207068EEDD65091C2CE2A3FA625F640EEB13CA # gpg: Good signature from "Rodrigo Vivi <rodrigo.vivi@intel.com>" [unknown] # gpg: aka "Rodrigo Vivi <rodrigo.vivi@gmail.com>" [unknown] # gpg: WARNING: This key is not certified with a trusted signature! # gpg: There is no indication that the signature belongs to the owner. # Primary key fingerprint: 6D20 7068 EEDD 6509 1C2C E2A3 FA62 5F64 0EEB 13CA From: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/Z1ofx-fExLQKV_e4@intel.com	2024-12-13 10:19:44 +10:00
Oak Zeng	774b5fa509	drm/xe: Avoid evicting object of the same vm in none fault mode BO validation during vm_bind could trigger memory eviction when system runs under memory pressure. Right now we blindly evict BOs of all VMs. This scheme has a problem when system runs in none recoverable page fault mode: even though the vm_bind could be successful by evicting BOs, the later the rebinding of the evicted BOs would fail. So it is better to report an out-of- memory failure at vm_bind time than at time of rebinding where xekmd currently doesn't have a good mechanism to report error to user space. This patch implemented a scheme to only evict objects of other VMs during vm_bind time. Object of the same VM will skip eviction. If we failed to find enough memory for vm_bind, we report error to user space at vm_bind time. This scheme is not needed for recoverable page fault mode under what we can dynamically fault-in pages on demand. v1: Use xe_vm_in_preempt_fence_mode instead of stack variable (Thomas) Signed-off-by: Oak Zeng <oak.zeng@intel.com> Suggested-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241203021929.1919730-1-oak.zeng@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2024-12-06 10:54:34 -05:00
Matthew Auld	125a66a572	drm/xe/display: fix ttm_bo_access() usage ttm_bo_access() returns the size on success, account for that otherwise the caller incorrectly thinks this is an error in intel_atomic_prepare_plane_clear_colors(). v2 (Thomas) - Make sure we check for the partial copy case. Also since this api is easy to get wrong, wrap the whole thing in a new helper to hide the details and then convert the existing users over. Fixes: `b6308aaa24` ("drm/xe/display: Update intel_bo_read_from_page to use ttm_bo_access") Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/3661 Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241202170102.88893-2-matthew.auld@intel.com	2024-12-04 10:28:33 +00:00
Matthew Brost	5f7bec831f	drm/xe: Use ttm_bo_access in xe_vm_snapshot_capture_delayed Non-contiguous mapping of BO in VRAM doesn't work, use ttm_bo_access instead. v2: - Fix error handling Fixes: `0eb2a18a8f` ("drm/xe: Implement VM snapshot support for BO's and userptr") Suggested-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241126174615.2665852-7-matthew.brost@intel.com	2024-11-27 16:38:54 -08:00
Christian König	0811cc0baf	drm/xe: drop unused component dependencies XE switched over to drm_exec quite some time ago. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241114153020.6209-7-christian.koenig@amd.com	2024-11-19 14:12:02 +01:00
Francois Dugast	9d42476f71	drm/xe: Allow fault injection in vm create and vm bind IOCTLs Use fault injection infrastructure to allow specific functions to be configured over debugfs for failing during the execution of xe_vm_create_ioctl() and xe_vm_bind_ioctl(). This allows more thorough testing from user space by going through code paths for error handling and unwinding which cannot be reached by simply injecting errors in IOCTL arguments. This can help increase code robustness. v2: Add xe_pt_update_ops_{prepare,run} (Matthew Brost) Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241113162212.2154103-1-francois.dugast@intel.com Signed-off-by: Francois Dugast <francois.dugast@intel.com>	2024-11-14 11:48:13 +01:00
Thomas Hellström	1a7b71805a	drm/xe: Don't unnecessarily invoke the OOM killer on multiple binds Multiple single-ioctl binds can be split up into multiple bind ioctls, reducing the memory required to hold the bind array. So rather than allowing the OOM killer to be invoked, return -ENOMEM or -ENOBUFS to user-space to take corrective action. v2: - Add __GFP_NOWARN to __GFP_RETRY_MAYFAIL allocations to avoid spamming the kernel log if a recoverable allocation fails. Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/2701 Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241031153732.164995-3-thomas.hellstrom@linux.intel.com	2024-11-01 16:59:48 +01:00
Matthew Auld	cfcbc0520d	drm/xe: fix unbalanced rpm put() with fence_fini() Currently we can call fence_fini() twice if something goes wrong when sending the GuC CT for the tlb request, since we signal the fence and return an error, leading to the caller also calling fini() on the error path in the case of stack version of the flow, which leads to an extra rpm put() which might later cause device to enter suspend when it shouldn't. It looks like we can just drop the fini() call since the fence signaller side will already call this for us. There are known mysterious splats with device going to sleep even with an rpm ref, and this could be one candidate. v2 (Matt B): - Prefer warning if we detect double fini() Fixes: `0a382f9bc5` ("drm/xe: Hold a PM ref when GT TLB invalidations are inflight") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241009084808.204432-3-matthew.auld@intel.com	2024-10-10 09:15:58 +01:00
Matthew Auld	dcfd397132	drm/xe/vm: move xa_alloc to prevent UAF Evil user can guess the next id of the vm before the ioctl completes and then call vm destroy ioctl to trigger UAF since create ioctl is still referencing the same vm. Move the xa_alloc all the way to the end to prevent this. v2: - Rebase Fixes: `dd08ebf6c3` ("drm/xe: Introduce a new DRM driver for Intel GPUs") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: <stable@vger.kernel.org> # v6.8+ Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240925071426.144015-3-matthew.auld@intel.com	2024-09-27 09:28:58 +01:00
Matthew Brost	fe4f5d4b66	drm/xe: Clean up VM / exec queue file lock usage. Both the VM / exec queue file lock protect the lookup and reference to the object, nothing more. These locks are not intended anything else underneath them. XA have their own locking too, so no need to take the VM / exec queue file lock aside from when doing a lookup and reference get. Add some kernel doc to make this clear and cleanup a few typos too. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240921011712.2681510-1-matthew.brost@intel.com	2024-09-24 09:03:42 -07:00
Matthew Brost	1378c633a3	drm/xe: Convert to USM lock to rwsem Remove contention from GPU fault path for ASID->VM lookup. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240918054436.1971839-1-matthew.brost@intel.com	2024-09-18 08:46:24 -07:00
Dave Airlie	2ef8d63da8	Merge tag 'drm-xe-next-2024-09-05' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-next Cross-subsystem Changes: - Split dma fence array creation into alloc and arm (Matthew Brost) Driver Changes: - Move kernel_lrc to execlist backend (Ilia) - Fix type width for pcode coommand (Karthik) - Make xe_drm.h include unambiguous (Jani) - Fixes and debug improvements for GSC load (Daniele) - Track resources and VF state by PF (Michal Wajdeczko) - Fix memory leak on error path (Nirmoy) - Cleanup header includes (Matt Roper) - Move pcode logic to tile scope (Matt Roper) - Move hwmon logic to device scope (Matt Roper) - Fix media TLB invalidation (Matthew Brost) - Threshold config fixes for PF (Michal Wajdeczko) - Remove extra "[drm]" from logs (Michal Wajdeczko) - Add missing runtime ref (Rodrigo Vivi) - Fix circular locking on runtime suspend (Rodrigo Vivi) - Fix rpm in TTM swapout path (Thomas) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/eirx5vdvoflbbqlrzi5cip6bpu3zjojm2pxseufu3rlq4pp6xv@eytjvhizfyu6	2024-09-10 13:18:00 +10:00
Dave Airlie	6d0ebb3904	Merge tag 'drm-intel-next-2024-08-29' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-next Cross-driver (xe-core) Changes: - Require BMG scanout buffers to be 64k physically aligned (Maarten) Core (drm) Changes: - Introducing Xe2 ccs modifiers for integrated and discrete graphics (Juha-Pekka) Driver Changes: - General cleanup and more work moving towards intel_display isolation (Jani) - New display workaround (Suraj) - Use correct cp_irq_count on HDCP (Suraj) - eDP PSR fix when CRC is enabled (Jouni) - Fix DP MST state after a sink reset (Imre) - Fix Arrow Lake GSC firmware version (John) - Use chained DSBs for LUT programming (Ville) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/ZtCC0lJ0Zf3MoSdW@intel.com	2024-08-30 13:41:32 +10:00
Dave Airlie	8bdb468dd7	Merge tag 'drm-xe-next-2024-08-28' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-next UAPI Changes: - Fix OA format masks which were breaking build with gcc-5 Cross-subsystem Changes: Driver Changes: - Use dma_fence_chain_free in chain fence unused as a sync (Matthew Brost) - Refactor hw engine lookup and mmio access to be used in more places (Dominik, Matt Auld, Mika Kuoppala) - Enable priority mem read for Xe2 and later (Pallavi Mishra) - Fix PL1 disable flow in xe_hwmon_power_max_write (Karthik) - Fix refcount and speedup devcoredump (Matthew Brost) - Add performance tuning changes to Xe2 (Akshata, Shekhar) - Fix OA sysfs entry (Ashutosh) - Add first GuC firmware support for BMG (Julia) - Bump minimum GuC firmware for platforms under force_probe to match LNL and BMG (Julia) - Fix access check on user fence creation (Nirmoy) - Add/document workarounds for Xe2 (Julia, Daniele, John, Tejas) - Document workaround and use proper WA infra (Matt Roper) - Fix VF configuration on media GT (Michal Wajdeczko) - Fix VM dma-resv lock (Matthew Brost) - Allow suspend/resume exec queue backend op to be called multiple times (Matthew Brost) - Add GT stats to debugfs (Nirmoy) - Add hwconfig to debugfs (Matt Roper) - Compile out all debugfs code with ONFIG_DEUBG_FS=n (Lucas) - Remove dead kunit code (Jani Nikula) - Refactor drvdata storing to help display (Jani Nikula) - Cleanup unsused xe parameter in pte handling (Himal) - Rename s/enable_display/probe_display/ for clarity (Lucas) - Fix missing MCR annotation in couple of registers (Tejas) - Fix DGFX display suspend/resume (Maarten) - Prepare exec_queue_kill for PXP handling (Daniele) - Fix devm/drmm issues (Daniele, Matthew Brost) - Fix tile and ggtt fini sequences (Matthew Brost) - Fix crashes when probing without firmware in place (Daniele, Matthew Brost) - Use xe_managed for kernel BOs (Daniele, Matthew Brost) - Future-proof dss_per_group calculation by using hwconfig (Matt Roper) - Use reserved copy engine for user binds on faulting devices (Matthew Brost) - Allow mixing dma-fence jobs and long-running faulting jobs (Francois) - Cleanup redundant arg when creating use BO (Nirmoy) - Prevent UAF around preempt fence (Auld) - Fix display suspend/resume (Maarten) - Use vma_pages() helper (Thorsten) - Calculate pagefault queue size (Stuart, Matthew Auld) - Fix missing pagefault wq destroy (Stuart) - Fix lifetime handling of HW fence ctx (Matthew Brost) - Fix order destroy order for jobs (Matthew Brost) - Fix TLB invalidation for media GT (Matthew Brost) - Document GGTT (Rodrigo Vivi) - Refactor GGTT layering and fix runtime outer protection (Rodrigo Vivi) - Handle HPD polling on display pm runtime suspend/resume (Imre, Vinod) - Drop unrequired NULL checks (Apoorva, Himal) - Use separate rpm lockdep map for non-d3cold-capable devices (Thomas Hellström) - Support "nomodeset" kernel command-line option (Thomas Zimmermann) - Drop force_probe requirement for LNL and BMG (Lucas, Balasubramani) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/wd42jsh4i3q5zlrmi2cljejohdsrqc6hvtxf76lbxsp3ibrgmz@y54fa7wwxgsd	2024-08-30 13:41:06 +10:00
Jani Nikula	87d8ecf015	drm/xe: replace #include <drm/xe_drm.h> with <uapi/drm/xe_drm.h> include/drm/xe_drm.h does not exist. Prefer the explicit uapi include. Signed-off-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240827091539.4136838-1-jani.nikula@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2024-08-28 15:17:54 -04:00
Maarten Lankhorst	c66f4711f7	drm/xe: Align all VRAM scanout buffers to 64k physical pages when needed. For CCS formats on affected platforms, CCS can be used freely, but display engine requires a multiple of 64k physical pages. No other changes are needed. At the BO creation time we don't know if the BO will be used for CCS or not. If the scanout flag is set, and the BO is a multiple of 64k, we take the safe route and force the physical alignment of 64k pages. If the BO is not a multiple of 64k, or the scanout flag was not set at BO creation, we reject it for usage as CCS in display. The physical pages are likely not aligned correctly, and this will cause corruption when used as FB. The scanout flag and size being a multiple of 64k are used together to enforce 64k physical placement. VM_BIND is completely unaffected, mappings to a VM can still be aligned to 4k, just like for normal buffers. Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Juha-Pekka Heikkilä <juha-pekka.heikkila@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240826170117.327709-3-maarten.lankhorst@linux.intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2024-08-27 18:15:20 -04:00
Daniel Vetter	4461e9e5c3	Merge v6.11-rc5 into drm-next amdgpu pr conconflicts due to patches cherry-picked to -fixes, I might as well catch up with a backmerge and handle them all. Plus both misc and intel maintainers asked for a backmerge anyway. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2024-08-27 14:09:45 +02:00
Matthew Brost	77cc3f6c58	drm/xe: Invalidate media_gt TLBs Testing on LNL has shown media TLBs need to be invalidated via the GuC, update xe_vm_invalidate_vma appropriately. v2: Fix 2 tile case v3: Include missing local change Fixes: `3330361543` ("drm/xe/lnl: Add LNL platform definition") Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240820160129.986889-1-matthew.brost@intel.com	2024-08-21 08:53:50 -07:00
Francois Dugast	4099cfda9d	drm/xe/device: Remove unused xe_device::usm::num_vm_in_* Those counters were used to keep track of the numbers VMs in fault mode and in non-fault mode, to determine if the whole device was in fault mode or not. This is no longer needed so remove those variables and their usages. Signed-off-by: Francois Dugast <francois.dugast@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240809155156.1955925-12-francois.dugast@intel.com	2024-08-17 18:31:59 -07:00
Francois Dugast	226d92e49a	drm/xe/vm: Remove restriction that all VMs must be faulting if one is With this restriction, all VMs on the device must be faulting VMs if there is already one faulting VM, in which case the device is considered in fault mode. This prevents for example an application from running 3D jobs for the compositor while submitting a SVM compute job on the same device. Now that mutual exclusion of faulting LR jobs and dma fence jobs is ensured on the hw engine group, remove this restriction to allow running faulting and non-faulting VMs on the same device. Signed-off-by: Francois Dugast <francois.dugast@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240809155156.1955925-11-francois.dugast@intel.com	2024-08-17 18:31:58 -07:00
Matthew Brost	852856e3b6	drm/xe: Use reserved copy engine for user binds on faulting devices User binds map to engines with can fault, faults depend on user binds completion, thus we can deadlock. Avoid this by using reserved copy engine for user binds on faulting devices. While we are here, normalize bind queue creation with a helper. v2: - Pass in extensions to bind queue creation (CI) v3: - s/resevered/reserved (Lucas) - Fix NULL hwe check (Jonathan) Fixes: `dd08ebf6c3` ("drm/xe: Introduce a new DRM driver for Intel GPUs") Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240816034033.53837-1-matthew.brost@intel.com	2024-08-17 14:12:19 -07:00
Lucas De Marchi	ed7171ff9f	Merge drm/drm-next into drm-xe-next Get drm-xe-next on v6.11-rc2 and synchronized with drm-intel-next for the display side. This resolves the current conflict for the enable_display module parameter and allows further pending refactors. Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2024-08-16 10:33:54 -07:00
Daniele Ceraolo Spurio	b0ee81dac3	drm/xe: Make exec_queue_kill safe to call twice An upcoming PXP patch will kill queues at runtime when a PXP invalidation event occurs, so we need exec_queue_kill to be safe to call multiple times. v2: Add documentation (Matt B) Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240814205654.1716586-1-daniele.ceraolospurio@intel.com	2024-08-15 13:08:31 -07:00
Matthew Brost	f002702290	drm/xe: Hold a PM ref when GT TLB invalidations are inflight Avoid GT TLB invalidation timeouts by holding a PM ref when invalidations are inflight. v2: - Drop PM ref before signaling fence (CI) v3: - Move invalidation_fence_signal helper in tlb timeout to previous patch (Matthew Auld) Fixes: `dd08ebf6c3` ("drm/xe: Introduce a new DRM driver for Intel GPUs") Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Nirmoy Das <nirmoy.das@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240719172905.1527927-4-matthew.brost@intel.com (cherry picked from commit `0a382f9bc5`) Requires: 46209ce5287b ("drm/xe: Add xe_gt_tlb_invalidation_fence_init helper") Requires: 0e414ab036e0 ("drm/xe: Drop xe_gt_tlb_invalidation_wait") Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2024-08-15 09:44:22 -04:00
Matthew Brost	58bfe66744	drm/xe: Drop xe_gt_tlb_invalidation_wait Having two methods to wait on GT TLB invalidations is not ideal. Remove xe_gt_tlb_invalidation_wait and only use GT TLB invalidation fences. In addition to two methods being less than ideal, once GT TLB invalidations are coalesced the seqno cannot be assigned during xe_gt_tlb_invalidation_ggtt/range. Thus xe_gt_tlb_invalidation_wait would not have a seqno to wait one. A fence however can be armed and later signaled. v3: - Add explaination about coalescing to commit message v4: - Don't put dma fence if defined on stack (CI) v5: - Initialize ret to zero (CI) v6: - Use invalidation_fence_signal helper in tlb timeout (Matthew Auld) Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240719172905.1527927-3-matthew.brost@intel.com (cherry picked from commit `61ac035361`) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2024-08-15 09:44:22 -04:00
Umesh Nerlige Ramappa	6309f9b1fc	drm/xe: Take a ref to xe file when user creates a VM Take a reference to xef when user creates the VM and put the reference when user destroys the VM. Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240718210548.3580382-4-umesh.nerlige.ramappa@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit `a2387e6949`) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2024-08-15 09:44:21 -04:00
Himal Prasad Ghimiray	4b498d1961	drm/xe: Remove unused xe parameter Remove the xe parameter from the pde_encode_pat_index and pte_encode_pat_index functions, as it is no longer used. Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240813104419.2958046-1-himal.prasad.ghimiray@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2024-08-13 11:05:07 -07:00
Daniel Vetter	91dae758bd	Merge tag 'drm-misc-next-2024-08-01' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-next drm-misc-next for v6.12: UAPI Changes: virtio: - Define DRM capset Cross-subsystem Changes: dma-buf: - heaps: Clean up documentation printk: - Pass description to kmsg_dump() Core Changes: CI: - Update IGT tests - Point upstream repo to GitLab instance modesetting: - Introduce Power Saving Policy property for connectors - Add might_fault() to drm_modeset_lock priming - Add dynamic per-crtc vblank configuration support panic: - Avoid build-time interference with framebuffer console docs: - Document Colorspace property scheduler: - Remove full_recover from drm_sched_start TTM: - Make LRU walk restartable after dropping locks - Allow direct reclaim to allocate local memory Driver Changes: amdgpu: - Support Power Saving Policy connector property ast: - astdp: Support AST2600 with VGA; Clean up HPD bridge: - Silence error message on -EPROBE_DEFER - analogix: Clean aup - bridge-connector: Fix double free - lt6505: Disable interrupt when powered off - tc358767: Make default DP port preemphasis configurable gma500: - Update i2c terminology ivpu: - Add MODULE_FIRMWARE() lcdif: - Fix pixel clock loongson: - Use GEM refcount over TTM's mgag200: - Improve BMC handling - Support VBLANK intterupts nouveau: - Refactor and clean up internals - Use GEM refcount over TTM's panel: - Shutdown fixes plus documentation - Refactor several drivers for better code sharing - boe-th101mb31ig002: Support for starry-er88577 MIPI-DSI panel plus DT; Fix porch parameter - edp: Support AOU B116XTN02.3, AUO B116XAN06.1, AOU B116XAT04.1, BOE NV140WUM-N41, BOE NV133WUM-N63, BOE NV116WHM-A4D, CMN N116BCA-EA2, CMN N116BCP-EA2, CSW MNB601LS1-4 - himax-hx8394: Support Microchip AC40T08A MIPI Display panel plus DT - ilitek-ili9806e: Support Densitron DMT028VGHMCMI-1D TFT plus DT - jd9365da: Support Melfas lmfbx101117480 MIPI-DSI panel plus DT; Refactor for code sharing sti: - Fix module owner stm: - Avoid UAF wih managed plane and CRTC helpers - Fix module owner - Fix error handling in probe - Depend on COMMON_CLK - ltdc: Fix transparency after disabling plane; Remove unused interrupt tegra: - Call drm_atomic_helper_shutdown() v3d: - Clean up perfmon vkms: - Clean up Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> From: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/20240801121406.GA102996@linux.fritz.box	2024-08-08 18:58:46 +02:00
Thomas Zimmermann	0e8655b4e8	Merge drm/drm-next into drm-misc-next Backmerging to get a late RC of v6.10 before moving into v6.11. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>	2024-07-29 09:35:54 +02:00
Matthew Brost	c8a31ff619	drm/xe: Return -ENOBUFS if a kmalloc fails which is tied to an array of binds The size of an array of binds is directly tied to several kmalloc in the KMD, thus making these kmalloc more likely to fail. Return -ENOBUFS in the case of these failures. The expected UMD behavior upon returning -ENOBUFS is to split an array of binds into a series of single binds. v2: - Resend for CI v3: - Resend for CI Cc: Paulo Zanoni <paulo.r.zanoni@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240723011702.1684013-1-matthew.brost@intel.com	2024-07-23 08:03:57 -07:00
Matthew Brost	0a382f9bc5	drm/xe: Hold a PM ref when GT TLB invalidations are inflight Avoid GT TLB invalidation timeouts by holding a PM ref when invalidations are inflight. v2: - Drop PM ref before signaling fence (CI) v3: - Move invalidation_fence_signal helper in tlb timeout to previous patch (Matthew Auld) Fixes: `dd08ebf6c3` ("drm/xe: Introduce a new DRM driver for Intel GPUs") Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Nirmoy Das <nirmoy.das@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240719172905.1527927-4-matthew.brost@intel.com	2024-07-19 19:45:32 -07:00
Matthew Brost	61ac035361	drm/xe: Drop xe_gt_tlb_invalidation_wait Having two methods to wait on GT TLB invalidations is not ideal. Remove xe_gt_tlb_invalidation_wait and only use GT TLB invalidation fences. In addition to two methods being less than ideal, once GT TLB invalidations are coalesced the seqno cannot be assigned during xe_gt_tlb_invalidation_ggtt/range. Thus xe_gt_tlb_invalidation_wait would not have a seqno to wait one. A fence however can be armed and later signaled. v3: - Add explaination about coalescing to commit message v4: - Don't put dma fence if defined on stack (CI) v5: - Initialize ret to zero (CI) v6: - Use invalidation_fence_signal helper in tlb timeout (Matthew Auld) Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240719172905.1527927-3-matthew.brost@intel.com	2024-07-19 19:45:31 -07:00
Umesh Nerlige Ramappa	a2387e6949	drm/xe: Take a ref to xe file when user creates a VM Take a reference to xef when user creates the VM and put the reference when user destroys the VM. Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240718210548.3580382-4-umesh.nerlige.ramappa@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2024-07-19 00:32:36 -07:00
Thomas Hellström	4c44f89c5d	drm/ttm, drm/amdgpu, drm/xe: Consider hitch moves within bulk sublist moves To address the problem with hitches moving when bulk move sublists are lru-bumped, register the list cursors with the ttm_lru_bulk_move structure when traversing its list, and when lru-bumping the list, move the cursor hitch to the tail. This also means it's mandatory for drivers to call ttm_lru_bulk_move_init() and ttm_lru_bulk_move_fini() when initializing and finalizing the bulk move structure, so add those calls to the amdgpu- and xe driver. Compared to v1 this is slightly more code but less fragile and hopefully easier to understand. Changes in previous series: - Completely rework the functionality - Avoid a NULL pointer dereference assigning manager->mem_type - Remove some leftover code causing build problems v2: - For hitch bulk tail moves, store the mem_type in the cursor instead of with the manager. v3: - Remove leftover mem_type member from change in v2. v6: - Add some lockdep asserts (Matthew Brost) - Avoid NULL pointer dereference (Matthew Brost) - No need to check bo->resource before dereferencing bo->bulk_move (Matthew Brost) Cc: Christian König <christian.koenig@amd.com> Cc: Somalapuram Amaranath <Amaranath.Somalapuram@amd.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: <dri-devel@lists.freedesktop.org> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Acked-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240705153206.68526-5-thomas.hellstrom@linux.intel.com Signed-off-by: Christian König <christian.koenig@amd.com>	2024-07-09 12:39:33 +02:00
Matthew Brost	04e9c0ce19	drm/xe: Add VM bind IOCTL error injection Add VM bind IOCTL error injection which steals MSB of the bind flags field which if set injects errors at various points in the VM bind IOCTL. Intended to validate error paths. Enabled by CONFIG_DRM_XE_DEBUG. v4: - Change define layout (Jonathan) Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240704041652.272920-8-matthew.brost@intel.com	2024-07-03 22:28:07 -07:00
Matthew Brost	282e6f846d	drm/xe: Update VM trace events The trace events have changed moving to a single job per VM bind IOCTL, update the trace events align with old behavior as much as possible. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240704041652.272920-6-matthew.brost@intel.com	2024-07-03 22:28:06 -07:00
Matthew Brost	e8babb280b	drm/xe: Convert multiple bind ops into single job This aligns with the uAPI of an array of binds or single bind that results in multiple GPUVA ops to be considered a single atomic operations. The design is roughly: - xe_vma_ops is a list of xe_vma_op (GPUVA op) - each xe_vma_op resolves to 0-3 PT ops - xe_vma_ops creates a single job - if at any point during binding a failure occurs, xe_vma_ops contains the information necessary unwind the PT and VMA (GPUVA) state v2: - add missing dma-resv slot reservation (CI, testing) v4: - Fix TLB invalidation (Paulo) - Add missing xe_sched_job_last_fence_add/test_dep check (Inspection) v5: - Invert i, j usage (Matthew Auld) - Add helper to test and add job dep (Matthew Auld) - Return on anything but -ETIME for cpu bind (Matthew Auld) - Return -ENOBUFS if suballoc of BB fails due to size (Matthew Auld) - s/do/Do (Matthew Auld) - Add missing comma (Matthew Auld) - Do not assign return value to xe_range_fence_insert (Matthew Auld) v6: - s/0x1ff/MAX_PTE_PER_SDI (Matthew Auld, CI) - Check to large of SA in Xe to avoid triggering WARN (Matthew Auld) - Fix checkpatch issues v7: - Rebase - Support more than 510 PTEs updates in a bind job (Paulo, mesa testing) v8: - Rebase Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240704041652.272920-5-matthew.brost@intel.com	2024-07-03 22:28:04 -07:00
Matthew Brost	2e524668c4	drm/xe: Add xe_vm_pgtable_update_op to xe_vma_ops Each xe_vma_op resolves to 0-3 pt_ops. Add storage for the pt_ops to xe_vma_ops which is dynamically allocated based the number and types of xe_vma_op in the xe_vma_ops list. Allocation only implemented in this patch. This will help with converting xe_vma_ops (multiple xe_vma_op) in a atomic update unit. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240704041652.272920-3-matthew.brost@intel.com	2024-07-03 22:27:00 -07:00
Matthew Brost	627c961d67	drm/xe: Add timeout to preempt fences To adhere to dma fencing rules that fences must signal within a reasonable amount of time, add a 5 second timeout to preempt fences. If this timeout occurs, kill the associated VM as this fatal to the VM. v2: - Add comment for smp_wmb (Checkpatch) - Fix kernel doc typo (Inspection) - Add comment for killed check (Niranjana) v3: - Drop smp_wmb (Matthew Auld) - Don't take vm->lock in preempt fence worker (Matthew Auld) - Drop RB given changes to patch v4: - Add WRITE/READ_ONCE (Niranjana) - Don't export xe_vm_kill (Niranjana) Cc: Matthew Auld <matthew.auld@intel.com> Cc: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Tested-by: Stuart Summers <stuart.summers@intel.com> Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240626004137.4060806-1-matthew.brost@intel.com	2024-07-03 15:27:50 -07:00
Matthew Brost	33991ae8f4	drm/xe: Simplify locking in new_vma Rather than acquiring and dropping the VM / BO dma-resv around xe_vma_create and do the same thing upon adding preempt fences or an error, hold these locks through the entire new_vma() function. v2: - Rebase (CI) Cc: Fei Yang <fei.yang@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Jagmeet Randhawa <jagmeet.randhawa@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240618003859.3239239-1-matthew.brost@intel.com	2024-06-20 15:33:48 -07:00
Francois Dugast	731e46c032	drm/xe/exec_queue: Rename xe_exec_queue::compute to xe_exec_queue::lr The properties of this struct are used in long running context so make that clear by renaming it to lr, in alignment with the rest of the code. Cc: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Francois Dugast <francois.dugast@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240613170348.723245-1-francois.dugast@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2024-06-17 16:46:52 -04:00
Radhakrishna Sripada	e46d3f813a	drm/xe/trace: Extract bo, vm, vma traces xe_trace.h is starting to get over crowded. Move the traces related to bo, vm, vma's to its own file. v2: Update year in License(Gustavo) Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com> Suggested-by: Jani Nikula <jani.nikula@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Radhakrishna Sripada <radhakrishna.sripada@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240607182943.3572524-2-radhakrishna.sripada@intel.com	2024-06-12 09:25:07 -07:00
Thorsten Blum	b3181f4332	drm/xe/vm: Simplify if condition The if condition !A \|\| A && B can be simplified to !A \|\| B. Fixes the following Coccinelle/coccicheck warning reported by excluded_middle.cocci: WARNING !A \|\| A && B is equivalent to !A \|\| B Compile-tested only. Signed-off-by: Thorsten Blum <thorsten.blum@toblux.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240603180005.191578-1-thorsten.blum@toblux.com	2024-06-05 09:35:53 -07:00

1 2 3 4 5 ...

264 Commits