Commit Graph

123 Commits

Author SHA1 Message Date
Moti Haimovski
d186e51b0e drm/xe/vm: bugfix in xe_vm_create_ioctl
Fix xe_vm_create_ioctl routine not freeing the vm-id allocated to it
when the function fails.

Fixes: dd08ebf6c3 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Signed-off-by: Moti Haimovski <mhaimovski@habana.ai>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Tomer Tayar <ttayar@habana.ai>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240122102424.4008095-1-mhaimovski@habana.ai
(cherry picked from commit f6bf0424ca)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2024-01-24 11:13:41 +01:00
Dan Carpenter
ec32f4f1be drm/xe: unlock on error path in xe_vm_add_compute_exec_queue()
Drop the "&vm->lock" before returning.

Fixes: 24f947d58f ("drm/xe: Use DRM GPUVM helpers for external- and evicted objects")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
(cherry picked from commit cf46019e85)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2024-01-15 15:36:59 +01:00
Thomas Hellström
457f443983 drm/xe/vm: Fix an error path
If using the VM_BIND_OP_UNMAP_ALL without any bound vmas for the
vm, we will end up dereferencing an uninitialized variable and leak a
bo lock. Fix this.

v2:
- Updated commit message (Lucas De Marchi)

Reported-by: Dafna Hirschfeld <dhirschfeld@habana.ai>
Closes: https://lore.kernel.org/intel-xe/jrwua7ckbiozfcaodx4gg2h4taiuxs53j5zlpf3qzvyhyiyl2d@pbs3plurokrj/
Suggested-by: Dafna Hirschfeld <dhirschfeld@habana.ai>
Fixes: b06d47be7c ("drm/xe: Port Xe to GPUVA")
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Acked-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231222175904.16732-1-thomas.hellstrom@linux.intel.com
(cherry picked from commit 9d0c1c5618)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2024-01-15 15:36:39 +01:00
Matthew Brost
75cbe49f9e drm/xe: Fix UBSAN splat in add_preempt_fences()
add_preempt_fences() calls dma_resv_reserve_fences() with num_fences ==
0 resulting in the below UBSAN splat. Short circuit add_preempt_fences()
if num_fences == 0.

[   58.652241] ================================================================================
[   58.660736] UBSAN: shift-out-of-bounds in ./include/linux/log2.h:57:13
[   58.667281] shift exponent 64 is too large for 64-bit type 'long unsigned int'
[   58.674539] CPU: 2 PID: 1170 Comm: xe_gpgpu_fill Not tainted 6.6.0-rc3-guc+ #630
[   58.674545] Hardware name: Intel Corporation Tiger Lake Client Platform/TigerLake U DDR4 SODIMM RVP, BIOS TGLSFWI1.R00.3243.A01.2006102133 06/10/2020
[   58.674547] Call Trace:
[   58.674548]  <TASK>
[   58.674550]  dump_stack_lvl+0x92/0xb0
[   58.674555]  __ubsan_handle_shift_out_of_bounds+0x15a/0x300
[   58.674559]  ? rcu_is_watching+0x12/0x60
[   58.674564]  ? software_resume+0x141/0x210
[   58.674575]  ? new_vma+0x44b/0x600 [xe]
[   58.674606]  dma_resv_reserve_fences.cold+0x40/0x66
[   58.674612]  new_vma+0x4b3/0x600 [xe]
[   58.674638]  xe_vm_bind_ioctl+0xffd/0x1e00 [xe]
[   58.674663]  ? __pfx_xe_vm_bind_ioctl+0x10/0x10 [xe]
[   58.674680]  drm_ioctl_kernel+0xc1/0x170
[   58.674686]  ? __pfx_xe_vm_bind_ioctl+0x10/0x10 [xe]
[   58.674703]  drm_ioctl+0x247/0x4c0
[   58.674709]  ? find_held_lock+0x2b/0x80
[   58.674716]  __x64_sys_ioctl+0x8c/0xb0
[   58.674720]  do_syscall_64+0x3c/0x90
[   58.674723]  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
[   58.674727] RIP: 0033:0x7fce4bd1aaff
[   58.674730] Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <41> 89 c0 3d 00 f0 ff ff 77 1f 48 8b 44 24 18 64 48 2b 04 25 28 00
[   58.674731] RSP: 002b:00007ffc57434050 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[   58.674734] RAX: ffffffffffffffda RBX: 00007ffc574340e0 RCX: 00007fce4bd1aaff
[   58.674736] RDX: 00007ffc574340e0 RSI: 0000000040886445 RDI: 0000000000000003
[   58.674737] RBP: 0000000040886445 R08: 0000000000000002 R09: 00007ffc574341b0
[   58.674739] R10: 000055de43eb3780 R11: 0000000000000246 R12: 00007ffc574340e0
[   58.674740] R13: 0000000000000003 R14: 00007ffc574341b0 R15: 0000000000000001
[   58.674747]  </TASK>
[   58.674748] ================================================================================

Fixes: dd08ebf6c3 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20231215230203.719244-1-matthew.brost@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-26 12:51:41 -05:00
Dave Airlie
d219702902 Merge tag 'drm-xe-next-2023-12-21-pr1-1' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-next
Introduce a new DRM driver for Intel GPUs

Xe, is a new driver for Intel GPUs that supports both integrated and
discrete platforms. The experimental support starts with Tiger Lake.
i915 will continue be the main production driver for the platforms
up to Meteor Lake and Alchemist. Then the goal is to make this Intel
Xe driver the primary driver for Lunar Lake and newer platforms.

It uses most, if not all, of the key drm concepts, in special: TTM,
drm-scheduler, drm-exec, drm-gpuvm/gpuva and others.

Signed-off-by: Dave Airlie <airlied@redhat.com>

[airlied: add an extra X86 check, fix a typo, fix drm_exec_init interface
change].

From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ZYSwLgXZUZ57qGPQ@intel.com
2023-12-22 10:36:21 +10:00
Matthew Brost
d3d767396a drm/xe/uapi: Remove sync binds
Remove concept of async vs sync VM bind queues, rather make all binds
async.

The following bits have dropped from the uAPI:
DRM_XE_ENGINE_CLASS_VM_BIND_ASYNC
DRM_XE_ENGINE_CLASS_VM_BIND_SYNC
DRM_XE_VM_CREATE_FLAG_ASYNC_DEFAULT
DRM_XE_VM_BIND_FLAG_ASYNC

To implement sync binds the UMD is expected to use the out-fence
interface.

v2: Send correct version
v3: Drop drm_xe_syncs

Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Francois Dugast <francois.dugast@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Acked-by: José Roberto de Souza <jose.souza@intel.com>
Acked-by: Mateusz Naklicki <mateusz.naklicki@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:46:59 -05:00
Matthew Brost
eb9702ad29 drm/xe: Allow num_batch_buffer / num_binds == 0 in IOCTLs
The idea being out-syncs can signal indicating all previous operations
on the bind queue are complete. An example use case of this would be
support for implementing vkQueueWaitIdle easily.

All in-syncs are waited on before signaling out-syncs. This is
implemented by forming a composite software fence of in-syncs and
installing this fence in the out-syncs and exec queue last fence slot.

The last fence must be added as a dependency for jobs on user exec
queues as it is possible for the last fence to be a composite software
fence (unordered, ioctl with zero bb or binds) rather than hardware
fence (ordered, previous job on queue).

Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:46:09 -05:00
Matthew Brost
53bf60f6d8 drm/xe: Use a flags field instead of bools for sync parse
Use a flags field instead of severval bools for sync parse as it is
easier to read and less bug prone.

v2: Pull in header change from subsequent patch

Suggested-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:46:09 -05:00
Matthew Brost
3b97e3b265 drm/xe: Use a flags field instead of bools for VMA create
Use a flags field instead of severval bools for VMA create as it is
easier to read and less bug prone.

Suggested-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:46:09 -05:00
Thomas Hellström
35705e32b1 drm/xe: Use DRM_GPUVM_RESV_PROTECTED for gpuvm
Use DRM_GPUVM_RESV_PROTECTED to use corse-grained locking for the
evict and external object list.
Since we are already holding the relevant RESV locks, for now at least,
we don't need the fine-grained locking.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231212100144.6833-3-thomas.hellstrom@linux.intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:46:08 -05:00
Thomas Hellström
24f947d58f drm/xe: Use DRM GPUVM helpers for external- and evicted objects
Adapt to the DRM_GPUVM helpers moving removing a lot of complicated
driver-specific code.

For now this uses fine-grained locking for the evict list and external
object list, which may incur a slight performance penalty in some
situations.

v2:
- Don't lock all bos and validate on LR exec submissions (Matthew Brost)
- Add some kerneldoc

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Acked-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231212100144.6833-2-thomas.hellstrom@linux.intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:46:08 -05:00
Thomas Hellström
06951c2ee7 drm/xe: Use NULL PTEs as scratch PTEs
Currently scratch PTEs are write-enabled and points to a single scratch
page. This has the side effect that buggy applications with out-of-bounds
memory accesses may not notice the bad access since what's written may
be read back.

Instead use NULL PTEs as scratch PTEs. These always return 0 when reading,
and writing has no effect. As a slight benefit, we can also use huge NULL
PTEs.

One drawback pointed out is that debugging may be hampered since previously
when inspecting the content of the scratch page, it might be possible to
detect writes to out-of-bound addresses and possibly also
from where the out-of-bounds address originated. However since the scratch
page-table structure is kept, it will be easy to add back the single
RW-enabled scratch page under a debug define if needed.

Also update the kerneldoc accordingly and move the function to create the
scratch page-tables from xe_pt.c to xe_pt.h since it is accessing
vm structure internals and this also makes it possible to make it static.

v2:
- Don't try to encode scratch PTEs larger than 1GiB.
- Move xe_pt_create_scratch(), Update kerneldoc.
v3:
- Rebase.

Cc: Brian Welty <brian.welty@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Acked-by: Lucas De Marchi <lucas.demarchi@intel.com> #for general direction.
Reviewed-by: Brian Welty <brian.welty@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231209151843.7903-3-thomas.hellstrom@linux.intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:45:28 -05:00
Thomas Hellström
e84d716dd4 drm/xe: Restrict huge PTEs to 1GiB
Add a define for the highest level for which we can encode a huge PTE,
and use it for page-table building. Also update an assert that checks that
we don't try to encode for larger sizes.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Brian Welty <brian.welty@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231209151843.7903-2-thomas.hellstrom@linux.intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:45:28 -05:00
Mika Kuoppala
06d5ae9057 drm/xe/vm: Avoid asid lookup if none allocated
The destroy path can and will get called for incomplete
vm objects on error paths, where the asid is not yet allocated.
This leads to lookup fail and assert triggered.

Fix this by not asserting of asid existence if vm never
got assigned one.

Cc: Ohad Sharabi <osharabi@habana.ai>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:45:28 -05:00
Lucas De Marchi
5a92da34dd drm/xe: Rename info.supports_* to info.has_*
Rename supports_mmio_ext and supports_usm to use a has_ prefix so the
flags are grouped together. This settles on just one variant for
positive info matching ("has_") and one for negative ("skip_").

Also make sure the has_* flags are grouped together in xe_pci.c.

Reviewed-by: Koby Elbaz <kelbaz@habana.ai>
Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com>
Link: https://lore.kernel.org/r/20231205145235.2114761-1-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:45:27 -05:00
Thomas Hellström
9329f06672 drm/xe/uapi: Use LR abbrev for long-running vms
Currently we're using "compute mode" for long running VMs using
preempt-fences for memory management, and "fault mode" for long
running VMs using page faults.

Change this to use the terminology "long-running" abbreviated as LR for
long-running VMs. These VMs can then either be in preempt-fence mode or
fault mode. The user can force fault mode at creation time, but otherwise
the driver can choose to use fault- or preempt-fence mode for long-running
vms depending on the device capabilities. Initially unless fault-mode is
specified, the driver uses preempt-fence mode.

v2:
- Fix commit message wording and the documentation around
  CREATE_FLAG_LR_MODE and CREATE_FLAG_FAULT_MODE

Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Francois Dugast <francois.dugast@intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Francois Dugast <francois.dugast@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:45:20 -05:00
Rodrigo Vivi
7a56bd0cfb drm/xe/uapi: Fix various struct padding for 64b alignment
Let's respect Documentation/process/botching-up-ioctls.rst
and add the proper padding for a 64b alignment with all as
well as all the required checks and settings for the pads
and the reserved entries.

v2: Fix remaining holes and double check with pahole (Jose)
    Ensure with pahole that both 32b and 64b have exact same
    layout (Thomas)
    Do not set query's pad and reserved bits to zero since it
    is redundant and already done by kzalloc (Matt)

v3: Fix alignment after rebase (José Roberto de Souza)

v4: Fix pad check (Francois Dugast)

Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Francois Dugast <francois.dugast@intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Francois Dugast <francois.dugast@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
2023-12-21 11:45:19 -05:00
Rodrigo Vivi
cad4a0d6af drm/xe/uapi: Kill tile_mask
It is currently unused, so by the rules it cannot go upstream.
Also there was the desire to convert that to align with the
engine_class_instance selection, but the consensus on that one
is to remain with the global gt_id. So we are keeping the gt_id
there, not converting to a generic sched_group and also killing
this tile_mask and only using the default behavior of 0 that is
to create a mapping / page_table entry on every tile, similar
to what i915.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Francois Dugast <francois.dugast@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
2023-12-21 11:45:19 -05:00
Matthew Auld
e1fbc4f18d drm/xe/uapi: support pat_index selection with vm_bind
Allow userspace to directly control the pat_index for a given vm
binding. This should allow directly controlling the coherency, caching
behaviour, compression and potentially other stuff in the future for the
ppGTT binding.

The exact meaning behind the pat_index is very platform specific (see
BSpec or PRMs) but effectively maps to some predefined memory
attributes. From the KMD pov we only care about the coherency that is
provided by the pat_index, which falls into either NONE, 1WAY or 2WAY.
The vm_bind coherency mode for the given pat_index needs to be at least
1way coherent when using cpu_caching with DRM_XE_GEM_CPU_CACHING_WB. For
platforms that lack the explicit coherency mode attribute, we treat
UC/WT/WC as NONE and WB as AT_LEAST_1WAY.

For userptr mappings we lack a corresponding gem object, so the expected
coherency mode is instead implicit and must fall into either 1WAY or
2WAY. Trying to use NONE will be rejected by the kernel. For imported
dma-buf (from a different device) the coherency mode is also implicit
and must also be either 1WAY or 2WAY.

v2:
  - Undefined coh_mode(pat_index) can now be treated as programmer
    error. (Matt Roper)
  - We now allow gem_create.coh_mode <= coh_mode(pat_index), rather than
    having to match exactly. This ensures imported dma-buf can always
    just use 1way (or even 2way), now that we also bundle 1way/2way into
    at_least_1way. We still require 1way/2way for external dma-buf, but
    the policy can now be the same for self-import, if desired.
  - Use u16 for pat_index in uapi. u32 is massive overkill. (José)
  - Move as much of the pat_index validation as we can into
    vm_bind_ioctl_check_args. (José)
v3 (Matt Roper):
  - Split the pte_encode() refactoring into separate patch.
v4:
  - Rebase
v5:
  - Check for and reject !coh_mode which would indicate hw reserved
    pat_index on xe2.
v6:
  - Rebase on removal of coh_mode from uapi. We just need to reject
    cpu_caching=wb + pat_index with coh_none.

Testcase: igt@xe_pat
Bspec: 45101, 44235 #xe
Bspec: 70552, 71582, 59400 #xe2
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Pallavi Mishra <pallavi.mishra@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Filip Hazubski <filip.hazubski@intel.com>
Cc: Carl Zhang <carl.zhang@intel.com>
Cc: Effie Yu <effie.yu@intel.com>
Cc: Zhengguo Xu <zhengguo.xu@intel.com>
Cc: Francois Dugast <francois.dugast@intel.com>
Tested-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Acked-by: Zhengguo Xu <zhengguo.xu@intel.com>
Acked-by: Bartosz Dunajski <bartosz.dunajski@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:45:07 -05:00
Thomas Hellström
fdb6a05383 drm/xe: Internally change the compute_mode and no_dma_fence mode naming
The name "compute_mode" can be confusing since compute uses either this
mode or fault_mode to achieve the long-running semantics, and compute_mode
can, moving forward, enable fault_mode under the hood to work around
hardware limitations.

Also the name no_dma_fence_mode really refers to what we elsewhere call
long-running mode and the mode contrary to what its name suggests allows
dma-fences as in-fences.

So in an attempt to be more consistent, rename
no_dma_fence_mode -> lr_mode
compute_mode      -> preempt_fence_mode

And adjust flags so that

preempt_fence_mode sets XE_VM_FLAG_LR_MODE
fault_mode sets XE_VM_FLAG_LR_MODE | XE_VM_FLAG_FAULT_MODE

v2:
- Fix a typo in the commit message (Oak Zeng)

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Oak Zeng <oak.zeng@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231127123349.23698-1-thomas.hellstrom@linux.intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:44:58 -05:00
Thomas Hellström
d2f51c50b9 drm/xe/vm: Fix ASID XA usage
xa_alloc_cyclic() returns 1 on successful allocation, if wrapping occurs,
but the code incorrectly treats that as an error. Fix that.
Also, xa_alloc_cyclic() requires xa_init_flags(..., XA_FLAGS_ALLOC), so
fix that, and assuming we don't want a zero ASID, instead of using
XA_FLAGS_ALLOC1, adjust the xa limits at alloc_cyclic time.

v2:
- On CONFIG_DRM_XE_DEBUG, Initialize the cyclic ASID allocation in such a
  way that the next allocated ASID will be the maximum one, and the one
  following will cause an ASID wrap, (all to have CI test high ASIDs
  and ASID wraps).
v3:
- Stricter return value checking from xa_alloc_cyclic() (Matthew Auld)

Suggested-by: Ohad Sharabi <osharabi@habana.ai>
Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/946
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Ohad Sharabi <osharabi@habana.ai> #v1
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231124153345.97385-5-thomas.hellstrom@linux.intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:44:57 -05:00
Matthew Brost
40709aa761 drm/xe: Only set xe_vma_op.map fields for GPUVA map operations
DRM_XE_VM_BIND_OP_MAP_* IOCTL operations can result in GPUVA unmap, remap,
or map operations in vm_bind_ioctl_ops_create. The xe_vma_op.map fields
are blindly set which is incorrect for GPUVA unmap or remap operations.
Fix this by only setting xe_vma_op.map for GPUVA map operations. Also
restructure a bit vm_bind_ioctl_ops_create to make the code a bit more
readable.

Reported-by: Dafna Hirschfeld <dhirschfeld@habana.ai>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Brian Welty <brian.welty@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:44:51 -05:00
Rodrigo Vivi
aaa115ffaa drm/xe/uapi: Be more specific about the vm_bind prefetch region
Let's bring a bit of clarity on this 'region' field that is
part of vm_bind operation struct. Rename and document to make
it more than obvious that it is a region instance and not a
mask and also that it should only be used with the prefetch
operation itself.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Francois Dugast <francois.dugast@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
2023-12-21 11:44:38 -05:00
Francois Dugast
3ac4a7896d drm/xe/uapi: Add _FLAG to uAPI constants usable for flags
Most constants defined in xe_drm.h which can be used for flags are
named DRM_XE_*_FLAG_*, which is helpful to identify them. Make this
systematic and add _FLAG where it was missing.

Signed-off-by: Francois Dugast <francois.dugast@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:44:33 -05:00
Francois Dugast
d5dc73dbd1 drm/xe/uapi: Add missing DRM_ prefix in uAPI constants
Most constants defined in xe_drm.h use DRM_XE_ as prefix which is
helpful to identify the name space. Make this systematic and add
this prefix where it was missing.

v2:
- fix vertical alignment of define values
- remove double DRM_ in some variables (José Roberto de Souza)

v3: Rebase

Signed-off-by: Francois Dugast <francois.dugast@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:44:33 -05:00
Brian Welty
04dfef5b41 drm/xe: Fix unbind of unaccessed VMA (fault mode)
In fault mode, page table binding is deferred until fault handler.
Thus vma->tile_present will be unset unless the VMA is accessed by GPU.

During a later unbind, the logic doesn't account for the fact that local
fence variable will be NULL in this case, leading to pass NULL into
dma_fence_add_callback() and causing few WARN_ONs to print to console.
The fix is already present in the code, just hoist the fence variable
computation to be done earlier.

Resolves warnings seen with igt@xe_exec_fault_mode@once-invalid-fault

Signed-off-by: Brian Welty <brian.welty@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:43:34 -05:00
Matthew Brost
81d11b9d66 drm/xe: Adjust tile_present mask when skipping rebinds
If a rebind is skipped the tile_present mask needs to be updated for the
newly created vma to properly reflect the state of the vma.

Reported-by: <christoph.manszewski@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:43:33 -05:00
Matthew Auld
bf6d941c06 drm/xe: fix pat[2] programming with 2M/1G pages
Bit 7 in the leaf node is normally programmed with pat[2], however with
2M/1G pages that same bit in the PDE/PDPE also toggles 2M/1G pages. For
2M/1G entries the pat[2] is rather moved to bit 12, which is now free
given that the address must be aligned to 2M or 1G, leaving bit 7 for
toggling 2M/1G pages.

Bspec: 59510, 45038
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:43:19 -05:00
Matthew Brost
e669f10cd3 drm/xe: Fix VM bind out-sync signaling ordering
A case existed where an out-sync of a later VM bind operation could
signal before a previous one if the later operation results in a NOP
(e.g. a unbind or prefetch to a VA range without any mappings). This
breaks the ordering rules, fix this. This patch also lays the groundwork
for users to pass in num_binds == 0 and out-syncs.

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:43:18 -05:00
Matthew Brost
f3e9b1f434 drm/xe: Remove async worker and rework sync binds
Async worker is gone. All jobs and memory allocations done in IOCTL to
align with dma fencing rules.

Async vs. sync now means when do bind operations complete relative to
the IOCTL. Async completes when out-syncs signal while sync completes
when the IOCTL returns. In-syncs and out-syncs are only allowed in async
mode.

If memory allocations fail in the job creation step the VM is killed.
This is temporary, eventually a proper unwind will be done and VM will
be usable.

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:43:17 -05:00
Matthew Brost
b21ae51dcf drm/xe/uapi: Kill DRM_XE_UFENCE_WAIT_VM_ERROR
This is not used nor does it align VM async document, kill this.

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:43:17 -05:00
Rodrigo Vivi
7224788f67 drm/xe: Kill XE_VM_PROPERTY_BIND_OP_ERROR_CAPTURE_ADDRESS extension
This extension is currently not used and it is not aligned with
the error handling on async VM_BIND. Let's remove it and along with
that, since it was the only extension for the vm_create, remove VM
extension entirely.

v2: rebase on top of the removal of drm_xe_ext_exec_queue_set_property

Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Francois Dugast <francois.dugast@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
2023-12-21 11:43:17 -05:00
Ashutosh Dixit
5dc079d1a8 drm/xe/uapi: Use common drm_xe_ext_set_property extension
There really is no difference between 'struct drm_xe_ext_vm_set_property'
and 'struct drm_xe_ext_exec_queue_set_property', they are extensions which
specify a <property, value> pair. Replace the two extensions with a single
common 'struct drm_xe_ext_set_property' extension. The rationale is that
rather than have each XE module (including future modules) invent their own
property/value extensions, all XE modules use a common set_property
extension when possible.

Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Francois Dugast <francois.dugast@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
2023-12-21 11:43:17 -05:00
Matthew Brost
abce4e4b07 drm/xe: Rename exec_queue_kill_compute to xe_vm_remove_compute_exec_queue
Much better name and aligns with xe_vm_add_compute_exec_queue. As part
of the rename, move the implementation from xe_exec_queue.c to xe_vm.c.

Suggested-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:43:13 -05:00
Francois Dugast
78ddc872c6 drm/xe/vm: Remove VM_BIND_OP macro
This macro was necessary when bind operations were shifted but this
is no longer the case, so removing to simplify code.

Signed-off-by: Francois Dugast <francois.dugast@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
2023-12-21 11:43:10 -05:00
Francois Dugast
ea0640fc69 drm/xe/uapi: Separate VM_BIND's operation and flag
Use different members in the drm_xe_vm_bind_op for op and for flags as
it is done in other structures.

Type is left to u32 to leave enough room for future operations and flags.

v2: Remove the XE_VM_BIND_* flags shift (Rodrigo Vivi)

Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/303
Signed-off-by: Francois Dugast <francois.dugast@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
2023-12-21 11:43:10 -05:00
Matthew Auld
e814389ff1 drm/xe: directly use pat_index for pte_encode
In a future patch userspace will be able to directly set the pat_index
as part of vm_bind. To support this we need to get away from using
xe_cache_level in the low level routines and rather just use the
pat_index directly.

v2: Rebase
v3: Some missed conversions, also prefer tile_to_xe() (Niranjana)
v4: remove leftover const (Lucas)

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Cc: Pallavi Mishra <pallavi.mishra@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Pallavi Mishra <pallavi.mishra@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:42:58 -05:00
Lucas De Marchi
5803bdc8ad drm/xe/xe2: Add one more bit to encode PAT to ppgtt entries
Xe2 adds one more bit to cover all the possible 32 entries. Although
those entries are not used by internal kernel code paths, it's expected
that userspace will make use of it.

Bspec: 59510, 67095
Reviewed-by: Pallavi Mishra <pallavi.mishra@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20231006182325.3617685-2-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:42:57 -05:00
Lucas De Marchi
285230832e drm/xe/vm: Prefer xe_assert() over XE_WARN_ON()
When xelp_pte_encode_addr() was added in commit 23c8495efe
("drm/xe/migrate: Do not hand-encode pte"), there was no xe pointer for
using xe_assert(). This is not the case anymore, so prefer it over
XE_WARN_ON().

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:42:46 -05:00
Paulo Zanoni
66aca8f04b drm/xe/vm: use list_last_entry() to fetch last_op
I would imagine that it's more efficient to fetch ops_list->prev than
to walk the whole list forward.

Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:42:09 -05:00
Paulo Zanoni
5f01a35b10 drm/xe/vm: print the correct 'keep' when printing gpuva ops
Unions are cool, until they aren't.

Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:42:09 -05:00
Matthew Brost
9a674bef6c drm/xe: Fix exec queue usage for unbinds
Passing in a NULL exec queue to __xe_pt_unbind_vma results in the
migrate exec queue being used. This is not the intent from the VM bind
IOCTL, rather a NULL exec queue should use default VM exec queue.

Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:42:08 -05:00
Fei Yang
9be7925181 drm/xe: set PTE_AE for all platforms supporting it
Atomic access is supported by PVC, and became a common feature for all
platforms starting from Xe2. To enable that XE_VMA_ATOMIC_PTE_BIT needs
to be set, then pte encode will eventually set PTE_AE for devmem.

Signed-off-by: Fei Yang <fei.yang@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230928044335.1474903-2-fei.yang@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:41:21 -05:00
Lucas De Marchi
fcd75139cd drm/xe: Use pat_index to encode pde/pte
Change the xelp_pte_encode() and xelp_pde_encode() functions to use the
platform-dependent pat_index.  The same function can be used for all
platforms as they only need to encode the pat_index bits in the same
pte/pde layout. For platforms that don't have the most significant bit,
as long as they don't return a bogus index they should be fine.

v2: Use the same logic to encode pde as it's compatible with previous
    logic, it's more future proof and also fixes the cache setting for
    PVC (Matt Roper)

Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230927193902.2849159-10-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:41:20 -05:00
Lucas De Marchi
23c8495efe drm/xe/migrate: Do not hand-encode pte
Instead of encoding the pte, call a new vfunc from xe_vm to handle that.
The encoding may not be the same on every platform, so keeping it in one
place helps to better support them.

Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230927193902.2849159-5-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:41:20 -05:00
Lucas De Marchi
0e5e77bd97 drm/xe: Use vfunc for pte/pde ppgtt encoding
Move the function to encode pte/pde to be vfuncs inside struct xe_vm.
This will allow to easily extend to platforms that don't have a
compatible encoding.

v2: Fix kunit build

Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230927193902.2849159-4-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:41:19 -05:00
Tejas Upadhyay
2ff00c4f77 drm/xe: Track page table memory usage for client
Account page table memory usage in the owning client
memory usage stats.

V2:
  - Minor tweak to if (vm->pt_root[id]) check - Himal

Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:41:15 -05:00
Tejas Upadhyay
9e4e9761e6 drm/xe: Record each drm client with its VM
Enable accounting of indirect client memory usage.

Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:41:15 -05:00
Francois Dugast
c73acc1eeb drm/xe: Use Xe assert macros instead of XE_WARN_ON macro
The XE_WARN_ON macro maps to WARN_ON which is not justified
in many cases where only a simple debug check is needed.
Replace the use of the XE_WARN_ON macro with the new xe_assert
macros which relies on drm_*. This takes a struct drm_device
argument, which is one of the main changes in this commit. The
other main change is that the condition is reversed, as with
XE_WARN_ON a message is displayed if the condition is true,
whereas with xe_assert it is if the condition is false.

v2:
- Rebase
- Keep WARN splats in xe_wopcm.c (Matt Roper)

v3:
- Rebase

Signed-off-by: Francois Dugast <francois.dugast@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:41:08 -05:00
Francois Dugast
5c0553cdc8 drm/xe: Replace XE_WARN_ON with drm_warn when just printing a string
Use the generic drm_warn instead of the driver-specific XE_WARN_ON
in cases where XE_WARN_ON is used to unconditionally print a debug
message.

v2: Rebase

Signed-off-by: Francois Dugast <francois.dugast@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:41:07 -05:00