Commit Graph

22 Commits

Author SHA1 Message Date
Niranjana Vishwanathapura
a043fbab7a drm/xe/pvc: Use fast copy engines as migrate engine on PVC
Some copy hardware engine instances are faster than others on PVC.
Use a virtual engine of these plus the reserved instance for the migrate
engine on PVC. The idea being if a fast instance is available it will be
used and the throughput of kernel copies, clears, and pagefault
servicing will be higher.

v2: Use OOB WA, use all copy engines if no WA is required

Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:40:28 -05:00
Oak Zeng
7f6c6e5085 drm/xe: Implement HW workaround 14016763929
To workaround a HW bug on DG2, driver is required to map the whole
ppgtt virtual address space before GPU workload submission. Thus
set the XE_VM_FLAG_SCRATCH_PAGE flag during vm create so the whole
address space is mapped to point to scratch page.

v1:
  - Move the workaround implementation from xe_vm_create to
    xe_vm_create_ioctl - Brian
  - Reorder error checking in xe_vm_create_ioctl - Jose
  - Implement WA only for DG2-G10 and DG2-G12

Signed-off-by: Oak Zeng <oak.zeng@intel.com>
Reviewed-by: Brian Welty <brian.welty@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:40:18 -05:00
Tejas Upadhyay
038ff941af drm/xe: Add sysfs entries for engines under its GT
Add engines sysfs directory under its GT and
create sub directory for all engine class
(note its not per instance) present on GT.

For example,
DUT# cat /sys/class/drm/cardX/device/tileN/gtN/engines/
bcs/ ccs/

V9 :
   - Add missing drmm_add_action_or_reset
V8 :
   - Rebase
V7 :
   - Remove xe_gt.h from .h and include in .c - Matt
V6 :
   - Add kernel doc and arrange file in make file by alphabet - Matt
V5 :
   - replace xe_engine with xe_hw_engine - Matt
V4 :
   - Rebase to resolve conflicts - CI
V3 :
   - Move code in its own file
   - Rename API name
V2 :
   - Correct class mask logic - Himal
   - Remove extra parenthesis

Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:39:21 -05:00
Francois Dugast
c22a4ed0c3 drm/xe: Rename xe_engine.[ch] to xe_exec_queue.[ch]
This is a preparation commit for a larger renaming of engine to exec queue.

Signed-off-by: Francois Dugast <francois.dugast@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:39:17 -05:00
Thomas Hellström
845f64bdbf drm/xe: Introduce a range-fence utility
Add generic utility to track range conflicts signaled by a dma-fence.
Tracking implemented via an interval tree. An example use case being
tracking conflicts for pending (un)binds from multiple bind engines. By
being generic ths idea would this could moved to the DRM level and used
in multiple drivers for similar problems.

v2: Make interval tree functions static (CI)
v3: Remove non-static cleanup function (CI)

Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:37:52 -05:00
Anshuman Gupta
b2d756199b drm/xe/pm: Add vram_d3cold_threshold Sysfs
Add per pci device vram_d3cold_threshold Sysfs to
control the d3cold allowed knob.
Adding a d3cold structure embedded in xe_device to encapsulate
d3cold related stuff.

v2:
- Check total vram before initializing default threshold. [Riana]
- Add static scope to vram_d3cold_threshold DEVICE_ATTR. [Riana]
v3:
- Fixed cosmetics review comment. [Riana]
- Fixed CI Hook failures.
- Used drmm_mutex_init().
v4:
- Fixed kernel-doc warnings.
v5:
- Added doc explaining need for the device sysfs. [Rodrigo]
- Removed TODO comment.

Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Anshuman Gupta <anshuman.gupta@intel.com>
Reviewed-by: Riana Tauro <riana.tauro@intel.com>
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230718080703.239343-4-anshuman.gupta@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:37:35 -05:00
Tejas Upadhyay
e5a845fd8f drm/xe: Add sysfs entry for tile
We have recently introduced tile for each gpu,
so lets add sysfs entry per tile for userspace
to provide required information specific to tile.

V5:
  - define ktype as const
V4:
  - Reorder headers - Aravind
V3:
  - Make API to return void and add drm_warn - Aravind
V2:
  - Add logs in failure path

Reviewed-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:35:06 -05:00
Riana Tauro
1c2097bbde drm/xe: add a new sysfs directory for gtidle properties
1) Add a new sysfs directory under devices/gt#/ called gtidle
   to contain idle properties of GT such as name, idle_status,
   idle_residency_ms

2) Remove forcewake calls for residency counter

v2:
    - abstract using function pointers (Anshuman)
    - remove forcewake calls for residency counter
    - use device_attr (Badal)
    - move rc functions to guc_pc
    - change name to gt_idle (Rodrigo)

v3:
    - return error for drmm_add_action_or_reset
    - replace file and functions with gt_idle prefix
      to gt_idle_sysfs (Himal)
    - use enum for gt idle state
    - move multiplier to gt idle and initialize (Anshuman)
    - correct doc annotation (Rodrigo)
    - remove return variable
    - use kobj_gt instead of new gtidle kobj
    - move residency_ms to gtidle file
    - retain xe_guc_pc prefix for functions in guc_rc file (Michal)

v4:
    - fix doc errors in xe_guc_pc file
    - change u64 to u32 for reading residency counter
    - keep gtidle states generic GT_IDLE_C[0/6] (Anshuman)

v5:
    - update commit message to include removal of
      forcewake calls (Anshuman)
    - return void from sysfs initialization function and add warnings
      (Andi)

v6:
    - remove extra lines (Anshuman)

Signed-off-by: Riana Tauro <riana.tauro@intel.com>
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Anshuman Gupta <anshuman.gupta@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:35:00 -05:00
Thomas Hellström
9f8f93bee3 drm/xe: Emit a render cache flush after each rcs/ccs batch
We need to flush render caches before fence signalling, where we might
release the memory for reuse. We can't rely on userspace doing this,
so flush render caches after the batch, but before user fence- and
dma_fence signalling.

Copy the cache flush from i915, but omit PIPE_CONTROL_FLUSH_L3, since it
should be implied by the other flushes. Also omit
PIPE_CONTROL_TLB_INVALIDATE since there should be no apparent need to
invalidate TLB after batch completion.

v2:
- Update Makefile for OOB WA.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Tested-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com> #1
Reported-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/291
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/291
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:35:21 -05:00
Matt Roper
ad703e0637 drm/xe: Move GGTT from GT to tile
The GGTT exists at the tile level.  When a tile contains multiple GTs,
they share the same GGTT.

v2:
 - Include some changes that were mis-squashed into the VRAM patch.
   (Gustavo)

Cc: Gustavo Sousa <gustavo.sousa@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Acked-by: Gustavo Sousa <gustavo.sousa@intel.com>
Link: https://lore.kernel.org/r/20230601215244.678611-9-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:34:11 -05:00
Lucas De Marchi
7d356b25b3 drm/xe/guc: Port Wa_22012773006 to xe_wa
Let xe_guc.c start using XE_WA() for workarounds, starting from a simple
one: Wa_22012773006. It's also changed to start with graphics version
12, since that is the first supported by xe.

Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230526164358.86393-14-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:34:03 -05:00
Lucas De Marchi
9616e74b79 drm/xe: Add support for OOB workarounds
There are WAs that, due to their nature, cannot be applied from a
central place like xe_wa.c. Those are peppered around the rest of the
code, as needed. Now they have a new name:  "out-of-band workarounds".

These workarounds have their names and rules still grouped in xe_wa.c,
inside the xe_wa_oob array, which is generated at compile time by
xe_wa_oob.rules and the hostprog xe_gen_wa_oob. The code generation
guarantees that the header xe_wa_oob.h contains the IDs for the
workarounds that match the index in the table. This way the runtime
checks that are spread throughout the code are simple tests against the
bitmap saved during initialization.

v2: Fix prev_name tracking not working when it's empty, i.e. when there
    is more than 1 continuation rule.

Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230526164358.86393-13-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:34:03 -05:00
Lucas De Marchi
464f2243c1 drm/xe: Include build directory
When doing out-of-tree builds with O= or KBUILD_OUTPUT=, it's important
to also add the directory where the target is saved. Otherwise any file
generated by the build system may not be available for other targets
depending on it.

The $(obj) is added automatically when building the entire kernel,
but it's not added when M=drivers/gpu/drm/xe is added.

Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230526164358.86393-12-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:34:03 -05:00
Rodrigo Vivi
e799485044 drm/xe: Introduce the dev_coredump infrastructure.
The goal is to use devcoredump infrastructure to report error states
captured at the crash time.

The error state will contain useful information for GPU hang debug, such
as INSTDONE registers and the current buffers getting executed, as well
as any other information that helps user space and allow later replays of
the error.

The proposal here is to avoid a Xe only error_state like i915 and use
a standard dev_coredump infrastructure to expose the error state.

For our own case, the data is only useful if it is a snapshot of the
time when the GPU crash has happened, since we reset the GPU immediately
after and the registers might have changed. So the proposal here is to
have an internal snapshot to be printed out later.

Also, usually a subsequent GPU hang can be only a cause of the initial
one. So we only save the 'first' hang. The dev_coredump has a delayed
work queue where it remove the coredump and free all the data within a
few moments of the error. When that happens we also reset our capture
state and allow further snapshots.

Right now this infra only print out the time of the hang. More information
will be migrated here on subsequent work. Also, in order to organize the
dump better, the goal is to propose dev_coredump changes itself to allow
multiple files and different controls. But for now we start Xe usage of
it without any dependency on dev_coredump core changes.

v2: Add dma_fence annotation for capture that might happen during long
    running. (Thomas and Matt)
    Use xe->drm.primary->index on drm_info msg. (Jani)
v3: checkpatch fixes
v4: Fix building and locking issues found by Francois.
    Actually let's kill all of the locking in here. gt_reset serialization
    already guarantee that there will be only one capture at the same time.
    Also, the devcoredump has its own locking to protect the free and reads
    and drivers don't need to duplicate it.
    Besides this, the dma_fence locking was pushed to a following patch
    since it is not needed in this one.
    Fix a use after free identified by KASAN: Do not stash the faulty_engine
    since that will be freed somewhere else.
v5: Fix Uptime - ktime_get_boottime actually returns the Uptime. (Francois)

Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Francois Dugast <francois.dugast@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
2023-12-19 18:33:51 -05:00
Lucas De Marchi
9a56502fe1 drm/xe: Move helper macros to separate header
The macros to handle the RTP tables are very scary, but shouldn't be
used outside of the header adding the infra. Move it to a separate
header and make sure it's only included when it can be.

Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230427223256.1432787-11-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:32:22 -05:00
Chang, Bruce
1a545ed74b drm/xe: fix pvc unload issue
Currently, unload pvc driver will generate a null dereference
and the call stack is as below.

[ 4850.618000] Call Trace:
[ 4850.620740]  <TASK>
[ 4850.623134]  ttm_bo_cleanup_memtype_use+0x3f/0x50 [ttm]
[ 4850.628661]  ttm_bo_release+0x154/0x2c0 [ttm]
[ 4850.633317]  ? drm_buddy_fini+0x62/0x80 [drm_buddy]
[ 4850.638487]  ? __kmem_cache_free+0x27d/0x2c0
[ 4850.643054]  ttm_bo_put+0x38/0x60 [ttm]
[ 4850.647190]  xe_gem_object_free+0x1f/0x30 [xe]
[ 4850.651945]  drm_gem_object_free+0x1e/0x30 [drm]
[ 4850.656904]  ggtt_fini_noalloc+0x9d/0xe0 [xe]
[ 4850.661574]  drm_managed_release+0xb5/0x150 [drm]
[ 4850.666617]  drm_dev_release+0x30/0x50 [drm]
[ 4850.671209]  devm_drm_dev_init_release+0x3c/0x60 [drm]

There are a couple issues, but the main one is due to TTM has only
one TTM_PL_TT region, but since pvc has 2 tiles and tries to setup
1 TTM_PL_TT each tile. The second will overwrite the first one.

During unload time, the first tile will reset the TTM_PL_TT manger
and when the second tile is trying to free Bo and it will generate
the null reference since the TTM manage is already got reset to 0.

The fix is to use one global TTM_PL_TT manager.

v2: make gtt mgr global and change the name to sys_mgr

Cc: Stuart Summers <stuart.summers@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Vivi, Rodrigo <rodrigo.vivi@intel.com>
Signed-off-by: Bruce Chang <yu.bruce.chang@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:31:30 -05:00
Matt Roper
576c6380da drm/xe/pat: Move PAT setup to a dedicated file
PAT handling is growing in complexity and will continue to do so in
upcoming platforms.  Separate it out to a dedicated file to keep things
tidy.

The code is moved as-is here (aside from a few unused #define's that are
just dropped); further changes will come in future patches.

Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Link: https://lore.kernel.org/r/20230324210415.2434992-2-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:30:25 -05:00
Lucas De Marchi
8cb49012ac drm/xe: Do not spread i915_reg_defs.h include
Reduce the use of i915_reg_defs.h so it can be encapsulated in a single
place.

1) If it was being included by mistake, remove
2) If it was included for FIELD_GET()/FIELD_PREP()/GENMASK() and the
   like, just include <linux/bitfield.h>
3) If it was included to be able to define additional registers, move
   the registers to the relavant headers (regs/xe_regs.h or
   regs/xe_gt_regs.h)

v2:
  - Squash commit fixing i915_reg_defs.h include and with the one
    introducing regs/xe_reg_defs.h
  - Remove more cases of i915_reg_defs.h being used when all it was
    needed was linux/bitfield.h  (Matt Roper)
  - Move some  registers to the corresponding regs/*.h file (Matt Roper)

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
[Rodrigo squashed here the removal of the i915 include]
2023-12-19 18:29:23 -05:00
Lucas De Marchi
5a4a8e8b3b drm/xe: Remove outdated build workaround
Use the more common "call cc-disable-warning" way to disable warnings.

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:29:07 -05:00
Matthew Brost
a9351846d9 drm/xe: Break of TLB invalidation into its own file
TLB invalidation is used by more than USM (page faults) so break this
code out into its own file.

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:27:45 -05:00
Maarten Lankhorst
d8b52a02cb drm/xe: Implement stolen memory.
This adds support for stolen memory, with the same allocator as
vram_mgr. This allows us to skip a whole lot of copy-paste,
by re-using parts of xe_ttm_vram_mgr.

The stolen memory may be bound using VM_BIND, so it performs like any
other memory region.

We should be able to map a stolen BO directly using the physical memory
location instead of through GGTT even on old platforms, but I don't know
what the effects are on coherency.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-12 14:06:00 -05:00
Matthew Brost
dd08ebf6c3 drm/xe: Introduce a new DRM driver for Intel GPUs
Xe, is a new driver for Intel GPUs that supports both integrated and
discrete platforms starting with Tiger Lake (first Intel Xe Architecture).

The code is at a stage where it is already functional and has experimental
support for multiple platforms starting from Tiger Lake, with initial
support implemented in Mesa (for Iris and Anv, our OpenGL and Vulkan
drivers), as well as in NEO (for OpenCL and Level0).

The new Xe driver leverages a lot from i915.

As for display, the intent is to share the display code with the i915
driver so that there is maximum reuse there. But it is not added
in this patch.

This initial work is a collaboration of many people and unfortunately
the big squashed patch won't fully honor the proper credits. But let's
get some git quick stats so we can at least try to preserve some of the
credits:

Co-developed-by: Matthew Brost <matthew.brost@intel.com>
Co-developed-by: Matthew Auld <matthew.auld@intel.com>
Co-developed-by: Matt Roper <matthew.d.roper@intel.com>
Co-developed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Co-developed-by: Francois Dugast <francois.dugast@intel.com>
Co-developed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Co-developed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Co-developed-by: Philippe Lecluse <philippe.lecluse@intel.com>
Co-developed-by: Nirmoy Das <nirmoy.das@intel.com>
Co-developed-by: Jani Nikula <jani.nikula@intel.com>
Co-developed-by: José Roberto de Souza <jose.souza@intel.com>
Co-developed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Co-developed-by: Dave Airlie <airlied@redhat.com>
Co-developed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Co-developed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Co-developed-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
2023-12-12 14:05:48 -05:00