clangd reports many "unused header" warnings throughout the Xe driver.
Start working to clean this up by removing unnecessary includes in our
.c files and/or replacing them with explicit includes of other headers
that were previously being included indirectly.
By far the most common offender here was unnecessary inclusion of
xe_gt.h. That likely originates from the early days of xe.ko when
xe_mmio did not exist and all register accesses, including those
unrelated to GTs, were done with GT functions.
There's still a lot of additional #include cleanup that can be done in
the headers themselves; that will come as a followup series.
v2:
- Squash the 79-patch series down to a single patch. (MattB)
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20260115032803.4067824-2-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Add support for triggering and handling MERT TLB invalidation. After
LMTT updates, the MERT TLB invalidation is initiated to ensure memory
translations remain coherent.
Completion of the invalidation is signaled via MERT interrupt (bit 13 in
the GFX master interrupt register). Detect and handle this interrupt to
properly synchronize the invalidation flow.
Signed-off-by: Lukasz Laguna <lukasz.laguna@intel.com>
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patch.msgid.link/20251124190237.20503-4-lukasz.laguna@intel.com
Current gu2host handler registered as MSI-X vector 0 and as per bspec for
a msix vector 0 interrupt, the driver must check the legacy registers
190008(TILE_INT_REG), 190060h (GT INTR Identity Reg 0) and other registers
mentioned in "Interrupt Service Routine Pseudocode" otherwise it will block
the next interrupts. To overcome this issue replacing guc2host handler
with legacy xe_irq_handler.
Fixes: da889070be ("drm/xe/irq: Separate MSI and MSI-X flows")
Bspec: 62357
Signed-off-by: Venkata Ramana Nayana <venkata.ramana.nayana@intel.com>
Reviewed-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Link: https://patch.msgid.link/20251107083141.2080189-1-venkata.ramana.nayana@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Current implementation of compute walker has dependency on GPU/SW Stack
which requires SW/UMD to wait for event from KMD to indicate
PIPE_CONTROL interrupt was done. This created latency on SW stack.
This feature adds support to generate completion interrupt from GPGPU
walker which does not support MSIx and avoid software using Pipe control
drain/idle latency.
The only thing needed for the kernel driver to do here is to wakeup the
thread waiting on the ufence, which is already handled by the irq
handler. Before waiting on this event, the userspace side can opt-in to
this interrupt being generated by the HW by selecting the flag in the
POST_SYNC_DATA_2 substructure's dw0[3] of COMPUTE_WALKER_2 instruction.
Bspec: 62346, 74334
Suggested-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Signed-off-by: S A Muqthyar Ahmed <syed.abdul.muqthyar.ahmed@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20251016-xe3p-v3-21-3dd173a3097a@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
It's confusing to refer to some masks as the interrupt masks and others
as the fuse masks. Rename the fuse one to make it clearer.
Note that the most important role they play here is that the call
to xe_hw_engine_mask_per_class() will not only limit the engines
according to the fuses, but also by what is available in the specific
architecture - the latter is more important information to know what
interrupts should be enabled. Add a comment about that.
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20251016-xe3p-v3-17-3dd173a3097a@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Display interrupt handling has no relation to GT(s) on the platforms
supported by the Xe driver. We only call xe_display_irq_postinstall
with the first tile's primary GT, so the single condition that uses the
GT pointer within the function always evaluates to true. Drop the
unnecessary parameter and the condition.
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20251013200944.2499947-27-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Gfx device reports two classes of errors: uncorrectable and
correctable. Depending on the severity uncorrectable errors are further
classified Non-Fatal and Fatal.
Correctable and Non-Fatal errors: These errors are reported as MSI. Bits in
the Master Interrupt Register indicate the class of the error.
The source of the error is then read from the Device Error Source
Register.
Fatal errors: These are reported as PCIe errors
When a PCIe error is asserted, the OS will perform a SBR (Secondary
Bus reset) which causes the driver to reload. The error registers are
sticky and the values are maintained through SBR.
Add basic support to handle these errors.
Bspec: 50875, 53073, 53074, 53075, 53076
v2: Format commit message (Umesh)
v3: fix documentation (Stuart)
Cc: Stuart Summers <stuart.summers@intel.com>
Co-developed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Signed-off-by: Riana Tauro <riana.tauro@intel.com>
Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Link: https://lore.kernel.org/r/20250826063419.3022216-9-riana.tauro@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
The cleanup is done by devres in irq_uninstall.
Commit bbc9651fe9 ("drm/xe/irq: move irq_uninstall over to devm")
resolved the ordering issue where irq_uninstall (registered with drmm)
was called after pci_free_irq_vectors (registered with devm upon calling
pci_alloc_irq_vectors). This happened because drmm action list is
registered with devm very early in the init flow - before
pci_alloc_irq_vectors.
Now that irq_uninstall is registered with devm, it will be called before
pci_free_irq_vectors and we can remove xe_irq_shutdown.
Signed-off-by: Ilia Levi <illevi@habana.ai>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240606124705.822451-1-illevi@habana.ai
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Starting on Xe2, the GSCCS engine reset is a 2-step process. When the
driver or the GuC hits the GDRST register, the CS is immediately reset
and a success is reported, but the GSC shim continues its reset in the
background. While the shim reset is ongoing, the CS is able to accept
new context submission, but any commands that require the shim will
be stalled until the reset is completed. This means that we can keep
submitting to the GSCCS as long as we make sure that the preemption
timeout is big enough to cover any delay introduced by the reset; since
the GSC preempt timeout is not tunable at runtime, we only need to check
that the value set in kconfig is big enough (and increase it if it
isn't).
When the shim reset completes, a specific CS interrupt is triggered,
in response to which we need to check the GSCI_TIMER_STATUS register
to see if the reset was successful or not.
Note that the GSCI_TIMER_STATUS register is not power save/restored,
so it gets reset on MC6 entry. However, a reset failure stops MC6,
so in that scenario we're always guaranteed to find the correct value.
Since we can't check the register within interrupt context, the
existing GSC worker has been updated to handle it.
The expected action to take on ER failure is to trigger a driver FLR,
but we still don't support that, so for now we just print an error. A
comment has been added to the code to keep track of the FLR requirement.
v2: Add a check for the initial timeout value (Alan)
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
Reviewed-by: Alan Previn <alan.previn.teres.alexis@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240304145634.820684-1-daniele.ceraolospurio@intel.com
The GSC notifies us of a proxy request via the HECI2 interrupt. The
interrupt must be enabled both in the HECI layer and in our usual gt irq
programming; for the latter, the interrupt is enabled via the same enable
register as the GSC CS, but it does have its own mask register. When the
interrupt is received, we also need to de-assert it in both layers.
The handling of the proxy request is deferred to the same worker that we
use for GSC load. New flags have been added to distinguish between the
init case and the proxy interrupt.
v2: rename irq define, fix include ordering (Alan)
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
Cc: Suraj Kandpal <suraj.kandpal@intel.com>
Reviewed-by: Alan Previn <alan.previn.teres.alexis@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240117182621.2653049-3-daniele.ceraolospurio@intel.com
As for display, the intent is to share the display code with the i915
driver so that there is maximum reuse there.
We do this by recompiling i915/display code twice.
Now that i915 has been adapted to support the Xe build, we can add
the xe/display support.
This initial work is a collaboration of many people and unfortunately
this squashed patch won't fully honor the proper credits.
But let's try to add a few from the squashed patches:
Co-developed-by: Matthew Brost <matthew.brost@intel.com>
Co-developed-by: Jani Nikula <jani.nikula@intel.com>
Co-developed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Co-developed-by: Matt Roper <matthew.d.roper@intel.com>
Co-developed-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Co-developed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Co-developed-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Starting with Xe_LP+, GFX_MSTR_IRQ contains status bits that have W1C
behavior. If we do not properly reset them, we would miss delivery of
interrupts if a pending bit is set when enabling IRQs.
As an example, the display part of our probe routine contains paths
where we wait for vblank interrupts. If a display interrupt was already
pending when enabling IRQs, we would time out waiting for the vblank.
That in fact happened recently when modprobing Xe on a Lunar Lake with a
specific configuration; and that's how we found out we were missing this
step in the IRQ enabling logic.
Fix the issue by clearing GFX_MSTR_IRQ as part of the IRQ reset.
v2:
- Make resetting GFX_MSTR_IRQ be the last step to avoid bit
re-latching. (Ville)
v3:
- Swap nesting order: guard loop with the IP version check instead of
doing the check at each iteration. (Lucas)
v4:
- Add braces for the "if" statement guarding the loop to make the
compiler happy. (Gustavo)
BSpec: 50875, 54028, 62357
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20230926221914.106843-2-gustavo.sousa@intel.com
Signed-off-by: Gustavo Sousa <gustavo.sousa@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
As a preparation for msix support, changing for new msi irq api
which supports both msi and msix.
Reviewed-by: Ohad Sharabi <osharabi@habana.ai>
Signed-off-by: Dani Liberman <dliberman@habana.ai>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
[Rebase fixes by Rodrigo]
The guc_submission_enabled() function is being used as a boolean toggle
for all firmwares and all related features, not just GuC submission. We
could add additional flags/functions to distinguish and allow different
use-cases (e.g. loading HuC but not using GuC submission), but given
that not using GuC is a debug-only scenario having a global switch for
all FWs is enough. However, we want to make it clear that this switch
turns off everything, so rename it to uc_enabled().
v2: rebase on s/XE_WARN_ON/xe_assert
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
There are a set of engine group busyness counters provided by HW which are
perfect fit to be exposed via PMU perf events.
BSPEC: 46559, 46560, 46722, 46729, 52071, 71028
events can be listed using:
perf list
xe_0000_03_00.0/any-engine-group-busy-gt0/ [Kernel PMU event]
xe_0000_03_00.0/copy-group-busy-gt0/ [Kernel PMU event]
xe_0000_03_00.0/interrupts/ [Kernel PMU event]
xe_0000_03_00.0/media-group-busy-gt0/ [Kernel PMU event]
xe_0000_03_00.0/render-group-busy-gt0/ [Kernel PMU event]
and can be read using:
perf stat -e "xe_0000_8c_00.0/render-group-busy-gt0/" -I 1000
time counts unit events
1.001139062 0 ns xe_0000_8c_00.0/render-group-busy-gt0/
2.003294678 0 ns xe_0000_8c_00.0/render-group-busy-gt0/
3.005199582 0 ns xe_0000_8c_00.0/render-group-busy-gt0/
4.007076497 0 ns xe_0000_8c_00.0/render-group-busy-gt0/
5.008553068 0 ns xe_0000_8c_00.0/render-group-busy-gt0/
6.010531563 43520 ns xe_0000_8c_00.0/render-group-busy-gt0/
7.012468029 44800 ns xe_0000_8c_00.0/render-group-busy-gt0/
8.013463515 0 ns xe_0000_8c_00.0/render-group-busy-gt0/
9.015300183 0 ns xe_0000_8c_00.0/render-group-busy-gt0/
10.017233010 0 ns xe_0000_8c_00.0/render-group-busy-gt0/
10.971934120 0 ns xe_0000_8c_00.0/render-group-busy-gt0/
The pmu base implementation is taken from i915.
v2:
Store last known value when device is awake return that while the GT is
suspended and then update the driver copy when read during awake.
v3:
1. drop init_samples, as storing counters before going to suspend should
be sufficient.
2. ported the "drm/i915/pmu: Make PMU sample array two-dimensional" and
dropped helpers to store and read samples.
3. use xe_device_mem_access_get_if_ongoing to check if device is active
before reading the OA registers.
4. dropped format attr as no longer needed
5. introduce xe_pmu_suspend to call engine_group_busyness_store
6. few other nits.
v4: minor nits.
v5: take forcewake when accessing the OAG registers
v6:
1. drop engine_busyness_sample_type
2. update UAPI documentation
v7:
1. update UAPI documentation
2. drop MEDIA_GT specific change for media busyness counter.
Co-developed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Co-developed-by: Bommu Krishnaiah <krishnaiah.bommu@intel.com>
Signed-off-by: Aravind Iddamsetty <aravind.iddamsetty@linux.intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>