Otherwise triggering sysfs multiple times without other submissions in
between only runs the shader once.
v2: add some comment
v3: re-add missing cast
v4: squash in semicolon fix
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 8b2ae7d492)
On x86_64 with gcc version 13.3.0, I compile kernel with:
make defconfig
./scripts/kconfig/merge_config.sh .config <(
echo CONFIG_COMPILE_TEST=y
)
make KCFLAGS="-fno-inline-functions -fno-inline-small-functions -fno-inline-functions-called-once"
Then I get a linker error:
ld: vmlinux.o: in function `pxp_fw_dependencies_completed':
kintel_pxp.c:(.text+0x95728f): undefined reference to `intel_pxp_gsccs_is_ready_for_sessions'
This is caused by not having a intel_pxp_gsccs_is_ready_for_sessions()
header stub for CONFIG_DRM_I915_PXP=n. Add it.
Signed-off-by: Chen Linxuan <chenlinxuan@uniontech.com>
Fixes: 99afb7cc8c ("drm/i915/pxp: Add ARB session creation and cleanup")
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Link: https://lore.kernel.org/r/20250415090616.2649889-1-jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
The is_vram() is checking the current placement, however if we consider
exported VRAM with dynamic dma-buf, it looks possible for the xe driver
to async evict the memory, notifying the importer, however importer does
not have to call unmap_attachment() immediately, but rather just as
"soon as possible", like when the dma-resv idles. Following from this we
would then pipeline the move, attaching the fence to the manager, and
then update the current placement. But when the unmap_attachment() runs
at some later point we might see that is_vram() is now false, and take
the complete wrong path when dma-unmapping the sg, leading to
explosions.
To fix this check if the sgl was mapping a struct page.
v2:
- The attachment can be mapped multiple times it seems, so we can't
really rely on encoding something in the attachment->priv. Instead
see if the page_link has an encoded struct page. For vram we expect
this to be NULL.
Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4563
Fixes: dd08ebf6c3 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: <stable@vger.kernel.org> # v6.8+
Acked-by: Christian König <christian.koenig@amd.com>
Link: https://lore.kernel.org/r/20250410162716.159403-2-matthew.auld@intel.com
User is reporting what smells like notifier vs folio deadlock, where
migrate_pages_batch() on core kernel side is holding folio lock(s) and
then interacting with the mappings of it, however those mappings are
tied to some userptr, which means calling into the notifier callback and
grabbing the notifier lock. With perfect timing it looks possible that
the pages we pulled from the hmm fault can get sniped by
migrate_pages_batch() at the same time that we are holding the notifier
lock to mark the pages as accessed/dirty, but at this point we also want
to grab the folio locks(s) to mark them as dirty, but if they are
contended from notifier/migrate_pages_batch side then we deadlock since
folio lock won't be dropped until we drop the notifier lock.
Fortunately the mark_page_accessed/dirty is not really needed in the
first place it seems and should have already been done by hmm fault, so
just remove it.
Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4765
Fixes: 0a98219bcc ("drm/xe/hmm: Don't dereference struct page pointers without notifier lock")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: <stable@vger.kernel.org> # v6.10+
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://lore.kernel.org/r/20250414132539.26654-2-matthew.auld@intel.com
The TI k3-j721s2 platform does not allow us to use uncached memory
(which is what the driver currently does) without disabling cache snooping
on the AXI ACE-Lite interface, which would be too much of a performance
hit.
Given the platform is dma-coherent, we can simply force all
device-accessible memory allocations through the CPU cache. In fact, this
can be done whenever the dma_coherent attribute is present.
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Link: https://lore.kernel.org/r/20250410-sets-bxs-4-64-patch-v1-v6-15-eda620c5865f@imgtec.com
Signed-off-by: Matt Coster <matt.coster@imgtec.com>
Extend interrupt handling logic to check for safety event IRQs, then clear
and handle them in the IRQ handler thread.
Safety events need to be checked and cleared with a different set of GPU
registers than those the IRQ handler has been using so far.
Only two safety events need to be handled on the host: FW fault (ECC error
correction or detection) and device watchdog timeout. Handling right now
simply consists of clearing any error and logging the event. If either of
these events results in an unrecoverable GPU or FW, the driver will
eventually attempt to recover from it e.g. via pvr_power_reset().
Note that Rogue GPUs may send interrupts to the host for all types of
safety events, not just the two above. For events not handled by the host,
clearing the associated interrupt is sufficient.
Signed-off-by: Alessio Belle <alessio.belle@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Link: https://lore.kernel.org/r/20250410-sets-bxs-4-64-patch-v1-v6-7-eda620c5865f@imgtec.com
Signed-off-by: Matt Coster <matt.coster@imgtec.com>
Pass IRQF_ONESHOT flag to request_threaded_irq(), so that interrupts will
be masked by the kernel until the end of the threaded IRQ handler. Since
the calls to pvr_fw_irq_enable() and pvr_fw_irq_disable() are now
redundant, remove them.
Interrupts to the host from the soon-to-be-added RISC-V firmware
processors cannot be masked in hardware. This change allows us to continue
using the threaded handler in GPUs with a RISC-V firmware.
For simplicity, the same approach is taken for all firmware processors.
Signed-off-by: Alessio Belle <alessio.belle@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Link: https://lore.kernel.org/r/20250410-sets-bxs-4-64-patch-v1-v6-6-eda620c5865f@imgtec.com
Signed-off-by: Matt Coster <matt.coster@imgtec.com>
The first supported GPU only used a single power domain so this was
automatically handled by the device runtime.
In order to support multiple power domains, they must be enumerated from
devicetree and linked to both the GPU device and each other to ensure
correct power sequencing at start time.
For all Imagination Rogue GPUs, power domains are named "a", "b", etc. and
the sequence A->B->... is always valid for startup with the reverse true
for shutdown. Note this is not always the *only* valid sequence, but it's
simple and does not require special-casing for different GPU power
topologies.
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Link: https://lore.kernel.org/r/20250410-sets-bxs-4-64-patch-v1-v6-5-eda620c5865f@imgtec.com
Signed-off-by: Matt Coster <matt.coster@imgtec.com>
Follow-on from the companion dt-bindings change ("dt-bindings: gpu: img:
More explicit compatible strings"), deprecating "img,img-axe" in favour of
the more explicit combination of "img,img-rogue" and "img,img-axe-1-16m".
Since all relevant details are interrogated from the device at runtime,
we can match on the generic "img,img-rogue" and avoid adding more entries
with NULL data members (barring hardware quirks).
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Link: https://lore.kernel.org/r/20250410-sets-bxs-4-64-patch-v1-v6-4-eda620c5865f@imgtec.com
Signed-off-by: Matt Coster <matt.coster@imgtec.com>
HDMI standard defines recommended N and CTS values for Audio Clock
Regeneration. Currently each driver implements those, frequently in
somewhat unique way. Provide a generic helper for getting those values
to be used by the HDMI drivers.
The helper is added to drm_hdmi_helper.c rather than drm_hdmi_audio.c
since HDMI drivers can be using this helper function even without
switching to DRM HDMI Audio helpers.
Note: currently this only handles the values per HDMI 1.4b Section 7.2
and HDMI 2.0 Section 9.2.1. Later the table can be expanded to
accommodate for Deep Color TMDS char rates per HDMI 1.4 Appendix D
and/or HDMI 2.0 / 2.1 Appendix C).
Reviewed-by: Maxime Ripard <mripard@kernel.org>
Link: https://lore.kernel.org/r/20250408-drm-hdmi-acr-v2-1-dee7298ab1af@oss.qualcomm.com
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Adding link rate and lane count information to i915_display_info makes it
easier and faster to access this data compared to checking kernel logs.
This is particularly beneficial for individuals who are not familiar with
i915 in the following scenarios:
* Debugging DP tunnel bandwidth usage in the Thunderbolt driver.
* During USB4 certification, it is necessary to know the link rate used by
the monitor to prove that the DP tunnel can handle required rates.
* In PHY CTS, when the connector probes are not mounted correctly,
some display lanes may not appear in the DP Oscilloscope, leading to CTS
failures.
This change provides validation teams with an easy way to identify and
troubleshoot issues.
v2: separate seq_printf line (Jani)
v3: separate output line (Jani)
Cc: Imre Deak <imre.deak@intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Khaled Almahallawy <khaled.almahallawy@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Link: https://lore.kernel.org/r/20250409230214.963999-1-khaled.almahallawy@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Cross-subsystem Changes:
- Update GVT MAINTAINERS (Jani)
Driver Changes:
- Updates for xe3lpd display (Gustavo)
- Fix link training interrupted by HPD pulse (Imre)
- Watermark bound checks for DSC (Ankit)
- VRR Refactor and other fixes and improvements (Ankit)
- More conversions towards intel_display struct (Gustavo, Jani)
- Other clean-up patches towards a display separation (Jani)
- Maintain asciibetical order for HAS_* macros (Ankit)
- Fixes around probe/initialization (Janusz)
- Fix build and doc build issue (Yue, Rodrigo)
- DSI related fixes (Suraj, William, Jani)
- Improve DC6 entry counter (Mohammed)
- Fix xe2hpd memory type identification (Vivek)
- PSR related fixes and improvements (Animesh, Jouni)
- DP MST related fixes and improvements (Imre)
- Fix scanline_offset for LNL+/BMG+ (Ville)
- Some gvt related fixes and changes (Ville, Jani)
- Some PLL code adjustment (Ville)
- Display wa addition (Vinod)
- DRAM type logging (Lucas)
- Pimp the initial FB readout (Ville)
- Some sagv/bw cleanup (Ville)
- Remove i915_display_capabilities debugfs entry (Jani)
- Move PCH type to display caps debugfs entry (Jani)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://lore.kernel.org/r/Z_kTqPX5Mjruq1pL@intel.com
The metadata saved in the ADS is read by GuC when it's initialized.
Saving the addresses to the LRCs when they are populated is too late as
GuC will keep using the old ones.
This was causing GuC to use the RCS LRC for any engine class. It's not a
big problem on a Linux-only scenario since the they are used by GuC only
on media engines when the watchdog is triggered. However, in a
virtualization scenario with Windows as the VF, it causes the wrong LRCs
to be loaded as the watchdog is used for all engines.
Fix it by letting guc_golden_lrc_init() initialize the metadata, like
other *_init() functions, and later guc_golden_lrc_populate() to copy
the LRCs to the right places. The former is called before the second GuC
load, while the latter is called after LRCs have been recorded.
Cc: Chee Yin Wong <chee.yin.wong@intel.com>
Cc: John Harrison <john.c.harrison@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Fixes: dd08ebf6c3 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Cc: <stable@vger.kernel.org> # v6.11+
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Tested-by: Chee Yin Wong <chee.yin.wong@intel.com>
Link: https://lore.kernel.org/r/20250409-fix-guc-ads-v1-1-494135f7a5d0@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Most of the PF's debugfs files (and their implementations) are
based on the GT hierarchy even if files are related to GGTT or
LMEM data, that are related to the tile.
While we could reach the tile data from any GT, to avoid potential
misuse, some functions allow to be used on the primary GT only,
and may use asserts to enforce that.
In our case, the following assert could be seen when reading the
/sys/kernel/debug/dri/0000:00:02.0/gt1/pf/ggtt_available
[ ] xe 0000:00:02.0: [drm] Assertion `!xe_gt_is_media_type(gt)` failed!
[ ] WARNING: CPU: 4 PID: 10609 at drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c:379 pf_get_spare_ggtt+0x256/0x4e0 [xe]
[ ] RIP: 0010:pf_get_spare_ggtt+0x256/0x4e0 [xe]
[ ] Call Trace:
[ ] <TASK>
[ ] xe_gt_sriov_pf_config_print_available_ggtt+0xb7/0x480 [xe]
[ ] ? __memcg_slab_post_alloc_hook+0x12f/0x3f0
[ ] xe_gt_debugfs_simple_show+0x7b/0xb0 [xe]
[ ] ? __pfx___drm_printfn_seq_file+0x10/0x10
[ ] ? __pfx___drm_puts_seq_file+0x10/0x10
[ ] seq_read_iter+0x139/0x4e0
[ ] seq_read+0x11d/0x160
[ ] full_proxy_read+0x6b/0xb0
[ ] vfs_read+0xfa/0x390
Fix that by moving GGTT/LMEM debugfs attributes to separate lists
and register them only when applicable (on primary GT, on DGFX).
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Tested-by: Marcin Bernatowicz <marcin.bernatowicz@linux.intel.com>
Reviewed-by: Marcin Bernatowicz <marcin.bernatowicz@linux.intel.com>
Link: https://lore.kernel.org/r/20250411193030.1865-1-michal.wajdeczko@intel.com