Commit Graph

15 Commits

Author SHA1 Message Date
Jay Cornwall
05762d9c7d drm/amdkfd: gfx12.1 trap handler instruction fixup for VOP3PX
A trap may occur in the middle of VOP3PX instruction co-issue.
The PC would be restored incorrectly if left unmodified.

Identify this case by examining the instruction opcode and
rewind the PC 8 bytes if it occurs.

Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Reviewed-by: Lancelot Six <lancelot.six@amd.com>
Reviewed-by: Vladimir Indic <vladimir.indic@amd.com>
Cc: Shweta Khatri <shweta.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-28 16:21:21 -05:00
Lancelot Six
532e2c87d4 drm/amdkfd: Do not include VGPR MSBs in saved PC during save
The current trap handler uses the top bits of ttmp1 to store a copy of
sq_wave_mode.*vgpr_msb (except for src2_vgpr_msb).  This is so the
effective values in sq_wave_mode can be cleared to ensure correct
behavior of the trap handler.

When saving sq_wave_mode, the trap handler correctly rebuilds the
expected value (with *vgpr_msb restored), so the save area is correct.
However, the PC itself is copied from ttmp[0:1], which contains the
wave's PC as well as the saved MSBs.

The debugger reads the PC from the save area and is confused when non-0
values from VGPR_MSBs are present.

This patch fixes this by saving the PC in the save area's PC slot, not
the composite of the PC and VGPR_MSBs.  On restore, the VGPR_MSBs are
restored from sq_wave_mode.

Signed-off-by: Lancelot Six <lancelot.six@amd.com>
Tested-by: Alexey Kondratiev <Alexey.Kondratiev@amd.com>
Reviewed-by: Jay Cornwall <jay.cornwall@amd.com>
Cc: Vladimir Indic <vladimir.indic@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-21 14:24:14 -05:00
Jay Cornwall
bbcad5a889 drm/amdkfd: gfx12.1 trap handler support for expert scheduling mode
- Leave DEP_MODE unchanged as it is ignored in the trap handler
- Save/restore SCHED_MODE (gfx12.0 saves in ttmp11)

Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Reviewed-by: Lancelot Six <lancelot.six@amd.com>
Cc: Vladimir Indic <vladimir.indic@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-21 14:21:51 -05:00
Jay Cornwall
29b703d7ad drm/amdkfd: gfx12.1 cluster barrier context save workaround
Trap cluster barrier may not serialize with user cluster barrier
under some circumstances. Add a check for pending user cluster
barrier complete.

Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Tested-by: Gang Ba <Gang.Ba@amd.com>
Cc: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Reviewed-by: Lancelot Six <lancelot.six@amd.com>
Cc: Vladimir Indic <vladimir.indic@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-21 14:18:36 -05:00
Jay Cornwall
ea89b305b6 drm/amdkfd: Fix scalar load ordering in gfx12.1 trap handler
Scalar loads may arrive out-of-order with respect to KMCNT.
The affected code expects the two loads to arrive in-order.

Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Reviewed-by: Lancelot Six <lancelot.six@amd.com>
Cc: Joseph Greathouse <joseph.greathouse@amd.com>
Cc: Vladimir Indic <vladimir.indic@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-21 14:18:29 -05:00
Jay Cornwall
d9fc0bdf9c drm/amdkfd: Sync trap handler binary with source
Binary and source desynced during branch activity. Source merge
also introduced compile error.

Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Reviewed-by: Lancelot Six <lancelot.six@amd.com>
Cc: Vladimir Indic <vladimir.indic@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-21 14:18:17 -05:00
Dave Airlie
83dc0ba275 Merge tag 'amd-drm-next-6.20-2026-01-09' of https://gitlab.freedesktop.org/agd5f/linux into drm-next
amd-drm-next-6.20-2026-01-09:

amdgpu:
- GPUVM updates
- Initial support for larger GPU address spaces
- Initial SMUIO 15.x support
- Documentation updates
- Initial PSP 15.x support
- Initial IH 7.1 support
- Initial IH 6.1.1 support
- SMU 13.0.12 updates
- RAS updates
- Initial MMHUB 3.4 support
- Initial MMHUB 4.2 support
- Initial GC 12.1 support
- Initial GC 11.5.4 support
- HDMI fixes
- Panel replay improvements
- DML updates
- DC FP fixes
- Initial SDMA 6.1.4 support
- Initial SDMA 7.1 support
- Userq updates
- DC HPD refactor
- SwSMU cleanups and refactoring
- TTM memory ops parallelization
- DCN 3.5 fixes
- DP audio fixes
- Clang fixes
- Misc spelling fixes and cleanups
- Initial SDMA 7.11.4 support
- Convert legacy DRM logging helpers to new drm logging helpers
- Initial JPEG 5.3 support
- Add support for changing UMA size via the driver
- DC analog fixes
- GC 9 gfx queue reset support
- Initial SMU 15.x support

amdkfd:
- Reserved SDMA rework
- Refactor SPM
- Initial GC 12.1 support
- Initial GC 11.5.4 support
- Initial SDMA 7.1 support
- Initial SDMA 6.1.4 support
- Increase the kfd process hash table
- Per context support
- Topology fixes

radeon:
- Convert legacy DRM logging helpers to new drm logging helpers
- Use devm for i2c adapters
- Variable sized array fix
- Misc cleanups

UAPI:
- KFD context support.  Proposed userspace:
  https://github.com/ROCm/rocm-systems/pull/1705
  https://github.com/ROCm/rocm-systems/pull/1701
- Add userq metadata queries for more queue types.  Proposed userspace:
  https://gitlab.freedesktop.org/yogeshmohan/mesa/-/commits/userq_query

From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patch.msgid.link/20260109154713.3242957-1-alexander.deucher@amd.com
Signed-off-by: Dave Airlie <airlied@redhat.com>
2026-01-15 14:49:33 +10:00
Jay Cornwall
ba80939fec drm/amdkfd: Apply VGPR bank state fixup on gfx12.1 trap exit
- Identify co-issue of S_SET_VGPR_MSB and VALU with banked VGPR
- Restore previous bank setting when exiting the trap

v2:
- Refine VOP3PX2 detection
- Improve load pipelining
- Fix a comment typo

Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Reviewed-by: Lancelot Six <lancelot.six@amd.com>
Cc: Joseph Greathouse <joseph.greathouse@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-05 16:59:56 -05:00
Jay Cornwall
1005ab86cf drm/amdkfd: Fix VGPR bank state save in gfx12.1 trap handler
S_SETREG_IMM32_B32 does not apply a mask to the MODE bank bits.
SRC2 is consequently unconditonally cleared during context save.

Use S_SETREG_B32 instead to preserve SRC2.

Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Reviewed-by: Lancelot Six <lancelot.six@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-05 16:59:56 -05:00
Mukul Joshi
258cc2b687 drm/amdkfd: Add back CWSR trap handler for GFX 12.1
CWSR Trap handler for GFX 12.1 was missed when merging changes
from 6.14 NPI branch to 6.16 NPI branch. This change adds back
the CWSR trap handler for GFX 12.1.

Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: Alex Sierra <alex.sierra@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-05 16:59:56 -05:00
Jay Cornwall
b7851f8c66 drm/amdkfd: Trap handler support for expert scheduling mode
The trap may be entered with dependency checking disabled.
Wait for dependency counters and save/restore scheduling mode.

v2:

Use ttmp1 instead of ttmp11. ttmp11 is not zero-initialized.
While the trap handler does zero this field before use, a user-mode
second-level trap handler could not rely on this being zero when
using an older kernel mode driver.

v3:

Use ttmp11 primarily but copy to ttmp1 before jumping to the
second level trap handler. ttmp1 is inspectable by a debugger.
Unexpected bits in the unused space may regress existing software.

Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Reviewed-by: Lancelot Six <lancelot.six@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 4238888794)
Cc: stable@vger.kernel.org
2025-12-08 15:22:45 -05:00
Jay Cornwall
4238888794 drm/amdkfd: Trap handler support for expert scheduling mode
The trap may be entered with dependency checking disabled.
Wait for dependency counters and save/restore scheduling mode.

v2:

Use ttmp1 instead of ttmp11. ttmp11 is not zero-initialized.
While the trap handler does zero this field before use, a user-mode
second-level trap handler could not rely on this being zero when
using an older kernel mode driver.

v3:

Use ttmp11 primarily but copy to ttmp1 before jumping to the
second level trap handler. ttmp1 is inspectable by a debugger.
Unexpected bits in the unused space may regress existing software.

Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Reviewed-by: Lancelot Six <lancelot.six@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08 14:20:46 -05:00
Jay Cornwall
424648c383 drm/amdkfd: Fix instruction hazard in gfx12 trap handler
VALU instructions with SGPR source need wait states to avoid hazard
with SALU using different SGPR.

v2: Eliminate some hazards to reduce code explosion

Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Reviewed-by: Lancelot Six <lancelot.six@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 7e0459d453)
Cc: stable@vger.kernel.org # 6.12.x
2025-03-18 16:28:34 -04:00
Lancelot SIX
d584198a6f drm/amdkfd: Ensure consistent barrier state saved in gfx12 trap handler
It is possible for some waves in a workgroup to finish their save
sequence before the group leader has had time to capture the workgroup
barrier state.  When this happens, having those waves exit do impact the
barrier state.  As a consequence, the state captured by the group leader
is invalid, and is eventually incorrectly restored.

This patch proposes to have all waves in a workgroup wait for each other
at the end of their save sequence (just before calling s_endpgm_saved).

Signed-off-by: Lancelot SIX <lancelot.six@amd.com>
Reviewed-by: Jay Cornwall <jay.cornwall@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.12.x
2025-02-12 19:47:15 -05:00
Jay Cornwall
62498e797a drm/amdkfd: Move gfx12 trap handler to separate file
gfx12 derivatives will have substantially different trap handler
implementations from gfx10/gfx11. Add a separate source file for
gfx12+ and remove unneeded conditional code.

No functional change.

v2: Revert copyright date to 2018, minor comment fixes

Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Reviewed-by: Lancelot Six <lancelot.six@amd.com>
Cc: Jonathan Kim <jonathan.kim@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-09 16:02:56 -05:00