Commit Graph

17039 Commits

Author SHA1 Message Date
Alex Sierra
f8692d2f9a drm/amd: include rrmt mode for mes_v12_1
Implement rrmt for misc read/write regs ops in mes_v12.
This covers LOCAL/REMOTE XCD and LOCAL/REMOTE AID.

v2: fix comments (Alex)

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: Michael Chen <michael.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08 14:11:56 -05:00
Likun Gao
e5fc897b07 drm/amdgpu: skip SDMA autoload copy if not initialized
Skip SDMA firmware copy for rlc autoload if SDMA not enabled.

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08 14:11:49 -05:00
Likun Gao
3e22786128 drm/amdgpu: only set XCC 0 related reg for rlc autoload
For RLC autoload, only set XCC id 0 related register to trigger
rlc autoload, no need to trigger muti-times for muti-xcc.

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08 13:56:43 -05:00
Likun Gao
08ba5ba01f drm/amdgpu: update rlc autoload function
Update rlc autoload function for gfx v12.1.0
to support muti-XCC firmware autoload.

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08 13:56:43 -05:00
Le Ma
442903eb83 drm/amdgpu: remove hdp flush/invalidation completely for gfx12.1.0/sdma7.1.0
Remove the hdp operation and interfaces as the HDP hw does not exist.

v2: add checks to see if hdp funcs exists before do hdp flush/invalidation

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08 13:56:43 -05:00
Likun Gao
6fb01a20da drm/amdgpu: Add gfx v12_1_0 to discovery list
Include gfx v12_1_0 in the discovery list for
gfx IP blocks

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Mukul Joshi <mukul.joshi@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08 13:56:43 -05:00
Likun Gao
ad5f1ee0a9 drm/amdgpu: Add initial support for gfx v12_1
Add initial support for gfx ip block for gc v12_1.

V2: Remove some not applicable bit set.
V3: drop unused header (Alex)
v4: Fix type on copyright
v5: fix num_xcc handling (Alex)

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Mukul Joshi <mukul.joshi@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08 13:56:43 -05:00
Zhu Lingshan
78996e1e98 amdkfd: record kfd context id into kfd process_info
This commit records the context id of the owner
kfd_process into a kfd process_info when
create it.

Signed-off-by: Zhu Lingshan <lingshan.zhu@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08 13:56:43 -05:00
Xiaogang Chen
6b6de6266f drm/amdgpu: Don't send warning when close drm obj if drm device has been unplug
During amdgpu_gem_object_close amdgpu driver cleans vm mapping for the closing
drm obj. If the correspondent adev has been unplug got error -ENODEV code. In
this case do not need send warning message.

Signed-off-by: Xiaogang Chen <xiaogang.chen@amd.com>
Reviewed-by: Kent Russell <kent.russell@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08 13:56:42 -05:00
Lijo Lazar
018fd6d7d9 drm/amdgpu: Make pre_asic_init optional
pre_asic_init is not required for all SOCs. Make it optional and remove
empty implementations.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08 13:56:42 -05:00
Pierre-Eric Pelloux-Prayer
42cbb68ce8 drm/amdgpu: remove the ring param from ttm functions
With the removal of the direct_submit argument, the ring param
becomes useless: the jobs are always submitted to buffer_funcs_ring.

Some functions are getting an amdgpu_device argument since they
were getting it from the ring arg.

---
v4: remove adev param from amdgpu_ttm_map_buffer
---

Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Felix Kuehling <felix.kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08 13:56:40 -05:00
Likun Gao
fb61604a69 drm/amdgpu: create pm4 header for gc v12_1
Duplicated from nvd.h.
Update Release MEM and Acquire MEM pkt header.

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Mukul Joshi <mukul.joshi@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08 13:56:40 -05:00
Jack Xiao
d6d64bf975 drm/amdgpu: Add mes v12_1_0 to discovery list
Include mes v12_1_0 in the discovery list for
mes IP blocks.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Mukul Joshi <mukul.joshi@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08 13:56:39 -05:00
Jack Xiao
e220edf2d6 drm/amdgpu/mes_v12_1: initial support for mes_v12_1
Duplicated and rename mes ip version name to v12_1_0.
Fix to access correct ring pipe by xcc_id.
Fix to access correct instance registers by xcc_id.
Fix to access correct index registers by grbm/xcc_id.

v2: rebase (Alex)
v3: fix sw_fini (Alex)

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Mukul Joshi <mukul.joshi@amd.com>
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08 13:56:39 -05:00
Pierre-Eric Pelloux-Prayer
73aa1550df drm/amdgpu: remove direct_submit arg from amdgpu_copy_buffer
It was always false.

Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Acked-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08 13:56:39 -05:00
Lijo Lazar
7b0813f32a drm/amdgpu: Rename userq_mgr_xa to userq_xa
Rename since it is an xarray of userq pointers

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08 13:56:39 -05:00
Lijo Lazar
dc21e39fd2 drm/amdgpu: Clean up userq helper functions
Remove userq manager from function signatures. Get the associated
manager from userq itself.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08 13:56:39 -05:00
Likun Gao
d73e530290 drm/amdgpu: Set GC family for GC 12.1
Set GC family for GC 12.1.0.

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08 13:56:39 -05:00
Lijo Lazar
473f12f820 drm/amdgpu: Change user queue interface signatures
A userq is associated with its queue manager. Use that and make
the userqueue interfaces to operate on queue.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08 13:56:39 -05:00
Likun Gao
af441be8b7 drm/amdgpu: add support for sdma v7_1
Add support for SDMA v7.1.0 ip block.

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08 13:56:39 -05:00
Mukul Joshi
1c85f12658 drm/amdgpu: Update Generate PTE/PDE SDMA packet for SDMA 7.1
Update the Generate PTE/PDE packet fields for SDMA 7.1.

v2: squash in mtype fix (Mukul)

Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: Alex Sierra <alex.sierra@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08 13:56:38 -05:00
Likun Gao
4ed5116aac drm/amdgpu: Add sdma v7_1_0 support
Add SDMA ip block for SDMA v7_1_0.

v2: squash in queue reset changes (Alex)
v3: squash in version fix (Hawking)
v4: squash in various fixes

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Acked-by: Le Ma <le.ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08 13:56:38 -05:00
Likun Gao
268614ae41 drm/amdgpu: create pkt header for sdma v7_1
Duplicated from sdma_v6_0_0_pkt_open.h.
Update some pkt for sdma v7_1.

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08 13:56:38 -05:00
Le Ma
f9d8e880b1 drm/amdgpu: add soc v1_0 common block for IP discovery
add soc v1_0 common block

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08 13:56:38 -05:00
Hawking Zhang
297b0cebbc drm/amdgpu: Add soc v1_0 support
v1_0 is a new generation ip block

v2: squash in doorbell changes (Alex)
v3: squash in xclk, reset placeholders, pcie r|wreg ext callbacks

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Le Ma <le.ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08 13:56:38 -05:00
chong li
c71980a3fc drm/amdgpu: reduce the full gpu access time in amdgpu_device_init.
[Why]
function "devm_memremap_pages" in function "kgd2kfd_init_zone_device",
sometimes cost too much time.

[How]
move the function "kgd2kfd_init_zone_device"
after release full gpu access(amdgpu_virt_release_full_gpu).

v2:
improve the coding style.

Signed-off-by: chong li <chongli2@amd.com>
Reviewed-by: Emily Deng <Emily.Deng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08 13:56:38 -05:00
Ed Maste
ecd3fdfbee drm/amd/amdgpu: Add missing newline in DRM_DEBUG_DRIVER message
This error message was emitted without a newline during bring-up on
FreeBSD.  Presumably the error doesn't occur on Linux so was not noticed
before.

Signed-off-by: Ed Maste <emaste@FreeBSD.org>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08 13:56:35 -05:00
Likun Gao
e53833ac82 drm/amdgpu: Add gmc v12_1_0 to discovery list
Include gmc v12_1_0 in the discovery list for
gmc IP blocks.

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Le Ma <le.ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08 13:56:35 -05:00
Mukul Joshi
3a338bb114 drm/amdgpu: Enable PDE.C usage on GFX 12.1
On GFX 12.1, PDE.C is ignored if (PDE|PTE)_REQUEST_PHYSICAL
is not setup in the GCVM control register. Always set this
field to enable PDE.C usage.

Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: Alex Sierra <alex.sierra@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08 13:56:35 -05:00
Mukul Joshi
b8c27208b2 drm/amdgpu: Always set snoop bit in PDE on GFX 12.1
GFX 12.1 has the requirement to always set snoop bit in PDE
to maintain coherency.

Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: Alex Sierra <alex.sierra@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08 13:56:35 -05:00
Mukul Joshi
db29ddf650 drm/amdgpu: Add per-ASIC PTE init flag
On GFX12.1, default PTE setup needs an additional bit to be
set. Add PTE initialization flags to handle setup default PTE
on a per-ASIC basis.
While at it, fixup the coding style too.

Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08 13:56:35 -05:00
Hawking Zhang
5056b75fed drm/amdgpu: Add gmc v12_1 gmc callbacks
Implement gmc v12_1 gmc callbacks

v2: revert temporary PDE MTYPE to UC setting

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Mukul Joshi <mukul.joshi@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08 13:56:35 -05:00
Likun Gao
036c0e38bb drm/amdgpu: Add gmc v12_1 support
Add gmc support for gc version 12_1_0.

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Le Ma <le.ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08 13:56:35 -05:00
Hawking Zhang
e87bfaf55e drm/amdgpu: Add gfxhub v12_1 support
gfxhub v12_1 is a new generation ip

v2: squash in update to new IP headers
v3: squash in cast fix

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Le Ma <le.ma@amd.com>
Reviewed-by: Likun Gao <Likun.Gao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08 13:56:35 -05:00
Likun Gao
4af2e15c9b drm/amdgpu: Add initial support for mmhub v4_2
Add initial support for mmhub v4_2_0.

v2: squash in cast fix

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Le Ma <le.ma@amd.com>
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08 13:56:34 -05:00
Alex Deucher
d09c6b98f8 drm/amdgpu: fix spelling in gmc9/10 code
onyl -> only

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08 13:56:34 -05:00
Asad Kamal
bd68a1404b drm/amdgpu/ras: Move ras data alloc before bad page check
In the rare event if eeprom has only invalid address entries,
allocation is skipped, this causes following NULL pointer issue
[  547.103445] BUG: kernel NULL pointer dereference, address: 0000000000000010
[  547.118897] #PF: supervisor read access in kernel mode
[  547.130292] #PF: error_code(0x0000) - not-present page
[  547.141689] PGD 124757067 P4D 0
[  547.148842] Oops: 0000 [#1] PREEMPT SMP NOPTI
[  547.158504] CPU: 49 PID: 8167 Comm: cat Tainted: G           OE      6.8.0-38-generic #38-Ubuntu
[  547.177998] Hardware name: Supermicro AS -8126GS-TNMR/H14DSG-OD, BIOS 1.7 09/12/2025
[  547.195178] RIP: 0010:amdgpu_ras_sysfs_badpages_read+0x2f2/0x5d0 [amdgpu]
[  547.210375] Code: e8 63 78 82 c0 45 31 d2 45 3b 75 08 48 8b 45 a0 73 44 44 89 f1 48 8b 7d 88 48 89 ca 48 c1 e2 05 48 29 ca 49 8b 4d 00 48 01 d1 <48> 83 79 10 00 74 17 49 63 f2 48 8b 49 08 41 83 c2 01 48 8d 34 76
[  547.252045] RSP: 0018:ffa0000067287ac0 EFLAGS: 00010246
[  547.263636] RAX: ff11000167c28130 RBX: ff11000127600000 RCX: 0000000000000000
[  547.279467] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ff11000125b1c800
[  547.295298] RBP: ffa0000067287b50 R08: 0000000000000000 R09: 0000000000000000
[  547.311129] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[  547.326959] R13: ff11000217b1de00 R14: 0000000000000000 R15: 0000000000000092
[  547.342790] FS:  0000746e59d14740(0000) GS:ff11017dfda80000(0000) knlGS:0000000000000000
[  547.360744] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  547.373489] CR2: 0000000000000010 CR3: 000000019585e001 CR4: 0000000000f71ef0
[  547.389321] PKRU: 55555554
[  547.395316] Call Trace:
[  547.400737]  <TASK>
[  547.405386]  ? show_regs+0x6d/0x80
[  547.412929]  ? __die+0x24/0x80
[  547.419697]  ? page_fault_oops+0x99/0x1b0
[  547.428588]  ? do_user_addr_fault+0x2ee/0x6b0
[  547.438249]  ? exc_page_fault+0x83/0x1b0
[  547.446949]  ? asm_exc_page_fault+0x27/0x30
[  547.456225]  ? amdgpu_ras_sysfs_badpages_read+0x2f2/0x5d0 [amdgpu]
[  547.470040]  ? mas_wr_modify+0xcd/0x140
[  547.478548]  sysfs_kf_bin_read+0x63/0xb0
[  547.487248]  kernfs_file_read_iter+0xa1/0x190
[  547.496909]  kernfs_fop_read_iter+0x25/0x40
[  547.506182]  vfs_read+0x255/0x390

This also result in space left assigned to negative values.
Moving data alloc call before bad page check resolves both the issue.

Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Suggested-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08 13:56:34 -05:00
Srinivasan Shanmugam
d8c2c6c33d drm/amdgpu: Map/Unmap MMIO_REMAP as BAR register window; add TTM sg helpers; wire dma-buf
MMIO_REMAP (HDP flush page) exposes a hardware MMIO register window via
a PCI BAR; there are no struct pages backing it (not normal RAM).  But
when one device shares memory with another through dma-buf, the receiver
still expects a delivery route—a list of DMA-able chunks—called an
sg_table. For the BAR window, we can’t (no pages!), so we instead create
a one-entry list that points directly to the BAR’s physical bus address
and tell DMA: “use this I/O span.” - A single, contiguous byte range on
the PCI bus (start DMA address + length)). That’s why we map it with
dma_map_resource() and set sg_set_page(..., NULL, ...). Perform DMA
reads/writes directly to that range so we build an sg_table from a BAR
physical span and map it with dma_map_resource().

This patch centralizes the BAR-I/O mapping in TTM and wires dma-buf to
it:

Add amdgpu_ttm_mmio_remap_alloc_sgt() /
amdgpu_ttm_mmio_remap_free_sgt(). They walk the TTM resource via
amdgpu_res_cursor, add the byte offset to adev->rmmio_remap.bus_addr,
build a one-entry sg_table with sg_set_page(NULL, …), and map/unmap it
with dma_map_resource().

In dma-buf map/unmap, if the BO is in AMDGPU_PL_MMIO_REMAP, call the new
helpers.

Single place for BAR-I/O handling: amdgpu_ttm.c in
amdgpu_ttm_mmio_remap_alloc_sgt() and ..._free_sgt().
No struct pages: sg_set_page(sg, NULL, cur.size, 0); inside
amdgpu_ttm_mmio_remap_alloc_sgt().
Minimal sg_table: sg_alloc_table(*sgt, 1, GFP_KERNEL); inside
amdgpu_ttm_mmio_remap_alloc_sgt().
Hooked into dma-buf: amdgpu_dma_buf_map()/unmap() in amdgpu_dma_buf.c
call these helpers for AMDGPU_PL_MMIO_REMAP.

v2: squash in fix for set/get tiling

Suggested-by: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08 13:56:34 -05:00
Srinivasan Shanmugam
de8955508b drm/amdgpu/ttm: Pin 4K MMIO_REMAP Singleton BO at Init v2
MMIO_REMAP (HDP flush page) is a hardware I/O window exposed via a PCI
BAR.  It must not migrate or be evicted.

Allocate a single 4 KB GEM BO in AMDGPU_GEM_DOMAIN_MMIO_REMAP during TTM
initialization when the hardware exposes a remap bus address and the
host page size is <= 4 KiB. Reserve the BO and pin it at the TTM level
so it remains fixed for its lifetime. No CPU mapping is established
here.

On teardown, reserve, unpin, and free the BO if present.

This prepares the object to be shared (e.g., via dma-buf) without
triggering placement changes or no CPU-access migration

v2: Added extra NULL checks

Suggested-by: Christian König <christian.koenig@amd.com>
Suggested-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08 13:56:34 -05:00
YiPeng Chai
54c2d9739b drm/amd/ras: Compatible with legacy sriov host
If sriov host is legacy, the guest uniras will
be disabled.

Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08 13:56:34 -05:00
YiPeng Chai
c174a6317b drm/amd/ras: Support sriov uniras to obtain cper data
Support sriov uniras to obtain cper data.

Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08 13:56:33 -05:00
YiPeng Chai
60a300780d drm/amdgpu: Add virt command to send VF ras command
Add virt command and interface to send VF ras command.

Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08 13:56:33 -05:00
Tao Zhou
f752e79d38 drm/amdgpu: fix the calculation of RAS bad page number
__amdgpu_ras_restore_bad_pages is responsible for the maintenance of bad
page number, drop the unnecessary bad page number update in the error
handling path of add_bad_pages.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08 13:56:33 -05:00
Rodrigo Siqueira
b9befc9a21 drm/amdgpu: Expand kernel-doc in amdgpu_ring
Expand the kernel-doc about amdgpu_ring and add some tiny improvements.

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Rodrigo Siqueira <siqueira@igalia.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08 13:56:32 -05:00
Mukul Joshi
e06d194201 drm/amdgpu: Enable IH CAM on IH 7.1.0
Enable IH CAM to handle retry faults on IH 7.1.0.
Also increase the soft ring size.

Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08 13:56:32 -05:00
Hawking Zhang
692c70f4d8 drm/amdgpu: Use ih v7_0 ip block for ih v7_1
ih v7_1 and ih v7_0 share the same ip block implementation

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Likun Gao <Likun.Gao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08 13:56:32 -05:00
Le Ma
41273a8ce3 drm/amdgpu: Set psp ip block and funcs for v15.0.8
Set psp ip block and funcs for MP0 15.0.8

Signed-off-by: Le Ma <le.ma@amd.com>
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Likun Gao <Likun.Gao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08 13:56:32 -05:00
Hawking Zhang
0c24b83d82 drm/amdgpu: Upload a single sdma fw copy when using psp v15.0.8
driver only need to upload sdma firmware copy for
all sdma instances when using PSP v15.0.8

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Likun Gao <Likun.Gao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08 13:56:32 -05:00
Hawking Zhang
b02c22359b drm/amdgpu: Set skip_tmr to true for psp v15_0_8
psp v15_0_8 does not require tmr created by gpu driver

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Likun Gao <Likun.Gao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08 13:56:31 -05:00
Le Ma
c92bb14151 drm/amdgpu: Add psp v15.0.8 ip block v3
Add psp_v15_0_8.c for MPASP 15.0.8

v2: drop memory training intf as they are only
necessary for GDDR memory

v3: Implement psp_v15_0_8_get_fw_type (Feifei)

Signed-off-by: Le Ma <le.ma@amd.com>
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Feifei Xu <Feifei.Xu@amd.com>
Reviewed-by: Likun Gao <Likun.Gao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08 13:56:31 -05:00