Tao Zhou
d9443ac4f9
drm/amdgpu: drop status query/reset for GCEA 9.4.3 and MMEA 1.8
...
PMFW will be responsible for them.
v2: remove query interfaces.
Signed-off-by: Tao Zhou <tao.zhou1@amd.com >
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2023-10-20 15:11:28 -04:00
Shiwu Zhang
626121fce4
drm/amdgpu: update the xgmi ta interface header
...
Update the header file to the v20.00.00.13
v1: rename TA_COMMAND_XGMI__GET_GET_TOPOLOGY_INFO to
TA_COMMAND_XGMI__GET_TOPOLOGY_INFO
And also rename struct ta_xgmi_cmd_get_peer_link_info_output to
ta_xgmi_cmd_get_peer_link_info accordingly
v2: add structs to support xgmi GET_EXTEND_PEER_LINK command
Signed-off-by: Shiwu Zhang <shiwu.zhang@amd.com >
Reviewed-by: Le Ma <le.ma@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2023-10-20 15:11:28 -04:00
Tao Zhou
8096df7664
drm/amdgpu: add set/get mca debug mode operations
...
Record the debug mode status in RAS.
Signed-off-by: Tao Zhou <tao.zhou1@amd.com >
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2023-10-20 15:11:28 -04:00
Tao Zhou
21226f02d7
drm/amdgpu: replace reset_error_count with amdgpu_ras_reset_error_count
...
Simplify the code.
Signed-off-by: Tao Zhou <tao.zhou1@amd.com >
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2023-10-20 15:11:28 -04:00
Li Ma
9d7a965e22
drm/amdgpu: add clockgating support for NBIO v7.7.1
...
add clockgating support for NBIO ip 7.7.1
Signed-off-by: Li Ma <li.ma@amd.com >
Reviewed-by: Tim Huang <Tim.Huang@amd.com >
Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2023-10-20 15:11:28 -04:00
Li Ma
fa9dd7a285
drm/amdgpu: fix missing stuff in NBIO v7.11
...
add get_clockgating_state, update_medium_grain_light_sleep and
update_medium_grain_clock_gating in nbio_v7_11_funcs
v1:
add missing funcs in nbio_v7_11.c
v2:
modify the if condition and add spport for nbio v7.11 clockgating.
Signed-off-by: Li Ma <li.ma@amd.com >
Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2023-10-20 15:11:28 -04:00
Stanley.Yang
66d64e4e03
drm/amdgpu: Enable RAS feature by default for APU
...
Enable RAS feature by default for aqua vanjaram on apu
platform.
Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com >
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2023-10-20 15:11:28 -04:00
Yang Wang
49c260bef3
drm/amdgpu: fix typo for amdgpu ras error data print
...
typo fix.
Fixes: 5b1270beb3 ("drm/amdgpu: add ras_err_info to identify RAS error source")
Signed-off-by: Yang Wang <kevinyang.wang@amd.com >
Reviewed-by: Candice Li <candice.li@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2023-10-20 15:11:28 -04:00
Bokun Zhang
017634a68d
drm/amd/amdgpu/vcn: Add RB decouple feature under SRIOV - P4
...
- In VCN 4 SRIOV code path, add code to enable RB decouple feature
Signed-off-by: Bokun Zhang <bokun.zhang@amd.com >
Reviewed-by: Leo Liu <leo.liu@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2023-10-20 15:11:27 -04:00
Bokun Zhang
eb9d6256b9
drm/amd/amdgpu/vcn: Add RB decouple feature under SRIOV - P3
...
- Update VCN header for RB decouple feature
- Add metadata struct, metadata will be placed after each RB
Signed-off-by: Bokun Zhang <bokun.zhang@amd.com >
Reviewed-by: Leo Liu <leo.liu@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2023-10-20 15:11:27 -04:00
Bokun Zhang
fc3136730b
drm/amd/amdgpu/vcn: Add RB decouple feature under SRIOV - P2
...
- Add function to check if RB decouple is enabled under SRIOV
Signed-off-by: Bokun Zhang <bokun.zhang@amd.com >
Reviewed-by: Leo Liu <leo.liu@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2023-10-20 15:11:27 -04:00
Bokun Zhang
97b2821643
drm/amd/amdgpu/vcn: Add RB decouple feature under SRIOV - P1
...
- Update SRIOV header with RB decouple flag
Signed-off-by: Bokun Zhang <bokun.zhang@amd.com >
Reviewed-by: Leo Liu <leo.liu@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2023-10-20 15:11:27 -04:00
Stanley.Yang
8a65661114
drm/amdgpu: Fix delete nodes that have been relesed
...
Fix delete nodes that it has been freed.
Fixes: 5b1270beb3 ("drm/amdgpu: add ras_err_info to identify RAS error source")
Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com >
Reviewed-by: Yang Wang <kevinyang.wang@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2023-10-20 15:11:27 -04:00
Hawking Zhang
f2176d7063
drm/amdgpu: Add UVD_VCPU_INT_EN2 to dpg sram
...
Add RAS sepcifc programming to dpg sram.
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com >
Reviewed-by: Tao Zhou <tao.zhou1@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2023-10-20 15:11:27 -04:00
Hawking Zhang
9248462d7e
drm/amdgpu: Enable software RAS in vcn v4_0_3
...
Set VCN/JPEG RAS masks to enable software RAS for
VCN and JPEG.
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com >
Reviewed-by: Tao Zhou <tao.zhou1@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2023-10-20 15:11:26 -04:00
Tao Zhou
472c5fb297
drm/amdgpu: define ras_reset_error_count function
...
Make the code architecture more simple.
v2: reuse ras_reset_error_count in ras_reset_error_status.
Signed-off-by: Tao Zhou <tao.zhou1@amd.com >
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2023-10-20 15:11:26 -04:00
Candice Li
afcf949cf3
drm/amdgpu: Log UE corrected by replay as correctable error
...
Support replay mode where UE could be converted to CE.
Signed-off-by: Candice Li <candice.li@amd.com >
Reviewed-by: Tao Zhou <tao.zhou1@amd.com >
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2023-10-20 15:11:26 -04:00
Dave Airlie
d43c76c820
Merge tag 'drm-misc-fixes-2023-10-19' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
...
Short summary of fixes pull:
amdgpu:
- Disable AMD_CTX_PRIORITY_UNSET
bridge:
- ti-sn65dsi86: Fix device lifetime
edid:
- Add quirk for BenQ GW2765
ivpu:
- Extend address range for MMU mmap
nouveau:
- DP-connector fixes
- Documentation fixes
panel:
- Move AUX B116XW03 into panel-simple
scheduler:
- Eliminate DRM_SCHED_PRIORITY_UNSET
ttm:
- Fix possible NULL-ptr deref in cleanup
Signed-off-by: Dave Airlie <airlied@redhat.com >
From: Thomas Zimmermann <tzimmermann@suse.de >
Link: https://patchwork.freedesktop.org/patch/msgid/20231019114605.GA22540@linux-uq9g
2023-10-20 14:07:58 +10:00
Felix Kuehling
316baf09d3
drm/amdgpu: Reserve fences for VM update
...
In amdgpu_dma_buf_move_notify reserve fences for the page table updates
in amdgpu_vm_clear_freed and amdgpu_vm_handle_moved. This fixes a BUG_ON
in dma_resv_add_fence when using SDMA for page table updates.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com >
Reviewed-by: Christian König <christian.koenig@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2023-10-19 18:56:57 -04:00
Felix Kuehling
51b79f3381
drm/amdgpu: Fix possible null pointer dereference
...
abo->tbo.resource may be NULL in amdgpu_vm_bo_update.
Fixes: 1802537820 ("drm/ttm: stop allocating dummy resources during BO creation")
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com >
Reviewed-by: Christian König <christian.koenig@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2023-10-19 18:56:50 -04:00
Felix Kuehling
207430b76a
drm/amdgpu: Reserve fences for VM update
...
In amdgpu_dma_buf_move_notify reserve fences for the page table updates
in amdgpu_vm_clear_freed and amdgpu_vm_handle_moved. This fixes a BUG_ON
in dma_resv_add_fence when using SDMA for page table updates.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com >
Reviewed-by: Christian König <christian.koenig@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2023-10-19 18:26:52 -04:00
Felix Kuehling
e6f8588733
drm/amdgpu: Fix possible null pointer dereference
...
abo->tbo.resource may be NULL in amdgpu_vm_bo_update.
Fixes: 1802537820 ("drm/ttm: stop allocating dummy resources during BO creation")
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com >
Reviewed-by: Christian König <christian.koenig@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2023-10-19 18:26:52 -04:00
Stanley.Yang
b1338a8e71
drm/amdgpu: Workaround to skip kiq ring test during ras gpu recovery
...
This is workaround, kiq ring test failed in suspend stage when do ras
recovery.
Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com >
Reviewed-by: Tao Zhou <tao.zhou1@amd.com >
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2023-10-19 18:26:52 -04:00
Mario Limonciello
e56690bb37
drm/amd: Read IMU FW version from scratch register during hw_init
...
If the IMU version wasn't discovered from the header, such as when
the firmware was directly loaded by PSP then there is no firmware
version to show to userspace from sysfs or IOCTL.
The IMU F/W stores the version in the first scratch register though,
so fetch it in these cases to let the driver export.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com >
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2023-10-19 18:26:51 -04:00
Mario Limonciello
4916615fe9
drm/amd: Don't parse IMU ucode version if it won't be loaded
...
When the IMU ucode is loaded by the PSP parsing the version that comes from
Linux will vary. Rather than showing the wrong data to kernel interface
consumers, avoid populating it in this case.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com >
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2023-10-19 18:26:51 -04:00
Mario Limonciello
d757dfd667
drm/amd: Move microcode init step to early_init()
...
The intention for early init is to find any missing microcode early
and fail the driver load if it's missing. Move this step to earlier
in driver init to match other IP blocks.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com >
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2023-10-19 18:26:51 -04:00
Asad Kamal
d8c1925ba8
drm/amdgpu: update retry times for psp BL wait
...
Increase retry time for PSP BL wait, to compensate
for longer time to set c2pmsg 35 ready bit during
mode1 with RAS
Signed-off-by: Asad Kamal <asad.kamal@amd.com >
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com >
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2023-10-19 18:26:51 -04:00
Alex Deucher
28ab9a02b6
drm/amdgpu/mes11: remove aggregated doorbell code
...
It's not enabled in hardware so the code is dead.
Remove it.
Reviewed-by: Jack Xiao <Jack.Xiao@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2023-10-19 18:26:51 -04:00
Asad Kamal
53dd920c1f
drm/amdgpu : Add hive ras recovery check
...
If one of the devices in the hive detects a
fatal error, need to send ras recovery reset
message to PMFW of all devices in the hive.
For that add a flag in hive to indicate that
it's undergoing ras recovery
Signed-off-by: Asad Kamal <asad.kamal@amd.com >
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com >
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2023-10-19 18:26:51 -04:00
Mangesh Gadre
2d955a06a5
Revert "drm/amdgpu: Program xcp_ctl registers as needed"
...
This reverts commit 0bdebfef3f .
XCP_CTL register is programmed by firmware and
register access is protected.
Signed-off-by: Mangesh Gadre <Mangesh.Gadre@amd.com >
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com >
Reviewed-by: Asad Kamal <asad.kamal@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2023-10-19 18:26:50 -04:00
Lang Yu
ab29ac57ad
drm/amdgpu/umsch: add suspend and resume callback
...
Add missing IP callbacks.
Signed-off-by: Lang Yu <Lang.Yu@amd.com >
Acked-by: Alex Deucher <alexander.deucher@amd.com >
Reviewed-by: Veerabadhran Gopalakrishnan <Veerabadhran.Gopalakrishnan@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2023-10-19 18:26:50 -04:00
Christian König
6b18ef481f
drm/amdgpu: ignore duplicate BOs again
...
Looks like RADV is actually hitting this.
Signed-off-by: Christian König <christian.koenig@amd.com >
Fixes: ca6c1e210a ("drm/amdgpu: use the new drm_exec object for CS v3")
Acked-by: Alex Deucher <alexander.deucher@amd.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20231017121015.1336786-1-christian.koenig@amd.com
2023-10-19 13:19:44 +02:00
Dave Airlie
27442758e9
Merge tag 'amd-drm-next-6.7-2023-10-13' of https://gitlab.freedesktop.org/agd5f/linux into drm-next
...
amd-drm-next-6.7-2023-10-13:
amdgpu:
- DC replay fixes
- Misc code cleanups and spelling fixes
- Documentation updates
- RAS EEPROM Updates
- FRU EEPROM Updates
- IP discovery updates
- SR-IOV fixes
- RAS updates
- DC PQ fixes
- SMU 13.0.6 updates
- GC 11.5 Support
- NBIO 7.11 Support
- GMC 11 Updates
- Reset fixes
- SMU 11.5 Updates
- SMU 13.0 OD support
- Use flexible arrays for bo list handling
- W=1 Fixes
- SubVP fixes
- DPIA fixes
- DCN 3.5 Support
- Devcoredump fixes
- VPE 6.1 support
- VCN 4.0 Updates
- S/G display fixes
- DML fixes
- DML2 Support
- MST fixes
- VRR fixes
- Enable seamless boot in more cases
- Enable content type property for HDMI
- OLED fixes
- Rework and clean up GPUVM TLB flushing
- DC ODM fixes
- DP 2.x fixes
- AGP aperture fixes
- SDMA firmware loading cleanups
- Cyan Skillfish GPU clock counter fix
- GC 11 GART fix
- Cache GPU fault info for userspace queries
- DC cursor check fixes
- eDP fixes
- DC FP handling fixes
- Variable sized array fixes
- SMU 13.0.x fixes
- IB start and size alignment fixes for VCN
- SMU 14 Support
- Suspend and resume sequence rework
- vkms fix
amdkfd:
- GC 11 fixes
- GC 10 fixes
- Doorbell fixes
- CWSR fixes
- SVM fixes
- Clean up GC info enumeration
- Rework memory limit handling
- Coherent memory handling fixes
- Use partial migrations in GPU faults
- TLB flush fixes
- DMA unmap fixes
- GC 9.4.3 fixes
- SQ interrupt fix
- GTT mapping fix
- GC 11.5 Support
radeon:
- Misc code cleanups
- W=1 Fixes
- Fix possible buffer overflow
- Fix possible NULL pointer dereference
UAPI:
- Add EXT_COHERENT memory allocation flags. These allow for system scope atomics.
Proposed userspace: https://github.com/RadeonOpenCompute/ROCT-Thunk-Interface/pull/88
- Add support for new VPE engine. This is a memory to memory copy engine with advanced scaling, CSC, and color management features
Proposed mesa MR: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25713
- Add INFO IOCTL interface to query GPU faults
Proposed Mesa MR: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23238
Proposed libdrm MR: https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/298
Signed-off-by: Dave Airlie <airlied@redhat.com >
From: Alex Deucher <alexander.deucher@amd.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20231013175758.1735031-1-alexander.deucher@amd.com
2023-10-18 16:08:07 +10:00
Luben Tuikov
fa8391ad68
gpu/drm: Eliminate DRM_SCHED_PRIORITY_UNSET
...
Eliminate DRM_SCHED_PRIORITY_UNSET, value of -2, whose only user was
amdgpu. Furthermore, eliminate an index bug, in that when amdgpu boots, it
calls drm_sched_entity_init() with DRM_SCHED_PRIORITY_UNSET, which uses it to
index sched->sched_rq[].
Cc: Alex Deucher <Alexander.Deucher@amd.com >
Cc: Christian König <christian.koenig@amd.com >
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com >
Acked-by: Alex Deucher <Alexander.Deucher@amd.com >
Link: https://lore.kernel.org/r/20231017035656.8211-2-luben.tuikov@amd.com
2023-10-17 20:35:38 -04:00
Luben Tuikov
eab0261967
drm/amdgpu: Unset context priority is now invalid
...
A context priority value of AMD_CTX_PRIORITY_UNSET is now invalid--instead of
carrying it around and passing it to the Direct Rendering Manager--and it
becomes AMD_CTX_PRIORITY_NORMAL in amdgpu_ctx_ioctl(), the gateway to context
creation.
Cc: Alex Deucher <Alexander.Deucher@amd.com >
Cc: Christian König <christian.koenig@amd.com >
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com >
Acked-by: Alex Deucher <Alexander.Deucher@amd.com >
Link: https://lore.kernel.org/r/20231017035656.8211-1-luben.tuikov@amd.com
2023-10-17 20:35:38 -04:00
Ma Ke
cd90511557
drm/amdgpu/vkms: fix a possible null pointer dereference
...
In amdgpu_vkms_conn_get_modes(), the return value of drm_cvt_mode()
is assigned to mode, which will lead to a NULL pointer dereference
on failure of drm_cvt_mode(). Add a check to avoid null pointer
dereference.
Signed-off-by: Ma Ke <make_ruc2021@163.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2023-10-13 11:36:25 -04:00
Yang Wang
3bba4bc6a0
drm/amdgpu: add RAS error info support for umc_v12_0
...
add RAS error info support for umc_v12_0.
Signed-off-by: Yang Wang <kevinyang.wang@amd.com >
Reviewed-by: Tao Zhou <tao.zhou1@amd.com >
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2023-10-13 11:36:11 -04:00
Yang Wang
8736d17a7f
drm/amdgpu: add RAS error info support for mmhub_v1_8
...
add RAS error info support for mmhub_v1_8.
Signed-off-by: Yang Wang <kevinyang.wang@amd.com >
Reviewed-by: Tao Zhou <tao.zhou1@amd.com >
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2023-10-13 11:36:03 -04:00
Yang Wang
156c2814c2
drm/amdgpu: add RAS error info support for gfx_v9_4_3
...
add RAS error info support for gfx_v9_4_3.
Signed-off-by: Yang Wang <kevinyang.wang@amd.com >
Reviewed-by: Tao Zhou <tao.zhou1@amd.com >
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2023-10-13 11:35:55 -04:00
Yang Wang
dd401cd29a
drm/amdgpu: add RAS error info support for sdma_v4_4_2.
...
add RAS error info support for sdma_v4_4_2.
Signed-off-by: Yang Wang <kevinyang.wang@amd.com >
Reviewed-by: Tao Zhou <tao.zhou1@amd.com >
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2023-10-13 11:35:45 -04:00
Yang Wang
5b1270beb3
drm/amdgpu: add ras_err_info to identify RAS error source
...
introduced "ras_err_info" to better identify a RAS ERROR source.
NOTE:
For legacy chips, keep the original RAS error print format.
v1:
RAS errors may come from different dies during a RAS error query,
therefore, need a new data structure to identify the source of RAS ERROR.
v2:
- use new data structure 'amdgpu_smuio_mcm_config_info' instead of
ras_err_id (in v1 patch)
- refine ras error dump function name
- refine ras error dump log format
Signed-off-by: Yang Wang <kevinyang.wang@amd.com >
Reviewed-by: Tao Zhou <tao.zhou1@amd.com >
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2023-10-13 11:35:35 -04:00
Yifan Zhang
6a1c31c7a8
drm/amdgpu: flush the correct vmid tlb for specific pasid
...
flush the correct vmid tlb for specific pasid on gmc 11.
Fixes: 041a574388 ("drm/amdgpu: fix and cleanup gmc_v11_0_flush_gpu_tlb_pasid")
Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com >
Reviewed-by: Christian König <christian.koenig@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2023-10-13 11:34:29 -04:00
Yang Wang
1a00cfab37
drm/amdgpu: make err_data structure built-in for ras_manager
...
(No effect outside the ras_mgr data structure)
Since a new member was added to the ras_err_data data structure,
it becomes unreasonable for the ras_mgr instance to contain this data,
because ras mgr only uses the 2 member information of ue_count/ce_count in err_data.
This patch changes the code err_data into built-in structure members,
making the code directly compatible.
Signed-off-by: Yang Wang <kevinyang.wang@amd.com >
Reviewed-by: Tao Zhou <tao.zhou1@amd.com >
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2023-10-13 11:34:13 -04:00
Jesse Zhang
e341631f4a
drm/amdgpu: disable GFXOFF and PG during compute for GFX9
...
Temporary workaround to fix issues observed in some compute
applications when GFXOFF is enabled on GFX9.
Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com >
Acked-by: Alex Deucher <alexander.deucher@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2023-10-13 11:33:56 -04:00
Lang Yu
ef2354c70f
drm/amdgpu/umsch: fix missing stuff during rebase
...
These are missed during rebase.
Signed-off-by: Lang Yu <Lang.Yu@amd.com >
Reviewed-by: Veerabadhran Gopalakrishnan <Veerabadhran.Gopalakrishnan@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2023-10-13 11:33:49 -04:00
Lang Yu
fb5b73acf7
drm/amdgpu/umsch: correct IP version format
...
FW uses IP_VERSION_MAJ_MIN_REV format.
Signed-off-by: Lang Yu <Lang.Yu@amd.com >
Reviewed-by: Veerabadhran Gopalakrishnan <Veerabadhran.Gopalakrishnan@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2023-10-13 11:33:42 -04:00
Lang Yu
1c1f14a472
drm/amdgpu: don't use legacy invalidation on MMHUB v3.3
...
Legacy invalidation is not supported.
This is missed during rebase.
Signed-off-by: Lang Yu <Lang.Yu@amd.com >
Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2023-10-13 11:33:29 -04:00
Lang Yu
4661482b9c
drm/amdgpu: correct NBIO v7.11 programing
...
Use v7.7 before, switch to v7.11 now.
Fix incorrect programing.
Signed-off-by: Lang Yu <Lang.Yu@amd.com >
Acked-by: Alex Deucher <alexander.deucher@amd.com >
Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2023-10-13 11:33:21 -04:00
Xiaogang Chen
ffa88b0019
drm/amdgpu: Correctly use bo_va->ref_count in compute VMs
...
This is needed to correctly handle BOs imported into compute VM from gfx.
Both kfd and gfx should use same bo_va and set bo_va->ref_count correctly
when map the Bos into same VM, otherwise we may trigger kernel general
protection when iterate mappings over bo_va's valids or invalids list.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com >
Signed-off-by: Xiaogang Chen <Xiaogang.Chen@amd.com >
Acked-by: Christian König <christian.koenig@amd.com >
Reviewed-by: Ramesh Errabolu <Ramesh.Errabolu@amd.com >
Tested-by: Xiaogang Chen <Xiaogang.Chen@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2023-10-13 11:33:08 -04:00
Lijo Lazar
f20f3b0d6c
drm/amd/pm: Add P2S tables for SMU v13.0.6
...
Add P2S table load support on SMU v13.0.6 ASICs.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com >
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com >
Reviewed-by: Yang Wang <kevinyang.wang@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2023-10-13 11:33:01 -04:00