Commit Graph

31376 Commits

Author SHA1 Message Date
Eric Huang
2656e1ce78 drm/amdgpu: add reset sources in gpu reset context
reset source or reset cause is very useful info
for reset context, it will be used by events API.

Suggested-by: Lijo Lazar <Lijo.Lazar@amd.com>
Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-05 11:25:13 -04:00
Dillon Varone
3e538e4322 drm/amd/display: Add UCLK p-state support message to dcn401
[WHY&HOW]
Improves on the SMU interface to explicitly declare P-State support.

Reviewed-by: Alvin Lee <alvin.lee2@amd.com>
Acked-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Signed-off-by: Dillon Varone <dillon.varone@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-05 11:25:13 -04:00
Alvin Lee
5e211d2cf2 drm/amd/display: Use current_state when checking old_pipe subvp type
[Description]
When checking the subvp type of the previous state we must
pass in current_state to the interface instead of context
otherwise we will get the wrong result.

Reviewed-by: Samson Tam <samson.tam@amd.com>
Acked-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Signed-off-by: Alvin Lee <alvin.lee2@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-05 11:25:13 -04:00
Bob Zhou
8332f1aaf5 drm/amd/pm: add missing error handling in function smu_v13_0_6_allocate_dpm_context
Check return value to avoid null pointer dereference.

Signed-off-by: Bob Zhou <bob.zhou@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-05 11:25:13 -04:00
Bob Zhou
50151b7f1c drm/amd/pm: Fix the null pointer dereference for vega10_hwmgr
Check return value and conduct null pointer handling to avoid null pointer dereference.

Signed-off-by: Bob Zhou <bob.zhou@amd.com>
Reviewed-by: Tim Huang <Tim.Huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-05 11:25:13 -04:00
Shane Xiao
45bd39fb3b drm/amdgpu: Update the impelmentation of AMDGPU_PTE_MTYPE_VG10
This patch changes the implementation of AMDGPU_PTE_MTYPE_VG10,
clear the bits before setting the new one.

Suggested-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: longlyao <Longlong.Yao@amd.com>
Signed-off-by: Shane Xiao <shane.xiao@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-05 11:25:13 -04:00
Jesse Zhang
b5b561621d drm/amdkfd: remove dead code in kfd_create_vcrat_image_gpu
kfd_create_vcrat_image_gpu itself checks the avail_size at the start.
So the value of avail_size is at least VCRAT_SIZE_FOR_GPU(16384),
minus struct crat_header(40UL) and struct crat_subtype_compute(40UL) it cannot be less than 0.

Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-05 11:25:13 -04:00
Jesse Zhang
8178cfb0b4 drm/amdkfd: fix the kdf debugger issue
The expression caps | HSA_CAP_TRAP_DEBUG_PRECISE_MEMORY_OPERATIONS_SUPPORTED
and  caps | HSA_CAP_TRAP_DEBUG_PRECISE_ALU_OPERATIONS_SUPPORTED
are always 1/true regardless of the values of its operand.

Fixes: 9243240bed ("drm/amdkfd: enable single alu ops for gfx12")
Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com>
Suggested-by: Felix Kuehling <felix.kuehling@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-05 11:25:00 -04:00
Jesse Zhang
46eb63ec8a drm/amdkfd: Comment out the unused variable use_static in pm_map_queues_v9
To fix the warning about unused value,
remove the use_static and use the parameter is_static directly.

Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com>
Suggested-by: Felix Kuehling <felix.kuehling@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-05 11:06:52 -04:00
Alvin Lee
e69d43356f drm/amd/display: Move fpo_in_use to stream_status
[Description]
Refactor code and move fpo_in_use into stream_status to avoid
unexpected changes to previous dc_state (i.e., current_state).
Since stream pointers are shared between current and new dc_states,
updating parameters of one stream will update the other as well
which causes unexpected behaviors (i.e., checking that fpo_in_use
isn't set in previous state and set in the new state is invalid).
To avoid incorrect updates to current_state, move the fpo_in_use flag
into dc_stream_status since stream_status is owned by dc and are not
shared between different dc_states.

Reviewed-by: Samson Tam <samson.tam@amd.com>
Acked-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Signed-off-by: Alvin Lee <alvin.lee2@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-05 11:06:36 -04:00
Alvin Lee
4621e10e01 drm/amd/display: Only program P-State force if pipe config changed
[Description]
Today for MED update type we do not call update clocks. However, for FPO
the assumption is that update clocks should be called to disable P-State
switch before any HW programming since FPO in FW and driver are not
synchronized. This causes an issue where on a MED update, an FPO P-State
switch could be taking place, then driver forces P-State disallow in the below
code and prevents FPO from completing the sequence. In this case we add a check
to avoid re-programming (and thus re-setting) the P-State force register by
only reprogramming if the pipe was not previously Subvp or FPO. The assumption
is that the P-State force register should be programmed correctly the first
time SubVP / FPO was enabled, so there's no need to update / reset it if the
pipe config has never exited SubVP / FPO.

Reviewed-by: Samson Tam <samson.tam@amd.com>
Acked-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Signed-off-by: Alvin Lee <alvin.lee2@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-05 11:06:28 -04:00
Joan Lee
2770b91588 drm/amd/display: Add retires when read DPCD
[why & how]
Sometimes read DPCD return fail while result not retrieved yet. Add
retries mechanism in Replay handle hpd irq to get real result.

Reviewed-by: Jerry Zuo <jerry.zuo@amd.com>
Acked-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Signed-off-by: Joan Lee <joan.lee@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-05 11:06:18 -04:00
Nicholas Susanto
cc4d6ea0f2 drm/amd/display: Fix DML2 logic to set clk state to min
[Why]

When an eDP with high clock states is going into s0i3, stream_count is
0. This causes DML to not update the clks to the lowest state and
blocking us to enter s0i3 since eDP is out of vmin.

[How]

When stream_count is 0, set all the clocks to the lowest state.

Reviewed-by: Jun Lei <jun.lei@amd.com>
Acked-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Signed-off-by: Nicholas Susanto <nicholas.susanto@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-05 11:06:12 -04:00
Jay Cornwall
c5afb313e7 drm/amdkfd: Handle deallocated VPGRs in gfx11+ trap handler
A wavefront may deallocate its VGPRs at the end of a program while
waiting for memory transactions to complete. If it subsequently
receives a context save exception it will be unable to save,
since this requires VGPRs. In this case the trap handler should
terminate the wavefront.

Fixes intermittent VM faults under context switching load.

V2: Use S_ENDPGM instead of S_ENDPGM_SAVED for performance counters

Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Reviewed-by: Lancelot Six <lancelot.six@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-05 11:05:57 -04:00
Chris Park
4002a6c55e drm/amd/display: Support new VA page table block size
[Why]
Page table definition increased up to 2MB.

[How]
Define new use case of page table for VA.

Reviewed-by: Alvin Lee <alvin.lee2@amd.com>
Acked-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Signed-off-by: Chris Park <chris.park@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-05 11:05:47 -04:00
Joshua Aberback
e902dd7f3e drm/amd/display: workaround for oled eDP not lighting up on DCN401
[Why]
Currently there's an issue on DCN401 that prevents oled eDP panels from
being lit up that is still under investigation. To unblock dev work
while investigating, we can work around the issue by skipping toggling
the enablement of the backlight.

[How]
 - new debug bit that will skip touching backlight enable DPCD for oled

Reviewed-by: Alvin Lee <alvin.lee2@amd.com>
Reviewed-by: Wenjing Liu <wenjing.liu@amd.com>
Acked-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Signed-off-by: Joshua Aberback <joshua.aberback@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-05 11:05:33 -04:00
Chun-LiangChang
1349db1581 drm/amd/display: Add params of set_abm_event for VB Scaling
[Why]
Add parameters for set_abm_event to enable varibright scaling.
VariBright Scaling is a feature to refer to system states like

1. Power mode
2. Battery Life percent
3. FullScreen video
4. Backlight slider

to adjust variBright strength to get low power or user experience.

[How]
Add parameters of set_abm_event for VB Scaling

Reviewed-by: Jun Lei <jun.lei@amd.com>
Acked-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Signed-off-by: Chun-LiangChang <chuchang@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-05 11:05:24 -04:00
Joshua Aberback
e86e8798d3 drm/amd/display: Fix swapped dimension calculations
[Why]
The values calculated in optc1_get_otg_active_size are assigned to the
wrong output parameters, vertical blank is being used for horizontal size
and vice versa. This results in DPG test pattern looking wrong during
hardware init, as the DPG dimensions get assigned from this output, and
potentially other issues.

Reviewed-by: Aric Cyr <aric.cyr@amd.com>
Acked-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Signed-off-by: Joshua Aberback <joshua.aberback@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-05 11:05:14 -04:00
Dillon Varone
57c4982169 drm/amd/display: Wait for hardmins to complete on dcn401
[WHY&HOW]
When updating clocks via SMU, DAL needs to wait for requests to be fulfilled
before proceeding.

Reviewed-by: Alvin Lee <alvin.lee2@amd.com>
Acked-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Signed-off-by: Dillon Varone <dillon.varone@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-05 11:05:05 -04:00
Wenjing Liu
7069484dbe drm/amd/display: turn on symclk for dio virtual stream in dpms sequence
[why]
In order to support glitchless display clock ramping for virtual stream,
we must
turn on symclk for stream encoder. The code will power on phy and enable
symclk
for dio encoder during virtual stream dpms sequence.

Reviewed-by: Dillon Varone <dillon.varone@amd.com>
Acked-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Signed-off-by: Wenjing Liu <wenjing.liu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-05 11:04:58 -04:00
yi-lchen
975507d73c drm/amd/display: Keep VBios pixel rate div setting until next mode set
[why]
Vbios & Driver have difference pixel rate div policy.
When enabling fast boot & performing blank & unblank w/o timing setting,
pixel clock & pixel rate dividor are not match.
It would cause too high pixel reate and eDP would be black screen.

[How]
We would keep pixel rate div setting by Vbios until next timing setting.

Reviewed-by: Jun Lei <jun.lei@amd.com>
Acked-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Signed-off-by: yi-lchen <yi-lchen@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-05 11:04:50 -04:00
Alex Deucher
a9ebd10482 Revert "drm/amdgpu/gfx11: enable gfx pipe1 hardware support"
This reverts commit 6670142d25.

Pierre-Eric reported problems with this on his navi33.  Revert
for now until we understand what is going wrong.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: Pierre-eric.Pelloux-prayer@amd.com
2024-06-05 11:03:45 -04:00
Alex Deucher
b592d01df6 drm/amdgpu: update gc_12_0_0 headers
Add some additional registers.

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-05 11:03:10 -04:00
Jesse Zhang
dfe190aff8 drm/amdkfd: remove dead code in the function svm_range_get_pte_flags
The varible uncached  set false, the condition uncached cannot be true.
So remove the dead code, mapping flags will set the flag AMDGPU_VM_MTYPE_UC in else.

Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-05 11:02:59 -04:00
Shane Xiao
3928290102 drm/amdgpu: Update the impelmentation of AMDGPU_PTE_MTYPE_NV10
This patch changes the implementation of AMDGPU_PTE_MTYPE_NV10,
clear the bits before setting the new one.

Suggested-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: longlyao <Longlong.Yao@amd.com>
Signed-off-by: Shane Xiao <shane.xiao@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-05 11:02:43 -04:00
Yifan Zhang
301dfbfc84 drm/amdgpu: disable lane0 L1TLB and enable lane1 L1TLB
This patch to disable lane0 L1TLB and enable lane1 L1TLB.

Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-05 11:02:37 -04:00
Yifan Zhang
b7e2170b87 drm/amdgpu: init SAW registers for mmhub v3.3
This patch to configure mmhub3.3 SAW registers

Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-05 11:02:30 -04:00
Tasos Sahanidis
98f9e5ea47 drm/amdgpu/pptable: Fix UBSAN array-index-out-of-bounds
Flexible arrays used [1] instead of []. Replace the former with the latter
to resolve multiple UBSAN warnings observed on boot with a BONAIRE card.

In addition, use the __counted_by attribute where possible to hint the
length of the arrays to the compiler and any sanitizers.

Signed-off-by: Tasos Sahanidis <tasos@tasossah.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-05 11:02:24 -04:00
Colin Ian King
34a6aa4e12 drm/amd/display: Fix a handful of spelling mistakes
There are a few spelling mistakes in dml2_printf messages. Fix them.

Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-05 11:02:14 -04:00
Sunil Khatri
a1a049bd59 drm/amdgpu: fix comments and error message for ipdump
Fix comments and error messages to rightly represent
the information.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-05 11:02:09 -04:00
Sunil Khatri
33837d62a4 drm/amdgpu: rename ip_dump_cp_queues to compute queues
Rename the variable ip_dump_cp_queues to ip_dump_compute_queue
as it represent compute queues.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-05 11:02:03 -04:00
Sunil Khatri
34b8d94b6c drm/amdgpu: add cp queue registers for gfx9 ipdump
Add gfx9 support of CP queue registers for all queues
to be used by devcoredump.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-05 11:01:57 -04:00
Sunil Khatri
173ef9182a drm/amdgpu: add print support for gfx9 ipdump
Add support of gfx9 ipdump print so devcoredump
could trigger it to dump the captured registers
in devcoredump.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-05 11:01:50 -04:00
Sunil Khatri
514dc965b2 drm/amdgpu: add gfx9 register support in ipdump
Add general registers of gfx9 in ipdump for
devcoredump support.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-05 11:01:28 -04:00
Srinivasan Shanmugam
745f7170db drm/amdgpu: Fix type mismatch in amdgpu_gfx_kiq_init_ring
This commit fixes a type mismatch in the amdgpu_gfx_kiq_init_ring
function triggered by the snprintf function expecting unsigned char
arguments due to the '%hhu' format specifier, but receiving int and u32
arguments.

The issue occurred because the arguments xcc_id, ring->me, ring->pipe,
and ring->queue were of type int and u32, not unsigned char. This led to
a type mismatch when these arguments were passed to snprintf.

To resolve this, the snprintf line was modified to cast these arguments
to unsigned char. This ensures that the arguments are of the correct
type for the '%hhu' format specifier and resolves the warning.

Fixes the below:
>> drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c:333:4: warning: format
>> specifies type 'unsigned char' but the argument has type 'int'
>> [-Wformat]
                    xcc_id, ring->me, ring->pipe, ring->queue);
                    ^~~~~~
>> drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c:333:12: warning: format
>> specifies type 'unsigned char' but the argument has type 'u32' (aka
>> 'unsigned int') [-Wformat]
                    xcc_id, ring->me, ring->pipe, ring->queue);
                            ^~~~~~~~
   drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c:333:22: warning: format specifies type 'unsigned char' but the argument has type 'u32' (aka 'unsigned int') [-Wformat]
                    xcc_id, ring->me, ring->pipe, ring->queue);
                                      ^~~~~~~~~~
   drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c:333:34: warning: format specifies type 'unsigned char' but the argument has type 'u32' (aka 'unsigned int') [-Wformat]
                    xcc_id, ring->me, ring->pipe, ring->queue);
                                                  ^~~~~~~~~~~
   4 warnings generated.

Fixes: 0ea5544555 ("drm/amdgpu: Fix snprintf usage in amdgpu_gfx_kiq_init_ring")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202405250446.XeaWe66u-lkp@intel.com/
Cc: Lijo Lazar <lijo.lazar@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-05 10:58:43 -04:00
Jesse Zhang
c09d2eff81 drm/amdgu: fix Unintentional integer overflow for mall size
Potentially overflowing expression mall_size_per_umc * adev->gmc.num_umc with type unsigned int (32 bits, unsigned)
is evaluated using 32-bit arithmetic,and then used in a context that expects an expression of type u64 (64 bits, unsigned).

Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-05 10:58:36 -04:00
Hawking Zhang
a474161e84 drm/amdgpu: Update programming for boot error reporting
AMDGPU_RAS_GPU_ERR_BOOT_STATUS field is no longer valid.
The polling sequence is also simplifed according to
the latest firmware change.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-05 10:58:26 -04:00
Alex Deucher
7d3b9668e6 drm/amdgpu/soc24: use common nbio callback to set remap offset
This fixes HDP flushes on systems with non-4K pages.

Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-05 10:58:10 -04:00
Tvrtko Ursulin
730ac57386 drm/amd/display: Convert some legacy DRM debug macros into appropriate categories
Currently when one enables driver debugging dmesg gets spammed, at I
suspect vblank rate, with messages like:

 [drm:amdgpu_dm_atomic_check [amdgpu]] MPO enablement requested on crtc:[00000000f073c3bb]

Fix if by converting some logging from deprecated and incorrect
DRM_DEBUG_DRIVER to drm_dbg_atomic. Plus some localized drive-by changes
to drm_dbg_kms.

By no means an exhaustive conversion but at least it allows turning on
driver debug selectively.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-05 10:58:01 -04:00
Hawking Zhang
473af28d3e drm/amdgpu: Estimate RAS reservation when report capacity v2
Add estimate of how much vram we need to reserve for RAS
when caculating the total available vram.

v2: apply the change to MP0 v13_0_2 and v13_0_14

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-05 10:57:53 -04:00
Tao Zhou
76bec2a031 drm/amdgpu: use u32 for buf size in __amdgpu_eeprom_xfer
And also make sure the value of msg[1].len should be in the range of u16.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-05 10:57:47 -04:00
Jay Cornwall
fda812ebe3 drm/amdkfd: gfx12 context save/restore trap handler fixes
Fix LDS size interpretation: 512 bytes (>= gfx12) vs 256 (< gfx12).

Ensure STATE_PRIV.BARRIER_COMPLETE cannot change after reading or
before writing. Other waves in the threadgroup may cause this field
to assert if they complete the barrier.

Do not overwrite EXCP_FLAG_PRIV.{SAVE_CONTEXT,HOST_TRAP} when
restoring this register. Both of these fields can assert while the
wavefront is running the trap handler.

Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Reviewed-by: Lancelot Six <lancelot.six@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-05 10:57:40 -04:00
David (Ming Qiang) Wu
813e7d4cd0 drm/amdgpu: drop some kernel messages in VCN code
We have messages when the VCN fails to initialize and
there is no need to report on success.
Also PSP loading FWs is the default for production.

Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Sonny Jiang <sonjiang@amd.com>
Signed-off-by: David (Ming Qiang) Wu <David.Wu3@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-05 10:57:34 -04:00
Shane Xiao
eba791dc17 drm/amdgpu: Update the impelmentation of AMDGPU_PTE_MTYPE_GFX12
This patch changes the implementation of AMDGPU_PTE_MTYPE_GFX12,
clear the bits before setting the new one.
This fixed the potential issue that GFX12 setting memory to NC.

v2: Clear mtype field before setting the new one (Alex)
v3: Fix typo (Felix)

Suggested-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: longlyao <Longlong.Yao@amd.com>
Signed-off-by: Shane Xiao <shane.xiao@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-05 10:57:24 -04:00
Dave Airlie
bb61cf46b6 Merge tag 'amd-drm-fixes-6.10-2024-05-30' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes
amd-drm-fixes-6.10-2024-05-30:

amdgpu:
- RAS fix
- Fix colorspace property for MST connectors
- Fix for PCIe DPM
- Silence UBSAN warning
- GPUVM robustness fix
- Partition fix
- Drop deprecated I2C_CLASS_SPD

amdkfd:
- Revert unused changes for certain 11.0.3 devices
- Simplify APU VRAM handling

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240530202316.2246826-1-alexander.deucher@amd.com
2024-05-31 08:38:15 +10:00
Heiner Kallweit
67c7d4fa26 drm/amd/pm: remove deprecated I2C_CLASS_SPD support from newly added SMU_14_0_2
Support for I2C_CLASS_SPD  is currently being removed from the kernel.
Only remaining step is to remove the definition of I2C_CLASS_SPD.
Setting I2C_CLASS_SPD  in a driver is a no-op meanwhile, so remove it
here.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-05-29 17:08:08 -04:00
Rajneesh Bhardwaj
a9bc5a19e4 drm/amdgpu: Make CPX mode auto default in NPS4
On GFXIP9.4.3, make CPX mode as the default compute mode if the node is
setup in NPS4 memory partition mode. This change is only applicable for
dGPU, for APU, continue to use TPX mode.

Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-05-29 17:06:48 -04:00
Alex Deucher
1f327dfc84 drm/amdkfd: simplify APU VRAM handling
With commit 89773b8559
("drm/amdkfd: Let VRAM allocations go to GTT domain on small APUs")
big and small APU "VRAM" handling in KFD was unified.  Since AMD_IS_APU
is set for both big and small APUs, we can simplify the checks in
the code.

v2: clean up a few more places (Lang)

Acked-by: Felix Kuehling <felix.kuehling@amd.com>
Reviewed-by: Lang Yu <Lang.Yu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-05-29 17:06:15 -04:00
Alex Deucher
dd2b75fd9a Revert "drm/amdkfd: fix gfx_target_version for certain 11.0.3 devices"
This reverts commit 28ebbb4981.

Revert this commit as apparently the LLVM code to take advantage of
this never landed.

Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: Feifei Xu <feifei.xu@amd.com>
2024-05-29 17:06:04 -04:00
Jesse Zhang
a0cf36546c drm/amdgpu: fix dereference null return value for the function amdgpu_vm_pt_parent
The pointer parent may be NULLed by the function amdgpu_vm_pt_parent.
To make the code more robust, check the pointer parent.

Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com>
Suggested-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-05-29 17:03:20 -04:00