Commit Graph

31376 Commits

Author SHA1 Message Date
Yang Wang
a4fcb5f733 Revert "drm/amdgpu: change aca bank error lock type to spinlock"
This reverts commit f6bce954f4.

Revert this patch to modify lock type back to 'mutex' to avoid kernel
calltrace issue.

[  602.668806] Workqueue: amdgpu-reset-dev amdgpu_ras_do_recovery [amdgpu]
[  602.668939] Call Trace:
[  602.668940]  <TASK>
[  602.668941]  dump_stack_lvl+0x4c/0x70
[  602.668945]  dump_stack+0x14/0x20
[  602.668946]  __schedule_bug+0x5a/0x70
[  602.668950]  __schedule+0x940/0xb30
[  602.668952]  ? srso_alias_return_thunk+0x5/0xfbef5
[  602.668955]  ? hrtimer_reprogram+0x77/0xb0
[  602.668957]  ? srso_alias_return_thunk+0x5/0xfbef5
[  602.668959]  ? hrtimer_start_range_ns+0x126/0x370
[  602.668961]  schedule+0x39/0xe0
[  602.668962]  schedule_hrtimeout_range_clock+0xb1/0x140
[  602.668964]  ? __pfx_hrtimer_wakeup+0x10/0x10
[  602.668966]  schedule_hrtimeout_range+0x17/0x20
[  602.668967]  usleep_range_state+0x69/0x90
[  602.668970]  psp_cmd_submit_buf+0x132/0x570 [amdgpu]
[  602.669066]  psp_ras_invoke+0x75/0x1a0 [amdgpu]
[  602.669156]  psp_ras_query_address+0x9c/0x120 [amdgpu]
[  602.669245]  umc_v12_0_update_ecc_status+0x16d/0x520 [amdgpu]
[  602.669337]  ? srso_alias_return_thunk+0x5/0xfbef5
[  602.669339]  ? stack_depot_save+0x12/0x20
[  602.669342]  ? srso_alias_return_thunk+0x5/0xfbef5
[  602.669343]  ? set_track_prepare+0x52/0x70
[  602.669346]  ? kmemleak_alloc+0x4f/0x90
[  602.669348]  ? __kmalloc_node+0x34b/0x450
[  602.669352]  amdgpu_umc_update_ecc_status+0x23/0x40 [amdgpu]
[  602.669438]  mca_umc_mca_get_err_count+0x85/0xc0 [amdgpu]
[  602.669554]  mca_smu_parse_mca_error_count+0x120/0x1d0 [amdgpu]
[  602.669655]  amdgpu_mca_dispatch_mca_set.part.0+0x141/0x250 [amdgpu]
[  602.669743]  ? kmemleak_free+0x36/0x60
[  602.669745]  ? kvfree+0x32/0x40
[  602.669747]  ? srso_alias_return_thunk+0x5/0xfbef5
[  602.669749]  ? kfree+0x15d/0x2a0
[  602.669752]  amdgpu_mca_smu_log_ras_error+0x1f6/0x210 [amdgpu]
[  602.669839]  amdgpu_ras_query_error_status_helper+0x2ad/0x390 [amdgpu]
[  602.669924]  ? srso_alias_return_thunk+0x5/0xfbef5
[  602.669925]  ? __call_rcu_common.constprop.0+0xa6/0x2b0
[  602.669929]  amdgpu_ras_query_error_status+0xf3/0x620 [amdgpu]
[  602.670014]  ? srso_alias_return_thunk+0x5/0xfbef5
[  602.670017]  amdgpu_ras_log_on_err_counter+0xe1/0x170 [amdgpu]
[  602.670103]  amdgpu_ras_do_recovery+0xd2/0x2c0 [amdgpu]
[  602.670187]  ? srso_alias_return_thunk+0x5/0

Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: YiPeng Chai <yipeng.chai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-19 12:51:23 -04:00
Yang Wang
8c9ee18019 Revert "drm/amdgpu: change bank cache lock type to spinlock"
This reverts commit 258ed689bc

revert this patch to modify lock type back to 'mutex' to avoid kernel
calltrace issue.

[  602.668806] Workqueue: amdgpu-reset-dev amdgpu_ras_do_recovery [amdgpu]
[  602.668939] Call Trace:
[  602.668940]  <TASK>
[  602.668941]  dump_stack_lvl+0x4c/0x70
[  602.668945]  dump_stack+0x14/0x20
[  602.668946]  __schedule_bug+0x5a/0x70
[  602.668950]  __schedule+0x940/0xb30
[  602.668952]  ? srso_alias_return_thunk+0x5/0xfbef5
[  602.668955]  ? hrtimer_reprogram+0x77/0xb0
[  602.668957]  ? srso_alias_return_thunk+0x5/0xfbef5
[  602.668959]  ? hrtimer_start_range_ns+0x126/0x370
[  602.668961]  schedule+0x39/0xe0
[  602.668962]  schedule_hrtimeout_range_clock+0xb1/0x140
[  602.668964]  ? __pfx_hrtimer_wakeup+0x10/0x10
[  602.668966]  schedule_hrtimeout_range+0x17/0x20
[  602.668967]  usleep_range_state+0x69/0x90
[  602.668970]  psp_cmd_submit_buf+0x132/0x570 [amdgpu]
[  602.669066]  psp_ras_invoke+0x75/0x1a0 [amdgpu]
[  602.669156]  psp_ras_query_address+0x9c/0x120 [amdgpu]
[  602.669245]  umc_v12_0_update_ecc_status+0x16d/0x520 [amdgpu]
[  602.669337]  ? srso_alias_return_thunk+0x5/0xfbef5
[  602.669339]  ? stack_depot_save+0x12/0x20
[  602.669342]  ? srso_alias_return_thunk+0x5/0xfbef5
[  602.669343]  ? set_track_prepare+0x52/0x70
[  602.669346]  ? kmemleak_alloc+0x4f/0x90
[  602.669348]  ? __kmalloc_node+0x34b/0x450
[  602.669352]  amdgpu_umc_update_ecc_status+0x23/0x40 [amdgpu]
[  602.669438]  mca_umc_mca_get_err_count+0x85/0xc0 [amdgpu]
[  602.669554]  mca_smu_parse_mca_error_count+0x120/0x1d0 [amdgpu]
[  602.669655]  amdgpu_mca_dispatch_mca_set.part.0+0x141/0x250 [amdgpu]
[  602.669743]  ? kmemleak_free+0x36/0x60
[  602.669745]  ? kvfree+0x32/0x40
[  602.669747]  ? srso_alias_return_thunk+0x5/0xfbef5
[  602.669749]  ? kfree+0x15d/0x2a0
[  602.669752]  amdgpu_mca_smu_log_ras_error+0x1f6/0x210 [amdgpu]
[  602.669839]  amdgpu_ras_query_error_status_helper+0x2ad/0x390 [amdgpu]
[  602.669924]  ? srso_alias_return_thunk+0x5/0xfbef5
[  602.669925]  ? __call_rcu_common.constprop.0+0xa6/0x2b0
[  602.669929]  amdgpu_ras_query_error_status+0xf3/0x620 [amdgpu]
[  602.670014]  ? srso_alias_return_thunk+0x5/0xfbef5
[  602.670017]  amdgpu_ras_log_on_err_counter+0xe1/0x170 [amdgpu]
[  602.670103]  amdgpu_ras_do_recovery+0xd2/0x2c0 [amdgpu]
[  602.670187]  ? srso_alias_return_thunk+0x5/0xfbef5
[  602.670189]  ? __schedule+0x37d/0xb30
[  602.670191]  process_one_work+0x176/0x350
[  602.670194]  worker_thread+0x2f7/0x420
[  602.670197]  ?

Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: YiPeng Chai <YiPeng.Chai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-19 12:50:31 -04:00
Alex Deucher
19797687e6 drm/amdgpu: remove amdgpu_mes_fence_wait_polling()
No longer used so remove it.

Reviewed-by: Mukul Joshi <mukul.joshi@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-19 12:50:24 -04:00
Alex Deucher
fffe347e14 drm/amdgpu: cleanup MES12 command submission
The approach of having a separate WB slot for each submission doesn't
really work well and for example breaks GPU reset.

Use a status query packet for the fence update instead since those
should always succeed we can use the fence of the original packet to
signal the state of the operation.

While at it cleanup the coding style.

Fixes: ade887c633 ("drm/amdgpu/mes12: Use a separate fence per transaction")
Reviewed-by: Mukul Joshi <mukul.joshi@amd.com>
Suggested-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-19 12:49:59 -04:00
Yang Wang
3af2c80ae2 drm/amdgpu: refine gfx10 firmware loading
refine gfx10 firmware loading

Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-19 12:49:53 -04:00
Yang Wang
23fc94795b drm/amdgpu: refine gfx9 firmware loading
refine gfx9 firmware loading

Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-19 12:49:28 -04:00
Christian König
de32462541 drm/amdgpu: cleanup MES11 command submission
The approach of having a separate WB slot for each submission doesn't
really work well and for example breaks GPU reset.

Use a status query packet for the fence update instead since those
should always succeed we can use the fence of the original packet to
signal the state of the operation.

While at it cleanup the coding style.

Fixes: eef016ba89 ("drm/amdgpu/mes11: Use a separate fence per transaction")
Reviewed-by: Mukul Joshi <mukul.joshi@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-19 12:48:25 -04:00
Alex Deucher
6a6eda569b drm/amdgpu: fix UBSAN warning in kv_dpm.c
Adds bounds check for sumo_vid_mapping_entry.

Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3392
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-19 12:48:19 -04:00
Christian König
a6328c9c3d drm/amdgpu: fix using the reserved VMID with gang submit
We need to ensure that even when using a reserved VMID that the gang
members can still run in parallel.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-19 12:48:00 -04:00
Victor Lu
b32563859d drm/amdgpu: Do not wait for MP0_C2PMSG_33 IFWI init in SRIOV
SRIOV does not need to wait for IFWI init, and MP0_C2PMSG_33 is blocked
for VF access.

Signed-off-by: Victor Lu <victorchengchi.lu@amd.com>
Reviewed-by: Vignesh Chander <Vignesh.Chander@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-19 12:47:52 -04:00
Li Ma
4b5b855c24 drm/amd/swsmu: add MALL init support workaround for smu_v14_0_1
[Why]
SMU firmware has not supported MALL PG.

[How]
Disable MALL PG and make it always on until SMU firmware is ready.

Signed-off-by: Li Ma <li.ma@amd.com>
Reviewed-by: Tim Huang <Tim.Huang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-19 12:47:27 -04:00
Mukul Joshi
4d14a74054 Revert "drm/amdgpu: Add missing locking for MES API calls"
This reverts commit 3612702852.

This is causing a BUG message during suspend.

[   61.603542] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:283
[   61.603550] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 2028, name: kworker/u64:14
[   61.603553] preempt_count: 1, expected: 0
[   61.603555] RCU nest depth: 0, expected: 0
[   61.603557] Preemption disabled at:
[   61.603559] [<ffffffffc08a3261>] amdgpu_gfx_disable_kgq+0x61/0x160 [amdgpu]
[   61.603789] CPU: 9 PID: 2028 Comm: kworker/u64:14 Tainted: G        W          6.8.0+ #7
[   61.603795] Workqueue: events_unbound async_run_entry_fn
[   61.603801] Call Trace:
[   61.603803]  <TASK>
[   61.603806]  dump_stack_lvl+0x37/0x50
[   61.603811]  ? amdgpu_gfx_disable_kgq+0x61/0x160 [amdgpu]
[   61.604007]  dump_stack+0x10/0x20
[   61.604010]  __might_resched+0x16f/0x1d0
[   61.604016]  __might_sleep+0x43/0x70
[   61.604020]  mutex_lock+0x1f/0x60
[   61.604024]  amdgpu_mes_unmap_legacy_queue+0x6d/0x100 [amdgpu]
[   61.604226]  gfx11_kiq_unmap_queues+0x3dc/0x430 [amdgpu]
[   61.604422]  ? srso_alias_return_thunk+0x5/0xfbef5
[   61.604429]  amdgpu_gfx_disable_kgq+0x122/0x160 [amdgpu]
[   61.604621]  gfx_v11_0_hw_fini+0xda/0x100 [amdgpu]
[   61.604814]  gfx_v11_0_suspend+0xe/0x20 [amdgpu]
[   61.605008]  amdgpu_device_ip_suspend_phase2+0x135/0x1d0 [amdgpu]
[   61.605175]  amdgpu_device_suspend+0xec/0x180 [amdgpu]

Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-19 12:46:39 -04:00
Aric Cyr
dcf5e17c05 drm/amd/display: 3.2.289
This version brings along the following:

- DCN401 fixes
- DPIA fixes
- DML21 fixes
- Misc Coverity fixes

Acked-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Aric Cyr <aric.cyr@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-19 12:46:33 -04:00
Anthony Koo
27dcb8fb92 drm/amd/display: [FW Promotion] Release 0.0.222.0
- Add new condition for PSR exit due to ESD recovery
 - Add new VB scaling feature for ABM by interpolating between
   existing VB parameters, allowing driver to have fine grain
   scaled VB levels between 0 - 250

Acked-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Anthony Koo <anthony.koo@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-19 12:46:27 -04:00
Alex Hung
db39d575ee drm/amd/display: Remove redundant null checks
The null checks for aconnector and aconnector->dc_link and
stream redundant as they were already dereferenced previously
as reported by Coverity; therefore the null checks are removed.

This fixes 4 REVERSE_INULL issues reported by Coverity.

Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Acked-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-19 12:46:21 -04:00
Alex Hung
a7b38c7852 drm/amd/display: Check UnboundedRequestEnabled's value
CalculateSwathAndDETConfiguration_params_st's UnboundedRequestEnabled
is a pointer (i.e. dml_bool_t *UnboundedRequestEnabled), and thus
if (p->UnboundedRequestEnabled) checks its address, not bool value.

This fixes 1 REVERSE_INULL issue reported by Coverity.

Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Acked-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-19 12:46:14 -04:00
Alex Hung
391c6fb490 drm/amd/display: Remove redundant checks for context
The null checks for context are redundant as it was already
dereferenced previously, as reported by Coverity; therefore
the null checks are removed.

This fixes 2 REVERSE_INULL issues reported by Coverity.

Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Acked-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-19 12:46:08 -04:00
Alex Hung
8a1708328c drm/amd/display: Remove redundant checks for opp
The null checks for opp are redundant as they were already
dereferenced previously, as reported by Coverity; therefore
the null checks are removed.

This fixes 2 REVERSE_INULL issues reported by Coverity.

Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Acked-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-19 12:45:54 -04:00
Alex Hung
14f293e044 drm/amd/display: Remove redundant null checks
The null checks are redundant as they were already dereferenced
previously, as reported by Coverity; therefore the null checks
are removed.

This fixes 7 REVERSE_INULL issues reported by Coverity.

Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Acked-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-19 12:45:46 -04:00
Ivan Lipski
9e6da7b70b drm/amd/display: Remove unused value set from 'min_hratio_fact' in dml
These portions of code are flagged as 'UNUSED_VALUE' by the
Coverity analysis since the assigned values of these vars
are never used in the code.

Reviewed-by: Alex Hung <alex.hung@amd.com>
Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Acked-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Ivan Lipski <ivlipski@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-19 12:45:39 -04:00
Alex Hung
f94a97117f drm/amd/display: Remove redundant checks for ctx->dc_bios
The null checks for ctx->dc_bios are redundant as it was already
dereferenced previously, as reported by Coverity; therefore the
null checks are removed.

This fixes 7 REVERSE_INULL issues reported by Coverity.

Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Acked-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-19 12:45:33 -04:00
Alex Hung
1a664dc0cf drm/amd/display: Remove redundant checks for res_pool->dccg
The null checks for res_pool->dccg are redundant as it was already
dereferenced previously, as reported by Coverity; therefore the
null checks are removed.

This fixes 6 REVERSE_INULL issues reported by Coverity.

Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Acked-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-19 12:45:27 -04:00
Rodrigo Siqueira
85fa228745 drm/amd/display: Improve warning log for get OPP for OTG master
If some part of the driver tries to call
resource_get_opp_heads_for_otg_master in a non-OTG master context, DC
will trigger a dmesg warning since this situation indicates that some
configuration associated with ODM slices might be wrong. This commit
adds an extra log to describe why the warning was triggered to make the
debugging more straightforward.

Reviewed-by: Wenjing Liu <wenjing.liu@amd.com>
Acked-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-19 12:45:21 -04:00
Rodrigo Siqueira
26ec3cca7b drm/amd/display: Fix warning caused by an attempt to configure a non-otg master
When booting the system with DCN401, the driver adds the following dmesg
warning:

WARNING: CPU: 8 PID: 175 at
drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_resource.c:1923
resource_get_opp_heads_for_otg_master+0x13/0x70 [amdgpu]

Modules linked in: amdgpu(+) hid_generic amdxcp i2c_algo_bit
drm_ttm_helper ttm drm_exec gpu_sched drm_suballoc_helper drm_buddy
drm_display_helper drm_kms_helper usbhid hid drm i2c_piix4 ahci igc
libahci video wmi

CPU: 8 PID: 175 Comm: systemd-udevd Not tainted 6.8.0-EXTRA-PROMO-MAY-29+ #66
Hardware name: ASUS System Product Name/TUF GAMING X570-PRO (WI-FI),
BIOS 4021 08/10/2021

RIP: 0010:resource_get_opp_heads_for_otg_master+0x13/0x70 [amdgpu]
Code: 8b 66 0f 1f 44 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90
90 0f 1f 44 00 00 55 48 83 bf f8 07 00 00 00 48 89 e5 74 0c <0f> 0b 31
f6 89 f0 5d e9 0c 65 01 e5 48 83 bf e0 07 00 00 00 75 ea

RSP: 0018:ffffa5f000816ed8 EFLAGS: 00010246
[...]
PKRU: 55555554
Call Trace:
 <TASK>
 ? show_regs+0x65/0x70
 ? __warn+0x85/0x160
 ? resource_get_opp_heads_for_otg_master+0x13/0x70 [amdgpu]
 ? report_bug+0x192/0x1c0
 ? handle_bug+0x44/0x90
 ? exc_invalid_op+0x18/0x70
[...]

This warning is triggered by a check in the function
resource_get_opp_heads_for_otg_master that validates if the request
operation is in a master OTG pipe; if not, the warning above is
displayed. In other words, another part of the code might be calling
this function in a non-OTG master pipe context, resulting in the log
message.

The reason the ASSERT was triggered is that the current state wasn't
updated after applying the context to the hardware. This means that the
update_dsc_for_odm_change might be called from a non-OTG-MASTER. To
prevent this, it's crucial to check if the current reference is pointing
to an OTG master before operate in the old OTG master reference. If it's
not, the function must set the old OTG reference to NULL and avoid
calling resource_get_opp_heads_for_otg_master before the context is
updated.

Reviewed-by: Wenjing Liu <wenjing.liu@amd.com>
Acked-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Co-developed-by: Wenjing Liu <wenjing.liu@amd.com>
Signed-off-by: Wenjing Liu <wenjing.liu@amd.com>
Signed-off-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-19 12:45:02 -04:00
Alex Hung
b62ec97d55 drm/amd/display: Covert integers to double before divisions
Integer divisions result in loss of fractional and accuracy is lost
when assigned or compared with double. It is necessary to perform
double/integer instead or explicitly cast them to double.

This fixes 54 UNINTENDED_INTEGER_DIVISION issues reported by Coverity.

Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Acked-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-19 12:44:55 -04:00
Alex Hung
7cf24de30e drm/amd/display: Check pipe_ctx before it is used
resource_get_odm_slice_count and resource_get_otg_master_for_stream can
return null, and their returns must be checked before used.

This fixes 4 NULL_RETURNS issues reported by Coverity.

Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Acked-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-19 12:44:47 -04:00
Alex Hung
470f3760cf drm/amd/display: Check dc_stream_state before it is used
dc_state_get_stream_status dc_state_get_paired_subvp_stream and other
functions can return null, and therefore null must be checked before
status can be used.

This fixes 21 NULL_RETURNS issues reported by Coverity.

Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Acked-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-19 12:44:41 -04:00
Alvin Lee
c76f56f252 drm/amd/display: Make sure to reprogram ODM when resync fifo
Need to reconfigure ODM when resyncing FIFO because on OTG disable we
clear all ODM programming

Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Acked-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Alvin Lee <alvin.lee2@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-19 12:44:33 -04:00
Rodrigo Siqueira
5af7571247 drm/amd/display: Fix NULL pointer dereference for DTN log in DCN401
When users run the command:

cat /sys/kernel/debug/dri/0/amdgpu_dm_dtn_log

The following NULL pointer dereference happens:

[  +0.000003] BUG: kernel NULL pointer dereference, address: NULL
[  +0.000005] #PF: supervisor instruction fetch in kernel mode
[  +0.000002] #PF: error_code(0x0010) - not-present page
[  +0.000002] PGD 0 P4D 0
[  +0.000004] Oops: 0010 [#1] PREEMPT SMP NOPTI
[  +0.000003] RIP: 0010:0x0
[  +0.000008] Code: Unable to access opcode bytes at 0xffffffffffffffd6.
[...]
[  +0.000002] PKRU: 55555554
[  +0.000002] Call Trace:
[  +0.000002]  <TASK>
[  +0.000003]  ? show_regs+0x65/0x70
[  +0.000006]  ? __die+0x24/0x70
[  +0.000004]  ? page_fault_oops+0x160/0x470
[  +0.000006]  ? do_user_addr_fault+0x2b5/0x690
[  +0.000003]  ? prb_read_valid+0x1c/0x30
[  +0.000005]  ? exc_page_fault+0x8c/0x1a0
[  +0.000005]  ? asm_exc_page_fault+0x27/0x30
[  +0.000012]  dcn10_log_color_state+0xf9/0x510 [amdgpu]
[  +0.000306]  ? srso_alias_return_thunk+0x5/0xfbef5
[  +0.000003]  ? vsnprintf+0x2fb/0x600
[  +0.000009]  dcn10_log_hw_state+0xfd0/0xfe0 [amdgpu]
[  +0.000218]  ? __mod_memcg_lruvec_state+0xe8/0x170
[  +0.000008]  ? srso_alias_return_thunk+0x5/0xfbef5
[  +0.000002]  ? debug_smp_processor_id+0x17/0x20
[  +0.000003]  ? srso_alias_return_thunk+0x5/0xfbef5
[  +0.000002]  ? srso_alias_return_thunk+0x5/0xfbef5
[  +0.000002]  ? set_ptes.isra.0+0x2b/0x90
[  +0.000004]  ? srso_alias_return_thunk+0x5/0xfbef5
[  +0.000002]  ? _raw_spin_unlock+0x19/0x40
[  +0.000004]  ? srso_alias_return_thunk+0x5/0xfbef5
[  +0.000002]  ? do_anonymous_page+0x337/0x700
[  +0.000004]  dtn_log_read+0x82/0x120 [amdgpu]
[  +0.000207]  full_proxy_read+0x66/0x90
[  +0.000007]  vfs_read+0xb0/0x340
[  +0.000005]  ? __count_memcg_events+0x79/0xe0
[  +0.000002]  ? srso_alias_return_thunk+0x5/0xfbef5
[  +0.000003]  ? count_memcg_events.constprop.0+0x1e/0x40
[  +0.000003]  ? handle_mm_fault+0xb2/0x370
[  +0.000003]  ksys_read+0x6b/0xf0
[  +0.000004]  __x64_sys_read+0x19/0x20
[  +0.000003]  do_syscall_64+0x60/0x130
[  +0.000004]  entry_SYSCALL_64_after_hwframe+0x6e/0x76
[  +0.000003] RIP: 0033:0x7fdf32f147e2
[...]

This error happens when the color log tries to read the gamut remap
information from DCN401 which is not initialized in the dcn401_dpp_funcs
which leads to a null pointer dereference. This commit addresses this
issue by adding a proper guard to access the gamut_remap callback in
case the specific ASIC did not implement this function.

Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Acked-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-19 12:44:23 -04:00
Sridevi Arvindekar
319d461551 drm/amd/display: mirror case cleanup for cursors
Mirror case unsupported for cursors. So, remove code for mirror case
with cursors.

Reviewed-by: Nevenko Stupar <nevenko.stupar@amd.com>
Acked-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Sridevi Arvindekar <sarvinde@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-19 12:44:16 -04:00
Alex Hung
0fd146067d drm/amd/display: Add null checker before access structs
Checks null pointer before accessing various structs.

This fixes 5 NULL_RETURNS issues reported by Coverity.

Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Acked-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-19 12:44:02 -04:00
Alex Hung
c4d31653c0 drm/amd/display: Skip wbscl_set_scaler_filter if filter is null
Callers can pass null in filter (i.e. from returned from the function
wbscl_get_filter_coeffs_16p) and a null check is added to ensure that is
not the case.

This fixes 4 NULL_RETURNS issues reported by Coverity.

Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Acked-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-14 16:18:55 -04:00
Alex Hung
8b0ddf19cc drm/amd/display: Check BIOS images before it is used
BIOS images may fail to load and null checks are added before they are
used.

This fixes 6 NULL_RETURNS issues reported by Coverity.

Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Acked-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-14 16:18:55 -04:00
Alex Hung
8092aa3ab8 drm/amd/display: Add null checker before passing variables
Checks null pointer before passing variables to functions.

This fixes 3 NULL_RETURNS issues reported by Coverity.

Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Acked-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-14 16:18:55 -04:00
Alex Hung
143818fae0 drm/amd/display: Explicitly extend unsigned 16 bit to 64 bit
Coverity reports sign extention defects as below:

Suspicious implicit sign extension: mode->htotal with type u16 ... to
int (32 bits, signed), then sign-extended to type unsigned long
(64 bits, unsigned). If mode->htotal * mode->vtotal is greater than
0x7FFFFFFF, the upper bits of the result will all be 1.

Cast it to unsigned long to avoid possible overflow.

This fixes 4 SIGN_EXTENSION issues reported by Coverity.

Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Acked-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-14 16:18:55 -04:00
Sung Joon Kim
0057b36ac2 drm/amd/display: Send message to notify the DPIA host router bandwidth
[why]
Tell the system about the current host router bandwidth to be used to
measure and calculate the right voltage to be used.

[how]
Send SMU message of each DPIA host router bandwidth.

Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Acked-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Sung Joon Kim <sungjoon.kim@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-14 16:18:55 -04:00
Dillon Varone
a157dcc521 drm/amd/display: Add null check to dml21_find_dc_pipes_for_plane
When a phantom stream is in the process of being deconstructed, there
could be pipes with no associated planes.  In that case, ignore the
phantom stream entirely when searching for associated pipes.

Cc: stable@vger.kernel.org
Reviewed-by: Alvin Lee <alvin.lee2@amd.com>
Acked-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Dillon Varone <dillon.varone@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-14 16:18:55 -04:00
Michael Strauss
37256027b4 drm/amd/display: Attempt to avoid empty TUs when endpoint is DPIA
[WHY]
Empty SST TUs are illegal to transmit over a USB4 DP tunnel.
Current policy is to configure stream encoder to pack 2 pixels per pclk
even when ODM combine is not in use, allowing seamless dynamic ODM
reconfiguration. However, in extreme edge cases where average pixel
count per TU is less than 2, this can lead to unexpected empty TU
generation during compliance testing. For example, VIC 1 with a 1xHBR3
link configuration will average 1.98 pix/TU.

[HOW]
Calculate average pixel count per TU, and block 2 pixels per clock if
endpoint is a DPIA tunnel and pixel clock is low enough that we will
never require 2:1 ODM combine.

Cc: stable@vger.kernel.org # 6.6+
Reviewed-by: Wenjing Liu <wenjing.liu@amd.com>
Acked-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Michael Strauss <michael.strauss@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-14 16:18:55 -04:00
Mounika Adhuri
2d62bb450e drm/amd/display: Refactor DCN3X into component folder
[why]
Move DCN3X files to unique component folder.

[how]
Create respective component folder in dc, move the DCN3X files into
corresponding new folders and made appropriate changes for compilation
in Makefiles.

Reviewed-by: Martin Leung <martin.leung@amd.com>
Acked-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Mounika Adhuri <moadhuri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-14 16:18:55 -04:00
Chris Park
3838c67365 drm/amd/display: On clock init, maintain DISPCLK freq
[Why]
On init if a display is connected, we need to maintain the DISPCLK
frequency Even though DPG_EN=1, the display still requires the correct
timing or it could cause audio corruption (if DISPCLK freq is reduced).

[How]
Read the current DISPCLK freq and request the same value to ensure the
timing is valid and unchanged.

Reviewed-by: Alvin Lee <alvin.lee2@amd.com>
Acked-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Chris Park <chris.park@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-14 16:18:55 -04:00
Wenjing Liu
3a69c1702f drm/amd/display: fix minor coding errors where dml21 phase 5 uses wrong variables
There is a coding error which causes incorrect variables to be assigned
in DML21 phase 5.

Reviewed-by: Dillon Varone <dillon.varone@amd.com>
Acked-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Wenjing Liu <wenjing.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-14 16:18:55 -04:00
Ivan Lipski
2094401053 drm/amd/display: Remove redundant condition in VBA 314 func
[WHY]
Coverity analysis this conditional code as DEADCODE.
The conditional statement is never true since
'MacroTileSizeBytes' is either 256 or 65536. Thus, the
code inside is the conditional statement is never reached.

[HOW]
Removed the conditional statement.

Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Reviewed-by: Alex Hung <alex.hung@amd.com>
Acked-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Ivan Lipski <ivlipski@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-14 16:18:54 -04:00
Ivan Lipski
9061707976 drm/amd/display: Remove redundant condition with DEADCODE
[WHY]
Coverity analysis flagged this condition as DEADCODE since the
variable 'req128_c' is always false, thus the condition is never
true.

[HOW]
Remove the condition.

Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Acked-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Ivan Lipski <ivlipski@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-14 16:18:54 -04:00
Joshua Aberback
6b6d38c508 Revert "drm/amd/display: workaround for oled eDP not lighting up on DCN401"
This reverts commit e902dd7f3e.

A proper fix for this issue has been implemented in DMUB FW. So, no need
to keep the workaround.

Reviewed-by: Wenjing Liu <wenjing.liu@amd.com>
Acked-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Joshua Aberback <joshua.aberback@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-14 16:18:27 -04:00
Relja Vojvodic
8867ae8cfa drm/amd/display: Add dcn401 DIG fifo enable/disable
[Why]
Found while hotplugging MST daisy chain displays. Changing dispclk
during this sequence caused SMU hang due to DIG fifo not being disabled
correctly (caused by missing functions).

[How]
Adding disable/enable DIG fifo functions for dcn401

Reviewed-by: Wenjing Liu <wenjing.liu@amd.com>
Acked-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Relja Vojvodic <relja.vojvodic@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-14 16:18:27 -04:00
Dillon Varone
6184bd5750 drm/amd/display: Enable DCN401 idle optimizations by default
[WHY&HOW]
Re-enable idle optimizations by default.

Reviewed-by: Alvin Lee <alvin.lee2@amd.com>
Acked-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Dillon Varone <dillon.varone@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-14 16:18:27 -04:00
Joshua Aberback
52971387a0 drm/amd/display: DCN401 full power down in HW init if any link enabled
[Why]
During HW init, certain operations the driver performs are invalid on
enabled hardware in an unknown state (for example, setting all clock
values to minimum when the GPU is actively driving a display). There is
already code present to call HWSS->power_down during init when any link
is enabled in HW, but that function pointer is unpopulated for most asics.
We want to enable this codepath for DCN401, as it resolves the issue with
being unable to drive certain display configs on adapter re-enable, and we
can restore boot optimizations.

[How]
 - add power_down HWSS function for DCN401
 - remove debug bit to disable boot optimizations for DCN401

Reviewed-by: Wenjing Liu <wenjing.liu@amd.com>
Acked-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Joshua Aberback <joshua.aberback@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-14 16:18:27 -04:00
Yang Wang
3a3be8bb97 drm/amdgpu: refine gfx8 firmware loading
refine gfx8 firmware loading

Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-14 16:18:27 -04:00
Tao Zhou
9d308e32a9 drm/amdkfd: add ASIC version check for the reset selection of RAS poison
GFX v9.4.3 uses mode1 reset, other ASICs choose mode2.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Acked-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-14 16:18:27 -04:00
Tao Zhou
4280f60e8e drm/amdkfd: use mode1 reset for RAS poison consumption
Per firmware's requirement, replace mode2 with mode1.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-14 16:18:27 -04:00