Commit Graph

31376 Commits

Author SHA1 Message Date
Yunxiang Li
ba531117a8 drm/amdgpu: call flush_gpu_tlb directly in gfxhub enable
Here since we are in reset and takes the reset_domain write side lock
already. We can't use the flush tlb helper which tries to take the read
side.

Signed-off-by: Yunxiang Li <Yunxiang.Li@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-14 16:15:59 -04:00
Yunxiang Li
c1f9d82b92 drm/amdgpu: use helper in amdgpu_gart_unbind
When amdgpu_gart_invalidate_tlb helper is introduced this part was left
out of the conversion. Avoid the code duplication here.

Signed-off-by: Yunxiang Li <Yunxiang.Li@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-14 16:15:59 -04:00
Yunxiang Li
4b0e76e4c1 drm/amdgpu: remove tlb flush in amdgpu_gtt_mgr_recover
At this point the gart is not set up, there's no point to invalidate tlb
here and it could even be harmful.

Signed-off-by: Yunxiang Li <Yunxiang.Li@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-14 16:15:58 -04:00
Yunxiang Li
1802b042a3 drm/amdgpu/kfd: remove is_hws_hang and is_resetting
is_hws_hang and is_resetting serves pretty much the same purpose and
they all duplicates the work of the reset_domain lock, just check that
directly instead. This also eliminate a few bugs listed below and get
rid of dqm->ops.pre_reset.

kfd_hws_hang did not need to avoid scheduling another reset. If the
on-going reset decided to skip GPU reset we have a bad time, otherwise
the extra reset will get cancelled anyway.

remove_queue_mes forgot to check is_resetting flag compared to the
pre-MES path unmap_queue_cpsch, so it did not block hw access during
reset correctly.

Signed-off-by: Yunxiang Li <Yunxiang.Li@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-14 16:15:58 -04:00
Yunxiang Li
5c0a1cdd17 drm/amdgpu: fix sriov host flr handler
We send back the ready to reset message before we stop anything. This is
wrong. Move it to when we are actually ready for the FLR to happen.

In the current state since we take tens of seconds to stop everything,
it is very likely that host would give up waiting and reset the GPU
before we send ready, so it would be the same as before. But this gets
rid of the hack with reset_domain locking and also let us tell how slow
ready to reset actually is from the host. The ready to reset speed can
be improved later.

Signed-off-by: Yunxiang Li <Yunxiang.Li@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Emily Deng <Emily.Deng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-14 16:15:58 -04:00
Yunxiang Li
b3948ad1ac drm/amdgpu: add skip_hw_access checks for sriov
Accessing registers via host is missing the check for skip_hw_access and
the lockdep check that comes with it.

Signed-off-by: Yunxiang Li <Yunxiang.Li@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-14 16:15:58 -04:00
Eric Huang
bac640ddb5 drm/amdgpu: add reset source in various cases
To fullfill the reset event description.

Suggested-by: Lijo Lazar <Lijo.Lazar@amd.com>
Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-14 16:15:58 -04:00
Eric Huang
7bed1df814 drm/amdgpu: fix NULL pointer in amdgpu_reset_get_desc
amdgpu_job_ring may return NULL, which causes kernel NULL
pointer error, using another way to print ring name instead
of ring->name.

Suggested-by: Lijo Lazar <Lijo.Lazar@amd.com>
Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-14 16:15:58 -04:00
Aric Cyr
6218bd6b22 drm/amd/display: dc 3.2.287
This version brings the following changes:
- Add sequential ONO sequencing for DCN35
- Add new GPINT command definitions
- reduce ODM slice count to initial new dc state only when needed
- Enable copying of bounding box data from VBIOS DMUB
- Guard reading 3DLUT registers for dcn32/dcn35

Reviewed-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Acked-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Signed-off-by: Aric Cyr <aric.cyr@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-14 16:15:28 -04:00
Sung Joon Kim
df86486d90 drm/amd/display: Fix DSC slice and delay calculations
[why]
There are other factors that determine the number
of DSC slices. The slices should not be determined
in DML but retrieve the value calculated from driver.

[how]
Update the logic to determine DSC slice.
Make DSCDelay per display pipe.

Reviewed-by: Jun Lei <jun.lei@amd.com>
Acked-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Signed-off-by: Sung Joon Kim <sungjoon.kim@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-14 15:35:20 -04:00
Alex Hung
82b7cde3f2 drm/amd/display: Increase MAX_LINKS by 2
Two additional virtual links are created and thus increasing size
for dc->links by two.

Reviewed-by: Alvin Lee <alvin.lee2@amd.com>
Acked-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-14 15:35:10 -04:00
Nicholas Kazlauskas
470679ef33 drm/amd/display: Guard reading 3DLUT registers for dcn32/dcn35
[Why]
3DLUT is not part of the DPP on DCN32/DCN35 ASIC and these registers
now exist in MCM state.

[How]
Add guards when reading DPP state based on whether the register has a
valid offset.

Reviewed-by: Sung joon Kim <sungjoon.kim@amd.com>
Acked-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-14 15:34:52 -04:00
Dillon Varone
06cd6d8f80 drm/amd/display: Various DML2 fixes for FAMS2
- Ensure SubVP stream settings match ODM policy
- Fix MALL size calculations when DCC is enabled
- Fail if any stream fails DRR policy check

Reviewed-by: Chaitanya Dhere <chaitanya.dhere@amd.com>
Acked-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Signed-off-by: Dillon Varone <dillon.varone@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-14 15:34:44 -04:00
Alvin Lee
cf58fdca00 drm/amd/display: Program DIG FE source select for DVI before PHY en
[Description]
In newer DCN's the programming of SYMCLK_FE_SRC_SEL depends on
the value of DIG_FE_SOURCE_SELECT. If DIG_FE_SOURCE_SELECT is not
already programmed at the time of PHY / DIG enable then the FW
sequence will program an incorrect SYMCLK source. Ensure that we
program DIG_FE_SOURCE_SELECT for all DIO scenarios (DVI in this
particular case) before going through the PHY / DIG enable sequence.

Reviewed-by: Wenjing Liu <wenjing.liu@amd.com>
Acked-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Signed-off-by: Alvin Lee <alvin.lee2@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-14 15:34:31 -04:00
Jesse Zhang
839eb4bbbd drm/amd/pm: remove dead code in navi10_emit_clk_levels and navi10_print_clk_levels
Since the range of the varibable i is 0 - 3.
So execution cannot reach this statement: default.

Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Tim Huang <Tim.Huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-14 15:34:20 -04:00
Jesse Zhang
27b500b77b drm/amdgpu: remove dead code in atom_get_src_int
Since the range of align is 0~7, the expression is: align = (attr >> 3) & 7.
In the case of ATOM_ARG_IMM, the code cannot reach the default case.
So there is no need for "break".

Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com>
Suggested-by: Tim Huang <Tim.Huang@amd.com>
Reviewed-by: Tim Huang <Tim.Huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-14 15:34:10 -04:00
ChunTao Tso
57a0d65bd1 drm/amd/display: Introduce deferred Replay coasting vtotal update
Add functions to defer updating of coasting vtotal after source refresh rate update.

Reviewed-by: Robin Chen <robin.chen@amd.com>
Acked-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Signed-off-by: ChunTao Tso <chuntao.tso@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-14 15:34:02 -04:00
Srinivasan Shanmugam
38e6f715b0 drm/amd/display: Add NULL check for 'afb' before dereferencing in amdgpu_dm_plane_handle_cursor_update
This commit adds a null check for the 'afb' variable in the
amdgpu_dm_plane_handle_cursor_update function. Previously, 'afb' was
assumed to be null, but was used later in the code without a null check.
This could potentially lead to a null pointer dereference.

Fixes the below:
drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm_plane.c:1298 amdgpu_dm_plane_handle_cursor_update() error: we previously assumed 'afb' could be null (see line 1252)

Cc: Tom Chung <chiahsuan.chung@amd.com>
Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Cc: Roman Li <roman.li@amd.com>
Cc: Hersen Wu <hersenxs.wu@amd.com>
Cc: Alex Hung <alex.hung@amd.com>
Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
Cc: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-14 15:33:21 -04:00
Srinivasan Shanmugam
ce66ffd981 drm/amd/display: Add null check for 'afb' in amdgpu_dm_update_cursor
This commit adds a null check for the 'afb' variable in the
amdgpu_dm_update_cursor function. Previously, 'afb' was assumed to be
null at line 8388, but was used later in the code without a null check.
This could potentially lead to a null pointer dereference.

Fixes the below:
drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c:8433 amdgpu_dm_update_cursor()
	error: we previously assumed 'afb' could be null (see line 8388)

drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c
    8379 static void amdgpu_dm_update_cursor(struct drm_plane *plane,
    8380                                     struct drm_plane_state *old_plane_state,
    8381                                     struct dc_stream_update *update)
    8382 {
    8383         struct amdgpu_device *adev = drm_to_adev(plane->dev);
    8384         struct amdgpu_framebuffer *afb = to_amdgpu_framebuffer(plane->state->fb);
    8385         struct drm_crtc *crtc = afb ? plane->state->crtc : old_plane_state->crtc;
                                         ^^^^^

    8386         struct dm_crtc_state *crtc_state = crtc ? to_dm_crtc_state(crtc->state) : NULL;
    8387         struct amdgpu_crtc *amdgpu_crtc = to_amdgpu_crtc(crtc);
    8388         uint64_t address = afb ? afb->address : 0;
                                    ^^^^^ Checks for NULL

    8389         struct dc_cursor_position position = {0};
    8390         struct dc_cursor_attributes attributes;
    8391         int ret;
    8392
    8393         if (!plane->state->fb && !old_plane_state->fb)
    8394                 return;
    8395
    8396         drm_dbg_atomic(plane->dev, "crtc_id=%d with size %d to %d\n",
    8397                        amdgpu_crtc->crtc_id, plane->state->crtc_w,
    8398                        plane->state->crtc_h);
    8399
    8400         ret = amdgpu_dm_plane_get_cursor_position(plane, crtc, &position);
    8401         if (ret)
    8402                 return;
    8403
    8404         if (!position.enable) {
    8405                 /* turn off cursor */
    8406                 if (crtc_state && crtc_state->stream) {
    8407                         dc_stream_set_cursor_position(crtc_state->stream,
    8408                                                       &position);
    8409                         update->cursor_position = &crtc_state->stream->cursor_position;
    8410                 }
    8411                 return;
    8412         }
    8413
    8414         amdgpu_crtc->cursor_width = plane->state->crtc_w;
    8415         amdgpu_crtc->cursor_height = plane->state->crtc_h;
    8416
    8417         memset(&attributes, 0, sizeof(attributes));
    8418         attributes.address.high_part = upper_32_bits(address);
    8419         attributes.address.low_part  = lower_32_bits(address);
    8420         attributes.width             = plane->state->crtc_w;
    8421         attributes.height            = plane->state->crtc_h;
    8422         attributes.color_format      = CURSOR_MODE_COLOR_PRE_MULTIPLIED_ALPHA;
    8423         attributes.rotation_angle    = 0;
    8424         attributes.attribute_flags.value = 0;
    8425
    8426         /* Enable cursor degamma ROM on DCN3+ for implicit sRGB degamma in DRM
    8427          * legacy gamma setup.
    8428          */
    8429         if (crtc_state->cm_is_degamma_srgb &&
    8430             adev->dm.dc->caps.color.dpp.gamma_corr)
    8431                 attributes.attribute_flags.bits.ENABLE_CURSOR_DEGAMMA = 1;
    8432
--> 8433         attributes.pitch = afb->base.pitches[0] / afb->base.format->cpp[0];
                                    ^^^^^                  ^^^^^
Unchecked dereferences

    8434
    8435         if (crtc_state->stream) {
    8436                 if (!dc_stream_set_cursor_attributes(crtc_state->stream,
    8437                                                      &attributes))
    8438                         DRM_ERROR("DC failed to set cursor attributes\n");
    8439
    8440                 update->cursor_attributes = &crtc_state->stream->cursor_attributes;
    8441
    8442                 if (!dc_stream_set_cursor_position(crtc_state->stream,
    8443                                                    &position))
    8444                         DRM_ERROR("DC failed to set cursor position\n");
    8445
    8446                 update->cursor_position = &crtc_state->stream->cursor_position;
    8447         }
    8448 }

Fixes: 66eba12a54 ("drm/amd/display: Do cursor programming with rest of pipe")
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Cc: Tom Chung <chiahsuan.chung@amd.com>
Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Cc: Roman Li <roman.li@amd.com>
Cc: Hersen Wu <hersenxs.wu@amd.com>
Cc: Alex Hung <alex.hung@amd.com>
Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
Cc: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-14 15:24:22 -04:00
Lewis Huang
06a498d9f5 drm/amd/display: Add monitor patch skip disable crtc during psr and ips1
[Why]
For some panel, it cannot handle pseudo vblank set by otg resync
when leave psr

[How]
The monitor patch will keep otg_on during enter IPS1.
And then we don't need to do otg resync when wake up.

Reviewed-by: Duncan Ma <duncan.ma@amd.com>
Acked-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Signed-off-by: Lewis Huang <lewis.huang@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-14 15:24:16 -04:00
Chiawen Huang
abb3f19cad drm/amd/display: add set ips disable
[How&Why]
Once IPS active, all the DCN resources are
not be allowed to access.
It needs to a function for 3rd party to
on/off IPS.

Reviewed-by: Duncan Ma <duncan.ma@amd.com>
Acked-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Signed-off-by: Chiawen Huang <chiawen.huang@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-14 15:24:10 -04:00
Dillon Varone
6172d39be2 drm/amd/display: Add recovery timeout to FAMS2
[WHY&HOW]
Add 5ms timeout to trigger recovery and force allow P-State in DMUB.

Reviewed-by: Alvin Lee <alvin.lee2@amd.com>
Acked-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Signed-off-by: Dillon Varone <dillon.varone@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-14 15:24:03 -04:00
Dillon Varone
ba73d69a2c drm/amd/display: Force max clocks unconditionally when p-state is unsupported
[WHY&HOW]
UCLK and FCLK are updated together, so an FCLK update can also cause UCLK update
to SMU.  When this happens, the UCLK provided should be max if switching is
unsupported.

Reviewed-by: Alvin Lee <alvin.lee2@amd.com>
Acked-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Signed-off-by: Dillon Varone <dillon.varone@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-14 15:23:50 -04:00
Wayne Lin
028383b64d drm/amd/display: Change the order of setting DP_IS_USB_C flag
[Why]
enc10->base.features.flags.bits.DP_IS_USB_C will be overwritten if we set it
before initializing enc10->base.features

[How]
Determine DP_IS_USB_C after enc10->base.features is initialized. Besides,
bp_cap_info.DP_IS_USB_C will never be set in get_connector_speed_cap_info().
Remove the redudant code.

Reviewed-by: Roman Li <roman.li@amd.com>
Acked-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Signed-off-by: Wayne Lin <wayne.lin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-14 15:23:42 -04:00
Yihan Zhu
a878304276 drm/amd/display: bypass ODM before CRTC off
[WHY]
OPPs couldn't disconnect from the ODM that cause the double buffer pending not being latched due to missing VUPDATE.

[HOW]
Moving memory blanking before OTG turn off to make sure double buffer latched correctly.

Reviewed-by: Dmytro Laktyushkin <dmytro.laktyushkin@amd.com>
Acked-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Signed-off-by: Yihan Zhu <yihan.zhu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-14 15:23:34 -04:00
Alex Deucher
91efe6de70 drm/amd/display/dcn401: use pre-allocated temp structure for bounding box
This mirrors what the driver does for older DCN generations.

Should fix:
[   26.924055] BUG: sleeping function called from invalid context at include/linux/sched/mm.h:306
[   26.924060] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 1022, name: modprobe
[   26.924063] preempt_count: 2, expected: 0
[   26.924064] RCU nest depth: 0, expected: 0
[   26.924066] Preemption disabled at:
[   26.924067] [<ffffffffc089e5e0>] dc_fpu_begin+0x30/0xd0 [amdgpu]
[   26.924322] CPU: 9 PID: 1022 Comm: modprobe Not tainted 6.8.0+ #20
[   26.924325] Hardware name: System manufacturer System Product Name/CROSSHAIR VI HERO, BIOS 6302 10/23/2018
[   26.924326] Call Trace:
[   26.924327]  <TASK>
[   26.924329]  dump_stack_lvl+0x37/0x50
[   26.924333]  ? dc_fpu_begin+0x30/0xd0 [amdgpu]
[   26.924589]  dump_stack+0x10/0x20
[   26.924592]  __might_resched+0x16a/0x1c0
[   26.924596]  __might_sleep+0x42/0x70
[   26.924598]  __kmalloc_node_track_caller+0x2ad/0x4b0
[   26.924601]  ? dm_helpers_allocate_gpu_mem+0x12/0x20 [amdgpu]
[   26.924855]  ? dcn401_update_bw_bounding_box+0x2a/0xf0 [amdgpu]
[   26.925122]  kmemdup+0x20/0x50
[   26.925124]  ? kernel_fpu_begin_mask+0x6b/0xe0
[   26.925127]  ? kmemdup+0x20/0x50
[   26.925129]  dcn401_update_bw_bounding_box+0x2a/0xf0 [amdgpu]
[   26.925393]  dc_create+0x311/0x670 [amdgpu]
[   26.925649]  amdgpu_dm_init+0x2aa/0x1fa0 [amdgpu]
[   26.925903]  ? irq_work_queue+0x38/0x50
[   26.925907]  ? vprintk_emit+0x1e7/0x270
[   26.925910]  ? dev_printk_emit+0x83/0xb0
[   26.925914]  ? amdgpu_device_rreg+0x17/0x20 [amdgpu]
[   26.926133]  dm_hw_init+0x14/0x30 [amdgpu]

v2: drop extra memcpy

Fixes: 669d6b078e ("drm/amd/display: avoid large on-stack structures")
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Suggested-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: George Zhang <george.zhang@amd.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: harry.wentland@amd.com
Cc: sunpeng.li@amd.com
Cc: Rodrigo.Siqueira@amd.com
2024-06-14 15:23:12 -04:00
Alex Deucher
afe9555e79 drm/amd/display: use pre-allocated temp structure for bounding box
This mirrors what the driver does for older DCN generations.

Should fix:

BUG: sleeping function called from invalid context at include/linux/sched/mm.h:306
in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 449, name: kworker/u64:8
preempt_count: 2, expected: 0
RCU nest depth: 0, expected: 0
Preemption disabled at:
ffffffffc0ce1580>] dc_fpu_begin+0x30/0xd0 [amdgpu]
CPU: 5 PID: 449 Comm: kworker/u64:8 Tainted: G        W          6.8.0+ #35
Hardware name: System manufacturer System Product Name/ROG STRIX X570-E GAMING WIFI II, BIOS 4204 02/24/2022
Workqueue: events_unbound async_run_entry_fn

v2: drop extra memcpy

Fixes: 88c61827ce ("drm/amd/display: dynamically allocate dml2_configuration_options structures")
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Tested-by: George Zhang <George.zhang@amd.com> (v1)
Suggested-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: George Zhang <george.zhang@amd.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: harry.wentland@amd.com
Cc: sunpeng.li@amd.com
Cc: Rodrigo.Siqueira@amd.com
2024-06-14 15:22:33 -04:00
Frank Min
faa64f633c drm/amdgpu: add sdma 7.0 support for copy dcc buffer
1. Add dcc buffer flag for copy buffer
2. Add sdma 7.0 support copy dcc buffer

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Signed-off-by: Frank Min <Frank.Min@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-14 15:22:14 -04:00
Likun Gao
7c85e97083 drm/amdgpu: support for DCC feature
Deal with AMDGPU_GEM_CREATE_GFX12_DCC to set DCC bit
when needed.

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-14 15:21:51 -04:00
Alex Deucher
6b83b94a94 drm/amdgpu: add additional VM bits
Add additional VM PTE bits.

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-14 15:20:56 -04:00
Thorsten Blum
4aa1f20251 drm/amd/display: Simplify if conditions
The if conditions !A || A && B can be simplified to !A || B.

Fixes the following Coccinelle/coccicheck warnings reported by
excluded_middle.cocci:

	WARNING !A || A && B is equivalent to !A || B
	WARNING !A || A && B is equivalent to !A || B
	WARNING !A || A && B is equivalent to !A || B

Compile-tested only.

Signed-off-by: Thorsten Blum <thorsten.blum@toblux.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-14 15:20:47 -04:00
Jack Chang
f1934de46f drm/amd/display: Extend PSRSU residency mode
1. To support multiple PSRSU residency measurement mode

Reviewed-by: Wenjing Liu <wenjing.liu@amd.com>
Acked-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Signed-off-by: Jack Chang <jack.chang@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-14 15:20:40 -04:00
Nicholas Kazlauskas
0db6657274 drm/amd/display: Add outbox notification support for HPD redetect
[Why]
HPD sense changes can occur during low power states and need to be
notified from firmware to driver. Upon notification the hotplug
redetection routines should execute.

[How]
Add Support in DMUB srv and DMUB srv stat for receiving these
notifications. DM can hook them up and process the HPD redetection
once received.

Reviewed-by: Duncan Ma <duncan.ma@amd.com>
Acked-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-14 15:20:20 -04:00
Maxime Ripard
14731a640e Merge drm/drm-fixes into drm-misc-fixes
Roll -rc3 and current drm/fixes in.

This will also unstuck our for-next branch.

Signed-off-by: Maxime Ripard <mripard@kernel.org>
2024-06-14 09:55:46 +02:00
Dave Airlie
1ddaaa2440 Merge tag 'amd-drm-next-6.11-2024-06-07' of https://gitlab.freedesktop.org/agd5f/linux into drm-next
amd-drm-next-6.11-2024-06-07:

amdgpu:
- DCN 4.0.x support
- DCN 3.5 updates
- GC 12.0 support
- DP MST fixes
- Cursor fixes
- MES11 updates
- MMHUB 4.1 support
- DML2 Updates
- DCN 3.1.5 fixes
- IPS fixes
- Various code cleanups
- GMC 12.0 support
- SDMA 7.0 support
- SMU 13 updates
- SR-IOV fixes
- VCN 5.x fixes
- MES12 support
- SMU 14.x updates
- Devcoredump improvements
- Fixes for HDP flush on platforms with >4k pages
- GC 9.4.3 fixes
- RAS ACA updates
- Silence UBSAN flex array warnings
- MMHUB 3.3 updates

amdkfd:
- Contiguous VRAM allocations
- GC 12.0 support
- SDMA 7.0 support
- SR-IOV fixes

radeon:
- Backlight workaround for iMac
- Silence UBSAN flex array warnings

UAPI:
- GFX12 modifier and DCC support
  Proposed Mesa changes:
  https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29510
- KFD GFX ALU exceptions
  Proposed ROCdebugger changes:
  08c760622b
  944fe1c141
- KFD Contiguous VRAM allocation flag
  Proposed ROCr/HIP changes:
  f7b4a26991
  26e8530d05
  1d48f2a1ab

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240607195900.902537-1-alexander.deucher@amd.com
2024-06-11 14:01:55 +10:00
Arunpravin Paneer Selvam
31849bf07e drm/amdgpu: Fix the BO release clear memory warning
This happens when the amdgpu_bo_release_notify running
before amdgpu_ttm_set_buffer_funcs_status set the buffer
funcs to enabled.

check the buffer funcs enablement before calling the fill
buffer memory.

v2:(Christian)
  - Apply it only for GEM buffers and since GEM buffers are only
    allocated/freed while the driver is loaded we never run into
    the issue to clear with buffer funcs disabled.

v3:(Mario)
  - drop the stable tag as this will presumably go into a
    -fixes PR for 6.10

Log snip:
*ERROR* Trying to clear memory with ring turned off.
RIP: 0010:amdgpu_bo_release_notify+0x201/0x220 [amdgpu]

Fixes: a68c7eaa7a ("drm/amdgpu: Enable clear page functionality")
Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Tested-by: Richard Gong <richard.gong@amd.com>
Suggested-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240610180401.9540-1-Arunpravin.PaneerSelvam@amd.com
2024-06-10 23:46:33 +05:30
Tasos Sahanidis
c6c4dd5401 drm/amdgpu/pptable: Fix UBSAN array-index-out-of-bounds
Flexible arrays used [1] instead of []. Replace the former with the latter
to resolve multiple UBSAN warnings observed on boot with a BONAIRE card.

In addition, use the __counted_by attribute where possible to hint the
length of the arrays to the compiler and any sanitizers.

Signed-off-by: Tasos Sahanidis <tasos@tasossah.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-05 13:43:34 -04:00
Mario Limonciello
267cace556 drm/amd: Fix shutdown (again) on some SMU v13.0.4/11 platforms
commit cd94d1b182 ("dm/amd/pm: Fix problems with reboot/shutdown for
some SMU 13.0.4/13.0.11 users") attempted to fix shutdown issues
that were reported since commit 31729e8c21 ("drm/amd/pm: fixes a
random hang in S4 for SMU v13.0.4/11") but caused issues for some
people.

Adjust the workaround flow to properly only apply in the S4 case:
-> For shutdown go through SMU_MSG_PrepareMp1ForUnload
-> For S4 go through SMU_MSG_GfxDeviceDriverReset and
   SMU_MSG_PrepareMp1ForUnload

Reported-and-tested-by: lectrode <electrodexsnet@gmail.com>
Closes: https://github.com/void-linux/void-packages/issues/50417
Cc: stable@vger.kernel.org
Fixes: cd94d1b182 ("dm/amd/pm: Fix problems with reboot/shutdown for some SMU 13.0.4/13.0.11 users")
Reviewed-by: Tim Huang <Tim.Huang@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-05 13:41:56 -04:00
Tao Zhou
b95fa494d6 drm/amdgpu: add RAS is_rma flag
Set the flag to true if bad page number reaches threshold.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-05 11:25:14 -04:00
Srinivasan Shanmugam
15c2990e0f drm/amd/display: Add null checks for 'stream' and 'plane' before dereferencing
This commit adds null checks for the 'stream' and 'plane' variables in
the dcn30_apply_idle_power_optimizations function. These variables were
previously assumed to be null at line 922, but they were used later in
the code without checking if they were null. This could potentially lead
to a null pointer dereference, which would cause a crash.

The null checks ensure that 'stream' and 'plane' are not null before
they are used, preventing potential crashes.

Fixes the below static smatch checker:
drivers/gpu/drm/amd/amdgpu/../display/dc/hwss/dcn30/dcn30_hwseq.c:938 dcn30_apply_idle_power_optimizations() error: we previously assumed 'stream' could be null (see line 922)
drivers/gpu/drm/amd/amdgpu/../display/dc/hwss/dcn30/dcn30_hwseq.c:940 dcn30_apply_idle_power_optimizations() error: we previously assumed 'plane' could be null (see line 922)

Cc: Tom Chung <chiahsuan.chung@amd.com>
Cc: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Cc: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com>
Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Cc: Roman Li <roman.li@amd.com>
Cc: Hersen Wu <hersenxs.wu@amd.com>
Cc: Alex Hung <alex.hung@amd.com>
Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
Cc: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-05 11:25:14 -04:00
Jesse Zhang
17035a45f1 drm/amd/pm: remove dead code in si_convert_power_level_to_smc
Since gmc_pg is false, setting mcFlags with SISLANDS_SMC_MC_PG_EN  cannot be reach.

Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com>
Suggested-by: Tim Huang <Tim.Huang@amd.com>
Reviewed-by: Tim Huang <Tim.Huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-05 11:25:14 -04:00
Fangzhi Zuo
1ff6631bae drm/amd/display: Prevent IPX From Link Detect and Set Mode
IPX involvment proven to affect LT, causing link loss. Need to prevent
IPX enabled in LT process in which link detect and set mode are main
procedures that have LT taken place.

Reviewed-by: Roman Li <roman.li@amd.com>
Acked-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Signed-off-by: Fangzhi Zuo <jerry.zuo@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-05 11:25:14 -04:00
Jesse Zhang
7f7f43f28e drm/amdkfd: remove logically dead code
idr_for_each_entry can ensure that mem is not empty during the loop.
So don't need check mem again.

Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-05 11:25:14 -04:00
Lin.Cao
c8ad1bbbc2 drm/amdgpu: fix failure mapping legacy queue when FLR
Flag "mes.ring.shced.ready" will be set as true after mes hw init and set
as false when mes hw fini to avoid duplicate initialization. But hw fini
will not be called when function level reset, which will cause mes hw
init be skipped during FLR, which will leads to mapping legacy queue
fail. Set this flag as false when post reset will fix this issue.

Signed-off-by: Lin.Cao <lincao12@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-05 11:25:14 -04:00
Daniel Sa
2874129903 drm/amd/display: Fetch Mall caps from DC
[Why]
When performing P-State switching with Subvp on 8k (downscaled to 4k).
corruption can be seen on the screen. MALL data was not being fetched
from DC, and the system things there is more MALL space then what is
actually available.

[How]
Read MALL size from dc caps.

Reviewed-by: Dillon Varone <dillon.varone@amd.com>
Acked-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Signed-off-by: Daniel Sa <daniel.sa@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-05 11:25:14 -04:00
Samson Tam
5d74be8c3a drm/amd/display: fix YUV video color corruption in DCN401
[Why]
Missing check causes sequence error which results in chroma
 filter coefficients not being updated in certain modes
 when we display YUV video in fullscreen.  This results in
 color corruption in video

[How]
Add back chroma_coef_mode check in dscl_set_scl_filter
 so that filter coefficients are calculated and updated when
 we have YUV surface

Reviewed-by: Alvin Lee <alvin.lee2@amd.com>
Acked-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Signed-off-by: Samson Tam <samson.tam@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-05 11:25:14 -04:00
Eric Huang
dbe2c4c8ab drm/amdkfd: add reset cause in gpu pre-reset smi event
reset cause is requested by customer as additional
info for gpu reset smi event.

v2: integerate reset sources suggested by Lijo Lazar

Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-05 11:25:14 -04:00
Frank Min
3c7758beb2 drm/amdgpu: Update soc24_enum.h and soc21_enum.h
Update to latest changes.

Signed-off-by: Frank Min <Frank.Min@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-05 11:25:14 -04:00
Frank Min
978f5428c9 drm/amdgpu: Set PTE_IS_PTE bit for gfx12
Set PTE_IS_PTE bit while PRT is enabled on gfx12.

Signed-off-by: Frank Min <Frank.Min@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-05 11:25:14 -04:00
Relja Vojvodic
239612c376 drm/amd/display: Updated optc401_set_drr to use dcn401 functions
why:
optc_401_set_drr was using an old optc3 function to update vtotal min and max,
causing crashes when disabling FAMS2

how:
Updated dcn401 to point to opt401 function for vtotal updates. This version of
the function has FAMS2 logic that allows for FAMS2 to be disabled.

Reviewed-by: Dillon Varone <dillon.varone@amd.com>
Acked-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Signed-off-by: Relja Vojvodic <relja.vojvodic@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-05 11:25:13 -04:00