Commit Graph

29120 Commits

Author SHA1 Message Date
Sung Joon Kim
4ea7151f6b drm/amd/display: Modify SMU message logs
[why]
It's important to make sure SMU messages
are logged by default to improve debugging for
power optimization use cases.

[how]
Change logs to warnings when SMU message
returns non-success id.

Reviewed-by: Charlene Liu <charlene.liu@amd.com>
Acked-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Sung Joon Kim <sungkim@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-09 16:52:55 -04:00
Stanley.Yang
80285ae1ec drm/amdgpu: Fix potential null pointer derefernce
The amdgpu_ras_get_context may return NULL if device
not support ras feature, so add check before using.

Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-09 16:52:46 -04:00
Yifan Zhang
098c13079c drm/amd/display: enable S/G display for for recent APUs by default
With S/G display becomes stable, enable S/G display for recent APUs
by default rather than white list.

v2: explicitly disable sg on pre-CZ chips (Alex)
v3: add parens for every clause (Alex)

Co-authored-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-09 16:52:32 -04:00
Lijo Lazar
b3e73b5a8f Documentation/amdgpu: Add FRU attribute details
Add documentation for the newly added manufacturer and fru_id attributes
in sysfs.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-09 16:52:25 -04:00
Lijo Lazar
ac6b1f275f drm/amdgpu: Add more FRU field information
Add support to read Manufacturer Name and FRU File Id fields. Also add
sysfs device attributes for external usage.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-09 16:52:17 -04:00
Lijo Lazar
8a2b51392a drm/amdgpu: Refactor FRU product information
Keep FRU related information together in a separate structure.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-09 16:52:08 -04:00
Yang Wang
be2e8aca06 drm/amdgpu: enable FRU device for SMU v13.0.6
v1:
enable GFX v9.4.3 FRU device to query board information.

v2:
use MP1 version to identify different asic

Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-09 16:51:58 -04:00
Boyuan Zhang
6cb8e3ee3a drm/amdgpu: update ib start and size alignment
Update IB starting address alignment and size alignment with correct values
for decode and encode IPs.

Decode IB starting address alignment: 256 bytes
Decode IB size alignment: 64 bytes
Encode IB starting address alignment: 256 bytes
Encode IB size alignment: 4 bytes

Also bump amdgpu driver version for this update.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-09 16:51:39 -04:00
Gabe Teeger
647cf51519 drm/amd/display: add check in validate_only in dml2
[what]
does_configuration_meet_sw_policies check was not done in the
validate_only portion of dml2, so some unsupported modes were passing bw
validation, only to fail the same check later in validate_and_build. now
we add the check to validate_only.

Also add line in dcn35_resource to ensure that value set for
enable_windowed_mpo_odm gets passed to dml.

[why]
Immediate black screen during video playback at 4k144hz. The debugger
showed that we were failing validation in dml on every updateplanes().

Reviewed-by: Wenjing Liu <wenjing.liu@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Qingqing Zhuo <Qingqing.Zhuo@amd.com>
Signed-off-by: Gabe Teeger <gabe.teeger@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-09 16:51:28 -04:00
Daniel Miess
cbe069f5e6 drm/amd/display: Port replay vblank logic to DML2
Update DML2 with replay vblank logic found in DML1.

Reviewed-by: Charlene Liu <charlene.liu@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Daniel Miess <daniel.miess@amd.com>
Signed-off-by: Qingqing Zhuo <Qingqing.Zhuo@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-09 16:51:21 -04:00
Saaem Rizvi
ba85d293a3 drm/amd/display: Modify Pipe Selection for Policy for ODM
[Why]
There are certain cases during a transition to ODM that might cause
corruption on the display. This occurs when we choose certain pipes in a
particular state.

[How]
We now will store the pipe indexes of the any pipes that might be
problematic to switch to during an ODM transition, and only use them as
a last resort.

Reviewed-by: Dmytro Laktyushkin <dmytro.laktyushkin@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Qingqing Zhuo <Qingqing.Zhuo@amd.com>
Signed-off-by: Saaem Rizvi <syedsaaem.rizvi@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-09 16:51:14 -04:00
Charlene Liu
0e56de91ed drm/amd/display: correct dml2 input and dlg_refclk
dc->dml2_options.use_native_pstate_optimization flag will make driver
use dcn32 legacy_svp_drr related tuning. Set this to false fixed the
stutter underflow issue also based on HW suggest disable ODM by default
and let DML choose it.

Reviewed-by: Zhan Liu <zhan.liu@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Qingqing Zhuo <Qingqing.Zhuo@amd.com>
Signed-off-by: Charlene Liu <charlene.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-09 16:51:06 -04:00
Sung Joon Kim
969fe903ee drm/amd/display: Fix Chroma Surface height/width initialization
[why]
Surface height/width for Chroma has another variable that it should be
intialized to, chroma_size. Fixing this will help pass DML2.0 validation
for YCbCr420 tests, DCHB006.109,129, DCHB014.011,012.

[how]
Assign SurfaceHeight/WidthC to chroma_size.height/width

Reviewed-by: Charlene Liu <charlene.liu@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Sung Joon Kim <sungkim@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-09 16:50:59 -04:00
Taimur Hassan
9158920cc8 drm/amd/display: Move stereo timing check to helper
Rework dml2_map_dc_pipes to keep the logic clean.

Reviewed-by: Chaitanya Dhere <chaitanya.dhere@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Qingqing Zhuo <Qingqing.Zhuo@amd.com>
Signed-off-by: Taimur Hassan <syed.hassan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-09 16:50:53 -04:00
Taimur Hassan
21eeb05114 drm/amd/display: Split pipe for stereo timings
[Why & How]
DML2 did not carry over DML1 logic that splits pipe for stero timings. Pipe
splitting is needed in this case to pass stereo tests.

Reviewed-by: Charlene Liu <charlene.liu@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Qingqing Zhuo <Qingqing.Zhuo@amd.com>
Signed-off-by: Taimur Hassan <syed.hassan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-09 16:50:45 -04:00
Sung Joon Kim
1d93c4db4e drm/amd/display: Use fixed DET Buffer Size
[why]
Regression from DML1.0 where we use differen DET buffer sizes for each
pipe. From the spec, we need to use DET buffer size of 384 kb for each
pipe

[how]
Ensure to use 384 kb DET buffer sizes for each available pipe.

Reviewed-by: Charlene Liu <charlene.liu@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Sung Joon Kim <sungkim@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-09 16:50:27 -04:00
Sung Joon Kim
e47d7ca757 drm/amd/display: Handle multiple streams sourcing same surface
[why]
There are cases where more than 1 stream can be mapped to the same
surface. DML2.0 does not seem to handle these cases.

[how]
Make sure to account for the stream id when deriving the plane id. By
doing this, each plane id will be unique based on the stream id.

Reviewed-by: Charlene Liu <charlene.liu@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Sung Joon Kim <sungkim@amd.com>
Signed-off-by: Qingqing Zhuo <Qingqing.Zhuo@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-09 16:50:19 -04:00
Charlene Liu
eb918cbba1 drm/amd/display: Add z8_marks in dml
Add z8 watermarks to struct for later ASIC use.

Reviewed-by: Alvin Lee <alvin.lee2@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Charlene Liu <charlene.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-09 16:50:12 -04:00
Qingqing Zhuo
115009d11c drm/amd/display: Add DCN35 DML2 support
Enable DML2 for DCN35.

Changes since V1:
- Remove hard coded values

Acked-by: Harry Wentland <Harry.Wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Roman Li <roman.li@amd.com>
Signed-off-by: Qingqing Zhuo <Qingqing.Zhuo@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-09 16:49:39 -04:00
Qingqing Zhuo
7966f319c6 drm/amd/display: Introduce DML2
DC is transitioning from DML to DML2, and this commit introduces all the
required changes for some of the already available ASICs and adds the
required code infra to support new ASICs under DML2. DML2 is also a
generated code that provides better mode verification and programming
models for software/hardware, and it enables a better way to create
validation tools. This version is more like a middle step to the
complete transition to the DML2 version.

Changes since V1:
- Alex: Fix typos

Changes since V2:
- Update DC includes

Changes since V3:
- Fix 32 bit compilation issues on x86

Changes since V4:
- Avoid compilation of DML2 on some not supported 32-bit architecture
- Update commit message

Co-developed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Co-developed-by: Roman Li <roman.li@amd.com>
Signed-off-by: Roman Li <roman.li@amd.com>
Signed-off-by: Qingqing Zhuo <Qingqing.Zhuo@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-09 16:48:51 -04:00
Rodrigo Siqueira
6e2c4941ce drm/amd/display: Move dml code under CONFIG_DRM_AMD_DC_FP guard
For some reason, the dml code is not guarded under CONFIG_DRM_AMD_DC_FP
in the Makefile. This commit moves the dml code under the DC_FP guard.

Reviewed-by: Qingqing Zhuo <Qingqing.Zhuo@amd.com>
Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-09 16:48:45 -04:00
Rodrigo Siqueira
a2719f91a1 drm/amd/display: Move bw_fixed from DML folder
bw_fixed does not need any FPU operation, and it is used on DCE and DCN.
For this reason, this commit moves bw_fixed to the basic folder outside
DML.

Reviewed-by: Qingqing Zhuo <Qingqing.Zhuo@amd.com>
Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-09 16:48:37 -04:00
Rodrigo Siqueira
13f9173af8 drm/amd/display: Move custom_float from DML folder
The custom_float file does not have any FPU operation, so it should be
inside DML. This commit moves the file to the basic folder.

Reviewed-by: Qingqing Zhuo <Qingqing.Zhuo@amd.com>
Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-09 16:48:31 -04:00
Rodrigo Siqueira
d310d18bfc drm/amd/display: Move dce_calcs from DML folder
dce_calcs does not have FPU operations, and it is required for DCE and
DCN. Remove this file from the DML folder and add it to the basic folder
visible for DCE and DCN.

Reviewed-by: Qingqing Zhuo <Qingqing.Zhuo@amd.com>
Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-09 16:48:22 -04:00
Alex Deucher
f8cd72728b drm/amdgpu: Enable SMU 13.0.0 optimizations when ROCm is active (v2)
When ROCm is active enable additional SMU 13.0.0 optimizations.
This reuses the unused powersave profile on PMFW.

v2: move to the swsmu code since we need both bits active in
    the workload mask.

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-09 16:48:16 -04:00
Sebastian Andrzej Siewior
2091ac6903 drm/amd/display: Move the memory allocation out of dcn20_validate_bandwidth_fp().
dcn20_validate_bandwidth_fp() is invoked while FPU access has been
enabled. FPU access requires disabling preemption even on PREEMPT_RT.
It is not possible to allocate memory with disabled preemption even with
GFP_ATOMIC on PREEMPT_RT.

Move the memory allocation before FPU access is enabled.
To preserve previous "clean" state of "pipes" add a memset() before the
second invocation of dcn20_validate_bandwidth_internal() where the
variable is used.

Acked-by: Harry Wentland <harry.wentland@amd.com>
Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-09 16:48:08 -04:00
Sebastian Andrzej Siewior
941e8036a4 drm/amd/display: Move the memory allocation out of dcn21_validate_bandwidth_fp().
dcn21_validate_bandwidth_fp() is invoked while FPU access has been
enabled. FPU access requires disabling preemption even on PREEMPT_RT.
It is not possible to allocate memory with disabled preemption even with
GFP_ATOMIC on PREEMPT_RT.

Move the memory allocation before FPU access is enabled.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=217928
Acked-by: Harry Wentland <harry.wentland@amd.com>
Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-09 16:47:59 -04:00
Sebastian Andrzej Siewior
80364500c0 drm/amd/display: Add a warning if the FPU is used outside from task context.
Add a warning if the FPU is used from any context other than task
context. This is only precaution since the code is not able to be used
from softirq while the API allows it on x86 for instance.

Acked-by: Harry Wentland <harry.wentland@amd.com>
Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-09 16:47:37 -04:00
Mario Limonciello
0f0e59075b drm/amd: Fix UBSAN array-index-out-of-bounds for Polaris and Tonga
For pptable structs that use flexible array sizes, use flexible arrays.

Link: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2036742
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-09 16:47:23 -04:00
Mario Limonciello
760efbca74 drm/amd: Fix UBSAN array-index-out-of-bounds for SMU7
For pptable structs that use flexible array sizes, use flexible arrays.

Suggested-by: Felix Held <felix.held@amd.com>
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2874
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-09 16:46:54 -04:00
Kees Cook
c8e7df374b drm/amdgpu: Annotate struct amdgpu_bo_list with __counted_by
Prepare for the coming implementation by GCC and Clang of the __counted_by
attribute. Flexible array members annotated with __counted_by can have
their accesses bounds-checked at run-time via CONFIG_UBSAN_BOUNDS (for
array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family
functions).

As found with Coccinelle[1], add __counted_by for struct amdgpu_bo_list.
Additionally, since the element count member must be set before accessing
the annotated flexible array member, move its initialization earlier.

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com>
Cc: David Airlie <airlied@gmail.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: "Gustavo A. R. Silva" <gustavoars@kernel.org>
Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Cc: Felix Kuehling <Felix.Kuehling@amd.com>
Cc: amd-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Cc: linux-hardening@vger.kernel.org
Link: https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci [1]
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-05 17:59:35 -04:00
Srinivasan Shanmugam
e0a3e7bf62 drm/amdgpu: Drop unnecessary return statements
There is no reason to call return at the end of function that
returns void.

Fixes the below:

WARNING: void function return statements are not generally useful

Thus remove such a statement in the affected functions.

Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-05 17:59:35 -04:00
Lijo Lazar
4798db85b7 Documentation/amdgpu: Add board info details
Add documentation for board info sysfs attribute.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-05 17:59:35 -04:00
Lijo Lazar
76da73f026 drm/amdgpu: Add sysfs attribute to get board info
Add a sysfs attribute which shows the board form factor like OAM or
CEM.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-05 17:59:35 -04:00
Lijo Lazar
b0a4553336 drm/amdgpu: Get package types for smuio v13.0
Add support to query package types supported in smuio v13.0 ASICs.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-05 17:59:35 -04:00
Lijo Lazar
4365d2ed09 drm/amdgpu: Add more smuio v13.0.3 package types
Expand support to get other board types like OAM or CEM.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-05 17:59:35 -04:00
Sathishkumar S
cbad0dd13a drm/amdgpu: fix ip count query for xcp partitions
fix wrong ip count INFO on spatial partitions. update the query
to return the instance count corresponding to the partition id.

v2:
 initialize variables only when required to be (Christian)
 move variable declarations to the beginning of function (Christian)

Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-05 17:59:35 -04:00
Asad Kamal
c207c36544 drm/amd/pm: Remove set df cstate for SMUv13.0.6
Remove set df cstate as disallow df state is
not required for SMUv13.0.6

Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-05 17:59:35 -04:00
Lijo Lazar
28a3f49609 drm/amdgpu: Move package type enum to amdgpu_smuio
Move definition of package type to amdgpu_smuio header and add new
package types for CEM and OAM.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-05 17:59:35 -04:00
Srinivasan Shanmugam
2b6b29f33f drm/amdgpu: Fix complex macros error
Fixes the below:

ERROR: Macros with complex values should be enclosed in parentheses

WARNING: macros should not use a trailing semicolon
+#define amdgpu_inc_vram_lost(adev) atomic_inc(&((adev)->vram_lost_counter));

Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-05 17:59:35 -04:00
Xiaogang Chen
dc427a473e drm/amdkfd: Use partial migrations in GPU page faults
This patch implements partial migration in gpu page fault according to migration
granularity(default 2MB) and not split svm range in cpu page fault handling.
A svm range may include pages from both system ram and vram of one gpu now.
These chagnes are expected to improve migration performance and reduce mmu
callback and TLB flush workloads.

Signed-off-by: Xiaogang Chen <xiaogang.chen@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-05 17:58:59 -04:00
Sebastian Andrzej Siewior
de5e73dc6b drm/amd/display: Simplify the per-CPU usage.
The fpu_recursion_depth counter is used to ensure that dc_fpu_begin()
can be invoked multiple times while the FPU-disable function itself is
only invoked once. Also the counter part (dc_fpu_end()) is ballanced
properly.

Instead of using the get_cpu_ptr() dance around the inc it is simpler to
increment the per-CPU variable directly. Also the per-CPU variable has
to be incremented and decremented on the same CPU. This is ensured by
the inner-part which disables preemption. This is kind of not obvious,
works and the preempt-counter is touched a few times for no reason.

Disable preemption before incrementing fpu_recursion_depth for the first
time. Keep preemption disabled until dc_fpu_end() where the counter is
decremented making it obvious that the preemption has to stay disabled
while the counter is non-zero.
Use simple inc/dec functions.
Remove the nested preempt_disable/enable functions which are now not
needed.

Acked-by: Harry Wentland <harry.wentland@amd.com>
Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-05 17:50:50 -04:00
Sebastian Andrzej Siewior
9c77dcf6a5 drm/amd/display: Remove migrate_en/dis from dc_fpu_begin().
This is a revert of the commit mentioned below while it is not wrong, as
in the kernel will explode, having migrate_disable() here it is
complete waste of resources.

Additionally commit message is plain wrong the review tag does not make
it any better. The migrate_disable() interface has a fat comment
describing it and it includes the word "undesired" in the headline which
should tickle people to read it before using it.
Initially I assumed it is worded too harsh but now I beg to differ.

The reviewer of the original commit, even not understanding what
migrate_disable() does should ask the following:

- migrate_disable() is added only to the CONFIG_X86 block and it claims
  to protect fpu_recursion_depth. Why are the other the architectures
  excluded?

- migrate_disable() is added after fpu_recursion_depth was modified.
  Shouldn't it be added before the modification or referencing takes
  place?

Moving on.
Disabling preemption DOES prevent CPU migration. A task, that can not be
pushed away from the CPU by the scheduler (due to disabled preemption)
can not be pushed or migrated to another CPU.

Disabling migration DOES NOT ensure consistency of per-CPU variables. It
only ensures that the task acts always on the same per-CPU variable. The
task remains preemptible meaning multiple tasks can access the same
per-CPU variable. This in turn leads to inconsistency for the statement

                  *pcpu -= 1;

with two tasks on one CPU and a preemption point during the RMW
operation:

     Task A                           Task B
     read pcpu to reg  # 0
     inc reg           # 0 -> 1
                                      read pcpu to reg  # 0
                                      inc reg           # 0 -> 1
                                      write reg to pcpu # 1
     write reg to pcpu # 1

At the end pcpu reads 1 but should read 2 instead. Boom.

get_cpu_ptr() already contains a preempt_disable() statement. That means
that the per-CPU variable can only be referenced by a single task which
is currently running. The only inconsistency that can occur if the
variable is additionally accessed from an interrupt.

Remove migrate_disable/enable() from dc_fpu_begin/end().

Cc: Tianci Yin <tianci.yin@amd.com>
Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
Fixes: 0c316556d1 ("drm/amd/display: Disable migration to ensure consistency of per-CPU variable")
Acked-by: Harry Wentland <harry.wentland@amd.com>
Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-05 17:49:59 -04:00
Alex Deucher
7d3f1d76f3 drm/amdgpu: refine fault cache updates
Don't update the fault cache if status is 0.  In the multiple
fault case, subsequent faults will return a 0 status which is
useless for userspace and replaces the useful fault status, so
only update if status is non-0.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-05 17:49:51 -04:00
Alex Deucher
7a41ed8b59 drm/amdgpu: add new INFO ioctl query for the last GPU page fault
Add a interface to query the last GPU page fault for the process.
Useful for debugging context lost errors.

v2: split vmhub representation between kernel and userspace
v3: add locking when fetching fault info in INFO IOCTL

Mesa MR: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23238
libdrm MR: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23238

Cc: samuel.pitoiset@gmail.com
Reviewed-by: Christian König <christian.koenig@amd.com>
Acked-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-05 17:49:39 -04:00
Wayne Lin
9031e0013f drm/amd/display: Fix mst hub unplug warning
[Why]
Unplug mst hub will cause warning. That's because
dm_helpers_construct_old_payload() is changed to be called after
payload removement from dc link.

In dm_helpers_construct_old_payload(), We refer to the vcpi in
payload allocation table of dc link to construct the old payload
and payload is no longer in the table when we call the function
now.

[How]
Refer to the mst_state to construct the number of time slot for old
payload now. Note that dm_helpers_construct_old_payload() is just
a quick workaround before and we are going to abandon it soon.

Fixes: 5aa1dfcdf0 ("drm/mst: Refactor the flow for payload allocation/removement")
Reviewed-by: Jerry Zuo <jerry.zuo@amd.com>
Signed-off-by: Wayne Lin <Wayne.Lin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231005080405.169841-1-Wayne.Lin@amd.com
2023-10-05 12:04:02 -04:00
Kees Cook
ac8e62ab25 drm/amdgpu/discovery: Annotate struct ip_hw_instance with __counted_by
Prepare for the coming implementation by GCC and Clang of the __counted_by
attribute. Flexible array members annotated with __counted_by can have
their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS
(for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family
functions).

As found with Coccinelle[1], add __counted_by for struct ip_hw_instance.

[1] https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com>
Cc: David Airlie <airlied@gmail.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Cc: amd-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230922173216.3823169-2-keescook@chromium.org
2023-10-05 11:29:09 +02:00
Kees Cook
a640e3c3a5 drm/amd/pm: Annotate struct smu10_voltage_dependency_table with __counted_by
Prepare for the coming implementation by GCC and Clang of the __counted_by
attribute. Flexible array members annotated with __counted_by can have
their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS
(for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family
functions).

As found with Coccinelle[1], add __counted_by for struct smu10_voltage_dependency_table.

[1] https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci

Cc: Evan Quan <evan.quan@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com>
Cc: David Airlie <airlied@gmail.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Xiaojian Du <Xiaojian.Du@amd.com>
Cc: Huang Rui <ray.huang@amd.com>
Cc: Kevin Wang <kevin1.wang@amd.com>
Cc: amd-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230922173216.3823169-1-keescook@chromium.org
2023-10-05 11:29:03 +02:00
Samson Tam
b206011bf0 drm/amd/display: apply edge-case DISPCLK WDIVIDER changes to master OTG pipes only
[Why]
The edge-case DISPCLK WDIVIDER changes call stream_enc functions.
But with MPC pipes, downstream pipes have null stream_enc and will
 cause crash.

[How]
Only call stream_enc functions for pipes that are OTG master.

Reviewed-by: Alvin Lee <alvin.lee2@amd.com>
Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Samson Tam <samson.tam@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-04 22:55:05 -04:00
Mario Limonciello
134b8c5d86 drm/amd: Fix detection of _PR3 on the PCIe root port
On some systems with Navi3x dGPU will attempt to use BACO for runtime
PM but fails to resume properly.  This is because on these systems
the root port goes into D3cold which is incompatible with BACO.

This happens because in this case dGPU is connected to a bridge between
root port which causes BOCO detection logic to fail.  Fix the intent of
the logic by looking at root port, not the immediate upstream bridge for
_PR3.

Cc: stable@vger.kernel.org
Suggested-by: Jun Ma <Jun.Ma2@amd.com>
Tested-by: David Perry <David.Perry@amd.com>
Fixes: b10c1c5b3a ("drm/amdgpu: add check for ACPI power resources")
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-04 22:52:05 -04:00