When compiling allmodconfig (CONFIG_WERROR=y) with clang-19, see the
following errors:
.../display/dc/dml2/display_mode_core.c:6268:13: warning: stack frame size (3128) exceeds limit (3072) in 'dml_prefetch_check' [-Wframe-larger-than]
.../display/dc/dml2/dml21/src/dml2_core/dml2_core_dcn4_calcs.c:7236:13: warning: stack frame size (3256) exceeds limit (3072) in 'dml_core_mode_support' [-Wframe-larger-than]
Mark static functions called by dml_prefetch_check() and
dml_core_mode_support() noinline_for_stack to avoid them become huge
functions and thus exceed the frame size limit.
A way to reproduce:
$ git checkout next-20250107
$ mkdir build_dir
$ export PATH=/tmp/llvm-19.1.6-x86_64/bin:$PATH
$ make LLVM=1 O=build_dir allmodconfig
$ make LLVM=1 O=build_dir drivers/gpu/drm/ -j
The way how it chose static functions to mark:
[0] Unset CONFIG_WERROR in build_dir/.config.
To get display_mode_core.o without errors.
[1] Get a function list called by dml_prefetch_check().
$ sed -n '6268,6711p' drivers/gpu/drm/amd/display/dc/dml2/display_mode_core.c \
| sed -n -r 's/.*\W(\w+)\(.*/\1/p' | sort -u >/tmp/syms
[2] Get the non-inline function list.
Objdump won't show the symbols if they are inline functions.
$ make LLVM=1 O=build_dir drivers/gpu/drm/ -j
$ objdump -d build_dir/.../display_mode_core.o | \
./scripts/checkstack.pl x86_64 0 | \
grep -f /tmp/syms | cut -d' ' -f2- >/tmp/orig
[3] Get the full function list.
Append "-fno-inline" to `CFLAGS_.../display_mode_core.o` in
drivers/gpu/drm/amd/display/dc/dml2/Makefile.
$ make LLVM=1 O=build_dir drivers/gpu/drm/ -j
$ objdump -d build_dir/.../display_mode_core.o | \
./scripts/checkstack.pl x86_64 0 | \
grep -f /tmp/syms | cut -d' ' -f2- >/tmp/noinline
[4] Get the inline function list.
If a symbol only in /tmp/noinline but not in /tmp/orig, it is a good
candidate to mark noinline.
$ diff /tmp/orig /tmp/noinline
Chosen functions and their stack sizes:
CalculateBandwidthAvailableForImmediateFlip [display_mode_core.o]:144
CalculateExtraLatency [display_mode_core.o]:176
CalculateTWait [display_mode_core.o]:64
CalculateVActiveBandwithSupport [display_mode_core.o]:112
set_calculate_prefetch_schedule_params [display_mode_core.o]:48
CheckGlobalPrefetchAdmissibility [dml2_core_dcn4_calcs.o]:544
calculate_bandwidth_available [dml2_core_dcn4_calcs.o]:320
calculate_vactive_det_fill_latency [dml2_core_dcn4_calcs.o]:272
CalculateDCFCLKDeepSleep [dml2_core_dcn4_calcs.o]:208
CalculateODMMode [dml2_core_dcn4_calcs.o]:208
CalculateOutputLink [dml2_core_dcn4_calcs.o]:176
Signed-off-by: Tzung-Bi Shih <tzungbi@kernel.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Commit 2563391e57 ("drm/amd/display: DML2.1 resynchronization") blew
away the compiler warning fix from commit 2fde4fdddc
("drm/amd/display: Avoid -Wenum-float-conversion in
add_margin_and_round_to_dfs_grainularity()"), causing the warning to
reappear.
drivers/gpu/drm/amd/amdgpu/../display/dc/dml2/dml21/src/dml2_dpmm/dml2_dpmm_dcn4.c:183:58: error: arithmetic between enumeration type 'enum dentist_divider_range' and floating-point type 'double' [-Werror,-Wenum-float-conversion]
183 | divider = (unsigned int)(DFS_DIVIDER_RANGE_SCALE_FACTOR * (vco_freq_khz / clock_khz));
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~
1 error generated.
Apply the fix again to resolve the warning.
Re-apply again after commit be4e350931 ("drm/amd/display: DML21 Reintegration For Various Fixes")
This should be making its way back to the original DML trees this time. (Alex)
Fixes: be4e350931 ("drm/amd/display: DML21 Reintegration For Various Fixes")
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3841
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[WHY]
When effective bandwidth from the SoC is enough to perform SubVP
prefetchs, then DF throttling is not required.
[HOW]
Provide SMU the required clocks for which DF throttling is not required.
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Reviewed-by: Alvin Lee <alvin.lee2@amd.com>
Signed-off-by: Dillon Varone <dillon.varone@amd.com>
Signed-off-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Current driver interface does not allow for flexibility in coexistence
of multiple interface versions, so add support for checking minor
interface revisions and providing appropriate programming.
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Reviewed-by: Alvin Lee <alvin.lee2@amd.com>
Signed-off-by: Dillon Varone <dillon.varone@amd.com>
Signed-off-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why/How]
Mismatch between mode support and mode programming occurs.
Mode support would calculate higher row vblank than mode programming.
As a result, mode programming fails and hardware isn't properly programmed.
Reviewed-by: Dillon Varone <dillon.varone@amd.com>
Signed-off-by: Austin Zheng <Austin.Zheng@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
If the nominal VBlank is too small, optimizing for stutter can cause
the prefetch bandwidth to increase drasticaly, resulting in higher
clock and power requirements. Only optimize if it is >3x the stutter
latency.
Reviewed-by: Austin Zheng <austin.zheng@amd.com>
Signed-off-by: Dillon Varone <dillon.varone@amd.com>
Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[WHY]
The phantom stream timing is copied from the main stream as most
parameters are identical, however some need to be recalculated.
Currently VBlank End is not recalculated and copied from the main
incorrectly.
[HOW]
Recalculate VBlank End for phantom stream timing.
Reviewed-by: Alvin Lee <alvin.lee2@amd.com>
Signed-off-by: Dillon Varone <dillon.varone@amd.com>
Signed-off-by: Wayne Lin <wayne.lin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
Playing 1080p video on 4k60 timing uses UCLK DPM5 and mode support
determines that p-state switching is not supported.
[How]
Allow DML to increase latency as the last strategy so strategies such
as VBlank p-state switching may become possible
Reviewed-by: Alvin Lee <alvin.lee2@amd.com>
Signed-off-by: Austin Zheng <Austin.Zheng@amd.com>
Signed-off-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
Videos using YUV420 format may result in high power being used.
Disabling MPO may result in lower power usage.
Update interface that can be used to check power profile of a dc_state.
[How]
Allow pstate switching in VBlank as last entry in strategy candidates.
Add helper functions that can be used to determine power level:
-get power profile after a dc_state has undergone full validation
Reviewed-by: Aric Cyr <aric.cyr@amd.com>
Signed-off-by: Austin Zheng <Austin.Zheng@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[WHY]
Prefetch calculations did not guarantee that bandwidth required in
mode support was less than mode programming which can cause failures.
[HOW]
Fix bandwidth calculations to assume fixed times for OTO schedule,
and choose which schedule to use based on time to fetch pixel data.
Reviewed-by: Jun Lei <jun.lei@amd.com>
Signed-off-by: Dillon Varone <dillon.varone@amd.com>
Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The sparse tool complains as follows:
drivers/gpu/drm/amd/amdgpu/../display/dc/dml2/dml21/src/dml2_core/dml2_core_dcn4.c:12:28: warning:
symbol 'core_dcn4_ip_caps_base' was not declared. Should it be static?
This symbol is not used outside of dcn35_hubp.c, so marks it static.
And do not want to change it, so mark it const.
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The sparse tool complains as follows:
drivers/gpu/drm/amd/amdgpu/../display/dc/dml2/dml21/src/dml2_core/dml2_core_dcn4_calcs.c:6853:56: warning:
symbol 'core_dcn4_g6_temp_read_blackout_table' was not declared. Should it be static?
This symbol is not used outside of dml2_core_dcn4_calcs.c, so marks it static.
And not want to change it, so mark it const.
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[why]
Based on power measurement result, in most cases when display clock
is higher than Vmin display clock, lowering display clock using
dynamic ODM will improve overall power consumption by 0 to 4 watts
even if we can't reach Vmin.
[how]
Allow vmin optimization applied even if dispclk can't reach Vmin.
Reviewed-by: Austin Zheng <austin.zheng@amd.com>
Signed-off-by: Wenjing Liu <wenjing.liu@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
There are a several statements with two following semicolons, replace
these with just one semicolon.
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
CalculateSwathAndDETConfiguration_params_st's UnboundedRequestEnabled is
a pointer (i.e. dml_bool_t *UnboundedRequestEnabled), and thus
p->UnboundedRequestEnabled checks its address, not bool value.
To check value, *p->UnboundedRequestEnabled is used instead.
This fixes 1 REVERSE_INULL issue reported by Coverity.
Signed-off-by: Alex Hung <alex.hung@amd.com>
Reviewed-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Functions get_per_method_common_meta and get_expanded_strategy_list can
return null and thus it is necessary to check their returned values
before dereferencing.
This fixes 3 NULL_RETURNS issues reported by Coverity.
Signed-off-by: Alex Hung <alex.hung@amd.com>
Reviewed-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The disable fams2 operation was reworked, but some of the old code
remained. This commit removes the disable_fams2_drr from the
dml2_stream_parameters.
Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Dillon Varone <dillon.varone@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Use more accurate names to refer to the asic architecture.
dcn3 in DML actually refers to DCN32 and DCN321, so rename it to dcn32x
dcn4 refers to any DCN4x soc., and hence rename dcn4 to dcn4x
Reviewed-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
Even if the mode is not supported dml2_check_mode_supported() would still return true.
This causes an unsupported mode to be programmed.
[How]
Check if the mode is supported or not and return the proper result.
Reviewed-by: Chaitanya Dhere <chaitanya.dhere@amd.com>
Signed-off-by: Austin Zheng <austin.zheng@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
In the DML math_ceil2 function, there is one ASSERT if the significance
is equal to zero. However, significance might be equal to zero
sometimes, and this is not an issue for a ceil function, but the current
ASSERT will trigger warnings in those cases. This commit removes the
ASSERT if the significance is equal to zero to avoid unnecessary noise.
Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Reviewed-by: Chaitanya Dhere <chaitanya.dhere@amd.com>
Signed-off-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[WHY]
dml2_core_shared_mode_support and dml_core_mode_support access the third
element of dummy_boolean, i.e. hw_debug5 = &s->dummy_boolean[2], when
dummy_boolean has size of 2. Any assignment to hw_debug5 causes an
OVERRUN.
[HOW]
Increase dummy_boolean's array size to 3.
This fixes 2 OVERRUN issues reported by Coverity.
Reviewed-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Signed-off-by: Jerry Zuo <jerry.zuo@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why & How]
Current logic in mcache admissibility check has flaw if
calculated number of maches are 3 or more per surface,
so sometimes the check may pass when it should fail,
and sometimes may fail when it should pass, fix the
issue and also adding additional check to make sure that
required number of mcaches per surface cannot be
higher than number of pipes + 1, used on that surface.
Reviewed-by: Chaitanya Dhere <chaitanya.dhere@amd.com>
Signed-off-by: Jerry Zuo <jerry.zuo@amd.com>
Signed-off-by: Nevenko Stupar <nevenko.stupar@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[WHAT & HOW]
Variables used as denominators and maybe not assigned to other values,
should not be 0. Change their default to 1 so they are never 0.
This fixes 10 DIVIDE_BY_ZERO issues reported by Coverity.
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Jerry Zuo <jerry.zuo@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[why]
Based on power measurement result, in most cases when display clock is higher
than Vmin display clock, lowering display clock using dynamic ODM will improve
overall power consumption by 0 to 4 watts even if we can't reach Vmin.
[how]
Allow vmin optimization applied even if dispclk can't reach Vmin.
Reviewed-by: Jun Lei <jun.lei@amd.com>
Signed-off-by: Jerry Zuo <jerry.zuo@amd.com>
Signed-off-by: Wenjing Liu <wenjing.liu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>