Commit Graph

323 Commits

Author SHA1 Message Date
Mario Limonciello
2a1fe39a5b drm/amd: Fix logic error in sienna_cichlid_update_pcie_parameters()
While aligning SMU11 with SMU13 implementation an assumption was made that
`dpm_context->dpm_tables.pcie_table` was populated in dpm table initialization
like in SMU13 but it isn't.

So restore some of the original logic and instead just check for
amdgpu_device_pcie_dynamic_switching_supported() to decide whether to hardcode
values; erring on the side of performance.

Cc: stable@vger.kernel.org # 6.1+
Reported-and-tested-by: Umio Yasuno <coelacanth_dream@protonmail.com>
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/1447#note_2101382
Fixes: e701156ccc ("drm/amd: Align SMU11 SMU_MSG_OverridePcieParameters implementation with SMU13")
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-04 22:52:05 -04:00
Mario Limonciello
9366c2e87d drm/amd: Rename AMDGPU_PP_SENSOR_GPU_POWER
Use the clearer name `AMDGPU_PP_SENSOR_GPU_AVG_POWER` instead.

Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-15 18:08:30 -04:00
Mario Limonciello
47f1724db4 drm/amd: Introduce AMDGPU_PP_SENSOR_GPU_INPUT_POWER
Some GPUs have been overloading average power values and input power
values. To disambiguate these, introduce a new
`AMDGPU_PP_SENSOR_GPU_INPUT_POWER` and the GPUs that share input
power update to use this instead of average power.

Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2746
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-15 18:08:29 -04:00
Umio Yasuno
e4538bc78b drm/amdgpu/pm: fix throttle_status for other than MP1 11.0.7
Use the right metrics table version based on the firmware.

Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2720
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Umio Yasuno <coelacanth_dream@protonmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-15 18:07:41 -04:00
Ran Sun
8d066f2b5b drm/amd/pm: Clean up errors in arcturus_ppt.c
Fix the following errors reported by checkpatch:

ERROR: spaces required around that '=' (ctx:VxW)
ERROR: spaces required around that '>=' (ctx:WxV)

Signed-off-by: Ran Sun <sunran001@208suo.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-27 14:47:41 -04:00
Ran Sun
1e3a58df21 drm/amd/pm: Clean up errors in navi10_ppt.c
Fix the following errors reported by checkpatch:

ERROR: open brace '{' following function definitions go on the next line
ERROR: space required before the open parenthesis '('
ERROR: space required after that ',' (ctx:VxV)
ERROR: spaces required around that '=' (ctx:VxW)

Signed-off-by: Ran Sun <sunran001@208suo.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-27 14:47:38 -04:00
Wenyou Yang
41cec40bc9 drm/amd/pm: Vangogh: Add new gpu_metrics_v2_4 to acquire gpu_metrics
To acquire the voltage and current info from gpu_metrics interface,
but gpu_metrics_v2_3 doesn't contain them, and to be backward compatible,
add new gpu_metrics_v2_4 structure.

Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Wenyou Yang <WenYou.Yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-25 13:47:27 -04:00
Alex Deucher
65ac2adfa0 drm/amdgpu/pm: make gfxclock consistent for sienna cichlid
Use average gfxclock for consistency with other dGPUs.

Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-18 11:08:39 -04:00
Mario Limonciello
2d60ba1bf5 drm/amd: Align SMU11 SMU_MSG_OverridePcieParameters implementation with SMU13
SMU13 overrides dynamic PCIe lane width and dynamic speed by when on
certain hosts. commit 38e4ced804 ("drm/amd/pm: conditionally disable
pcie lane switching for some sienna_cichlid SKUs") worked around this
issue by setting up certain SKUs to set up certain limits, but the same
fundamental problem with those hosts affects all SMU11 implmentations
as well, so align the SMU11 and SMU13 driver handling.

Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-12 12:22:31 -04:00
Yang Wang
62b73bd50d drm/amd/pm: fix smu i2c data read risk
the smu driver_table is used for all types of smu
tables data transcation (e.g: PPtable, Metrics, i2c, Ecc..).

it is necessary to hold this lock to avoiding data tampering
during the i2c read operation.

Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-10 09:02:37 -04:00
Evan Quan
b75efe88b2 drm/amd/pm: avoid unintentional shutdown due to temperature momentary fluctuation
An intentional delay is added on soft ctf triggered. Then there will
be a double check for the GPU temperature before taking further
action. This can avoid unintended shutdown due to temperature
momentary fluctuation.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-30 13:12:16 -04:00
Mingtong Bao
3ec61983aa drm/amd/pm: remove unneeded variable
fix the following coccicheck warning:

drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c:1657:14-18: Unneeded variable: "size".

Signed-off-by: Mingtong Bao <baomingtong001@208suo.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-23 15:33:05 -04:00
Srinivasan Shanmugam
d50dc746ff drm/amdgpu: Fix memcpy() in sienna_cichlid_append_powerplay_table function.
Fixes the following gcc with W=1:

In file included from ./include/linux/string.h:253,
                 from ./include/linux/bitmap.h:11,
                 from ./include/linux/cpumask.h:12,
                 from ./arch/x86/include/asm/cpumask.h:5,
                 from ./arch/x86/include/asm/msr.h:11,
                 from ./arch/x86/include/asm/processor.h:22,
                 from ./arch/x86/include/asm/cpufeature.h:5,
                 from ./arch/x86/include/asm/thread_info.h:53,
                 from ./include/linux/thread_info.h:60,
                 from ./arch/x86/include/asm/preempt.h:7,
                 from ./include/linux/preempt.h:78,
                 from ./include/linux/spinlock.h:56,
                 from ./include/linux/mmzone.h:8,
                 from ./include/linux/gfp.h:7,
                 from ./include/linux/firmware.h:7,
                 from drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu11/sienna_cichlid_ppt.c:26:
In function ‘fortify_memcpy_chk’,
    inlined from ‘sienna_cichlid_append_powerplay_table’ at drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu11/sienna_cichlid_ppt.c:444:2,
    inlined from ‘sienna_cichlid_setup_pptable’ at drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu11/sienna_cichlid_ppt.c:506:8,
    inlined from ‘sienna_cichlid_setup_pptable’ at drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu11/sienna_cichlid_ppt.c:494:12:
./include/linux/fortify-string.h:413:4: warning: call to ‘__read_overflow2_field’ declared with attribute warning: detected read beyond size of field (2nd parameter); maybe use struct_group()? [-Wattribute-warning]
  413 |    __read_overflow2_field(q_size_field, size);
      |    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

the compiler complains about the size calculation in the memcpy() -
"sizeof(*smc_dpm_table) - sizeof(smc_dpm_table->table_header)" is much
larger than what fits into table_member.

Hence, reuse 'smu_memcpy_trailing' for nv1x

Fixes: 7077b19a38 ("drm/amd/pm: use macro to get pptable members")
Suggested-by: Evan Quan <Evan.Quan@amd.com>
Cc: Evan Quan <Evan.Quan@amd.com>
Cc: Chengming Gui <Jack.Gui@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-15 10:45:39 -04:00
Evan Quan
6ff5a1cff7 drm/amd/pm: conditionally disable pcie lane switching for some sienna_cichlid SKUs
Disable the pcie lane switching for some sienna_cichlid SKUs since it
might not work well on some platforms.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09 12:34:33 -04:00
Tim Huang
63b9acdf06 drm/amd/pm: reverse mclk and fclk clocks levels for vangogh
This patch reverses the DPM clocks levels output of pp_dpm_mclk
and pp_dpm_fclk.

On dGPUs and older APUs we expose the levels from lowest clocks
to highest clocks. But for some APUs, the clocks levels that from
the DFPstateTable are given the reversed orders by PMFW. Like the
memory DPM clocks that are exposed by pp_dpm_mclk.

It's not intuitive that they are reversed on these APUs. All tools
and software that talks to the driver then has to know different ways
to interpret the data depending on the asic.

So we need to reverse them to expose the clocks levels from the
driver consistently.

Signed-off-by: Tim Huang <Tim.Huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09 10:53:01 -04:00
Evan Quan
572773992e drm/amd/pm: fix possible power mode mismatch between driver and PMFW
PMFW may boots the ASIC with a different power mode from the system's
real one. Notify PMFW explicitly the power mode the system in. This
is needed only when ACDC switch via gpio is not supported.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09 09:39:15 -04:00
Alex Deucher
3043d13fef drm/amd/pm: enable TEMP_DEPENDENT_VMIN for navi1x
May help stability with some navi1x boards.

Hopefully this helps with stability with multiple monitors
and would allow us to re-enable MPC_SPLIT_DYNAMIC in the
DC code for better power savings.

Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2196

Reviewed-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
2023-03-22 01:07:04 -04:00
Lee Jones
63d99a342a drm/amd/pm/swsmu/smu11/vangogh_ppt: Provide a couple of missing parameter descriptions
Fixes the following W=1 kernel build warning(s):

 drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu11/vangogh_ppt.c:2381: warning: Function parameter or member 'residency' not described in 'vangogh_get_gfxoff_residency'
 drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu11/vangogh_ppt.c:2399: warning: Function parameter or member 'entrycount' not described in 'vangogh_get_gfxoff_entrycount'

Cc: Evan Quan <evan.quan@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com>
Cc: David Airlie <airlied@gmail.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Li Ma <li.ma@amd.com>
Cc: amd-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Lee Jones <lee@kernel.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-22 00:48:00 -04:00
Błażej Szczygieł
49017304c0 drm/amd/pm: Fix sienna cichlid incorrect OD volage after resume
Always setup overdrive tables after resume. Preserve only some
user-defined settings in user_overdrive_table if they're set.

Copy restored user_overdrive_table into od_table to get correct
values.

On cold boot, BTC was triggered and GfxVfCurve was calibrated. We
got VfCurve settings (a). On resuming back, BTC will be triggered
again and GfxVfCurve will be recalibrated. VfCurve settings (b)
got may be different from those of cold boot.  So if we reuse
those VfCurve settings (a) got on cold boot on suspend, we can
run into discrepencies.

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1897
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2276
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Błażej Szczygieł <mumei6102@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-13 17:14:55 -04:00
Perry Yuan
dc622367c5 drm/amdgpu: map new capped and uncapped mode power profiles for Vangogh
Capped and Uncapped workload types are supported, each workload type
has different performance thresholds and pstate conditions.

* capped mode is used by power centric workload
* uncapped mode is used by perf centric workload

Acked-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Perry Yuan <perry.yuan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-07 14:22:39 -05:00
Kun Liu
aea9040c2d drm/amdgpu: fix no previous prototype warning
add static prefix for vangogh_set_apu_thermal_limit function

Signed-off-by: Kun Liu <Kun.Liu2@amd.com>
Reported-by: kernel test robot <lkp@intel.com>
Link: https://lore.kernel.org/oe-kbuild-all/202303010827.c2N0yBGT-lkp@intel.com
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-06 15:14:25 -05:00
Kun Liu
0c3c993643 drm/amdgpu: added a sysfs interface for thermal throttling
implement apu_thermal_cap r/w callback for vangogh

Signed-off-by: Kun Liu <Kun.Liu2@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-23 17:36:00 -05:00
Guchun Chen
424b3d7582 drm/amd/pm: downgrade log level upon SMU IF version mismatch
SMU IF version mismatch as a warning message exists widely
after asic production, however, due to this log level setting,
such mismatch warning will be caught by automation test like
IGT and reported as a fake error after checking. As such mismatch
does not break anything, to reduce confusion, downgrade it from
dev_warn to dev_info.

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-23 17:35:59 -05:00
Lijo Lazar
18c4e319db drm/amd/pm: Allocate dummy table only if needed
Only Navi1x requires dummy read workaround. Allocate the table in VRAM
only for Navi1x.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-15 22:24:08 -05:00
Mario Limonciello
315d1716d6 drm/amd: Use amdgpu_ucode_* helpers for SMU
The `amdgpu_ucode_request` helper will ensure that the return code for
missing firmware is -ENODEV so that early_init can fail.

The `amdgpu_ucode_release` helper is for symmetry on unloading.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-01-10 14:32:57 -05:00
Mario Limonciello
6b54496238 drm/amd: Convert SMUv11 microcode to use amdgpu_ucode_ip_version_decode
Remove the special casing from SMU v11 code. No intended functional
changes.

Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-01-09 16:27:43 -05:00
Guchun Chen
86a3c691db drm/amd/pm/smu11: poll BACO status after RPM BACO exits
After executing BACO exit, driver needs to poll the status
to ensure FW has completed BACO exit sequence to prevent
timing issue.

v2: use usleep_range to replace msleep to fix checkpatch.pl warnings

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-11-29 11:03:35 -05:00
Guchun Chen
6dca7efe6e drm/amd/pm/smu11: BACO is supported when it's in BACO state
Return true early if ASIC is in BACO state already, no need
to talk to SMU. It can fix the issue that driver was not
calling BACO exit at all in runtime pm resume, and a timing
issue leading to a PCI AER error happened eventually.

Fixes: 8795e182b0 ("PCI/portdrv: Don't disable AER reporting in get_port_device_capability()")
Suggested-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-11-29 11:03:35 -05:00
Evan Quan
8ae5a38c8c drm/amd/pm: enable runpm support over BACO for SMU13.0.0
Enable SMU13.0.0 runpm support.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-11-15 13:35:16 -05:00
Guchun Chen
f42c01696e drm/amdgpu: disable BACO support on more cards
Otherwise, some unexpected PCIE AER errors will be observed
in runtime suspend/resume cycle.

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-11-15 13:35:16 -05:00
Guchun Chen
407a5bdd55 drm/amdgpu: disable BACO on special BEIGE_GOBY card
Still avoid intermittent failure.

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-11-09 17:41:41 -05:00
Evan Quan
3059cd8c5f drm/amd/pm: disable cstate feature for gpu reset scenario
Suggested by PMFW team and same as what did for gfxoff feature.
This can address some Mode1Reset failures observed on SMU13.0.0.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.0.x
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-18 22:12:20 -04:00
Li Ma
0d6516efff drm/amd/pm:add new gpu_metrics_v2_3 to acquire average temperature info
Add new gpu_metrics_v2_3 to acquire average temperature info from SMU metrics. To acquire average temp info from gpu_metrics interface, but gpu_metrics_v2_2 only has members to show current temp info.
---
v1:
	Only add average_temperature_gfx in gpu_metrics_v2_3.
v2:
	Add average temp members for soc, core and l3 in gpu_metrics_v2_3 and put these new members at the end of gpu_metrics_v2_3. Add operation to read average temp info from metrics table.
v3:
	Merge v1 and v2 and rename the patch.
v4:
	Merge v3. Add firmware version judgment in vangogh_common_get_gpu_metrics to maintain backward compatibility and rename the patch. "return ret" on error scenario in smu_cmn_get_smc_version.

Signed-off-by: Li Ma <li.ma@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-09-14 12:38:41 -04:00
Guchun Chen
7bb9122829 drm/amd/pm: disable BACO entry/exit completely on several sienna cichlid cards
To avoid hardware intermittent failures.

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-09-13 12:54:22 -04:00
André Almeida
1ed5a845c7 drm/amd/pm: Implement GFXOFF's entry count and residency for vangogh
Implement functions to get and set GFXOFF's entry count and residency
for vangogh.

Signed-off-by: André Almeida <andrealmeid@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-08-16 18:17:31 -04:00
Victor Zhao
672c0218e3 drm/amdgpu: add mode2 reset for sienna_cichlid
To meet the requirement for multi container usecase which needs
a quicker reset and not causing VRAM lost, adding the Mode2
reset handler for sienna_cichlid.

v2: move skip mode2 flag part separately

v3: remove the use of asic_reset_res

Signed-off-by: Victor Zhao <Victor.Zhao@amd.com>
Acked-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-08-16 18:14:31 -04:00
Evan Quan
0a2d922a56 drm/amd/pm: add missing ->fini_microcode interface for Sienna Cichlid
To avoid any potential memory leak.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-08-16 18:02:57 -04:00
lin cao
748262eb40 drm/amdgpu: Call trace info was found in dmesg when loading amdgpu
In the case of SRIOV, the register smnMp1_PMI_3_FIFO will get an invalid
value which will cause the "shift out of bound". In Ubuntu22.04, this
issue will be checked an related call trace will be reported in dmesg.

Signed-off-by: lin cao <lin.cao@amd.com>
Reviewed-by: Jingwen Chen <Jingwen.Chen2@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-07-18 16:38:10 -04:00
Dave Airlie
60693e3a38 Merge tag 'amd-drm-next-5.20-2022-07-14' of https://gitlab.freedesktop.org/agd5f/linux into drm-next
amd-drm-next-5.20-2022-07-14:

amdgpu:
- DCN3.2 updates
- DC SubVP support
- DP MST fixes
- Audio fixes
- DC code cleanup
- SMU13 updates
- Adjust GART size on newer APUs for S/G display
- Soft reset for GFX 11
- Soft reset for SDMA 6
- Add gfxoff status query for vangogh
- Improve BO domain pinning
- Fix timestamps for cursor only commits
- MES fixes
- DCN 3.1.4 support
- Misc fixes
- Misc code cleanup

amdkfd:
- Simplify GPUVM validation
- Unified memory for CWSR save/restore area
- fix possible list corruption on queue failure

radeon:
- Fix bogus power of two warning

UAPI:
- Unified memory for CWSR save/restore area for KFD
  Proposed userspace: https://lists.freedesktop.org/archives/amd-gfx/2022-June/080952.html

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220714214716.8203-1-alexander.deucher@amd.com
2022-07-15 15:07:26 +10:00
André Almeida
43195162fb drm/amd/pm: Implement get GFXOFF status for vangogh
Implement function to get current GFXOFF status for vangogh.

Signed-off-by: André Almeida <andrealmeid@igalia.com>
Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-07-13 11:25:18 -04:00
Yefim Barashkin
1e866f1fe5 drm/amd/pm: Prevent divide by zero
divide error: 0000 [#1] SMP PTI
CPU: 3 PID: 78925 Comm: tee Not tainted 5.15.50-1-lts #1
Hardware name: MSI MS-7A59/Z270 SLI PLUS (MS-7A59), BIOS 1.90 01/30/2018
RIP: 0010:smu_v11_0_set_fan_speed_rpm+0x11/0x110 [amdgpu]

Speed is user-configurable through a file.
I accidentally set it to zero, and the driver crashed.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: André Almeida <andrealmeid@igalia.com>
Signed-off-by: Yefim Barashkin <mr.b34r@kolabnow.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-07-13 11:25:18 -04:00
Dave Airlie
344feb7ccf Merge tag 'amd-drm-next-5.20-2022-07-05' of https://gitlab.freedesktop.org/agd5f/linux into drm-next
amd-drm-next-5.20-2022-07-05:

amdgpu:
- Various spelling and grammer fixes
- Various eDP fixes
- Various DMCUB fixes
- VCN fixes
- GMC 11 fixes
- RAS fixes
- TMZ support for GC 10.3.7
- GPUVM TLB flush fixes
- SMU 13.0.x updates
- DCN 3.2 Support
- DCN 3.2.1 Support
- MES updates
- GFX11 modifiers support
- USB-C fixes
- MMHUB 3.0.1 support
- SDMA 6.0 doorbell fixes
- Initial devcoredump support
- Enable high priority gfx queue on asics which support it
- Enable GPU reset for SMU 13.0.4
- OLED display fixes
- MPO fixes
- DC frame size fixes
- ASPM support for PCIE 7.4/7.6
- GPU reset support for SMU 13.0.0
- GFX11 updates
- VCN JPEG fix
- BACO support for SMU 13.0.7
- VCN instance handling fix
- GFX8 GPUVM TLB flush fix
- GPU reset rework
- VCN 4.0.2 support
- GTT size fixes
- DP link training fixes
- LSDMA 6.0.1 support
- Various backlight fixes
- Color encoding fixes
- Backlight config cleanup
- VCN 4.x unified queue cleanup

amdkfd:
- MMU notifier fixes
- Updates for GC 10.3.6 and 10.3.7
- P2P DMA support using dma-buf
- Add available memory IOCTL
- SDMA 6.0.1 fix
- MES fixes
- HMM profiler support

radeon:
- License fix
- Backlight config cleanup

UAPI:
- Add available memory IOCTL to amdkfd
  Proposed userspace: https://www.mail-archive.com/amd-gfx@lists.freedesktop.org/msg75743.html
- HMM profiler support for amdkfd
  Proposed userspace: https://lists.freedesktop.org/archives/amd-gfx/2022-June/080805.html

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220705212633.6037-1-alexander.deucher@amd.com
2022-07-12 11:07:32 +10:00
Darren Powell
543faf57ee amdgpu/pm: Fix incorrect variable for size of clocks array
[v2]
No Changes, added RB
 [v1]
Size of pp_clock_levels_with_latency is PP_MAX_CLOCK_LEVELS, not MAX_NUM_CLOCKS.
Both are currently defined as 16, modifying in case one value is modified in future
Changed code in both arcturus and aldabaran.

Also removed unneeded var count, and used min_t function

Signed-off-by: Darren Powell <darren.powell@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-06-22 16:55:12 -04:00
Linus Torvalds
d0e60d46bc Merge tag 'bitmap-for-5.19-rc1' of https://github.com/norov/linux
Pull bitmap updates from Yury Norov:

 - bitmap: optimize bitmap_weight() usage, from me

 - lib/bitmap.c make bitmap_print_bitmask_to_buf parseable, from Mauro
   Carvalho Chehab

 - include/linux/find: Fix documentation, from Anna-Maria Behnsen

 - bitmap: fix conversion from/to fix-sized arrays, from me

 - bitmap: Fix return values to be unsigned, from Kees Cook

It has been in linux-next for at least a week with no problems.

* tag 'bitmap-for-5.19-rc1' of https://github.com/norov/linux: (31 commits)
  nodemask: Fix return values to be unsigned
  bitmap: Fix return values to be unsigned
  KVM: x86: hyper-v: replace bitmap_weight() with hweight64()
  KVM: x86: hyper-v: fix type of valid_bank_mask
  ia64: cleanup remove_siblinginfo()
  drm/amd/pm: use bitmap_{from,to}_arr32 where appropriate
  KVM: s390: replace bitmap_copy with bitmap_{from,to}_arr64 where appropriate
  lib/bitmap: add test for bitmap_{from,to}_arr64
  lib: add bitmap_{from,to}_arr64
  lib/bitmap: extend comment for bitmap_(from,to)_arr32()
  include/linux/find: Fix documentation
  lib/bitmap.c make bitmap_print_bitmask_to_buf parseable
  MAINTAINERS: add cpumask and nodemask files to BITMAP_API
  arch/x86: replace nodes_weight with nodes_empty where appropriate
  mm/vmstat: replace cpumask_weight with cpumask_empty where appropriate
  clocksource: replace cpumask_weight with cpumask_empty in clocksource.c
  genirq/affinity: replace cpumask_weight with cpumask_empty where appropriate
  irq: mips: replace cpumask_weight with cpumask_empty where appropriate
  drm/i915/pmu: replace cpumask_weight with cpumask_empty where appropriate
  arch/x86: replace cpumask_weight with cpumask_empty where appropriate
  ...
2022-06-04 14:04:27 -07:00
Alex Deucher
da1db031cd drm/amdgpu/swsmu: add SMU mailbox registers in SMU context
So we can eventaully use them in the common smu code for
accessing the SMU mailboxes without needing a lot of
per asic logic in the common code.

Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-06-03 16:45:00 -04:00
Alex Deucher
9d6b204176 drm/amdgpu: convert sienna_cichlid_populate_umd_state_clk() to use IP version
Rather than asic type.

Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-06-03 16:45:00 -04:00
Alex Deucher
ab9d97d6f9 drm/amdgpu: convert sienna_cichlid_get_default_config_table_settings() to IP version
Use IP version rather than asic type.

Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-06-03 16:43:36 -04:00
Yury Norov
525d651560 drm/amd/pm: use bitmap_{from,to}_arr32 where appropriate
The smu_v1X_0_set_allowed_mask() uses bitmap_copy() to convert
bitmap to 32-bit array. This may be wrong due to endiannes issues.
Fix it by switching to bitmap_{from,to}_arr32.

CC: Alexander Gordeev <agordeev@linux.ibm.com>
CC: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
CC: Christian Borntraeger <borntraeger@linux.ibm.com>
CC: Claudio Imbrenda <imbrenda@linux.ibm.com>
CC: David Hildenbrand <david@redhat.com>
CC: Heiko Carstens <hca@linux.ibm.com>
CC: Janosch Frank <frankja@linux.ibm.com>
CC: Rasmus Villemoes <linux@rasmusvillemoes.dk>
CC: Sven Schnelle <svens@linux.ibm.com>
CC: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Yury Norov <yury.norov@gmail.com>
2022-06-03 06:52:58 -07:00
Evan Quan
396beb91a9 drm/amd/pm: correct the metrics version for SMU 11.0.11/12/13
Correct the metrics version used for SMU 11.0.11/12/13.
Fixes misreported GPU metrics (e.g., fan speed, etc.) depending
on which version of SMU firmware is loaded.

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1925
Signed-off-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-26 14:56:33 -04:00
Sathishkumar S
d6810d7dfa drm/amd/pm: support ss metrics read for smu11
support reading smartshift apu and dgpu power for smu11 based asic

v2: add new version of SmuMetrics and make calculation more readable (Lijo)
v3: avoid calculations that result in -ve values and skip related checks
v4: use the current power limit on dGPU and exclude smu 11_0_7 (Lijo)
v5: remove redundant code (Lijo)

Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-16 10:02:57 -04:00