The prompts will contain pci address(segment/bus/port/function),
severity(warn or error) and some keywords(GPU, amdgpu). Also this
address the issue that pci bus retrieved by PCI_BUS_NUM(adev->pdev->devfn)
is wrong.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
'ctxid' is used to distinguish different events raised from SMC.
0x3 and 0x4 are for AC and DC power mode.
V2: update the way to retrieve the ctxid and change the log level
to debug
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
On the ASIC powered down(in baco or system suspend),
the dpm_enabled will be set as false. Then all access
(e.g. df state setting issued on RAS error event) to
SMU will be blocked.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This sequence is recommended by PMFW team for the baco reset
with PMFW reloaded. And it seems able to address the random
failure seen on Arcturus.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Reviewed-by: John Clements <John.Clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This sequence is recommended by PMFW team for the baco reset
with PMFW reloaded. And it seems able to address the random
failure seen on Arcturus.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Reviewed-by: John Clements <John.Clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This adds a message lock to the smu_send_smc_msg* implementations to
protect against concurrent access to the mmu registers used to
communicate with the SMU
v2: Implement for smu_v12_0 as well
v3: Add mutex_init for message_lock
Signed-off-by: Matt Coffin <mcoffin13@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The new interface reads the argument in the call to send the message, so
this is no longer needed, and shouldn't be used for concurrency safety
reasons.
Signed-off-by: Matt Coffin <mcoffin13@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Move the responsibility for reading argument registers into the
smu_send_smc_msg* implementations, so that adding a message-sending lock
to protect the SMU registers will result in the lock still being held
when the argument is read.
v2: transition smu_v12_0, it's asics, and vega20
Signed-off-by: Matt Coffin <mcoffin13@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Previously, the syfs functionality for restoring the default powerplay
table was sourcing it's information from the currently-staged powerplay
table.
This patch adds a step to cache the first overdrive table that we see on
boot, so that it can be used later to "restore" the powerplay table
v2: sqaush my original with Matt's fix
Bug: https://gitlab.freedesktop.org/drm/amd/issues/1020
Signed-off-by: Matt Coffin <mcoffin13@gmail.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 5.5.x
change the sw ctf setting to smu_v11_0_set_thermal_range()
since software_shutdown_temp shares the same definition and
name in all the smu11 project.
Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
By this, we can avoid to pass in the VRAM address on every table
transferring. That puts extra unnecessary traffics on SMU on
some cases(e.g. polling the amdgpu_pm_info sysfs interface).
V2: document what the driver table is for and how it works
Signed-off-by: Evan Quan <evan.quan@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
So that we do not need to allocate a piece of VRAM for it. This
is a preparation for coming change which unifies the VRAM address
for all driver tables interaction with SMU.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Have every asic provide a callback for this rather than a mix
of generic and asic specific code.
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Drop smu_send_smc_msg function from ASIC specify structure.
Reuse smu_send_smc_msg_with_param function for smu_send_smc_msg.
Set paramer to 0 for smu_send_msg function, otherwise it will send
with previous paramer value (Not a certain value).
Materialize msg type for smu send message function definition.
Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
BACO - Bus Active, Chip Off
So we can use it for power savings rather than just reset.
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This can fix the compile errors below:
drivers/gpu/drm/amd/amdgpu/../powerplay/smu_v11_0.c: In function ‘smu_v11_0_baco_set_state’:
drivers/gpu/drm/amd/amdgpu/../powerplay/smu_v11_0.c:1674:27: error: implicit declaration of function ‘amdgpu_ras_get_context’ [-Werror=implicit-function-declaration]
struct amdgpu_ras *ras = amdgpu_ras_get_context(adev);
^
drivers/gpu/drm/amd/amdgpu/../powerplay/smu_v11_0.c:1674:27: warning: initialization makes pointer from integer without a cast [-Wint-conversion]
drivers/gpu/drm/amd/amdgpu/../powerplay/smu_v11_0.c:1692:19: error: dereferencing pointer to incomplete type ‘struct amdgpu_ras’
if (!ras || !ras->supported) {
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Added bif doorbell interrupt setting and applied different
settings for BACO reset for RAS recovery.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
On Navi10, and presumably arcterus, updating pp_table via sysfs would
not re-scale the maximum possible power limit one can set. On navi10,
the SMU code ignored the power percentage overdrive setting entirely,
and would not allow you to exceed the default power limit at all.
[How]
Adding a function to the SMU interface to get the pptable version of the
default power limit allows ASIC-specific code to provide the correct
maximum-settable power limit for the current pptable.
v3: fix spelling (Alex)
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Matt Coffin <mcoffin13@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
Before this patch, there was no way to use pp_od_clk_voltage on navi
[How]
Similar to the vega20 implementation, but using the common smc_v11_0
headers, implemented the pp_od_clk_voltage API for navi10's pptable
implementation
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Matt Coffin <mcoffin13@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add xgmi pstate setting on powerplay routine.
V2: split the change of is_support_sw_smu_xgmi into a separate patch
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
With this cleanup, the APIs from amdgpu_smu.c will map to
ASIC specific ones directly. Those can be shared around
all SMU V11/V12 ASICs will be put in smu_v11_0.c and
smu_v12_0.c respectively.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Those swSMU APIs used internally are moved to smu_internal.h while
others are kept in amdgpu_smu.h.
V2: give a better name smu_internal.h for the place to hold
those internal APIs
Signed-off-by: Evan Quan <evan.quan@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This is a quick and low risk fix. Those APIs which
are exposed to other IPs or to support sysfs/hwmon
interfaces or DAL will have lock protection. Meanwhile
no lock protection is enforced for swSMU internal used
APIs. Future optimization is needed.
V2: strip the lock protection for all swSMU internal APIs
Signed-off-by: Evan Quan <evan.quan@amd.com>
Acked-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Acked-by: Feifei Xu <Feifei.Xu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
there are two paths for renoir dc access smu.
one dc access smu directly using bios smc
interface: set disply, dprefclk, etc.
another goes through pplib for get dpm clock
table and set watermmark.
Signed-off-by: Hersen Wu <hersenxs.wu@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Bug fix for pcie paramerers override on swsmu.
Below is a scenario to have this problem.
pptable definition on pcie dpm:
0 -> pcie gen speed:1, pcie lanes: *16
1 -> pcie gen speed:4, pcie lanes: *16
Then if we have a system only have the capbility:
pcie gen speed: 3, pcie lanes: *8,
we will override dpm 1 to pcie gen speed 3, pcie lanes *8.
But the code skips the dpm 0 configuration.
So the real pcie dpm parameters are:
0 -> pcie gen speed:1, pcie lanes: *16
1 -> pcie gen speed:3, pcie lanes: *8
Then the wrong pcie lanes will be toggled.
Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The APU soft freq range set by different way from DGPU, thus need implement
the function respectively base on each common SMU part.
Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
fix bellow patch issue:
drm/amd/powerplay: introduce smu table id type to handle the smu table
for each asic
----
"This patch introduces new smu table type, it's to handle the
different smu table
defines for each asic with the same smu ip."
before:
use smu->table_count to represent the actual table count in smc firmware
use actual table count to check smu function parameter about smu table
after:
use logic table count "SMU_TABLE_COUNT" to check function parameter
because table id already mapped in smu driver,
and smu function will use logic table id not actual table id to check func parameter.
v2: squash in warning fix
Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>