Commit Graph

1009 Commits

Author SHA1 Message Date
Lijo Lazar
f1807682de drm/amd/pm: Fetch current power limit from FW
Power limit of SMUv13.0.6 SOCs can be updated by out-of-band ways. Fetch
the limit from firmware instead of using cached values.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.7.x
2024-01-25 15:46:41 -05:00
Kenneth Feng
3026995474 drm/amd/pm: update the power cap setting
update the power cap setting for smu_v13.0.0/smu_v13.0.7

Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2356
Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2024-01-25 15:42:53 -05:00
Ma Jun
ca1ffb174f drm/amdgpu/pm: Fix the power source flag error
The power source flag should be updated when
[1] System receives an interrupt indicating that the power source
has changed.
[2] System resumes from suspend or runtime suspend

Signed-off-by: Ma Jun <Jun.Ma2@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2024-01-25 15:41:06 -05:00
Yang Wang
fdaca31a76 drm/amd/pm: udpate smu v13.0.6 message permission
update smu v13.0.6 message to allow guest driver set gfx clock.

Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-01-25 15:32:54 -05:00
Linus Torvalds
e08b575815 Merge tag 'drm-next-2024-01-19' of git://anongit.freedesktop.org/drm/drm
Pull more drm fixes from Dave Airlie:
 "This is mostly amdgpu and xe fixes, with an amdkfd and nouveau fix
  thrown in.

  The amdgpu ones are just the usual couple of weeks of fixes. The xe
  ones are bunch of cleanups for the new xe driver, the fix you put in
  on the merge commit and the kconfig fix that was hiding the problem
  from me.

  amdgpu:
   - DSC fixes
   - DC resource pool fixes
   - OTG fix
   - DML2 fixes
   - Aux fix
   - GFX10 RLC firmware handling fix
   - Revert a broken workaround for SMU 13.0.2
   - DC writeback fix
   - Enable gfxoff when ROCm apps are active on gfx11 with the proper FW
     version

  amdkfd:
   - Fix dma-buf exports using GEM handles

  nouveau:
   - fix a unneeded WARN_ON triggering

  xe:
   - Fix for definition of wakeref_t
   - Fix for an error code aliasing
   - Fix for VM_UNBIND_ALL in the case there are no bound VMAs
   - Fixes for a number of __iomem address space mismatches reported by
     sparse
   - Fixes for the assignment of exec_queue priority
   - A Fix for skip_guc_pc not taking effect
   - Workaround for a build problem on GCC 11
   - A couple of fixes for error paths
   - Fix a Flat CCS compression metadata copy issue
   - Fix a misplace array bounds checking
   - Don't have display support depend on EXPERT (as discussed on IRC)"

* tag 'drm-next-2024-01-19' of git://anongit.freedesktop.org/drm/drm: (71 commits)
  nouveau/vmm: don't set addr on the fail path to avoid warning
  drm/amdgpu: Enable GFXOFF for Compute on GFX11
  drm/amd/display: Drop 'acrtc' and add 'new_crtc_state' NULL check for writeback requests.
  drm/amdgpu: revert "Adjust removal control flow for smu v13_0_2"
  drm/amdkfd: init drm_client with funcs hook
  drm/amd/display: Fix a switch statement in populate_dml_output_cfg_from_stream_state()
  drm/amdgpu: Fix the null pointer when load rlc firmware
  drm/amd/display: Align the returned error code with legacy DP
  drm/amd/display: Fix DML2 watermark calculation
  drm/amd/display: Clear OPTC mem select on disable
  drm/amd/display: Port DENTIST hang and TDR fixes to OTG disable W/A
  drm/amd/display: Add logging resource checks
  drm/amd/display: Init link enc resources in dc_state only if res_pool presents
  drm/amd/display: Fix late derefrence 'dsc' check in 'link_set_dsc_pps_packet()'
  drm/amd/display: Avoid enum conversion warning
  drm/amd/pm: Fix smuv13.0.6 current clock reporting
  drm/amd/pm: Add error log for smu v13.0.6 reset
  drm/amdkfd: Fix 'node' NULL check in 'svm_range_get_range_boundaries()'
  drm/amdgpu: drop exp hw support check for GC 9.4.3
  drm/amdgpu: move debug options init prior to amdgpu device init
  ...
2024-01-19 11:50:00 -08:00
Linus Torvalds
ed8d84530a Merge tag 'i2c-for-6.8-rc1-rebased' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c updates from Wolfram Sang:
 "This removes the currently unused CLASS_DDC support (controllers set
  the flag, but there is no client to use it).

  Also, CLASS_SPD support gets simplified to prepare removal in the
  future. Class based instantiation is not recommended these days
  anyhow.

  Furthermore, I2C core now creates a debugfs directory per I2C adapter.
  Current bus driver users were converted to use it.

  Finally, quite some driver updates. Standing out are patches for the
  wmt-driver which is refactored to support more variants.

  This is the rebased pull request where a large series for the
  designware driver was dropped"

* tag 'i2c-for-6.8-rc1-rebased' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: (38 commits)
  MAINTAINERS: use proper email for my I2C work
  i2c: stm32f7: add support for stm32mp25 soc
  i2c: stm32f7: perform I2C_ISR read once at beginning of event isr
  dt-bindings: i2c: document st,stm32mp25-i2c compatible
  i2c: stm32f7: simplify status messages in case of errors
  i2c: stm32f7: perform most of irq job in threaded handler
  i2c: stm32f7: use dev_err_probe upon calls of devm_request_irq
  i2c: i801: Add lis3lv02d for Dell XPS 15 7590
  i2c: i801: Add lis3lv02d for Dell Precision 3540
  i2c: wmt: Reduce redundant: REG_CR setting
  i2c: wmt: Reduce redundant: function parameter
  i2c: wmt: Reduce redundant: clock mode setting
  i2c: wmt: Reduce redundant: wait event complete
  i2c: wmt: Reduce redundant: bus busy check
  i2c: mux: reg: Remove class-based device auto-detection support
  i2c: make i2c_bus_type const
  dt-bindings: at24: add ROHM BR24G04
  eeprom: at24: use of_match_ptr()
  i2c: cpm: Remove linux,i2c-index conversion from be32
  i2c: imx: Make SDA actually optional for bus recovering
  ...
2024-01-18 17:29:01 -08:00
Heiner Kallweit
f21682b362 drm/amd/pm: Remove I2C_CLASS_SPD support
I2C_CLASS_SPD was used to expose the EEPROM content to user space,
via the legacy eeprom driver. Now that this driver has been removed,
we can remove I2C_CLASS_SPD support. at24 driver with explicit
instantiation should be used instead.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
2024-01-18 21:10:41 +01:00
Lijo Lazar
a992c90d8e drm/amd/pm: Fix smuv13.0.6 current clock reporting
When current clock is equal to max dpm level clock, the level is not
indicated correctly with *. Fix by comparing current clock against dpm
level value.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.7.x
2024-01-15 18:33:08 -05:00
Lijo Lazar
91739a897c drm/amd/pm: Add error log for smu v13.0.6 reset
For all mode-2 reset fail cases, add error log.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.7.x
2024-01-15 18:33:05 -05:00
Asad Kamal
25272bcf84 drm/amd/pm: Use gpu_metrics_v1_5 for SMUv13.0.6
Use gpu_metrics_v1_5 for SMUv13.0.6 to fill
gpu metric info

Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Le Ma <le.ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-01-03 11:16:05 -05:00
Asad Kamal
a62503ca85 drm/amd/pm: Add gpu_metrics_v1_5
Add new gpu_metrics_v1_5 to acquire vcn/jpeg activity
& pcie nak error counters

Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Le Ma <le.ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-01-03 11:16:05 -05:00
Asad Kamal
9323b4bf6b drm/amd/pm: Update metric table for jpeg/vcn data
Update pmfw metric table to include vcn & jpeg
activity for smu_v_13_0_6

Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Le Ma <le.ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-01-03 11:16:05 -05:00
Asad Kamal
29bc46c4da drm/amd/pm: Use separate metric table for APU
Use separate metric table for APU and Non APU
systems for smu_v_13_0_6 to get metric data

Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Le Ma <le.ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-01-03 11:16:05 -05:00
YiPeng Chai
6fe08f56db drm/amd/pm: smu v13_0_6 supports ecc info by default
smu v13_0_6 supports ecc info by default.

Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-12-19 14:59:03 -05:00
YiPeng Chai
a8c77a121c drm/amdgpu: Add poison mode check error condition for umc v12_0
Add poison mode check error condition for umc v12_0.

Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-12-19 14:59:03 -05:00
Li Ma
7046ca9c1b drm/amd/swsmu: remove duplicate definition of smu v14_0_0 driver if version
There is a repeated define of smu v14_0_0 driver if version, so delete
one in driver if header.

Signed-off-by: Li Ma <li.ma@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-12-15 12:16:50 -05:00
Yang Li
804c49ef30 drm/amd/pm: Remove unneeded semicolon
./drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c:1418:2-3: Unneeded semicolon

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=7743
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-12-14 15:28:11 -05:00
Kenneth Feng
34dc227bf2 drm/amd/pm: add power save mode workload for smu 13.0.10
add power save mode workload for smu 13.0.10, so that in compute mode,
pmfw will add margin since some applications requres higher margin.

Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Likun Gao <Likun.Gao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-12-14 15:26:28 -05:00
Peyton Lee
a2f2f43f74 drm/amd/pm: support return vpe clock table
pm supports return vpe clock table and soc clock table

Signed-off-by: Peyton Lee <peytolee@amd.com>
Reviewed-by: Li Ma <li.ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-12-14 15:25:59 -05:00
Hawking Zhang
058eb51912 drm/amdgpu: Switch to aca bank for xgmi pcs err cnt
Instead of software managed counters.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Stanley.Yang <Stanley.Yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-12-13 15:28:47 -05:00
Evan Quan
cca850267d drm/amd/pm: enable Wifi RFI mitigation feature support for SMU13.0.7
Fulfill the SMU13.0.7 support for Wifi RFI mitigation feature.

--
v10->v11:
  - downgrade the prompt level on message failure(Lijo)
v13:
 - Fix the format issue (IIpo Jarvinen)
 - Remove duplicate code (IIpo Jarvinen)

Signed-off-by: Evan Quan <quanliangl@hotmail.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Ma Jun <Jun.Ma2@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-12-13 15:23:50 -05:00
Ma Jun
18df969b44 drm/amd/pm: enable Wifi RFI mitigation feature support for SMU13.0.0
Fulfill the SMU13.0.0 support for Wifi RFI mitigation feature.

--
v10->v11:
 - downgrade the prompt level on message failure(Lijo)
v13:
 - Fix the format issue (IIpo Jarvinen)
 - Move function smu_v13_0_0_set_wbrf_exclusion_ranges to
smu_v13_0.c as a generic code for later use (IIpo Jarvinen)

Co-developed-by: Evan Quan <quanliangl@hotmail.com>
Signed-off-by: Evan Quan <quanliangl@hotmail.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Ma Jun <Jun.Ma2@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-12-13 15:23:50 -05:00
Evan Quan
71f69557cb drm/amd/pm: add flood detection for wbrf events
To protect PMFW from being overloaded.

Signed-off-by: Evan Quan <quanliangl@hotmail.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Ma Jun <Jun.Ma2@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-12-13 15:23:50 -05:00
Evan Quan
b8b39de646 drm/amd/pm: setup the framework to support Wifi RFI mitigation feature
With WBRF feature supported, as a driver responding to the frequencies,
amdgpu driver is able to do shadow pstate switching to mitigate possible
interference(between its (G-)DDR memory clocks and local radio module
frequency bands used by Wifi 6/6e/7).

--
v1->v2:
  - update the prompt for feature support(Lijo)
v8->v9:
  - update parameter document for smu_wbrf_event_handler(Simon)
v9->v10:
v10->v11:
 - correct the logics for wbrf range sorting(Lijo)
v13:
 - Fix the format issue (IIpo Jarvinen)

Signed-off-by: Evan Quan <quanliangl@hotmail.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Ma Jun <Jun.Ma2@amd.com>
Signed-off-by: Ma Jun <Jun.Ma2@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-12-13 15:23:50 -05:00
Evan Quan
296b29ce8a drm/amd/pm: update driver_if and ppsmc headers for coming wbrf feature
Add those data structures to support Wifi RFI mitigation feature.

Signed-off-by: Evan Quan <quanliangl@hotmail.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Ma Jun <Jun.Ma2@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-12-13 15:23:50 -05:00
Ma Jun
31e6af1ff7 drm/amd/pm: Remove redundant function members of pptable_funcs
Remove redundant functions members of pptable_funcs and change
the function type as static because they are not called by other
files.

Signed-off-by: Ma Jun <Jun.Ma2@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-12-13 15:09:53 -05:00
Lijo Lazar
ed342a2e78 drm/amdgpu: Use the right method to get IP version
Replace direct usage of adev->ip_versions with amdgpu_ip_version.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-12-13 15:09:53 -05:00
Dave Airlie
c1ee197d64 Backmerge tag 'v6.7-rc5' into drm-next
Linux 6.7-rc5

Alex requested this for some amdkfd work relying on the symbols exports.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2023-12-12 11:32:33 +10:00
Yang Wang
37c57631c1 drm/amd/pm: support new mca smu error code decoding
support new mca smu error code decoding from smu 85.86.0 for smu v13.0.6

Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-12-06 16:05:32 -05:00
Li Ma
78825df90d drm/amd/swsmu: update smu v14_0_0 driver if version and metrics table
Increment the driver if version and add new mems to the mertics table.

Signed-off-by: Li Ma <li.ma@amd.com>
Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-12-06 16:05:32 -05:00
Dinghao Liu
7a88f23e76 drm/amd/pm: fix a memleak in aldebaran_tables_init
When kzalloc() for smu_table->ecc_table fails, we should free
the previously allocated resources to prevent memleak.

Fixes: edd7942085 ("drm/amd/pm: add message smu to get ecc_table v2")
Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-11-29 18:09:14 -05:00
Perry Yuan
8c4e9105b2 drm/amdgpu: optimize RLC powerdown notification on Vangogh
The smu needs to get the rlc power down message to sync the rlc state
with smu, the rlc state updating message need to be sent at while smu
begin suspend sequence , otherwise SMU will crash while RLC state is not
notified by driver, and rlc state probally changed after that
notification, so it needs to notify rlc state to smu at the end of the
suspend sequence in amdgpu_device_suspend() that can make sure the rlc
state  is correctly set to SMU.

[  101.000590] amdgpu 0000:03:00.0: amdgpu: SMU: I'm not done with your previous command: SMN_C2PMSG_66:0x0000001E SMN_C2PMSG_82:0x00000000
[  101.000598] amdgpu 0000:03:00.0: amdgpu: Failed to disable gfxoff!
[  110.838026] amdgpu 0000:03:00.0: amdgpu: SMU: I'm not done with your previous command: SMN_C2PMSG_66:0x0000001E SMN_C2PMSG_82:0x00000000
[  110.838035] amdgpu 0000:03:00.0: amdgpu: Failed to disable smu features.
[  110.838039] amdgpu 0000:03:00.0: amdgpu: Fail to disable dpm features!
[  110.838040] [drm:amdgpu_device_ip_suspend_phase2 [amdgpu]] *ERROR* suspend of IP block <smu> failed -62
[  110.884394] PM: suspend of devices aborted after 21213.620 msecs
[  110.884402] PM: start suspend of devices aborted after 21213.882 msecs
[  110.884405] PM: Some devices failed to suspend, or early wake event detected

Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com>
Signed-off-by: Perry Yuan <perry.yuan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-11-29 17:55:47 -05:00
Dinghao Liu
b0e5c88d8a drm/amd/pm: fix a memleak in aldebaran_tables_init
When kzalloc() for smu_table->ecc_table fails, we should free
the previously allocated resources to prevent memleak.

Fixes: edd7942085 ("drm/amd/pm: add message smu to get ecc_table v2")
Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-11-29 16:49:22 -05:00
Lijo Lazar
201761b5eb drm/amdgpu: Move mca debug mode decision to ras
Refactor code such that ras block decides the default mca debug mode,
and not swsmu block.

By default mca debug mode is set to false.

v2: squash in uninitialized value fix (Alex)

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-11-29 16:49:01 -05:00
Perry Yuan
2e9b152325 drm/amdgpu: optimize RLC powerdown notification on Vangogh
The smu needs to get the rlc power down message to sync the rlc state
with smu, the rlc state updating message need to be sent at while smu
begin suspend sequence , otherwise SMU will crash while RLC state is not
notified by driver, and rlc state probally changed after that
notification, so it needs to notify rlc state to smu at the end of the
suspend sequence in amdgpu_device_suspend() that can make sure the rlc
state  is correctly set to SMU.

[  101.000590] amdgpu 0000:03:00.0: amdgpu: SMU: I'm not done with your previous command: SMN_C2PMSG_66:0x0000001E SMN_C2PMSG_82:0x00000000
[  101.000598] amdgpu 0000:03:00.0: amdgpu: Failed to disable gfxoff!
[  110.838026] amdgpu 0000:03:00.0: amdgpu: SMU: I'm not done with your previous command: SMN_C2PMSG_66:0x0000001E SMN_C2PMSG_82:0x00000000
[  110.838035] amdgpu 0000:03:00.0: amdgpu: Failed to disable smu features.
[  110.838039] amdgpu 0000:03:00.0: amdgpu: Fail to disable dpm features!
[  110.838040] [drm:amdgpu_device_ip_suspend_phase2 [amdgpu]] *ERROR* suspend of IP block <smu> failed -62
[  110.884394] PM: suspend of devices aborted after 21213.620 msecs
[  110.884402] PM: start suspend of devices aborted after 21213.882 msecs
[  110.884405] PM: Some devices failed to suspend, or early wake event detected

Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com>
Signed-off-by: Perry Yuan <perry.yuan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-11-29 16:48:58 -05:00
Lijo Lazar
f9a45b76a1 drm/amd/pm: Add pm metrics support to SMU v13.0.6
Add support to fetch PM metrics sample from SMU v13.0.6

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-11-29 16:23:54 -05:00
Lijo Lazar
12c2d3b5f5 drm/amd/pm: Add support to fetch pm metrics sample
Add API support to fetch a snapshot of power management metrics from PMFW.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-11-29 16:23:46 -05:00
Yang Wang
ef71bb4119 drm/amdgpu: correct mca ipid die/socket/addr decode
correct mca ipid die/socket/addr decode

v2: squash in fix from Yang

Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-11-17 09:30:49 -05:00
Ma Jun
5ce8eccd53 drm/amd/pm: Make smu_v13_0_baco_set_armd3_sequence() static
smu_v13_0_baco_set_armd3_sequence is not used by other files, so
make it as static type.

Signed-off-by: Ma Jun <Jun.Ma2@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-11-17 09:29:54 -05:00
Ma Jun
857c838c78 drm/amd/pm: Move some functions to smu_v13_0.c as generic code
Use generic functions and remove the duplicate code

Signed-off-by: Ma Jun <Jun.Ma2@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-11-17 09:29:54 -05:00
Ma Jun
fbbcb3f2b7 drm/amd/pm: Fix return value and drop redundant param
Fix the return value and drop redundant parameter
of get_asic_baco_capability function.

Signed-off-by: Ma Jun <Jun.Ma2@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-11-17 09:29:53 -05:00
Lijo Lazar
0f21636462 drm/amd/pm: Don't send unload message for reset
No need to notify about unload during reset. Also remove the FW version
check.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-11-17 00:52:44 -05:00
Asad Kamal
e4d0be1824 drm/amd/pm: Fill pcie error counters for gpu v1_4
Fill PCIE error counters & instantaneous bandwidth
in gpu metrics v1_4 for smu v_13_0_6

Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Le Ma <le.ma@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-11-17 00:49:40 -05:00
Asad Kamal
786c355797 drm/amd/pm: Update metric table for smu v13_0_6
Update pmfw metric table to include pcie
instantaneous bandwidth & pcie error counters

Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Le Ma <le.ma@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-11-17 00:48:58 -05:00
Yang Wang
76d2da18af drm/amdgpu: add smu v13.0.6 pcs xgmi ras error query support
add pcs xgmi ras error query support for smu v13.0.6.

Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-11-09 17:02:59 -05:00
Yang Wang
61e0a98200 drm/amdgpu: disable smu v13.0.6 mca debug mode by default
disable mca debug mode for smu v13.0.6 by default.

Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-11-09 17:02:08 -05:00
Yang Wang
07c1db7036 drm/amdgpu: refine smu v13.0.6 mca dump driver
refine smu mca driver to support query ras error from pmfw path.
- correct gfx smu bank hwid (from mp5 to smu bank)
- retire unused callback function in amdgpu_mca_smu_funcs{}
- add new mca_bank_set{} structure to collect mca bank
- move enum mca_reg_idx into amdgpu_mca.h header
- add mca status register field decode macro

Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-11-09 17:01:51 -05:00
Yang Wang
bf13da6ae1 drm/amdgpu: correct smu v13.0.6 umc ras error check
correct smu v13.0.0 umc ras error check

Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-11-09 17:01:20 -05:00
Le Ma
5a2913aada drm/amd/pm: raise the deep sleep clock threshold for smu 13.0.6
The DS clock may exceed the limit as sclk dfll divider is 16
to target freq.

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-11-09 17:00:55 -05:00
Tim Huang
d78fa1c309 drm/amd/pm: not stop rlc for IMU enabled APUs when suspend
For IMU enabled APUs, after sending the PrepareMp1ForUnload message
to SMU in system_features_control, the RLC registers can't be touched.
The driver to stop the rlc in suspending is no longer required.

Signed-off-by: Tim Huang <Tim.Huang@amd.com>
Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-11-07 12:03:31 -05:00