linux

mirror of https://github.com/torvalds/linux.git synced 2026-04-30 12:32:31 -04:00

Author	SHA1	Message	Date
Yang Wang	e12603bf2c	drm/amd/pm: fix amdgpu_irq enabled counter unbalanced on smu v11.0 v1: - fix amdgpu_irq enabled counter unbalanced issue on smu_v11_0_disable_thermal_alert. v2: - re-enable smu thermal alert to make amdgpu irq counter balance for smu v11.0 if in runpm state [75582.361561] ------------[ cut here ]------------ [75582.361565] WARNING: CPU: 42 PID: 533 at drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c:639 amdgpu_irq_put+0xd8/0xf0 [amdgpu] ... [75582.362211] Tainted: [E]=UNSIGNED_MODULE [75582.362214] Hardware name: GIGABYTE MZ01-CE0-00/MZ01-CE0-00, BIOS F14a 08/14/2020 [75582.362218] Workqueue: pm pm_runtime_work [75582.362225] RIP: 0010:amdgpu_irq_put+0xd8/0xf0 [amdgpu] [75582.362556] Code: 31 f6 31 ff e9 c9 bf cf c2 44 89 f2 4c 89 e6 4c 89 ef e8 db fc ff ff 5b 41 5c 41 5d 41 5e 5d 31 d2 31 f6 31 ff e9 a8 bf cf c2 <0f> 0b eb c3 b8 fe ff ff ff eb 97 e9 84 e8 8b 00 0f 1f 84 00 00 00 [75582.362560] RSP: 0018:ffffd50d51297b80 EFLAGS: 00010246 [75582.362564] RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000 [75582.362568] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 [75582.362570] RBP: ffffd50d51297ba0 R08: 0000000000000000 R09: 0000000000000000 [75582.362573] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8e72091d2008 [75582.362576] R13: ffff8e720af80000 R14: 0000000000000000 R15: ffff8e720af80000 [75582.362579] FS: 0000000000000000(0000) GS:ffff8e9158262000(0000) knlGS:0000000000000000 [75582.362582] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [75582.362585] CR2: 000074869d040c14 CR3: 0000001e37a3e000 CR4: 00000000003506f0 [75582.362588] Call Trace: [75582.362591] <TASK> [75582.362597] smu_v11_0_disable_thermal_alert+0x17/0x30 [amdgpu] [75582.362983] smu_smc_hw_cleanup+0x79/0x4f0 [amdgpu] [75582.363375] smu_suspend+0x92/0x110 [amdgpu] [75582.363762] ? gfx_v10_0_hw_fini+0xd5/0x150 [amdgpu] [75582.364098] amdgpu_ip_block_suspend+0x27/0x80 [amdgpu] [75582.364377] ? timer_delete_sync+0x10/0x20 [75582.364384] amdgpu_device_ip_suspend_phase2+0x190/0x450 [amdgpu] [75582.364665] amdgpu_device_suspend+0x1ae/0x2f0 [amdgpu] [75582.364948] amdgpu_pmops_runtime_suspend+0xf3/0x1f0 [amdgpu] [75582.365230] pci_pm_runtime_suspend+0x6d/0x1f0 [75582.365237] ? __pfx_pci_pm_runtime_suspend+0x10/0x10 [75582.365242] __rpm_callback+0x4c/0x190 [75582.365246] ? srso_return_thunk+0x5/0x5f [75582.365252] ? srso_return_thunk+0x5/0x5f [75582.365256] ? ktime_get_mono_fast_ns+0x43/0xe0 [75582.365263] rpm_callback+0x6e/0x80 [75582.365267] rpm_suspend+0x124/0x5f0 [75582.365271] ? srso_return_thunk+0x5/0x5f [75582.365275] ? __schedule+0x439/0x15e0 [75582.365281] ? srso_return_thunk+0x5/0x5f [75582.365285] ? __queue_delayed_work+0xb8/0x180 [75582.365293] pm_runtime_work+0xc6/0xe0 [75582.365297] process_one_work+0x1a1/0x3f0 [75582.365303] worker_thread+0x2ba/0x3d0 [75582.365309] kthread+0x107/0x220 [75582.365313] ? __pfx_worker_thread+0x10/0x10 [75582.365318] ? __pfx_kthread+0x10/0x10 [75582.365323] ret_from_fork+0xa2/0x120 [75582.365328] ? __pfx_kthread+0x10/0x10 [75582.365332] ret_from_fork_asm+0x1a/0x30 [75582.365343] </TASK> [75582.365345] ---[ end trace 0000000000000000 ]--- [75582.365350] amdgpu 0000:05:00.0: amdgpu: Fail to disable thermal alert! [75582.365379] amdgpu 0000:05:00.0: amdgpu: suspend of IP block <smu> failed -22 Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-11-24 12:34:31 -05:00
Alex Deucher	fd39b5a583	drm/amdgpu/smu: Handle S0ix for vangogh Fix the flows for S0ix. There is no need to stop rlc or reintialize PMFW in S0ix. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4659 Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Reported-by: Antheas Kapenekakis <lkml@antheas.dev> Tested-by: Antheas Kapenekakis <lkml@antheas.dev> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-11-04 11:53:22 -05:00
Asad Kamal	d80391dd03	drm/amdgpu: Remove invalidate and flush hdp macros Remove amdgpu_asic_flush_hdp & amdgpu_asic_invalidate_hdp functions and directly use the mapped ones Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-11-04 11:33:54 -05:00
Ilya Zlobintsev	cdfdec6f16	drm/amd/pm: Avoid writing nulls into `pp_od_clk_voltage` Calling `smu_cmn_get_sysfs_buf` aligns the offset used by `sysfs_emit_at` to the current page boundary, which was previously directly returned from the various `print_clk_levels` implementations to be added to the buffer position. Instead, only the relative offset showing how much was written to the buffer should be returned, regardless of how it was changed for alignment purposes. Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Ilya Zlobintsev <ilya.zlobintsev@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-10-20 18:25:17 -04:00
Mario Limonciello	5f4f49a41c	drm/amd: Stop overloading power limit with limit type When passed around internally the upper 8 bits of power limit include the limit type. This is non-obvious without digging into the nuances of each function. Instead pass the limit type as an argument to all applicable layers. Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-10-13 14:14:35 -04:00
Mario Limonciello	000902683f	drm/amd: Adjust whitespace for vangogh_ppt A few changes have more whitespace than needed. Clean them up. Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Tested-by: Robert Beckett <bob.beckett@collabora.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-10-13 14:14:33 -04:00
Rodrigo Siqueira	13f785d37a	drm/amd/pm: Use devm_i2c_add_adapter() in the Sienna smu The I2C init for Sienna Cichlid uses i2c_add_adapter() and i2c_del_adapter(), this commit replaces the use of these two functions with devm_i2c_add_adapter(). Notice that Sienna Cichlid init initializes multiple I2C buses in a loop; if something goes wrong, the previous adapters are removed, and the amdgpu load is interrupted. Since I2C init is required for the correct load of amdgpu, it is safe to rely on devm_i2c_add_adapter() to handle any previously initialized I2C adapter. Signed-off-by: Rodrigo Siqueira <siqueira@igalia.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-09-16 17:47:36 -04:00
Rodrigo Siqueira	9058cb7775	drm/amd/pm: Use devm_i2c_add_adapter() in the Navi10 smu The I2C init for Navi10 uses i2c_add_adapter() and i2c_del_adapter(), this commit replaces the use of these two functions with devm_i2c_add_adapter(). Notice that Navi10 init initializes multiple I2C buses in a loop; if something goes wrong, the previous adapters are removed, and the amdgpu load is interrupted. Since I2C init is required for the correct load of amdgpu, it is safe to rely on devm_i2c_add_adapter() to handle any previously initialized I2C adapter. Signed-off-by: Rodrigo Siqueira <siqueira@igalia.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-09-16 17:47:32 -04:00
Rodrigo Siqueira	439158c475	drm/amd/pm: Use devm_i2c_add_adapter() in the Arcturus smu The I2C init for Arcturus uses i2c_add_adapter() and i2c_del_adapter(), this commit replaces the use of these two functions with devm_i2c_add_adapter(). Notice that Arcturus init initializes multiple I2C buses in a loop; if something goes wrong, the previous adapters are removed, and the amdgpu load is interrupted. Since I2C init is required for the correct load of amdgpu, it is safe to rely on devm_i2c_add_adapter() to handle any previously initialized I2C adapter. Signed-off-by: Rodrigo Siqueira <siqueira@igalia.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-09-16 17:47:28 -04:00
Lijo Lazar	2f3b1ccf83	drm/amd/pm: Use cached metrics data on arcturus Cached metrics data validity is 1ms on arcturus. It's not reasonable for any client to query gpu_metrics at a faster rate and constantly interrupt PMFW. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-08-04 14:41:03 -04:00
Kenneth Feng	1b92cb40b4	drm/amd/pm: revise the pcie dpm parameters revise the pcie dpm parameters Signed-off-by: Kenneth Feng <kenneth.feng@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-06-24 10:05:40 -04:00
Mario Limonciello	2d1ec1e955	drm/amd: Allow printing VanGogh OD SCLK levels without setting dpm to manual Several other ASICs allow printing OD SCLK levels without setting DPM control to manual. When OD is disabled it will show the range the hardware supports. When OD is enabled it will show what values have been programmed. Adjust VanGogh to work the same. Cc: Pierre-Loup A. Griffais <pgriffais@valvesoftware.com> Reported-by: Vicki Pfau <vi@endrift.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Link: https://lore.kernel.org/r/20250609031227.479079-1-superm1@kernel.org Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-06-18 12:19:19 -04:00
Kenneth Feng	1a18607c07	drm/amd/pm: override pcie dpm parameters only if it is necessary override pcie dpm parameters only if it is necessary Signed-off-by: Kenneth Feng <kenneth.feng@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-06-18 12:19:19 -04:00
Dr. David Alan Gilbert	4367ee3ed1	drm/amd/pm: Remove remainder of mode2_reset_is_support The previous patch removed smu_mode2_reset_is_support() which was the only function to call through the mode2_reset_is_support() method pointer. Remove the remaining functions that were assigned to it and the pointer itself. See discussion at: https://lore.kernel.org/all/DM4PR12MB5165D85BD85BC8FC8BF7A3B48E88A@DM4PR12MB5165.namprd12.prod.outlook.com/ Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-05-13 09:23:14 -04:00
Dr. David Alan Gilbert	2c599d66b9	drm/amd/pm/smu11: Remove unused smu_v11_0_get_dpm_level_range The last use of smu_v11_0_get_dpm_level_range() was removed in 2020 by commit `46a301e14e` ("drm/amd/powerplay: drop unnecessary Navi1x specific APIs") Remove it. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-05-07 17:46:21 -04:00
Lijo Lazar	3580440308	drm/amd/pm: Fix comment style Fix code comment style Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202504271422.D6cqMlZ0-lkp@intel.com/ Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-30 18:15:03 -04:00
Denis Arefev	da7dc714a8	drm/amd/pm/smu11: Prevent division by zero The user can set any speed value. If speed is greater than UINT_MAX/8, division by zero is possible. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: `1e866f1fe5` ("drm/amd/pm: Prevent divide by zero") Signed-off-by: Denis Arefev <arefev@swemel.ru> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:13 -04:00
Denis Arefev	7d641c2b83	drm/amd/pm: Prevent division by zero The user can set any speed value. If speed is greater than UINT_MAX/8, division by zero is possible. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: `b64625a303` ("drm/amd/pm: correct the address of Arcturus fan related registers") Signed-off-by: Denis Arefev <arefev@swemel.ru> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2025-03-26 17:41:23 -04:00
Ying Li	ee9e64549f	drm/amd/pm: add support for IP version 11.5.2 This initializes drm/amd/pm version 11.5.2 Signed-off-by: YING LI <yingli12@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-02-12 21:05:49 -05:00
Mario Limonciello	ea5d493498	drm/amd: Add the capability to mark certain firmware as "required" Some of the firmware that is loaded by amdgpu is not actually required. For example the ISP firmware on some SoCs is optional, and if it's not present the ISP IP block just won't be initialized. The firmware loader core however will show a warning when this happens like this: ``` Direct firmware load for amdgpu/isp_4_1_0.bin failed with error -2 ``` To avoid confusion for non-required firmware, adjust the amd-ucode helper to take an extra argument indicating if the firmware is required or optional. On optional firmware use firmware_request_nowarn() instead of request_firmware() to avoid the warnings. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Link: https://lore.kernel.org/amd-gfx/df71d375-7abd-4b32-97ce-15e57846eed8@amd.com/T/#t Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-12-10 10:26:51 -05:00
Boyuan Zhang	8aaf166703	drm/amd/pm: power up or down vcn by instance For smu ip with multiple vcn instances (smu 11/13/14), remove all the for loop in dpm_set_vcn_enable() functions. And use the instance argument to power up/down vcn for the given instance only, instead of powering up/down for all vcn instances. v2: remove all duplicated functions in v1. remove for-loop from each ip, and temporarily move to dpm_set_vcn_enable, in order to keep the exact same logic as before, until further separation in the next patch. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-12-10 10:26:47 -05:00
Alex Deucher	1443dd3c67	drm/amd/pm: fix and simplify workload handling smu->workload_mask is IP specific and should not be messed with in the common code. The mask bits vary across SMU versions. Move all handling of smu->workload_mask in to the backends and simplify the code. Store the user's preference in smu->power_profile_mode which will be reflected in sysfs. For internal driver profile switches for KFD or VCN, just update the workload mask so that the user's preference is retained. Remove all of the extra now unused workload related elements in the smu structure. v2: use refcounts for workload profiles v3: rework based on feedback from Lijo v4: fix the refcount on failure, drop backend mask v5: rework custom handling v6: handle failure cleanup with custom profile v7: Update documentation Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: Kenneth Feng <kenneth.feng@amd.com> Cc: Lijo Lazar <lijo.lazar@amd.com> Cc: stable@vger.kernel.org # 6.11.x	2024-12-02 18:36:15 -05:00
Alex Deucher	c3d06a3b6a	Revert "drm/amd/pm: correct the workload setting" This reverts commit `74e1006430`. This causes a regression in the workload selection. A more extensive fix is being worked on. For now, revert. This came back after a merge in 6.13-rc1, so revert again. Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3618 Fixes: `74e1006430` ("drm/amd/pm: correct the workload setting") Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `44f392fbf6`)	2024-12-02 18:35:57 -05:00
Lijo Lazar	da868898cf	drm/amd/pm: Remove arcturus min power limit As per power team, there is no need to impose a lower bound on arcturus power limit. Any unreasonable limit set will result in frequent throttling. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2024-11-21 15:56:06 -05:00
Kenneth Feng	8cc438be5d	drm/amd/pm: correct the workload setting Correct the workload setting in order not to mix the setting with the end user. Update the workload mask accordingly. v2: changes as below: 1. the end user can not erase the workload from driver except default workload. 2. always shows the real highest priority workoad to the end user. 3. the real workload mask is combined with driver workload mask and end user workload mask. v3: apply this to the other ASICs as well. v4: simplify the code v5: refine the code based on the review comments. Signed-off-by: Kenneth Feng <kenneth.feng@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-11-04 12:06:23 -05:00
Boyuan Zhang	8b7f3529cd	drm/amd/pm: add inst to dpm_set_vcn_enable Add an instance parameter to the existing function dpm_set_vcn_enable() for future implementation. Re-write all pptable functions accordingly. v2: Remove duplicated dpm_set_vcn_enable() functions in v1. Instead, adding instance parameter to existing functions. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Suggested-by: Christian König <christian.koenig@amd.com> Suggested-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-11-04 11:39:49 -05:00
Tvrtko Ursulin	0880f58f96	drm/amd/pm: Vangogh: Fix kernel memory out of bounds write KASAN reports that the GPU metrics table allocated in vangogh_tables_init() is not large enough for the memset done in smu_cmn_init_soft_gpu_metrics(). Condensed report follows: [ 33.861314] BUG: KASAN: slab-out-of-bounds in smu_cmn_init_soft_gpu_metrics+0x73/0x200 [amdgpu] [ 33.861799] Write of size 168 at addr ffff888129f59500 by task mangoapp/1067 ... [ 33.861808] CPU: 6 UID: 1000 PID: 1067 Comm: mangoapp Tainted: G W 6.12.0-rc4 #356 1a56f59a8b5182eeaf67eb7cb8b13594dd23b544 [ 33.861816] Tainted: [W]=WARN [ 33.861818] Hardware name: Valve Galileo/Galileo, BIOS F7G0107 12/01/2023 [ 33.861822] Call Trace: [ 33.861826] <TASK> [ 33.861829] dump_stack_lvl+0x66/0x90 [ 33.861838] print_report+0xce/0x620 [ 33.861853] kasan_report+0xda/0x110 [ 33.862794] kasan_check_range+0xfd/0x1a0 [ 33.862799] __asan_memset+0x23/0x40 [ 33.862803] smu_cmn_init_soft_gpu_metrics+0x73/0x200 [amdgpu 13b1bc364ec578808f676eba412c20eaab792779] [ 33.863306] vangogh_get_gpu_metrics_v2_4+0x123/0xad0 [amdgpu 13b1bc364ec578808f676eba412c20eaab792779] [ 33.864257] vangogh_common_get_gpu_metrics+0xb0c/0xbc0 [amdgpu 13b1bc364ec578808f676eba412c20eaab792779] [ 33.865682] amdgpu_dpm_get_gpu_metrics+0xcc/0x110 [amdgpu 13b1bc364ec578808f676eba412c20eaab792779] [ 33.866160] amdgpu_get_gpu_metrics+0x154/0x2d0 [amdgpu 13b1bc364ec578808f676eba412c20eaab792779] [ 33.867135] dev_attr_show+0x43/0xc0 [ 33.867147] sysfs_kf_seq_show+0x1f1/0x3b0 [ 33.867155] seq_read_iter+0x3f8/0x1140 [ 33.867173] vfs_read+0x76c/0xc50 [ 33.867198] ksys_read+0xfb/0x1d0 [ 33.867214] do_syscall_64+0x90/0x160 ... [ 33.867353] Allocated by task 378 on cpu 7 at 22.794876s: [ 33.867358] kasan_save_stack+0x33/0x50 [ 33.867364] kasan_save_track+0x17/0x60 [ 33.867367] __kasan_kmalloc+0x87/0x90 [ 33.867371] vangogh_init_smc_tables+0x3f9/0x840 [amdgpu] [ 33.867835] smu_sw_init+0xa32/0x1850 [amdgpu] [ 33.868299] amdgpu_device_init+0x467b/0x8d90 [amdgpu] [ 33.868733] amdgpu_driver_load_kms+0x19/0xf0 [amdgpu] [ 33.869167] amdgpu_pci_probe+0x2d6/0xcd0 [amdgpu] [ 33.869608] local_pci_probe+0xda/0x180 [ 33.869614] pci_device_probe+0x43f/0x6b0 Empirically we can confirm that the former allocates 152 bytes for the table, while the latter memsets the 168 large block. Root cause appears that when GPU metrics tables for v2_4 parts were added it was not considered to enlarge the table to fit. The fix in this patch is rather "brute force" and perhaps later should be done in a smarter way, by extracting and consolidating the part version to size logic to a common helper, instead of brute forcing the largest possible allocation. Nevertheless, for now this works and fixes the out of bounds write. v2: * Drop impossible v3_0 case. (Mario) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Fixes: `41cec40bc9` ("drm/amd/pm: Vangogh: Add new gpu_metrics_v2_4 to acquire gpu_metrics") Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Evan Quan <evan.quan@amd.com> Cc: Wenyou Yang <WenYou.Yang@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Link: https://lore.kernel.org/r/20241025145639.19124-1-tursulin@igalia.com Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-10-28 16:37:23 -04:00
Alex Deucher	3d73327b74	drm/amdgpu/swsmu: add automatic parameter to set_soft_freq_range On chips that support it, you can specificy 0 and 0xffff for min and max and the PMFW will use that to determine the optimal min and max. This enables optimal performance when the user manually switches between performance levels using sysfs. Previously we'd set soft min/max which could limit performance. Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-10-15 11:27:08 -04:00
Pierre-Eric Pelloux-Prayer	b5353c05ea	drm/amd/pm: remove dump_pptable functions They're not used. Tested-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-10-01 17:41:13 -04:00
Lijo Lazar	5839d27d5b	drm/amdgpu: Use init level for pending_reset flag Drop pending_reset flag in gmc block. Instead use init level to determine which type of init is preferred - in this case MINIMAL. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Acked-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> Tested-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-09-26 17:06:18 -04:00
Tobias Jakobi	17d30ed33c	drm/amdgpu/swsmu: fix SMU11 typos (memlk -> memclk) No functional changes. Signed-off-by: Tobias Jakobi <tjakobi@math.uni-bielefeld.de> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-08-13 10:27:01 -04:00
Yang Wang	3e92af6bf5	drm/amdgpu: refine pmfw/smu firmware loading refine pmfw/smu firmware loading Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-06-14 16:17:12 -04:00
Jesse Zhang	839eb4bbbd	drm/amd/pm: remove dead code in navi10_emit_clk_levels and navi10_print_clk_levels Since the range of the varibable i is 0 - 3. So execution cannot reach this statement: default. Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Tim Huang <Tim.Huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-06-14 15:34:20 -04:00
Lijo Lazar	9488d7affe	drm/amd/pm: Remove unused interface to set plpd Remove unused callback to set PLPD policy and its implementation from arcturus, aldebaran and SMUv13.0.6 SOCs. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-05-20 16:20:26 -04:00
Lijo Lazar	5d6f66b542	drm/amd/pm: Add xgmi plpd to arcturus pm_policy On arcturus, allow changing xgmi plpd policy through 'pm_policy/xgmi_plpd' sysfs interface. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-05-17 17:40:39 -04:00
Jesse Zhang	ff284ecac3	drm/amd/pm: check the return of send smc msg for navi10 Set smu work laod mask may fail, so check return. Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com> Reviewed-by: Tim Huang <Tim.Huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-05-13 16:11:53 -04:00
Jesse Zhang	7f684a67f8	drm/amd/pm: check the return of send smc msg for sienna_cichild Set smu work laod mask may fail, so check return. Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com> Reviewed-by: Tim Huang <Tim.Huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-05-13 16:11:53 -04:00
Ma Jun	adb9de4dd2	drm/amdgpu/pm: Check input value for power profile setting on smu11, smu13 and smu14 Check the input value for CUSTOM profile mode setting on smu 11, smu13 and smu14. Otherwise we use uninitialized value of input[] Signed-off-by: Ma Jun <Jun.Ma2@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-05-13 15:44:59 -04:00
Tim Huang	b2871de696	drm/amd/pm: fix uninitialized variable warnings for vangogh_ppt 1. Fix a issue that using uninitialized mask to get the ultimate frequency. 2. Check return of smu_cmn_send_smc_msg_with_param to avoid using uninitialized variable residency. Signed-off-by: Tim Huang <Tim.Huang@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-05-08 15:17:04 -04:00
Jesse Zhang	c8c19ebf7c	drm/amd/pm: Fix negative array index read Avoid using the negative values for clk_idex as an index into an array pptable->DpmDescriptor. V2: fix clk_index return check (Tim Huang) Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com> Reviewed-by: Tim Huang <Tim.Huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-30 10:03:56 -04:00
Dave Airlie	34633158b8	Merge tag 'amd-drm-next-6.10-2024-04-13' of https://gitlab.freedesktop.org/agd5f/linux into drm-next amd-drm-next-6.10-2024-04-13: amdgpu: - HDCP fixes - ODM fixes - RAS fixes - Devcoredump improvements - Misc code cleanups - Expose VCN activity via sysfs - SMY 13.0.x updates - Enable fast updates on DCN 3.1.4 - Add dclk and vclk reporting on additional devices - Add ACA RAS infrastructure - Implement TLB flush fence - EEPROM handling fixes - SMUIO 14.0.2 support - SMU 14.0.1 Updates - Sync page table freeing with TLB flushes - DML2 refactor - DC debug improvements - SR-IOV fixes - Suspend and Resume fixes - DCN 3.5.x Updates - Z8 fixes - UMSCH fixes - GPU reset fixes - HDP fix for second GFX pipe on GC 10.x - Enable secondary GFX pipe on GC 10.3 - Refactor and clean up BACO/BOCO/BAMACO handling - VCN partitioning fix - DC DWB fixes - VSC SDP fixes - DCN 3.1.6 fix - GC 11.5 fixes - Remove invalid TTM resource start check - DCN 1.0 fixes amdkfd: - MQD handling cleanup - Preemption handling fixes for XCDs - TLB flush fix for GC 9.4.2 - Properly clean up workqueue during module unload - Fix memory leak process create failure - Range check CP bad op exception targets to avoid reporting invalid exceptions to userspace radeon: - Misc code cleanups From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240413213708.3427038-1-alexander.deucher@amd.com Signed-off-by: Dave Airlie <airlied@redhat.com>	2024-04-17 15:48:59 +10:00
Ma Jun	5279a8506f	drm/amdgpu/pm: Check AMDGPU_RUNPM_BAMACO when setting baco state Check AMDGPU_RUNPM_BAMACO intead of amdgpu_runtime_pm when setting baco state. Signed-off-by: Ma Jun <Jun.Ma2@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-09 22:08:27 -04:00
Ma Jun	b2207dc698	drm/amdgpu/pm: Add support for MACO flag checking Add support for MACO flag checking. MACO mode only works if BACO is supported. Signed-off-by: Ma Jun <Jun.Ma2@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-09 22:07:59 -04:00
Ma Jun	1b19959427	drm/amdgpu/pm: Change the member function name in pp_hwmgr_func and pptable_funcs Use a unified and more explicit name get_bamaco_support to replace is_baco_support and get_asic_baco_capability Signed-off-by: Ma Jun <Jun.Ma2@amd.com> Suggested-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-09 22:07:50 -04:00
Srinivasan Shanmugam	730dd50f84	drm/amdgpu: Fix truncation in smu_v11_0_init_microcode Reducing the size of ucode_prefix to 25 in the smu_v11_0_init_microcode function. we ensure that fw_name can accommodate the maximum possible string size Fixes the below with gcc W=1: drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu11/smu_v11_0.c: In function ‘smu_v11_0_init_microcode’: drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu11/smu_v11_0.c:110:54: warning: ‘.bin’ directive output may be truncated writing 4 bytes into a region of size between 0 and 29 [-Wformat-truncation=] 110 \| snprintf(fw_name, sizeof(fw_name), "amdgpu/%s.bin", ucode_prefix); \| ^~~~ drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu11/smu_v11_0.c:110:9: note: ‘snprintf’ output between 12 and 41 bytes into a destination of size 36 110 \| snprintf(fw_name, sizeof(fw_name), "amdgpu/%s.bin", ucode_prefix); \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Suggested-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-03-22 15:56:28 -04:00
Linus Torvalds	7ee0490121	Merge tag 'drm-next-2024-03-22' of https://gitlab.freedesktop.org/drm/kernel Pull drm fixes from Dave Airlie: "Fixes from the last week (or 3 weeks in amdgpu case), after amdgpu, it's xe and nouveau then a few scattered core fixes. core: - fix rounding in drm_fixp2int_round() bridge: - fix documentation for DRM_BRIDGE_OP_EDID sun4i: - fix 64-bit division on 32-bit architectures tests: - fix dependency on DRM_KMS_HELPER probe-helper: - never return negative values from .get_modes() plus driver fixes xe: - invalidate userptr vma on page pin fault - fail early on sysfs file creation error - skip VMA pinning on xe_exec if no batches nouveau: - clear bo resource bus after eviction - documentation fixes - don't check devinit disable on GSP amdgpu: - Freesync fixes - UAF IOCTL fixes - Fix mmhub client ID mapping - IH 7.0 fix - DML2 fixes - VCN 4.0.6 fix - GART bind fix - GPU reset fix - SR-IOV fix - OD table handling fixes - Fix TA handling on boards without display hardware - DML1 fix - ABM fix - eDP panel fix - DPPCLK fix - HDCP fix - Revert incorrect error case handling in ioremap - VPE fix - HDMI fixes - SDMA 4.4.2 fix - Other misc fixes amdkfd: - Fix duplicate BO handling in process restore" * tag 'drm-next-2024-03-22' of https://gitlab.freedesktop.org/drm/kernel: (50 commits) drm/amdgpu/pm: Don't use OD table on Arcturus drm/amdgpu: drop setting buffer funcs in sdma442 drm/amd/display: Fix noise issue on HDMI AV mute drm/amd/display: Revert Remove pixle rate limit for subvp Revert "drm/amdgpu/vpe: don't emit cond exec command under collaborate mode" Revert "drm/amd/amdgpu: Fix potential ioremap() memory leaks in amdgpu_device_init()" drm/amd/display: Add a dc_state NULL check in dc_state_release drm/amd/display: Return the correct HDCP error code drm/amd/display: Implement wait_for_odm_update_pending_complete drm/amd/display: Lock all enabled otg pipes even with no planes drm/amd/display: Amend coasting vtotal for replay low hz drm/amd/display: Fix idle check for shared firmware state drm/amd/display: Update odm when ODM combine is changed on an otg master pipe with no plane drm/amd/display: Init DPPCLK from SMU on dcn32 drm/amd/display: Add monitor patch for specific eDP drm/amd/display: Allow dirty rects to be sent to dmub when abm is active drm/amd/display: Override min required DCFCLK in dml1_validate drm/amdgpu: Bypass display ta if display hw is not available drm/amdgpu: correct the KGQ fallback message drm/amdgpu/pm: Check the validity of overdiver power limit ...	2024-03-21 19:04:31 -07:00
Xiaojian Du	2a88f1b5d0	drm/amdgpu: add VCN sensor value for Vangogh This will drm/amdgpu: add VCN sensor value for Vangogh. Signed-off-by: Xiaojian Du <Xiaojian.Du@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-03-20 13:38:14 -04:00
Ma Jun	bc55c344b0	drm/amdgpu/pm: Don't use OD table on Arcturus OD is not supported on Arcturus, so the OD table should not be used. Signed-off-by: Ma Jun <Jun.Ma2@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-03-20 13:36:29 -04:00
Ma Jun	e17718251a	drm/amdgpu/pm: Check the validity of overdiver power limit Check the validity of overdriver power limit before using it. Fixes: `7968e9748f` ("drm/amdgpu/pm: Fix the power1_min_cap value") Signed-off-by: Ma Jun <Jun.Ma2@amd.com> Suggested-by: Lazar Lijo <lijo.lazar@amd.com> Suggested-by: Alex Deucher <Alexander.Deucher@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2024-03-20 13:12:57 -04:00
Ma Jun	08ae9ef829	drm/amdgpu/pm: Fix NULL pointer dereference when get power limit Because powerplay_table initialization is skipped under sriov case, We check and set default lower and upper OD value if powerplay_table is NULL. Fixes: `7968e9748f` ("drm/amdgpu/pm: Fix the power1_min_cap value") Signed-off-by: Ma Jun <Jun.Ma2@amd.com> Reported-by: Yin Zhenguo <zhenguo.yin@amd.com> Suggested-by: Lazar Lijo <lijo.lazar@amd.com> Suggested-by: Alex Deucher <Alexander.Deucher@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2024-03-20 13:12:57 -04:00

1 2 3 4 5 ...

402 Commits