Commit Graph

16673 Commits

Author SHA1 Message Date
Lijo Lazar
d6fa802661 drm/amdgpu: Add vbios build number interface
Fetch VBIOS build number from atom rom image. Add a sysfs interface to
read the build number.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-05 17:38:40 -04:00
Jesse.Zhang
54d18bc600 drm/amdgpu/userq: add a detect and reset callback
Add a detect and reset callback and add the implementation
for mes.  The callback will detect all hung queues of a
particular ip type (e.g., GFX or compute or SDMA) and
reset them.

v2: increase reset counter and set fence force completion
v3: Removed userq_mutex in mes_userq_detect_and_reset since the driver holds it when calling

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-05 17:38:39 -04:00
Alex Deucher
94bd7bf2c9 drm/amdgpu: don't enable SMU on cyan skillfish
Cyan skillfish uses different SMU firmware.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-05 17:38:39 -04:00
Alex Deucher
fa819e3a7c drm/amdgpu: add support for cyan skillfish gpu_info
Some SOCs which are part of the cyan skillfish family
rely on an explicit firmware for IP discovery.  Add support
for the gpu_info firmware.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-05 17:38:39 -04:00
Alex Deucher
9e6a5cf1a2 drm/amdgpu: add support for cyan skillfish without IP discovery
For platforms without an IP discovery table.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-05 17:38:38 -04:00
Alex Deucher
e8529dbc75 drm/amdgpu: add ip offset support for cyan skillfish
For chips that don't have IP discovery tables.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-05 17:38:38 -04:00
Srinivasan Shanmugam
38ab33dbea drm/amdgpu: Fix function header names in amdgpu_connectors.c
Align the function headers for `amdgpu_max_hdmi_pixel_clock` and
`amdgpu_connector_dvi_mode_valid` with the function implementations so
they match the expected kdoc style.

Fixes the below:
drivers/gpu/drm/amd/amdgpu/amdgpu_connectors.c:1199: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst
 * Returns the maximum supported HDMI (TMDS) pixel clock in KHz.
drivers/gpu/drm/amd/amdgpu/amdgpu_connectors.c:1212: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst
 * Validates the given display mode on DVI and HDMI connectors.

Fixes: 585b2f685c ("drm/amdgpu: Respect max pixel clock for HDMI and DVI-D (v2)")
Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-05 17:38:38 -04:00
Yifan Zhang
fa7c99f04f amd/amdkfd: correct mem limit calculation for small APUs
Current mem limit check leaks some GTT memory (reserved_for_pt
reserved_for_ras + adev->vram_pin_size) for small APUs.

Since carveout VRAM is tunable on APUs, there are three case
regarding the carveout VRAM size relative to GTT:

1. 0 < carveout < gtt
   apu_prefer_gtt = true, is_app_apu = false

2. carveout > gtt / 2
   apu_prefer_gtt = false, is_app_apu = false

3. 0 = carveout
   apu_prefer_gtt = true, is_app_apu = true

It doesn't make sense to check below limitation in case 1
(default case, small carveout) because the values in the below
expression are mixed with carveout and gtt.

adev->kfd.vram_used[xcp_id] + vram_needed >
    vram_size - reserved_for_pt - reserved_for_ras -
    atomic64_read(&adev->vram_pin_size)

gtt: kfd.vram_used, vram_needed, vram_size
carveout: reserved_for_pt, reserved_for_ras, adev->vram_pin_size

In case 1, vram allocation will go to gtt domain, skip vram check
since ttm_mem_limit check already cover this allocation.

Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-05 17:38:38 -04:00
Alex Deucher
276e8beb2a drm/amdgpu/userq: add force completion helpers
Add support for forcing completion of userq fences.
This is needed for userq resets and asic resets so that we
can set the error on the fence and force completion.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-05 17:38:38 -04:00
Alex Deucher
c5da9e9c02 drm/amdgpu: add user queue reset source
Track resets from user queues.

Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-05 17:38:38 -04:00
Jesse.Zhang
724471254e drm/amdgpu/mes12: implement detect and reset callback
Implement support for the hung queue detect and reset
functionality.

v2: Always use AMDGPU_MES_SCHED_PIPE

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-05 17:38:38 -04:00
Jesse.Zhang
b28cfc8305 drm/amdgpu/mes11: implement detect and reset callback
Implement support for the hung queue detect and reset
functionality.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-05 17:38:38 -04:00
Jesse.Zhang
78e1222fbf drm/amdgpu/mes: add front end for detect and reset hung queue
Helper function to detect and reset hung queues.  MES will
return an array of doorbell indices of which queues are hung
and were optionally reset.

v2:  Clear the doorbell array before detection

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-05 17:38:31 -04:00
Jesse.Zhang
6abd725fdf drm/amd/amdgpu: Implement MES suspend/resume gang functionality for v12
This commit implements the actual MES (Micro Engine Scheduler) suspend
and resume gang operations for version 12 hardware. Previously these
functions were just stubs returning success.

v2: Always use AMDGPU_MES_SCHED_PIPE

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com>
2025-09-05 17:38:07 -04:00
Jesse.Zhang
2318336573 drm/amdgpu: Add preempt and restore callbacks to userq funcs
Add two new function pointers to struct amdgpu_userq_funcs:
- preempt: To handle preemption of user mode queues
- restore: To restore preempted user mode queues

These callbacks will allow the driver to properly manage queue
preemption and restoration when needed, such as during context
switching or priority changes.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-05 17:37:50 -04:00
Sunil Khatri
878f33f390 drm/amdgpu: fix the formating for debugfs print
Fix the format of debugfs print in the mqd. Need to
add a colon so parser can parse it properly.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-05 16:07:02 -04:00
Alex Deucher
1e18746381 drm/amd: add more cyan skillfish PCI ids
Add additional PCI IDs to the cyan skillfish family.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-05 16:06:58 -04:00
Sunil Khatri
7e0fc7b2b7 drm/amdgpu: add more information in debugfs to pagetable dump
Add more information in the debugfs which is needed to dump
a pagetable correctly for userqueues where vmid is not known
in the kernel.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-05 16:06:52 -04:00
Xiang Liu
f320ed01cf drm/amdgpu: Correct info field of bad page threshold exceed CPER
Correct valid_bits and ms_chk_bits of section info field for bad page
threshold exceed CPER to match OOB's behavior.

Signed-off-by: Xiang Liu <xiang.liu@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-05 16:06:49 -04:00
Liao Yuanhong
086f66edf9 drm/amdgpu/vcn: Remove redundant ternary operators
For ternary operators in the form of "a ? true : false", if 'a' itself
returns a boolean result, the ternary operator can be omitted. Remove
redundant ternary operators to clean up the code.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Liao Yuanhong <liaoyuanhong@vivo.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-05 16:06:34 -04:00
Liao Yuanhong
9502b09933 drm/amdgpu/jpeg: Remove redundant ternary operators
For ternary operators in the form of "a ? true : false", if 'a' itself
returns a boolean result, the ternary operator can be omitted. Remove
redundant ternary operators to clean up the code.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Liao Yuanhong <liaoyuanhong@vivo.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-05 16:06:29 -04:00
Liao Yuanhong
8a4bc4508c drm/amdgpu/ih: Remove redundant ternary operators
For ternary operators in the form of "a ? false : true", if 'a' itself
returns a boolean result, the ternary operator can be omitted. Remove
redundant ternary operators to clean up the code.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Liao Yuanhong <liaoyuanhong@vivo.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-05 16:00:15 -04:00
Liao Yuanhong
60df3e81d7 drm/amdgpu/gmc: Remove redundant ternary operators
For ternary operators in the form of "a ? false : true", if 'a' itself
returns a boolean result, the ternary operator can be omitted. Remove
redundant ternary operators to clean up the code.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Liao Yuanhong <liaoyuanhong@vivo.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-05 16:00:13 -04:00
Liao Yuanhong
d261e744af drm/amdgpu/gfx: Remove redundant ternary operators
For ternary operators in the form of "a ? false : true", if 'a' itself
returns a boolean result, the ternary operator can be omitted. Remove
redundant ternary operators to clean up the code.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Liao Yuanhong <liaoyuanhong@vivo.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-05 16:00:09 -04:00
Liao Yuanhong
7670daf65a drm/amdgpu/amdgpu_cper: Remove redundant ternary operators
For ternary operators in the form of "a ? false : true", if 'a' itself
returns a boolean result, the ternary operator can be omitted. Remove
redundant ternary operators to clean up the code.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Liao Yuanhong <liaoyuanhong@vivo.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-05 16:00:04 -04:00
Colin Ian King
4320fd9e0d drm/amd/amdgpu: Fix a less than zero check on a uint32_t struct field
Currently the error check from the call to mes_v12_inv_tlb_convert_hub_id
is always false because a uint32_t struct field hub_id is being used to
to perform the less than zero error check. Fix this by using the int
variable ret to perform the check.

Fixes: 87e6505261 ("drm/amd/amdgpu : Use the MES INV_TLBS API for tlb invalidation on gfx12")
Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-05 15:59:11 -04:00
Dave Airlie
6dc1d3c191 Merge tag 'drm-misc-next-2025-09-04' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-next
drm-misc-next for v6.18:

Cross-subsystem Changes:

- Update a number of DT bindings for STM32MP25 Arm SoC

Core Changes:

gem:
- Simplify locking for GPUVM

panel-backlight-quirks:
- Add additional quirks for EDID, DMI, brightness

sched:
- Fix race condition in trace code
- Clean up

sysfb:
- Clean up

Driver Changes:

amdgpu:
- Give kernel jobs a unique id for better tracing

amdxdna:
- Improve error reporting

bridge:
- Improve ref counting on bridge management
- adv7511: Provide SPD and HDMI infoframes
- it6505: Replace crypto_shash with sha()
- synopsys: Add support for DW DPTX Controller plus DT bindings

gud:
- Replace simple-KMS pipe with regular atomic helpers

imagination:
- Improve power management
- Add support for TH1520 GPU
- Support Risc-V architectures

ivpu:
- Clean up

nouveau:
- Improve error reporting

panthor:
- Fail VM bind if BO has offset
- Clean up

rcar-du:
- Make number of lanes configurable

rockchip:
- Add support for RK3588 DPTX output

rocket:
- Use kfree() and sizeof() correctly
- Test DMA status
- Clean up

sitronix:
- st7571-i2c: Add support for inverted displays and 2-bit grayscale
- Clean up

stm:
- ltdc: Add support support for STM32MP257F-EV1 plus DT bindings

tidss:
- Convert to kernel's FIELD_ macros

v3d:
- Improve job management and locking

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://lore.kernel.org/r/20250904090932.GA193997@linux.fritz.box
2025-09-05 11:49:01 +10:00
Colin Ian King
467e00b30d drm/amd/amdgpu: Fix missing error return on kzalloc failure
Currently the kzalloc failure check just sets reports the failure
and sets the variable ret to -ENOMEM, which is not checked later
for this specific error. Fix this by just returning -ENOMEM rather
than setting ret.

Fixes: 4fb9307154 ("drm/amd/amdgpu: remove redundant host to psp cmd buf allocations")
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 1ee9d1a096)
2025-09-03 16:27:56 -04:00
Gustavo A. R. Silva
1e6d36e15b drm/amdgpu/amdkfd: Avoid a couple hundred -Wflex-array-member-not-at-end warnings
-Wflex-array-member-not-at-end was introduced in GCC-14, and we are
getting ready to enable it, globally.

Move the conflicting declarations to the end of the corresponding
structures. Notice that `struct dev_pagemap` is a flexible structure,
this is a structure that contains a flexible-array member.

struct dev_pagemap always has room for at least one range. amdgpu only
uses a single range. Therefore no change are needed to the allocation
of struct amdgpu_device.

Fix 283 of the following type of warnings:
    283 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h:111:28: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end]

Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Felix Kuehling <felix.kuehling@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-02 17:36:20 -04:00
Colin Ian King
1ee9d1a096 drm/amd/amdgpu: Fix missing error return on kzalloc failure
Currently the kzalloc failure check just sets reports the failure
and sets the variable ret to -ENOMEM, which is not checked later
for this specific error. Fix this by just returning -ENOMEM rather
than setting ret.

Fixes: 4fb9307154 ("drm/amd/amdgpu: remove redundant host to psp cmd buf allocations")
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-02 15:57:10 -04:00
Timur Kristóf
6df0768c0d drm/amd/pm: Remove wm_low and wm_high fields from amdgpu_crtc (v2)
These fields were only used by si_dpm and are not necessary
anymore. They also may have been incorrect because:
- wm_high was set to the LOW_WATERMARK field of watermark A.
- wm_low was not set on DCE 6 and was always zero.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-02 15:57:01 -04:00
Timur Kristóf
c661219cd7 drm/amdgpu: Power up UVD 3 for FW validation (v2)
Unlike later versions, UVD 3 has firmware validation.
For this to work, the UVD should be powered up correctly.

When DPM is enabled and the display clock is off,
the SMU may choose a power state which doesn't power
the UVD, which can result in failure to initialize UVD.

v2:
Add code comments to explain about the UVD power state
and how UVD clock is turned on/off.

Fixes: b38f3e80ec ("drm amdgpu: SI UVD v3_1 (v2)")
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-02 15:54:03 -04:00
David Francis
4d82724f7f drm/amdgpu: Add mapping info option for GEM_OP ioctl
Add new GEM_OP_IOCTL option GET_MAPPING_INFO, which
returns a list of mappings associated with a given bo, along with
their positions and offsets.

Userspace for this and the previous change can be found at:
https://github.com/checkpoint-restore/criu/pull/2613

Signed-off-by: David Francis <David.Francis@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-02 15:53:33 -04:00
David Francis
f9db1fc52c drm/amdgpu: Add ioctl to get all gem handles for a process
Add new ioctl DRM_IOCTL_AMDGPU_GEM_LIST_HANDLES.

This ioctl returns a list of bos with their handles, sizes,
and flags and domains.

This ioctl is meant to be used during CRIU checkpoint and
provide information needed to reconstruct the bos
in CRIU restore.

Userspace for this and the next change can be found at
https://github.com/checkpoint-restore/criu/pull/2613

Signed-off-by: David Francis <David.Francis@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-02 15:34:00 -04:00
David Francis
0317e0e224 drm/amdgpu: Allow more flags to be set on gem create.
The GEM create flag AMDGPU_GEM_CREATE_VRAM_WIPE_ON_RELEASE
specifies that gem memory contains sensitive information and
should be cleared to prevent snooping.

The COHERENT and UNCACHED gem create flags enable memory
features related to sharing memory across devices.

For CRIU we need to re-create KFD BOs through the
GEM_CREATE IOCTL, so allow those KFD specific flags here as well.
This will also aid us in the future and allows to move
the KFD components over using the render node for allocations.

Signed-off-by: David Francis <David.Francis@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-02 15:34:00 -04:00
Dave Airlie
14579a6f18 Merge tag 'amd-drm-next-6.18-2025-08-29' of https://gitlab.freedesktop.org/agd5f/linux into drm-next
amd-drm-next-6.18-2025-08-29:

amdgpu:
- Replay fixes
- RAS updates
- VCN SRAM load fixes
- EDID read fixes
- eDP ALPM support
- AUX fixes
- Documenation updates
- Rework how PTE flags are generated
- DCE6 fixes
- VCN devcoredump cleanup
- MMHUB client id fixes
- SR-IOV fixes
- VRR fixes
- VCN 5.0.1 RAS support
- Backlight fixes
- UserQ fixes
- Misc code cleanups
- SMU 13.0.12 updates
- Expanded PCIe DPC support
- Expanded VCN reset support
- SMU 13.0.x Updates
- VPE per queue reset support
- Cusor rotation fix
- DSC fixes
- GC 12 MES TLB invalidation update
- Cursor fixes
- Non-DC TMDS clock validation fix

amdkfd:
- debugfs fixes
- Misc code cleanups
- Page migration fixes
- Partition fixes
- SVM fixes

radeon:
- Misc code cleanups

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Alex Deucher <alexander.deucher@amd.com>
Link: https://lore.kernel.org/r/20250829190848.1921648-1-alexander.deucher@amd.com
2025-09-02 09:35:54 +10:00
Pierre-Eric Pelloux-Prayer
256576ed68 drm/amdgpu: give each kernel job a unique id
Userspace jobs have drm_file.client_id as a unique identifier
as job's owners. For kernel jobs, we can allocate arbitrary
values - the risk of overlap with userspace ids is small (given
that it's a u64 value).
In the unlikely case the overlap happens, it'll only impact
trace events.

Since this ID is traced in the gpu_scheduler trace events, this
allows to determine the source of each job sent to the hardware.

To make grepping easier, the IDs are defined as they will appear
in the trace output.

Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Link: https://lore.kernel.org/r/20250604122827.2191-1-pierre-eric.pelloux-prayer@amd.com
2025-09-01 10:49:31 +05:30
Alex Deucher
71403f58b4 drm/amdgpu: drop hw access in non-DC audio fini
We already disable the audio pins in hw_fini so
there is no need to do it again in sw_fini.

Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4481
Cc: oushixiong <oushixiong1025@163.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 5eeb16ca72)
Cc: stable@vger.kernel.org
2025-08-29 11:13:38 -04:00
Alex Deucher
5171848bdf drm/amdgpu/mes11: make MES_MISC_OP_CHANGE_CONFIG failure non-fatal
If the firmware is too old, just warn and return success.

Fixes: 27b7915147 ("drm/amdgpu/mes: keep enforce isolation up to date")
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4414
Cc: shaoyun.Liu@amd.com
Reviewed-by: Shaoyun.liu <Shaoyun.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 9f28af76fa)
Cc: stable@vger.kernel.org
2025-08-29 11:12:44 -04:00
Jesse.Zhang
2d41a4bfee drm/amdgpu/sdma: bump firmware version checks for user queue support
Using the previous firmware could lead to problems with
PROTECTED_FENCE_SIGNAL commands, specifically causing register
conflicts between MCU_DBG0 and MCU_DBG1.

The updated firmware versions ensure proper alignment
and unification of the SDMA_SUBOP_PROTECTED_FENCE_SIGNAL value with SDMA 7.x,
resolving these hardware coordination issues

Fixes: e8cca30d8b ("drm/amdgpu/sdma6: add ucode version checks for userq support")
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit aab8b689ad)
Cc: stable@vger.kernel.org
2025-08-29 11:11:45 -04:00
Timur Kristóf
585b2f685c drm/amdgpu: Respect max pixel clock for HDMI and DVI-D (v2)
Update the legacy (non-DC) display code to respect the maximum
pixel clock for HDMI and DVI-D. Reject modes that would require
a higher pixel clock than can be supported.

Also update the maximum supported HDMI clock value depending on
the ASIC type.

For reference, see the DC code:
check max_hdmi_pixel_clock in dce*_resource.c

v2:
Fix maximum clocks for DVI-D and DVI/HDMI adapters.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-08-29 10:23:09 -04:00
Alex Deucher
5eeb16ca72 drm/amdgpu: drop hw access in non-DC audio fini
We already disable the audio pins in hw_fini so
there is no need to do it again in sw_fini.

Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4481
Cc: oushixiong <oushixiong1025@163.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-08-29 10:13:42 -04:00
Alex Deucher
9f28af76fa drm/amdgpu/mes11: make MES_MISC_OP_CHANGE_CONFIG failure non-fatal
If the firmware is too old, just warn and return success.

Fixes: 27b7915147 ("drm/amdgpu/mes: keep enforce isolation up to date")
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4414
Cc: shaoyun.Liu@amd.com
Reviewed-by: Shaoyun.liu <Shaoyun.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-08-29 10:12:00 -04:00
Lijo Lazar
9c0442286f drm/amdgpu: Check vcn state before profile switch
The patch uses power state of VCN instances for requesting video
profile.

In idle worker of a vcn instance, when there is no outstanding
submisssion or fence, the instance is put to power gated state. When
all instances are powered off that means video profile is no longer
required. A request is made to turn off video profile.

A job submission starts with begin_use of ring, and at that time
vcn instance state is changed to power on. Subsequently a check is
made for active video profile, and if not active, a request is made.

Fixes: 3b669df92c ("drm/amdgpu/vcn: adjust workload profile handling")
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Sathishkumar S <sathishkumar.sundararaju@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-08-29 10:09:33 -04:00
Mangesh Gadre
37551277df drm/amdgpu: Avoid vcn v5.0.1 poison irq call trace on sriov guest
Sriov guest side doesn't init ras feature hence the poison irq shouldn't
be put during hw fini

Signed-off-by: Mangesh Gadre <Mangesh.Gadre@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-08-29 10:09:29 -04:00
Mangesh Gadre
01152c30ee drm/amdgpu: Avoid jpeg v5.0.1 poison irq call trace on sriov guest
Sriov guest side doesn't init ras feature hence the poison irq shouldn't
be put during hw fini

Signed-off-by: Mangesh Gadre <Mangesh.Gadre@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-08-29 10:09:23 -04:00
Yang Wang
89c3503bc6 drm/amd/amdgpu: unified amdgpu ip block name
v1:
1. Unified amdgpu ip block name print with format
   "{ip_type}_v{major}_{minor}_{rev}"

2. Avoid IP block name conflicts for SMU/PSP ip block

v2:
Update IP block print format to keep legacy IP block name (Alex)
"{ip_type}_v{major}_{minor}_{rev} ({funcs->name})"

Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-08-29 10:09:16 -04:00
Jesse.Zhang
aab8b689ad drm/amdgpu/sdma: bump firmware version checks for user queue support
Using the previous firmware could lead to problems with
PROTECTED_FENCE_SIGNAL commands, specifically causing register
conflicts between MCU_DBG0 and MCU_DBG1.

The updated firmware versions ensure proper alignment
and unification of the SDMA_SUBOP_PROTECTED_FENCE_SIGNAL value with SDMA 7.x,
resolving these hardware coordination issues

Fixes: e8cca30d8b ("drm/amdgpu/sdma6: add ucode version checks for userq support")
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-08-29 10:07:24 -04:00
Xiang Liu
c8d6e90abe drm/amdgpu: Notify pmfw bad page threshold exceeded
Notify pmfw when bad page threshold is exceeded, no matter the module
parameter 'bad_page_threshold' is set or not.

Signed-off-by: Xiang Liu <xiang.liu@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-08-29 10:07:19 -04:00
David (Ming Qiang) Wu
b1d83546cf drm/amdgpu/vcn: add instance number to VCN version message
For multiple VCN instances case we get multiple lines of the same
message like below:

  amdgpu 0000:43:00.0: amdgpu: Found VCN firmware Version ENC: 1.24 DEC: 9 VEP: 0 Revision: 11
  amdgpu 0000:43:00.0: amdgpu: Found VCN firmware Version ENC: 1.24 DEC: 9 VEP: 0 Revision: 11

By adding instance number to the log message for multiple VCN instances,
each line will clearly indicate which VCN instance it refers to.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: David (Ming Qiang) Wu <David.Wu3@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-08-29 10:07:10 -04:00