drm/amdgpu: fix hung reset queue array memory allocation

By design the MES will return an array result that is twice the number
of hung doorbells it can report.

i.e. if up k reported doorbells are supported, then the
second half of the array, also of length k, holds the HQD information
(type/queue/pipe) where queue 1 corresponds to index 0 and k,
queue 2 corresponds to index 1 and k + 1 etc ...

The driver will use the HDQ info to target queue/pipe reset for
hardware scheduled user compute queues.

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This commit is contained in:
Jonathan Kim
2025-10-09 11:28:19 -04:00
committed by Alex Deucher
parent 8745ca5efb
commit 0ef930e1fa
5 changed files with 20 additions and 10 deletions

View File

@@ -208,10 +208,10 @@ static int mes_userq_detect_and_reset(struct amdgpu_device *adev,
struct amdgpu_userq_mgr *uqm, *tmp;
unsigned int hung_db_num = 0;
int queue_id, r, i;
u32 db_array[4];
u32 db_array[8];
if (db_array_size > 4) {
dev_err(adev->dev, "DB array size (%d vs 4) too small\n",
if (db_array_size > 8) {
dev_err(adev->dev, "DB array size (%d vs 8) too small\n",
db_array_size);
return -EINVAL;
}