drm/amdgpu: trigger flr_work if reading pf2vf data failed

if reading pf2vf data failed 30 times continuously, it means something is
wrong. Need to trigger flr_work to recover the issue.

also use dev_err to print the error message to get which device has
issue and add warning message if waiting IDH_FLR_NOTIFICATION_CMPL
timeout.

Signed-off-by: Zhigang Luo <Zhigang.Luo@amd.com>
Acked-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This commit is contained in:
Zhigang Luo
2024-02-29 16:04:35 -05:00
committed by Alex Deucher
parent dc5c3d48e9
commit ab66c83284
5 changed files with 41 additions and 10 deletions

View File

@@ -309,6 +309,8 @@ static void xgpu_nv_mailbox_flr_work(struct work_struct *work)
timeout -= 10;
} while (timeout > 1);
dev_warn(adev->dev, "waiting IDH_FLR_NOTIFICATION_CMPL timeout\n");
flr_done:
atomic_set(&adev->reset_domain->in_gpu_reset, 0);
up_write(&adev->reset_domain->sem);