drm/amdgpu: Fix repeatly flr issue

Only for no job running test case need to do recover in flr notification. For having job in mirror list, then let guest driver to hit job timeout, and then do recover. Signed-off-by: jqdeng <Emily.Deng@amd.com> Acked-by: Nirmoy Das <nirmoy.das@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-04-18 14:53:58 -04:00 · 2020-08-07 17:31:19 +08:00
parent 5ce99853a6
commit 9a1cddd637
4 changed files with 32 additions and 2 deletions
--- a/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c
+++ b/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c
@@ -289,7 +289,8 @@ flr_done:

 	/* Trigger recovery for world switch failure if no TDR */
 	if (amdgpu_device_should_recover_gpu(adev)
-		&& (adev->sdma_timeout == MAX_SCHEDULE_TIMEOUT ||
+		&& (amdgpu_device_has_job_running(adev) ||
+		adev->sdma_timeout == MAX_SCHEDULE_TIMEOUT ||
 		adev->gfx_timeout == MAX_SCHEDULE_TIMEOUT ||
 		adev->compute_timeout == MAX_SCHEDULE_TIMEOUT ||
 		adev->video_timeout == MAX_SCHEDULE_TIMEOUT))