drm/amdgpu: Unlocked unmap only clear page table leaves

SVM migration unmap pages from GPU and then update mapping to GPU to
recover page fault. Currently unmap clears the PDE entry for range
length >= huge page and free PTB bo, update mapping to alloc new PT bo.
There is race bug that the freed entry bo maybe still on the pt_free
list, reused when updating mapping and then freed, leave invalid PDE
entry and cause GPU page fault.

By setting the update to clear only one PDE entry or clear PTB, to
avoid unmap to free PTE bo. This fixes the race bug and improve the
unmap and map to GPU performance. Update mapping to huge page will
still free the PTB bo.

With this change, the vm->pt_freed list and work is not needed. Add
WARN_ON(unlocked) in amdgpu_vm_pt_free_dfs to catch if unmap to free the
PTB.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This commit is contained in:
Philip Yang
2025-01-14 09:53:13 -05:00
committed by Alex Deucher
parent 1350dd3691
commit 23b645231e
3 changed files with 13 additions and 38 deletions

View File

@@ -374,10 +374,6 @@ struct amdgpu_vm {
/* BOs which are invalidated, has been updated in the PTs */
struct list_head done;
/* PT BOs scheduled to free and fill with zero if vm_resv is not hold */
struct list_head pt_freed;
struct work_struct pt_free_work;
/* contains the page directory */
struct amdgpu_vm_bo_base root;
struct dma_fence *last_update;