Disable verbose while getting p2p distance. With verbose, it shows
warning if ACS redirect is set between the devices. Adds noise
to dmesg logs when a few GPU devices are on the same platform.
Example log:
amdgpu 0000:34:00.0: ACS redirect is set between the client and provider (0000:31:00.0)
amdgpu 0000:34:00.0: to disable ACS redirect for this path, add the kernel parameter:
pci=disable_acs_redir=0000:30:00.0;0000:2e:00.0;0000:33:00.0;0000:2e:10.0
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
For the asic using smu v13_0_2, there is the following
warning when uninstalling amdgpu:
amdgpu: ras disable gfx failed poison:1 ret:-22.
[Why]:
For the asic using smu v13_0_2, the psp .suspend and
mode1reset is called before amdgpu_ras_pre_fini during
amdgpu uninstall, it has disabled all ras features and
reset the psp. Since the psp is reset, calling
amdgpu_ras_disable_all_features in amdgpu_ras_pre_fini
to disable ras features will fail.
[How]:
If all ras features are disabled, amdgpu_ras_disable_all_features
will not be called to disable all ras features again.
Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
- Under SRIOV, we need to send REQ_GPU_FINI to the hypervisor
during the suspend time. Furthermore, we cannot request a
mode 1 reset under SRIOV as VF. Therefore, we will skip it
as it is called in suspend_noirq() function.
- In the resume code path, we need to send REQ_GPU_INIT to the
hypervisor and also resume PSP IP block under SRIOV.
Signed-off-by: Bokun Zhang <Bokun.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
The function amdgpu_fence_count_emitted used in work_hander should not call
amdgpu_fence_process which must be used in irq handler.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Jiadong.Zhu <Jiadong.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The current position calulated in gfx_v9_0_ring_emit_patch_cond_exec
underflows when the wptr is divisible by ring->buf_mask + 1.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Jiadong.Zhu <Jiadong.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Update mes_v11_api_def.h add_queue API with is_aql_queue parameter. Also
re-use gds_size for the queue size (unused for KFD). MES requires the
queue size in order to compute the actual wptr offset within the queue
RB since it increases monotonically for AQL queues.
v2: Make is_aql_queue assign clearer
Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Use DECLARE_DYNDBG_CLASSMAP across DRM:
- in .c files, since macro defines/initializes a record
- in drivers, $mod_{drv,drm,param}.c
ie where param setup is done, since a classmap is param related
- in drm/drm_print.c
since existing __drm_debug param is defined there,
and we ifdef it, and provide an elaborated alternative.
- in drm_*_helper modules:
dp/drm_dp - 1st item in makefile target
drivers/gpu/drm/drm_crtc_helper.c - random pick iirc.
Since these modules all use identical CLASSMAP declarations (ie: names
and .class_id's) they will all respond together to "class DRM_UT_*"
query-commands:
:#> echo class DRM_UT_KMS +p > /proc/dynamic_debug/control
NOTES:
This changes __drm_debug from int to ulong, so BIT() is usable on it.
DRM's enum drm_debug_category values need to sync with the index of
their respective class-names here. Then .class_id == category, and
dyndbg's class FOO mechanisms will enable drm_dbg(DRM_UT_KMS, ...).
Though DRM needs consistent categories across all modules, thats not
generally needed; modules X and Y could define FOO differently (ie a
different NAME => class_id mapping), changes are made according to
each module's private class-map.
No callsites are actually selected by this patch, since none are
class'd yet.
Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Link: https://lore.kernel.org/r/20220912052852.1123868-3-jim.cromie@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This patch updates the PTE flags when translate further (TF) is
enabled:
- With translate_further enabled, invalid PTEs can be 0. Reading
consecutive invalid PTEs as 0 is considered a fault. To prevent
this, ensure invalid PTEs have at least 1 bit set.
- The current invalid PTE flags settings to translate a retry fault
into a no-retry fault, doesn't work with TF enabled. As a result,
update invalid PTE flags settings which works for both TF enabled
and disabled case.
Fixes: 352e683b72 ("drm/amdgpu: Enable translate_further to extend UTCL2 reach")
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Free page table BO from vm resv unlocked context generate below
warnings.
Add a pt_free_work in vm to free page table BO from vm->pt_freed list.
pass vm resv unlock status from page table update caller, and add vm_bo
entry to vm->pt_freed list and schedule the pt_free_work if calling with
vm resv unlocked.
WARNING: CPU: 12 PID: 3238 at
drivers/gpu/drm/ttm/ttm_bo.c:106 ttm_bo_set_bulk_move+0xa1/0xc0
Call Trace:
amdgpu_vm_pt_free+0x42/0xd0 [amdgpu]
amdgpu_vm_pt_free_dfs+0xb3/0xf0 [amdgpu]
amdgpu_vm_ptes_update+0x52d/0x850 [amdgpu]
amdgpu_vm_update_range+0x2a6/0x640 [amdgpu]
svm_range_unmap_from_gpus+0x110/0x300 [amdgpu]
svm_range_cpu_invalidate_pagetables+0x535/0x600 [amdgpu]
__mmu_notifier_invalidate_range_start+0x1cd/0x230
unmap_vmas+0x9d/0x140
unmap_region+0xa8/0x110
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This got lost somewhere along the way, This fixes
audio not working until set_property was called.
Signed-off-by: hongao <hongao@uniontech.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The return value is no longer initialized before the loop because of
moving code around.
Signed-off-by: Christian König <christian.koenig@amd.com>
Fixes: c2b08e7a6d ("drm/amdgpu: move entity selection and job init earlier during CS")
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Allows submitting jobs as gang which needs to run on multiple engines at the
same time.
All members of the gang get the same implicit, explicit and VM dependencies. So
no gang member will start running until everything else is ready.
The last job is considered the gang leader (usually a submission to the GFX
ring) and used for signaling output dependencies.
Each job is remembered individually as user of a buffer object, so there is no
joining of work at the end.
v2: rebase and fix review comments from Andrey and Yogesh
v3: use READ instead of BOOKKEEP for now because of VM unmaps, set gang
leader only when necessary
v4: fix order of pushing jobs and adding fences found by Trigger.
v5: fix job index calculation and adding IBs to jobs
v6: fix typo found by Alex
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Allows submitting jobs as gang which needs to run on multiple
engines at the same time.
Basic idea is that we have a global gang submit fence representing when the
gang leader is finally pushed to run on the hardware last.
Jobs submitted as gang are never re-submitted in case of a GPU reset since this
won't work and will just deadlock the hardware immediately again.
v2: fix logic inversion, improve documentation, fix rcu
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Similar to what we did for VCN3 use the job instead of the parser
entity. Cleanup the coding style quite a bit as well.
v2: merge improved application check into this patch
v3: finally fix the check
v4: limit to the correct engine
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This reverts commit 250195ff74.
The job should now be initialized when we reach the parser functions.
v2: merge improved application check into this patch
v3: back to the original test, but use the right ring
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>