We need to reset the shadow state every time we submit an
IB and there needs to be a COND_EXEC packet after the
SET_Q_PREEMPTION_MODE packet for it to work properly, so
we should emit both of these packets regardless of whether
there is a job present or not.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add ring callback for gfx to update the CP firmware
with the new shadow information before we process the
IB.
v2: add implementation for new packet (Alex)
v3: add current FW version checks (Alex)
v4: only initialize shadow on first use
Only set IB_VMID when a valid shadow buffer is present
(Alex)
v5: Pass parameters rather than job to new ring callback (Alex)
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add support for submitting the shadow update packet
when submitting an IB. Needed for MCBP on GFX11.
v2: update API for CSA (Alex)
v3: fix ordering; SET_Q_PREEMPTION_MODE most come before COND_EXEC
Add missing check for AMDGPU_CHUNK_ID_CP_GFX_SHADOW in
amdgpu_cs_pass1()
Only initialize shadow on first use
(Alex)
v4: Pass parameters rather than job to new ring callback (Alex)
v5: squash in change to call SET_Q_PREEMPTION_MODE/COND_EXEC
before RELEASE_MEM to complete the UMDs use of the shadow (Alex)
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Only set the supported flag if we have new enough CP FW.
v2: update to the final firmware versions
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The type of size is unsigned int, if size is 0x40000000, there will
be an integer overflow, size will be zero after size *= sizeof(uint32_t),
will cause uninitialized memory to be referenced later.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: hackyzh002 <hackyzh002@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Fix following checkpatch style errors in amdgpu_drv.c &
amdgpu_device.c
ERROR: exactly one space required after that #ifdef
ERROR: spaces required around that '+=' (ctx:WxV)
ERROR: space required before the open brace '{'
ERROR: spaces required around that '||' (ctx:VxE)
ERROR: space prohibited before that close parenthesis ')'
ERROR: space required before the open parenthesis '('
ERROR: space required before the open brace '{'
ERROR: code indent should use tabs where possible
Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The following call trace is observed when removing the amdgpu driver, which
is caused by that BOs allocated for psp are not freed until removing.
[61811.450562] RIP: 0010:amddrm_buddy_fini.cold+0x29/0x47 [amddrm_buddy]
[61811.450577] Call Trace:
[61811.450577] <TASK>
[61811.450579] amdgpu_vram_mgr_fini+0x135/0x1c0 [amdgpu]
[61811.450728] amdgpu_ttm_fini+0x207/0x290 [amdgpu]
[61811.450870] amdgpu_bo_fini+0x27/0xa0 [amdgpu]
[61811.451012] gmc_v9_0_sw_fini+0x4a/0x60 [amdgpu]
[61811.451166] amdgpu_device_fini_sw+0x117/0x520 [amdgpu]
[61811.451306] amdgpu_driver_release_kms+0x16/0x30 [amdgpu]
[61811.451447] devm_drm_dev_init_release+0x4d/0x80 [drm]
[61811.451466] devm_action_release+0x15/0x20
[61811.451469] release_nodes+0x40/0xb0
[61811.451471] devres_release_all+0x9b/0xd0
[61811.451473] __device_release_driver+0x1bb/0x2a0
[61811.451476] driver_detach+0xf3/0x140
[61811.451479] bus_remove_driver+0x6c/0xf0
[61811.451481] driver_unregister+0x31/0x60
[61811.451483] pci_unregister_driver+0x40/0x90
[61811.451486] amdgpu_exit+0x15/0x447 [amdgpu]
For smu v13_0_2, if the GPU supports xgmi, refer to
commit f5c7e77970 ("drm/amdgpu: Adjust removal control flow for smu v13_0_2"),
it will run gpu recover in AMDGPU_RESET_FOR_DEVICE_REMOVE mode when removing,
which makes all devices in hive list have hw reset but no resume except the
basic ip blocks, then other ip blocks will not call .hw_fini according to
ip_block.status.hw.
Since psp_free_shared_bufs just includes some software operations, so move
it to psp_sw_fini.
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Longlong Yao <Longlong.Yao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
After gpu-reset, sometimes the driver fails to enable vblank irq,
causing flip_done timed out and the desktop freezed.
During gpu-reset, we disable and enable vblank irq in dm_suspend() and
dm_resume(). Later on in amdgpu_irq_gpu_reset_resume_helper(), we check
irqs' refcount and decide to enable or disable the irqs again.
However, we have 2 sets of API for controling vblank irq, one is
dm_vblank_get/put() and another is amdgpu_irq_get/put(). Each API has
its own refcount and flag to store the state of vblank irq, and they
are not synchronized.
In drm we use the first API to control vblank irq but in
amdgpu_irq_gpu_reset_resume_helper() we use the second set of API.
The failure happens when vblank irq was enabled by dm_vblank_get()
before gpu-reset, we have vblank->enabled true. However, during
gpu-reset, in amdgpu_irq_gpu_reset_resume_helper() vblank irq's state
checked from amdgpu_irq_update() is DISABLED. So finally it disables
vblank irq again. After gpu-reset, if there is a cursor plane commit,
the driver will try to enable vblank irq by calling drm_vblank_enable(),
but the vblank->enabled is still true, so it fails to turn on vblank
irq and causes flip_done can't be completed in vblank irq handler and
desktop become freezed.
[How]
Combining the 2 vblank control APIs by letting drm's API finally calls
amdgpu_irq's API, so the irq's refcount and state of both APIs can be
synchronized. Also add a check to prevent refcount from being less then
0 in amdgpu_irq_put().
v2:
- Add warning in amdgpu_irq_enable() if the irq is already disabled.
- Call dc_interrupt_set() in dm_set_vblank() to avoid refcount change
if it is in gpu-reset.
v3:
- Improve commit message and code comments.
Signed-off-by: Alan Liu <HaoPing.Liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
v1: To support multple XCD case (Le)
v2: unify naming style (Le)
v3: apply the changes to gc v11_0 (Hawking)
v4: apply the changes to gc SOC21 (Morris)
Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Morris Zhang <Shiwu.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Change those v9_4_3 interfaces which are exposed in gfx_v9_0.c.
For some active single-xcc emu models, the code path in
gfx_v9_0.c is better to keep reserved for a while.
Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Each XCD needs to be initialized respectively. The major changes are:
1. add iteration to do rlc/kiq/kcq init/fini for each xcd
2. load rlc/mec microcode to each xcd
3. add argument to specify xcc index in initialization functions
Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
v1: Modify kiq_init/fini, mqd_sw_init/fini and
enable/disable_kcq to adapt to multi-die case.
Pass 0 as default to all asics with single xcc (Le)
v2: squash commits to avoid breaking the build (Le)
v3: unify naming style (Le)
v4: apply the changes to gc v11_0 (Hawking)
Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
To allocate independent queue_bitmap for each XCD,
then the old bitmap policy can be continued to use
with a clear logic.
Use mec_bitmap[0] as default for all non-GC 9.4.3 IPs.
v2: squash commits to avoid breaking the build
v3: unify naming style
Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
v1: more kiq instances are a available in SOC (Le)
v2: squash commits to avoid breaking the build (Le)
v3: make the conversion for gfx/mec v11_0 (Hawking)
Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
After gpu-reset, sometimes the driver fails to enable vblank irq,
causing flip_done timed out and the desktop freezed.
During gpu-reset, we disable and enable vblank irq in dm_suspend() and
dm_resume(). Later on in amdgpu_irq_gpu_reset_resume_helper(), we check
irqs' refcount and decide to enable or disable the irqs again.
However, we have 2 sets of API for controling vblank irq, one is
dm_vblank_get/put() and another is amdgpu_irq_get/put(). Each API has
its own refcount and flag to store the state of vblank irq, and they
are not synchronized.
In drm we use the first API to control vblank irq but in
amdgpu_irq_gpu_reset_resume_helper() we use the second set of API.
The failure happens when vblank irq was enabled by dm_vblank_get()
before gpu-reset, we have vblank->enabled true. However, during
gpu-reset, in amdgpu_irq_gpu_reset_resume_helper() vblank irq's state
checked from amdgpu_irq_update() is DISABLED. So finally it disables
vblank irq again. After gpu-reset, if there is a cursor plane commit,
the driver will try to enable vblank irq by calling drm_vblank_enable(),
but the vblank->enabled is still true, so it fails to turn on vblank
irq and causes flip_done can't be completed in vblank irq handler and
desktop become freezed.
[How]
Combining the 2 vblank control APIs by letting drm's API finally calls
amdgpu_irq's API, so the irq's refcount and state of both APIs can be
synchronized. Also add a check to prevent refcount from being less then
0 in amdgpu_irq_put().
v2:
- Add warning in amdgpu_irq_enable() if the irq is already disabled.
- Call dc_interrupt_set() in dm_set_vblank() to avoid refcount change
if it is in gpu-reset.
v3:
- Improve commit message and code comments.
Signed-off-by: Alan Liu <HaoPing.Liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
To ensure it supports 192 IBs per submission, so we can keep a
simplified IB limit in the follow up patch without having to
look at IP or GPU version.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[WHY]
Function "amdgpu_irq_update()" called by "amdgpu_device_ip_late_init()" is an atomic context.
We shouldn't access registers through KIQ since "msleep()" may be called in "amdgpu_kiq_rreg()".
[HOW]
Move function "amdgpu_virt_release_full_gpu()" after function "amdgpu_device_ip_late_init()",
to ensure that registers be accessed through RLCG instead of KIQ.
Call Trace:
<TASK>
show_stack+0x52/0x69
dump_stack_lvl+0x49/0x6d
dump_stack+0x10/0x18
__schedule_bug.cold+0x4f/0x6b
__schedule+0x473/0x5d0
? __wake_up_klogd.part.0+0x40/0x70
? vprintk_emit+0xbe/0x1f0
schedule+0x68/0x110
schedule_timeout+0x87/0x160
? timer_migration_handler+0xa0/0xa0
msleep+0x2d/0x50
amdgpu_kiq_rreg+0x18d/0x1f0 [amdgpu]
amdgpu_device_rreg.part.0+0x59/0xd0 [amdgpu]
amdgpu_device_rreg+0x3a/0x50 [amdgpu]
amdgpu_sriov_rreg+0x3c/0xb0 [amdgpu]
gfx_v10_0_set_gfx_eop_interrupt_state.constprop.0+0x16c/0x190 [amdgpu]
gfx_v10_0_set_eop_interrupt_state+0xa5/0xb0 [amdgpu]
amdgpu_irq_update+0x53/0x80 [amdgpu]
amdgpu_irq_get+0x7c/0xb0 [amdgpu]
amdgpu_fence_driver_hw_init+0x58/0x90 [amdgpu]
amdgpu_device_init.cold+0x16b7/0x2022 [amdgpu]
Signed-off-by: Chong Li <chongli2@amd.com>
Reviewed-by: JingWen.Chen2@amd.com
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add some basic definitions and structure member. Inscrease MAX_WB slots
to 1024 to support the increasing number of rings for multiple partitions.
v2: unify naming style
Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
It looks better to place this field in ring
structure. Also drop the repeated ring funcs definitions
if there's no difference except for vmhub field.
v2: rename the field to vm_hub like others (Le)
v3: apply the changes to new ip blocks (Hawking)
v4: fix vcn sw ring (Alex)
Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reserve the MOUDLE_FIRMWARE declaration of gc_11_0_*_mes.bin
to fix falling back to old mes bin on failure via autoload.
Fixes: 97998b893c ("drm/amd/amdgpu: introduce gc_*_mes_2.bin v2")
Signed-off-by: Li Ma <li.ma@amd.com>
Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
all the gc v9_4_3 registers fall in gc_rlcpdec address range
have different relative offsets and base_idx from the ones
defined in gc v9_0 ip headers. gc_v9_0_rlc_funcs can not be
reused anymore for gc v9_4_3
v2: drop unused handshake function (Alex)
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Le Ma <Le.Ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Due to raven/raven2 maybe enable sclk slow down,
they cannot get clock count by the RLC at the auto level of dpm performance.
So switch to golden tsc register.
Suggested-by: shanshengwang <shansheng.wang@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Jesse Zhang <jesse.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Rework retry fault removal from the software filter by
storing an expired timestamp for a fault that is being removed.
When a new fault comes, and it matches an entry in the sw filter,
it will be added as a new fault only when its timestamp is greater
than the timestamp expiry of the fault in the sw filter.
This helps in avoiding stale faults being added back into the
filter and preventing legitimate faults from being handled.
Suggested-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This patch enables the IH retry CAM on GFX9 series cards. This
retry filter is used to prevent sending lots of retry interrupts
in a short span of time and overflowing the IH ring buffer. This
will also help reduce CPU interrupt workload.
Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
All chips that support RAS also support IP discovery, so
use the IP versions rather than a mix of IP versions and
asic types. Checking the validity of the atom_ctx pointer
is not required as the vbios is already fetched at this
point.
v2: add comments to id asic types based on feedback from Luben
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: Luben Tuikov <luben.tuikov@amd.com>