Commit Graph

20785 Commits

Author SHA1 Message Date
Stanley.Yang
232d1d43b5 drm/amdgpu: fix disable ras feature failed when unload drvier v2
v2:
    still need call ras_disable_all_featrures to handle
    ras initilization failure case.

Function amdgpu_device_fini_hw is called before amdgpu_device_fini_sw,
so ras ta will unload before send ras disable command, ras dsiable operation
must before hw fini.

Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-01 16:03:22 -05:00
Lijo Lazar
85c1b9bd13 drm/amd/pm: Add warning for unexpected PG requests
v1: Ideally power gate/ungate requests shouldn't come when smu block is
uninitialized. Add a WARN message to check the origins if such a thing
ever happens.

v2: Use dev_WARN to log device info (Felix/Guchun).

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Kevin Yang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-01 16:03:13 -05:00
Flora Cui
700de2c8aa drm/amdgpu: check atomic flag to differeniate with legacy path
since vkms support atomic KMS interface

Signed-off-by: Flora Cui <flora.cui@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Acked-by: Alex Deucher <aleander.deucher@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-01 16:03:03 -05:00
Flora Cui
deefd07eed drm/amdgpu: fix vkms crtc settings
otherwise adev->mode_info.crtcs[] is NULL

Signed-off-by: Flora Cui <flora.cui@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-01 16:02:57 -05:00
Flora Cui
4f7ee199d9 drm/amdgpu: cancel the correct hrtimer on exit
Signed-off-by: Flora Cui <flora.cui@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-01 16:02:46 -05:00
Christophe JAILLET
f37668301e drm/amdkfd: Slighly optimize 'init_doorbell_bitmap()'
The 'doorbell_bitmap' bitmap has just been allocated. So we can use the
non-atomic '__set_bit()' function to save a few cycles as no concurrent
access can happen.

Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-01 16:02:37 -05:00
Christophe JAILLET
b9dd6fbd15 drm/amdkfd: Use bitmap_zalloc() when applicable
'doorbell_bitmap' and 'queue_slot_bitmap' are bitmaps. So use
'bitmap_zalloc()' to simplify code, improve the semantic and avoid some
open-coded arithmetic in allocator arguments.

Also change the corresponding 'kfree()' into 'bitmap_free()' to keep
consistency.

Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-01 16:02:27 -05:00
Lv Ruyi
b7e7e6ca1f drm/amd/display: fix application of sizeof to pointer
Both of split and merge are pointers, not arrays.

Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: Lv Ruyi <lv.ruyi@zte.com.cn>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-01 16:02:16 -05:00
Jane Jian
981b304546 drm/amdgpu/sriov/vcn: add new vcn ip revision check case for SIENNA_CICHLID
[WHY]
for sriov odd# vf will modify vcn0 engine ip revision(due to multimedia bandwidth feature),
which will be mismatched with original vcn0 revision

[HOW]
add new version check for vcn0 disabled revision(3, 0, 192), typically modified under
sriov mode

Signed-off-by: Jane Jian <Jane.Jian@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-01 16:02:04 -05:00
Jiapeng Chong
627d137aa0 drm/amd/display: Fix warning comparing pointer to 0
Fix the following coccicheck warning:

./drivers/gpu/drm/amd/display/dc/dml/dsc/rc_calc_fpu.c:96:14-15: WARNING
comparing pointer to 0.

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-01 16:01:56 -05:00
Nicholas Kazlauskas
7089784873 drm/amdgpu/display: Only set vblank_disable_immediate when PSR is not enabled
[Why]
PSR currently relies on the kernel's delayed vblank on/off mechanism
as an implicit bufferring mechanism to prevent excessive entry/exit.

Without this delay the user experience is impacted since it can take
a few frames to enter/exit.

[How]
Only allow vblank disable immediate for DC when psr is not supported.

Leave a TODO indicating that this support should be extended in the
future to delay independent of the vblank interrupt.

Fixes: 92020e81dd ("drm/amdgpu/display: set vblank_disable_immediate for DC")

Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-01 16:00:58 -05:00
Javier Martinez Canillas
6a2d2ddf2c drm: Move nomodeset kernel parameter to the DRM subsystem
The "nomodeset" kernel cmdline parameter is handled by the vgacon driver
but the exported vgacon_text_force() symbol is only used by DRM drivers.

It makes much more sense for the parameter logic to be in the subsystem
of the drivers that are making use of it.

Let's move the vgacon_text_force() function and related logic to the DRM
subsystem. While doing that, rename it to drm_firmware_drivers_only() and
make it return true if "nomodeset" was used and false otherwise. This is
a better description of the condition that the drivers are testing for.

Suggested-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
Acked-by: Thomas Zimmermann <tzimmermann@suse.de>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Acked-by: Pekka Paalanen <pekka.paalanen@collabora.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20211112133230.1595307-4-javierm@redhat.com
2021-11-27 13:52:22 +01:00
Javier Martinez Canillas
35f7775f81 drm: Don't print messages if drivers are disabled due nomodeset
The nomodeset kernel parameter handler already prints a message that the
DRM drivers will be disabled, so there's no need for drivers to do that.

Suggested-by: Thomas Zimmermann <tzimmermann@suse.de>
Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
Acked-by: Thomas Zimmermann <tzimmermann@suse.de>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Acked-by: Pekka Paalanen <pekka.paalanen@collabora.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211112133230.1595307-2-javierm@redhat.com
2021-11-27 13:52:16 +01:00
Alex Deucher
692cd92e66 drm/amd/display: update bios scratch when setting backlight
Update the bios scratch register when updating the backlight
level.  Some platforms apparently read this scratch register
and do additional operations in their hotkey handlers.

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1518
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-24 15:14:36 -05:00
Alex Deucher
d5c7255dc7 drm/amdgpu/pm: fix powerplay OD interface
The overclocking interface currently appends data to a
string.  Revert back to using sprintf().

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1774
Fixes: 6db0c87a0a ("amdgpu/pm: Replace hwmgr smu usage of sprintf with sysfs_emit")
Acked-by: Evan Quan <evan.quan@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-11-24 15:14:14 -05:00
Lijo Lazar
57961c4c18 drm/amdgpu: Skip ASPM programming on aldebaran
There is no need for additional programming, keep the default settings.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-24 15:14:03 -05:00
Yang Wang
fd08953b2d drm/amdgpu: fix byteorder error in amdgpu discovery
fix some byteorder issues about amdgpu discovery.
This will result in running errors on the big end system. (e.g:MIPS)

Signed-off-by: Yang Wang <KevinYang.Wang@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-24 15:13:54 -05:00
Philip Yang
c4ef8a73bf drm/amdgpu: enable Navi retry fault wptr overflow
If xnack is on, VM retry fault interrupt send to IH ring1, and ring1
will be full quickly. IH cannot receive other interrupts, this causes
deadlock if migrating buffer using sdma and waiting for sdma done
while handling retry fault.

Remove VMC from IH storm client, enable ring1 write pointer
overflow, then IH will drop retry fault interrupts and be able to receive
other interrupts while driver is handling retry fault.

IH ring1 write pointer doesn't writeback to memory by IH, and ring1
write pointer recorded by self-irq is not updated, so always read
the latest ring1 write pointer from register.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-24 15:13:10 -05:00
Philip Yang
8888e2fe9c drm/amdgpu: enable Navi 48-bit IH timestamp counter
By default this timestamp is 32 bit counter. It gets overflowed in
around 10 minutes.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-24 15:12:55 -05:00
Philip Yang
6946be2443 drm/amdkfd: simplify drain retry fault
unmap range always increase atomic svms->drain_pagefaults to simplify
both parent range and child range unmap, page fault handle ignores the
retry fault if svms->drain_pagefaults is set to speed up interrupt
handling. svm_range_drain_retry_fault restart draining if another
range unmap from cpu.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-24 15:12:08 -05:00
Philip Yang
0cc53cb450 drm/amdkfd: handle VMA remove race
VMA may be removed before unmap notifier callback, and deferred list
work remove range, return success for this special case as we are
handling stale retry fault.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-24 15:11:20 -05:00
Philip Yang
cda0817b41 drm/amdkfd: process exit and retry fault race
kfd_process_wq_release drain retry fault to ensure no retry fault comes
after removing kfd process from the hash table, otherwise svm page fault
handler will fail to recover the fault and dump GPU vm fault log.

Refactor deferred list work to get_task_mm and take mmap write lock
to handle all ranges, and avoid mm is gone while inserting mmu notifier.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-24 15:10:32 -05:00
Philip Yang
4d62555f62 drm/amdgpu: IH process reset count when restart
Otherwise when IH process restart, count is zero, the loop will
not exit to wake_up_all after processing AMDGPU_IH_MAX_NUM_IVS
interrupts.

Cc: stable@vger.kernel.org
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-24 15:09:46 -05:00
Alex Deucher
53af98c091 drm/amdgpu/gfx9: switch to golden tsc registers for renoir+
Renoir and newer gfx9 APUs have new TSC register that is
not part of the gfxoff tile, so it can be read without
needing to disable gfx off.

Acked-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-24 15:09:17 -05:00
Alex Deucher
244ee39885 drm/amdgpu/gfx10: add wraparound gpu counter check for APUs as well
Apply the same check we do for dGPUs for APUs as well.

Acked-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-24 15:09:09 -05:00
shaoyunl
271fd38ce5 drm/amdgpu: move kfd post_reset out of reset_sriov function
Fixes: 9f4f2c1a35 ("drm/amd/amdgpu: fix the kfd pre_reset sequence in sriov")

For sriov XGMI  configuration, the host driver will handle the hive reset,
so in guest side, the reset_sriov only be called once on one device. This will
make kfd post_reset unblanced with kfd pre_reset since kfd pre_reset already
been moved out of reset_sriov function. Move kfd post_reset out of reset_sriov
function to make them balance .

Signed-off-by: shaoyunl <shaoyun.liu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-24 15:08:58 -05:00
Yi-Ling Chen
2da8f0beec drm/amd/display: Fixed DSC would not PG after removing DSC stream
[WHY]
Due to pass the wrong parameter down to the enable_stream_gating(),
it would cause the DSC of the removing stream would not be PG.

[HOW]
To pass the correct parameter down th the enable_stream_gating().

Reviewed-by: Anthony Koo <Anthony.Koo@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Yi-Ling Chen <Yi-Ling.Chen2@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-24 15:07:39 -05:00
Nicholas Kazlauskas
2276ee6d1b drm/amd/display: Reset link encoder assignments for GPU reset
[Why]
A warning appears in the log on GPU reset for
link_enc_cfg_link_encs_assign for the following condition:

ASSERT(state->res_ctx.link_enc_cfg_ctx.link_enc_assignments[i].valid == false);

This is not expected behavior and may result in link encoders being
incorrectly assigned.

[How]
The dc->current_state is backed up into dm->cached_dc_state before
we commit 0 streams.

DC will clear link encoder assignments on the real state but the
changes won't propagate over to the copy we made before the
0 streams commit.

DC expects that link encoder assignments are *not* valid
when committing a state, so as a workaround it needs to be cleared
before passing it back into DC.

Reviewed-by: Harry Wentland <Harry.Wentland@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-24 15:06:21 -05:00
Nicholas Kazlauskas
21431f70f6 drm/amd/display: Set plane update flags for all planes in reset
[Why]
We're only setting the flags on stream[0]'s planes so this logic fails
if we have more than one stream in the state.

This can cause a page flip timeout with multiple displays in the
configuration.

[How]
Index into the stream_status array using the stream index - it's a 1:1
mapping.

Fixes: cdaae8371a ("drm/amd/display: Handle GPU reset for DC block")

Reviewed-by: Harry Wentland <Harry.Wentland@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-24 15:05:44 -05:00
Nicholas Kazlauskas
6eff272dbe drm/amd/display: Fix DPIA outbox timeout after GPU reset
[Why]
The HW interrupt gets disabled after GPU reset so we don't receive
notifications for HPD or AUX from DMUB - leading to timeout and
black screen with (or without) DPIA links connected.

[How]
Re-enable the interrupt after GPU reset like we do for the other
DC interrupts.

Fixes: 81927e2808 ("drm/amd/display: Support for DMUB AUX")

Reviewed-by: Jude Shih <Jude.Shih@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-24 15:04:26 -05:00
xinhui pan
4eb6bb649f drm/amdgpu: Fix double free of dmabuf
amdgpu_amdkfd_gpuvm_free_memory_of_gpu drop dmabuf reference increased in
amdgpu_gem_prime_export.
amdgpu_bo_destroy drop dmabuf reference increased in
amdgpu_gem_prime_import.

So remove this extra dma_buf_put to avoid double free.

Signed-off-by: xinhui pan <xinhui.pan@amd.com>
Tested-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-24 15:04:05 -05:00
Felix Kuehling
d3a21f7e35 drm/amdgpu: Fix MMIO HDP flush on SRIOV
Disable HDP register remapping on SRIOV and set rmmio_remap.reg_offset
to the fixed address of the VF register for hdp_v*_flush_hdp.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Tested-by: Bokun Zhang <bokun.zhang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-24 15:02:25 -05:00
Alex Deucher
1f57925493 drm/amd/display: update bios scratch when setting backlight
Update the bios scratch register when updating the backlight
level.  Some platforms apparently read this scratch register
and do additional operations in their hotkey handlers.

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1518
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-24 14:06:54 -05:00
Alex Deucher
081664ef3e drm/amdgpu/pm: fix powerplay OD interface
The overclocking interface currently appends data to a
string.  Revert back to using sprintf().

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1774
Fixes: 6db0c87a0a ("amdgpu/pm: Replace hwmgr smu usage of sprintf with sysfs_emit")
Acked-by: Evan Quan <evan.quan@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-24 14:06:53 -05:00
Lijo Lazar
6ff53495ce drm/amdgpu: Skip ASPM programming on aldebaran
There is no need for additional programming, keep the default settings.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-24 14:06:53 -05:00
Yang Wang
cc7818d709 drm/amdgpu: fix byteorder error in amdgpu discovery
fix some byteorder issues about amdgpu discovery.
This will result in running errors on the big end system. (e.g:MIPS)

Signed-off-by: Yang Wang <KevinYang.Wang@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-24 14:06:53 -05:00
Philip Yang
23eb49251b drm/amdgpu: enable Navi retry fault wptr overflow
If xnack is on, VM retry fault interrupt send to IH ring1, and ring1
will be full quickly. IH cannot receive other interrupts, this causes
deadlock if migrating buffer using sdma and waiting for sdma done
while handling retry fault.

Remove VMC from IH storm client, enable ring1 write pointer
overflow, then IH will drop retry fault interrupts and be able to receive
other interrupts while driver is handling retry fault.

IH ring1 write pointer doesn't writeback to memory by IH, and ring1
write pointer recorded by self-irq is not updated, so always read
the latest ring1 write pointer from register.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-24 14:06:53 -05:00
Philip Yang
71ee9236ab drm/amdgpu: enable Navi 48-bit IH timestamp counter
By default this timestamp is 32 bit counter. It gets overflowed in
around 10 minutes.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-24 14:06:53 -05:00
Philip Yang
2e4477282c drm/amdkfd: simplify drain retry fault
unmap range always increase atomic svms->drain_pagefaults to simplify
both parent range and child range unmap, page fault handle ignores the
retry fault if svms->drain_pagefaults is set to speed up interrupt
handling. svm_range_drain_retry_fault restart draining if another
range unmap from cpu.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-24 14:06:53 -05:00
Philip Yang
7ad153db58 drm/amdkfd: handle VMA remove race
VMA may be removed before unmap notifier callback, and deferred list
work remove range, return success for this special case as we are
handling stale retry fault.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-24 14:06:53 -05:00
Philip Yang
a0c55ecee1 drm/amdkfd: process exit and retry fault race
kfd_process_wq_release drain retry fault to ensure no retry fault comes
after removing kfd process from the hash table, otherwise svm page fault
handler will fail to recover the fault and dump GPU vm fault log.

Refactor deferred list work to get_task_mm and take mmap write lock
to handle all ranges, and avoid mm is gone while inserting mmu notifier.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-24 14:06:53 -05:00
Philip Yang
514f4a99c7 drm/amdgpu: IH process reset count when restart
Otherwise when IH process restart, count is zero, the loop will
not exit to wake_up_all after processing AMDGPU_IH_MAX_NUM_IVS
interrupts.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-24 14:06:53 -05:00
Surbhi Kakarya
3a50403f8b drm/amd/pm: add new fields for Sienna Cichlid.
Fill voltage fields in metrics table.

Signed-off-by: Surbhi Kakarya <Surbhi.Kakarya@amd.com>
Acked-by: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-24 14:06:53 -05:00
Luben Tuikov
e771d71d8d drm/amd/pm: Print the error on command submission
Print the error on command submission immediately after submitting to
the SMU. This is rate-limited. It helps to immediately know there was an
error on command submission, rather than leave it up to clients to report
the error, as sometimes they do not.

Cc: Alex Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Acked-by: Alex Deucher <Alexander.Deucher@amd.com>
Reviewed-by: Alex Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-24 14:06:53 -05:00
Luben Tuikov
dc78fea1e7 drm/amd/pm: Sienna: Print failed BTC
Add a print in sienna_cichlid_run_btc() to help debug and to mirror other
platforms, as no print is present in the caller, smu_smc_hw_setup().

Cc: Alex Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Alex Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-24 14:06:53 -05:00
Luben Tuikov
ca4b32bb2d drm/amd/pm: Add debug prints
Add prints where there are none and none are printed in the callee.

Remove the word "previous" from comment and print to make it shorter and
avoid confusion in various prints.

Cc: Alex Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Alex Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-24 14:06:53 -05:00
Evan Quan
1223c15c78 drm/amdgpu: update the domain flags for dumb buffer creation
After switching to generic framebuffer framework, we rely on the
->dumb_create routine for frame buffer creation. However, the
different domain flags used are not optimal. Add the contiguous
flag to directly allocate the scanout BO as one linear buffer.

Fixes: 087451f372 ("drm/amdgpu: use generic fb helpers instead of setting up AMD own's.")

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-24 14:06:53 -05:00
Ramesh Errabolu
37ba5bbc89 drm/amdgpu: Declare Unpin BO api as static
Fixes warning report from kernel test robot

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Ramesh Errabolu <Ramesh.Errabolu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-24 14:06:53 -05:00
Alex Deucher
7b37c7f8f5 drm/amdgpu/gfx9: switch to golden tsc registers for renoir+
Renoir and newer gfx9 APUs have new TSC register that is
not part of the gfxoff tile, so it can be read without
needing to disable gfx off.

Acked-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-24 14:06:53 -05:00
Alex Deucher
f75de84475 drm/amdgpu/gfx10: add wraparound gpu counter check for APUs as well
Apply the same check we do for dGPUs for APUs as well.

Acked-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-24 14:06:52 -05:00