Commit Graph

12937 Commits

Author SHA1 Message Date
Felix Kuehling
b9274387bc drm/amdkfd: Don't trigger evictions unmapping dmabuf attachments
Don't move DMABuf attachments for PCIe P2P mappings to the SYSTEM domain
when unmapping. This avoids triggering eviction fences unnecessarily.
Instead do the move to SYSTEM and back to GTT when mapping these
attachments to ensure the SG table gets updated after evictions.

This may still trigger unnecessary evictions if user mode unmaps and
remaps the same BO. However, this is unlikely in real applications.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Eric Huang <jinhuieric.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09 09:33:59 -04:00
Jiapeng Chong
1ffbc89c30 drm/amdgpu: remove unneeded semicolon
No functional modification involved.

./drivers/gpu/drm/amd/amdgpu/nbio_v7_9.c:146:2-3: Unneeded semicolon.

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=4871
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09 09:33:50 -04:00
Le Ma
4667fbe2f7 drm/amdgpu: do gfxhub init for all XCDs
Each XCD needs to do gfxhub init

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09 09:33:47 -04:00
Hamza Mahfooz
c0c2742890 drm/amdgpu: fix an amdgpu_irq_put() issue in gmc_v9_0_hw_fini()
As made mention of in commit c56edea58c31 ("drm/amdgpu: fix
amdgpu_irq_put call trace in gmc_v10_0_hw_fini") and commit aa6ac247ed7d
("drm/amdgpu: fix amdgpu_irq_put call trace in gmc_v11_0_hw_fini"). It
is meaningless to call amdgpu_irq_put() for gmc.ecc_irq. So, remove it
from gmc_v9_0_hw_fini().

Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2522
Fixes: c8b5a95b57 ("drm/amdgpu: Fix desktop freezed after gpu-reset")
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09 09:33:03 -04:00
Dan Carpenter
f4409a2361 drm/amdgpu: unlock on error in gfx_v9_4_3_kiq_resume()
Smatch complains that we need to drop this lock before returning.

    drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c:1838 gfx_v9_4_3_kiq_resume()
    warn: inconsistent returns 'ring->mqd_obj->tbo.base.resv'.

Fixes: 8630112969 ("drm/amdgpu: split gc v9_4_3 functionality from gc v9_0")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09 09:32:27 -04:00
Dan Carpenter
7a6a2e59aa drm/amdgpu: unlock the correct lock in amdgpu_gfx_enable_kcq()
We changed which lock we are supposed to take but this error path
was accidentally over looked so it still drops the old lock.

Fixes: def799c659 ("drm/amdgpu: add multi-xcc support to amdgpu_gfx interfaces (v4)")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09 09:31:46 -04:00
Alex Deucher
0e5f625157 drm/amdgpu: drop unused function
amdgpu_discovery_get_ip_version() has not been used since
commit c40bdfb2ff ("drm/amdgpu: fix incorrect VCN revision in SRIOV")
so drop it.

Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09 09:31:31 -04:00
Alex Deucher
29551fd90e drm/amdgpu: drop invalid IP revision
This was already fixed and dropped in:
commit baf3f8f374 ("drm/amdgpu: handle SRIOV VCN revision parsing")
commit c40bdfb2ff ("drm/amdgpu: fix incorrect VCN revision in SRIOV")
But seems to have been accidently been left around in a merge.

Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09 09:31:04 -04:00
Alex Deucher
1cfb4d6121 drm/amdgpu: put MQDs in VRAM
Reduces preemption latency.
Only enable this for gfx10 and 11 for now
to avoid changing behavior on gfx 8 and 9.

v2: move MES MQDs into VRAM as well (YuBiao)
v3: enable on gfx10, 11 only (Alex)
v4: minor style changes, document why gfx10/11 only (Alex)

Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09 09:30:58 -04:00
Guchun Chen
967a66396e drm/amdgpu: drop redundant sched job cleanup when cs is aborted
Once command submission failed due to userptr invalidation in
amdgpu_cs_submit, legacy code will perform cleanup of scheduler
job. However, it's not needed at all, as former commit has integrated
job cleanup stuff into amdgpu_job_free. Otherwise, because of double
free, a NULL pointer dereference will occur in such scenario.

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2457
Fixes: f7d66fb2ea ("drm/amdgpu: cleanup scheduler job initialization v2")
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09 09:29:58 -04:00
Srinivasan Shanmugam
9e69018458 drm/amd/amdgpu: Fix errors & warnings in amdgpu _bios, _cs, _dma_buf, _fence.c
The following checkpatch errors & warning is removed.

ERROR: else should follow close brace '}'
ERROR: trailing statements should be on next line
WARNING: Prefer 'unsigned int' to bare use of 'unsigned'
WARNING: Possible repeated word: 'Fences'
WARNING: Missing a blank line after declarations
WARNING: braces {} are not necessary for single statement blocks
WARNING: Comparisons should place the constant on the right side of the test
WARNING: printk() should include KERN_<LEVEL> facility level

Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09 09:29:51 -04:00
Yifan Zhang
0e768043bf drm/amdgpu: set gfx9 onwards APU atomics support to be true
APUs w/ gfx9 onwards doesn't reply on PCIe atomics, rather
it is internal path w/ native atomic support. Set have_atomics_support
to true.

Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Reviewed-by: Lang Yu <lang.yu@amd.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09 09:29:48 -04:00
Alex Deucher
09d8a67912 drm/amdgpu/gfx11: always restore kcq/kgq MQDs
Always restore the MQD not just when we do a reset.
This allows us to move the MQD to VRAM if we want.

v2: always reset ring pointer as well (Christian)

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09 09:29:44 -04:00
Thong Thai
c6fa6fe9eb drm/amdgpu/nv: update VCN 3 max HEVC encoding resolution
Update the maximum resolution reported for HEVC encoding on VCN 3
devices to reflect its 8K encoding capability.

v2: Also update the max height for H.264 encoding to match spec.
(Ruijing)

Signed-off-by: Thong Thai <thong.thai@amd.com>
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09 09:29:39 -04:00
Alex Deucher
2dbaf83998 drm/amdgpu/gfx10: always restore kcq/kgq MQDs
Always restore the MQD not just when we do a reset.
This allows us to move the MQD to VRAM if we want.

v2: always reset ring pointer as well (Christian)

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09 09:29:07 -04:00
Alex Deucher
45b54a7dd3 drm/amdgpu/gfx9: always restore kcq MQDs
Always restore the MQD not just when we do a reset.
This allows us to move the MQD to VRAM if we want.

v2: always reset ring pointer as well (Christian)

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09 09:29:04 -04:00
Alex Deucher
42cdf6f687 drm/amdgpu/gfx8: always restore kcq MQDs
Always restore the MQD not just when we do a reset.
This allows us to move the MQD to VRAM if we want.

v2: always reset ring pointer as well (Christian)

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09 09:28:58 -04:00
Alex Deucher
b848fe65f8 drm/amdgpu/gfx11: drop unused variable
Just check the return value directly.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09 09:28:55 -04:00
Alex Deucher
d78e816a3d drm/amdgpu/gfx10: drop unused variable
Just check the return value directly.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09 09:28:50 -04:00
Alex Deucher
83033f72a4 drm/amdgpu/gfx11: use generic [en/dis]able_kgq() helpers
And remove the duplicate local variants.

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09 09:28:39 -04:00
Alex Deucher
f39c25357f drm/amdgpu/gfx10: use generic [en/dis]able_kgq() helpers
And remove the duplicate local variants.

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09 09:28:35 -04:00
Alex Deucher
1156e1a60f drm/amdgpu: add [en/dis]able_kgq() functions
To replace the IP specific variants which are largely
duplicate.

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09 09:28:28 -04:00
Srinivasan Shanmugam
f14c8c3e1f drm/amd/amdgpu: Fix style problems in amdgpu_psp.c
Fix the following checkpatch warnings & error in amdgpu_psp.c

WARNING: Comparisons should place the constant on the right side of the test
WARNING: braces {} are not necessary for single statement blocks
WARNING: please, no space before tabs
WARNING: braces {} are not necessary for single statement blocks
ERROR: that open brace { should be on the previous line

Suggested-by: Christian König <christian.koenig@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09 09:27:08 -04:00
Alex Deucher
edacf33357 drm/amdgpu/gfx10: drop old bring up code
No longer used.  Remove it.

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09 09:27:06 -04:00
Alex Deucher
6ae869b9b6 drm/amdgpu/gfx11: drop old bring up code
No longer used.  Remove it.

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09 09:26:50 -04:00
Srinivasan Shanmugam
8fa7635058 drm/amd/amdgpu: Fix style problems in amdgpu_debugfs.c
Fix the following issues reported by checkpatch:

WARNING: please, no space before tabs
WARNING: Prefer 'unsigned int' to bare use of 'unsigned'
WARNING: sizeof *rd should be sizeof(*rd)
WARNING: Missing a blank line after declarations
WARNING: sizeof rd->id should be sizeof(rd->id)
WARNING: static const char * array should probably be static const char * const
WARNING: Symbolic permissions 'S_IRUGO' are not preferred. Consider using octal permissions '0444'.
WARNING: Prefer seq_puts to seq_printf
ERROR: space prohibited after that open parenthesis '('

Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09 09:25:54 -04:00
YuBiao Wang
d446127107 drm/amdgpu: Enable mcbp under sriov by default
Enable mcbp under sriov by default. Asics with soc21 supports mcbp now
so we should set it enabled.

Signed-off-by: YuBiao Wang <YuBiao.Wang@amd.com>
Reviewed-by: Horace Chen <horace.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09 09:25:50 -04:00
Xiaomeng Hou
04b3c34f5c drm/amdgpu: remove pasid_src field from IV entry
PASID_SRC is not actually present in the Interrupt Packet, the field is
taken as reserved bits now. So remove it from IV entry to avoid misuse.

Signed-off-by: Xiaomeng Hou <Xiaomeng.Hou@amd.com>
Reviewed-by: Aaron Liu <aaron.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09 09:25:47 -04:00
Srinivasan Shanmugam
a6f7baa387 drm/amd/amdgpu: Simplify switch case statements in amdgpu_connectors.c
Fix the following checkpatch errors:

ERROR: trailing statements should be on next line
ERROR: space required after that ',' (ctx:VxV)
ERROR: code indent should use tabs where possible

Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09 09:25:43 -04:00
Horace Chen
9f58341d63 drm/amdgpu: disable SDMA WPTR_POLL_ENABLE for SR-IOV
[Why]
This WPTR_POLL_ENABLE is a hardware contigious polling which will cause
FCLK and UCLK to keep on a high level.
Mostly its case can be covered by F32_WPTR_POLL_ENABLE which polls by
firmware.
So to save power, SR-IOV also needs to disable this bit

Signed-off-by: Horace Chen <horace.chen@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09 09:25:40 -04:00
Stanley.Yang
64e2e71737 drm/amdgpu: Add SDMA_UTCL1_WR_FIFO_SED field for sdma_v4_4_ras_field
Query sdma_utcl1_wr_fifo_sed fiel to detect UTCL1_WR_FIFO SED error counts

Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09 09:25:37 -04:00
Chia-I Wu
514987a5bc drm/amdgpu: add a missing lock for AMDGPU_SCHED
mgr->ctx_handles should be protected by mgr->lock.

v2: improve commit message
v3: add a Fixes tag

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Fixes: 52c6a62c64 ("drm/amdgpu: add interface for editing a foreign process's priority v3")
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09 09:25:02 -04:00
Mukul Joshi
27fb73a0e3 drm/amdkfd: Update KFD TTM mem limit
Use the helper function in TTM to get TTM memory
limit and set KFD's internal mem limit. This ensures
that KFD's TTM mem limit and actual TTM mem limit are
exactly same.

Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09 09:24:52 -04:00
Mukul Joshi
f1f6f48a33 drm/amdgpu: Set GTT size equal to TTM mem limit
Use the helper function in TTM to get TTM mem limit and
set GTT size to be equal to TTL mem limit.

Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09 09:24:46 -04:00
Guchun Chen
3c4f6507ab drm/amdgpu: mark gfx_v9_4_3_disable_gpa_mode() static
This was left global by accident, the corresponding functions for other hardware types are already static:

drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c:1072:6: error: no previous prototype for function 'gfx_v9_4_3_disable_gpa_mode' [-Werror,-Wmissing-prototypes]

Fixes: 8630112969 ("drm/amdgpu: split gc v9_4_3 functionality from gc v9_0")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09 09:24:11 -04:00
Guchun Chen
34305ac364 drm/amdgpu: check correct allocated mqd_backup object after alloc
Instead of the default one, check the right mqd_backup object.

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Cc: Le Ma <le.ma@amd.com>
Reviewed-by: Le Ma <le.ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09 09:24:07 -04:00
Guchun Chen
7a4685cdfb drm/amdgpu: fix a build warning by a typo in amdgpu_gfx.c
This should be a typo when intruducing multi-xx support.

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Cc: Le Ma <le.ma@amd.com>
Reviewed-by: Le Ma <le.ma@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09 09:24:02 -04:00
Horatio Zhang
c953cf0406 drm/amdgpu: fix amdgpu_irq_put call trace in gmc_v10_0_hw_fini
The gmc.ecc_irq is enabled by firmware per IFWI setting,
and the host driver is not privileged to enable/disable
the interrupt. So, it is meaningless to use the amdgpu_irq_put
function in gmc_v10_0_hw_fini, which also leads to the call
trace.

[   82.340264] Call Trace:
[   82.340265]  <TASK>
[   82.340269]  gmc_v10_0_hw_fini+0x83/0xa0 [amdgpu]
[   82.340447]  gmc_v10_0_suspend+0xe/0x20 [amdgpu]
[   82.340623]  amdgpu_device_ip_suspend_phase2+0x127/0x1c0 [amdgpu]
[   82.340789]  amdgpu_device_ip_suspend+0x3d/0x80 [amdgpu]
[   82.340955]  amdgpu_device_pre_asic_reset+0xdd/0x2b0 [amdgpu]
[   82.341122]  amdgpu_device_gpu_recover.cold+0x4dd/0xbb2 [amdgpu]
[   82.341359]  amdgpu_debugfs_reset_work+0x4c/0x70 [amdgpu]
[   82.341529]  process_one_work+0x21d/0x3f0
[   82.341535]  worker_thread+0x1fa/0x3c0
[   82.341538]  ? process_one_work+0x3f0/0x3f0
[   82.341540]  kthread+0xff/0x130
[   82.341544]  ? kthread_complete_and_exit+0x20/0x20
[   82.341547]  ret_from_fork+0x22/0x30

Signed-off-by: Horatio Zhang <Hongkun.Zhang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09 09:23:13 -04:00
Horatio Zhang
7e5b601008 drm/amdgpu: fix amdgpu_irq_put call trace in gmc_v11_0_hw_fini
The gmc.ecc_irq is enabled by firmware per IFWI setting,
and the host driver is not privileged to enable/disable
the interrupt. So, it is meaningless to use the amdgpu_irq_put
function in gmc_v11_0_hw_fini, which also leads to the call
trace.

[  102.980303] Call Trace:
[  102.980303]  <TASK>
[  102.980304]  gmc_v11_0_hw_fini+0x54/0x90 [amdgpu]
[  102.980357]  gmc_v11_0_suspend+0xe/0x20 [amdgpu]
[  102.980409]  amdgpu_device_ip_suspend_phase2+0x240/0x460 [amdgpu]
[  102.980459]  amdgpu_device_ip_suspend+0x3d/0x80 [amdgpu]
[  102.980520]  amdgpu_device_pre_asic_reset+0xd9/0x490 [amdgpu]
[  102.980573]  amdgpu_device_gpu_recover.cold+0x548/0xce6 [amdgpu]
[  102.980687]  amdgpu_debugfs_reset_work+0x4c/0x70 [amdgpu]
[  102.980740]  process_one_work+0x21f/0x3f0
[  102.980741]  worker_thread+0x200/0x3e0
[  102.980742]  ? process_one_work+0x3f0/0x3f0
[  102.980743]  kthread+0xfd/0x130
[  102.980743]  ? kthread_complete_and_exit+0x20/0x20
[  102.980744]  ret_from_fork+0x22/0x30

Signed-off-by: Horatio Zhang <Hongkun.Zhang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09 09:23:00 -04:00
Shane Xiao
1c312e816c drm/amdgpu: Enable doorbell selfring after resize FB BAR
[Why]
The selfring doorbell aperture will change when resize FB
BAR successfully during gmc sw init, we should reorder
the sequence of enabling doorbell selfring aperture.

[How]
Move enable_doorbell_selfring_aperture from *_common_hw_init
to *_common_late_init.

This fixes the potential issue that GPU ring its own
doorbell when this device is in translated mode when
iommu is on.

v2: Remove *_enable_doorbell_aperture functions (Christian)
v3: Add comments to note that why we need enable doorbell
    selfring late (Christian)

Signed-off-by: Shane Xiao <shane.xiao@amd.com>
Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Tested-by: Xiaomeng Hou <Xiaomeng.Hou@amd.com>
Reviewed-by: Christian K�nig <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09 09:22:55 -04:00
Srinivasan Shanmugam
b2edaac4f2 drm/amd/amdgpu: Fix style errors in amdgpu_display.c
Fix following checkpatch errors in amdgpu_display.c

ERROR: spaces required around that '=' (ctx:VxW)
ERROR: that open brace { should be on the previous line
ERROR: else should follow close brace '}'

Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09 09:22:50 -04:00
lyndonli
59e9fff198 drm/amdgpu: Use the default reset when loading or reloading the driver
Below call trace and errors are observed when reloading
amdgpu driver with the module parameter reset_method=3.

It should do a default reset when loading or reloading the
driver, regardless of the module parameter reset_method.

v2: add comments inside and modify commit messages.

[  +2.180243] [drm] psp gfx command ID_LOAD_TOC(0x20) failed
and response status is (0x0)
[  +0.000011] [drm:psp_hw_start [amdgpu]] *ERROR* Failed to load toc
[  +0.000890] [drm:psp_hw_start [amdgpu]] *ERROR* PSP tmr init failed!
[  +0.020683] [drm:amdgpu_fill_buffer [amdgpu]] *ERROR* Trying to
clear memory with ring turned off.
[  +0.000003] RIP: 0010:amdgpu_bo_release_notify+0x1ef/0x210 [amdgpu]
[  +0.000004] Call Trace:
[  +0.000003]  <TASK>
[  +0.000008]  ttm_bo_release+0x2c4/0x330 [amdttm]
[  +0.000026]  amdttm_bo_put+0x3c/0x70 [amdttm]
[  +0.000020]  amdgpu_bo_free_kernel+0xe6/0x140 [amdgpu]
[  +0.000728]  psp_v11_0_ring_destroy+0x34/0x60 [amdgpu]
[  +0.000826]  psp_hw_init+0xe7/0x2f0 [amdgpu]
[  +0.000813]  amdgpu_device_fw_loading+0x1ad/0x2d0 [amdgpu]
[  +0.000731]  amdgpu_device_init.cold+0x108e/0x2002 [amdgpu]
[  +0.001071]  ? do_pci_enable_device+0xe1/0x110
[  +0.000011]  amdgpu_driver_load_kms+0x1a/0x160 [amdgpu]
[  +0.000729]  amdgpu_pci_probe+0x179/0x3a0 [amdgpu]

Signed-off-by: lyndonli <Lyndon.Li@amd.com>
Signed-off-by: Yunxiang Li <Yunxiang.Li@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09 09:21:42 -04:00
lyndonli
535f778610 drm/amdgpu: Fix mode2 reset for sienna cichlid
Before this change, sienna_cichlid_get_reset_handler will always
return NULL, although the module parameter reset_method is 3
when loading amdgpu driver.

Signed-off-by: lyndonli <Lyndon.Li@amd.com>
Signed-off-by: Yunxiang Li <Yunxiang.Li@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09 09:21:03 -04:00
Pierre-Eric Pelloux-Prayer
489763af89 drm/amdgpu: add new flag to AMDGPU_CTX_QUERY2
OpenGL EXT_robustness extension expects the driver to stop reporting
GUILTY_CONTEXT_RESET when the reset has completed and the GPU is ready
to accept submission again.

This commit adds a AMDGPU_CTX_QUERY2_FLAGS_RESET_IN_PROGRESS flag,
that let the UMD know that the reset is still not finished.

Mesa MR: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22290

Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: André Almeida <andrealmeid@igalia.com>
Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09 09:20:55 -04:00
Sukrut Bellary
609d830048 drm:amd:amdgpu: Fix missing bo unlock in failure path
smatch warning - inconsistent handling of buffer object reserve
and unreserve.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Sukrut Bellary <sukrut.bellary@linux.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09 09:20:26 -04:00
YiPeng Chai
dac652220b drm/amdgpu: change reserved vram info print
The link object of mgr->reserved_pages is the blocks
variable in struct amdgpu_vram_reservation, not the
link variable in struct drm_buddy_block.

Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2023-06-07 17:01:17 -04:00
Chia-I Wu
b447b079cf drm/amdgpu: fix xclk freq on CHIP_STONEY
According to Alex, most APUs from that time seem to have the same issue
(vbios says 48Mhz, actual is 100Mhz).  I only have a CHIP_STONEY so I
limit the fixup to CHIP_STONEY

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2023-06-07 17:00:35 -04:00
Alex Deucher
73c12de8be Revert "drm/amdgpu: switch to golden tsc registers for raven/raven2"
This reverts commit f03eb1d26c.

This results in inconsistent timing reported via asynchronous
GPU queries.

Link: https://lists.freedesktop.org/archives/amd-gfx/2023-May/093731.html
Cc: Jesse.Zhang@amd.com
Cc: michel@daenzer.net
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-07 16:59:56 -04:00
Alex Deucher
d511f95938 Revert "drm/amdgpu: Differentiate between Raven2 and Raven/Picasso according to revision id"
This reverts commit 9d2d1827af.

This results in inconsistent timing reported via asynchronous
GPU queries.

Link: https://lists.freedesktop.org/archives/amd-gfx/2023-May/093731.html
Cc: Jesse.Zhang@amd.com
Cc: michel@daenzer.net
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-07 16:59:48 -04:00
Alex Deucher
3b3ffd729e Revert "drm/amdgpu: change the reference clock for raven/raven2"
This reverts commit fbc24293ca.

This results in inconsistent timing reported via asynchronous
GPU queries.

Link: https://lists.freedesktop.org/archives/amd-gfx/2023-May/093731.html
Cc: Jesse.Zhang@amd.com
Cc: michel@daenzer.net
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-07 16:59:41 -04:00