Commit Graph

283 Commits

Author SHA1 Message Date
Graham Sider
7eb0502ac0 drm/amdkfd: replace asic_family with asic_type
asic_family was a duplicate of asic_type, both of type amd_asic_type.
Replace all instances of device_info->asic_family with adev->asic_type
and remove asic_family from device_info.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-17 17:10:01 -05:00
Graham Sider
046e674b96 drm/amdkfd: convert misc checks to IP version checking
Switch to IP version checking instead of asic_type on various KFD
version checks.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-17 17:09:46 -05:00
Graham Sider
e4804a39ba drm/amdkfd: convert switches to IP version checking
Converts KFD switch statements to use IP version checking instead
of asic_type.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-17 17:09:36 -05:00
Graham Sider
02274fc0f6 drm/amdkfd: replace trivial funcs with direct access
These get funcs simply return an adev field. Replace funcs/calls with
direct field accesses instead.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-17 16:58:11 -05:00
Graham Sider
b5d1d755c1 drm/amdkfd: remove kgd_dev declaration and initialization
Completes removal of kgd_dev. Direct references to amdgpu_device objects
should now be used instead.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-17 16:58:03 -05:00
Graham Sider
56c5977eae drm/amdkfd: replace/remove remaining kgd_dev references
Remove get_amdgpu_device and other remaining kgd_dev references aside
from declaration/kfd struct entry and initialization.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-17 16:58:02 -05:00
Graham Sider
574c4183ef drm/amdkfd: replace kgd_dev in get amdgpu_amdkfd funcs
Modified definitions:

- amdgpu_amdkfd_get_fw_version
- amdgpu_amdkfd_get_local_mem_info
- amdgpu_amdkfd_get_gpu_clock_counter
- amdgpu_amdkfd_get_max_engine_clock_in_mhz
- amdgpu_amdkfd_get_cu_info
- amdgpu_amdkfd_get_dmabuf_info
- amdgpu_amdkfd_get_vram_usage
- amdgpu_amdkfd_get_hive_id
- amdgpu_amdkfd_get_unique_id
- amdgpu_amdkfd_get_mmio_remap_phys_addr
- amdgpu_amdkfd_get_num_gws
- amdgpu_amdkfd_get_asic_rev_id
- amdgpu_amdkfd_get_noretry
- amdgpu_amdkfd_get_xgmi_hops_count
- amdgpu_amdkfd_get_xgmi_bandwidth_mbytes
- amdgpu_amdkfd_get_pcie_bandwidth_mbytes

Also replaces kfd_device_by_kgd with kfd_device_by_adev, now
searching via adev rather than kgd.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-17 16:58:02 -05:00
Graham Sider
6bfc7c7e17 drm/amdkfd: replace kgd_dev in various amgpu_amdkfd funcs
Modified definitions:

- amdgpu_amdkfd_submit_ib
- amdgpu_amdkfd_set_compute_idle
- amdgpu_amdkfd_have_atomics_support
- amdgpu_amdkfd_flush_gpu_tlb_pasid
- amdgpu_amdkfd_flush_gpu_tlb_pasid
- amdgpu_amdkfd_gpu_reset
- amdgpu_amdkfd_alloc_gtt_mem
- amdgpu_amdkfd_free_gtt_mem
- amdgpu_amdkfd_alloc_gws
- amdgpu_amdkfd_free_gws
- amdgpu_amdkfd_ras_poison_consumption_handler

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-17 16:58:01 -05:00
Graham Sider
c6c5744638 drm/amdkfd: add amdgpu_device entry to kfd_dev
Patch series to remove kgd_dev struct and replace all instances with
amdgpu_device objects.

amdgpu_device needs to be declared in kgd_kfd_interface.h to be visible
to kfd2kgd_calls.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-17 16:57:59 -05:00
Graham Sider
cc22b92761 drm/amdkfd: update gfx target version for Renoir
Previously Renoir compiler gfx target version was forced to Raven.
Update driver side for completeness.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-03 12:22:07 -04:00
Alex Deucher
0f3d2b6804 drm/amdkfd: protect raven_device_info with KFD_SUPPORT_IOMMU_V2
raven_device_info is not used when KFD_SUPPORT_IOMMU_V2 is not
set.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-20 11:43:57 -04:00
Alex Deucher
18f12604f5 drm/amdkfd: protect hawaii_device_info with CONFIG_DRM_AMDGPU_CIK
hawaii_device_info is not used when CONFIG_DRM_AMDGPU_CIK is not
set.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-20 11:43:56 -04:00
Yifan Zhang
6f4b590aae drm/amdkfd: fix resume error when iommu disabled in Picasso
When IOMMU disabled in sbios and kfd in iommuv2 path,
IOMMU resume failure blocks system resume. Don't allow kfd to
use iommu v2 when iommu is disabled.

Reported-by: youling <youling257@gmail.com>
Tested-by: youling <youling257@gmail.com>
Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Reviewed-by: James Zhu <James.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-13 14:16:02 -04:00
Yifan Zhang
afd18180c0 drm/amdkfd: fix boot failure when iommu is disabled in Picasso.
When IOMMU disabled in sbios and kfd in iommuv2 path, iommuv2
init will fail. But this failure should not block amdgpu driver init.

Reported-by: youling <youling257@gmail.com>
Tested-by: youling <youling257@gmail.com>
Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Reviewed-by: James Zhu <James.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-13 14:15:52 -04:00
Yifan Zhang
499f4d38ec drm/amdkfd: remove redundant iommu cleanup code
kfd_resume doesn't involve iommu operation, remove
redundant iommu cleanup code.

Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Reviewed-by: James Zhu <James.Zhu@amd.com>
Tested-by: James Zhu <James.Zhu@amd.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-05 12:22:15 -04:00
Alex Deucher
c868d58442 drm/amdkfd: convert kfd_device.c to use GC IP version
rather than asic type.

v2: fix up CZ case

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:02 -04:00
Alex Deucher
5b983db8c3 drm/amdkfd: clean up parameters in kgd2kfd_probe
We can get the pdev and asic type from the adev.  No need
to pass them explicitly.

v2: squash in build fix for !CONFIG_HSA_AMD from Anson

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:02 -04:00
Philip Yang
22f4f4faf3 drm/amdkfd: fix svm_migrate_fini warning
Device manager releases device-specific resources when a driver
disconnects from a device, devm_memunmap_pages and
devm_release_mem_region calls in svm_migrate_fini are redundant.

It causes below warning trace after patch "drm/amdgpu: Split
amdgpu_device_fini into early and late", so remove function
svm_migrate_fini.

BUG: https://gitlab.freedesktop.org/drm/amd/-/issues/1718

WARNING: CPU: 1 PID: 3646 at drivers/base/devres.c:795
devm_release_action+0x51/0x60
Call Trace:
    ? memunmap_pages+0x360/0x360
    svm_migrate_fini+0x2d/0x60 [amdgpu]
    kgd2kfd_device_exit+0x23/0xa0 [amdgpu]
    amdgpu_amdkfd_device_fini_sw+0x1d/0x30 [amdgpu]
    amdgpu_device_fini_sw+0x45/0x290 [amdgpu]
    amdgpu_driver_release_kms+0x12/0x30 [amdgpu]
    drm_dev_release+0x20/0x40 [drm]
    release_nodes+0x196/0x1e0
    device_release_driver_internal+0x104/0x1d0
    driver_detach+0x47/0x90
    bus_remove_driver+0x7a/0xd0
    pci_unregister_driver+0x3d/0x90
    amdgpu_exit+0x11/0x20 [amdgpu]

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-23 16:34:57 -04:00
James Zhu
f8846323d5 drm/amdkfd: separate kfd_iommu_resume from kfd_resume
Separate kfd_iommu_resume from kfd_resume for fine-tuning
of amdgpu device init/resume/reset/recovery sequence.

v2: squash in fix for !CONFIG_HSA_AMD

Bug: https://bugzilla.kernel.org/show_bug.cgi?id=211277
Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-14 15:59:46 -04:00
Felix Kuehling
e312af6c2a drm/amdkfd: make needs_pcie_atomics FW-version dependent
On some GPUs the PCIe atomic requirement for KFD depends on the MEC
firmware version. Add a firmware version check for this. The minimum
firmware version that works without atomics can be updated in the
device_info structure for each GPU type.

Move PCIe atomic detection from kgd2kfd_probe into kgd2kfd_device_init
because the MEC firmware is not loaded yet at the probe stage.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-14 15:56:50 -04:00
Graham Sider
9d6fa9c7ff drm/amdkfd: Expose GFXIP engine version to sysfs
Add u32 gfx_target_version field to kfd_node_properties and
kfd_device_info. Populate <asic>_device_info structs accordingly and
expose to sysfs.

This allows eliminating device-ID-based lookup tables in user mode for
future ASICs.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-05 21:18:00 -04:00
Tao Zhou
06e75b88e8 drm/amdkfd: enable cyan_skillfish KFD
Add KFD support for cyan_skillfish.

v2: whitespace fixes (Alex)

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-23 10:08:01 -04:00
Graham Sider
410e302ea5 drm/amdkfd: Update SMI throttle event bitmask
Update Arcturus/Aldebaran thermal throttle SMI event path to use
ASIC-independent throttler bits when logging.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-23 10:08:00 -04:00
Oak Zeng
4f942aaeb1 drm/amdkfd: Fix a concurrency issue during kfd recovery
start_cpsch and stop_cpsch can be called during kfd device
initialization or during gpu reset/recovery. So they can
run concurrently. Currently in start_cpsch and stop_cpsch,
pm_init and pm_uninit is not protected by the dpm lock.
Imagine such a case that user use packet manager's function
to submit a pm4 packet to hang hws (ie through command
cat /sys/class/kfd/kfd/topology/nodes/1/gpu_id | sudo tee
/sys/kernel/debug/kfd/hang_hws), while kfd device is under
device reset/recovery so packet manager can be not initialized.
There will be unpredictable protection fault in such case.

This patch moves pm_init/uninit inside the dpm lock and check
packet manager is initialized before using packet manager
function.

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Acked-by: Christian Konig <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-23 10:08:00 -04:00
Oak Zeng
9af5379c85 drm/amdkfd: Renaming dqm->packets to dqm->packet_mgr
Renaming packets to packet_mgr to reflect the real meaning
of this variable.

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Acked-by: Christian Konig <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-23 10:07:59 -04:00
Aaron Liu
bf9d4e88c2 drm/amdkfd: add yellow carp KFD support
This patch is to add GFX10 based Yellow Carp KFD support.
We will bypass IOMMU v2.

Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-04 16:03:09 -04:00
Thomas Zimmermann
304ba5dca4 Merge drm/drm-next into drm-misc-next
Backmerging from drm/drm-next to the patches for AMD devices
for v5.14.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
2021-05-22 07:17:05 +02:00
Andrey Grodzovsky
e9669fb782 drm/amdgpu: Add early fini callback
Use it to call disply code dependent on device->drv_data
before it's set to NULL on device unplug

v5:
Move HW finilization into this callback to prevent MMIO accesses
post cpi remove.

v7:
Split kfd suspend from device exit to expdite HW related
stuff to amdgpu_pci_remove

v8:
Squash previous KFD commit into this commit to avoid compile break.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210520032057.497334-1-andrey.grodzovsky@amd.com
2021-05-19 23:48:50 -04:00
Chengming Gui
c86eb51705 drm/amdkfd: add kfd2kgd funcs for beige_goby kfd support
Add the function pointer.

Signed-off-by: Chengming Gui <Jack.Gui@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-05-19 22:40:42 -04:00
Chengming Gui
5cf607cc35 drm/amdkfd: support beige_goby KFD
Add KFD support for beige_goby
v2: fix asic name typo
v3: squash in updates (Alex)
v4: squash in needs_atomics fix (Alex)

Signed-off-by: Chengming Gui <Jack.Gui@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-05-19 22:40:40 -04:00
Zhigang Luo
cecd91b4f7 drm/amdkfd: Add Aldebaran virtualization support
update kfd_supported_devices to enable Aldebaran virtualization support

Signed-off-by: Zhigang Luo <zhigang.luo@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-05-10 18:06:43 -04:00
Harish Kasiviswanathan
8baa6018b7 drm/amdkfd: Add Aldebaran gws support
v2: updated MEC FW version after validating gws with debugger

Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Reviewed-by: Joseph Greathouse <Joseph.Greathouse@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-28 23:36:05 -04:00
Jonathan Kim
fd6a440ebc drm/amdkfd: add per-vmid-debug map_process_support
In order to support multi-process debugging, HWS PM4 packet
MAP_PROCESS requires an extension of 5 DWORDS to support targeting of
per-vmid SPI debug control registers as well as watch points per process.

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-23 17:16:05 -04:00
Philip Yang
814ab9930c drm/amdkfd: register HMM device private zone
Register vram memory as MEMORY_DEVICE_PRIVATE type resource, to
allocate vram backing pages for page migration.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-20 21:47:54 -04:00
Amber Lin
158fc08d17 drm/amdkfd: Avoid null pointer in SMI event
Power Management IP is initialized/enabled before KFD init. When a
thermal throttling happens before kfd_smi_init is done, calling the KFD
SMI update function causes a stack dump by referring a NULL pointer (
smi_clients list). Check if kfd_init is completed before calling the
function.

Signed-off-by: Amber Lin <Amber.Lin@amd.com>
Reviewed-by: Mukul Joshi <mukul.joshi@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-09 16:50:42 -04:00
Jonathan Kim
5073506c7e drm/amdkfd: add aldebaran kfd2kgd callbacks to kfd device (v2)
Create dedicated Aldebaran kfd2kgd callbacks to prepare
for new per-vmid register instructions for debug trap
setting functions and sending host traps.

v2: rebase (Alex)

Signed-off-by: Jonathan Kim <Jonathan.Kim@amd.com>
Reviewed-by: Oak Zeng <Oak.Zeng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:59:28 -04:00
Jay Cornwall
0ef6845c8c drm/amdkfd: Add aldebaran trap handler support
Similar to arcturus, but ARCH/ACC VGPRs may now be split unevenly.
A new field in SQ_WAVE_GPR_ALLOC tracks the boundary between the two
sets of VGPRs.

Squash below patches:

drm/amdkfd: Use preprocessor for IP-specific trap handler code
drm/amdkfd: Fix VGPR restore race in gfx8/gfx9 trap handler
drm/amdkfd: Remove duplicated code in gfx9 trap handler
drm/amdkfd: Separate ARCH/ACC VGPR restore in trap handler
drm/amdkfd: Reverse order of ARCH/ACC VGPR restore in trap handler

Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-10 00:02:24 -05:00
Yong Zhao
36e22d59dd drm/amdkfd: Add Aldebaran KFD support
Add initial KFD support.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-10 00:02:13 -05:00
Harish Kasiviswanathan
6cc980e3f5 drm/amdkfd: PCIe atomics required for gfx10
GFX10 CP firmware expects PCIe atomics support. Don't enumerate GFX10
devices on platforms (PCIe v2) that don't support PCIe atomics.

Currently, some of the applications like clinfo soft hangs on platforms
without PCIe atomics support.

Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-12-17 16:43:14 -05:00
Kent Russell
d95c368ab8 drm/amdkfd: Fix getting unique_id in topology
Since the unique_id is now obtained in amdgpu in smu_late_init,
topology misses getting the value during KFD device initialization.
To work around this, we use amdgpu_amdkfd_get_unique_id to get
the unique_id at read time. Due to this, we can remove unique_id from
the kfd_dev structure, since we only need it in the KFD node properties
struct

Signed-off-by: Kent Russell <kent.russell@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-10-30 00:59:42 -04:00
Chengming Gui
8f72ce6421 drm/amdkfd: Add kfd2kgd_funcs for dimgrey_cavefish kfd support
Add KFD support.

Signed-off-by: Chengming Gui <Jack.Gui@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-10-12 14:01:25 -04:00
Chengming Gui
eb5a34d482 drm/amdkfd: Support dimgrey_cavefish KFD (v2)
Add KFD support for dimgrey cavefish.

v2: rebase (Alex)

Signed-off-by: Chengming Gui <Jack.Gui@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-10-12 14:01:21 -04:00
Huang Rui
3a5e715de1 drm/amdkfd: add Van Gogh KFD support
This patch is to add GFX10 based APU Van Gogh KFD support. We will treat Van
Gogh as "dgpu" (bypass IOMMU v2).

Signed-off-by: Huang Rui <ray.huang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Yong Zhao <Yong.Zhao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-10-05 15:15:27 -04:00
Alex Deucher
9b498efae2 drm/amdgpu: store noretry parameter per driver instance
This will allow us to have different defaults per asic
in a future patch.

Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-25 16:55:16 -04:00
Mukul Joshi
59d7115dae drm/amdkfd: Move process doorbell allocation into kfd device
Move doorbell allocation for a process into kfd device and
allocate doorbell space in each PDD during process creation.
Currently, KFD manages its own doorbell space but for some
devices, amdgpu would allocate the complete doorbell
space instead of leaving a chunk of doorbell space for KFD to
manage. In a system with mix of such devices, KFD would need
to request process doorbell space based on the type of device,
either from amdgpu or from its own doorbell space.

Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-22 12:25:02 -04:00
YueHaibing
2b3bbf2354 drm/amdkfd: Fix -Wunused-const-variable warning
If KFD_SUPPORT_IOMMU_V2 is not set, gcc warns:

drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_device.c:121:37: warning: ‘raven_device_info’ defined but not used [-Wunused-const-variable=]
 static const struct kfd_device_info raven_device_info = {
                                     ^~~~~~~~~~~~~~~~~

As Huang Rui suggested, Raven already has the fallback path,
so it should be out of IOMMU v2 flag.

Suggested-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:52:39 -04:00
Mukul Joshi
55977744f9 drm/amdkfd: Add GPU reset SMI event
Add support for reporting GPU reset events through SMI. KFD
would report both pre and post GPU reset events.

Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-31 14:40:03 -04:00
Felix Kuehling
332f6e1e98 drm/amdkfd: call amdgpu_amdkfd_get_hive_id directly
No need to use a function pointer because the implementation is not
ASIC-specific.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-26 16:40:18 -04:00
Felix Kuehling
817154c1a2 drm/amdkfd: call amdgpu_amdkfd_get_unique_id directly
No need to use a function pointer because the implementation is not
ASIC-specific. This fixes missing support due to a missing function
pointer on Arcturus.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-26 16:40:18 -04:00
Huang Rui
6127896f4a drm/amdkfd: implement the dGPU fallback path for apu (v6)
We still have a few iommu issues which need to address, so force raven
as "dgpu" path for the moment.

This is to add the fallback path to bypass IOMMU if IOMMU v2 is disabled
or ACPI CRAT table not correct.

v2: Use ignore_crat parameter to decide whether it will go with IOMMUv2.
v3: Align with existed thunk, don't change the way of raven, only renoir
    will use "dgpu" path by default.
v4: don't update global ignore_crat in the driver, and revise fallback
    function if CRAT is broken.
v5: refine acpi crat good but no iommu support case, and rename the
    title.
v6: fix the issue of dGPU initialized firstly, just modify the report
    value in the node_show().

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-26 16:40:17 -04:00