linux

mirror of https://github.com/torvalds/linux.git synced 2026-04-25 10:02:31 -04:00

Author	SHA1	Message	Date
Xiaogang Chen	ca5d4db8db	drm/amdgpu: Use correct address to setup gart page table for vram access Use dst input parameter to setup gart page table entries instead of using fixed location. Fixes: `237d623ae6` ("drm/amdgpu/gart: Add helper to bind VRAM pages (v2)") Signed-off-by: Xiaogang Chen <xiaogang.chen@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2026-01-10 14:06:39 -05:00
YuBiao Wang	39c21b8111	drm/amdgpu: Skip loading SDMA_RS64 in VF VFs use the PF SDMA ucode and are unable to load SDMA_RS64. Signed-off-by: YuBiao Wang <YuBiao.Wang@amd.com> Signed-off-by: Victor Skvortsov <Victor.Skvortsov@amd.com> Reviewed-by: Gavin Wan <gavin.wan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2026-01-10 14:06:33 -05:00
Peter Colberg	38a0f4cf8c	Revert duplicate "drm/amdgpu: disable peer-to-peer access for DCC-enabled GC12 VRAM surfaces" This reverts commit `22a36e660d` once, which was merged twice due to an incorrect backmerge resolution. Fixes: `ce0478b02e` ("Merge tag 'v6.18-rc6' into drm-next") Signed-off-by: Peter Colberg <pcolberg@redhat.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2026-01-08 15:18:13 -05:00
Mario Limonciello (AMD)	6a23e7b433	drm/amd: Clean up kfd node on surprise disconnect When an eGPU is unplugged the KFD topology should also be destroyed for that GPU. This never happens because the fini_sw callbacks never get to run. Run them manually before calling amdgpu_device_ip_fini_early() when a device has already been disconnected. This location is intentionally chosen to make sure that the kfd locking refcount doesn't get incremented unintentionally. Cc: kent.russell@amd.com Closes: https://community.frame.work/t/amd-egpu-on-linux/8691/33 Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org> Reviewed-by: Kent Russell <kent.russell@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2026-01-08 11:43:23 -05:00
Hawking Zhang	5e3f50fda2	drm/amdgpu: Extend psp_skip_tmr for bare-metal and sriov In SRIOV, guest drivers no longer setup/destory VMR starting from mp0 v11_0_7. In bare-metal, if boot-time TMR is enabled, some generation (e.g., mp0 v13_0_x) don’t need runtime TMR allocation but still require SETUP_TMR command with tmr address 0 for backward compatibility. some newer generations require neither SETUP_TMR nor DESTROY_TMR and will return errors if they are sent. Driver relies on boot_time_tmr and autoload_supported to handle these cases correctly. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Likun Gao <Likun.Gao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2026-01-08 11:43:09 -05:00
Philip Yang	698fa62f56	drm/amdgpu: Add helper to alloc GART entries Add helper amdgpu_gtt_mgr_alloc/free_entries, define GART_ENTRY_WITHOUT_BO_COLOR color for GART node not allocated with GTT bo, then amdgpu_gtt_mgr_recover skip those mm_node. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2026-01-08 11:42:53 -05:00
Lu Yao	2f2a72de67	drm/amdgpu: fix drm panic null pointer when driver not support atomic When driver not support atomic, fb using plane->fb rather than plane->state->fb. Fixes: `fe151ed7af` ("drm/amdgpu: add generic display panic helper code") Signed-off-by: Lu Yao <yaolu@kylinos.cn> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2026-01-08 11:41:46 -05:00
Pratik Vishwakarma	c7fc0f3723	drm/amd: Enable SMU 15_0_0 support Add SMU 15_0_0 v2: rebase (Alex) v3: fix clang build (Alex) Signed-off-by: Pratik Vishwakarma <Pratik.Vishwakarma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2026-01-08 11:41:42 -05:00
Pratik Vishwakarma	52cd9bc47f	drm/amd: Enable SMUIO 15_0_0 support Add SMUIO 15_0_0. Signed-off-by: Pratik Vishwakarma <Pratik.Vishwakarma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2026-01-08 11:41:34 -05:00
Philip Yang	fc1366016a	drm/amdgpu: Fix gfx9 update PTE mtype flag Fix copy&paste error, that should have been an assignment instead of an or, otherwise MTYPE_UC 0x3 can not be updated to MTYPE_RW 0x1. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2026-01-08 11:40:45 -05:00
Mario Limonciello (AMD)	6b2989ac5e	Reapply "Revert "drm/amd: Skip power ungate during suspend for VPE"" Skipping power ungate exposed some scenarios that will fail like below: ``` amdgpu: Register(0) [regVPEC_QUEUE_RESET_REQ] failed to reach value 0x00000000 != 0x00000001n amdgpu 0000:c1:00.0: amdgpu: VPE queue reset failed ... amdgpu: [drm] ERROR wait_for_completion_timeout timeout! ``` The underlying s2idle issue that prompted this commit is going to be fixed in BIOS. This reverts commit `2a6c826cfe`. This was lost in the 6.19 merge so reapply it. Fixes: `2a6c826cfe` ("drm/amd: Skip power ungate during suspend for VPE") Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reported-by: Konstantin <answer2019@yandex.ru> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220812 Reported-by: Matthew Schwartz <matthew.schwartz@linux.dev> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `3925683515`)	2026-01-07 17:24:16 -05:00
Perry Yuan	0de604d035	drm/amd/pm: Disable MMIO access during SMU Mode 1 reset During Mode 1 reset, the ASIC undergoes a reset cycle and becomes temporarily inaccessible via PCIe. Any attempt to access MMIO registers during this window (e.g., from interrupt handlers or other driver threads) can result in uncompleted PCIe transactions, leading to NMI panics or system hangs. To prevent this, set the `no_hw_access` flag to true immediately after triggering the reset. This signals other driver components to skip register accesses while the device is offline. A memory barrier `smp_mb()` is added to ensure the flag update is globally visible to all cores before the driver enters the sleep/wait state. Signed-off-by: Perry Yuan <perry.yuan@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `7edb503fe4`)	2026-01-07 17:24:10 -05:00
Alan Liu	72d7f45736	drm/amdgpu: Fix query for VPE block_type and ip_count [Why] Query for VPE block_type and ip_count is missing. [How] Add VPE case in ip_block_type and hw_ip_count query. Reviewed-by: Lang Yu <lang.yu@amd.com> Signed-off-by: Alan Liu <haoping.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `a6ea0a430a`) Cc: stable@vger.kernel.org	2026-01-05 17:34:48 -05:00
Pratap Nirujogi	7ed51e3a13	drm/amd/amdgpu: Fix SMU warning during isp suspend-resume ISP mfd child devices are using genpd and the system suspend-resume operations between genpd and amdgpu parent device which uses only runtime suspend-resume are not in sync. Linux power manager during suspend-resume resuming the genpd devices earlier than the amdgpu parent device. This is resulting in the below warning as SMU is in suspended state when genpd attempts to resume ISP. WARNING: CPU: 13 PID: 5435 at drivers/gpu/drm/amd/amdgpu/../pm/swsmu/amdgpu_smu.c:398 smu_dpm_set_power_gate+0x36f/0x380 [amdgpu] To fix this warning isp suspend-resume is handled as part of amdgpu parent device suspend-resume instead of genpd sequence. Each ISP MFD child device is marked as dev_pm_syscore_device to skip genpd suspend-resume and use pm_runtime_force api's to suspend-resume the devices when callbacks from amdgpu are received. Co-developed-by: Gjorgji Rosikopulos <grosikop@amd.com> Signed-off-by: Gjorgji Rosikopulos <grosikop@amd.com> Signed-off-by: Bin Du <bin.du@amd.com> Signed-off-by: Pratap Nirujogi <pratap.nirujogi@amd.com> Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `0288a345f1`)	2026-01-05 17:29:51 -05:00
Alex Deucher	531b432609	drm/amdgpu: always backup and reemit fences If when we backup the ring contents for reemit before a ring reset, we skip jobs associated with the bad context, however, we need to make sure the fences are reemited as unprocessed submissions may depend on them. v2: clean up fence handling, make helpers static Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `155a748f14`)	2026-01-05 17:28:45 -05:00
Alex Deucher	9fc27cbabe	drm/amdgpu: don't reemit ring contents more than once If we cancel a bad job and reemit the ring contents, and we get another timeout, cancel everything rather than reemitting. The wptr markers are only relevant for the original emit. If we reemit, the wptr markers are no longer correct. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `fb62a2067c`)	2026-01-05 17:28:45 -05:00
Julia Lawall	e9009c8b74	drm/amdgpu: update outdated comment The function amdgpu_amdkfd_gpuvm_import_dmabuf() was split into import_obj_create() and amdgpu_amdkfd_gpuvm_import_dmabuf_fd() in commit `0188006d7c` ("drm/amdkfd: Import DMABufs for interop through DRM"). import_obj_create() now does the allocation for the mem variable discussed in the comment. Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr> Signed-off-by: Felix Kuehling <felix.kuehling@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2026-01-05 17:00:00 -05:00
Perry Yuan	7edb503fe4	drm/amd/pm: Disable MMIO access during SMU Mode 1 reset During Mode 1 reset, the ASIC undergoes a reset cycle and becomes temporarily inaccessible via PCIe. Any attempt to access MMIO registers during this window (e.g., from interrupt handlers or other driver threads) can result in uncompleted PCIe transactions, leading to NMI panics or system hangs. To prevent this, set the `no_hw_access` flag to true immediately after triggering the reset. This signals other driver components to skip register accesses while the device is offline. A memory barrier `smp_mb()` is added to ensure the flag update is globally visible to all cores before the driver enters the sleep/wait state. Signed-off-by: Perry Yuan <perry.yuan@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2026-01-05 17:00:00 -05:00
Srinivasan Shanmugam	bd8150a1b3	drm/amdgpu: Refactor amdgpu_gem_va_ioctl for Handling Last Fence Update and Timeline Management v4 This commit simplifies the amdgpu_gem_va_ioctl function, key updates include: - Moved the logic for managing the last update fence directly into amdgpu_gem_va_update_vm. - Introduced checks for the timeline point to enable conditional replacement or addition of fences. v2: Addressed review comments from Christian. v3: Updated comments (Christian). v4: The previous version selected the fence too early and did not manage its reference correctly, which could lead to stale or freed fences being used. This resulted in refcount underflows and could crash when updating GPU timelines. The fence is now chosen only after the VA mapping work is completed, and its reference is taken safely. After exporting it to the VM timeline syncobj, the driver always drops its local fence reference, ensuring balanced refcounting and avoiding use-after-free on dma_fence. Crash signature: [ 205.828135] refcount_t: underflow; use-after-free. [ 205.832963] WARNING: CPU: 30 PID: 7274 at lib/refcount.c:28 refcount_warn_saturate+0xbe/0x110 ... [ 206.074014] Call Trace: [ 206.076488] <TASK> [ 206.078608] amdgpu_gem_va_ioctl+0x6ea/0x740 [amdgpu] [ 206.084040] ? __pfx_amdgpu_gem_va_ioctl+0x10/0x10 [amdgpu] [ 206.089994] drm_ioctl_kernel+0x86/0xe0 [drm] [ 206.094415] drm_ioctl+0x26e/0x520 [drm] [ 206.098424] ? __pfx_amdgpu_gem_va_ioctl+0x10/0x10 [amdgpu] [ 206.104402] amdgpu_drm_ioctl+0x4b/0x80 [amdgpu] [ 206.109387] __x64_sys_ioctl+0x96/0xe0 [ 206.113156] do_syscall_64+0x66/0x2d0 ... [ 206.553351] BUG: unable to handle page fault for address: ffffffffc0dfde90 ... [ 206.553378] RIP: 0010:dma_fence_signal_timestamp_locked+0x39/0xe0 ... [ 206.553405] Call Trace: [ 206.553409] <IRQ> [ 206.553415] ? __pfx_drm_sched_fence_free_rcu+0x10/0x10 [gpu_sched] [ 206.553424] dma_fence_signal+0x30/0x60 [ 206.553427] drm_sched_job_done.isra.0+0x123/0x150 [gpu_sched] [ 206.553434] dma_fence_signal_timestamp_locked+0x6e/0xe0 [ 206.553437] dma_fence_signal+0x30/0x60 [ 206.553441] amdgpu_fence_process+0xd8/0x150 [amdgpu] [ 206.553854] sdma_v4_0_process_trap_irq+0x97/0xb0 [amdgpu] [ 206.554353] edac_mce_amd(E) ee1004(E) [ 206.554270] amdgpu_irq_dispatch+0x150/0x230 [amdgpu] [ 206.554702] amdgpu_ih_process+0x6a/0x180 [amdgpu] [ 206.555101] amdgpu_irq_handler+0x23/0x60 [amdgpu] [ 206.555500] __handle_irq_event_percpu+0x4a/0x1c0 [ 206.555506] handle_irq_event+0x38/0x80 [ 206.555509] handle_edge_irq+0x92/0x1e0 [ 206.555513] __common_interrupt+0x3e/0xb0 [ 206.555519] common_interrupt+0x80/0xa0 [ 206.555525] </IRQ> [ 206.555527] <TASK> ... [ 206.555650] RIP: 0010:dma_fence_signal_timestamp_locked+0x39/0xe0 ... [ 206.555667] Kernel panic - not syncing: Fatal exception in interrupt Link: https://patchwork.freedesktop.org/patch/654669/ Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Suggested-by: Christian König <christian.koenig@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2026-01-05 17:00:00 -05:00
Gangliang Xie	f9f281e839	drm/amdgpu: only check critical address when it is not reserved when an address is reserved already, no need to check if it is in critical or not, to save time Signed-off-by: Gangliang Xie <ganglxie@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2026-01-05 17:00:00 -05:00
Alan Liu	a6ea0a430a	drm/amdgpu: Fix query for VPE block_type and ip_count [Why] Query for VPE block_type and ip_count is missing. [How] Add VPE case in ip_block_type and hw_ip_count query. Reviewed-by: Lang Yu <lang.yu@amd.com> Signed-off-by: Alan Liu <haoping.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2026-01-05 17:00:00 -05:00
Alex Deucher	5ab75f98fb	drm/amdgpu/gfx9: Implement KGQ ring reset GFX ring resets work differently on pre-GFX10 hardware since there is no MQD managed by the scheduler. For ring reset, you need issue the reset via CP_VMID_RESET via KIQ or MMIO and submit the following to the gfx ring to complete the reset: 1. EOP packet with EXEC bit set 2. WAIT_REG_MEM to wait for the fence 3. Clear CP_VMID_RESET to 0 4. EVENT_WRITE ENABLE_LEGACY_PIPELINE 5. EOP packet with EXEC bit set 6. WAIT_REG_MEM to wait for the fence Once those commands have completed the reset should be complete and the ring can accept new packets. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Tested-by: Jiqian Chen <Jiqian.Chen@amd.com> (v1) Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2026-01-05 17:00:00 -05:00
Alex Deucher	9596097be4	drm/amdgpu/gfx9: rework pipeline sync packet sequence Replace WAIT_REG_MEM with EVENT_WRITE flushes for all shader types and ACQUIRE_MEM. That should accomplish the same thing and avoid having to wait on a fence preventing any issues with pipeline syncs during queue resets. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2026-01-05 16:59:59 -05:00
Alex Deucher	c8cf9ddc54	drm/amdgpu: avoid a warning in timedout job handler Only set an error on the fence if the fence is not signalled. We can end up with a warning if the per queue reset path signals the fence and sets an error as part of the reset, but fails to recover. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2026-01-05 16:59:59 -05:00
Pratap Nirujogi	0288a345f1	drm/amd/amdgpu: Fix SMU warning during isp suspend-resume ISP mfd child devices are using genpd and the system suspend-resume operations between genpd and amdgpu parent device which uses only runtime suspend-resume are not in sync. Linux power manager during suspend-resume resuming the genpd devices earlier than the amdgpu parent device. This is resulting in the below warning as SMU is in suspended state when genpd attempts to resume ISP. WARNING: CPU: 13 PID: 5435 at drivers/gpu/drm/amd/amdgpu/../pm/swsmu/amdgpu_smu.c:398 smu_dpm_set_power_gate+0x36f/0x380 [amdgpu] To fix this warning isp suspend-resume is handled as part of amdgpu parent device suspend-resume instead of genpd sequence. Each ISP MFD child device is marked as dev_pm_syscore_device to skip genpd suspend-resume and use pm_runtime_force api's to suspend-resume the devices when callbacks from amdgpu are received. Co-developed-by: Gjorgji Rosikopulos <grosikop@amd.com> Signed-off-by: Gjorgji Rosikopulos <grosikop@amd.com> Signed-off-by: Bin Du <bin.du@amd.com> Signed-off-by: Pratap Nirujogi <pratap.nirujogi@amd.com> Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2026-01-05 16:59:59 -05:00
Alex Deucher	b4ba5c9509	drm/amdgpu: use dma_fence_get_status() for adapter reset We need to check if the fence was signaled without an error as the per queue resets may have signalled the fence while attempting to reset the queue. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2026-01-05 16:59:58 -05:00
Yo-Jung Leo Lin (AMD)	5946dbe1c8	Documentation/amdgpu: Add UMA carveout details Add documentation for the uma/carveout_options and uma/carveout attributes in sysfs Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Yo-Jung Leo Lin (AMD) <Leo.Lin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2026-01-05 16:59:58 -05:00
Yo-Jung Leo Lin (AMD)	19ba61ac06	drm/amdgpu: add UMA allocation interfaces to sysfs Add a uma/ directory containing two sysfs files as interfaces to inspect or change UMA carveout size. These files are: - uma/carveout_options: a read-only file listing all the available UMA allocation options and their index. - uma/carveout: a file that is both readable and writable. On read, it shows the index of the current setting. Writing a valid index into this file allows users to change the UMA carveout size to that option on the next boot. Co-developed-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Yo-Jung Leo Lin (AMD) <Leo.Lin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2026-01-05 16:59:58 -05:00
Yo-Jung Leo Lin (AMD)	379a316063	drm/amdgpu: add UMA allocation setting helpers On some platforms, UMA allocation size can be set using the ATCS methods. Add helper functions to interact with this functionality. Co-developed-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Yo-Jung Leo Lin (AMD) <Leo.Lin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2026-01-05 16:59:58 -05:00
Yo-Jung Leo Lin (AMD)	685b7113e0	drm/amdgpu: add helper to read UMA carveout info Currently, the available UMA allocation configs in the integrated system information table have not been parsed. Add a helper function to retrieve and store these configs. Co-developed-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Yo-Jung Leo Lin (AMD) <Leo.Lin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2026-01-05 16:59:58 -05:00
Yo-Jung Leo Lin (AMD)	6f3b631e39	drm/amdgpu: parse UMA size-getting/setting bits in ATCS mask The capabilities of getting and setting VRAM carveout size are exposed in the ATCS mask. Parse and store these capabilities for future use. Co-developed-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Yo-Jung Leo Lin (AMD) <Leo.Lin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2026-01-05 16:59:58 -05:00
Alex Deucher	155a748f14	drm/amdgpu: always backup and reemit fences If when we backup the ring contents for reemit before a ring reset, we skip jobs associated with the bad context, however, we need to make sure the fences are reemited as unprocessed submissions may depend on them. v2: clean up fence handling, make helpers static Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2026-01-05 16:59:58 -05:00
Alex Deucher	fb62a2067c	drm/amdgpu: don't reemit ring contents more than once If we cancel a bad job and reemit the ring contents, and we get another timeout, cancel everything rather than reemitting. The wptr markers are only relevant for the original emit. If we reemit, the wptr markers are no longer correct. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2026-01-05 16:59:57 -05:00
Le Ma	216779827f	drm/amdgpu: add helpers to access cross-die registers smn addr for soc v1_0 Encode die_id/socket_id for upper 32bits of soc v1_0 registers SMN address. v2: fix logical error caught by clang Signed-off-by: Le Ma <le.ma@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2026-01-05 16:59:57 -05:00
Bokun Zhang	0dd72af552	drm/amdgpu: RLC-G VF Register Access Interface - Implement Gfx v12.1 VFi interface under SRIOV - Redirect all RLCG interface access to new function after Gfx v12.1 v2: squash in register updates Signed-off-by: Bokun Zhang <Bokun.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2026-01-05 16:59:57 -05:00
Likun Gao	87046288e8	drm/amdgpu: set aid_mask for soc v1 Set aid_mask via xcc_mask. v2: squash in follow up change Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2026-01-05 16:59:57 -05:00
Pratik Vishwakarma	9b24f63d82	drm/amdgpu: Enable support for PSP 15_0_0 Add support for PSP v 15.0.0. Signed-off-by: Pratik Vishwakarma <Pratik.Vishwakarma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2026-01-05 16:59:57 -05:00
Alex Deucher	6ee1ee12ff	drm/amdgpu: add queue reset support for jpeg 5.3 Enable queue reset for JPEG 5.3. Reviewed-by: Ruijing Dong <ruijing.dong@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2026-01-05 16:59:57 -05:00
Saleemkhan Jamadar	637fd8dedf	drm/amdgpu/discovery: add vcn and jpeg ip block Add VCN and jpeg IPs v5_3_0 blocks. Reviewed-by: Ruijing Dong <ruijing.dong@amd.com> Signed-off-by: Saleemkhan Jamadar <saleemkhan.jamadar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2026-01-05 16:59:57 -05:00
Saleemkhan Jamadar	4aeaf3cbfa	drm/amdgpu/jpeg: Add jpeg 5.3.0 support Add the Jpeg IP v5_3_0 code base. Reviewed-by: Ruijing Dong <ruijing.dong@amd.com> Signed-off-by: Saleemkhan Jamadar <saleemkhan.jamadar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2026-01-05 16:59:57 -05:00
Le Ma	a26198f122	drm/amdgpu: reserve umf hole size at vram high end for gfx v12.1 This region is reserved by firmware thus carve it out in driver. v2: set reserve size based on aid configuration. Signed-off-by: Le Ma <le.ma@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2026-01-05 16:59:57 -05:00
Srinivasan Shanmugam	af26fa751c	drm/amdgpu: Use explicit VCN instance 0 in SR-IOV init vcn_v2_0_start_sriov() declares a local variable "i" initialized to zero and uses it only as the instance index in SOC15_REG_OFFSET(UVD, i, ...). The value is never changed and all other fields are taken from adev->vcn.inst[0], so this path only ever programs VCN instance 0. This triggered a Smatch: warn: iterator 'i' not incremented Replace the dummy iterator with an explicit instance index of 0 in SOC15_REG_OFFSET() calls. Fixes: `dd26858a9c` ("drm/amdgpu: implement initialization part on VCN2.0 for SRIOV") Reported by: Dan Carpenter <dan.carpenter@linaro.org> Cc: darlington Opara <darlington.opara@amd.com> Cc: Jinage Zhao <jiange.zhao@amd.com> Cc: Monk Liu <Monk.Liu@amd.com> Cc: Emily Deng <Emily.Deng@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Emily Deng <Emily.Deng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2026-01-05 16:59:56 -05:00
Le Ma	56c0a9c33c	drm/amdgpu: enable CP interrupt for gfx v12_1 in frontdoor loading case Enable cp interrupt for event detection since GFX CGCG and LS has been enabled by firmware. v2: enable CP INT by merely checking fw_load_type Signed-off-by: Le Ma <le.ma@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2026-01-05 16:59:56 -05:00
Asad Kamal	864a8b2c1f	drm/amdgpu: Add sysfs up clean for gfx_v12_1 Add sysfs clean up for gfx_v12_1 during gfx fini sequence. This will prevent following crash while reloading driver 2645.490824] R13: 000055d0cb186330 R14: 000055d0cb185ed0 R15: 000055d0cb188f40 [ 2645.490825] </TASK> [ 2645.490836] amdgpu 0000:02:00.0: amdgpu: failed to create xcp sysfs files [ 2645.490937] amdgpu 0000:02:00.0: amdgpu: sw_init of IP block <gfx_v12_1> failed -17 [ 2645.491018] amdgpu 0000:02:00.0: amdgpu: amdgpu_device_ip_init failed [ 2645.491098] amdgpu 0000:02:00.0: amdgpu: Fatal error during GPU init [ 2645.491547] amdgpu 0000:02:00.0: amdgpu: amdgpu: finishing device. [ 2648.549939] ------------[ cut here ]------------ [ 2648.549942] WARNING: CPU: 0 PID: 2459 at /tmp/amd.aIpOeG3c/amd/amdgpu/amdgpu_irq.c: Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2026-01-05 16:59:56 -05:00
Shaoyun Liu	d0c989a0aa	drm/amd/amdgpu : Use the MES INV_TLBS API for tlb invalidation on gfx12_1 Signed-off-by: Shaoyun Liu <shaoyun.liu@amd.com> Reviewed-by: Prike Liang <Prike.Liang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2026-01-05 16:59:56 -05:00
Mukul Joshi	3af6302d8c	drm/amdgpu: Update TCP Control register on GFX 12.1 Update TCP CNTL register to disable some features not supported on GFX 12.1. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Alex Sierra <alex.sierra@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2026-01-05 16:59:56 -05:00
Mukul Joshi	5a8c343d2e	drm/amdgpu: Cleanup gmc_v12_1 after 6.16 merge After the 6.16 merge, some changes not applicable to GFX 12.1 were added in the gmc_v12_1_get_vm_pte function. Additionally, add the case for MTYPE RW for GFX 12.1. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Alex Sierra <alex.sierra@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2026-01-05 16:59:56 -05:00
Mukul Joshi	fab4099549	drm/amdgpu: Disable TCP Early Write Ack for GFX 12.1 Disable the TCP Early Write Ack feature on GFX 12.1. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Alex Sierra <alex.sierra@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2026-01-05 16:59:56 -05:00
Jonathan Kim	c14af4cc24	drm/amdkfd: fix partitioned gfx12 address watch enablement GFX 12 devices that support spatial partitioning should use the WREG32 per XCC macro when updating address watch settings, similar to GFX 9 devices that support spatial partitioning. Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Reviewed-by: Mukul Joshi <mukul.joshi@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2026-01-05 16:59:56 -05:00
Likun Gao	acf07acfae	drm/amdgpu: skip gfxhub tlb flush if gfx is power off Skip for gfxhub tlb flush for gc v12_1 if gfx is not poweron. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2026-01-05 16:59:55 -05:00

... 2 3 4 5 6 ...

17069 Commits