linux

mirror of https://github.com/torvalds/linux.git synced 2026-04-30 04:22:32 -04:00

Author	SHA1	Message	Date
Lizhi Hou	ca25834123	accel/amdxdna: Fix deadlock between context destroy and job timeout Hardware context destroy function holds dev_lock while waiting for all jobs to complete. The timeout job also needs to acquire dev_lock, this leads to a deadlock. Fix the issue by temporarily releasing dev_lock before waiting for all jobs to finish, and reacquiring it afterward. Fixes: `4fd6ca90fc` ("accel/amdxdna: Refactor hardware context destroy routine") Reviewed-by: Maciej Falkowski <maciej.falkowski@linux.intel.com> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://patch.msgid.link/20251107181050.1293125-1-lizhi.hou@amd.com	2025-11-13 09:10:43 -08:00
Lizhi Hou	6ff9385c07	accel/amdxdna: Clear mailbox interrupt register during channel creation The mailbox interrupt register is not always cleared when a mailbox channel is created. This can leave stale interrupt states from previous operations. Fix this by explicitly clearing the interrupt register in the mailbox channel creation function. Fixes: `b87f920b93` ("accel/amdxdna: Support hardware mailbox") Reviewed-by: Maciej Falkowski <maciej.falkowski@linux.intel.com> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://patch.msgid.link/20251107181115.1293158-1-lizhi.hou@amd.com	2025-11-13 08:36:08 -08:00
Lizhi Hou	1750e652a9	accel/amdxdna: Treat power-off failure as unrecoverable error Failing to set power off indicates an unrecoverable hardware or firmware error. Update the driver to treat such a failure as a fatal condition and stop further operations that depend on successful power state transition. This prevents undefined behavior when the hardware remains in an unexpected state after a failed power-off attempt. Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://patch.msgid.link/20251106180521.1095218-1-lizhi.hou@amd.com	2025-11-07 08:52:29 -08:00
Lizhi Hou	dea9f84776	accel/amdxdna: Fix dma_fence leak when job is canceled Currently, dma_fence_put(job->fence) is called in job notification callback. However, if a job is canceled, the notification callback is never invoked, leading to a memory leak. Move dma_fence_put(job->fence) to the job cleanup function to ensure the fence is always released. Fixes: `aac243092b` ("accel/amdxdna: Add command execution") Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://patch.msgid.link/20251105194140.1004314-1-lizhi.hou@amd.com	2025-11-06 09:23:42 -08:00
Lizhi Hou	3a0ff7b98a	accel/amdxdna: Support preemption requests The driver checks the firmware version during initialization.If preemption is supported, the driver configures preemption accordingly and handles userspace preemption requests. Otherwise, the driver returns an error for userspace preemption requests. Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://patch.msgid.link/20251104185340.897560-1-lizhi.hou@amd.com	2025-11-05 08:56:28 -08:00
Lizhi Hou	e568dc3e62	accel/amdxdna: Add IOCTL parameter for telemetry data Extend DRM_IOCTL_AMDXDNA_GET_INFO to include additional parameters that allow collection of telemetry data. Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://patch.msgid.link/20251104062546.833771-3-lizhi.hou@amd.com	2025-11-04 09:04:21 -08:00
Lizhi Hou	1556c170d2	accel/amdxdna: Add IOCTL parameter for resource data Extend DRM_IOCTL_AMDXDNA_GET_INFO to include additional parameters that allow collection of resource data. Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://patch.msgid.link/20251104062546.833771-2-lizhi.hou@amd.com	2025-11-04 09:03:11 -08:00
Lizhi Hou	c48f1f459e	accel/amdxdna: Add hardware specific attributes Add three hardware specific attributes to describe device capabilities: hwctx_limit: The maximum number of hardware context supported. max_tops: The maximum TOPS supported. curr_tops: The TOPS achievable with the current power and frequency configuration. Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://patch.msgid.link/20251104062546.833771-1-lizhi.hou@amd.com	2025-11-04 09:01:44 -08:00
Lizhi Hou	71829d7f2f	accel/amdxdna: Use MSG_OP_CHAIN_EXEC_NPU when supported MSG_OP_CHAIN_EXEC_NPU is a unified mailbox message that replaces MSG_OP_CHAIN_EXEC_BUFFER_CF and MSG_OP_CHAIN_EXEC_DPU. Add driver logic to check firmware version, and if MSG_OP_CHAIN_EXEC_NPU is supported, uses it to submit firmware commands. Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://patch.msgid.link/20251031014700.2919349-1-lizhi.hou@amd.com	2025-11-03 09:20:39 -08:00
Jani Nikula	f6e8dc9edf	drm: include drm_print.h where needed There are a gazillion files that depend on drm_print.h being indirectly included via drm_buddy.h, drm_mm.h, or ttm/ttm_resource.h. In preparation for removing those includes, explicitly include drm_print.h where needed. Cc: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://lore.kernel.org/r/5fe67395907be33eb5199ea6d540e29fddee71c8.1761734313.git.jani.nikula@intel.com	2025-10-31 10:34:52 +02:00
Lizhi Hou	6fb7f29888	accel/amdxdna: Fix incorrect command state for timed out job When a command times out, mark it as ERT_CMD_STATE_TIMEOUT. Any other commands that are canceled due to this timeout should be marked as ERT_CMD_STATE_ABORT. Fixes: `aac243092b` ("accel/amdxdna: Add command execution") Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://patch.msgid.link/20251029193423.2430463-1-lizhi.hou@amd.com	2025-10-30 17:48:13 -07:00
Lizhi Hou	81233d5419	accel/amdxdna: Fix uninitialized return value In aie2_get_hwctx_status() and aie2_query_ctx_status_array(), the functions could return an uninitialized value in some cases. Update them to always return 0. The amount of valid results is indicated by the returned buffer_size, element_size, and num_element fields. Fixes: `2f509fe6a4` ("accel/amdxdna: Add ioctl DRM_IOCTL_AMDXDNA_GET_ARRAY") Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://patch.msgid.link/20251024165503.1548131-1-lizhi.hou@amd.com	2025-10-24 13:06:06 -07:00
Lizhi Hou	41ee90230c	accel/amdxdna: Fix incorrect return value in aie2_hwctx_sync_debug_bo() When the driver issues the SYNC_DEBUG_BO command, it currently returns 0 even if the firmware fails to execute the command. Update the driver to return -EINVAL in this case to properly indicate the failure. Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Closes: https://lore.kernel.org/dri-devel/aPsadTBXunUSBByV@stanley.mountain/ Fixes: `7ea0468380` ("accel/amdxdna: Support firmware debug buffer") Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://patch.msgid.link/20251024162608.1544842-1-lizhi.hou@amd.com	2025-10-24 13:02:09 -07:00
Lizhi Hou	7ea0468380	accel/amdxdna: Support firmware debug buffer To collect firmware debug information, the userspace application allocates a AMDXDNA_BO_DEV buffer object through DRM_IOCTL_AMDXDNA_CREATE_BO. Then it associates the buffer with the hardware context through DRM_IOCTL_AMDXDNA_CONFIG_HWCTX which requests firmware to bind the buffer through a mailbox command. The firmware then writes the debug data into this buffer. The buffer can be mapped into userspace so that applications can retrieve and analyze the firmware debug information. Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://lore.kernel.org/r/20251016203016.819441-1-lizhi.hou@amd.com	2025-10-20 09:07:12 -07:00
Lizhi Hou	b291e4f1a4	accel/amdxdna: Support getting last hardware error Add new parameter DRM_AMDXDNA_HW_LAST_ASYNC_ERR to get array IOCTL. When hardware reports an error, the driver save the error information and timestamp. This new get array parameter retrieves the last error. Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://lore.kernel.org/r/20251014234119.628453-1-lizhi.hou@amd.com	2025-10-16 09:32:48 -07:00
Lizhi Hou	e00f78679f	accel/amdxdna: Resume power for creating and destroying hardware context When the hardware is powered down by auto-suspend, creating or destroying a hardware context without resuming power will fail. Call amdxdna_pm_resume_get() before requesting the hardware to create or destroy a hardware context. Fixes: `063db45183` ("accel/amdxdna: Enhance runtime power management") Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://lore.kernel.org/r/20251008045324.4171807-1-lizhi.hou@amd.com	2025-10-08 08:31:39 -07:00
Lizhi Hou	063db45183	accel/amdxdna: Enhance runtime power management Currently, pm_runtime_resume_and_get() is invoked in the driver's open callback, and pm_runtime_put_autosuspend() is called in the close callback. As a result, the device remains active whenever an application opens it, even if no I/O is performed, leading to unnecessary power consumption. Move the runtime PM calls to the AIE2 callbacks that actually interact with the hardware. The device will automatically suspend after 5 seconds of inactivity (no hardware accesses and no pending commands), and it will be resumed on the next hardware access. Reviewed-by: Karol Wachowski <karol.wachowski@linux.intel.com> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://lore.kernel.org/r/20250923152229.1303625-1-lizhi.hou@amd.com	2025-09-24 13:47:59 -07:00
Lizhi Hou	457f4393d0	accel/amdxdna: Call dma_buf_vmap_unlocked() for imported object In amdxdna_gem_obj_vmap(), calling dma_buf_vmap() triggers a kernel warning if LOCKDEP is enabled. So for imported object, use dma_buf_vmap_unlocked(). Then, use drm_gem_vmap() for other objects. The similar change applies to vunmap code. Fixes: `bd72d4acda` ("accel/amdxdna: Support user space allocated buffer") Reviewed-by: Maciej Falkowski <maciej.falkowski@linux.intel.com> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://lore.kernel.org/r/20250916174842.234709-1-lizhi.hou@amd.com	2025-09-17 11:02:31 -07:00
Lizhi Hou	9e16c8bf9a	accel/amdxdna: Fix an integer overflow in aie2_query_ctx_status_array() The unpublished smatch static checker reported a warning. drivers/accel/amdxdna/aie2_pci.c:904 aie2_query_ctx_status_array() warn: potential user controlled sizeof overflow 'args->num_element * args->element_size' '1-u32max(user) * 1-u32max(user)' Even this will not cause a real issue, it is better to put a reasonable limitation for element_size and num_element. Add condition to make sure the input element_size <= 4K and num_element <= 1K. Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Closes: https://lore.kernel.org/dri-devel/aL56ZCLyl3tLQM1e@stanley.mountain/ Fixes: `2f509fe6a4` ("accel/amdxdna: Add ioctl DRM_IOCTL_AMDXDNA_GET_ARRAY") Reviewed-by: Maciej Falkowski <maciej.falkowski@linux.intel.com> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://lore.kernel.org/r/20250909154531.3469979-1-lizhi.hou@amd.com	2025-09-11 11:31:24 -07:00
Lizhi Hou	2f509fe6a4	accel/amdxdna: Add ioctl DRM_IOCTL_AMDXDNA_GET_ARRAY Add interface for applications to get information array. The application provides a buffer pointer along with information type, maximum number of entries and maximum size of each entry. The buffer may also contain match conditions based on the information type. After the ioctl completes, the actual number of entries and entry size are returned. (see [1], used by driver runtime library) [1] https://github.com/amd/xdna-driver/blob/main/src/shim/host/platform_host.cpp#L337 Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Reviewed-by: Maciej Falkowski <maciej.falkowski@linux.intel.com> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://lore.kernel.org/r/20250903053402.2103196-1-lizhi.hou@amd.com	2025-09-04 08:26:43 -07:00
Qianfeng Rong	24de3daf61	accel/amdxdna: Use int instead of u32 to store error codes Change the 'ret' variable from u32 to int to store -EINVAL. Storing the negative error codes in unsigned type, doesn't cause an issue at runtime but it's ugly as pants. Additionally, assigning -EINVAL to u32 ret (i.e., u32 ret = -EINVAL) may trigger a GCC warning when the -Wsign-conversion flag is enabled. Fixes: `aac243092b` ("accel/amdxdna: Add command execution") Signed-off-by: Qianfeng Rong <rongqianfeng@vivo.com> Reviewed-by: Lizhi Hou <lizhi.hou@amd.com> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://lore.kernel.org/r/20250828033917.113364-1-rongqianfeng@vivo.com	2025-08-29 08:59:26 -07:00
Lizhi Hou	6380b1ceba	accel/amdxdna: Fix incorrect type used for a local variable drivers/accel/amdxdna/aie2_pci.c:794:13: sparse: sparse: incorrect type in assignment (different address spaces) Fixes: `c8cea4371e` ("accel/amdxdna: Add a function to walk hardware contexts") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202508230855.0b9efFl6-lkp@intel.com/ Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://lore.kernel.org/r/20250826171951.801585-1-lizhi.hou@amd.com	2025-08-26 11:18:22 -07:00
Lizhi Hou	c8cea4371e	accel/amdxdna: Add a function to walk hardware contexts Walking hardware contexts created by a process is duplicated in multiple spots. Add a function, amdxdna_hwctx_walk(), and replace all spots. hwctx_srcu and dev_lock are good enough to protect hardware context list. Remove hwctx_lock. Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://lore.kernel.org/r/20250815171634.3417487-1-lizhi.hou@amd.com	2025-08-18 08:35:57 -07:00
Lizhi Hou	d2b48f2b30	accel/amdxdna: Unify pm and rpm suspend and resume callbacks The suspend and resume callbacks for pm and runtime pm should be same. During suspending, it needs to stop all hardware contexts first. And the hardware contexts will be restarted after the device is resumed. Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Reviewed-by: Maciej Falkowski <maciej.falkowski@linux.intel.com> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://lore.kernel.org/r/20250803191450.1568851-1-lizhi.hou@amd.com	2025-08-06 10:31:55 -07:00
Salah Triki	5982a539cd	accel/amdxdna: Delete pci_free_irq_vectors() The device is managed so pci_free_irq_vectors() is called automatically no need to do it manually. Reviewed-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> Signed-off-by: Salah Triki <salah.triki@gmail.com> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://lore.kernel.org/r/aHs8QAfUlFeNp7qL@pc	2025-07-22 09:19:04 -07:00
Lizhi Hou	bd72d4acda	accel/amdxdna: Support user space allocated buffer Enhance DRM_IOCTL_AMDXDNA_CREATE_BO to accept user space allocated buffer pointer. The buffer pages will be pinned in memory. Unless the CAP_IPC_LOCK is enabled for the application process, the total pinned memory can not beyond rlimit_memlock. Reviewed-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://lore.kernel.org/r/20250716164414.112091-1-lizhi.hou@amd.com	2025-07-22 08:34:29 -07:00
Maíra Canal	0a5dc1b67e	drm/sched: Rename DRM_GPU_SCHED_STAT_NOMINAL to DRM_GPU_SCHED_STAT_RESET Among the scheduler's statuses, the only one that indicates an error is DRM_GPU_SCHED_STAT_ENODEV. Any status other than DRM_GPU_SCHED_STAT_ENODEV signifies that the operation succeeded and the GPU is in a nominal state. However, to provide more information about the GPU's status, it is needed to convey more information than just "OK". Therefore, rename DRM_GPU_SCHED_STAT_NOMINAL to DRM_GPU_SCHED_STAT_RESET, which better communicates the meaning of this status. The status DRM_GPU_SCHED_STAT_RESET indicates that the GPU has hung, but it has been successfully reset and is now in a nominal state again. Reviewed-by: Philipp Stanner <phasta@kernel.org> Link: https://lore.kernel.org/r/20250714-sched-skip-reset-v6-1-5c5ba4f55039@igalia.com Signed-off-by: Maíra Canal <mcanal@igalia.com>	2025-07-15 08:27:00 -03:00
Dave Airlie	9356b50af5	Merge tag 'drm-misc-next-2025-06-19' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-next drm-misc-next for 6.17: UAPI Changes: - Add Task Information for the wedge API Cross-subsystem Changes: Core Changes: - Fix warnings related to export.h - fbdev: Make CONFIG_FIRMWARE_EDID available on all architectures - fence: Fix UAF issues - format-helper: Improve tests Driver Changes: - ivpu: Add turbo flag, Add Wildcat Lake Support - rz-du: Improve MIPI-DSI Support - vmwgfx: fence improvement Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maxime Ripard <mripard@redhat.com> Link: https://lore.kernel.org/r/20250619-perfect-industrious-whippet-8ed3db@houat	2025-06-20 11:34:09 +10:00
Dave Airlie	45215c589e	Merge tag 'drm-misc-next-2025-06-12' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-next drm-misc-next for 6.17: UAPI Changes: Cross-subsystem Changes: Core Changes: - atomic-helpers: Tune the enable / disable sequence - bridge: Add destroy hook - color management: Add helpers for hardware gamma LUT handling - HDMI: Add CEC handling, YUV420 output support - sched: tracing improvements Driver Changes: - hyperv: Move out of simple-kms, drm_panic support - i915: drm_panel_follower support - imx: Add IMX8qxq Display Controller Support - lima: Add Rockchip RK3528 GPU Support - nouveau: fence handling cleanup - panfrost: Add BO labeling, 64-bit registers access - qaic: Add RAS Support - rz-du: Add RZ/V2H(P) Support, MIPI-DSI DCS Support - sun4i: Add H616 Support - tidss: Add TI AM62L Support - vkms: YUV and R* formats support - bridges: - Switched to reference counted drm_bridge allocations - panels: - Switched to reference counted drm_panel allocations - Add support for fwnode-based panel lookup - himax-hx8394: Support for Huiling hl055fhv028c - ilitek-ili9881c: Support for 7" Raspberry Pi 720x1280 - panel-edp: Support for KDC KD116N3730A05, N160JCE-ELL CMN, - panel-simple: Support for AUO P238HAN01 - st7701: Support for Winstar wf40eswaa6mnn0 - visionox-rm69299: Support for rm69299-shift - New panels: Renesas R61307, Renesas R69328 Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maxime Ripard <mripard@redhat.com> Link: https://lore.kernel.org/r/20250612-coucal-of-impossible-cleaning-a5eecf@houat	2025-06-18 08:09:35 +10:00
Lizhi Hou	e252e3f348	accel/amdxdna: Revise device bo creation and free The device bo is allocated from the device heap memory. (a trunk of memory dedicated to device) Rename amdxdna_gem_insert_node_locked to amdxdna_gem_heap_alloc and move related sanity checks into it. Add amdxdna_gem_dev_obj_free and move device bo free code into it. Calculate the kernel virtual address of device bo by the device heap memory address and offset. Reviewed-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://lore.kernel.org/r/20250616091418.2605476-1-lizhi.hou@amd.com	2025-06-16 22:29:09 -07:00
Thomas Zimmermann	c598d5eb9f	Merge drm/drm-next into drm-misc-next Backmerging to forward to v6.16-rc1 Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>	2025-06-11 09:01:34 +02:00
Lizhi Hou	779a0c9e06	accel/amdxdna: Fix incorrect PSP firmware size The incorrect PSP firmware size is used for initializing. It may cause error for newer version firmware. Fixes: `8c9ff1b181` ("accel/amdxdna: Add a new driver for AMD AI Engine") Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://lore.kernel.org/r/20250604143217.1386272-1-lizhi.hou@amd.com	2025-06-09 07:16:32 -07:00
Linus Torvalds	8477ab1430	Merge tag 'iommu-updates-v6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux Pull iommu updates from Joerg Roedel: "Core: - Introduction of iommu-pages infrastructure to consolitate page-table allocation code among hardware drivers. This is ground-work for more generalization in the future - Remove IOMMU_DEV_FEAT_SVA and IOMMU_DEV_FEAT_IOPF feature flags - Convert virtio-iommu to domain_alloc_paging() - KConfig cleanups - Some small fixes for possible overflows and race conditions Intel VT-d driver: - Restore WO permissions on second-level paging entries - Use ida to manage domain id - Miscellaneous cleanups AMD-Vi: - Make sure notifiers finish running before module unload - Add support for HTRangeIgnore feature - Allow matching ACPI HID devices without matching UIDs ARM-SMMU: - SMMUv2: - Recognise the compatible string for SAR2130P MDSS in the Qualcomm driver, as this device requires an identity domain - Fix Adreno stall handling so that GPU debugging is more robust and doesn't e.g. result in deadlock - SMMUv3: - Fix ->attach_dev() error reporting for unrecognised domains - IO-pgtable: - Allow clients (notably, drivers that process requests from userspace) to silence warnings when mapping an already-mapped IOVA S390: - Add support for additional table regions Mediatek: - Add support for MT6893 MM IOMMU And some smaller fixes and improvements in various other drivers" * tag 'iommu-updates-v6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux: (75 commits) iommu/vt-d: Restore context entry setup order for aliased devices iommu/mediatek: Fix compatible typo for mediatek,mt6893-iommu-mm iommu/arm-smmu-qcom: Make set_stall work when the device is on iommu/arm-smmu: Move handing of RESUME to the context fault handler iommu/arm-smmu-qcom: Enable threaded IRQ for Adreno SMMUv2/MMU500 iommu/io-pgtable-arm: Add quirk to quiet WARN_ON() iommu: Clear the freelist after iommu_put_pages_list() iommu/vt-d: Change dmar_ats_supported() to return boolean iommu/vt-d: Eliminate pci_physfn() in dmar_find_matched_satc_unit() iommu/vt-d: Replace spin_lock with mutex to protect domain ida iommu/vt-d: Use ida to manage domain id iommu/vt-d: Restore WO permissions on second-level paging entries iommu/amd: Allow matching ACPI HID devices without matching UIDs iommu: make inclusion of arm/arm-smmu-v3 directory conditional iommu: make inclusion of riscv directory conditional iommu: make inclusion of amd directory conditional iommu: make inclusion of intel directory conditional iommu: remove duplicate selection of DMAR_TABLE iommu/fsl_pamu: remove trailing space after \n iommu/arm-smmu-qcom: Add SAR2130P MDSS compatible ...	2025-05-30 10:44:20 -07:00
Pierre-Eric Pelloux-Prayer	2956554823	drm/sched: Store the drm client_id in drm_sched_fence This will be used in a later commit to trace the drm client_id in some of the gpu_scheduler trace events. This requires changing all the users of drm_sched_job_init to add an extra parameter. The newly added drm_client_id field in the drm_sched_fence is a bit of a duplicate of the owner one. One suggestion I received was to merge those 2 fields - this can't be done right now as amdgpu uses some special values (AMDGPU_FENCE_OWNER_*) that can't really be translated into a client id. Christian is working on getting rid of those; when it's done we should be able to squash owner/drm_client_id together. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Signed-off-by: Philipp Stanner <phasta@kernel.org> Link: https://lore.kernel.org/r/20250526125505.2360-3-pierre-eric.pelloux-prayer@amd.com	2025-05-28 16:15:58 +02:00
Lizhi Hou	c065e46395	accel/amdxdna: Support submit commands without arguments The latest userspace runtime allows generating commands which do not have any argument. Remove the corresponding check in driver IOCTL to enable this use case. Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://lore.kernel.org/r/20250507161500.2339701-1-lizhi.hou@amd.com Link: https://lore.kernel.org/r/20250507161500.2339701-1-lizhi.hou@amd.com	2025-05-08 09:58:21 -07:00
Jason Gunthorpe	7c8896dd4a	iommu: Remove IOMMU_DEV_FEAT_SVA None of the drivers implement anything here anymore, remove the dead code. Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Yi Liu <yi.l.liu@intel.com> Tested-by: Zhangfei Gao <zhangfei.gao@linaro.org> Link: https://lore.kernel.org/r/20250418080130.1844424-3-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2025-04-28 13:04:29 +02:00
Lizhi Hou	6c161732ea	accel/amdxdna: Fix incorrect size of ERT_START_NPU commands When multiple ERT_START_NPU commands are combined in one buffer, the buffer size calculation is incorrect. Also, the condition to make sure the buffer size is not beyond 4K is also fixed. Fixes: `aac243092b` ("accel/amdxdna: Add command execution") Reviewed-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> Reviewed-by: Maciej Falkowski <maciej.falkowski@linux.intel.com> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://lore.kernel.org/r/20250409210013.10854-1-lizhi.hou@amd.com	2025-04-10 10:45:39 -07:00
Lizhi Hou	e486147c91	accel/amdxdna: Add BO import and export Add amdxdna_gem_prime_export() and amdxdna_gem_prime_import() for BO import and export. Register mmu notifier for imported BO as well. When MMU_NOTIFIER_UNMAP event is received, queue work to remove the notifier. The same BO could be mapped multiple times if it is exported and imported by an application. Use a link list to track VMAs the BO been mapped. v2: Rebased and call get_dma_buf() before dma_buf_attach() v3: Removed import_attach usage Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://lore.kernel.org/r/20250325200105.2744079-1-lizhi.hou@amd.com	2025-03-28 11:14:06 -07:00
Boris Brezillon	a600794afe	accel/amdxdna: s/drm_gem_v[un]map_unlocked/drm_gem_v[un]map/ Commit `8f5c4871a0` ("drm/gem: Change locked/unlocked postfix of drm_gem_v/unmap() function names") dropped the _unlocked suffix, but accel drivers were left behind. Fixes: `8f5c4871a0` ("drm/gem: Change locked/unlocked postfix of drm_gem_v/unmap() function names") Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Cc: Dmitry Osipenko <dmitry.osipenko@collabora.com> Cc: Min Ma <min.ma@amd.com> Cc: Lizhi Hou <lizhi.hou@amd.com> Cc: Oded Gabbay <ogabbay@kernel.org> Cc: dri-devel@lists.freedesktop.org Reviewed-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Tested-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250327104300.1982058-3-boris.brezillon@collabora.com	2025-03-27 14:33:35 +03:00
Lizhi Hou	cd740b873f	accel/amdxdna: Check interrupt register before mailbox_rx_worker exits There is a timeout failure been found during stress tests. If the firmware generates a mailbox response right after driver clears the mailbox channel interrupt register, the hardware will not generate an interrupt for the response. This causes the unexpected mailbox command timeout. To handle this failure, driver checks the interrupt register before exiting mailbox_rx_worker(). If there is a new response, driver goes back to process it. Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Reviewed-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250226161810.4188334-1-lizhi.hou@amd.com	2025-02-27 08:41:46 -06:00
Dave Airlie	fb51bf0255	Merge tag 'v6.14-rc4' into drm-next Backmerge Linux 6.14-rc4 at the request of tzimmermann so misc-next can base on rc4. Signed-off-by: Dave Airlie <airlied@redhat.com>	2025-02-25 17:36:09 +10:00
Su Hui	838c17fd07	accel/amdxdna: Add missing include linux/slab.h When compiling without CONFIG_IA32_EMULATION, there can be some errors: drivers/accel/amdxdna/amdxdna_mailbox.c: In function ‘mailbox_release_msg’: drivers/accel/amdxdna/amdxdna_mailbox.c:197:2: error: implicit declaration of function ‘kfree’. 197 \| kfree(mb_msg); \| ^~~~~ drivers/accel/amdxdna/amdxdna_mailbox.c: In function ‘xdna_mailbox_send_msg’: drivers/accel/amdxdna/amdxdna_mailbox.c:418:11: error:implicit declaration of function ‘kzalloc’. 418 \| mb_msg = kzalloc(sizeof(*mb_msg) + pkg_size, GFP_KERNEL); \| ^~~~~~~ Add the missing include. Fixes: `b87f920b93` ("accel/amdxdna: Support hardware mailbox") Signed-off-by: Su Hui <suhui@nfschina.com> Reviewed-by: Lizhi Hou <lizhi.hou@amd.com> Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250211015354.3388171-1-suhui@nfschina.com	2025-02-19 09:42:17 -08:00
Thomas Zimmermann	8bd1a8e757	Merge drm/drm-next into drm-misc-next Backmerging to get bugfixes from v6.14-rc2. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>	2025-02-18 07:43:43 +01:00
Lizhi Hou	4fd6ca90fc	accel/amdxdna: Refactor hardware context destroy routine It is required by firmware to wait up to 2 seconds for pending commands before sending the destroy hardware context command. After 2 seconds wait, if there are still pending commands, driver needs to cancel them. So the context destroy steps need to be: 1. Stop drm scheduler. (drm_sched_entity_destroy) 2. Wait up to 2 seconds for pending commands. 3. Destroy hardware context and cancel the rest pending requests. 4. Wait all jobs associated with the hwctx are freed. 5. Free job resources. Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Signed-off-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250124173536.148676-1-lizhi.hou@amd.com	2025-02-14 08:36:07 -07:00
Dave Airlie	0ed1356af8	Merge tag 'drm-misc-next-2025-02-12' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-next drm-misc-next for v6.15: UAPI Changes: fourcc: - Add modifiers for MediaTek tiled formats Cross-subsystem Changes: bus: - mhi: Enable image transfer via BHIe in PBL dma-buf: - Add fast-path for single-fence merging Core Changes: atomic helper: - Allow full modeset on connector changes - Clarify semantics of allow_modeset - Clarify semantics of drm_atomic_helper_check() buddy allocator: - Fix multi-root cleanup ci: - Update IGT display: - dp: Support Extendeds Wake Timeout - dp_mst: Fix RAD-to-string conversion panic: - Encode QR code according to Fido 2.2 probe helper: - Cleanups scheduler: - Cleanups ttm: - Refactor pool-allocation code - Cleanups Driver Changes: amdxdma: - Fix error handling - Cleanups ast: - Refactor detection of transmitter chips - Refactor support of VBIOS display-mode handling - astdp: Fix connection status; Filter unsupported display modes bridge: - adv7511: Report correct capabilities - it6505: Fix HDCP V compare - sn65dsi86: Fix device IDs - Cleanups i915: - Enable Extendeds Wake Timeout imagination: - Check job dependencies with DRM-sched helper ivpu: - Improve command-queue handling - Use workqueue for IRQ handling - Add suport for HW fault injection - Locking fixes - Cleanups mgag200: - Add support for G200eH5 chips msm: - dpu: Add concurrent writeback support for DPU 10.x+ nouveau: - Move drm_slave_encoder interface into driver - nvkm: Refactor GSP RPC omapdrm: - Cleanups panel: - Convert several panels to multi-style functions to improve error handling - edp: Add support for B140UAN04.4, BOE NV140FHM-NZ, CSW MNB601LS1-3, LG LP079QX1-SP0V, MNE007QS3-7, STA 116QHD024002, Starry 116KHD024006, Lenovo T14s Gen6 Snapdragon - himax-hx83102: Add support for CSOT PNA957QT1-1, Kingdisplay kd110n11-51ie, Starry 2082109qfh040022-50e panthor: - Expose sizes of intenral BOs via fdinfo - Fix race between reset and suspend - Cleanups qaic: - Add support for AIC200 - Cleanups renesas: - Fix limits in DT bindings rockchip: - rk3576: Add HDMI support - vop2: Add new display modes on RK3588 HDMI0 up to 4K - Don't change HDMI reference clock rate - Fix DT bindings solomon: - Set SPI device table to silence warnings - Fix pixel and scanline encoding v3d: - Cleanups vc4: - Use drm_exec - Use dma-resv for wait-BO ioctl - Remove seqno infrastructure virtgpu: - Support partial mappings of GEM objects - Reserve VGA resources during initialization - Fix UAF in virtgpu_dma_buf_free_obj() - Add panic support vkms: - Switch to a managed modesetting pipeline - Add support for ARGB8888 xlnx: - Set correct DMA segment size - Fix error handling - Fix docs Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/20250212090625.GA24865@linux.fritz.box	2025-02-14 10:24:02 +10:00
Philipp Stanner	796a9f55a8	drm/sched: Use struct for drm_sched_init() params drm_sched_init() has a great many parameters and upcoming new functionality for the scheduler might add even more. Generally, the great number of parameters reduces readability and has already caused one missnaming, addressed in: commit `6f1cacf4eb` ("drm/nouveau: Improve variable name in nouveau_sched_init()"). Introduce a new struct for the scheduler init parameters and port all users. Reviewed-by: Liviu Dudau <liviu.dudau@arm.com> Acked-by: Matthew Brost <matthew.brost@intel.com> # for Xe Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> # for Panfrost and Panthor Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com> # for Etnaviv Reviewed-by: Frank Binns <frank.binns@imgtec.com> # for Imagination Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> # for Sched Reviewed-by: Maíra Canal <mcanal@igalia.com> # for v3d Reviewed-by: Danilo Krummrich <dakr@kernel.org> Reviewed-by: Lizhi Hou <lizhi.hou@amd.com> # for amdxdna Signed-off-by: Philipp Stanner <phasta@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20250211111422.21235-2-phasta@kernel.org	2025-02-12 11:59:52 +01:00
Mario Limonciello	ecee4d0695	accel/amdxdna: Add MODULE_FIRMWARE() declarations Initramfs building tools such as dracut will look for a MODULE_FIRMWARE() declaration to determine which firmware to include in the initramfs when a driver is included in the initramfs. As amdxdna doesn't declare any firmware this causes the driver to fail to load with -ENOENT when in the initramfs. Add the missing declaration for possible firmware. Reported-by: Renjith Pananchikkal <Renjith.Pananchikkal@amd.com> Suggested-by: Alexander Deucher <Alexander.Deucher@amd.com> Fixes: `8c9ff1b181` ("accel/amdxdna: Add a new driver for AMD AI Engine") Reviewed-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://lore.kernel.org/r/20250204174031.3425762-1-superm1@kernel.org Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250204174031.3425762-1-superm1@kernel.org	2025-02-04 13:05:48 -06:00
Lizhi Hou	b3dff598e7	accel/amdxdna: Declare sched_ops as static Fix sparse warning: symbol 'sched_ops' was not declared. Should it be static? Fixes: `aac243092b` ("accel/amdxdna: Add command execution") Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250113182617.1256094-2-lizhi.hou@amd.com	2025-01-13 14:21:53 -06:00
Lizhi Hou	412576293c	accel/amdxdna: Remove casting mailbox payload pointer The mailbox payload pointer is void __iomem . Casting it to u32 is incorrect and causes sparse warning. cast removes address space '__iomem' of expression Fixes: `b87f920b93` ("accel/amdxdna: Support hardware mailbox") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202501130921.ktqwsMLH-lkp@intel.com/ Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250113182617.1256094-1-lizhi.hou@amd.com	2025-01-13 14:21:39 -06:00
Lizhi Hou	0c2768bf81	accel/amdxdna: Return error when setting clock failed for npu1 Due to miss returning error when setting clock, the smatch static checker reports warning: drivers/accel/amdxdna/aie2_smu.c:68 npu1_set_dpm() error: uninitialized symbol 'freq'. Fixes: `f4d7b8a6bc` ("accel/amdxdna: Enhance power management settings") Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Closes: https://lore.kernel.org/dri-devel/202267d0-882e-4593-b58d-be9274592f9b@stanley.mountain/ Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250109194811.499505-1-lizhi.hou@amd.com	2025-01-09 13:53:22 -06:00

1 2

77 Commits