Commit Graph

1981 Commits

Author SHA1 Message Date
Srinivasan Shanmugam
429aec2bc0 drm/amdkfd: Fix NULL pointer check order in kfd_ioctl_create_process
In kfd_ioctl_create_process(), the pointer 'p' is used before checking
if it is NULL.

The code accesses p->context_id before validating 'p'. This can lead
to a possible NULL pointer dereference.

Move the NULL check before using 'p' so that the pointer is validated
before access.

Fixes the below:
drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_chardev.c:3177 kfd_ioctl_create_process() warn: variable dereferenced before check 'p' (see line 3174)

Fixes: cc6b66d661 ("amdkfd: introduce new ioctl AMDKFD_IOC_CREATE_PROCESS")
Cc: Zhu Lingshan <lingshan.zhu@amd.com>
Cc: Felix Kuehling <felix.kuehling@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 19d4149b22)
2026-03-24 13:54:19 -04:00
Philip Yang
2ce75a0b7e drm/amdkfd: Unreserve bo if queue update failed
Error handling path should unreserve bo then return failed.

Fixes: 305cd109b7 ("drm/amdkfd: Validate user queue update")
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Alex Sierra <alex.sierra@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit c24afed7de)
2026-03-11 14:02:45 -04:00
Kees Cook
189f164e57 Convert remaining multi-line kmalloc_obj/flex GFP_KERNEL uses
Conversion performed via this Coccinelle script:

  // SPDX-License-Identifier: GPL-2.0-only
  // Options: --include-headers-for-types --all-includes --include-headers --keep-comments
  virtual patch

  @gfp depends on patch && !(file in "tools") && !(file in "samples")@
  identifier ALLOC = {kmalloc_obj,kmalloc_objs,kmalloc_flex,
 		    kzalloc_obj,kzalloc_objs,kzalloc_flex,
		    kvmalloc_obj,kvmalloc_objs,kvmalloc_flex,
		    kvzalloc_obj,kvzalloc_objs,kvzalloc_flex};
  @@

  	ALLOC(...
  -		, GFP_KERNEL
  	)

  $ make coccicheck MODE=patch COCCI=gfp.cocci

Build and boot tested x86_64 with Fedora 42's GCC and Clang:

Linux version 6.19.0+ (user@host) (gcc (GCC) 15.2.1 20260123 (Red Hat 15.2.1-7), GNU ld version 2.44-12.fc42) #1 SMP PREEMPT_DYNAMIC 1970-01-01
Linux version 6.19.0+ (user@host) (clang version 20.1.8 (Fedora 20.1.8-4.fc42), LLD 20.1.8) #1 SMP PREEMPT_DYNAMIC 1970-01-01

Signed-off-by: Kees Cook <kees@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-02-22 08:26:33 -08:00
Linus Torvalds
32a92f8c89 Convert more 'alloc_obj' cases to default GFP_KERNEL arguments
This converts some of the visually simpler cases that have been split
over multiple lines.  I only did the ones that are easy to verify the
resulting diff by having just that final GFP_KERNEL argument on the next
line.

Somebody should probably do a proper coccinelle script for this, but for
me the trivial script actually resulted in an assertion failure in the
middle of the script.  I probably had made it a bit _too_ trivial.

So after fighting that far a while I decided to just do some of the
syntactically simpler cases with variations of the previous 'sed'
scripts.

The more syntactically complex multi-line cases would mostly really want
whitespace cleanup anyway.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-02-21 20:03:00 -08:00
Linus Torvalds
bf4afc53b7 Convert 'alloc_obj' family to use the new default GFP_KERNEL argument
This was done entirely with mindless brute force, using

    git grep -l '\<k[vmz]*alloc_objs*(.*, GFP_KERNEL)' |
        xargs sed -i 's/\(alloc_objs*(.*\), GFP_KERNEL)/\1)/'

to convert the new alloc_obj() users that had a simple GFP_KERNEL
argument to just drop that argument.

Note that due to the extreme simplicity of the scripting, any slightly
more complex cases spread over multiple lines would not be triggered:
they definitely exist, but this covers the vast bulk of the cases, and
the resulting diff is also then easier to check automatically.

For the same reason the 'flex' versions will be done as a separate
conversion.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-02-21 17:09:51 -08:00
Kees Cook
69050f8d6d treewide: Replace kmalloc with kmalloc_obj for non-scalar types
This is the result of running the Coccinelle script from
scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to
avoid scalar types (which need careful case-by-case checking), and
instead replace kmalloc-family calls that allocate struct or union
object instances:

Single allocations:	kmalloc(sizeof(TYPE), ...)
are replaced with:	kmalloc_obj(TYPE, ...)

Array allocations:	kmalloc_array(COUNT, sizeof(TYPE), ...)
are replaced with:	kmalloc_objs(TYPE, COUNT, ...)

Flex array allocations:	kmalloc(struct_size(PTR, FAM, COUNT), ...)
are replaced with:	kmalloc_flex(*PTR, FAM, COUNT, ...)

(where TYPE may also be *VAR)

The resulting allocations no longer return "void *", instead returning
"TYPE *".

Signed-off-by: Kees Cook <kees@kernel.org>
2026-02-21 01:02:28 -08:00
Linus Torvalds
d4a292c5f8 Merge tag 'drm-next-2026-02-21' of https://gitlab.freedesktop.org/drm/kernel
Pull drm fixes from Dave Airlie:
 "This is the fixes and cleanups for the end of the merge window, it's
  nearly all amdgpu, with some amdkfd, then a pagemap core fix, i915/xe
  display fixes, and some xe driver fixes.

  Nothing seems out of the ordinary, except amdgpu is a little more
  volume than usual.

  pagemap:
   - drm/pagemap: pass pagemap_addr by reference

  amdgpu:
   - DML 2.1 fixes
   - Panel replay fixes
   - Display writeback fixes
   - MES 11 old firmware compat fix
   - DC CRC improvements
   - DPIA fixes
   - XGMI fixes
   - ASPM fix
   - SMU feature bit handling fixes
   - DC LUT fixes
   - RAS fixes
   - Misc memory leak in error path fixes
   - SDMA queue reset fixes
   - PG handling fixes
   - 5 level GPUVM page table fix
   - SR-IOV fix
   - Queue reset fix
   - SMU 13.x fixes
   - DC resume lag fix
   - MPO fixes
   - DCN 3.6 fix
   - VSDB fixes
   - HWSS clean up
   - Replay fixes
   - DCE cursor fixes
   - DCN 3.5 SR DDR5 latency fixes
   - HPD fixes
   - Error path unwind fixes
   - SMU13/14 mode1 reset fixes
   - PSP 15 updates
   - SMU 15 updates
   - Sync fix in amdgpu_dma_buf_move_notify()
   - HAINAN fix
   - PSP 13.x fix
   - GPUVM locking fix
   - Fixes for DC analog support
   - DC FAMS fixes
   - DML 2.1 fixes
   - eDP fixes
   - Misc DC fixes
   - Fastboot fix
   - 3DLUT fixes
   - GPUVM fixes
   - 64bpp format fix
   - Fix for MacBooks with switchable gfx

  amdkfd:
   - Fix possible double deletion of validate list
   - Event setup fix
   - Device disconnect regression fix
   - APU GTT as VRAM fix
   - Fix piority inversion with MQDs
   - NULL check fix

  radeon:
   - HAINAN fix

  i915/xe display:
   - Regresion fix for HDR 4k displays (#15503)
   - Fixup for Dell XPS 13 7390 eDP rate limit
   - Memory leak fix on ACPI _DSM handling
   - Add missing slice count check during DP mode validation

  xe:
   - drm/xe: Prevent VFs from exposing the CCS mode sysfs file
   - SRIOV related fixes
   - PAT cache fix
   - MMIO read fix
   - W/a fixes
   - Adjust type of xe_modparam.force_vram_bar_size
   - Wedge mode fix
   - HWMon fix

* tag 'drm-next-2026-02-21' of https://gitlab.freedesktop.org/drm/kernel: (143 commits)
  drm/amd/display: Remove unneeded DAC link encoder register
  drm/amd/display: Enable DAC in DCE link encoder
  drm/amd/display: Set CRTC source for DAC using registers
  drm/amd/display: Initialize DAC in DCE link encoder using VBIOS
  drm/amd/display: Turn off DAC in DCE link encoder using VBIOS
  drm/amd/display: Don't call find_analog_engine() twice
  drm/amdgpu: fix 4-level paging if GMC supports 57-bit VA v2
  drm/amdgpu: keep vga memory on MacBooks with switchable graphics
  drm/amdgpu: Set atomics to true for xgmi
  drm/amdkfd: Check for NULL return values
  drm/amd/display: Use same max plane scaling limits for all 64 bpp formats
  drm/amdgpu: Set vmid0 PAGE_TABLE_DEPTH for GFX12.1
  drm/amdkfd: Disable MQD queue priority
  drm/amd/display: Remove conditional for shaper 3DLUT power-on
  drm/amd/display: Check return of shaper curve to HW format
  drm/amd/display: Correct logic check error for fastboot
  drm/amd/display: Skip eDP detection when no sink
  Revert "drm/amd/display: Add Gfx Base Case For Linear Tiling Handling"
  Revert "drm/amd/display: Correct hubp GfxVersion verification"
  Revert "drm/amd/display: Add Handling for gfxversion DcGfxBase"
  ...
2026-02-20 15:36:38 -08:00
Andrew Martin
09da66f139 drm/amdkfd: Check for NULL return values
This patch fixes issues when the code moves forward with a potential
NULL pointer, without checking.
Removed one redundant NULL check for a function parameter. This check
is already done in the only caller.

Signed-off-by: Andrew Martin <andrew.martin@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-02-19 12:16:11 -05:00
Andrew Martin
73463e26f7 drm/amdkfd: Disable MQD queue priority
This solves a priority inversion issue, caused by the language
runtime making high-priority queues wait for activity on
lower-priority queues.

Signed-off-by: Andrew Martin <andrew.martin@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-02-19 12:16:11 -05:00
Siwei He
8446c74737 drm/amdkfd: Fix APU to use GTT, not VRAM for MQD
Add a check in mqd_on_vram. If the device prefers GTT, it returns false

Fixes: d4a814f400 ("drm/amdkfd: Move gfx9.4.3 and gfx 9.5 MQD to HBM")
Signed-off-by: Siwei He <siwei.he@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-02-12 15:23:45 -05:00
Srinivasan Shanmugam
5a19302cab drm/amdkfd: Fix watch_id bounds checking in debug address watch v2
The address watch clear code receives watch_id as an unsigned value
(u32), but some helper functions were using a signed int and checked
bits by shifting with watch_id.

If a very large watch_id is passed from userspace, it can be converted
to a negative value.  This can cause invalid shifts and may access
memory outside the watch_points array.

drm/amdkfd: Fix watch_id bounds checking in debug address watch v2

Fix this by checking that watch_id is within MAX_WATCH_ADDRESSES before
using it.  Also use BIT(watch_id) to test and clear bits safely.

This keeps the behavior unchanged for valid watch IDs and avoids
undefined behavior for invalid ones.

Fixes the below:
drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_debug.c:448
kfd_dbg_trap_clear_dev_address_watch() error: buffer overflow
'pdd->watch_points' 4 <= u32max user_rl='0-3,2147483648-u32max' uncapped

drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_debug.c
    433 int kfd_dbg_trap_clear_dev_address_watch(struct kfd_process_device *pdd,
    434                                         uint32_t watch_id)
    435 {
    436         int r;
    437
    438         if (!kfd_dbg_owns_dev_watch_id(pdd, watch_id))

kfd_dbg_owns_dev_watch_id() doesn't check for negative values so if
watch_id is larger than INT_MAX it leads to a buffer overflow.
(Negative shifts are undefined).

    439                 return -EINVAL;
    440
    441         if (!pdd->dev->kfd->shared_resources.enable_mes) {
    442                 r = debug_lock_and_unmap(pdd->dev->dqm);
    443                 if (r)
    444                         return r;
    445         }
    446
    447         amdgpu_gfx_off_ctrl(pdd->dev->adev, false);
--> 448         pdd->watch_points[watch_id] = pdd->dev->kfd2kgd->clear_address_watch(
    449                                                         pdd->dev->adev,
    450                                                         watch_id);

v2: (as per, Jonathan Kim)
 - Add early watch_id >= MAX_WATCH_ADDRESSES validation in the set path to
   match the clear path.
 - Drop the redundant bounds check in kfd_dbg_owns_dev_watch_id().

Fixes: e0f85f4690 ("drm/amdkfd: add debug set and clear address watch points operation")
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Cc: Jonathan Kim <jonathan.kim@amd.com>
Cc: Felix Kuehling <felix.kuehling@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Jonathan Kim <jonathan.kim@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-02-12 15:18:08 -05:00
Linus Torvalds
136114e0ab Merge tag 'mm-nonmm-stable-2026-02-12-10-48' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull non-MM updates from Andrew Morton:

 - "ocfs2: give ocfs2 the ability to reclaim suballocator free bg" saves
   disk space by teaching ocfs2 to reclaim suballocator block group
   space (Heming Zhao)

 - "Add ARRAY_END(), and use it to fix off-by-one bugs" adds the
   ARRAY_END() macro and uses it in various places (Alejandro Colomar)

 - "vmcoreinfo: support VMCOREINFO_BYTES larger than PAGE_SIZE" makes
   the vmcore code future-safe, if VMCOREINFO_BYTES ever exceeds the
   page size (Pnina Feder)

 - "kallsyms: Prevent invalid access when showing module buildid" cleans
   up kallsyms code related to module buildid and fixes an invalid
   access crash when printing backtraces (Petr Mladek)

 - "Address page fault in ima_restore_measurement_list()" fixes a
   kexec-related crash that can occur when booting the second-stage
   kernel on x86 (Harshit Mogalapalli)

 - "kho: ABI headers and Documentation updates" updates the kexec
   handover ABI documentation (Mike Rapoport)

 - "Align atomic storage" adds the __aligned attribute to atomic_t and
   atomic64_t definitions to get natural alignment of both types on
   csky, m68k, microblaze, nios2, openrisc and sh (Finn Thain)

 - "kho: clean up page initialization logic" simplifies the page
   initialization logic in kho_restore_page() (Pratyush Yadav)

 - "Unload linux/kernel.h" moves several things out of kernel.h and into
   more appropriate places (Yury Norov)

 - "don't abuse task_struct.group_leader" removes the usage of
   ->group_leader when it is "obviously unnecessary" (Oleg Nesterov)

 - "list private v2 & luo flb" adds some infrastructure improvements to
   the live update orchestrator (Pasha Tatashin)

* tag 'mm-nonmm-stable-2026-02-12-10-48' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (107 commits)
  watchdog/hardlockup: simplify perf event probe and remove per-cpu dependency
  procfs: fix missing RCU protection when reading real_parent in do_task_stat()
  watchdog/softlockup: fix sample ring index wrap in need_counting_irqs()
  kcsan, compiler_types: avoid duplicate type issues in BPF Type Format
  kho: fix doc for kho_restore_pages()
  tests/liveupdate: add in-kernel liveupdate test
  liveupdate: luo_flb: introduce File-Lifecycle-Bound global state
  liveupdate: luo_file: Use private list
  list: add kunit test for private list primitives
  list: add primitives for private list manipulations
  delayacct: fix uapi timespec64 definition
  panic: add panic_force_cpu= parameter to redirect panic to a specific CPU
  netclassid: use thread_group_leader(p) in update_classid_task()
  RDMA/umem: don't abuse current->group_leader
  drm/pan*: don't abuse current->group_leader
  drm/amd: kill the outdated "Only the pthreads threading model is supported" checks
  drm/amdgpu: don't abuse current->group_leader
  android/binder: use same_thread_group(proc->tsk, current) in binder_mmap()
  android/binder: don't abuse current->group_leader
  kho: skip memoryless NUMA nodes when reserving scratch areas
  ...
2026-02-12 12:13:01 -08:00
Linus Torvalds
939faf71cf Merge tag 'drm-next-2026-02-11' of https://gitlab.freedesktop.org/drm/kernel
Pull drm updates from Dave Airlie:
 "Highlights:
   - amdgpu support for lots of new IP blocks which means newer GPUs
   - xe has a lot of SR-IOV and SVM improvements
   - lots of intel display refactoring across i915/xe
   - msm has more support for gen8 platforms
   - Given up on kgdb/kms integration, it's too hard on modern hw

  core:
   - drop kgdb support
   - replace system workqueue with percpu
   - account for property blobs in memcg
   - MAINTAINERS updates for xe + buddy

  rust:
   - Fix documentation for Registration constructors
   - Use pin_init::zeroed() for fops initialization
   - Annotate DRM helpers with __rust_helper
   - Improve safety documentation for gem::Object::new()
   - Update AlwaysRefCounted imports
   - mm: Prevent integer overflow in page_align()

  atomic:
   - add drm_device pointer to drm_private_obj
   - introduce gamma/degamma LUT size check

  buddy:
   - fix free_trees memory leak
   - prevent BUG_ON

  bridge:
   - introduce drm_bridge_unplug/enter/exit
   - add connector argument to .hpd_notify
   - lots of recounting conversions
   - convert rockchip inno hdmi to bridge
   - lontium-lt9611uxc: switch to HDMI audio helpers
   - dw-hdmi-qp: add support for HPD-less setups
   - Algoltek AG6311 support

  panels:
   - edp: CSW MNE007QB3-1, AUO B140HAN06.4, AUO B140QAX01.H
   - st75751: add SPI support
   - Sitronix ST7920, Samsung LTL106HL02
   - LG LH546WF1-ED01, HannStar HSD156J
   - BOE NV130WUM-T08
   - Innolux G150XGE-L05
   - Anbernic RG-DS

  dma-buf:
   - improve sg_table debugging
   - add tracepoints
   - call clear_page instead of memset
   - start to introduce cgroup memory accounting in heaps
   - remove sysfs stats

  dma-fence:
   - add new helpers

  dp:
   - mst: avoid oob access with vcpi=0

  hdmi:
   - limit infoframes exposure to userspace

  gem:
   - reduce page table overhead with THP
   - fix leak in drm_gem_get_unmapped_area

  gpuvm:
   - API sanitation for rust bindings

  sched:
   - introduce new helpers

  panic:
   - report invalid panic modes
   - add kunit tests

  i915/xe display:
   - Expose sharpness only if num_scalers is >= 2
   - Add initial Xe3P_LPD for NVL
   - BMG FBC support
   - Add MTL+ platforms to support dpll framework
   _ fix DIMM_S DRM decoding on ICL
   - Return to using AUX interrupts
   - PSR/Panel replay refactoring
   - use consolidation HDMI tables
   - Xe3_LPD CD2X dividier changes

  xe:
   - vfio: add vfio_pci for intel GPU
   - multi queue support
   - dynamic pagemaps and multi-device SVM
   - expose temp attribs in hwmon
   - NO_COMPRESSION bo flag
   - expose MERT OA unit
   - sysfs survivability refactor
   - SRIOV PF: add MERT support
   - enable SR-IOV VF migration
   - Enable I2C/NVM on Crescent Island
   - Xe3p page reclaimation support
   - introduce SRIOV scheduler groups
   - add SoC remappt support in system controller
   - insert compiler barriers in GuC code
   - define NVL GuC firmware
   - handle GT resume failure
   - fix drm scheduler layering violations
   - enable GSC loading and PXP for PTL
   - disable GuC Power DCC strategy on PTL
   - unregister drm device on probe error

  i915:
   - move to kernel standard fault injection
   - bump recommended GuC version for DG2 and MTL

  amdgpu:
   - SMUIO 15.x, PSP 15.x support
   - IH 6.1.1/7.1 support
   - MMHUB 3.4/4.2 support
   - GC 11.5.4/12.1 support
   - SDMA 6.1.4/7.1/7.11.4 support
   - JPEG 5.3 support
   - UserQ updates
   - GC 9 gfx queue reset support
   - TTM memory ops parallelization
   - convert legacy logging to new helpers
   - DC analog fixes

  amdkfd:
   - GC 11.5.4/12.1 suppport
   - SDMA 6.1.4/7.1 support
   - per context support
   - increase kfd process hash table
   - Reserved SDMA rework

  radeon:
   - convert legacy logging to new helpers
   - use devm for i2c adapters

  msm:
   - GPU
      - Document a612/RGMU dt bindings
      - UBWC 6.0 support (for A840 / Kaanapali)
      - a225 support
   - DPU:
      - Switch to use virtual planes by default
      - Fix DSI CMD panels on DPU 3.x
      - Rewrite format handling to remove intermediate representation
      - Fix watchdog on DPU 8.x+
      - Fix TE / Vsync source setting on DPU 8.x+
      - Add 3D_Mux on SC7280
      - Kaanapali platform support
      - Fix UBWC register programming
      - Make RM reserve DSPP-enabled mixers for CRTCs with LMs
      - Gamma correction support
   - DP:
      - Enable support for eDP 1.4+ link rate tables
      - Fix MDSS1 DP indices on SA8775P, making them to work
      - Fix msm_dp_ctrl_config_msa() to work with LLVM 20
   - DSI:
      - Document QCS8300 as compatible with SA8775P
      - Kaanapali platform support
   - DSI PHY:
      - switch to divider_determine_rate()
   - MDP5:
      - Drop support for MSM8998, SDM660 and SDM630 (switch over to DPU)
   -  MDSS:
      - Kaanapali platform support
      - Fixed UBWC register programming

  nova-core:
   - Prepare for Turing support. This includes parsing and handling
     Turing-specific firmware headers and sections as well as a Turing
     Falcon HAL implementation
   - Get rid of the Result<impl PinInit<T, E>> anti-pattern
   - Relocate initializer-specific code into the appropriate initializer
   - Use CStr::from_bytes_until_nul() to remove custom helpers
   - Improve handling of unexpected firmware values
   - Clean up redundant debug prints
   - Replace c_str!() with native Rust C-string literals
   - Update nova-core task list

  nova:
   - Align GEM object size to system page size

  tyr:
   - Use generated uAPI bindings for GpuInfo
   - Replace manual sleeps with read_poll_timeout()
   - Replace c_str!() with native Rust C-string literals
   - Suppress warnings for unread fields
   - Fix incorrect register name in print statement

  nouveau:
   - fix big page table support races in PTE management
   - improve reclocking on tegra 186+

  amdxdna:
   - fix suspend race conditions
   - improve handling of zero tail pointers
   - fix cu_idx overwritten during command setup
   - enable hardware context priority
   - remove NPU2 support
   - update message buffer allocation requirements
   - update firmware version check

  ast:
   - support imported cursor buffers
   - big endian fixes

  etnaviv:
   - add PPU flop reset support

  imagination:
   - add AM62P support
   - introduce hw version checks

  ivpu:
   - implement warm boot flow

  panfrost:
   - add bo sync ioctl
   - add GPU_PM_RT support for RZ/G3E SoC

  panthor:
   - add bo sync ioctl
   - enable timestamp propagation
   - scheduler robustness improvements
   - VM termination fixes
   - huge page support

  rockchip:
   - RK3368 HDMI Support
   - get rid of atomic_check fixups
   - RK3506 support
   - RK3576/RK3588 improved HPD handling

  rz-du:
   - RZ/V2H(P) MIPI-DSI Support

  v3d:
   - fix DMA segment size
   - convert to new logging helpers

  mediatek:
   - move DP training to hotplug thread
   - convert logging to new helpers
   - add support for HS speed DSI
   - Genio 510/700/1200-EVK, Radxa NIO-12L HDMI support

  atmel-hlcdc:
   - switch to drmm resource
   - support nomodeset
   - use newer helpers

  hisilicon:
   - fix various DP bugs

  renesas:
   - fix kernel panic on reboot

  exynos:
   - fix vidi_connection_ioctl using wrong device
   - fix vidi_connection deref user ptr
   - fix concurrency regression with vidi_context

  vkms:
   - add configfs support for display configuration

* tag 'drm-next-2026-02-11' of https://gitlab.freedesktop.org/drm/kernel: (1610 commits)
  drm/xe/pm: Disable D3Cold for BMG only on specific platforms
  drm/xe: Fix kerneldoc for xe_tlb_inval_job_alloc_dep
  drm/xe: Fix kerneldoc for xe_gt_tlb_inval_init_early
  drm/xe: Fix kerneldoc for xe_migrate_exec_queue
  drm/xe/query: Fix topology query pointer advance
  drm/xe/guc: Fix kernel-doc warning in GuC scheduler ABI header
  drm/xe/guc: Fix CFI violation in debugfs access.
  accel/amdxdna: Move RPM resume into job run function
  accel/amdxdna: Fix incorrect DPM level after suspend/resume
  nouveau/vmm: start tracking if the LPT PTE is valid. (v6)
  nouveau/vmm: increase size of vmm pte tracker struct to u32 (v2)
  nouveau/vmm: rewrite pte tracker using a struct and bitfields.
  accel/amdxdna: Fix incorrect error code returned for failed chain command
  accel/amdxdna: Remove hardware context status
  drm/bridge: imx8qxp-pixel-combiner: Fix bailout for imx8qxp_pc_bridge_probe()
  drm/panel: ilitek-ili9882t: Remove duplicate initializers in tianma_il79900a_dsc
  drm/i915/display: fix the pixel normalization handling for xe3p_lpd
  drm/exynos: vidi: use ctx->lock to protect struct vidi_context member variables related to memory alloc/free
  drm/exynos: vidi: fix to avoid directly dereferencing user pointer
  drm/exynos: vidi: use priv->vidi_dev for ctx lookup in vidi_connection_ioctl()
  ...
2026-02-11 12:55:44 -08:00
Sunday Clement
8a70a26c9f drm/amdkfd: Fix out-of-bounds write in kfd_event_page_set()
The kfd_event_page_set() function writes KFD_SIGNAL_EVENT_LIMIT * 8
bytes via memset without checking the buffer size parameter. This allows
unprivileged userspace to trigger an out-of bounds kernel memory write
by passing a small buffer, leading to  potential privilege
escalation.

Signed-off-by: Sunday Clement <Sunday.Clement@amd.com>
Reviewed-by: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2026-02-05 17:20:07 -05:00
Andrew Martin
08b1eb621c drm/amdgpu: Ignored various return code
The return code of a non void function should not be ignored. In cases
where we do not care, the code needs to suppress it.

Signed-off-by: Andrew Martin <andrew.martin@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-02-03 16:46:31 -05:00
Oleg Nesterov
a87da7a9fa drm/amd: kill the outdated "Only the pthreads threading model is supported" checks
Nowadays task->group_leader->mm != task->mm is only possible if a) task is
not a group leader and b) task->group_leader->mm == NULL because
task->group_leader has already exited using sys_exit().

I don't think that drm/amd tries to detect/nack this case.

Link: https://lkml.kernel.org/r/aXY_yLVHd63UlWtm@redhat.com
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Christan König <christian.koenig@amd.com>
Acked-by: Felix Kuehling <felix.kuehling@amd.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Boris Brezillon <boris.brezillon@collabora.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Simon Horman <horms@kernel.org>
Cc: Steven Price <steven.price@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-02-03 08:21:25 -08:00
Linus Torvalds
bcb6058a4b Merge tag 'mm-hotfixes-stable-2026-01-29-09-41' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull misc fixes from Andrew Morton:
 "16 hotfixes.  9 are cc:stable, 12 are for MM.

  There's a patch series from Pratyush Yadav which fixes a few things in
  the new-in-6.19 LUO memfd code.

  Plus the usual shower of singletons - please see the changelogs for
  details"

* tag 'mm-hotfixes-stable-2026-01-29-09-41' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  vmcoreinfo: make hwerr_data visible for debugging
  mm/zone_device: reinitialize large zone device private folios
  mm/mm_init: don't cond_resched() in deferred_init_memmap_chunk() if called from deferred_grow_zone()
  mm/kfence: randomize the freelist on initialization
  kho: kho_preserve_vmalloc(): don't return 0 when ENOMEM
  kho: init alloc tags when restoring pages from reserved memory
  mm: memfd_luo: restore and free memfd_luo_ser on failure
  mm: memfd_luo: use memfd_alloc_file() instead of shmem_file_setup()
  memfd: export alloc_file()
  flex_proportions: make fprop_new_period() hardirq safe
  mailmap: add entry for Viacheslav Bocharov
  mm/memory-failure: teach kill_accessing_process to accept hugetlb tail page pfn
  mm/memory-failure: fix missing ->mf_stats count in hugetlb poison
  mm, swap: restore swap_space attr aviod kernel panic
  mm/kasan: fix KASAN poisoning in vrealloc()
  mm/shmem, swap: fix race of truncate and swap entry split
2026-01-29 11:09:13 -08:00
Lang Yu
2bddc36c12 drm/amdkfd: Use AMDGPU_MQD_SIZE_ALIGN in gfx11+ kfd mqd manager
MES is enabled by default from gfx11+, use AMDGPU_MQD_SIZE_ALIGN
unconditionally for gfx11+.

Signed-off-by: Lang Yu <lang.yu@amd.com>
Reviewed-by: David Belanger <david.belanger@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Mukul Joshi <mukul.joshi@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-29 12:27:02 -05:00
Lang Yu
3aca6f835b drm/amdkfd: Adjust parameter of allocate_mqd
Make allocate_mqd consistent with other callbacks.
Prepare for next patch to use mqd_manager->mqd_size.

Signed-off-by: Lang Yu <lang.yu@amd.com>
Reviewed-by: David Belanger <david.belanger@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Mukul Joshi <mukul.joshi@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-29 12:26:58 -05:00
Jay Cornwall
05762d9c7d drm/amdkfd: gfx12.1 trap handler instruction fixup for VOP3PX
A trap may occur in the middle of VOP3PX instruction co-issue.
The PC would be restored incorrectly if left unmodified.

Identify this case by examining the instruction opcode and
rewind the PC 8 bytes if it occurs.

Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Reviewed-by: Lancelot Six <lancelot.six@amd.com>
Reviewed-by: Vladimir Indic <vladimir.indic@amd.com>
Cc: Shweta Khatri <shweta.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-28 16:21:21 -05:00
Jonathan Kim
e698127eb7 drm/amdkfd: add extended capabilities to device snapshot
Add additional capabilities reporting.

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Reviewed-by: James Zhu <james.zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-27 18:13:28 -05:00
Matthew Brost
12b2285bf3 mm/zone_device: reinitialize large zone device private folios
Reinitialize metadata for large zone device private folios in
zone_device_page_init prior to creating a higher-order zone device private
folio.  This step is necessary when the folio's order changes dynamically
between zone_device_page_init calls to avoid building a corrupt folio.  As
part of the metadata reinitialization, the dev_pagemap must be passed in
from the caller because the pgmap stored in the folio page may have been
overwritten with a compound head.

Without this fix, individual pages could have invalid pgmap fields and
flags (with PG_locked being notably problematic) due to prior different
order allocations, which can, and will, result in kernel crashes.

Link: https://lkml.kernel.org/r/20260116111325.1736137-2-francois.dugast@intel.com
Fixes: d245f9b4ab ("mm/zone_device: support large zone device private folios")
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Francois Dugast <francois.dugast@intel.com>
Acked-by: Felix Kuehling <felix.kuehling@amd.com>
Reviewed-by: Balbir Singh <balbirs@nvidia.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "Christophe Leroy (CS GROUP)" <chleroy@kernel.org>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: David Airlie <airlied@gmail.com>
Cc: Simona Vetter <simona@ffwll.ch>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Lyude Paul <lyude@redhat.com>
Cc: Danilo Krummrich <dakr@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-01-26 19:03:48 -08:00
Yury Norov
afaf425071 drm/amdkfd: simplify svm_range_unmap_from_gpus()
The function calls bitmap_or() followed by for_each_set_bit().
Switch it to the dedicated for_each_or_bit() and drop the temporary
bitmap.

Signed-off-by: Yury Norov <ynorov@nvidia.com>
Signed-off-by: Felix Kuehling <felix.kuehling@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-21 14:24:19 -05:00
Lancelot Six
532e2c87d4 drm/amdkfd: Do not include VGPR MSBs in saved PC during save
The current trap handler uses the top bits of ttmp1 to store a copy of
sq_wave_mode.*vgpr_msb (except for src2_vgpr_msb).  This is so the
effective values in sq_wave_mode can be cleared to ensure correct
behavior of the trap handler.

When saving sq_wave_mode, the trap handler correctly rebuilds the
expected value (with *vgpr_msb restored), so the save area is correct.
However, the PC itself is copied from ttmp[0:1], which contains the
wave's PC as well as the saved MSBs.

The debugger reads the PC from the save area and is confused when non-0
values from VGPR_MSBs are present.

This patch fixes this by saving the PC in the save area's PC slot, not
the composite of the PC and VGPR_MSBs.  On restore, the VGPR_MSBs are
restored from sq_wave_mode.

Signed-off-by: Lancelot Six <lancelot.six@amd.com>
Tested-by: Alexey Kondratiev <Alexey.Kondratiev@amd.com>
Reviewed-by: Jay Cornwall <jay.cornwall@amd.com>
Cc: Vladimir Indic <vladimir.indic@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-21 14:24:14 -05:00
Jay Cornwall
bbcad5a889 drm/amdkfd: gfx12.1 trap handler support for expert scheduling mode
- Leave DEP_MODE unchanged as it is ignored in the trap handler
- Save/restore SCHED_MODE (gfx12.0 saves in ttmp11)

Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Reviewed-by: Lancelot Six <lancelot.six@amd.com>
Cc: Vladimir Indic <vladimir.indic@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-21 14:21:51 -05:00
Jay Cornwall
29b703d7ad drm/amdkfd: gfx12.1 cluster barrier context save workaround
Trap cluster barrier may not serialize with user cluster barrier
under some circumstances. Add a check for pending user cluster
barrier complete.

Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Tested-by: Gang Ba <Gang.Ba@amd.com>
Cc: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Reviewed-by: Lancelot Six <lancelot.six@amd.com>
Cc: Vladimir Indic <vladimir.indic@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-21 14:18:36 -05:00
Jay Cornwall
ea89b305b6 drm/amdkfd: Fix scalar load ordering in gfx12.1 trap handler
Scalar loads may arrive out-of-order with respect to KMCNT.
The affected code expects the two loads to arrive in-order.

Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Reviewed-by: Lancelot Six <lancelot.six@amd.com>
Cc: Joseph Greathouse <joseph.greathouse@amd.com>
Cc: Vladimir Indic <vladimir.indic@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-21 14:18:29 -05:00
Jay Cornwall
d9fc0bdf9c drm/amdkfd: Sync trap handler binary with source
Binary and source desynced during branch activity. Source merge
also introduced compile error.

Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Reviewed-by: Lancelot Six <lancelot.six@amd.com>
Cc: Vladimir Indic <vladimir.indic@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-21 14:18:17 -05:00
Jonathan Kim
b6aff8bb0c drm/amdkfd: fix gfx11 restrictions on debugging cooperative launch
Restrictions on debugging cooperative launch for GFX11 devices should
align to CWSR work around requirements.
i.e. devices without the need for the work around should not be subject
to such restrictions.

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Reviewed-by: James Zhu <james.zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 230ef3977d)
2026-01-20 21:50:12 -05:00
Jonathan Kim
230ef3977d drm/amdkfd: fix gfx11 restrictions on debugging cooperative launch
Restrictions on debugging cooperative launch for GFX11 devices should
align to CWSR work around requirements.
i.e. devices without the need for the work around should not be subject
to such restrictions.

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Reviewed-by: James Zhu <james.zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-20 17:17:42 -05:00
Philip Yang
d4a814f400 drm/amdkfd: Move gfx9.4.3 and gfx 9.5 MQD to HBM
To reduce queue switch latency further, move MQD to VRAM domain, CP
access MQD and control stack via FB aperture, this requires contiguous
pages.

After MQD is initialized, updated or restored, flush HDP to guarantee
the data is written to HBM and GPU cache is invalidated, then CP will
read the new MQD.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-20 17:15:46 -05:00
Dave Airlie
c098b1aa2f Merge tag 'amd-drm-next-6.20-2026-01-16' of https://gitlab.freedesktop.org/agd5f/linux into drm-next
amd-drm-next-6.20-2026-01-16:

amdgpu:
- SR-IOV fixes
- Rework SMU mailbox handling
- Drop MMIO_REMAP domain
- UserQ fixes
- MES cleanups
- Panel Replay updates
- HDMI fixes
- Backlight fixes
- SMU 14.x fixes
- SMU 15 updates

amdkfd:
- Fix a memory leak
- Fixes for systems with non-4K pages
- LDS/Scratch cleanup
- MES process eviction fix

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patch.msgid.link/20260116202609.23107-1-alexander.deucher@amd.com
2026-01-19 06:54:46 +10:00
Dave Airlie
83dc0ba275 Merge tag 'amd-drm-next-6.20-2026-01-09' of https://gitlab.freedesktop.org/agd5f/linux into drm-next
amd-drm-next-6.20-2026-01-09:

amdgpu:
- GPUVM updates
- Initial support for larger GPU address spaces
- Initial SMUIO 15.x support
- Documentation updates
- Initial PSP 15.x support
- Initial IH 7.1 support
- Initial IH 6.1.1 support
- SMU 13.0.12 updates
- RAS updates
- Initial MMHUB 3.4 support
- Initial MMHUB 4.2 support
- Initial GC 12.1 support
- Initial GC 11.5.4 support
- HDMI fixes
- Panel replay improvements
- DML updates
- DC FP fixes
- Initial SDMA 6.1.4 support
- Initial SDMA 7.1 support
- Userq updates
- DC HPD refactor
- SwSMU cleanups and refactoring
- TTM memory ops parallelization
- DCN 3.5 fixes
- DP audio fixes
- Clang fixes
- Misc spelling fixes and cleanups
- Initial SDMA 7.11.4 support
- Convert legacy DRM logging helpers to new drm logging helpers
- Initial JPEG 5.3 support
- Add support for changing UMA size via the driver
- DC analog fixes
- GC 9 gfx queue reset support
- Initial SMU 15.x support

amdkfd:
- Reserved SDMA rework
- Refactor SPM
- Initial GC 12.1 support
- Initial GC 11.5.4 support
- Initial SDMA 7.1 support
- Initial SDMA 6.1.4 support
- Increase the kfd process hash table
- Per context support
- Topology fixes

radeon:
- Convert legacy DRM logging helpers to new drm logging helpers
- Use devm for i2c adapters
- Variable sized array fix
- Misc cleanups

UAPI:
- KFD context support.  Proposed userspace:
  https://github.com/ROCm/rocm-systems/pull/1705
  https://github.com/ROCm/rocm-systems/pull/1701
- Add userq metadata queries for more queue types.  Proposed userspace:
  https://gitlab.freedesktop.org/yogeshmohan/mesa/-/commits/userq_query

From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patch.msgid.link/20260109154713.3242957-1-alexander.deucher@amd.com
Signed-off-by: Dave Airlie <airlied@redhat.com>
2026-01-15 14:49:33 +10:00
Harish Kasiviswanathan
18dbcfb46f drm/amdkfd: No need to suspend whole MES to evict process
Each queue of the process is individually removed and there is not need
to suspend whole mes. Suspending mes stops kernel mode queues also
causing unnecessary timeouts when running mixed work loads

Fixes: 079ae5118e ("drm/amdkfd: fix suspend/resume all calls in mes based eviction path")
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4765
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 3fd20580b9)
2026-01-14 15:07:05 -05:00
Haoxiang Li
80614c5098 drm/amdkfd: fix a memory leak in device_queue_manager_init()
If dqm->ops.initialize() fails, add deallocate_hiq_sdma_mqd()
to release the memory allocated by allocate_hiq_sdma_mqd().
Move deallocate_hiq_sdma_mqd() up to ensure proper function
visibility at the point of use.

Fixes: 11614c36bc ("drm/amdkfd: Allocate MQD trunk for HIQ and SDMA")
Signed-off-by: Haoxiang Li <lihaoxiang@isrc.iscas.ac.cn>
Signed-off-by: Felix Kuehling <felix.kuehling@amd.com>
Reviewed-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit b7cccc8286)
Cc: stable@vger.kernel.org
2026-01-14 14:58:24 -05:00
Philip Yang
0cba5b27f1 drm/amdkfd: Add domain parameter to alloc kernel BO
To allocate kernel BO from VRAM domain for MQD in the following patch.
No functional change because kernel BO allocate all from GTT domain.

Rename amdgpu_amdkfd_alloc_gtt_mem to amdgpu_amdkfd_alloc_kernel_mem
Rename amdgpu_amdkfd_free_gtt_mem to amdgpu_amdkfd_free_kernel_mem
Rename mem_kfd_mem_obj gtt_mem to mem

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Kent Russell <kent.russell@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-14 14:28:59 -05:00
Harish Kasiviswanathan
3fd20580b9 drm/amdkfd: No need to suspend whole MES to evict process
Each queue of the process is individually removed and there is not need
to suspend whole mes. Suspending mes stops kernel mode queues also
causing unnecessary timeouts when running mixed work loads

Fixes: 079ae5118e ("drm/amdkfd: fix suspend/resume all calls in mes based eviction path")
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4765
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-14 14:28:59 -05:00
Lang Yu
bcd600ab7f drm/amdkfd: Switch to using GC VERSION to decide LDS/Scratch base
Next generation GC IP with 4-level page table needs to use the
same LDS/Scratch base with 5-level page table, use GC VERSION
to decide is more appropriate.

Signed-off-by: Lang Yu <lang.yu@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-14 14:28:58 -05:00
Xiaogang Chen
6cca686dfc drm/amdkfd: kfd driver supports hot unplug/replug amdgpu devices
This patch allows kfd driver function correctly when AMD gpu devices got
unplug/replug at run time.

When an AMD gpu device got unplug kfd driver gracefully terminates existing
kfd processes after stops all queues by sending SIGBUS to user process. After
that user space can still use remaining AMD gpu devices. When all AMD gpu
devices at system got removed kfd driver will not response new requests.

Unplugged AMD gpu devices can be re-plugged. kfd driver will use added devices
to function as usual.

The purpose of this patch is having kfd driver behavior as expected during and
after AMD gpu devices unplug/replug at run time.

Signed-off-by: Xiaogang Chen <Xiaogang.Chen@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-14 14:28:49 -05:00
Donet Tom
6c16000166 drm/amdkfd: Fix GART PTE for non-4K pagesize in svm_migrate_gart_map()
In svm_migrate_gart_map(), while migrating GART mapping, the number of
bytes copied for the GART table only accounts for CPU pages. On non-4K
systems, each CPU page can contain multiple GPU pages, and the GART
requires one 8-byte PTE per GPU page. As a result, an incorrect size was
passed to the DMA, causing only a partial update of the GART table.

Fix this function to work correctly on non-4K page-size systems by
accounting for the number of GPU pages per CPU page when calculating the
number of bytes to be copied.

Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Donet Tom <donettom@linux.ibm.com>
Signed-off-by: Felix Kuehling <felix.kuehling@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-14 14:28:49 -05:00
Donet Tom
fb361a520a drm/amdkfd: Fix SVM map/unmap address conversion for non-4k page sizes
SVM range size is tracked using the system page size. The range start and
end are aligned to system page-sized PFNs, so the total SVM range size
equals the total number of pages in the SVM range multiplied by the system
page size.

The SVM range map/unmap functions pass these system page-sized PFN numbers
to amdgpu_vm_update_range(), which expects PFNs based on the GPU page size
(4K). On non-4K page systems, this mismatch causes only part of the SVM
range to be mapped in the GPU page table, while the rest remains unmapped.
If the GPU accesses an unmapped address within the same range, it results
in a GPU page fault.

To fix this, the required conversion has been added in both
svm_range_map_to_gpu() and svm_range_unmap_from_gpu(), ensuring that all
pages in the SVM range are correctly mapped on non-4K systems.

Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Donet Tom <donettom@linux.ibm.com>
Signed-off-by: Felix Kuehling <felix.kuehling@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-14 14:28:48 -05:00
Donet Tom
42ea9cf2f1 drm/amdkfd: Relax size checking during queue buffer get
HW-supported EOP buffer sizes are 4K and 32K. On systems that do not
use 4K pages, the minimum buffer object (BO) allocation size is
PAGE_SIZE (for example, 64K). During queue buffer acquisition, the driver
currently checks the allocated BO size against the supported EOP buffer
size. Since the allocated BO is larger than the expected size, this check
fails, preventing queue creation.

Relax the strict size validation and allow PAGE_SIZE-sized BOs to be used.
Only the required 4K region of the buffer will be used as the EOP buffer
and avoids queue creation failures on non-4K page systems.

Acked-by: Christian König <christian.koenig@amd.com>
Suggested-by: Philip Yang <yangp@amd.com>
Signed-off-by: Donet Tom <donettom@linux.ibm.com>
Signed-off-by: Felix Kuehling <felix.kuehling@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-14 14:28:48 -05:00
Haoxiang Li
b7cccc8286 drm/amdkfd: fix a memory leak in device_queue_manager_init()
If dqm->ops.initialize() fails, add deallocate_hiq_sdma_mqd()
to release the memory allocated by allocate_hiq_sdma_mqd().
Move deallocate_hiq_sdma_mqd() up to ensure proper function
visibility at the point of use.

Fixes: 11614c36bc ("drm/amdkfd: Allocate MQD trunk for HIQ and SDMA")
Signed-off-by: Haoxiang Li <lihaoxiang@isrc.iscas.ac.cn>
Signed-off-by: Felix Kuehling <felix.kuehling@amd.com>
Reviewed-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-10 14:21:52 -05:00
Julia Lawall
95b36732fe drm/amdkfd: update outdated comment
The function acquire_packet_buffer() was renamed
kq_acquire_packet_buffer() by commit a5a4d68c93 ("drm/amdkfd:
Eliminate unnecessary kernel queue function pointers").  Update
the comment accordingly.

Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
Signed-off-by: Felix Kuehling <felix.kuehling@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-05 17:00:00 -05:00
Jay Cornwall
ba80939fec drm/amdkfd: Apply VGPR bank state fixup on gfx12.1 trap exit
- Identify co-issue of S_SET_VGPR_MSB and VALU with banked VGPR
- Restore previous bank setting when exiting the trap

v2:
- Refine VOP3PX2 detection
- Improve load pipelining
- Fix a comment typo

Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Reviewed-by: Lancelot Six <lancelot.six@amd.com>
Cc: Joseph Greathouse <joseph.greathouse@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-05 16:59:56 -05:00
Jay Cornwall
1005ab86cf drm/amdkfd: Fix VGPR bank state save in gfx12.1 trap handler
S_SETREG_IMM32_B32 does not apply a mask to the MODE bank bits.
SRC2 is consequently unconditonally cleared during context save.

Use S_SETREG_B32 instead to preserve SRC2.

Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Reviewed-by: Lancelot Six <lancelot.six@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-05 16:59:56 -05:00
David Yat Sin
c51bb53d5c drm/amdkfd: Add metadata ring buffer for compute
Add support for separate ring-buffer for metadata packets when using
compute queues. Userspace application allocate the metadata ring-buffer
and the queue ring-buffer with a single allocation. The metadata
ring-buffer starts after the queue ring-buffer.

Signed-off-by: David Yat Sin <David.YatSin@amd.com>
Reviewed-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-05 16:59:56 -05:00
Mukul Joshi
258cc2b687 drm/amdkfd: Add back CWSR trap handler for GFX 12.1
CWSR Trap handler for GFX 12.1 was missed when merging changes
from 6.14 NPI branch to 6.16 NPI branch. This change adds back
the CWSR trap handler for GFX 12.1.

Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: Alex Sierra <alex.sierra@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-05 16:59:56 -05:00
Jonathan Kim
e418a8fdb9 drm/amdkfd: enable precise memory operations for gfx1250
Enable precise memory for GFX 1250.

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Reviewed-by: Mukul Joshi <mukul.joshi@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-05 16:59:56 -05:00
Mukul Joshi
a0806e7fe7 drm/amdkfd: Implement CU Masking for GFX 12.1
Add CU masking implementation for GFX 12.1. Add a local
implementation for GFX 12.1 instead of using the generic
function defined in kfd_mqd_manager.c because of some
quirks in the way CU mask is handled on GFX 12.1.

Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: Alex Sierra <alex.sierra@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-05 16:59:56 -05:00