Commit Graph

9 Commits

Author SHA1 Message Date
Linus Torvalds
bf4afc53b7 Convert 'alloc_obj' family to use the new default GFP_KERNEL argument
This was done entirely with mindless brute force, using

    git grep -l '\<k[vmz]*alloc_objs*(.*, GFP_KERNEL)' |
        xargs sed -i 's/\(alloc_objs*(.*\), GFP_KERNEL)/\1)/'

to convert the new alloc_obj() users that had a simple GFP_KERNEL
argument to just drop that argument.

Note that due to the extreme simplicity of the scripting, any slightly
more complex cases spread over multiple lines would not be triggered:
they definitely exist, but this covers the vast bulk of the cases, and
the resulting diff is also then easier to check automatically.

For the same reason the 'flex' versions will be done as a separate
conversion.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-02-21 17:09:51 -08:00
Kees Cook
69050f8d6d treewide: Replace kmalloc with kmalloc_obj for non-scalar types
This is the result of running the Coccinelle script from
scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to
avoid scalar types (which need careful case-by-case checking), and
instead replace kmalloc-family calls that allocate struct or union
object instances:

Single allocations:	kmalloc(sizeof(TYPE), ...)
are replaced with:	kmalloc_obj(TYPE, ...)

Array allocations:	kmalloc_array(COUNT, sizeof(TYPE), ...)
are replaced with:	kmalloc_objs(TYPE, COUNT, ...)

Flex array allocations:	kmalloc(struct_size(PTR, FAM, COUNT), ...)
are replaced with:	kmalloc_flex(*PTR, FAM, COUNT, ...)

(where TYPE may also be *VAR)

The resulting allocations no longer return "void *", instead returning
"TYPE *".

Signed-off-by: Kees Cook <kees@kernel.org>
2026-02-21 01:02:28 -08:00
Melissa Wen
adefb2ccea drm/v3d: create a dedicated lock for dma fence
Don't mix dma fence lock with the active_job lock. Use fence_lock to
protect the dma fence used by drm scheduler when signalling a job
completion and queue_lock to protect concurrent access to active bin job
in OOM and stats collection for a given file priv. The issue was
uncovered when PREEMPT_RT on with a system freeze when opening multiple
Chromium tabs on Raspberry Pi 5.

Link: https://github.com/raspberrypi/linux/issues/7035
Fixes: fa6a20c874 ("drm/v3d: Address race-condition between per-fd GPU stats and fd release")
Signed-off-by: Melissa Wen <mwen@igalia.com>
Acked-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Maíra Canal <mcanal@igalia.com>
Signed-off-by: Melissa Wen <melissa.srw@gmail.com>
Link: https://lore.kernel.org/r/20250916172022.2779837-1-mwen@igalia.com
2025-09-30 14:28:14 -01:00
Maíra Canal
e9d8e02748 drm/v3d: Replace a global spinlock with a per-queue spinlock
Each V3D queue works independently and all the dependencies between the
jobs are handled through the DRM scheduler. Therefore, there is no need
to use one single lock for all queues. Using it, creates unnecessary
contention between different queues that can operate independently.

Replace the global spinlock with per-queue locks to improve parallelism
and reduce contention between different V3D queues (BIN, RENDER, TFU,
CSD). This allows independent queues to operate concurrently while
maintaining proper synchronization within each queue.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Melissa Wen <mwen@igalia.com>
Link: https://lore.kernel.org/r/20250826-v3d-queue-lock-v3-3-979efc43e490@igalia.com
Signed-off-by: Maíra Canal <mcanal@igalia.com>
2025-08-29 10:28:10 -03:00
Eric Anholt
d223f98f02 drm/v3d: Add support for compute shader dispatch.
The compute shader dispatch interface is pretty simple -- just pass in
the regs that userspace has passed us, with no CLs to run.  However,
with no CL to run it means that we need to do manual cache flushing of
the L2 after the HW execution completes (for SSBO, atomic, and
image_load_store writes that are the output of compute shaders).

This doesn't yet expose the L2 cache's ability to have a region of the
address space not write back to memory (which could be used for
shared_var storage).

So far, the Mesa side has been tested on V3D v4.2 simpenrose (passing
the ES31 tests), and on the kernel side on 7278 (failing atomic
compswap tests in a way that doesn't reproduce on simpenrose).

v2: Fix excessive allocation for the clean_job (reported by Dan
    Carpenter).  Keep refs on jobs until clean_job is finished, to
    avoid spurious MMU errors if the output BOs are freed by userspace
    before L2 cleaning is finished.

Signed-off-by: Eric Anholt <eric@anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20190416225856.20264-4-eric@anholt.net
Acked-by: Rob Clark <robdclark@gmail.com>
2019-04-18 09:54:10 -07:00
Eric Anholt
db176f6ba1 drm/v3d: Add missing fence timeline name for TFU.
We shouldn't be returning v3d-render for our new queue.

Signed-off-by: Eric Anholt <eric@anholt.net>
Fixes: 83d5139982db ("drm/v3d: Add support for submitting jobs to the TFU.")
Link: https://patchwork.freedesktop.org/patch/msgid/20181201005759.28093-6-eric@anholt.net
Reviewed-by: Dave Emett <david.emett@broadcom.com>
2018-12-03 11:24:58 -08:00
Eric Anholt
e0d018119a drm/v3d: Remove unnecessary dma_fence_ops.
The dma-fence core as of commit 418cc6ca06 ("dma-fence: Make ->wait
callback optional") provides appropriate defaults for these methods.

Signed-off-by: Eric Anholt <eric@anholt.net>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20180703170515.6298-2-eric@anholt.net
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Daniel Vetter <daniel@ffwll.ch>
2018-07-05 11:42:50 -07:00
Eric Anholt
14d1d19086 drm/v3d: Remove the bad signaled() implementation.
Since our seqno value comes from a counter associated with the GPU
ring, not the entity (aka client), they'll be completed out of order.
There's actually no need for this code at all, since we don't have
enable_signaling() and thus DMA_FENCE_SIGNALED_BIT will be set before
we could be called.

Signed-off-by: Eric Anholt <eric@anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20180605190302.18279-2-eric@anholt.net
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
2018-06-21 14:46:05 -07:00
Eric Anholt
57692c94dc drm/v3d: Introduce a new DRM driver for Broadcom V3D V3.x+
This driver will be used to support Mesa on the Broadcom 7268 and 7278
platforms.

V3D 3.3 introduces an MMU, which means we no longer need CMA or vc4's
complicated CL/shader validation scheme.  This massively changes the
GEM behavior, so I've forked off to a new driver.

v2: Mark SUBMIT_CL as needing DRM_AUTH.  coccinelle fixes from kbuild
    test robot. Drop personal git link from MAINTAINERS.  Don't
    double-map dma-buf imported BOs.  Add kerneldoc about needing MMU
    eviction.  Drop prime vmap/unmap stubs.  Delay mmap offset setup
    to mmap time.  Use drm_dev_init instead of _alloc.  Use
    ktime_get() for wait_bo timeouts.  Drop drm_can_sleep() usage,
    since we don't modeset.  Switch page tables back to WC (debug
    change to coherent had slipped in).  Switch
    drm_gem_object_unreference_unlocked() to
    drm_gem_object_put_unlocked().  Simplify overflow mem handling by
    not sharing overflow mem between jobs.
v3: no changes
v4: align submit_cl to 64 bits (review by airlied), check zero flags in
    other ioctls.

Signed-off-by: Eric Anholt <eric@anholt.net>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> (v4)
Acked-by: Dave Airlie <airlied@linux.ie> (v3, requested submit_cl change)
Link: https://patchwork.freedesktop.org/patch/msgid/20180430181058.30181-3-eric@anholt.net
2018-05-03 16:26:30 -07:00