This was done entirely with mindless brute force, using
git grep -l '\<k[vmz]*alloc_objs*(.*, GFP_KERNEL)' |
xargs sed -i 's/\(alloc_objs*(.*\), GFP_KERNEL)/\1)/'
to convert the new alloc_obj() users that had a simple GFP_KERNEL
argument to just drop that argument.
Note that due to the extreme simplicity of the scripting, any slightly
more complex cases spread over multiple lines would not be triggered:
they definitely exist, but this covers the vast bulk of the cases, and
the resulting diff is also then easier to check automatically.
For the same reason the 'flex' versions will be done as a separate
conversion.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This is the result of running the Coccinelle script from
scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to
avoid scalar types (which need careful case-by-case checking), and
instead replace kmalloc-family calls that allocate struct or union
object instances:
Single allocations: kmalloc(sizeof(TYPE), ...)
are replaced with: kmalloc_obj(TYPE, ...)
Array allocations: kmalloc_array(COUNT, sizeof(TYPE), ...)
are replaced with: kmalloc_objs(TYPE, COUNT, ...)
Flex array allocations: kmalloc(struct_size(PTR, FAM, COUNT), ...)
are replaced with: kmalloc_flex(*PTR, FAM, COUNT, ...)
(where TYPE may also be *VAR)
The resulting allocations no longer return "void *", instead returning
"TYPE *".
Signed-off-by: Kees Cook <kees@kernel.org>
On plaforms with an a7xx GPU not supporting IFPC, the ifpc_reglist
if still deferenced in a7xx_patch_pwrup_reglist() which causes
a kernel crash:
Unable to handle kernel NULL pointer dereference at virtual address 0000000000000008
...
pc : a6xx_hw_init+0x155c/0x1e4c [msm]
lr : a6xx_hw_init+0x9a8/0x1e4c [msm]
...
Call trace:
a6xx_hw_init+0x155c/0x1e4c [msm] (P)
msm_gpu_hw_init+0x58/0x88 [msm]
adreno_load_gpu+0x94/0x1fc [msm]
msm_open+0xe4/0xf4 [msm]
drm_file_alloc+0x1a0/0x2e4 [drm]
drm_client_init+0x7c/0x104 [drm]
drm_fbdev_client_setup+0x94/0xcf0 [drm_client_lib]
drm_client_setup+0xb4/0xd8 [drm_client_lib]
msm_drm_kms_post_init+0x2c/0x3c [msm]
msm_drm_init+0x1a4/0x228 [msm]
msm_drm_bind+0x30/0x3c [msm]
...
Check the validity of ifpc_reglist before deferencing the table
to setup the register values.
Fixes: a6a0157cc6 ("drm/msm/a6xx: Enable IFPC on Adreno X1-85")
Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
Reviewed-by: Akhil P Oommen <akhilpo@oss.qualcomm.com>
Patchwork: https://patchwork.freedesktop.org/patch/688944/
Message-ID: <20251117-topic-sm8x50-fix-a6xx-non-ifpc-v1-1-e4473cbf5903@linaro.org>
Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
A8x is the next generation of Adreno GPUs, featuring a significant
hardware design change. A major update to the design is the introduction
of Slice architecture. Slices are sort of mini-GPUs within the GPU which
are more independent in processing Graphics and compute workloads. Also,
in addition to the BV and BR pipe we saw in A7x, CP has more concurrency
with additional pipes.
From a software interface perspective, these changes have a significant
impact on the KMD side. First, the GPU register space has been extensively
reorganized. Second, to avoid a register space explosion caused by the
new slice architecture and additional pipes, many registers are now
virtualized, instead of duplicated as in A7x. KMD must configure an
aperture register with the appropriate slice and pipe ID before accessing
these virtualized registers.
Signed-off-by: Akhil P Oommen <akhilpo@oss.qualcomm.com>
Patchwork: https://patchwork.freedesktop.org/patch/689019/
Message-ID: <20251118-kaana-gpu-support-v4-14-86eeb8e93fb6@oss.qualcomm.com>
Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
Add support for Adreno 612 GPU found in SM6150/QCS615 chipsets.
A612 falls under ADRENO_6XX_GEN1 family and is a cut down version
of A615 GPU.
A612 has a new IP called Reduced Graphics Management Unit or RGMU
which is a small state machine which helps to toggle GX GDSC
(connected to CX rail) to implement IFPC feature. It doesn't support
any other features of a full fledged GMU like clock control, resource
voting to rpmh etc. So we need linux clock driver support like other
gmu-wrapper implementations to control gpu core clock and gpu GX gdsc.
This patch skips RGMU core initialization and act more like a
gmu-wrapper case.
Signed-off-by: Jie Zhang <quic_jiezh@quicinc.com>
Signed-off-by: Akhil P Oommen <akhilpo@oss.qualcomm.com>
Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Patchwork: https://patchwork.freedesktop.org/patch/686212/
Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
From the hangcheck handler, KMD checks a few registers in GX
domain to see if the GPU made any progress. But it cannot access
those registers when IFPC is enabled. Since HW based hang detection
is pretty decent, lets rely on it instead of these registers when
IFPC is enabled.
Signed-off-by: Akhil P Oommen <akhilpo@oss.qualcomm.com>
Patchwork: https://patchwork.freedesktop.org/patch/673378/
Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
Even though the GX power domain is kept ON when there is a pending GPU
interrupt, there is a small window of potential race with GMU where it
may move the AHB fence to 'Drop' mode. Once the GMU sees the pending IRQ,
it will move back the fence state to ALLOW mode. Close this race window
by polling for AHB fence to ensure that it is in 'Allow' mode.
Signed-off-by: Akhil P Oommen <akhilpo@oss.qualcomm.com>
Patchwork: https://patchwork.freedesktop.org/patch/673377/
Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
There are some special registers which are accessible even when GX power
domain is collapsed during an IFPC sleep. Accessing these registers
wakes up GPU from power collapse and allow programming these registers
without additional handshake with GMU. This patch adds support for this
special register write sequence.
Signed-off-by: Akhil P Oommen <akhilpo@oss.qualcomm.com>
Patchwork: https://patchwork.freedesktop.org/patch/673368/
Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
With user managed VMs and multiple queues, it is in theory possible to
trigger map/unmap errors. These will (in a later patch) mark the VM as
unusable. But we want to tell the io-pgtable helpers not to spam the
log. In addition, in the unmap path, we don't want to bail early from
the unmap, to ensure we don't leave some dangling pages mapped.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
Tested-by: Antonino Maniscalco <antomani103@gmail.com>
Reviewed-by: Antonino Maniscalco <antomani103@gmail.com>
Patchwork: https://patchwork.freedesktop.org/patch/661520/
Now that we've realigned deletion and allocation, switch over to using
drm_gpuvm/drm_gpuva. This allows us to support multiple VMAs per BO per
VM, to allow mapping different parts of a single BO at different virtual
addresses, which is a key requirement for sparse/VM_BIND.
This prepares us for using drm_gpuvm to translate a batch of MAP/
MAP_NULL/UNMAP operations from userspace into a sequence of map/remap/
unmap steps for updating the page tables.
Since, unlike our prior vm/vma setup, with drm_gpuvm the vm_bo holds a
reference to the GEM object. To prevent reference loops causing us to
leak all GEM objects, we implicitly tear down the mapping when the GEM
handle is close or when the obj is unpinned. Which means the submit
needs to also hold a reference to the vm_bo, to prevent the VMA from
being torn down while the submit is in-flight.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
Tested-by: Antonino Maniscalco <antomani103@gmail.com>
Reviewed-by: Antonino Maniscalco <antomani103@gmail.com>
Patchwork: https://patchwork.freedesktop.org/patch/661479/
Calling this packet is necessary when we switch contexts because there
are various pieces of state used by userspace to synchronize between BR
and BV that are persistent across submits and we need to make sure that
they are in a "safe" state when switching contexts. Otherwise a
userspace submission in one context could cause another context to
function incorrectly and hang, effectively a denial of service (although
without leaking data). This was missed during initial a7xx bringup.
Fixes: af66706acc ("drm/msm/a6xx: Add skeleton A7xx support")
Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
Patchwork: https://patchwork.freedesktop.org/patch/654924/
Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
When things go wrong, the GPU is capable of quickly generating millions
of faulting translation requests per second. When that happens, in the
stall-on-fault model each access will stall until it wins the race to
signal the fault and then the RESUME register is written. This slows
processing page faults to a crawl as the GPU can generate faults much
faster than the CPU can acknowledge them. It also means that all
available resources in the SMMU are saturated waiting for the stalled
transactions, so that other transactions such as transactions generated
by the GMU, which shares translation resources with the GPU, cannot
proceed. This causes a GMU watchdog timeout, which leads to a failed
reset because GX cannot collapse when there is a transaction pending and
a permanently hung GPU.
On older platforms with qcom,smmu-v2, it seems that when one transaction
is stalled subsequent faulting transactions are terminated, which avoids
this problem, but the MMU-500 follows the spec here.
To work around these problems, disable stall-on-fault as soon as we get a
page fault until a cooldown period after pagefaults stop. This allows
the GMU some guaranteed time to continue working. We only use
stall-on-fault to halt the GPU while we collect a devcoredump and we
always terminate the transaction afterward, so it's fine to miss some
subsequent page faults. We also keep it disabled so long as the current
devcoredump hasn't been deleted, because in that case we likely won't
capture another one if there's a fault.
After this commit HFI messages still occasionally time out, because the
crashdump handler doesn't run fast enough to let the GMU resume, but the
driver seems to recover from it. This will probably go away after the
HFI timeout is increased.
Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Rob Clark <robdclark@gmail.com>
Patchwork: https://patchwork.freedesktop.org/patch/654891/
Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
Really the only purpose of this was to limit the address space size to
4GB to avoid 32b rollover problems in 64b pointer math in older sqe fw.
So replace the address_space_size with a quirk limiting the address
space to 4GB. In all other cases, use the SMMU input address size (IAS)
to determine the address space size.
v2: Properly account for vm_start
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Patchwork: https://patchwork.freedesktop.org/patch/649467/
Merge drm-misc-next to get commit Fixes: fec450ca15 ("drm/display:
hdmi: provide central data authority for ACR params").
Signed-off-by: Rob Clark <robdclark@chromium.org>
IB_SIZE is only b0..b19. Starting with a6xx gen3, additional fields
were added above the IB_SIZE. Accidentially setting them can cause
badness. Fix this by properly defining the CP_INDIRECT_BUFFER packet
and using the generated builder macro to ensure unintended bits are not
set.
v2: add missing type attribute for IB_BASE
v3: fix offset attribute in xml
Reported-by: Connor Abbott <cwabbott0@gmail.com>
Fixes: a83366ef19 ("drm/msm/a6xx: add A640/A650 to gpulist")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Patchwork: https://patchwork.freedesktop.org/patch/643396/
timer_delete[_sync]() replaces del_timer[_sync](). Convert the whole tree
over and remove the historical wrapper inlines.
Conversion was done with coccinelle plus manual fixups where necessary.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>