This was done entirely with mindless brute force, using
git grep -l '\<k[vmz]*alloc_objs*(.*, GFP_KERNEL)' |
xargs sed -i 's/\(alloc_objs*(.*\), GFP_KERNEL)/\1)/'
to convert the new alloc_obj() users that had a simple GFP_KERNEL
argument to just drop that argument.
Note that due to the extreme simplicity of the scripting, any slightly
more complex cases spread over multiple lines would not be triggered:
they definitely exist, but this covers the vast bulk of the cases, and
the resulting diff is also then easier to check automatically.
For the same reason the 'flex' versions will be done as a separate
conversion.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This is the result of running the Coccinelle script from
scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to
avoid scalar types (which need careful case-by-case checking), and
instead replace kmalloc-family calls that allocate struct or union
object instances:
Single allocations: kmalloc(sizeof(TYPE), ...)
are replaced with: kmalloc_obj(TYPE, ...)
Array allocations: kmalloc_array(COUNT, sizeof(TYPE), ...)
are replaced with: kmalloc_objs(TYPE, COUNT, ...)
Flex array allocations: kmalloc(struct_size(PTR, FAM, COUNT), ...)
are replaced with: kmalloc_flex(*PTR, FAM, COUNT, ...)
(where TYPE may also be *VAR)
The resulting allocations no longer return "void *", instead returning
"TYPE *".
Signed-off-by: Kees Cook <kees@kernel.org>
Ensure that the DMA address of the framebuffer flush page is not larger
than its hardware register.
On GPUs older than Hopper, the register for the address can hold up to a
40-bit address (right-shifted by 8 so that it fits in the 32-bit
register), and on Hopper and later it can be 52 bits (64-bit register
where bits 52-63 must be zero).
Recently it was discovered that under certain conditions, the flush page
could be allocated outside this range. Although this bug was fixed, we
can ensure that any future changes to this code don't accidentally
generate an invalid page address.
Signed-off-by: Timur Tabi <ttabi@nvidia.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Lyude Paul <lyude@redhat.com>
Link: https://patch.msgid.link/20251113230323.1271726-2-ttabi@nvidia.com
- also executes pre-DEVINIT, so early boot is able to DMA sysmem
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
This enables BLCG optimization for kepler1. When using clockgating,
nvidia's firmware has a set of registers which are initially programmed
by the vbios with various engine delays and other mysterious settings
that are safe enough to bring up the GPU. However, the values used by
the vbios are more power hungry then they need to be, so the nvidia driver
writes it's own more optimized set of BLCG settings before enabling
CG_CTRL. This adds support for programming the optimized BLCG values
during engine/subdev init, which enables rather significant power
savings.
This introduces the nvkm_therm_clkgate_init() helper, which we use to
program the optimized BLCG settings before enabling clockgating with
nvkm_therm_clkgate_enable.
As well, this commit shares a lot more code with Fermi since BLCG is
mostly the same there as far as we can tell. In the future, it's likely
we'll reformat the clkgate_packs for kepler1 so that they share a list
of mmio packs with Fermi.
Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Martin Peres <martin.peres@free.fr>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
The 100c10 scratch page is mapped using dma_map_page() before the TTM
layer has had a chance to set the DMA mask. This means we are still
running with the default of 32 when this code executes, and this causes
problems for platforms with no memory below 4 GB (such as AMD Seattle)
So move the dma_map_page() to the .oneinit hook, which executes after the
DMA mask has been set.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
GFxxx/GM1xx support the selection of 64/128KiB big pages globally.
GM2xx supports the same, as well as another mode where the page size
can be selected per-instance.
We default to 128KiB pages (With per-instance for GM200, but the current
code selects 128KiB there already) as the MMU code isn't currently able
to handle otherwise.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Pretty much every subdev/engine is going to need access to nvkm_device
shortly to touch registers and/or output messages.
The odd placement of the includes is necessary to work around some
inter-dependencies that currently exist. This will be fixed later.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
The namespace of NVKM is being changed to nvkm_ instead of nouveau_,
which will be used for the DRM part of the driver. This is being
done in order to make it very clear as to what part of the driver a
given symbol belongs to, and as a minor step towards splitting the
DRM driver out to be able to stand on its own (for virt).
Because there's already a large amount of churn here anyway, this is
as good a time as any to also switch to NVIDIA's device and chipset
naming to ease collaboration with them.
A comparison of objdump disassemblies proves no code changes.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
The namespace of NVKM is being changed to nvkm_ instead of nouveau_,
which will be used for the DRM part of the driver. This is being
done in order to make it very clear as to what part of the driver a
given symbol belongs to, and as a minor step towards splitting the
DRM driver out to be able to stand on its own (for virt).
Because there's already a large amount of churn here anyway, this is
as good a time as any to also switch to NVIDIA's device and chipset
naming to ease collaboration with them.
A comparison of objdump disassemblies proves no code changes.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>