Commit Graph

27 Commits

Author SHA1 Message Date
Alexandre Courbot
7973858907 gpu: nova-core: convert PFB registers to kernel register macro
Convert all PFB registers to use the kernel's register macro and update
the code accordingly.

NV_PGSP_QUEUE_HEAD was somehow caught in the PFB section, so move it to
its own section and convert it as well.

Reviewed-by: Eliot Courtney <ecourtney@nvidia.com>
Reviewed-by: Gary Guo <gary@garyguo.net>
Acked-by: Danilo Krummrich <dakr@kernel.org>
Link: https://patch.msgid.link/20260325-b4-nova-register-v4-4-bdf172f0f6ca@nvidia.com
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
2026-03-26 15:08:27 +09:00
Alexandre Courbot
4e7588dcb0 gpu: nova-core: convert PBUS registers to kernel register macro
Convert all PBUS registers to use the kernel's register macro and update
the code accordingly.

Reviewed-by: Eliot Courtney <ecourtney@nvidia.com>
Reviewed-by: Gary Guo <gary@garyguo.net>
Acked-by: Danilo Krummrich <dakr@kernel.org>
Link: https://patch.msgid.link/20260325-b4-nova-register-v4-3-bdf172f0f6ca@nvidia.com
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
2026-03-26 15:08:27 +09:00
Danilo Krummrich
e21ad5e51c gpu: nova-core: use Coherent::init to initialize GspFwWprMeta
Convert wpr_meta to use Coherent::init() and simplify the
initialization.  It also avoids a separate initialization of
GspFwWprMeta on the stack.

Reviewed-by: Gary Guo <gary@garyguo.net>
Reviewed-by: Alexandre Courbot <acourbot@nvidia.com>
Link: https://patch.msgid.link/20260320194626.36263-7-dakr@kernel.org
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
2026-03-23 22:36:00 +01:00
Eliot Courtney
a19457958c gpu: nova-core: gsp: add mutex locking to Cmdq
Wrap `Cmdq`'s mutable state in a new struct `CmdqInner` and wrap that in
a Mutex. This lets `Cmdq` methods take &self instead of &mut self, which
lets required commands be sent e.g. while unloading the driver.

The mutex is held over both send and receive in `send_command` to make
sure that it doesn't get the reply of some other command that could have
been sent just beforehand.

Reviewed-by: Zhi Wang <zhiw@nvidia.com>
Tested-by: Zhi Wang <zhiw@nvidia.com>
Signed-off-by: Eliot Courtney <ecourtney@nvidia.com>
Link: https://patch.msgid.link/20260318-cmdq-locking-v5-5-18b37e3f9069@nvidia.com
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
2026-03-18 21:53:14 +09:00
Eliot Courtney
c3bd240f97 gpu: nova-core: gsp: add reply/no-reply info to CommandToGsp
Add type infrastructure to know what reply is expected from each
`CommandToGsp`. Uses a marker type `NoReply` which does not implement
`MessageFromGsp` to mark commands which don't expect a response.

Update `send_command` to wait for a reply and add `send_command_no_wait`
which sends a command that has no reply, without blocking.

This prepares for adding locking to the queue.

Tested-by: Zhi Wang <zhiw@nvidia.com>
Reviewed-by: Gary Guo <gary@garyguo.net>
Signed-off-by: Eliot Courtney <ecourtney@nvidia.com>
Link: https://patch.msgid.link/20260318-cmdq-locking-v5-3-18b37e3f9069@nvidia.com
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
2026-03-18 21:53:14 +09:00
Danilo Krummrich
76bce7ac51 Merge tag 'v7.0-rc4' into drm-rust-next
We need the latest fixes from drm-rust-fixes in drm-rust-next as well to
build on top of.

Signed-off-by: Danilo Krummrich <dakr@kernel.org>
2026-03-15 22:58:48 +01:00
John Hubbard
a247f8a107 gpu: nova-core: add FbRange.len() and use it in boot.rs
A tiny simplification: now that FbLayout uses its own specific FbRange
type, add an FbRange.len() method, and use that to (very slightly)
simplify the calculation of Frts::frts_size initialization.

Suggested-by: Alexandre Courbot <acourbot@nvidia.com>
Reviewed-by: Gary Guo <gary@garyguo.net>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
Link: https://patch.msgid.link/20260310021125.117855-3-jhubbard@nvidia.com
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
2026-03-10 20:13:46 +09:00
Timur Tabi
50b3e0c7c8 gpu: nova-core: use the Generic Bootloader to boot FWSEC on Turing
On Turing and GA100, a new firmware image called the Generic Bootloader
(gen_bootloader) must be used to load FWSEC into Falcon memory.  The
driver loads the generic bootloader into Falcon IMEM, passes a
descriptor that points to FWSEC using DMEM, and then boots the generic
bootloader.  The bootloader will then load FWSEC into IMEM and boot it.

Signed-off-by: Timur Tabi <ttabi@nvidia.com>
Co-developed-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Acked-by: Danilo Krummrich <dakr@kernel.org>
Link: https://patch.msgid.link/20260306-turing_prep-v11-12-8f0042c5d026@nvidia.com
2026-03-09 10:39:13 +09:00
Alexandre Courbot
bc9de9e1af gpu: nova-core: create falcon firmware DMA objects lazily
When DMA was the only loading option for falcon firmwares, we decided to
store them in DMA objects as soon as they were loaded from disk and
patch them in-place to avoid having to do an extra copy.

This decision complicates the PIO loading patch considerably, and
actually does not even stand on its own when put into perspective with
the fact that it requires 8 unsafe statements in the code that wouldn't
exist if we stored the firmware into a `KVVec` and copied it into a DMA
object at the last minute.

The cost of the copy is, as can be expected, imperceptible at runtime.
Thus, switch to a lazy DMA object creation model and simplify our code
a bit. This will also have the nice side-effect of being more fit for
PIO loading.

Reviewed-by: Eliot Courtney <ecourtney@nvidia.com>
Acked-by: Danilo Krummrich <dakr@kernel.org>
Link: https://patch.msgid.link/20260306-turing_prep-v11-1-8f0042c5d026@nvidia.com
[acourbot@nvidia.com: add TODO item to switch back to a coherent
allocation when it becomes convenient to do so.]
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
2026-03-09 10:35:37 +09:00
Gary Guo
4da879a0d3 rust: dma: use pointer projection infra for dma_{read,write} macro
Current `dma_read!`, `dma_write!` macros also use a custom
`addr_of!()`-based implementation for projecting pointers, which has
soundness issue as it relies on absence of `Deref` implementation on types.
It also has a soundness issue where it does not protect against unaligned
fields (when `#[repr(packed)]` is used) so it can generate misaligned
accesses.

This commit migrates them to use the general pointer projection
infrastructure, which handles these cases correctly.

As part of migration, the macro is updated to have an improved surface
syntax. The current macro have

    dma_read!(a.b.c[d].e.f)

to mean `a.b.c` is a DMA coherent allocation and it should project into it
with `[d].e.f` and do a read, which is confusing as it makes the indexing
operator integral to the macro (so it will break if you have an array of
`CoherentAllocation`, for example).

This also is problematic as we would like to generalize
`CoherentAllocation` from just slices to arbitrary types.

Make the macro expects `dma_read!(path.to.dma, .path.inside.dma)` as the
canonical syntax. The index operator is no longer special and is just one
type of projection (in additional to field projection). Similarly, make
`dma_write!(path.to.dma, .path.inside.dma, value)` become the canonical
syntax for writing.

Another issue of the current macro is that it is always fallible. This
makes sense with existing design of `CoherentAllocation`, but once we
support fixed size arrays with `CoherentAllocation`, it is desirable to
have the ability to perform infallible indexing as well, e.g. doing a `[0]`
index of `[Foo; 2]` is okay and can be checked at build-time, so forcing
falliblity is non-ideal. To capture this, the macro is changed to use
`[idx]` as infallible projection and `[idx]?` as fallible index projection
(those syntax are part of the general projection infra). A benefit of this
is that while individual indexing operation may fail, the overall
read/write operation is not fallible.

Fixes: ad2907b4e3 ("rust: add dma coherent allocator abstraction")
Reviewed-by: Benno Lossin <lossin@kernel.org>
Signed-off-by: Gary Guo <gary@garyguo.net>
Link: https://patch.msgid.link/20260302164239.284084-4-gary@kernel.org
[ Capitalize safety comments; slightly improve wording in doc-comments.
  - Danilo ]
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
2026-03-07 23:06:20 +01:00
Gary Guo
8d1a65c2de gpu: nova-core: remove redundant .as_ref() for dev_* print
This is now handled by the macro itself.

Signed-off-by: Gary Guo <gary@garyguo.net>
Link: https://patch.msgid.link/20260123175854.176735-7-gary@kernel.org
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
2026-02-24 15:13:07 +01:00
Timur Tabi
ab2aad252f gpu: nova-core: add Falcon HAL method load_method()
Some GPUs do not support using DMA to transfer code/data from system
memory to Falcon memory, and instead must use programmed I/O (PIO).
Add a function to the Falcon HAL to indicate whether a given GPU's
Falcons support DMA for this purpose.

Signed-off-by: Timur Tabi <ttabi@nvidia.com>
Reviewed-by: Gary Guo <gary@garyguo.net>
Acked-by: Danilo Krummrich <dakr@kernel.org>
Link: https://patch.msgid.link/20260122222848.2555890-10-ttabi@nvidia.com
[acourbot@nvidia.com: add short code to call into the HAL.]
[acourbot@nvidia.com: make `dma_load` private as per feedback.]
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
2026-01-24 10:48:59 +09:00
Timur Tabi
654826aa4a gpu: nova-core: add missing newlines to several print strings
Although the dev_xx!() macro calls do not technically require terminating
newlines for the format strings, they should be added anyway to maintain
consistency, both within Rust code and with the C versions.

Signed-off-by: Timur Tabi <ttabi@nvidia.com>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Link: https://patch.msgid.link/20260107201647.2490140-2-ttabi@nvidia.com
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
2026-01-12 14:44:06 +01:00
John Hubbard
8d6a8e7922 gpu: nova-core: preserve error information in gpu_name()
Change gpu_name() to return a Result instead of an Option. This avoids
silently discarding error information when parsing the GPU name string
from the GSP.

Update the callsite to log a warning with the error details on failure,
rather than just displaying "invalid GPU name".

Suggested-by: Danilo Krummrich <dakr@kernel.org>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
Link: https://patch.msgid.link/20260108005811.86014-2-jhubbard@nvidia.com
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
2026-01-12 14:11:04 +01:00
Danilo Krummrich
db22fbc15a gpu: nova-core: fw: get rid of redundant Result in GspFirmware::new()
In GspFirmware::new(), utilize pin_init_scope() to get rid of the Result
in the returned

	Result<impl PinInit<T, Error>>

which is unnecessarily redundant.

Reviewed-by: Joel Fernandes <joelagnelf@nvidia.com>
Link: https://patch.msgid.link/20251218155239.25243-2-dakr@kernel.org
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
2025-12-29 17:54:31 +01:00
Alistair Popple
13f85988d4 gpu: nova-core: gsp: Retrieve GSP static info to gather GPU information
After GSP initialization is complete, retrieve the static configuration
information from GSP-RM. This information includes GPU name, capabilities,
memory configuration, and other properties. On some GPU variants, it is
also required to do this for initialization to complete.

Signed-off-by: Alistair Popple <apopple@nvidia.com>
Co-developed-by: Joel Fernandes <joelagnelf@nvidia.com>
Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
[acourbot@nvidia.com: properly abstract the command's bindings, add
relevant methods, make str_from_null_terminated return an Option, fix
size of GPU name array.]
Co-developed-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Message-ID: <20251114195552.739371-14-joelagnelf@nvidia.com>
2025-11-15 21:54:18 +09:00
Alistair Popple
0e7d572b4b gpu: nova-core: gsp: Wait for gsp initialization to complete
This adds the GSP init done command to wait for GSP initialization
to complete. Once this command has been received the GSP is fully
operational and will respond properly to normal RPC commands.

Signed-off-by: Alistair Popple <apopple@nvidia.com>
Co-developed-by: Joel Fernandes <joelagnelf@nvidia.com>
Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
[acourbot@nvidia.com: move new definitions to end of commands.rs, rename
to `wait_gsp_init_done` and remove timeout argument.]
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Message-ID: <20251114195552.739371-13-joelagnelf@nvidia.com>
2025-11-15 21:54:18 +09:00
Joel Fernandes
6ddfc892a5 gpu: nova-core: Implement the GSP sequencer
Implement the GSP sequencer which culminates in INIT_DONE message being
received from the GSP indicating that the GSP has successfully booted.

This is just initial sequencer support, the actual commands will be
added in the next patches.

Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
[acourbot@nvidia.com: move GspSequencerInfo definition before its impl
blocks and rename it to GspSequence, adapt imports in sequencer.rs to
new formatting rules, remove `timeout` argument to harmonize with other
commands.]
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Message-ID: <20251114195552.739371-8-joelagnelf@nvidia.com>
2025-11-15 21:05:50 +09:00
Alistair Popple
5949d419c1 gpu: nova-core: gsp: Boot GSP
Boot the GSP to the RISC-V active state. Completing the boot requires
running the CPU sequencer which will be added in a future commit.

Reviewed-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Alistair Popple <apopple@nvidia.com>
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Message-ID: <20251110-gsp_boot-v9-15-8ae4058e3c0e@nvidia.com>
2025-11-14 20:25:57 +09:00
Alistair Popple
19b0a6e7c2 gpu: nova-core: gsp: Add SetRegistry command
Add support for sending the SetRegistry command, which is critical to
GSP initialization.

The RM registry is serialized into a packed format and sent via the
command queue. For now only three parameters which are required to boot
GSP are hardcoded. In the future a kernel module parameter will be added
to enable other parameters to be added.

Signed-off-by: Alistair Popple <apopple@nvidia.com>
[acourbot@nvidia.com: split into its own patch.]
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Message-ID: <20251110-gsp_boot-v9-12-8ae4058e3c0e@nvidia.com>
2025-11-14 20:25:57 +09:00
Alistair Popple
edcb134264 gpu: nova-core: gsp: Add SetSystemInfo command
Add support for sending the SetSystemInfo command, which provides
required hardware information to the GSP and is critical to its
initialization.

Signed-off-by: Alistair Popple <apopple@nvidia.com>
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Message-ID: <20251110-gsp_boot-v9-11-8ae4058e3c0e@nvidia.com>
2025-11-14 20:25:57 +09:00
Alistair Popple
41235c40ed gpu: nova-core: gsp: Create wpr metadata
The GSP requires some pieces of metadata to boot. These are passed in a
struct which the GSP transfers via DMA. Create this struct and get a
handle to it for future use when booting the GSP.

Signed-off-by: Alistair Popple <apopple@nvidia.com>
Co-developed-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Message-ID: <20251110-gsp_boot-v9-5-8ae4058e3c0e@nvidia.com>
2025-11-14 20:25:56 +09:00
Alexandre Courbot
7c01dc25f5 gpu: nova-core: compute layout of more framebuffer regions required for GSP
Compute more of the required FB layout information to boot the GSP
firmware.

This information is dependent on the firmware itself, so first we need
to import and abstract the required firmware bindings in the `nvfw`
module.

Then, a new FB HAL method is introduced in `fb::hal` that uses these
bindings and hardware information to compute the correct layout
information.

This information is then used in `fb` and the result made visible in
`FbLayout`.

These 3 things are grouped into the same patch to avoid lots of unused
warnings that would be tedious to work around. As they happen in
different files, they should not be too difficult to track separately.

Acked-by: Danilo Krummrich <dakr@kernel.org>
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Message-ID: <20251110-gsp_boot-v9-1-8ae4058e3c0e@nvidia.com>
2025-11-14 11:05:58 +09:00
John Hubbard
173c99b85a gpu: nova-core: apply the one "use" item per line policy
As per [1], we need one "use" item per line, in order to reduce merge
conflicts. Furthermore, we need a trailing ", //" in order to tell
rustfmt(1) to leave it alone.

This does that for the entire nova-core driver.

[1] https://docs.kernel.org/rust/coding-guidelines.html#imports

Acked-by: Danilo Krummrich <dakr@kernel.org>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
[acourbot@nvidia.com: remove imports already in prelude as pointed out
by Danilo.]
[acourbot@nvidia.com: remove a few unneeded trailing `//`.]
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Message-ID: <20251107021006.434109-1-jhubbard@nvidia.com>
2025-11-07 23:10:44 +09:00
Alexandre Courbot
a841614e60 gpu: nova-core: firmware: process and prepare the GSP firmware
The GSP firmware is a binary blob that is verified, loaded, and run by
the GSP bootloader. Its presentation is a bit peculiar as the GSP
bootloader expects to be given a DMA address to a 3-levels page table
mapping the GSP firmware at address 0 of its own address space.

Prepare such a structure containing the DMA-mapped firmware as well as
the DMA-mapped page tables, and a way to obtain the DMA handle of the
level 0 page table.

Then, move the GSP firmware instance from the `Firmware` struct to the
`start_gsp` method since it doesn't need to be kept after the GSP is
booted.

As we are performing the required ELF section parsing and radix3 page
table building, remove these items from the TODO file.

Acked-by: Danilo Krummrich <dakr@kernel.org>
Link: https://lore.kernel.org/r/20250913-nova_firmware-v6-7-9007079548b0@nvidia.com
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
2025-09-13 23:17:38 +09:00
Alexandre Courbot
3e5c9681bf gpu: nova-core: firmware: process Booter and patch its signature
The Booter signed firmware is an essential part of bringing up the GSP
on Turing and Ampere. It is loaded on the sec2 falcon core and is
responsible for loading and running the RISC-V GSP bootloader into the
GSP core.

Add support for parsing the Booter firmware loaded from userspace, patch
its signatures, and store it into a form that is ready to be loaded and
executed on the sec2 falcon.

Then, move the Booter instance from the `Firmware` struct to the
`start_gsp` method since it doesn't need to be kept after the GSP is
booted.

We do not run Booter yet, as its own payload (the GSP bootloader and
firmware image) still need to be prepared.

Acked-by: Danilo Krummrich <dakr@kernel.org>
Link: https://lore.kernel.org/r/20250913-nova_firmware-v6-6-9007079548b0@nvidia.com
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
2025-09-13 23:17:34 +09:00
Alexandre Courbot
e7c96980ea gpu: nova-core: move GSP boot code to its own module
Right now the GSP boot code is very incomplete and limited to running
FRTS, so having it in `Gpu::new` is not a big constraint.

However, this will change as we add more steps of the GSP boot process,
and not all GPU families follow the same procedure, so having these
steps in a dedicated method is the logical construct.

There is also the fact the GSP will require its own runtime data, and
while it won't immediately need to be pinned, we want to be ready for
the time where it will - most likely when it starts using mutexes.

Thus, add an empty `Gsp` type that is pinned inside `Gpu` and
initialized using a pin initializer. This sets the constraint we need to
observe from the start, and could spare us some costly refactoring down
the road.

Then, move the code related to GSP boot to the `gsp::boot` module, as
part of the `Gsp` implementation.

Doing so allows us to make `Gpu::new` return a fallible `impl PinInit`
instead of a `Result.` This is more idiomatic when working with pinned
objects, and sets up the pinned initialization pattern we want to
preserve as the code grows more complex.

Acked-by: Danilo Krummrich <dakr@kernel.org>
Link: https://lore.kernel.org/r/20250913-nova_firmware-v6-2-9007079548b0@nvidia.com
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
2025-09-13 23:17:21 +09:00