Commit Graph

712 Commits

Author SHA1 Message Date
Sean Christopherson
d2ea4ff1ce KVM: selftests: Verify SEV+ guests can read and write EFER, CR0, CR4, and CR8
Add "do no harm" testing of EFER, CR0, CR4, and CR8 for SEV+ guests to
verify that the guest can read and write the registers, without hitting
e.g. a #VC on SEV-ES guests due to KVM incorrectly trying to intercept a
register.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-ID: <20260310211841.2552361-3-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2026-03-12 17:31:53 +01:00
Paolo Bonzini
c52b534f26 selftests: kvm: extract common functionality out of smm_test.c
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2026-03-11 18:41:12 +01:00
Paolo Bonzini
1b13885edf Merge tag 'kvm-x86-apic-6.20' of https://github.com/kvm-x86/linux into HEAD
KVM x86 APIC-ish changes for 6.20

 - Fix a benign bug where KVM could use the wrong memslots (ignored SMM) when
   creating a vCPU-specific mapping of guest memory.

 - Clean up KVM's handling of marking mapped vCPU pages dirty.

 - Drop a pile of *ancient* sanity checks hidden behind in KVM's unused
   ASSERT() macro, most of which could be trivially triggered by the guest
   and/or user, and all of which were useless.

 - Fold "struct dest_map" into its sole user, "struct rtc_status", to make it
   more obvious what the weird parameter is used for, and to allow burying the
   RTC shenanigans behind CONFIG_KVM_IOAPIC=y.

 - Bury all of ioapic.h and KVM_IRQCHIP_KERNEL behind CONFIG_KVM_IOAPIC=y.

 - Add a regression test for recent APICv update fixes.

 - Rework KVM's handling of VMCS updates while L2 is active to temporarily
   switch to vmcs01 instead of deferring the update until the next nested
   VM-Exit.  The deferred updates approach directly contributed to several
   bugs, was proving to be a maintenance burden due to the difficulty in
   auditing the correctness of deferred updates, and was polluting
   "struct nested_vmx" with a growing pile of booleans.

 - Handle "hardware APIC ISR", a.k.a. SVI, updates in kvm_apic_update_apicv()
   to consolidate the updates, and to co-locate SVI updates with the updates
   for KVM's own cache of ISR information.

 - Drop a dead function declaration.
2026-02-11 12:45:32 -05:00
Paolo Bonzini
54f15ebfc6 Merge tag 'kvm-riscv-6.20-1' of https://github.com/kvm-riscv/linux into HEAD
KVM/riscv changes for 6.20

- Fixes for issues discoverd by KVM API fuzzing in
  kvm_riscv_aia_imsic_has_attr(), kvm_riscv_aia_imsic_rw_attr(),
  and kvm_riscv_vcpu_aia_imsic_update()
- Allow Zalasr, Zilsd and Zclsd extensions for Guest/VM
- Add riscv vm satp modes in KVM selftests
- Transparent huge page support for G-stage
- Adjust the number of available guest irq files based on
  MMIO register sizes in DeviceTree or ACPI
2026-02-11 12:45:00 -05:00
Paolo Bonzini
4215ee0d7b Merge tag 'kvm-x86-svm-6.20' of https://github.com/kvm-x86/linux into HEAD
KVM SVM changes for 6.20

 - Drop a user-triggerable WARN on nested_svm_load_cr3() failure.

 - Add support for virtualizing ERAPS.  Note, correct virtualization of ERAPS
   relies on an upcoming, publicly announced change in the APM to reduce the
   set of conditions where hardware (i.e. KVM) *must* flush the RAP.

 - Ignore nSVM intercepts for instructions that are not supported according to
   L1's virtual CPU model.

 - Add support for expedited writes to the fast MMIO bus, a la VMX's fastpath
   for EPT Misconfig.

 - Don't set GIF when clearing EFER.SVME, as GIF exists independently of SVM,
   and allow userspace to restore nested state with GIF=0.

 - Treat exit_code as an unsigned 64-bit value through all of KVM.

 - Add support for fetching SNP certificates from userspace.

 - Fix a bug where KVM would use vmcb02 instead of vmcb01 when emulating VMLOAD
   or VMSAVE on behalf of L2.

 - Misc fixes and cleanups.
2026-02-09 18:51:37 +01:00
Paolo Bonzini
a0c468eda4 Merge tag 'kvm-x86-selftests-6.20' of https://github.com/kvm-x86/linux into HEAD
KVM selftests changes for 6.20

 - Add a regression test for TPR<=>CR8 synchronization and IRQ masking.

 - Overhaul selftest's MMU infrastructure to genericize stage-2 MMU support,
   and extend x86's infrastructure to support EPT and NPT (for L2 guests).

 - Extend several nested VMX tests to also cover nested SVM.

 - Add a selftest for nested VMLOAD/VMSAVE.

 - Rework the nested dirty log test, originally added as a regression test for
   PML where KVM logged L2 GPAs instead of L1 GPAs, to improve test coverage
   and to hopefully make the test easier to understand and maintain.
2026-02-09 18:38:54 +01:00
Wu Fei
39ad809dd2 KVM: riscv: selftests: Add riscv vm satp modes
Current vm modes cannot represent riscv guest modes precisely, here add
all 9 combinations of P(56,40,41) x V(57,48,39). Also the default vm
mode is detected on runtime instead of hardcoded one, which might not be
supported on specific machine.

Signed-off-by: Wu Fei <wu.fei9@sanechips.com.cn>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Nutty Liu <nutty.liu@hotmail.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/r/20251105151442.28767-1-wu.fei9@sanechips.com.cn
Signed-off-by: Anup Patel <anup@brainfault.org>
2026-02-06 19:05:23 +05:30
Sean Christopherson
a91cc48246 KVM: selftests: Test READ=>WRITE dirty logging behavior for shadow MMU
Update the nested dirty log test to validate KVM's handling of READ faults
when dirty logging is enabled.  Specifically, set the Dirty bit in the
guest PTEs used to map L2 GPAs, so that KVM will create writable SPTEs
when handling L2 read faults.  When handling read faults in the shadow MMU,
KVM opportunistically creates a writable SPTE if the mapping can be
writable *and* the gPTE is dirty (or doesn't support the Dirty bit), i.e.
if KVM doesn't need to intercept writes in order to emulate Dirty-bit
updates.

To actually test the L2 READ=>WRITE sequence, e.g. without masking a false
pass by other test activity, route the READ=>WRITE and WRITE=>WRITE
sequences to separate L1 pages, and differentiate between "marked dirty
due to a WRITE access/fault" and "marked dirty due to creating a writable
SPTE for a READ access/fault".  The updated sequence exposes the bug fixed
by KVM commit 1f4e5fc83a ("KVM: x86: fix nested guest live migration
with PML") when the guest performs a READ=>WRITE sequence with dirty guest
PTEs.

Opportunistically tweak and rename the address macros, and add comments,
to make it more obvious what the test is doing.  E.g. NESTED_TEST_MEM1
vs. GUEST_TEST_MEM doesn't make it all that obvious that the test is
creating aliases in both the L2 GPA and GVA address spaces, but only when
L1 is using TDP to run L2.

Cc: Yosry Ahmed <yosry.ahmed@linux.dev>
Reviewed-by: Yosry Ahmed <yosry.ahmed@linux.dev>
Link: https://patch.msgid.link/20260115172154.709024-1-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-01-16 07:48:54 -08:00
Fuad Tabba
e0a99a2b72 KVM: selftests: Fix typos and stale comments in kvm_util
Fix minor documentation errors in `kvm_util.h` and `kvm_util.c`.

- Correct the argument description for `vcpu_args_set` in `kvm_util.h`,
  which incorrectly listed `vm` instead of `vcpu`.
- Fix a typo in the comment for `kvm_selftest_arch_init` ("exeucting" ->
  "executing").
- Correct the return value description for `vm_vaddr_unused_gap` in
  `kvm_util.c` to match the implementation, which returns an address "at
  or above" `vaddr_min`, not "at or below".

No functional change intended.

Reviewed-by: Andrew Jones <andrew.jones@linux.dev>
Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20260109082218.3236580-6-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
2026-01-15 13:39:53 +00:00
Fuad Tabba
de00d07321 KVM: selftests: Move page_align() to shared header
To avoid code duplication, move page_align() to the shared `kvm_util.h`
header file. Rename it to vm_page_align(), to make it clear that the
alignment is done with respect to the guest's base page size.

No functional change intended.

Reviewed-by: Andrew Jones <andrew.jones@linux.dev>
Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20260109082218.3236580-5-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
2026-01-15 13:39:53 +00:00
Fuad Tabba
7e03d07d03 KVM: arm64: selftests: Disable unused TTBR1_EL1 translations
KVM selftests map all guest code and data into the lower virtual address
range (0x0000...) managed by TTBR0_EL1. The upper range (0xFFFF...)
managed by TTBR1_EL1 is unused and uninitialized.

If a guest accesses the upper range, the MMU attempts a translation
table walk using uninitialized registers, leading to unpredictable
behavior.

Set `TCR_EL1.EPD1` to disable translation table walks for TTBR1_EL1,
ensuring that any access to the upper range generates an immediate
Translation Fault. Additionally, set `TCR_EL1.TBI1` (Top Byte Ignore) to
ensure that tagged pointers in the upper range also deterministically
trigger a Translation Fault via EPD1.

Define `TCR_EPD1_MASK`, `TCR_EPD1_SHIFT`, and `TCR_TBI1` in
`processor.h` to support this configuration. These are based on their
definitions in `arch/arm64/include/asm/pgtable-hwdef.h`.

Suggested-by: Will Deacon <will@kernel.org>
Reviewed-by: Itaru Kitayama <itaru.kitayama@fujitsu.com>
Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20260109082218.3236580-2-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
2026-01-15 13:39:53 +00:00
Yosry Ahmed
55058e3215 KVM: selftests: Add a selftests for nested VMLOAD/VMSAVE
Add a test for VMLOAD/VMSAVE in an L2 guest. The test verifies that L1
intercepts for VMSAVE/VMLOAD always work regardless of
VIRTUAL_VMLOAD_VMSAVE_ENABLE_MASK.

Then, more interestingly, it makes sure that when L1 does not intercept
VMLOAD/VMSAVE, they work as intended in L2. When
VIRTUAL_VMLOAD_VMSAVE_ENABLE_MASK is enabled by L1, VMSAVE/VMLOAD from
L2 should interpret the GPA as an L2 GPA and translate it through the
NPT. When VIRTUAL_VMLOAD_VMSAVE_ENABLE_MASK is disabled by L1,
VMSAVE/VMLOAD from L2 should interpret the GPA as an L1 GPA.

To test this, put two VMCBs (0 and 1) in L1's physical address space,
and have a single L2 GPA where:
- L2 VMCB GPA == L1 VMCB(0) GPA
- L2 VMCB GPA maps to L1 VMCB(1) via the NPT in L1.

This setup allows detecting how the GPA is interpreted based on which L1
VMCB is actually accessed.

In both cases, L2 sets KERNEL_GS_BASE (one of the fields handled by
VMSAVE/VMLOAD), and executes VMSAVE to write its value to the VMCB. The
test userspace code then checks that the write was made to the correct
VMCB (based on whether VIRTUAL_VMLOAD_VMSAVE_ENABLE_MASK is set by L1),
and writes a new value to that VMCB. L2 then executes VMLOAD to load the
new value and makes sure it's reflected correctly in KERNERL_GS_BASE.

Signed-off-by: Yosry Ahmed <yosry.ahmed@linux.dev>
Link: https://patch.msgid.link/20260110004821.3411245-4-yosry.ahmed@linux.dev
Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-01-14 14:09:10 -08:00
Sean Christopherson
d7507a94a0 KVM: SVM: Treat exit_code as an unsigned 64-bit value through all of KVM
Fix KVM's long-standing buggy handling of SVM's exit_code as a 32-bit
value.  Per the APM and Xen commit d1bd157fbc ("Big merge the HVM
full-virtualisation abstractions.") (which is arguably more trustworthy
than KVM), offset 0x70 is a single 64-bit value:

  070h 63:0 EXITCODE

Track exit_code as a single u64 to prevent reintroducing bugs where KVM
neglects to correctly set bits 63:32.

Fixes: 6aa8b732ca ("[PATCH] kvm: userspace interface")
Cc: Jim Mattson <jmattson@google.com>
Cc: Yosry Ahmed <yosry.ahmed@linux.dev>
Reviewed-by: Yosry Ahmed <yosry.ahmed@linux.dev>
Link: https://patch.msgid.link/20251230211347.4099600-6-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-01-13 17:37:03 -08:00
Sean Christopherson
c3a9a27c79 KVM: selftests: Add a test to verify APICv updates (while L2 is active)
Add a test to verify KVM correctly handles a variety of edge cases related
to APICv updates, and in particular updates that are triggered while L2 is
actively running.

Reviewed-by: Chao Gao <chao.gao@intel.com>
Link: https://patch.msgid.link/20260109034532.1012993-2-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-01-13 17:35:31 -08:00
Sean Christopherson
e353850499 KVM: selftests: Rename vm_get_page_table_entry() to vm_get_pte()
Shorten the API to get a PTE as the "PTE" acronym is ubiquitous, and the
"page table entry" makes it unnecessarily difficult to quickly understand
what callers are doing.

No functional change intended.

Reviewed-by: Yosry Ahmed <yosry.ahmed@linux.dev>
Link: https://patch.msgid.link/20251230230150.4150236-21-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-01-08 12:02:17 -08:00
Yosry Ahmed
251e4849a7 KVM: selftests: Set the user bit on nested NPT PTEs
According to the APM, NPT walks are treated as user accesses. In
preparation for supporting NPT mappings, set the 'user' bit on NPTs by
adding a mask of bits to always be set on PTEs in kvm_mmu.

Signed-off-by: Yosry Ahmed <yosry.ahmed@linux.dev>
Link: https://patch.msgid.link/20251230230150.4150236-18-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-01-08 12:02:15 -08:00
Yosry Ahmed
753c0d5a50 KVM: selftests: Add support for nested NPTs
Implement nCR3 and NPT initialization functions, similar to the EPT
equivalents, and create common TDP helpers for enablement checking and
initialization. Enable NPT for nested guests by default if the TDP MMU
was initialized, similar to VMX.

Reuse the PTE masks from the main MMU in the NPT MMU, except for the C
and S bits related to confidential VMs.

Signed-off-by: Yosry Ahmed <yosry.ahmed@linux.dev>
Link: https://patch.msgid.link/20251230230150.4150236-17-seanjc@google.com
[sean: apply Yosry's fixup for ncr3_gpa]
Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-01-08 12:02:14 -08:00
Sean Christopherson
07676c04bd KVM: selftests: Move TDP mapping functions outside of vmx.c
Now that the functions are no longer VMX-specific, move them to
processor.c. Do a minor comment tweak replacing 'EPT' with 'TDP'.

No functional change intended.

Signed-off-by: Yosry Ahmed <yosry.ahmed@linux.dev>
Link: https://patch.msgid.link/20251230230150.4150236-15-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-01-08 12:02:13 -08:00
Yosry Ahmed
508d1cc3ca KVM: selftests: Reuse virt mapping functions for nested EPTs
Rework tdp_map() and friends to use __virt_pg_map() and drop the custom
EPT code in __tdp_pg_map() and tdp_create_pte().  The EPT code and
__virt_pg_map() are practically identical, the main differences are:
  - EPT uses the EPT struct overlay instead of the PTE masks.
  - EPT always assumes 4-level EPTs.

To reuse __virt_pg_map(), extend the PTE masks to work with EPT's RWX and
X-only capabilities, and provide a tdp_mmu_init() API so that EPT can pass
in the EPT PTE masks along with the root page level (which is currently
hardcoded to '4').

Don't reuse KVM's insane overloading of the USER bit for EPT_R as there's
no reason to multiplex bits in the selftests, e.g. selftests aren't trying
to shadow guest PTEs and thus don't care about funnelling protections into
a common permissions check.

Another benefit of reusing the code is having separate handling for
upper-level PTEs vs 4K PTEs, which avoids some quirks like setting the
large bit on a 4K PTE in the EPTs.

For all intents and purposes, no functional change intended.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Yosry Ahmed <yosry.ahmed@linux.dev>
Co-developed-by: Sean Christopherson <seanjc@google.com>
Link: https://patch.msgid.link/20251230230150.4150236-14-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-01-08 12:02:12 -08:00
Sean Christopherson
8296b16c0a KVM: selftests: Add a stage-2 MMU instance to kvm_vm
Add a stage-2 MMU instance so that architectures that support nested
virtualization (more specifically, nested stage-2 page tables) can create
and track stage-2 page tables for running L2 guests.  Plumb the structure
into common code to avoid cyclical dependencies, and to provide some line
of sight to having common APIs for creating stage-2 mappings.

As a bonus, putting the member in common code justifies using stage2_mmu
instead of tdp_mmu for x86.

Reviewed-by: Yosry Ahmed <yosry.ahmed@linux.dev>
Link: https://patch.msgid.link/20251230230150.4150236-13-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-01-08 12:02:12 -08:00
Yosry Ahmed
e40e72fec0 KVM: selftests: Stop passing VMX metadata to TDP mapping functions
The root GPA is now retrieved from the nested MMU, stop passing VMX
metadata. This is in preparation for making these functions work for
NPTs as well.

Opportunistically drop tdp_pg_map() since it's unused.

No functional change intended.

Signed-off-by: Yosry Ahmed <yosry.ahmed@linux.dev>
Link: https://patch.msgid.link/20251230230150.4150236-12-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-01-08 12:02:11 -08:00
Yosry Ahmed
f00f519ceb KVM: selftests: Use a TDP MMU to share EPT page tables between vCPUs
prepare_eptp() currently allocates new EPTs for each vCPU.  memstress has
its own hack to share the EPTs between vCPUs.  Currently, there is no
reason to have separate EPTs for each vCPU, and the complexity is
significant.  The only reason it doesn't matter now is because memstress
is the only user with multiple vCPUs.

Add vm_enable_ept() to allocate EPT page tables for an entire VM, and use
it everywhere to replace prepare_eptp().  Drop 'eptp' and 'eptp_hva' from
'struct vmx_pages' as they serve no purpose (e.g. the EPTP can be built
from the PGD), but keep 'eptp_gpa' so that the MMU structure doesn't need
to be passed in along with vmx_pages.  Dynamically allocate the TDP MMU
structure to avoid a cyclical dependency between kvm_util_arch.h and
kvm_util.h.

Remove the workaround in memstress to copy the EPT root between vCPUs
since that's now the default behavior.

Name the MMU tdp_mmu instead of e.g. nested_mmu or nested.mmu to avoid
recreating the same mess that KVM has with respect to "nested" MMUs, e.g.
does nested refer to the stage-2 page tables created by L1, or the stage-1
page tables created by L2?

Signed-off-by: Yosry Ahmed <yosry.ahmed@linux.dev>
Co-developed-by: Sean Christopherson <seanjc@google.com>
Link: https://patch.msgid.link/20251230230150.4150236-11-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-01-08 12:02:10 -08:00
Yosry Ahmed
6dd7075721 KVM: selftests: Move PTE bitmasks to kvm_mmu
Move the PTE bitmasks into kvm_mmu to parameterize them for virt mapping
functions. Introduce helpers to read/write different PTE bits given a
kvm_mmu.

Drop the 'global' bit definition as it's currently unused, but leave the
'user' bit as it will be used in coming changes. Opportunisitcally
rename 'large' to 'huge' as it's more consistent with the kernel naming.

Leave PHYSICAL_PAGE_MASK alone, it's fixed in all page table formats and
a lot of other macros depend on it. It's tempting to move all the other
macros to be per-struct instead, but it would be too much noise for
little benefit.

Keep c_bit and s_bit in vm->arch as they used before the MMU is
initialized, through  __vmcreate() -> vm_userspace_mem_region_add() ->
vm_mem_add() -> vm_arch_has_protected_memory().

No functional change intended.

Signed-off-by: Yosry Ahmed <yosry.ahmed@linux.dev>
[sean: rename accessors to is_<adjective>_pte()]
Link: https://patch.msgid.link/20251230230150.4150236-10-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-01-08 12:02:10 -08:00
Sean Christopherson
3d0e7595e8 KVM: selftests: Add a "struct kvm_mmu_arch arch" member to kvm_mmu
Add an arch structure+field in "struct kvm_mmu" so that architectures can
track arch-specific information for a given MMU.

No functional change intended.

Reviewed-by: Yosry Ahmed <yosry.ahmed@linux.dev>
Link: https://patch.msgid.link/20251230230150.4150236-9-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-01-08 12:02:09 -08:00
Sean Christopherson
11825209f5 KVM: selftests: Plumb "struct kvm_mmu" into x86's MMU APIs
In preparation for generalizing the x86 virt mapping APIs to work with
TDP (stage-2) page tables, plumb "struct kvm_mmu" into all of the helper
functions instead of operating on vm->mmu directly.

Opportunistically swap the order of the check in virt_get_pte() to first
assert that the parent is the PGD, and then check that the PTE is present,
as it makes more sense to check if the parent PTE is the PGD/root (i.e.
not a PTE) before checking that the PTE is PRESENT.

No functional change intended.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Yosry Ahmed <yosry.ahmed@linux.dev>
[sean: rebase on common kvm_mmu structure, rewrite changelog]
Link: https://patch.msgid.link/20251230230150.4150236-8-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-01-08 12:02:08 -08:00
Sean Christopherson
9f073ac25b KVM: selftests: Add "struct kvm_mmu" to track a given MMU instance
Add a "struct kvm_mmu" to track a given MMU instance, e.g. a VM's stage-1
MMU versus a VM's stage-2 MMU, so that x86 can share MMU functionality for
both stage-1 and stage-2 MMUs, without creating the potential for subtle
bugs, e.g. due to consuming on vm->pgtable_levels when operating a stage-2
MMU.

Encapsulate the existing de facto MMU in "struct kvm_vm", e.g instead of
burying the MMU details in "struct kvm_vm_arch", to avoid more #ifdefs in
____vm_create(), and in the hopes that other architectures can utilize the
formalized MMU structure if/when they too support stage-2 page tables.

No functional change intended.

Reviewed-by: Yosry Ahmed <yosry.ahmed@linux.dev>
Link: https://patch.msgid.link/20251230230150.4150236-7-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-01-08 12:02:08 -08:00
Yosry Ahmed
60de423781 KVM: selftests: Rename nested TDP mapping functions
Rename the functions from nested_* to tdp_* to make their purpose
clearer.

No functional change intended.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Yosry Ahmed <yosry.ahmed@linux.dev>
Link: https://patch.msgid.link/20251230230150.4150236-4-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-01-08 12:02:06 -08:00
Yosry Ahmed
97dfbdfea4 KVM: selftests: Stop passing a memslot to nested_map_memslot()
On x86, KVM selftests use memslot 0 for all the default regions used by
the test infrastructure. This is an implementation detail.
nested_map_memslot() is currently used to map the default regions by
explicitly passing slot 0, which leaks the library implementation into
the caller.

Rename the function to a very verbose
nested_identity_map_default_memslots() to reflect what it actually does.
Add an assertion that only memslot 0 is being used so that the
implementation does not change from under us.

No functional change intended.

Signed-off-by: Yosry Ahmed <yosry.ahmed@linux.dev>
Link: https://patch.msgid.link/20251230230150.4150236-3-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-01-08 12:02:05 -08:00
Yosry Ahmed
69e81ed5e6 KVM: selftests: Make __vm_get_page_table_entry() static
The function is only used in processor.c, drop the declaration in
processor.h and make it static.

No functional change intended.

Signed-off-by: Yosry Ahmed <yosry.ahmed@linux.dev>
Link: https://patch.msgid.link/20251230230150.4150236-2-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-01-08 12:02:04 -08:00
MJ Pooladkhay
7fe9f5366b KVM: selftests: Fix sign extension bug in get_desc64_base()
The function get_desc64_base() performs a series of bitwise left shifts on
fields of various sizes. More specifically, when performing '<< 24' on
'desc->base2' (which is a u8), 'base2' is promoted to a signed integer
before shifting.

In a scenario where base2 >= 0x80, the shift places a 1 into bit 31,
causing the 32-bit intermediate value to become negative. When this
result is cast to uint64_t or ORed into the return value, sign extension
occurs, corrupting the upper 32 bits of the address (base3).

Example:
Given:
  base0 = 0x5000
  base1 = 0xd6
  base2 = 0xf8
  base3 = 0xfffffe7c

Expected return: 0xfffffe7cf8d65000
Actual return:   0xfffffffff8d65000

Fix this by explicitly casting the fields to 'uint64_t' before shifting
to prevent sign extension.

Signed-off-by: MJ Pooladkhay <mj@pooladkhay.com>
Link: https://patch.msgid.link/20251222174207.107331-1-mj@pooladkhay.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-01-08 12:00:56 -08:00
Maciej S. Szmigiero
0b28194c4c KVM: selftests: Test TPR / CR8 sync and interrupt masking
Add a few extra TPR / CR8 tests to x86's xapic_state_test to see if:
  * TPR is 0 on reset,
  * TPR, PPR and CR8 are equal inside the guest,
  * TPR and CR8 read equal by the host after a VMExit
  * TPR borderline values set by the host correctly mask interrupts in the
    guest.

These hopefully will catch the most obvious cases of improper TPR sync or
interrupt masking.

Do these tests both in x2APIC and xAPIC modes.
The x2APIC mode uses SELF_IPI register to trigger interrupts to give it a
bit of exercise too.

Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Acked-by: Naveen N Rao (AMD) <naveen@kernel.org>
[sean: put code in separate test]
Link: https://patch.msgid.link/20251205224937.428122-1-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-01-08 10:50:50 -08:00
Paolo Bonzini
f58e70cc31 Merge tag 'kvmarm-6.19' of https://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm64 updates for 6.19

 - Support for userspace handling of synchronous external aborts (SEAs),
   allowing the VMM to potentially handle the abort in a non-fatal
   manner.

 - Large rework of the VGIC's list register handling with the goal of
   supporting more active/pending IRQs than available list registers in
   hardware. In addition, the VGIC now supports EOImode==1 style
   deactivations for IRQs which may occur on a separate vCPU than the
   one that acked the IRQ.

 - Support for FEAT_XNX (user / privileged execute permissions) and
   FEAT_HAF (hardware update to the Access Flag) in the software page
   table walkers and shadow MMU.

 - Allow page table destruction to reschedule, fixing long need_resched
   latencies observed when destroying a large VM.

 - Minor fixes to KVM and selftests
2025-12-02 18:36:26 +01:00
Paolo Bonzini
8040280405 Merge tag 'loongarch-kvm-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson into HEAD
LoongArch KVM changes for v6.19

1. Get VM PMU capability from HW GCFG register.
2. Add AVEC basic support.
3. Use 64-bit register definition for EIOINTC.
4. Add KVM timer test cases for tools/selftests.
2025-12-02 18:34:22 +01:00
Oliver Upton
3eef0c83c3 Merge branch 'kvm-arm64/nv-xnx-haf' into kvmarm/next
* kvm-arm64/nv-xnx-haf: (22 commits)
  : Support for FEAT_XNX and FEAT_HAF in nested
  :
  : Add support for a couple of MMU-related features that weren't
  : implemented by KVM's software page table walk:
  :
  :  - FEAT_XNX: Allows the hypervisor to describe execute permissions
  :    separately for EL0 and EL1
  :
  :  - FEAT_HAF: Hardware update of the Access Flag, which in the context of
  :    nested means software walkers must also set the Access Flag.
  :
  : The series also adds some basic support for testing KVM's emulation of
  : the AT instruction, including the implementation detail that AT sets the
  : Access Flag in KVM.
  KVM: arm64: at: Update AF on software walk only if VM has FEAT_HAFDBS
  KVM: arm64: at: Use correct HA bit in TCR_EL2 when regime is EL2
  KVM: arm64: Document KVM_PGTABLE_PROT_{UX,PX}
  KVM: arm64: Fix spelling mistake "Unexpeced" -> "Unexpected"
  KVM: arm64: Add break to default case in kvm_pgtable_stage2_pte_prot()
  KVM: arm64: Add endian casting to kvm_swap_s[12]_desc()
  KVM: arm64: Fix compilation when CONFIG_ARM64_USE_LSE_ATOMICS=n
  KVM: arm64: selftests: Add test for AT emulation
  KVM: arm64: nv: Expose hardware access flag management to NV guests
  KVM: arm64: nv: Implement HW access flag management in stage-2 SW PTW
  KVM: arm64: Implement HW access flag management in stage-1 SW PTW
  KVM: arm64: Propagate PTW errors up to AT emulation
  KVM: arm64: Add helper for swapping guest descriptor
  KVM: arm64: nv: Use pgtable definitions in stage-2 walk
  KVM: arm64: Handle endianness in read helper for emulated PTW
  KVM: arm64: nv: Stop passing vCPU through void ptr in S2 PTW
  KVM: arm64: Call helper for reading descriptors directly
  KVM: arm64: nv: Advertise support for FEAT_XNX
  KVM: arm64: Teach ptdump about FEAT_XNX permissions
  KVM: arm64: nv: Forward FEAT_XNX permissions to the shadow stage-2
  ...

Signed-off-by: Oliver Upton <oupton@kernel.org>
2025-12-01 00:47:41 -08:00
Oliver Upton
938309b028 Merge branch 'kvm-arm64/vgic-lr-overflow' into kvmarm/next
* kvm-arm64/vgic-lr-overflow: (50 commits)
  : Support for VGIC LR overflows, courtesy of Marc Zyngier
  :
  : Address deficiencies in KVM's GIC emulation when a vCPU has more active
  : IRQs than can be represented in the VGIC list registers. Sort the AP
  : list to prioritize inactive and pending IRQs, potentially spilling
  : active IRQs outside of the LRs.
  :
  : Handle deactivation of IRQs outside of the LRs for both EOImode=0/1,
  : which involves special consideration for SPIs being deactivated from a
  : different vCPU than the one that acked it.
  KVM: arm64: Convert ICH_HCR_EL2_TDIR cap to EARLY_LOCAL_CPU_FEATURE
  KVM: arm64: selftests: vgic_irq: Add timer deactivation test
  KVM: arm64: selftests: vgic_irq: Add Group-0 enable test
  KVM: arm64: selftests: vgic_irq: Add asymmetric SPI deaectivation test
  KVM: arm64: selftests: vgic_irq: Perform EOImode==1 deactivation in ack order
  KVM: arm64: selftests: vgic_irq: Remove LR-bound limitation
  KVM: arm64: selftests: vgic_irq: Exclude timer-controlled interrupts
  KVM: arm64: selftests: vgic_irq: Change configuration before enabling interrupt
  KVM: arm64: selftests: vgic_irq: Fix GUEST_ASSERT_IAR_EMPTY() helper
  KVM: arm64: selftests: gic_v3: Disable Group-0 interrupts by default
  KVM: arm64: selftests: gic_v3: Add irq group setting helper
  KVM: arm64: GICv2: Always trap GICV_DIR register
  KVM: arm64: GICv2: Handle deactivation via GICV_DIR traps
  KVM: arm64: GICv2: Handle LR overflow when EOImode==0
  KVM: arm64: GICv3: Force exit to sync ICH_HCR_EL2.En
  KVM: arm64: GICv3: nv: Plug L1 LR sync into deactivation primitive
  KVM: arm64: GICv3: nv: Resync LRs/VMCR/HCR early for better MI emulation
  KVM: arm64: GICv3: Avoid broadcast kick on CPUs lacking TDIR
  KVM: arm64: GICv3: Handle in-LR deactivation when possible
  KVM: arm64: GICv3: Add SPI tracking to handle asymmetric deactivation
  ...

Signed-off-by: Oliver Upton <oupton@kernel.org>
2025-12-01 00:47:32 -08:00
Oliver Upton
66f1888583 KVM: arm64: selftests: Add test for AT emulation
Add a basic test for AT emulation in the EL2&0 and EL1&0 translation
regimes.

Reviewed-by: Marc Zyngier <maz@kernel.org>
Tested-by: Marc Zyngier <maz@kernel.org>
Link: https://msgid.link/20251124190158.177318-16-oupton@kernel.org
Signed-off-by: Oliver Upton <oupton@kernel.org>
2025-12-01 00:44:02 -08:00
Bibo Mao
df41742343 KVM: LoongArch: selftests: Add timer interrupt test case
Add timer test case based on common arch_timer code, timer interrupt
with one-shot and period mode is tested.

Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-11-28 14:49:44 +08:00
Bibo Mao
d84fe2f30b KVM: LoongArch: selftests: Add exception handler register interface
Add interrupt and exception handler register interface. When exception
happens, execute registered exception handler if exists, else report an
error.

Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-11-27 11:00:18 +08:00
Bibo Mao
1c5d3a1eab KVM: LoongArch: selftests: Add basic interfaces
Add some basic function interfaces such as CSR register access, local
irq enable or disable APIs.

Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-11-27 11:00:18 +08:00
Bibo Mao
985a96983b KVM: LoongArch: selftests: Add system registers save/restore on exception
When system returns from exception with ertn instruction, PC comes from
LOONGARCH_CSR_ERA, and CSR.CRMD comes LOONGARCH_CSR_PRMD.

Here save CSR register CSR.ERA and CSR.PRMD into stack, and then restore
them from stack. So it can be modified by exception handlers in future.

Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-11-27 11:00:18 +08:00
Paolo Bonzini
b0bf3d67a7 Merge tag 'kvm-x86-selftests-6.19' of https://github.com/kvm-x86/linux into HEAD
KVM selftests changes for 6.19:

 - Fix a math goof in mmu_stress_test when running on a single-CPU system/VM.

 - Forcefully override ARCH from x86_64 to x86 to play nice with specifying
   ARCH=x86_64 on the command line.

 - Extend a bunch of nested VMX to validate nested SVM as well.

 - Add support for LA57 in the core VM_MODE_xxx macro, and add a test to
   verify KVM can save/restore nested VMX state when L1 is using 5-level
   paging, but L2 is not.

 - Clean up the guest paging code in anticipation of sharing the core logic for
   nested EPT and nested NPT.
2025-11-26 09:35:40 +01:00
Marc Zyngier
a1650de7c1 KVM: arm64: selftests: gic_v3: Add irq group setting helper
Being able to set the group of an interrupt is pretty useful.
Add such a helper.

Tested-by: Fuad Tabba <tabba@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Tested-by: Mark Brown <broonie@kernel.org>
Link: https://msgid.link/20251120172540.2267180-41-maz@kernel.org
Signed-off-by: Oliver Upton <oupton@kernel.org>
2025-11-24 14:29:14 -08:00
Yosry Ahmed
d2e50389ab KVM: selftests: Make sure vm->vpages_mapped is always up-to-date
Call paths leading to __virt_pg_map() are currently:
(a) virt_pg_map() -> virt_arch_pg_map() -> __virt_pg_map()
(b) virt_map_level() -> __virt_pg_map()

For (a), calls to virt_pg_map() from kvm_util.c make sure they update
vm->vpages_mapped, but other callers do not. Move the sparsebit_set()
call into virt_pg_map() to make sure all callers are captured.

For (b), call sparsebit_set_num() from virt_map_level().

It's tempting to have a single the call inside __virt_pg_map(), however:
- The call path in (a) is not x86-specific, while (b) is. Moving the
  call into __virt_pg_map() would require doing something similar for
  other archs implementing virt_pg_map().

- Future changes will reusue __virt_pg_map() for nested PTEs, which should
  not update vm->vpages_mapped, i.e. a triple underscore version that does
  not update vm->vpages_mapped would need to be provided.

Signed-off-by: Yosry Ahmed <yosry.ahmed@linux.dev>
Link: https://patch.msgid.link/20251021074736.1324328-12-yosry.ahmed@linux.dev
Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-11-21 10:17:05 -08:00
Jim Mattson
ec5806639e KVM: selftests: Change VM_MODE_PXXV48_4K to VM_MODE_PXXVYY_4K
Use 57-bit addresses with 5-level paging on hardware that supports
LA57. Continue to use 48-bit addresses with 4-level paging on hardware
that doesn't support LA57.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Jim Mattson <jmattson@google.com>
Link: https://patch.msgid.link/20251028225827.2269128-4-jmattson@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-11-20 16:19:59 -08:00
Yosry Ahmed
ff736dba47 KVM: selftests: Remove the unused argument to prepare_eptp()
eptp_memslot is unused, remove it. No functional change intended.

Signed-off-by: Yosry Ahmed <yosry.ahmed@linux.dev>
Link: https://patch.msgid.link/20251021074736.1324328-10-yosry.ahmed@linux.dev
Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-11-20 16:19:57 -08:00
Maximilian Dittgen
85f329df29 KVM: selftests: SYNC after guest ITS setup in vgic_lpi_stress
vgic_lpi_stress sends MAPTI and MAPC commands during guest GIC setup to
map interrupt events to ITT entries and collection IDs to
redistributors, respectively.

We have no guarantee that the ITS will finish handling these mapping
commands before the selftest calls KVM_SIGNAL_MSI to inject LPIs to the
guest. If LPIs are injected before ITS mapping completes, the ITS cannot
properly pass the interrupt on to the redistributor.

Fix by adding a SYNC command to the selftests ITS library, then calling
SYNC after ITS mapping to ensure mapping completes before signal_lpi()
writes to GITS_TRANSLATER.

Signed-off-by: Maximilian Dittgen <mdittgen@amazon.de>
Link: https://msgid.link/20251119135744.68552-2-mdittgen@amazon.de
Signed-off-by: Oliver Upton <oupton@kernel.org>
2025-11-19 12:38:59 -08:00
Sean Christopherson
83e0e12219 KVM: selftests: Rename "guest_paddr" variables to "gpa"
Rename "guest_paddr" variables in vm_userspace_mem_region_add() and
vm_mem_add() to KVM's de facto standard "gpa", both for consistency and
to shorten line lengths.

Opportunistically fix the indentation of the
vm_userspace_mem_region_add() declaration.

Link: https://patch.msgid.link/20251007223625.369939-1-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-11-03 12:54:21 -08:00
Shivank Garg
e698e89b3e KVM: selftests: Add helpers to probe for NUMA support, and multi-node systems
Add NUMA helpers to probe for support/availability and to check if the
test is running on a multi-node system.  The APIs will be used to verify
guest_memfd NUMA support.

Signed-off-by: Shivank Garg <shivankg@amd.com>
[sean: land helpers in numaif.h, add comments, tweak names]
Link: https://lore.kernel.org/r/20251016172853.52451-11-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-10-20 06:30:44 -07:00
Sean Christopherson
fe7baebb99 KVM: selftests: Use proper uAPI headers to pick up mempolicy.h definitions
Drop the KVM's re-definitions of MPOL_xxx flags in numaif.h as they are
defined by the already-included, kernel-provided mempolicy.h.  The only
reason the duplicate definitions don't cause compiler warnings is because
they are identical, but only on x86-64!  The syscall numbers in particular
are subtly x86_64-specific, i.e. will cause problems if/when numaif.h is
used outsize of x86.

Opportunistically clean up the file comment as the license information is
covered by the SPDX header, the path is superfluous, and as above the
comment about the contents is flat out wrong.

Fixes: 346b59f220 ("KVM: selftests: Add missing header file needed by xAPIC IPI tests")
Reviewed-by: Shivank Garg <shivankg@amd.com>
Tested-by: Shivank Garg <shivankg@amd.com>
Link: https://lore.kernel.org/r/20251016172853.52451-10-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-10-20 06:30:44 -07:00
Sean Christopherson
2189d78269 KVM: selftests: Add additional equivalents to libnuma APIs in KVM's numaif.h
Add APIs for all syscalls defined in the kernel's mm/mempolicy.c to match
those that would be provided by linking to libnuma.  Opportunistically use
the recently inroduced KVM_SYSCALL_DEFINE() builders to take care of the
boilerplate, and to fix a flaw where the two existing wrappers would
generate multiple symbols if numaif.h were to be included multiple times.

Reviewed-by: Ackerley Tng <ackerleytng@google.com>
Tested-by: Ackerley Tng <ackerleytng@google.com>
Reviewed-by: Shivank Garg <shivankg@amd.com>
Tested-by: Shivank Garg <shivankg@amd.com>
Link: https://lore.kernel.org/r/20251016172853.52451-9-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-10-20 06:30:43 -07:00