linux

mirror of https://github.com/torvalds/linux.git synced 2026-04-19 07:13:56 -04:00

Author	SHA1	Message	Date
Paolo Bonzini	b1195183ed	Merge tag 'kvm-s390-next-7.0-1' of https://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD - gmap rewrite: completely new memory management for kvm/s390 - vSIE improvement - maintainership change for s390 vfio-pci - small quality of life improvement for protected guests	2026-02-11 18:52:27 +01:00
Paolo Bonzini	1b13885edf	Merge tag 'kvm-x86-apic-6.20' of https://github.com/kvm-x86/linux into HEAD KVM x86 APIC-ish changes for 6.20 - Fix a benign bug where KVM could use the wrong memslots (ignored SMM) when creating a vCPU-specific mapping of guest memory. - Clean up KVM's handling of marking mapped vCPU pages dirty. - Drop a pile of ancient sanity checks hidden behind in KVM's unused ASSERT() macro, most of which could be trivially triggered by the guest and/or user, and all of which were useless. - Fold "struct dest_map" into its sole user, "struct rtc_status", to make it more obvious what the weird parameter is used for, and to allow burying the RTC shenanigans behind CONFIG_KVM_IOAPIC=y. - Bury all of ioapic.h and KVM_IRQCHIP_KERNEL behind CONFIG_KVM_IOAPIC=y. - Add a regression test for recent APICv update fixes. - Rework KVM's handling of VMCS updates while L2 is active to temporarily switch to vmcs01 instead of deferring the update until the next nested VM-Exit. The deferred updates approach directly contributed to several bugs, was proving to be a maintenance burden due to the difficulty in auditing the correctness of deferred updates, and was polluting "struct nested_vmx" with a growing pile of booleans. - Handle "hardware APIC ISR", a.k.a. SVI, updates in kvm_apic_update_apicv() to consolidate the updates, and to co-locate SVI updates with the updates for KVM's own cache of ISR information. - Drop a dead function declaration.	2026-02-11 12:45:32 -05:00
Paolo Bonzini	4215ee0d7b	Merge tag 'kvm-x86-svm-6.20' of https://github.com/kvm-x86/linux into HEAD KVM SVM changes for 6.20 - Drop a user-triggerable WARN on nested_svm_load_cr3() failure. - Add support for virtualizing ERAPS. Note, correct virtualization of ERAPS relies on an upcoming, publicly announced change in the APM to reduce the set of conditions where hardware (i.e. KVM) must flush the RAP. - Ignore nSVM intercepts for instructions that are not supported according to L1's virtual CPU model. - Add support for expedited writes to the fast MMIO bus, a la VMX's fastpath for EPT Misconfig. - Don't set GIF when clearing EFER.SVME, as GIF exists independently of SVM, and allow userspace to restore nested state with GIF=0. - Treat exit_code as an unsigned 64-bit value through all of KVM. - Add support for fetching SNP certificates from userspace. - Fix a bug where KVM would use vmcb02 instead of vmcb01 when emulating VMLOAD or VMSAVE on behalf of L2. - Misc fixes and cleanups.	2026-02-09 18:51:37 +01:00
Paolo Bonzini	a0c468eda4	Merge tag 'kvm-x86-selftests-6.20' of https://github.com/kvm-x86/linux into HEAD KVM selftests changes for 6.20 - Add a regression test for TPR<=>CR8 synchronization and IRQ masking. - Overhaul selftest's MMU infrastructure to genericize stage-2 MMU support, and extend x86's infrastructure to support EPT and NPT (for L2 guests). - Extend several nested VMX tests to also cover nested SVM. - Add a selftest for nested VMLOAD/VMSAVE. - Rework the nested dirty log test, originally added as a regression test for PML where KVM logged L2 GPAs instead of L1 GPAs, to improve test coverage and to hopefully make the test easier to understand and maintain.	2026-02-09 18:38:54 +01:00
Paolo Bonzini	5490063269	Merge tag 'kvmarm-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 updates for 7.0 - Add support for FEAT_IDST, allowing ID registers that are not implemented to be reported as a normal trap rather than as an UNDEF exception. - Add sanitisation of the VTCR_EL2 register, fixing a number of UXN/PXN/XN bugs in the process. - Full handling of RESx bits, instead of only RES0, and resulting in SCTLR_EL2 being added to the list of sanitised registers. - More pKVM fixes for features that are not supposed to be exposed to guests. - Make sure that MTE being disabled on the pKVM host doesn't give it the ability to attack the hypervisor. - Allow pKVM's host stage-2 mappings to use the Force Write Back version of the memory attributes by using the "pass-through' encoding. - Fix trapping of ICC_DIR_EL1 on GICv5 hosts emulating GICv3 for the guest. - Preliminary work for guest GICv5 support. - A bunch of debugfs fixes, removing pointless custom iterators stored in guest data structures. - A small set of FPSIMD cleanups. - Selftest fixes addressing the incorrect alignment of page allocation. - Other assorted low-impact fixes and spelling fixes.	2026-02-09 18:18:19 +01:00
Paolo Bonzini	c14f646638	Merge tag 'loongarch-kvm-6.20' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson into HEAD LoongArch KVM changes for v6.20 1. Add more CPUCFG mask bits. 2. Improve feature detection. 3. Add FPU/LBT delay load support. 4. Set default return value in KVM IO bus ops. 5. Add paravirt preempt feature support. 6. Add KVM steal time test case for tools/selftests.	2026-02-09 18:17:01 +01:00
Bibo Mao	2d94a3f708	KVM: LoongArch: selftests: Add steal time test case LoongArch KVM supports steal time accounting now, here add steal time test case on LoongArch. Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2026-02-06 09:28:01 +08:00
Claudio Imbrenda	52940a34a8	KVM: s390: selftests: Add selftest for the KVM_S390_KEYOP ioctl This test allows to test the various storage key handling functions. Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>	2026-02-04 17:00:10 +01:00
Zhiquan Li	e396a74222	KVM: selftests: Add -U_FORTIFY_SOURCE to avoid some unpredictable test failures Some distributions (such as Ubuntu) configure GCC so that _FORTIFY_SOURCE is automatically enabled at -O1 or above. This results in some fortified version of definitions of standard library functions are included. While linker resolves the symbols, the fortified versions might override the definitions in lib/string_override.c and reference to those PLT entries in GLIBC. This is not a problem for the code in host, but it is a disaster for the guest code. E.g., if build and run x86/nested_emulation_test on Ubuntu 24.04 will encounter a L1 #PF due to memset() reference to __memset_chk@plt. The option -fno-builtin-memset is not helpful here, because those fortified versions are not built-in but some definitions which are included by header, they are for different intentions. In order to eliminate the unpredictable behaviors may vary depending on the linker and platform, add the "-U_FORTIFY_SOURCE" into CFLAGS to prevent from introducing the fortified definitions. Signed-off-by: Zhiquan Li <zhiquan_li@163.com> Link: https://patch.msgid.link/20260122053551.548229-1-zhiquan_li@163.com Fixes: `6b6f71484b` ("KVM: selftests: Implement memcmp(), memcpy(), and memset() for guest use") Cc: stable@vger.kernel.org [sean: tag for stable] Signed-off-by: Sean Christopherson <seanjc@google.com>	2026-01-23 08:38:31 -08:00
Marc Zyngier	b638a9d0f8	KVM: arm64: selftests: Add a test for FEAT_IDST Add a very basic test checking that FEAT_IDST actually works for the {GMID,SMIDR,CSSIDR2}_EL1 registers. Link: https://patch.msgid.link/20260108173233.2911955-10-maz@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>	2026-01-15 11:58:57 +00:00
Yosry Ahmed	55058e3215	KVM: selftests: Add a selftests for nested VMLOAD/VMSAVE Add a test for VMLOAD/VMSAVE in an L2 guest. The test verifies that L1 intercepts for VMSAVE/VMLOAD always work regardless of VIRTUAL_VMLOAD_VMSAVE_ENABLE_MASK. Then, more interestingly, it makes sure that when L1 does not intercept VMLOAD/VMSAVE, they work as intended in L2. When VIRTUAL_VMLOAD_VMSAVE_ENABLE_MASK is enabled by L1, VMSAVE/VMLOAD from L2 should interpret the GPA as an L2 GPA and translate it through the NPT. When VIRTUAL_VMLOAD_VMSAVE_ENABLE_MASK is disabled by L1, VMSAVE/VMLOAD from L2 should interpret the GPA as an L1 GPA. To test this, put two VMCBs (0 and 1) in L1's physical address space, and have a single L2 GPA where: - L2 VMCB GPA == L1 VMCB(0) GPA - L2 VMCB GPA maps to L1 VMCB(1) via the NPT in L1. This setup allows detecting how the GPA is interpreted based on which L1 VMCB is actually accessed. In both cases, L2 sets KERNEL_GS_BASE (one of the fields handled by VMSAVE/VMLOAD), and executes VMSAVE to write its value to the VMCB. The test userspace code then checks that the write was made to the correct VMCB (based on whether VIRTUAL_VMLOAD_VMSAVE_ENABLE_MASK is set by L1), and writes a new value to that VMCB. L2 then executes VMLOAD to load the new value and makes sure it's reflected correctly in KERNERL_GS_BASE. Signed-off-by: Yosry Ahmed <yosry.ahmed@linux.dev> Link: https://patch.msgid.link/20260110004821.3411245-4-yosry.ahmed@linux.dev Signed-off-by: Sean Christopherson <seanjc@google.com>	2026-01-14 14:09:10 -08:00
Sean Christopherson	c3a9a27c79	KVM: selftests: Add a test to verify APICv updates (while L2 is active) Add a test to verify KVM correctly handles a variety of edge cases related to APICv updates, and in particular updates that are triggered while L2 is actively running. Reviewed-by: Chao Gao <chao.gao@intel.com> Link: https://patch.msgid.link/20260109034532.1012993-2-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2026-01-13 17:35:31 -08:00
Yosry Ahmed	ca2eccb953	KVM: selftests: Extend vmx_set_nested_state_test to cover SVM Add test cases for the validation checks in svm_set_nested_state(), and allow the test to run with SVM as well as VMX. The SVM test also makes sure that KVM_SET_NESTED_STATE accepts GIF being set or cleared if EFER.SVME is cleared, verifying a recently fixed bug where GIF was incorrectly expected to always be set when EFER.SVME is cleared. Signed-off-by: Yosry Ahmed <yosry.ahmed@linux.dev> Link: https://patch.msgid.link/20251121204803.991707-5-yosry.ahmed@linux.dev Signed-off-by: Sean Christopherson <seanjc@google.com>	2026-01-08 12:54:19 -08:00
Yosry Ahmed	6794d916f8	KVM: selftests: Extend vmx_dirty_log_test to cover SVM Generalize the code in vmx_dirty_log_test.c by adding SVM-specific L1 code, doing some renaming (e.g. EPT -> TDP), and having setup code for both SVM and VMX in test_dirty_log(). Signed-off-by: Yosry Ahmed <yosry.ahmed@linux.dev> Link: https://patch.msgid.link/20251230230150.4150236-19-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2026-01-08 12:02:16 -08:00
Maciej S. Szmigiero	0b28194c4c	KVM: selftests: Test TPR / CR8 sync and interrupt masking Add a few extra TPR / CR8 tests to x86's xapic_state_test to see if: * TPR is 0 on reset, * TPR, PPR and CR8 are equal inside the guest, * TPR and CR8 read equal by the host after a VMExit * TPR borderline values set by the host correctly mask interrupts in the guest. These hopefully will catch the most obvious cases of improper TPR sync or interrupt masking. Do these tests both in x2APIC and xAPIC modes. The x2APIC mode uses SELF_IPI register to trigger interrupts to give it a bit of exercise too. Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> Acked-by: Naveen N Rao (AMD) <naveen@kernel.org> [sean: put code in separate test] Link: https://patch.msgid.link/20251205224937.428122-1-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2026-01-08 10:50:50 -08:00
Paolo Bonzini	e0c26d47de	Merge tag 'kvm-s390-next-6.19-1' of https://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD - SCA rework - VIRT_XFER_TO_GUEST_WORK support - Operation exception forwarding support - Cleanups	2025-12-02 18:58:47 +01:00
Paolo Bonzini	f58e70cc31	Merge tag 'kvmarm-6.19' of https://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 updates for 6.19 - Support for userspace handling of synchronous external aborts (SEAs), allowing the VMM to potentially handle the abort in a non-fatal manner. - Large rework of the VGIC's list register handling with the goal of supporting more active/pending IRQs than available list registers in hardware. In addition, the VGIC now supports EOImode==1 style deactivations for IRQs which may occur on a separate vCPU than the one that acked the IRQ. - Support for FEAT_XNX (user / privileged execute permissions) and FEAT_HAF (hardware update to the Access Flag) in the software page table walkers and shadow MMU. - Allow page table destruction to reschedule, fixing long need_resched latencies observed when destroying a large VM. - Minor fixes to KVM and selftests	2025-12-02 18:36:26 +01:00
Paolo Bonzini	8040280405	Merge tag 'loongarch-kvm-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson into HEAD LoongArch KVM changes for v6.19 1. Get VM PMU capability from HW GCFG register. 2. Add AVEC basic support. 3. Use 64-bit register definition for EIOINTC. 4. Add KVM timer test cases for tools/selftests.	2025-12-02 18:34:22 +01:00
Oliver Upton	3eef0c83c3	Merge branch 'kvm-arm64/nv-xnx-haf' into kvmarm/next * kvm-arm64/nv-xnx-haf: (22 commits) : Support for FEAT_XNX and FEAT_HAF in nested : : Add support for a couple of MMU-related features that weren't : implemented by KVM's software page table walk: : : - FEAT_XNX: Allows the hypervisor to describe execute permissions : separately for EL0 and EL1 : : - FEAT_HAF: Hardware update of the Access Flag, which in the context of : nested means software walkers must also set the Access Flag. : : The series also adds some basic support for testing KVM's emulation of : the AT instruction, including the implementation detail that AT sets the : Access Flag in KVM. KVM: arm64: at: Update AF on software walk only if VM has FEAT_HAFDBS KVM: arm64: at: Use correct HA bit in TCR_EL2 when regime is EL2 KVM: arm64: Document KVM_PGTABLE_PROT_{UX,PX} KVM: arm64: Fix spelling mistake "Unexpeced" -> "Unexpected" KVM: arm64: Add break to default case in kvm_pgtable_stage2_pte_prot() KVM: arm64: Add endian casting to kvm_swap_s[12]_desc() KVM: arm64: Fix compilation when CONFIG_ARM64_USE_LSE_ATOMICS=n KVM: arm64: selftests: Add test for AT emulation KVM: arm64: nv: Expose hardware access flag management to NV guests KVM: arm64: nv: Implement HW access flag management in stage-2 SW PTW KVM: arm64: Implement HW access flag management in stage-1 SW PTW KVM: arm64: Propagate PTW errors up to AT emulation KVM: arm64: Add helper for swapping guest descriptor KVM: arm64: nv: Use pgtable definitions in stage-2 walk KVM: arm64: Handle endianness in read helper for emulated PTW KVM: arm64: nv: Stop passing vCPU through void ptr in S2 PTW KVM: arm64: Call helper for reading descriptors directly KVM: arm64: nv: Advertise support for FEAT_XNX KVM: arm64: Teach ptdump about FEAT_XNX permissions KVM: arm64: nv: Forward FEAT_XNX permissions to the shadow stage-2 ... Signed-off-by: Oliver Upton <oupton@kernel.org>	2025-12-01 00:47:41 -08:00
Oliver Upton	66f1888583	KVM: arm64: selftests: Add test for AT emulation Add a basic test for AT emulation in the EL2&0 and EL1&0 translation regimes. Reviewed-by: Marc Zyngier <maz@kernel.org> Tested-by: Marc Zyngier <maz@kernel.org> Link: https://msgid.link/20251124190158.177318-16-oupton@kernel.org Signed-off-by: Oliver Upton <oupton@kernel.org>	2025-12-01 00:44:02 -08:00
Bibo Mao	df41742343	KVM: LoongArch: selftests: Add timer interrupt test case Add timer test case based on common arch_timer code, timer interrupt with one-shot and period mode is tested. Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2025-11-28 14:49:44 +08:00
Janosch Frank	8e8678e740	KVM: s390: Add capability that forwards operation exceptions Setting KVM_CAP_S390_USER_OPEREXEC will forward all operation exceptions to user space. This also includes the 0x0000 instructions managed by KVM_CAP_S390_USER_INSTR0. It's helpful if user space wants to emulate instructions which do not (yet) have an opcode. While we're at it refine the documentation for KVM_CAP_S390_USER_INSTR0. Signed-off-by: Janosch Frank <frankja@linux.ibm.com> Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Acked-by: Christian Borntraeger <borntraeger@linux.ibm.com> Signed-off-by: Janosch Frank <frankja@linux.ibm.com>	2025-11-21 10:26:03 +01:00
Jim Mattson	6a8818de21	KVM: selftests: Add a VMX test for LA57 nested state Add a selftest that verifies KVM's ability to save and restore nested state when the L1 guest is using 5-level paging and the L2 guest is using 4-level paging. Specifically, canonicality tests of the VMCS12 host-state fields should accept 57-bit virtual addresses. Signed-off-by: Jim Mattson <jmattson@google.com> Link: https://patch.msgid.link/20251028225827.2269128-5-jmattson@google.com [sean: rename to vmx_nested_la57_state_test to prep nested_<test> namespace] Signed-off-by: Sean Christopherson <seanjc@google.com>	2025-11-20 16:21:52 -08:00
Yosry Ahmed	3c40777f0e	KVM: selftests: Extend vmx_tsc_adjust_test to cover SVM Add SVM L1 code to run the nested guest, and allow the test to run with SVM as well as VMX. Reviewed-by: Jim Mattson <jmattson@google.com> Signed-off-by: Yosry Ahmed <yosry.ahmed@linux.dev> Link: https://patch.msgid.link/20251021074736.1324328-8-yosry.ahmed@linux.dev Signed-off-by: Sean Christopherson <seanjc@google.com>	2025-11-20 16:19:56 -08:00
Yosry Ahmed	4d256d00e4	KVM: selftests: Move nested invalid CR3 check to its own test vmx_tsc_adjust_test currently verifies that a nested VMLAUNCH fails with an invalid CR3. This is irrelevant to TSC scaling, move it to a standalone test. Signed-off-by: Yosry Ahmed <yosry.ahmed@linux.dev> Link: https://patch.msgid.link/20251021074736.1324328-6-yosry.ahmed@linux.dev Signed-off-by: Sean Christopherson <seanjc@google.com>	2025-11-20 16:19:54 -08:00
Yosry Ahmed	e6bcdd2122	KVM: selftests: Extend vmx_nested_tsc_scaling_test to cover SVM Add SVM L1 code to run the nested guest, and allow the test to run with SVM as well as VMX. Signed-off-by: Yosry Ahmed <yosry.ahmed@linux.dev> Link: https://patch.msgid.link/20251021074736.1324328-5-yosry.ahmed@linux.dev Signed-off-by: Sean Christopherson <seanjc@google.com>	2025-11-20 16:19:54 -08:00
Yosry Ahmed	0a9eb2afa1	KVM: selftests: Extend vmx_close_while_nested_test to cover SVM Add SVM L1 code to run the nested guest, and allow the test to run with SVM as well as VMX. Reviewed-by: Jim Mattson <jmattson@google.com> Signed-off-by: Yosry Ahmed <yosry.ahmed@linux.dev> Link: https://patch.msgid.link/20251021074736.1324328-4-yosry.ahmed@linux.dev [sean: rename to "nested_close_kvm_test" to provide nested_* sorting] Signed-off-by: Sean Christopherson <seanjc@google.com>	2025-11-20 16:19:53 -08:00
Jiaqi Yan	feee9ef7ac	KVM: selftests: Test for KVM_EXIT_ARM_SEA Test how KVM handles guest SEA when APEI is unable to claim it, and KVM_CAP_ARM_SEA_TO_USER is enabled. The behavior is triggered by consuming recoverable memory error (UER) injected via EINJ. The test asserts two major things: 1. KVM returns to userspace with KVM_EXIT_ARM_SEA exit reason, and has provided expected fault information, e.g. esr, flags, gva, gpa. 2. Userspace is able to handle KVM_EXIT_ARM_SEA by injecting SEA to guest and KVM injects expected SEA into the VCPU. Tested on a data center server running Siryn AmpereOne processor that has RAS support. Several things to notice before attempting to run this selftest: - The test relies on EINJ support in both firmware and kernel to inject UER. Otherwise the test will be skipped. - The under-test platform's APEI should be unable to claim the SEA. Otherwise the test will be skipped. - Some platform doesn't support notrigger in EINJ, which may cause APEI and GHES to offline the memory before guest can consume injected UER, and making test unable to trigger SEA. Signed-off-by: Jiaqi Yan <jiaqiyan@google.com> Link: https://msgid.link/20251013185903.1372553-3-jiaqiyan@google.com Signed-off-by: Oliver Upton <oupton@kernel.org>	2025-11-12 01:27:16 -08:00
Paolo Bonzini	12abeb81c8	Merge tag 'kvm-x86-cet-6.18' of https://github.com/kvm-x86/linux into HEAD KVM x86 CET virtualization support for 6.18 Add support for virtualizing Control-flow Enforcement Technology (CET) on Intel (Shadow Stacks and Indirect Branch Tracking) and AMD (Shadow Stacks). CET is comprised of two distinct features, Shadow Stacks (SHSTK) and Indirect Branch Tracking (IBT), that can be utilized by software to help provide Control-flow integrity (CFI). SHSTK defends against backward-edge attacks (a.k.a. Return-oriented programming (ROP)), while IBT defends against forward-edge attacks (a.k.a. similarly CALL/JMP-oriented programming (COP/JOP)). Attackers commonly use ROP and COP/JOP methodologies to redirect the control- flow to unauthorized targets in order to execute small snippets of code, a.k.a. gadgets, of the attackers choice. By chaining together several gadgets, an attacker can perform arbitrary operations and circumvent the system's defenses. SHSTK defends against backward-edge attacks, which execute gadgets by modifying the stack to branch to the attacker's target via RET, by providing a second stack that is used exclusively to track control transfer operations. The shadow stack is separate from the data/normal stack, and can be enabled independently in user and kernel mode. When SHSTK is is enabled, CALL instructions push the return address on both the data and shadow stack. RET then pops the return address from both stacks and compares the addresses. If the return addresses from the two stacks do not match, the CPU generates a Control Protection (#CP) exception. IBT defends against backward-edge attacks, which branch to gadgets by executing indirect CALL and JMP instructions with attacker controlled register or memory state, by requiring the target of indirect branches to start with a special marker instruction, ENDBRANCH. If an indirect branch is executed and the next instruction is not an ENDBRANCH, the CPU generates a #CP. Note, ENDBRANCH behaves as a NOP if IBT is disabled or unsupported. From a virtualization perspective, CET presents several problems. While SHSTK and IBT have two layers of enabling, a global control in the form of a CR4 bit, and a per-feature control in user and kernel (supervisor) MSRs (U_CET and S_CET respectively), the {S,U}_CET MSRs can be context switched via XSAVES/XRSTORS. Practically speaking, intercepting and emulating XSAVES/XRSTORS is not a viable option due to complexity, and outright disallowing use of XSTATE to context switch SHSTK/IBT state would render the features unusable to most guests. To limit the overall complexity without sacrificing performance or usability, simply ignore the potential virtualization hole, but ensure that all paths in KVM treat SHSTK/IBT as usable by the guest if the feature is supported in hardware, and the guest has access to at least one of SHSTK or IBT. I.e. allow userspace to advertise one of SHSTK or IBT if both are supported in hardware, even though doing so would allow a misbehaving guest to use the unadvertised feature. Fully emulating SHSTK and IBT would also require significant complexity, e.g. to track and update branch state for IBT, and shadow stack state for SHSTK. Given that emulating large swaths of the guest code stream isn't necessary on modern CPUs, punt on emulating instructions that meaningful impact or consume SHSTK or IBT. However, instead of doing nothing, explicitly reject emulation of such instructions so that KVM's emulator can't be abused to circumvent CET. Disable support for SHSTK and IBT if KVM is configured such that emulation of arbitrary guest instructions may be required, specifically if Unrestricted Guest (Intel only) is disabled, or if KVM will emulate a guest.MAXPHYADDR that is smaller than host.MAXPHYADDR. Lastly disable SHSTK support if shadow paging is enabled, as the protections for the shadow stack are novel (shadow stacks require Writable=0,Dirty=1, so that they can't be directly modified by software), i.e. would require non-trivial support in the Shadow MMU. Note, AMD CPUs currently only support SHSTK. Explicitly disable IBT support so that KVM doesn't over-advertise if AMD CPUs add IBT, and virtualizing IBT in SVM requires KVM modifications.	2025-09-30 13:37:14 -04:00
Paolo Bonzini	924ccf1d09	Merge tag 'kvm-riscv-6.18-1' of https://github.com/kvm-riscv/linux into HEAD KVM/riscv changes for 6.18 - Added SBI FWFT extension for Guest/VM with misaligned delegation and pointer masking PMLEN features - Added ONE_REG interface for SBI FWFT extension - Added Zicbop and bfloat16 extensions for Guest/VM - Enabled more common KVM selftests for RISC-V such as access_tracking_perf_test, dirty_log_perf_test, memslot_modification_stress_test, memslot_perf_test, mmu_stress_test, and rseq_test - Added SBI v3.0 PMU enhancements in KVM and perf driver	2025-09-30 13:23:36 -04:00
Paolo Bonzini	924ebaefce	Merge tag 'kvmarm-6.18' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 updates for 6.18 - Add support for FF-A 1.2 as the secure memory conduit for pKVM, allowing more registers to be used as part of the message payload. - Change the way pKVM allocates its VM handles, making sure that the privileged hypervisor is never tricked into using uninitialised data. - Speed up MMIO range registration by avoiding unnecessary RCU synchronisation, which results in VMs starting much quicker. - Add the dump of the instruction stream when panic-ing in the EL2 payload, just like the rest of the kernel has always done. This will hopefully help debugging non-VHE setups. - Add 52bit PA support to the stage-1 page-table walker, and make use of it to populate the fault level reported to the guest on failing to translate a stage-1 walk. - Add NV support to the GICv3-on-GICv5 emulation code, ensuring feature parity for guests, irrespective of the host platform. - Fix some really ugly architecture problems when dealing with debug in a nested VM. This has some bad performance impacts, but is at least correct. - Add enough infrastructure to be able to disable EL2 features and give effective values to the EL2 control registers. This then allows a bunch of features to be turned off, which helps cross-host migration. - Large rework of the selftest infrastructure to allow most tests to transparently run at EL2. This is the first step towards enabling NV testing. - Various fixes and improvements all over the map, including one BE fix, just in time for the removal of the feature.	2025-09-30 13:23:28 -04:00
Paolo Bonzini	8cbb0df294	Merge tag 'kvmarm-fixes-6.17-2' of https://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 changes for 6.17, round #3 - Invalidate nested MMUs upon freeing the PGD to avoid WARNs when visiting from an MMU notifier - Fixes to the TLB match process and TLB invalidation range for managing the VCNR pseudo-TLB - Prevent SPE from erroneously profiling guests due to UNKNOWN reset values in PMSCR_EL1 - Fix save/restore of host MDCR_EL2 to account for eagerly programming at vcpu_load() on VHE systems - Correct lock ordering when dealing with VGIC LPIs, avoiding scenarios where an xarray's spinlock was nested with a raw spinlock - Permit stage-2 read permission aborts which are possible in the case of NV depending on the guest hypervisor's stage-2 translation - Call raw_spin_unlock() instead of the internal spinlock API - Fix parameter ordering when assigning VBAR_EL1 [Pull into kvm/master to fix conflicts. - Paolo]	2025-09-30 13:23:06 -04:00
Oliver Upton	f677b0efa9	KVM: arm64: selftests: Add basic test for running in VHE EL2 Add an embarrassingly simple selftest for sanity checking KVM's VHE EL2 and test that the ID register bits are consistent with HCR_EL2.E2H being RES1. Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org>	2025-09-24 19:23:32 +01:00
Sean Christopherson	9c38ddb3df	KVM: selftests: Add an MSR test to exercise guest/host and read/write Add a selftest to verify reads and writes to various MSRs, from both the guest and host, and expect success/failure based on whether or not the vCPU supports the MSR according to supported CPUID. Note, this test is extremely similar to KVM-Unit-Test's "msr" test, but provides more coverage with respect to host accesses, and will be extended to provide addition testing of CPUID-based features, save/restore lists, and KVM_{G,S}ET_ONE_REG, all which are extremely difficult to validate in KUT. If kvm.ignore_msrs=true, skip the unsupported and reserved testcases as KVM's ABI is a mess; what exactly is supposed to be ignored, and when, varies wildly. Reviewed-by: Chao Gao <chao.gao@intel.com> Link: https://lore.kernel.org/r/20250919223258.1604852-46-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2025-09-23 09:51:56 -07:00
Quan Zhou	dbe3d1d160	KVM: riscv: selftests: Add common supported test cases Some common KVM test cases are supported on riscv now as following: access_tracking_perf_test dirty_log_perf_test memslot_modification_stress_test memslot_perf_test mmu_stress_test rseq_test Signed-off-by: Quan Zhou <zhouquan@iscas.ac.cn> Signed-off-by: Dong Yang <dayss1224@gmail.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Link: https://lore.kernel.org/r/c447f18115b27562cd65863645e41a5ef89bd37b.1756710918.git.dayss1224@gmail.com Signed-off-by: Anup Patel <anup@brainfault.org>	2025-09-16 10:53:58 +05:30
Paolo Bonzini	42a0305ab1	Merge tag 'kvmarm-fixes-6.17-1' of https://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 changes for 6.17, take #2 - Correctly handle 'invariant' system registers for protected VMs - Improved handling of VNCR data aborts, including external aborts - Fixes for handling of FEAT_RAS for NV guests, providing a sane fault context during SEA injection and preventing the use of RASv1p1 fault injection hardware - Ensure that page table destruction when a VM is destroyed gives an opportunity to reschedule - Large fix to KVM's infrastructure for managing guest context loaded on the CPU, addressing issues where the output of AT emulation doesn't get reflected to the guest - Fix AT S12 emulation to actually perform stage-2 translation when necessary - Avoid attempting vLPI irqbypass when GICv4 has been explicitly disabled for a VM - Minor KVM + selftest fixes	2025-08-29 12:57:31 -04:00
Fuad Tabba	692f6ecf38	KVM: selftests: Do not use hardcoded page sizes in guest_memfd test Update the guest_memfd_test selftest to use getpagesize() instead of hardcoded 4KB page size values. Using hardcoded page sizes can cause test failures on architectures or systems configured with larger page sizes, such as arm64 with 64KB pages. By dynamically querying the system's page size, the test becomes more portable and robust across different environments. Additionally, build the guest_memfd_test selftest for arm64. Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Shivank Garg <shivankg@amd.com> Reviewed-by: Gavin Shan <gshan@redhat.com> Suggested-by: Gavin Shan <gshan@redhat.com> Signed-off-by: Fuad Tabba <tabba@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-ID: <20250729225455.670324-23-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2025-08-27 04:37:03 -04:00
Marc Zyngier	85acc29f90	KVM: arm64: selftest: Add standalone test checking for KVM's own UUID Tinkering with UUIDs is a perilious task, and the KVM UUID gets broken at times. In order to spot this early enough, add a selftest that will shout if the expected value isn't found. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Sebastian Ott <sebott@redhat.com> Link: https://lore.kernel.org/r/20250721130558.50823-1-jackabt.amazon@gmail.com Link: https://lore.kernel.org/r/20250806171341.1521210-1-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2025-08-08 01:30:16 -07:00
Paolo Bonzini	314b40b3b6	Merge tag 'kvmarm-6.17' of https://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 changes for 6.17, round #1 - Host driver for GICv5, the next generation interrupt controller for arm64, including support for interrupt routing, MSIs, interrupt translation and wired interrupts. - Use FEAT_GCIE_LEGACY on GICv5 systems to virtualize GICv3 VMs on GICv5 hardware, leveraging the legacy VGIC interface. - Userspace control of the 'nASSGIcap' GICv3 feature, allowing userspace to disable support for SGIs w/o an active state on hardware that previously advertised it unconditionally. - Map supporting endpoints with cacheable memory attributes on systems with FEAT_S2FWB and DIC where KVM no longer needs to perform cache maintenance on the address range. - Nested support for FEAT_RAS and FEAT_DoubleFault2, allowing the guest hypervisor to inject external aborts into an L2 VM and take traps of masked external aborts to the hypervisor. - Convert more system register sanitization to the config-driven implementation. - Fixes to the visibility of EL2 registers, namely making VGICv3 system registers accessible through the VGIC device instead of the ONE_REG vCPU ioctls. - Various cleanups and minor fixes.	2025-07-29 12:27:40 -04:00
Paolo Bonzini	1a14928e2e	Merge tag 'kvm-x86-misc-6.17' of https://github.com/kvm-x86/linux into HEAD KVM x86 misc changes for 6.17 - Prevert the host's DEBUGCTL.FREEZE_IN_SMM (Intel only) when running the guest. Failure to honor FREEZE_IN_SMM can bleed host state into the guest. - Explicitly check vmcs12.GUEST_DEBUGCTL on nested VM-Enter (Intel only) to prevent L1 from running L2 with features that KVM doesn't support, e.g. BTF. - Intercept SPEC_CTRL on AMD if the MSR shouldn't exist according to the vCPU's CPUID model. - Rework the MSR interception code so that the SVM and VMX APIs are more or less identical. - Recalculate all MSR intercepts from the "source" on MSR filter changes, and drop the dedicated "shadow" bitmaps (and their awful "max" size defines). - WARN and reject loading kvm-amd.ko instead of panicking the kernel if the nested SVM MSRPM offsets tracker can't handle an MSR. - Advertise support for LKGS (Load Kernel GS base), a new instruction that's loosely related to FRED, but is supported and enumerated independently. - Fix a user-triggerable WARN that syzkaller found by stuffing INIT_RECEIVED, a.k.a. WFS, and then putting the vCPU into VMX Root Mode (post-VMXON). Use the same approach KVM uses for dealing with "impossible" emulation when running a !URG guest, and simply wait until KVM_RUN to detect that the vCPU has architecturally impossible state. - Add KVM_X86_DISABLE_EXITS_APERFMPERF to allow disabling interception of APERF/MPERF reads, so that a "properly" configured VM can "virtualize" APERF/MPERF (with many caveats). - Reject KVM_SET_TSC_KHZ if vCPUs have been created, as changing the "default" frequency is unsupported for VMs with a "secure" TSC, and there's no known use case for changing the default frequency for other VM types.	2025-07-29 08:36:43 -04:00
Jim Mattson	df98ce784a	KVM: selftests: Test behavior of KVM_X86_DISABLE_EXITS_APERFMPERF For a VCPU thread pinned to a single LPU, verify that interleaved host and guest reads of IA32_[AM]PERF return strictly increasing values when APERFMPERF exiting is disabled. Run the test in both L1 and L2 to verify that KVM passes through the APERF and MPERF MSRs when L1 doesn't want to intercept them (or any MSRs). Signed-off-by: Jim Mattson <jmattson@google.com> Link: https://lore.kernel.org/r/20250530185239.2335185-4-jmattson@google.com Co-developed-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20250626001225.744268-5-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2025-07-09 09:33:41 -07:00
Oliver Upton	2858ea3083	KVM: arm64: selftests: Add basic SError injection test Add tests for SError injection considering KVM is more directly involved in delivery: - Pending SErrors are taken at the first CSE after SErrors are unmasked - Pending SErrors aren't taken and remain pending if SErrors are masked - Unmasked SErrors are taken immediately when injected (implementation detail) Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20250708172532.1699409-25-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2025-07-08 11:36:36 -07:00
Sean Christopherson	7e9b231c40	KVM: selftests: Add a KVM_IRQFD test to verify uniqueness requirements Add a selftest to verify that eventfd+irqfd bindings are globally unique, i.e. that KVM doesn't allow multiple irqfds to bind to a single eventfd, even across VMs. Tested-by: K Prateek Nayak <kprateek.nayak@amd.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20250522235223.3178519-14-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2025-06-23 09:51:01 -07:00
Paolo Bonzini	4e02d4f973	Merge tag 'kvm-x86-svm-6.16' of https://github.com/kvm-x86/linux into HEAD KVM SVM changes for 6.16: - Wait for target vCPU to acknowledge KVM_REQ_UPDATE_PROTECTED_GUEST_STATE to fix a race between AP destroy and VMRUN. - Decrypt and dump the VMSA in dump_vmcb() if debugging enabled for the VM. - Add support for ALLOWED_SEV_FEATURES. - Add #VMGEXIT to the set of handlers special cased for CONFIG_RETPOLINE=y. - Treat DEBUGCTL[5:2] as reserved to pave the way for virtualizing features that utilize those bits. - Don't account temporary allocations in sev_send_update_data(). - Add support for KVM_CAP_X86_BUS_LOCK_EXIT on SVM, via Bus Lock Threshold.	2025-05-27 12:15:49 -04:00
Paolo Bonzini	3e0797f6dd	Merge tag 'kvm-x86-selftests-6.16' of https://github.com/kvm-x86/linux into HEAD KVM selftests changes for 6.16: - Add support for SNP to the various SEV selftests. - Add a selftest to verify fastops instructions via forced emulation. - Add MGLRU support to the access tracking perf test.	2025-05-27 12:15:26 -04:00
Paolo Bonzini	4d526b02df	Merge tag 'kvmarm-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 updates for 6.16 * New features: - Add large stage-2 mapping support for non-protected pKVM guests, clawing back some performance. - Add UBSAN support to the standalone EL2 object used in nVHE/hVHE and protected modes. - Enable nested virtualisation support on systems that support it (yes, it has been a long time coming), though it is disabled by default. * Improvements, fixes and cleanups: - Large rework of the way KVM tracks architecture features and links them with the effects of control bits. This ensures correctness of emulation (the data is automatically extracted from the published JSON files), and helps dealing with the evolution of the architecture. - Significant changes to the way pKVM tracks ownership of pages, avoiding page table walks by storing the state in the hypervisor's vmemmap. This in turn enables the THP support described above. - New selftest checking the pKVM ownership transition rules - Fixes for FEAT_MTE_ASYNC being accidentally advertised to guests even if the host didn't have it. - Fixes for the address translation emulation, which happened to be rather buggy in some specific contexts. - Fixes for the PMU emulation in NV contexts, decoupling PMCR_EL0.N from the number of counters exposed to a guest and addressing a number of issues in the process. - Add a new selftest for the SVE host state being corrupted by a guest. - Keep HCR_EL2.xMO set at all times for systems running with the kernel at EL2, ensuring that the window for interrupts is slightly bigger, and avoiding a pretty bad erratum on the AmpereOne HW. - Add workaround for AmpereOne's erratum AC04_CPU_23, which suffers from a pretty bad case of TLB corruption unless accesses to HCR_EL2 are heavily synchronised. - Add a per-VM, per-ITS debugfs entry to dump the state of the ITS tables in a human-friendly fashion. - and the usual random cleanups.	2025-05-26 16:19:46 -04:00
Bibo Mao	a867688c8c	KVM: selftests: Add supported test cases for LoongArch Some common KVM test cases are supported on LoongArch now as following: coalesced_io_test demand_paging_test dirty_log_perf_test dirty_log_test guest_print_test hardware_disable_test kvm_binary_stats_test kvm_create_max_vcpus kvm_page_table_test memslot_modification_stress_test memslot_perf_test set_memory_region_test And other test cases are not supported by LoongArch such as rseq_test, since it is not supported on LoongArch physical machine either. Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2025-05-20 20:20:26 +08:00
Nikunj A Dadhania	72df72e1c6	KVM: selftests: Add test to verify KVM_CAP_X86_BUS_LOCK_EXIT Add a test case to verify x86's bus lock exit functionality, which is now supported on both Intel and AMD. Trigger bus lock exits by performing a split-lock access, i.e. an atomic access that splits two cache lines. Verify that the correct number of bus lock exits are generated, and that the counter is incremented correctly and at the appropriate time based on the underlying architecture. Generate bus locks in both L1 and L2 (if nested virtualization is enabled), as SVM's functionality in particular requires non-trivial logic to do the right thing when running nested VMs. Signed-off-by: Nikunj A Dadhania <nikunj@amd.com> Co-developed-by: Manali Shukla <manali.shukla@amd.com> Signed-off-by: Manali Shukla <manali.shukla@amd.com> Link: https://lore.kernel.org/r/20250502050346.14274-6-manali.shukla@amd.com Co-developed-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com>	2025-05-19 11:05:19 -07:00
James Houghton	d166453ebd	KVM: selftests: access_tracking_perf_test: Use MGLRU for access tracking Use MGLRU's debugfs interface to do access tracking instead of page_idle. The logic to use the page_idle bitmap is left in, as it is useful for kernels that do not have MGLRU built in. When MGLRU is enabled, page_idle will report pages as still idle even after being accessed, as MGLRU doesn't necessarily clear the Idle folio flag when accessing an idle page, so the test will not attempt to use page_idle if MGLRU is enabled but otherwise not usable. Aging pages with MGLRU is much faster than marking pages as idle with page_idle. Co-developed-by: Axel Rasmussen <axelrasmussen@google.com> Signed-off-by: Axel Rasmussen <axelrasmussen@google.com> Signed-off-by: James Houghton <jthoughton@google.com> Link: https://lore.kernel.org/r/20250508184649.2576210-8-jthoughton@google.com [sean: print parsed features, not raw string] Signed-off-by: Sean Christopherson <seanjc@google.com>	2025-05-16 12:58:21 -07:00
James Houghton	b11fcb51e2	KVM: selftests: Build and link selftests/cgroup/lib into KVM selftests libcgroup.o is built separately from KVM selftests and cgroup selftests, so different compiler flags used by the different selftests will not conflict with each other. Signed-off-by: James Houghton <jthoughton@google.com> Link: https://lore.kernel.org/r/20250508184649.2576210-7-jthoughton@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2025-05-16 11:45:16 -07:00

1 2

57 Commits