linux

mirror of https://github.com/torvalds/linux.git synced 2026-05-02 13:32:40 -04:00

Author	SHA1	Message	Date
Linus Torvalds	3efc57369a	Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull x86 kvm updates from Paolo Bonzini: "x86: - KVM currently invalidates the entirety of the page tables, not just those for the memslot being touched, when a memslot is moved or deleted. This does not traditionally have particularly noticeable overhead, but Intel's TDX will require the guest to re-accept private pages if they are dropped from the secure EPT, which is a non starter. Actually, the only reason why this is not already being done is a bug which was never fully investigated and caused VM instability with assigned GeForce GPUs, so allow userspace to opt into the new behavior. - Advertise AVX10.1 to userspace (effectively prep work for the "real" AVX10 functionality that is on the horizon) - Rework common MSR handling code to suppress errors on userspace accesses to unsupported-but-advertised MSRs This will allow removing (almost?) all of KVM's exemptions for userspace access to MSRs that shouldn't exist based on the vCPU model (the actual cleanup is non-trivial future work) - Rework KVM's handling of x2APIC ICR, again, because AMD (x2AVIC) splits the 64-bit value into the legacy ICR and ICR2 storage, whereas Intel (APICv) stores the entire 64-bit value at the ICR offset - Fix a bug where KVM would fail to exit to userspace if one was triggered by a fastpath exit handler - Add fastpath handling of HLT VM-Exit to expedite re-entering the guest when there's already a pending wake event at the time of the exit - Fix a WARN caused by RSM entering a nested guest from SMM with invalid guest state, by forcing the vCPU out of guest mode prior to signalling SHUTDOWN (the SHUTDOWN hits the VM altogether, not the nested guest) - Overhaul the "unprotect and retry" logic to more precisely identify cases where retrying is actually helpful, and to harden all retry paths against putting the guest into an infinite retry loop - Add support for yielding, e.g. to honor NEED_RESCHED, when zapping rmaps in the shadow MMU - Refactor pieces of the shadow MMU related to aging SPTEs in prepartion for adding multi generation LRU support in KVM - Don't stuff the RSB after VM-Exit when RETPOLINE=y and AutoIBRS is enabled, i.e. when the CPU has already flushed the RSB - Trace the per-CPU host save area as a VMCB pointer to improve readability and cleanup the retrieval of the SEV-ES host save area - Remove unnecessary accounting of temporary nested VMCB related allocations - Set FINAL/PAGE in the page fault error code for EPT violations if and only if the GVA is valid. If the GVA is NOT valid, there is no guest-side page table walk and so stuffing paging related metadata is nonsensical - Fix a bug where KVM would incorrectly synthesize a nested VM-Exit instead of emulating posted interrupt delivery to L2 - Add a lockdep assertion to detect unsafe accesses of vmcs12 structures - Harden eVMCS loading against an impossible NULL pointer deref (really truly should be impossible) - Minor SGX fix and a cleanup - Misc cleanups Generic: - Register KVM's cpuhp and syscore callbacks when enabling virtualization in hardware, as the sole purpose of said callbacks is to disable and re-enable virtualization as needed - Enable virtualization when KVM is loaded, not right before the first VM is created Together with the previous change, this simplifies a lot the logic of the callbacks, because their very existence implies virtualization is enabled - Fix a bug that results in KVM prematurely exiting to userspace for coalesced MMIO/PIO in many cases, clean up the related code, and add a testcase - Fix a bug in kvm_clear_guest() where it would trigger a buffer overflow _if_ the gpa+len crosses a page boundary, which thankfully is guaranteed to not happen in the current code base. Add WARNs in more helpers that read/write guest memory to detect similar bugs Selftests: - Fix a goof that caused some Hyper-V tests to be skipped when run on bare metal, i.e. NOT in a VM - Add a regression test for KVM's handling of SHUTDOWN for an SEV-ES guest - Explicitly include one-off assets in .gitignore. Past Sean was completely wrong about not being able to detect missing .gitignore entries - Verify userspace single-stepping works when KVM happens to handle a VM-Exit in its fastpath - Misc cleanups" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (127 commits) Documentation: KVM: fix warning in "make htmldocs" s390: Enable KVM_S390_UCONTROL config in debug_defconfig selftests: kvm: s390: Add VM run test case KVM: SVM: let alternatives handle the cases when RSB filling is required KVM: VMX: Set PFERR_GUEST_{FINAL,PAGE}_MASK if and only if the GVA is valid KVM: x86/mmu: Use KVM_PAGES_PER_HPAGE() instead of an open coded equivalent KVM: x86/mmu: Add KVM_RMAP_MANY to replace open coded '1' and '1ul' literals KVM: x86/mmu: Fold mmu_spte_age() into kvm_rmap_age_gfn_range() KVM: x86/mmu: Morph kvm_handle_gfn_range() into an aging specific helper KVM: x86/mmu: Honor NEED_RESCHED when zapping rmaps and blocking is allowed KVM: x86/mmu: Add a helper to walk and zap rmaps for a memslot KVM: x86/mmu: Plumb a @can_yield parameter into __walk_slot_rmaps() KVM: x86/mmu: Move walk_slot_rmaps() up near for_each_slot_rmap_range() KVM: x86/mmu: WARN on MMIO cache hit when emulating write-protected gfn KVM: x86/mmu: Detect if unprotect will do anything based on invalid_list KVM: x86/mmu: Subsume kvm_mmu_unprotect_page() into the and_retry() version KVM: x86: Rename reexecute_instruction()=>kvm_unprotect_and_retry_on_failure() KVM: x86: Update retry protection fields when forcing retry on emulation failure KVM: x86: Apply retry protection to "unprotect on failure" path KVM: x86: Check EMULTYPE_WRITE_PF_TO_SP before unprotecting gfn ...	2024-09-28 09:20:14 -07:00
Linus Torvalds	f8ffbc365f	Merge tag 'pull-stable-struct_fd' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull 'struct fd' updates from Al Viro: "Just the 'struct fd' layout change, with conversion to accessor helpers" * tag 'pull-stable-struct_fd' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: add struct fd constructors, get rid of __to_fd() struct fd: representation change introduce fd_file(), convert all accessors to it.	2024-09-23 09:35:36 -07:00
Paolo Bonzini	3f8df62852	Merge tag 'kvm-x86-vmx-6.12' of https://github.com/kvm-x86/linux into HEAD KVM VMX changes for 6.12: - Set FINAL/PAGE in the page fault error code for EPT Violations if and only if the GVA is valid. If the GVA is NOT valid, there is no guest-side page table walk and so stuffing paging related metadata is nonsensical. - Fix a bug where KVM would incorrectly synthesize a nested VM-Exit instead of emulating posted interrupt delivery to L2. - Add a lockdep assertion to detect unsafe accesses of vmcs12 structures. - Harden eVMCS loading against an impossible NULL pointer deref (really truly should be impossible). - Minor SGX fix and a cleanup.	2024-09-17 12:41:23 -04:00
Paolo Bonzini	55e6f8f29d	Merge tag 'kvm-x86-svm-6.12' of https://github.com/kvm-x86/linux into HEAD KVM SVM changes for 6.12: - Don't stuff the RSB after VM-Exit when RETPOLINE=y and AutoIBRS is enabled, i.e. when the CPU has already flushed the RSB. - Trace the per-CPU host save area as a VMCB pointer to improve readability and cleanup the retrieval of the SEV-ES host save area. - Remove unnecessary accounting of temporary nested VMCB related allocations.	2024-09-17 12:41:13 -04:00
Paolo Bonzini	43d97b2ebd	Merge tag 'kvm-x86-pat_vmx_msrs-6.12' of https://github.com/kvm-x86/linux into HEAD KVM VMX and x86 PAT MSR macro cleanup for 6.12: - Add common defines for the x86 architectural memory types, i.e. the types that are shared across PAT, MTRRs, VMCSes, and EPTPs. - Clean up the various VMX MSR macros to make the code self-documenting (inasmuch as possible), and to make it less painful to add new macros.	2024-09-17 12:40:39 -04:00
Paolo Bonzini	5d55a052e3	Merge tag 'kvm-x86-mmu-6.12' of https://github.com/kvm-x86/linux into HEAD KVM x86 MMU changes for 6.12: - Overhaul the "unprotect and retry" logic to more precisely identify cases where retrying is actually helpful, and to harden all retry paths against putting the guest into an infinite retry loop. - Add support for yielding, e.g. to honor NEED_RESCHED, when zapping rmaps in the shadow MMU. - Refactor pieces of the shadow MMU related to aging SPTEs in prepartion for adding MGLRU support in KVM. - Misc cleanups	2024-09-17 12:39:53 -04:00
Paolo Bonzini	41786cc5ea	Merge tag 'kvm-x86-misc-6.12' of https://github.com/kvm-x86/linux into HEAD KVM x86 misc changes for 6.12 - Advertise AVX10.1 to userspace (effectively prep work for the "real" AVX10 functionality that is on the horizon). - Rework common MSR handling code to suppress errors on userspace accesses to unsupported-but-advertised MSRs. This will allow removing (almost?) all of KVM's exemptions for userspace access to MSRs that shouldn't exist based on the vCPU model (the actual cleanup is non-trivial future work). - Rework KVM's handling of x2APIC ICR, again, because AMD (x2AVIC) splits the 64-bit value into the legacy ICR and ICR2 storage, whereas Intel (APICv) stores the entire 64-bit value a the ICR offset. - Fix a bug where KVM would fail to exit to userspace if one was triggered by a fastpath exit handler. - Add fastpath handling of HLT VM-Exit to expedite re-entering the guest when there's already a pending wake event at the time of the exit. - Finally fix the RSM vs. nested VM-Enter WARN by forcing the vCPU out of guest mode prior to signalling SHUTDOWN (architecturally, the SHUTDOWN is supposed to hit L1, not L2).	2024-09-17 11:38:23 -04:00
Paolo Bonzini	c09dd2bb57	Merge branch 'kvm-redo-enable-virt' into HEAD Register KVM's cpuhp and syscore callbacks when enabling virtualization in hardware, as the sole purpose of said callbacks is to disable and re-enable virtualization as needed. The primary motivation for this series is to simplify dealing with enabling virtualization for Intel's TDX, which needs to enable virtualization when kvm-intel.ko is loaded, i.e. long before the first VM is created. That said, this is a nice cleanup on its own. By registering the callbacks on-demand, the callbacks themselves don't need to check kvm_usage_count, because their very existence implies a non-zero count. Patch 1 (re)adds a dedicated lock for kvm_usage_count. This avoids a lock ordering issue between cpus_read_lock() and kvm_lock. The lock ordering issue still exist in very rare cases, and will be fixed for good by switching vm_list to an (S)RCU-protected list. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-09-17 11:38:20 -04:00
Paolo Bonzini	55f50b2f86	Merge branch 'kvm-memslot-zap-quirk' into HEAD Today whenever a memslot is moved or deleted, KVM invalidates the entire page tables and generates fresh ones based on the new memslot layout. This behavior traditionally was kept because of a bug which was never fully investigated and caused VM instability with assigned GeForce GPUs. It generally does not have a huge overhead, because the old MMU is able to reuse cached page tables and the new one is more scalabale and can resolve EPT violations/nested page faults in parallel, but it has worse performance if the guest frequently deletes and adds small memslots, and it's entirely not viable for TDX. This is because TDX requires re-accepting of private pages after page dropping. For non-TDX VMs, this series therefore introduces the KVM_X86_QUIRK_SLOT_ZAP_ALL quirk, enabling users to control the behavior of memslot zapping when a memslot is moved/deleted. The quirk is turned on by default, leading to the zapping of all SPTEs when a memslot is moved/deleted; users however have the option to turn off the quirk, which limits the zapping only to those SPTEs hat lie within the range of memslot being moved/deleted. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-09-17 11:38:19 -04:00
Linus Torvalds	d42f7708e2	Merge tag 'for-linus-6.11' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull kvm fix from Paolo Bonzini: "Do not always honor guest PAT on CPUs that support self-snoop. This triggers an issue in the bochsdrm driver, which used ioremap() instead of ioremap_wc() to map the video RAM. The revert lets video RAM use the WB memory type instead of the slower UC memory type" * tag 'for-linus-6.11' of git://git.kernel.org/pub/scm/virt/kvm/kvm: Revert "KVM: VMX: Always honor guest PAT on CPUs that support self-snoop"	2024-09-15 09:35:50 +02:00
Paolo Bonzini	9d70f3fec1	Revert "KVM: VMX: Always honor guest PAT on CPUs that support self-snoop" This reverts commit `377b2f359d`. This caused a regression with the bochsdrm driver, which used ioremap() instead of ioremap_wc() to map the video RAM. After the commit, the WB memory type is used without the IGNORE_PAT, resulting in the slower UC memory type. In fact, UC is slow enough to basically cause guests to not boot... but only on new processors such as Sapphire Rapids and Cascade Lake. Coffee Lake for example works properly, though that might also be an effect of being on a larger, more NUMA system. The driver has been fixed but that does not help older guests. Until we figure out whether Cascade Lake and newer processors are working as intended, revert the commit. Long term we might add a quirk, but the details depend on whether the processors are working as intended: for example if they are, the quirk might reference bochs-compatible devices, e.g. in the name and documentation, so that userspace can disable the quirk by default and only leave it enabled if such a device is being exposed to the guest. If instead this is actually a bug in CLX+, then the actions we need to take are different and depend on the actual cause of the bug. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-09-15 02:49:33 -04:00
Amit Shah	4440337af4	KVM: SVM: let alternatives handle the cases when RSB filling is required Remove superfluous RSB filling after a VMEXIT when the CPU already has flushed the RSB after a VMEXIT when AutoIBRS is enabled. The initial implementation for adding RETPOLINES added an ALTERNATIVES implementation for filling the RSB after a VMEXIT in commit `117cc7a908` ("x86/retpoline: Fill return stack buffer on vmexit"). Later, X86_FEATURE_RSB_VMEXIT was added in commit `9756bba284` ("x86/speculation: Fill RSB on vmexit for IBRS") to handle stuffing the RSB if RETPOLINE=y or KERNEL_IBRS=y, i.e. to also stuff the RSB if the kernel is configured to do IBRS mitigations on entry/exit. The AutoIBRS (on AMD) feature implementation added in commit `e7862eda30` ("x86/cpu: Support AMD Automatic IBRS") used the already-implemented logic for EIBRS in spectre_v2_determine_rsb_fill_type_on_vmexit() -- but did not update the code at VMEXIT to act on the mode selected in that function -- resulting in VMEXITs continuing to clear the RSB when RETPOLINES are enabled, despite the presence of AutoIBRS. Signed-off-by: Amit Shah <amit.shah@amd.com> Link: https://lore.kernel.org/r/20240807123531.69677-1-amit@kernel.org [sean: massage changeloge, drop comment about AMD not needing RSB_VMEXIT_LITE] Signed-off-by: Sean Christopherson <seanjc@google.com>	2024-09-10 10:27:53 -07:00
Sean Christopherson	f300948251	KVM: VMX: Set PFERR_GUEST_{FINAL,PAGE}_MASK if and only if the GVA is valid Set PFERR_GUEST_{FINAL,PAGE}_MASK based on EPT_VIOLATION_GVA_TRANSLATED if and only if EPT_VIOLATION_GVA_IS_VALID is also set in exit qualification. Per the SDM, bit 8 (EPT_VIOLATION_GVA_TRANSLATED) is valid if and only if bit 7 (EPT_VIOLATION_GVA_IS_VALID) is set, and is '0' if bit 7 is '0'. Bit 7 (a.k.a. EPT_VIOLATION_GVA_IS_VALID) Set if the guest linear-address field is valid. The guest linear-address field is valid for all EPT violations except those resulting from an attempt to load the guest PDPTEs as part of the execution of the MOV CR instruction and those due to trace-address pre-translation Bit 8 (a.k.a. EPT_VIOLATION_GVA_TRANSLATED) If bit 7 is 1: • Set if the access causing the EPT violation is to a guest-physical address that is the translation of a linear address. • Clear if the access causing the EPT violation is to a paging-structure entry as part of a page walk or the update of an accessed or dirty bit. Reserved if bit 7 is 0 (cleared to 0). Failure to guard the logic on GVA_IS_VALID results in KVM marking the page fault as PFERR_GUEST_PAGE_MASK when there is no known GVA, which can put the vCPU into an infinite loop due to kvm_mmu_page_fault() getting false positive on its PFERR_NESTED_GUEST_PAGE logic (though only because that logic is also buggy/flawed). In practice, this is largely a non-issue because so GVA_IS_VALID is almost always set. However, when TDX comes along, GVA_IS_VALID will never be set, as the TDX Module deliberately clears bits 12:7 in exit qualification, e.g. so that the faulting virtual address and other metadata that aren't practically useful for the hypervisor aren't leaked to the untrusted host. When exit is due to EPT violation, bits 12-7 of the exit qualification are cleared to 0. Fixes: `eebed24389` ("kvm: nVMX: Add support for fast unprotection of nested guest page tables") Reviewed-by: Yuan Yao <yuan.yao@intel.com> Link: https://lore.kernel.org/r/20240831001538.336683-2-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2024-09-09 20:33:22 -07:00
Sean Christopherson	9a5bff7f5e	KVM: x86/mmu: Use KVM_PAGES_PER_HPAGE() instead of an open coded equivalent Use KVM_PAGES_PER_HPAGE() instead of open coding equivalent logic that is anything but obvious. No functional change intended, and verified by compiling with the below assertions: BUILD_BUG_ON((1UL << KVM_HPAGE_GFN_SHIFT(PG_LEVEL_4K)) != KVM_PAGES_PER_HPAGE(PG_LEVEL_4K)); BUILD_BUG_ON((1UL << KVM_HPAGE_GFN_SHIFT(PG_LEVEL_2M)) != KVM_PAGES_PER_HPAGE(PG_LEVEL_2M)); BUILD_BUG_ON((1UL << KVM_HPAGE_GFN_SHIFT(PG_LEVEL_1G)) != KVM_PAGES_PER_HPAGE(PG_LEVEL_1G)); Link: https://lore.kernel.org/r/20240809194335.1726916-19-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2024-09-09 20:22:08 -07:00
Sean Christopherson	7645829145	KVM: x86/mmu: Add KVM_RMAP_MANY to replace open coded '1' and '1ul' literals Replace all of the open coded '1' literals used to mark a PTE list as having many/multiple entries with a proper define. It's hard enough to read the code with one magic bit, and a future patch to support "locking" a single rmap will add another. No functional change intended. Link: https://lore.kernel.org/r/20240809194335.1726916-17-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2024-09-09 20:22:07 -07:00
Sean Christopherson	7aac9dc680	KVM: x86/mmu: Fold mmu_spte_age() into kvm_rmap_age_gfn_range() Fold mmu_spte_age() into its sole caller now that aging and testing for young SPTEs is handled in a common location, i.e. doesn't require more helpers. Opportunistically remove the use of mmu_spte_get_lockless(), as mmu_lock is held (for write!), and marking SPTEs for access tracking outside of mmu_lock is unsafe (at least, as written). I.e. using the lockless accessor is quite misleading. No functional change intended. Link: https://lore.kernel.org/r/20240809194335.1726916-16-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2024-09-09 20:22:06 -07:00
Sean Christopherson	c17f150000	KVM: x86/mmu: Morph kvm_handle_gfn_range() into an aging specific helper Rework kvm_handle_gfn_range() into an aging-specic helper, kvm_rmap_age_gfn_range(). In addition to purging a bunch of unnecessary boilerplate code, this sets the stage for aging rmap SPTEs outside of mmu_lock. Note, there's a small functional change, as kvm_test_age_gfn() will now return immediately if a young SPTE is found, whereas previously KVM would continue iterating over other levels. Link: https://lore.kernel.org/r/20240809194335.1726916-15-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2024-09-09 20:22:05 -07:00
Sean Christopherson	548f87f667	KVM: x86/mmu: Honor NEED_RESCHED when zapping rmaps and blocking is allowed Convert kvm_unmap_gfn_range(), which is the helper that zaps rmap SPTEs in response to an mmu_notifier invalidation, to use __kvm_rmap_zap_gfn_range() and feed in range->may_block. In other words, honor NEED_RESCHED by way of cond_resched() when zapping rmaps. This fixes a long-standing issue where KVM could process an absurd number of rmap entries without ever yielding, e.g. if an mmu_notifier fired on a PUD (or larger) range. Opportunistically rename __kvm_zap_rmap() to kvm_zap_rmap(), and drop the old kvm_zap_rmap(). Ideally, the shuffling would be done in a different patch, but that just makes the compiler unhappy, e.g. arch/x86/kvm/mmu/mmu.c:1462:13: error: ‘kvm_zap_rmap’ defined but not used Reported-by: Peter Xu <peterx@redhat.com> Link: https://lore.kernel.org/r/20240809194335.1726916-14-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2024-09-09 20:22:04 -07:00
Sean Christopherson	dd9eaad744	KVM: x86/mmu: Add a helper to walk and zap rmaps for a memslot Add a dedicated helper to walk and zap rmaps for a given memslot so that the code can be shared between KVM-initiated zaps and mmu_notifier invalidations. No functional change intended. Link: https://lore.kernel.org/r/20240809194335.1726916-13-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2024-09-09 20:22:03 -07:00
Sean Christopherson	5b1fb116e1	KVM: x86/mmu: Plumb a @can_yield parameter into __walk_slot_rmaps() Add a @can_yield param to __walk_slot_rmaps() to control whether or not dropping mmu_lock and conditionally rescheduling is allowed. This will allow using __walk_slot_rmaps() and thus cond_resched() to handle mmu_notifier invalidations, which usually allow blocking/yielding, but not when invoked by the OOM killer. Link: https://lore.kernel.org/r/20240809194335.1726916-12-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2024-09-09 20:22:02 -07:00
Sean Christopherson	0a37fffda1	KVM: x86/mmu: Move walk_slot_rmaps() up near for_each_slot_rmap_range() Move walk_slot_rmaps() and friends up near for_each_slot_rmap_range() so that the walkers can be used to handle mmu_notifier invalidations, and so that similar function has some amount of locality in code. No functional change intended. Link: https://lore.kernel.org/r/20240809194335.1726916-11-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2024-09-09 20:22:01 -07:00
Sean Christopherson	98a69b96ca	KVM: x86/mmu: WARN on MMIO cache hit when emulating write-protected gfn WARN if KVM gets an MMIO cache hit on a RET_PF_WRITE_PROTECTED fault, as KVM should return RET_PF_WRITE_PROTECTED if and only if there is a memslot, and creating a memslot is supposed to invalidate the MMIO cache by virtue of changing the memslot generation. Keep the code around mainly to provide a convenient location to document why emulated MMIO should be impossible. Suggested-by: Yuan Yao <yuan.yao@linux.intel.com> Link: https://lore.kernel.org/r/20240831001538.336683-23-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2024-09-09 20:16:36 -07:00
Sean Christopherson	d859b16161	KVM: x86/mmu: Detect if unprotect will do anything based on invalid_list Explicitly query the list of to-be-zapped shadow pages when checking to see if unprotecting a gfn for retry has succeeded, i.e. if KVM should retry the faulting instruction. Add a comment to explain why the list needs to be checked before zapping, which is the primary motivation for this change. No functional change intended. Reviewed-by: Yuan Yao <yuan.yao@intel.com> Link: https://lore.kernel.org/r/20240831001538.336683-22-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2024-09-09 20:16:35 -07:00
Sean Christopherson	6b3dcabc10	KVM: x86/mmu: Subsume kvm_mmu_unprotect_page() into the and_retry() version Fold kvm_mmu_unprotect_page() into kvm_mmu_unprotect_gfn_and_retry() now that all other direct usage is gone. No functional change intended. Link: https://lore.kernel.org/r/20240831001538.336683-21-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2024-09-09 20:16:34 -07:00
Sean Christopherson	2876624e1a	KVM: x86: Rename reexecute_instruction()=>kvm_unprotect_and_retry_on_failure() Rename reexecute_instruction() to kvm_unprotect_and_retry_on_failure() to make the intent and purpose of the helper much more obvious. No functional change intended. Reviewed-by: Yuan Yao <yuan.yao@intel.com> Link: https://lore.kernel.org/r/20240831001538.336683-20-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2024-09-09 20:16:33 -07:00
Sean Christopherson	4df685664b	KVM: x86: Update retry protection fields when forcing retry on emulation failure When retrying the faulting instruction after emulation failure, refresh the infinite loop protection fields even if no shadow pages were zapped, i.e. avoid hitting an infinite loop even when retrying the instruction as a last-ditch effort to avoid terminating the guest. Link: https://lore.kernel.org/r/20240831001538.336683-19-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2024-09-09 20:16:32 -07:00
Sean Christopherson	dabc4ff70c	KVM: x86: Apply retry protection to "unprotect on failure" path Use kvm_mmu_unprotect_gfn_and_retry() in reexecute_instruction() to pick up protection against infinite loops, e.g. if KVM somehow manages to encounter an unsupported instruction and unprotecting the gfn doesn't allow the vCPU to make forward progress. Other than that, the retry-on- failure logic is a functionally equivalent, open coded version of kvm_mmu_unprotect_gfn_and_retry(). Note, the emulation failure path still isn't fully protected, as KVM won't update the retry protection fields if no shadow pages are zapped (but this change is still a step forward). That flaw will be addressed in a future patch. Reviewed-by: Yuan Yao <yuan.yao@intel.com> Link: https://lore.kernel.org/r/20240831001538.336683-18-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2024-09-09 20:16:32 -07:00
Sean Christopherson	19ab2c8be0	KVM: x86: Check EMULTYPE_WRITE_PF_TO_SP before unprotecting gfn Don't bother unprotecting the target gfn if EMULTYPE_WRITE_PF_TO_SP is set, as KVM will simply report the emulation failure to userspace. This will allow converting reexecute_instruction() to use kvm_mmu_unprotect_gfn_instead_retry() instead of kvm_mmu_unprotect_page(). Link: https://lore.kernel.org/r/20240831001538.336683-17-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2024-09-09 20:16:31 -07:00
Sean Christopherson	6205257395	KVM: x86: Remove manual pfn lookup when retrying #PF after failed emulation Drop the manual pfn look when retrying an instruction that KVM failed to emulation in response to a #PF due to a write-protected gfn. Now that KVM sets EMULTYPE_ALLOW_RETRY_PF if and only if the page fault hit a write- protected gfn, i.e. if and only if there's a writable memslot, there's no need to redo the lookup to avoid retrying an instruction that failed on emulated MMIO (no slot, or a write to a read-only slot). I.e. KVM will never attempt to retry an instruction that failed on emulated MMIO, whereas that was not the case prior to the introduction of RET_PF_WRITE_PROTECTED. Reviewed-by: Yuan Yao <yuan.yao@intel.com> Link: https://lore.kernel.org/r/20240831001538.336683-16-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2024-09-09 20:16:30 -07:00
Sean Christopherson	b299c273c0	KVM: x86/mmu: Move event re-injection unprotect+retry into common path Move the event re-injection unprotect+retry logic into kvm_mmu_write_protect_fault(), i.e. unprotect and retry if and only if the #PF actually hit a write-protected gfn. Note, there is a small possibility that the gfn was unprotected by a different tasking between hitting the #PF and acquiring mmu_lock, but in that case, KVM will resume the guest immediately anyways because KVM will treat the fault as spurious. As a bonus, unprotecting _after_ handling the page fault also addresses the case where the installing a SPTE to handle fault encounters a shadowed PTE, i.e. creates a read-only SPTE. Opportunstically add a comment explaining what on earth the intent of the code is, as based on the changelog from commit `577bdc4966` ("KVM: Avoid instruction emulation when event delivery is pending"). Link: https://lore.kernel.org/r/20240831001538.336683-15-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2024-09-09 20:16:29 -07:00
Sean Christopherson	29e495bdf8	KVM: x86/mmu: Always walk guest PTEs with WRITE access when unprotecting When getting a gpa from a gva to unprotect the associated gfn when an event is awating reinjection, walk the guest PTEs for WRITE as there's no point in unprotecting the gfn if the guest is unable to write the page, i.e. if write-protection can't trigger emulation. Note, the entire flow should be guarded on the access being a write, and even better should be conditioned on actually triggering a write-protect fault. This will be addressed in a future commit. Reviewed-by: Yuan Yao <yuan.yao@intel.com> Link: https://lore.kernel.org/r/20240831001538.336683-14-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2024-09-09 20:16:28 -07:00
Sean Christopherson	b7e948898e	KVM: x86/mmu: Don't try to unprotect an INVALID_GPA If getting the gpa for a gva fails, e.g. because the gva isn't mapped in the guest page tables, don't try to unprotect the invalid gfn. This is mostly a performance fix (avoids unnecessarily taking mmu_lock), as for_each_gfn_valid_sp_with_gptes() won't explode on garbage input, it's simply pointless. Link: https://lore.kernel.org/r/20240831001538.336683-13-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2024-09-09 20:16:27 -07:00
Sean Christopherson	2df354e37c	KVM: x86: Fold retry_instruction() into x86_emulate_instruction() Now that retry_instruction() is reasonably tiny, fold it into its sole caller, x86_emulate_instruction(). In addition to getting rid of the absurdly confusing retry_instruction() name, handling the retry in x86_emulate_instruction() pairs it back up with the code that resets last_retry_{eip,address}. No functional change intended. Reviewed-by: Yuan Yao <yuan.yao@intel.com> Link: https://lore.kernel.org/r/20240831001538.336683-12-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2024-09-09 20:16:26 -07:00
Sean Christopherson	41e6e367d5	KVM: x86: Move EMULTYPE_ALLOW_RETRY_PF to x86_emulate_instruction() Move the sanity checks for EMULTYPE_ALLOW_RETRY_PF to the top of x86_emulate_instruction(). In addition to deduplicating a small amount of code, this makes the connection between EMULTYPE_ALLOW_RETRY_PF and EMULTYPE_PF even more explicit, and will allow dropping retry_instruction() entirely. Reviewed-by: Yuan Yao <yuan.yao@intel.com> Link: https://lore.kernel.org/r/20240831001538.336683-11-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2024-09-09 20:16:25 -07:00
Sean Christopherson	dfaae8447c	KVM: x86/mmu: Try "unprotect for retry" iff there are indirect SPs Try to unprotect shadow pages if and only if indirect_shadow_pages is non- zero, i.e. iff there is at least one protected such shadow page. Pre- checking indirect_shadow_pages avoids taking mmu_lock for write when the gfn is write-protected by a third party, i.e. not for KVM shadow paging, and in the extremely unlikely case that a different task has already unprotected the last shadow page. Link: https://lore.kernel.org/r/20240831001538.336683-10-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2024-09-09 20:16:24 -07:00
Sean Christopherson	01dd4d3192	KVM: x86/mmu: Apply retry protection to "fast nTDP unprotect" path Move the anti-infinite-loop protection provided by last_retry_{eip,addr} into kvm_mmu_write_protect_fault() so that it guards unprotect+retry that never hits the emulator, as well as reexecute_instruction(), which is the last ditch "might as well try it" logic that kicks in when emulation fails on an instruction that faulted on a write-protected gfn. Add a new helper, kvm_mmu_unprotect_gfn_and_retry(), to set the retry fields and deduplicate other code (with more to come). Link: https://lore.kernel.org/r/20240831001538.336683-9-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2024-09-09 20:16:23 -07:00
Sean Christopherson	9c19129e53	KVM: x86: Store gpa as gpa_t, not unsigned long, when unprotecting for retry Store the gpa used to unprotect the faulting gfn for retry as a gpa_t, not an unsigned long. This fixes a bug where 32-bit KVM would unprotect and retry the wrong gfn if the gpa had bits 63:32!=0. In practice, this bug is functionally benign, as unprotecting the wrong gfn is purely a performance issue (thanks to the anti-infinite-loop logic). And of course, almost no one runs 32-bit KVM these days. Reviewed-by: Yuan Yao <yuan.yao@intel.com> Link: https://lore.kernel.org/r/20240831001538.336683-8-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2024-09-09 20:16:22 -07:00
Sean Christopherson	019f3f84a4	KVM: x86: Get RIP from vCPU state when storing it to last_retry_eip Read RIP from vCPU state instead of pulling it from the emulation context when filling last_retry_eip, which is part of the anti-infinite-loop protection used when unprotecting and retrying instructions that hit a write-protected gfn. This will allow reusing the anti-infinite-loop protection in flows that never make it into the emulator. No functional change intended, as ctxt->eip is set to kvm_rip_read() in init_emulate_ctxt(), and EMULTYPE_PF emulation is mutually exclusive with EMULTYPE_NO_DECODE and EMULTYPE_SKIP, i.e. always goes through x86_decode_emulated_instruction() and hasn't advanced ctxt->eip (yet). Reviewed-by: Yuan Yao <yuan.yao@intel.com> Link: https://lore.kernel.org/r/20240831001538.336683-7-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2024-09-09 20:16:21 -07:00
Sean Christopherson	c1edcc41c3	KVM: x86: Retry to-be-emulated insn in "slow" unprotect path iff sp is zapped Resume the guest and thus skip emulation of a non-PTE-writing instruction if and only if unprotecting the gfn actually zapped at least one shadow page. If the gfn is write-protected for some reason other than shadow paging, attempting to unprotect the gfn will effectively fail, and thus retrying the instruction is all but guaranteed to be pointless. This bug has existed for a long time, but was effectively fudged around by the retry RIP+address anti-loop detection. Reviewed-by: Yuan Yao <yuan.yao@intel.com> Link: https://lore.kernel.org/r/20240831001538.336683-6-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2024-09-09 20:16:21 -07:00
Sean Christopherson	2fb2b7877b	KVM: x86/mmu: Skip emulation on page fault iff 1+ SPs were unprotected When doing "fast unprotection" of nested TDP page tables, skip emulation if and only if at least one gfn was unprotected, i.e. continue with emulation if simply resuming is likely to hit the same fault and risk putting the vCPU into an infinite loop. Note, it's entirely possible to get a false negative, e.g. if a different vCPU faults on the same gfn and unprotects the gfn first, but that's a relatively rare edge case, and emulating is still functionally ok, i.e. saving a few cycles by avoiding emulation isn't worth the risk of putting the vCPU into an infinite loop. Opportunistically rewrite the relevant comment to document in gory detail exactly what scenario the "fast unprotect" logic is handling. Fixes: `147277540b` ("kvm: svm: Add support for additional SVM NPF error codes") Cc: Yuan Yao <yuan.yao@intel.com> Reviewed-by: Yuan Yao <yuan.yao@intel.com> Link: https://lore.kernel.org/r/20240831001538.336683-5-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2024-09-09 20:16:20 -07:00
Sean Christopherson	989a84c93f	KVM: x86/mmu: Trigger unprotect logic only on write-protection page faults Trigger KVM's various "unprotect gfn" paths if and only if the page fault was a write to a write-protected gfn. To do so, add a new page fault return code, RET_PF_WRITE_PROTECTED, to explicitly and precisely track such page faults. If a page fault requires emulation for any MMIO (or any reason besides write-protection), trying to unprotect the gfn is pointless and risks putting the vCPU into an infinite loop. E.g. KVM will put the vCPU into an infinite loop if the vCPU manages to trigger MMIO on a page table walk. Fixes: `147277540b` ("kvm: svm: Add support for additional SVM NPF error codes") Reviewed-by: Yuan Yao <yuan.yao@intel.com> Link: https://lore.kernel.org/r/20240831001538.336683-4-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2024-09-09 20:16:19 -07:00
Sean Christopherson	4ececec19a	KVM: x86/mmu: Replace PFERR_NESTED_GUEST_PAGE with a more descriptive helper Drop the globally visible PFERR_NESTED_GUEST_PAGE and replace it with a more appropriately named is_write_to_guest_page_table(). The macro name is misleading, because while all nNPT walks match PAGE\|WRITE\|PRESENT, the reverse is not true. No functional change intended. Link: https://lore.kernel.org/r/20240831001538.336683-3-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2024-09-09 20:16:18 -07:00
Sean Christopherson	3dde46a21a	KVM: nVMX: Assert that vcpu->mutex is held when accessing secondary VMCSes Add lockdep assertions in get_vmcs12() and get_shadow_vmcs12() to verify the vCPU's mutex is held, as the returned VMCS objects are dynamically allocated/freed when nested VMX is turned on/off, i.e. accessing vmcs12 structures without holding vcpu->mutex is susceptible to use-after-free. Waive the assertion if the VM is being destroyed, as KVM currently forces a nested VM-Exit when freeing the vCPU. If/when that wart is fixed, the assertion can/should be converted to an unqualified lockdep assertion. See also https://lore.kernel.org/all/Zsd0TqCeY3B5Sb5b@google.com. Link: https://lore.kernel.org/r/20240906043413.1049633-8-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2024-09-09 20:15:03 -07:00
Sean Christopherson	1ed0f119c5	KVM: nVMX: Explicitly invalidate posted_intr_nv if PI is disabled at VM-Enter Explicitly invalidate posted_intr_nv when emulating nested VM-Enter and posted interrupts are disabled to make it clear that posted_intr_nv is valid if and only if nested posted interrupts are enabled, and as a cheap way to harden against KVM bugs. KVM initializes posted_intr_nv to -1 at vCPU creation and resets it to -1 when unloading vmcs12 and/or leaving nested mode, i.e. this is not a bug fix (or at least, it's not intended to be a bug fix). Note, tracking nested.posted_intr_nv as a u16 subtly adds a measure of safety, as it prevents unintentionally matching KVM's informal "no IRQ" vector of -1, stored as a signed int. Because a u16 can be always be represented as a signed int, the effective "invalid" value of posted_intr_nv, 65535, will be preserved as-is when comparing against an int, i.e. will be zero-extended, not sign-extended, and thus won't get a false positive if KVM is buggy and compares posted_intr_nv against -1. Opportunistically add a comment in vmx_deliver_nested_posted_interrupt() to call out that it must check vmx->nested.posted_intr_nv, not the vector in vmcs12, which is presumably the _entire_ reason nested.posted_intr_nv exists. E.g. vmcs12 is a KVM-controlled snapshot, so there are no TOCTOU races to worry about, the only potential badness is if the vCPU leaves nested and frees vmcs12 between the sender checking is_guest_mode() and dereferencing the vmcs12 pointer. Link: https://lore.kernel.org/r/20240906043413.1049633-7-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2024-09-09 20:15:02 -07:00
Sean Christopherson	aa9477966a	KVM: x86: Fold kvm_get_apic_interrupt() into kvm_cpu_get_interrupt() Fold kvm_get_apic_interrupt() into kvm_cpu_get_interrupt() now that nVMX essentially open codes kvm_get_apic_interrupt() in order to correctly emulate nested posted interrupts. Opportunistically stop exporting kvm_cpu_get_interrupt(), as the aforementioned nVMX flow was the only user in vendor code. Link: https://lore.kernel.org/r/20240906043413.1049633-6-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2024-09-09 20:15:01 -07:00
Sean Christopherson	6e0b456547	KVM: nVMX: Detect nested posted interrupt NV at nested VM-Exit injection When synthensizing a nested VM-Exit due to an external interrupt, pend a nested posted interrupt if the external interrupt vector matches L2's PI notification vector, i.e. if the interrupt is a PI notification for L2. This fixes a bug where KVM will incorrectly inject VM-Exit instead of processing nested posted interrupt when IPI virtualization is enabled. Per the SDM, detection of the notification vector doesn't occur until the interrupt is acknowledge and deliver to the CPU core. If the external-interrupt exiting VM-execution control is 1, any unmasked external interrupt causes a VM exit (see Section 26.2). If the "process posted interrupts" VM-execution control is also 1, this behavior is changed and the processor handles an external interrupt as follows: 1. The local APIC is acknowledged; this provides the processor core with an interrupt vector, called here the physical vector. 2. If the physical vector equals the posted-interrupt notification vector, the logical processor continues to the next step. Otherwise, a VM exit occurs as it would normally due to an external interrupt; the vector is saved in the VM-exit interruption-information field. For the most part, KVM has avoided problems because a PI NV for L2 that arrives will L2 is active will be processed by hardware, and KVM checks for a pending notification vector during nested VM-Enter. Thus, to hit the bug, the PI NV interrupt needs to sneak its way into L1's vIRR while L2 is active. Without IPI virtualization, the scenario is practically impossible to hit, modulo L1 doing weird things (see below), as the ordering between vmx_deliver_posted_interrupt() and nested VM-Enter effectively guarantees that either the sender will see the vCPU as being in_guest_mode(), or the receiver will see the interrupt in its vIRR. With IPI virtualization, introduced by commit `d588bb9be1` ("KVM: VMX: enable IPI virtualization"), the sending CPU effectively implements a rough equivalent of vmx_deliver_posted_interrupt(), sans the nested PI NV check. If the target vCPU has a valid PID, the CPU will send a PI NV interrupt based on _L1's_ PID, as the sender's because IPIv table points at L1 PIDs. PIR := 32 bytes at PID_ADDR; // under lock PIR[V] := 1; store PIR at PID_ADDR; // release lock NotifyInfo := 8 bytes at PID_ADDR + 32; // under lock IF NotifyInfo.ON = 0 AND NotifyInfo.SN = 0; THEN NotifyInfo.ON := 1; SendNotify := 1; ELSE SendNotify := 0; FI; store NotifyInfo at PID_ADDR + 32; // release lock IF SendNotify = 1; THEN send an IPI specified by NotifyInfo.NDST and NotifyInfo.NV; FI; As a result, the target vCPU ends up receiving an interrupt on KVM's POSTED_INTR_VECTOR while L2 is running, with an interrupt in L1's PIR for L2's nested PI NV. The POSTED_INTR_VECTOR interrupt triggers a VM-Exit from L2 to L0, KVM moves the interrupt from L1's PIR to vIRR, triggers a KVM_REQ_EVENT prior to re-entry to L2, and calls vmx_check_nested_events(), effectively bypassing all of KVM's "early" checks on nested PI NV. Without IPI virtualization, the bug can likely be hit only if L1 programs an assigned device to _post_ an interrupt to L2's notification vector, by way of L1's PID.PIR. Doing so would allow the interrupt to get into L1's vIRR without KVM checking vmcs12's NV. Which is architecturally allowed, but unlikely behavior for a hypervisor. Cc: Zeng Guang <guang.zeng@intel.com> Reviewed-by: Chao Gao <chao.gao@intel.com> Link: https://lore.kernel.org/r/20240906043413.1049633-5-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2024-09-09 20:15:00 -07:00
Sean Christopherson	8c23670f2b	KVM: nVMX: Suppress external interrupt VM-Exit injection if there's no IRQ In the should-be-impossible scenario that kvm_cpu_get_interrupt() doesn't return a valid vector after checking kvm_cpu_has_interrupt(), skip VM-Exit injection to reduce the probability of crashing/confusing L1. Now that KVM gets the IRQ _before_ calling nested_vmx_vmexit(), squashing the VM-Exit injection is trivial since there are no actions that need to be undone. Reviewed-by: Chao Gao <chao.gao@intel.com> Link: https://lore.kernel.org/r/20240906043413.1049633-4-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2024-09-09 20:14:59 -07:00
Sean Christopherson	363010e1dd	KVM: nVMX: Get to-be-acknowledge IRQ for nested VM-Exit at injection site Move the logic to get the to-be-acknowledge IRQ for a nested VM-Exit from nested_vmx_vmexit() to vmx_check_nested_events(), which is subtly the one and only path where KVM invokes nested_vmx_vmexit() with EXIT_REASON_EXTERNAL_INTERRUPT. A future fix will perform a last-minute check on L2's nested posted interrupt notification vector, just before injecting a nested VM-Exit. To handle that scenario correctly, KVM needs to get the interrupt _before_ injecting VM-Exit, as simply querying the highest priority interrupt, via kvm_cpu_has_interrupt(), would result in TOCTOU bug, as a new, higher priority interrupt could arrive between kvm_cpu_has_interrupt() and kvm_cpu_get_interrupt(). Unfortunately, simply moving the call to kvm_cpu_get_interrupt() doesn't suffice, as a VMWRITE to GUEST_INTERRUPT_STATUS.SVI is hiding in kvm_get_apic_interrupt(), and acknowledging the interrupt before nested VM-Exit would cause the VMWRITE to hit vmcs02 instead of vmcs01. Open code a rough equivalent to kvm_cpu_get_interrupt() so that the IRQ is acknowledged after emulating VM-Exit, taking care to avoid the TOCTOU issue described above. Opportunistically convert the WARN_ON() to a WARN_ON_ONCE(). If KVM has a bug that results in a false positive from kvm_cpu_has_interrupt(), spamming dmesg won't help the situation. Note, nested_vmx_reflect_vmexit() can never reflect external interrupts as they are always "wanted" by L0. Link: https://lore.kernel.org/r/20240906043413.1049633-3-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2024-09-09 20:14:58 -07:00
Sean Christopherson	a194a3a13c	KVM: x86: Move "ack" phase of local APIC IRQ delivery to separate API Split the "ack" phase, i.e. the movement of an interrupt from IRR=>ISR, out of kvm_get_apic_interrupt() and into a separate API so that nested VMX can acknowledge a specific interrupt _after_ emulating a VM-Exit from L2 to L1. To correctly emulate nested posted interrupts while APICv is active, KVM must: 1. find the highest pending interrupt. 2. check if that IRQ is L2's notification vector 3. emulate VM-Exit if the IRQ is NOT the notification vector 4. ACK the IRQ in L1 _after_ VM-Exit When APICv is active, the process of moving the IRQ from the IRR to the ISR also requires a VMWRITE to update vmcs01.GUEST_INTERRUPT_STATUS.SVI, and so acknowledging the interrupt before switching to vmcs01 would result in marking the IRQ as in-service in the wrong VMCS. KVM currently fudges around this issue by doing kvm_get_apic_interrupt() smack dab in the middle of emulating VM-Exit, but that hack doesn't play nice with nested posted interrupts, as notification vector IRQs don't trigger a VM-Exit in the first place. Cc: Nathan Chancellor <nathan@kernel.org> Link: https://lore.kernel.org/r/20240906043413.1049633-2-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2024-09-09 20:14:57 -07:00
Kai Huang	7efb4d8a39	KVM: VMX: Also clear SGX EDECCSSA in KVM CPU caps when SGX is disabled When SGX EDECCSSA support was added to KVM in commit `16a7fe3728` ("KVM/VMX: Allow exposing EDECCSSA user leaf function to KVM guest"), it forgot to clear the X86_FEATURE_SGX_EDECCSSA bit in KVM CPU caps when KVM SGX is disabled. Fix it. Fixes: `16a7fe3728` ("KVM/VMX: Allow exposing EDECCSSA user leaf function to KVM guest") Signed-off-by: Kai Huang <kai.huang@intel.com> Link: https://lore.kernel.org/r/20240905120837.579102-1-kai.huang@intel.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2024-09-09 20:13:55 -07:00

1 2 3 4 5 ...

10236 Commits