linux

mirror of https://github.com/torvalds/linux.git synced 2026-05-05 23:05:25 -04:00

Author	SHA1	Message	Date
Thomas Huth	24a295e4ef	x86/headers: Replace __ASSEMBLY__ with __ASSEMBLER__ in non-UAPI headers While the GCC and Clang compilers already define __ASSEMBLER__ automatically when compiling assembly code, __ASSEMBLY__ is a macro that only gets defined by the Makefiles in the kernel. This can be very confusing when switching between userspace and kernelspace coding, or when dealing with UAPI headers that rather should use __ASSEMBLER__ instead. So let's standardize on the __ASSEMBLER__ macro that is provided by the compilers now. This is mostly a mechanical patch (done with a simple "sed -i" statement), with some manual tweaks in <asm/frame.h>, <asm/hw_irq.h> and <asm/setup.h> that mentioned this macro in comments with some missing underscores. Signed-off-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Brian Gerst <brgerst@gmail.com> Cc: Juergen Gross <jgross@suse.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20250314071013.1575167-38-thuth@redhat.com	2025-03-19 11:47:30 +01:00
Thomas Huth	8a141be323	x86/headers: Replace __ASSEMBLY__ with __ASSEMBLER__ in UAPI headers __ASSEMBLY__ is only defined by the Makefile of the kernel, so this is not really useful for UAPI headers (unless the userspace Makefile defines it, too). Let's switch to __ASSEMBLER__ which gets set automatically by the compiler when compiling assembly code. Signed-off-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Kees Cook <keescook@chromium.org> Cc: Brian Gerst <brgerst@gmail.com> Link: https://lore.kernel.org/r/20250310104256.123527-1-thuth@redhat.com	2025-03-19 11:30:53 +01:00
Uros Bizjak	faa6f77b0d	x86/locking/atomic: Improve performance by using asm_inline() for atomic locking instructions According to: https://gcc.gnu.org/onlinedocs/gcc/Size-of-an-asm.html the usage of asm pseudo directives in the asm template can confuse the compiler to wrongly estimate the size of the generated code. The LOCK_PREFIX macro expands to several asm pseudo directives, so its usage in atomic locking insns causes instruction length estimates to fail significantly (the specially instrumented compiler reports the estimated length of these asm templates to be 6 instructions long). This incorrect estimate further causes unoptimal inlining decisions, un-optimal instruction scheduling and un-optimal code block alignments for functions that use these locking primitives. Use asm_inline instead: https://gcc.gnu.org/pipermail/gcc-patches/2018-December/512349.html which is a feature that makes GCC pretend some inline assembler code is tiny (while it would think it is huge), instead of just asm. For code size estimation, the size of the asm is then taken as the minimum size of one instruction, ignoring how many instructions compiler thinks it is. bloat-o-meter reports the following code size increase (x86_64 defconfig, gcc-14.2.1): add/remove: 82/283 grow/shrink: 870/372 up/down: 76272/-43618 (32654) Total: Before=22770320, After=22802974, chg +0.14% with top grows (>500 bytes): Function old new delta ---------------------------------------------------------------- copy_process 6465 10191 +3726 balance_dirty_pages_ratelimited_flags 237 2949 +2712 icl_plane_update_noarm 5800 7969 +2169 samsung_input_mapping 3375 5170 +1795 ext4_do_update_inode.isra - 1526 +1526 __schedule 2416 3472 +1056 __i915_vma_resource_unhold - 946 +946 sched_mm_cid_after_execve 175 1097 +922 __do_sys_membarrier - 862 +862 filemap_fault 2666 3462 +796 nl80211_send_wiphy 11185 11874 +689 samsung_input_mapping.cold 900 1500 +600 virtio_gpu_queue_fenced_ctrl_buffer 839 1410 +571 ilk_update_pipe_csc 1201 1735 +534 enable_step - 525 +525 icl_color_commit_noarm 1334 1847 +513 tg3_read_bc_ver - 501 +501 and top shrinks (>500 bytes): Function old new delta ---------------------------------------------------------------- nl80211_send_iftype_data 580 - -580 samsung_gamepad_input_mapping.isra.cold 604 - -604 virtio_gpu_queue_ctrl_sgs 724 - -724 tg3_get_invariants 9218 8376 -842 __i915_vma_resource_unhold.part 899 - -899 ext4_mark_iloc_dirty 1735 106 -1629 samsung_gamepad_input_mapping.isra 2046 - -2046 icl_program_input_csc 2203 - -2203 copy_mm 2242 - -2242 balance_dirty_pages 2657 - -2657 These code size changes can be grouped into 4 groups: a) some functions now include once-called functions in full or in part. These are: Function old new delta ---------------------------------------------------------------- copy_process 6465 10191 +3726 balance_dirty_pages_ratelimited_flags 237 2949 +2712 icl_plane_update_noarm 5800 7969 +2169 samsung_input_mapping 3375 5170 +1795 ext4_do_update_inode.isra - 1526 +1526 that now include: Function old new delta ---------------------------------------------------------------- copy_mm 2242 - -2242 balance_dirty_pages 2657 - -2657 icl_program_input_csc 2203 - -2203 samsung_gamepad_input_mapping.isra 2046 - -2046 ext4_mark_iloc_dirty 1735 106 -1629 b) ISRA [interprocedural scalar replacement of aggregates, interprocedural pass that removes unused function return values (turning functions returning a value which is never used into void functions) and removes unused function parameters. It can also replace an aggregate parameter by a set of other parameters representing part of the original, turning those passed by reference into new ones which pass the value directly.] Top grows and shrinks of this group are listed below: Function old new delta ---------------------------------------------------------------- ext4_do_update_inode.isra - 1526 +1526 nfs4_begin_drain_session.isra - 249 +249 nfs4_end_drain_session.isra - 168 +168 __guc_action_register_multi_lrc_v70.isra 335 500 +165 __i915_gem_free_objects.isra - 144 +144 ... membarrier_register_private_expedited.isra 108 - -108 syncobj_eventfd_entry_func.isra 445 314 -131 __ext4_sb_bread_gfp.isra 140 - -140 class_preempt_notrace_destructor.isra 145 - -145 p9_fid_put.isra 151 - -151 __mm_cid_try_get.isra 238 - -238 membarrier_global_expedited.isra 294 - -294 mm_cid_get.isra 295 - -295 samsung_gamepad_input_mapping.isra.cold 604 - -604 samsung_gamepad_input_mapping.isra 2046 - -2046 c) different split points of hot/cold split that just move code around: Top grows and shrinks of this group are listed below: Function old new delta ---------------------------------------------------------------- samsung_input_mapping.cold 900 1500 +600 __i915_request_reset.cold 311 389 +78 nfs_update_inode.cold 77 153 +76 __do_sys_swapon.cold 404 455 +51 copy_process.cold - 45 +45 tg3_get_invariants.cold 73 115 +42 ... hibernate.cold 671 643 -28 copy_mm.cold 31 - -31 software_resume.cold 249 207 -42 io_poll_wake.cold 106 54 -52 samsung_gamepad_input_mapping.isra.cold 604 - -604 c) full inline of small functions with locking insn (~150 cases). These bring in most of the code size increase because the removed function code is now inlined in multiple places. E.g.: 0000000000a50e10 <release_devnum>: a50e10: 48 63 07 movslq (%rdi),%rax a50e13: 85 c0 test %eax,%eax a50e15: 7e 10 jle a50e27 <release_devnum+0x17> a50e17: 48 8b 4f 50 mov 0x50(%rdi),%rcx a50e1b: f0 48 0f b3 41 50 lock btr %rax,0x50(%rcx) a50e21: c7 07 ff ff ff ff movl $0xffffffff,(%rdi) a50e27: e9 00 00 00 00 jmp a50e2c <release_devnum+0x1c> a50e28: R_X86_64_PLT32 __x86_return_thunk-0x4 a50e2c: 0f 1f 40 00 nopl 0x0(%rax) is now fully inlined into the caller function. This is desirable due to the per function overhead of CPU bug mitigations like retpolines. FTR a) with -Os (where generated code size really matters) x86_64 defconfig object file decreases by 24.388 kbytes, representing 0.1% code size decrease: text data bss dec hex filename 23883860 4617284 814212 29315356 1bf511c vmlinux-old.o 23859472 4615404 814212 29289088 1beea80 vmlinux-new.o FTR b) clang recognizes "asm inline", but there was no difference in code sizes: text data bss dec hex filename 27577163 4503078 807732 32887973 1f5d4a5 vmlinux-clang-patched.o 27577181 4503078 807732 32887991 1f5d4b7 vmlinux-clang-unpatched.o The performance impact of the patch was assessed by recompiling fedora-41 6.13.5 kernel and running lmbench with old and new kernel. The most noticeable improvements were: Process fork+exit: 270.0952 microseconds Process fork+execve: 2620.3333 microseconds Process fork+/bin/sh -c: 6781.0000 microseconds File /usr/tmp/XXX write bandwidth: 1780350 KB/sec Pagefaults on /usr/tmp/XXX: 0.3875 microseconds to: Process fork+exit: 298.6842 microseconds Process fork+execve: 1662.7500 microseconds Process fork+/bin/sh -c: 2127.6667 microseconds File /usr/tmp/XXX write bandwidth: 1950077 KB/sec Pagefaults on /usr/tmp/XXX: 0.1958 microseconds and from: Socket bandwidth using localhost 0.000001 2.52 MB/sec 0.000064 163.02 MB/sec 0.000128 321.70 MB/sec 0.000256 630.06 MB/sec 0.000512 1207.07 MB/sec 0.001024 2004.06 MB/sec 0.001437 2475.43 MB/sec 10.000000 5817.34 MB/sec Avg xfer: 3.2KB, 41.8KB in 1.2230 millisecs, 34.15 MB/sec AF_UNIX sock stream bandwidth: 9850.01 MB/sec Pipe bandwidth: 4631.28 MB/sec to: Socket bandwidth using localhost 0.000001 3.13 MB/sec 0.000064 187.08 MB/sec 0.000128 324.12 MB/sec 0.000256 618.51 MB/sec 0.000512 1137.13 MB/sec 0.001024 1962.95 MB/sec 0.001437 2458.27 MB/sec 10.000000 6168.08 MB/sec Avg xfer: 3.2KB, 41.8KB in 1.0060 millisecs, 41.52 MB/sec AF_UNIX sock stream bandwidth: 9921.68 MB/sec Pipe bandwidth: 4649.96 MB/sec [ mingo: Prettified the changelog a bit. ] Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Brian Gerst <brgerst@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Link: https://lore.kernel.org/r/20250309170955.48919-1-ubizjak@gmail.com	2025-03-19 11:26:58 +01:00
Uros Bizjak	f685a96bfd	x86/asm: Use asm_inline() instead of asm() in clwb() Use asm_inline() to instruct the compiler that the size of asm() is the minimum size of one instruction, ignoring how many instructions the compiler thinks it is. ALTERNATIVE macro that expands to several pseudo directives causes instruction length estimate to count more than 20 instructions. bloat-o-meter reports slight increase of the code size for x86_64 defconfig object file, compiled with gcc-14.2: add/remove: 0/2 grow/shrink: 3/0 up/down: 190/-59 (131) Function old new delta __copy_user_flushcache 166 247 +81 __memcpy_flushcache 369 437 +68 arch_wb_cache_pmem 6 47 +41 __pfx_clean_cache_range 16 - -16 clean_cache_range 43 - -43 Total: Before=22807167, After=22807298, chg +0.00% The compiler now inlines and removes the clean_cache_range() function. Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Brian Gerst <brgerst@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20250313102715.333142-2-ubizjak@gmail.com	2025-03-19 11:26:58 +01:00
Uros Bizjak	5328663245	x86/asm: Use CLFLUSHOPT and CLWB mnemonics in <asm/special_insns.h> Current minimum required version of binutils is 2.25, which supports CLFLUSHOPT and CLWB instruction mnemonics. Replace the byte-wise specification of CLFLUSHOPT and CLWB with these proper mnemonics. No functional change intended. Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Brian Gerst <brgerst@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20250313102715.333142-1-ubizjak@gmail.com	2025-03-19 11:26:58 +01:00
Uros Bizjak	21fe251484	x86/hweight: Use asm_inline() instead of asm() Use asm_inline() to instruct the compiler that the size of asm() is the minimum size of one instruction, ignoring how many instructions the compiler thinks it is. ALTERNATIVE macro that expands to several pseudo directives causes instruction length estimate to count more than 20 instructions. bloat-o-meter reports slight reduction of the code size for x86_64 defconfig object file, compiled with gcc-14.2: add/remove: 6/12 grow/shrink: 59/50 up/down: 3389/-3560 (-171) Total: Before=22734393, After=22734222, chg -0.00% where 29 instances of code blocks involving POPCNT now gets inlined, resulting in the removal of several functions: format_is_yuv_semiplanar.part.isra 41 - -41 cdclk_divider 69 - -69 intel_joiner_adjust_timings 140 - -140 nl80211_send_wowlan_tcp_caps 369 - -369 nl80211_send_iftype_data 579 - -579 __do_sys_pidfd_send_signal 809 - -809 One noticeable change is: pcpu_page_first_chunk 1075 1060 -15 Where the compiler now inlines 4 more instances of POPCNT insns, but still manages to compile to a function with smaller code size. Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20250312123905.149298-3-ubizjak@gmail.com	2025-03-19 11:26:58 +01:00
Uros Bizjak	194a613088	x86/hweight: Use ASM_CALL_CONSTRAINT in inline asm() Use ASM_CALL_CONSTRAINT to prevent inline asm() that includes call instruction from being scheduled before the frame pointer gets set up by the containing function. This unconstrained scheduling might cause objtool to print a "call without frame pointer save/setup" warning. Current versions of compilers don't seem to trigger this condition, but without this constraint there's nothing to prevent the compiler from scheduling the insn in front of frame creation. Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20250312123905.149298-2-ubizjak@gmail.com	2025-03-19 11:26:58 +01:00
Uros Bizjak	72899899e4	x86/hweight: Use named operands in inline asm() No functional change intended. Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20250312123905.149298-1-ubizjak@gmail.com	2025-03-19 11:26:58 +01:00
Ingo Molnar	91d5451d97	x86/stackprotector/64: Only export __ref_stack_chk_guard on CONFIG_SMP The __ref_stack_chk_guard symbol doesn't exist on UP: <stdin>:4:15: error: ‘__ref_stack_chk_guard’ undeclared here (not in a function) Fix the #ifdef around the entry.S export. Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Brian Gerst <brgerst@gmail.com> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Uros Bizjak <ubizjak@gmail.com> Link: https://lore.kernel.org/r/20250123190747.745588-8-brgerst@gmail.com	2025-03-19 11:26:58 +01:00
Ard Biesheuvel	3f5dbafc2d	x86/head/64: Avoid Clang < 17 stack protector in startup code Clang versions before 17 will not honour -fdirect-access-external-data for the load of the stack cookie emitted into each function's prologue and epilogue, and will emit a GOT based reference instead, e.g., 4c 8b 2d 00 00 00 00 mov 0x0(%rip),%r13 18a: R_X86_64_REX_GOTPCRELX __ref_stack_chk_guard-0x4 65 49 8b 45 00 mov %gs:0x0(%r13),%rax This is inefficient, but at least, the linker will usually follow the rules of the x86 psABI, and relax the GOT load into a RIP-relative LEA instruction. This is still suboptimal, as the per-CPU load could use a RIP-relative reference directly, but at least it gets rid of the first load from memory. However, Boris reports that in some cases, when using distro builds of Clang/LLD 15, the first load gets relaxed into 49 c7 c6 20 c0 55 86 mov $0xffffffff8655c020,%r14 ffffffff8373bf0f: R_X86_64_32S __ref_stack_chk_guard 65 49 8b 06 mov %gs:(%r14),%rax instead, which is fine in principle, as MOV may be cheaper than LEA on some micro-architectures. However, such absolute references assume that the variable in question can be accessed via the kernel virtual mapping, and this is not guaranteed for the startup code residing in .head.text. This is therefore a true positive, that was caught using the recently introduced relocs check for absolute references in the startup code: Absolute reference to symbol '__ref_stack_chk_guard' not permitted in .head.text Work around the issue by disabling the stack protector in the startup code for Clang versions older than 17. Fixes: `80d47defdd` ("x86/stackprotector/64: Convert to normal per-CPU variable") Reported-by: Borislav Petkov <bp@alien8.de> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20250312102740.602870-2-ardb+git@google.com	2025-03-19 11:26:49 +01:00
Uros Bizjak	a9deda6959	x86/kexec: Merge x86_32 and x86_64 code using macros from <asm/asm.h> Merge common x86_32 and x86_64 code in crash_setup_regs() using macros from <asm/asm.h>. The compiled object files before and after the patch are unchanged. Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: Baoquan He <bhe@redhat.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Dave Young <dyoung@redhat.com> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Link: https://lore.kernel.org/r/20250306145227.55819-1-ubizjak@gmail.com	2025-03-19 11:26:24 +01:00
Kirill A. Shutemov	bd72baff22	x86/runtime-const: Add the RUNTIME_CONST_PTR assembly macro Add an assembly macro to refer runtime cost. It hides linker magic and makes assembly more readable. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Brian Gerst <brgerst@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20250304153342.2016569-1-kirill.shutemov@linux.intel.com	2025-03-19 11:26:24 +01:00
Sohil Mehta	fadb6f569b	x86/cpu/intel: Limit the non-architectural constant_tsc model checks X86_FEATURE_CONSTANT_TSC is a Linux-defined, synthesized feature flag. It is used across several vendors. Intel CPUs will set the feature when the architectural CPUID.80000007.EDX[1] bit is set. There are also some Intel CPUs that have the X86_FEATURE_CONSTANT_TSC behavior but don't enumerate it with the architectural bit. Those currently have a model range check. Today, virtually all of the CPUs that have the CPUID bit also match the "model >= 0x0e" check. This is confusing. Instead of an open-ended check, pick some models (INTEL_IVYBRIDGE and P4_WILLAMETTE) as the end of goofy CPUs that should enumerate the bit but don't. These models are relatively arbitrary but conservative pick for this. This makes it obvious that later CPUs (like Family 18+) no longer need to synthesize X86_FEATURE_CONSTANT_TSC. Signed-off-by: Sohil Mehta <sohil.mehta@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20250219184133.816753-14-sohil.mehta@intel.com	2025-03-19 11:19:56 +01:00
Sohil Mehta	05d234d3c7	x86/mm/pat: Replace Intel x86_model checks with VFM ones Introduce markers and names for some Family 6 and Family 15 models and replace x86_model checks with VFM ones. Since the VFM checks are closed ended and only applicable to Intel, get rid of the explicit Intel vendor check as well. Signed-off-by: Sohil Mehta <sohil.mehta@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@surriel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Link: https://lore.kernel.org/r/20250219184133.816753-13-sohil.mehta@intel.com	2025-03-19 11:19:53 +01:00
Sohil Mehta	15b7ddcf66	x86/cpu/intel: Fix fast string initialization for extended Families X86_FEATURE_REP_GOOD is a linux defined feature flag to track whether fast string operations should be used for copy_page(). It is also used as a second alternative for clear_page() if enhanced fast string operations (ERMS) are not available. X86_FEATURE_ERMS is an Intel-specific hardware-defined feature flag that tracks hardware support for Enhanced Fast strings. It is used to track whether Fast strings should be used for similar memory copy and memory clearing operations. On top of these, there is a FAST_STRING enable bit in the IA32_MISC_ENABLE MSR. It is typically controlled by the BIOS to provide a hint to the hardware and the OS on whether fast string operations are preferred. Commit: `161ec53c70` ("x86, mem, intel: Initialize Enhanced REP MOVSB/STOSB") introduced a mechanism to honor the BIOS preference for fast string operations and clear the above feature flags if needed. Unfortunately, the current initialization code for Intel to set and clear these bits is confusing at best and likely incorrect. X86_FEATURE_REP_GOOD is cleared in early_init_intel() if MISC_ENABLE.FAST_STRING is 0. But it gets set later on unconditionally for all Family 6 processors in init_intel(). This not only overrides the BIOS preference but also contradicts the earlier check. Fix this by combining the related checks and always relying on the BIOS provided preference for fast string operations. This simplification makes sure the upcoming Intel Family 18 and 19 models are covered as well. Signed-off-by: Sohil Mehta <sohil.mehta@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Brian Gerst <brgerst@gmail.com> Cc: Juergen Gross <jgross@suse.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20250219184133.816753-12-sohil.mehta@intel.com	2025-03-19 11:19:51 +01:00
Sohil Mehta	7a2ad75274	x86/smpboot: Fix INIT delay assignment for extended Intel Families Some old crusty CPUs need an extra delay that slows down booting. See the comment above 'init_udelay' for details. Newer CPUs don't need the delay. Right now, for Intel, Family 6 and only Family 6 skips the delay. That leaves out both the Family 15 (Pentium 4s) and brand new Family 18/19 models. The omission of Family 15 (Pentium 4s) seems like an oversight and 18/19 do not need the delay. Skip the delay on all Intel processors Family 6 and beyond. Signed-off-by: Sohil Mehta <sohil.mehta@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20250219184133.816753-11-sohil.mehta@intel.com	2025-03-19 11:19:50 +01:00
Sohil Mehta	58d1c1fd03	x86/smpboot: Remove confusing quirk usage in INIT delay Very old multiprocessor systems required a 10 msec delay between asserting and de-asserting INIT but modern processors do not require this delay. Over time the usage of the "quirk" wording while setting the INIT delay has become misleading. The code comments suggest that modern processors need to be quirked, which clears the default init_udelay of 10 msec, while legacy processors don't need the quirk and continue to use the default init_udelay. With a lot more modern processors, the wording should be inverted if at all needed. Instead, simplify the comments and the code by getting rid of "quirk" usage altogether and clarifying the following: - Old legacy processors -> Set the "legacy" 10 msec delay - Modern processors -> Do not set any delay No functional change. Signed-off-by: Sohil Mehta <sohil.mehta@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20250219184133.816753-10-sohil.mehta@intel.com	2025-03-19 11:19:48 +01:00
Sohil Mehta	337959860d	x86/acpi/cstate: Improve Intel Family model checks Update the Intel Family checks to consistently use Family 15 instead of Family 0xF. Also, get rid of one of last usages of x86_model by using the new VFM checks. Update the incorrect comment since the check has changed since the initial commit: `ee1ca48fae` ("ACPI: Disable ARB_DISABLE on platforms where it is not needed") The two changes were: - `3e2ada5867` ("ACPI: fix Compaq Evo N800c (Pentium 4m) boot hang regression") removed the P4 - Family 15. - `03a05ed115` ("ACPI: Use the ARB_DISABLE for the CPU which model id is less than 0x0f.") got rid of CORE_YONAH - Family 6, model E. Signed-off-by: Sohil Mehta <sohil.mehta@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://lore.kernel.org/r/20250219184133.816753-9-sohil.mehta@intel.com	2025-03-19 11:19:46 +01:00
Sohil Mehta	eb1ac33305	x86/cpu/intel: Replace Family 5 model checks with VFM ones Introduce names for some Family 5 models and convert some of the checks to be VFM based. Also, to keep the file sorted by family, move Family 5 to the top of the header file. Signed-off-by: Sohil Mehta <sohil.mehta@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lore.kernel.org/r/20250219184133.816753-8-sohil.mehta@intel.com	2025-03-19 11:19:44 +01:00
Sohil Mehta	fc866f2472	x86/cpu/intel: Replace Family 15 checks with VFM ones Introduce names for some old pentium 4 models and replace the x86_model checks with VFM ones. Signed-off-by: Sohil Mehta <sohil.mehta@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lore.kernel.org/r/20250219184133.816753-7-sohil.mehta@intel.com	2025-03-19 11:19:43 +01:00
Sohil Mehta	eaa472f76d	x86/cpu/intel: Replace early Family 6 checks with VFM ones Introduce names for some old pentium models and replace the x86_model checks with VFM ones. Signed-off-by: Sohil Mehta <sohil.mehta@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lore.kernel.org/r/20250219184133.816753-6-sohil.mehta@intel.com	2025-03-19 11:19:41 +01:00
Sohil Mehta	a8cb451458	x86/mtrr: Modify a x86_model check to an Intel VFM check Simplify one of the last few Intel x86_model checks in arch/x86 by substituting it with a VFM one. Signed-off-by: Sohil Mehta <sohil.mehta@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lore.kernel.org/r/20250219184133.816753-5-sohil.mehta@intel.com	2025-03-19 11:19:40 +01:00
Sohil Mehta	7e6b0a2e41	x86/microcode: Update the Intel processor flag scan check The Family model check to read the processor flag MSR is misleading and potentially incorrect. It doesn't consider Family while comparing the model number. The original check did have a Family number but it got lost/moved during refactoring. intel_collect_cpu_info() is called through multiple paths such as early initialization, CPU hotplug as well as IFS image load. Some of these flows would be error prone due to the ambiguous check. Correct the processor flag scan check to use a Family number and update it to a VFM based one to make it more readable. Signed-off-by: Sohil Mehta <sohil.mehta@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lore.kernel.org/r/20250219184133.816753-4-sohil.mehta@intel.com	2025-03-19 11:19:38 +01:00
Sohil Mehta	7e67f36172	x86/cpu/intel: Fix the MOVSL alignment preference for extended Families The alignment preference for 32-bit MOVSL based bulk memory move has been 8-byte for a long time. However this preference is only set for Family 6 and 15 processors. Use the same preference for upcoming Family numbers 18 and 19. Also, use a simpler VFM based check instead of switching based on Family numbers. Refresh the comment to reflect the new check. Signed-off-by: Sohil Mehta <sohil.mehta@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Brian Gerst <brgerst@gmail.com> Cc: Juergen Gross <jgross@suse.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20250219184133.816753-3-sohil.mehta@intel.com	2025-03-19 11:19:31 +01:00
Sohil Mehta	680d9b2a56	x86/apic: Fix 32-bit APIC initialization for extended Intel Families APIC detection is currently limited to a few specific Families and will not match the upcoming Families >=18. Extend the check to include all Families 6 or greater. Also convert it to a VFM check to make it simpler. Signed-off-by: Sohil Mehta <sohil.mehta@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lore.kernel.org/r/20250219184133.816753-2-sohil.mehta@intel.com	2025-03-19 11:19:29 +01:00
Ingo Molnar	a46f322661	x86/cpuid: Use u32 in instead of uint32_t in <asm/cpuid/api.h> Use u32 instead of uint32_t in hypervisor_cpuid_base(). Yes, uint32_t is used in Xen code et al, but this is a core x86 architecture header and we should standardize on the type that is being used overwhelmingly in related x86 architecture code. The two types are the same so there should be no build warnings. Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Andrew Cooper <andrew.cooper3@citrix.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: John Ogness <john.ogness@linutronix.de> Cc: "Ahmed S. Darwish" <darwi@linutronix.de> Cc: x86-cpuid@lists.linux.dev Link: https://lore.kernel.org/r/20250317221824.3738853-6-mingo@kernel.org	2025-03-19 11:19:28 +01:00
Ingo Molnar	cfb4fc5f08	x86/cpuid: Standardize on u32 in <asm/cpuid/api.h> Convert all uses of 'unsigned int' to 'u32' in <asm/cpuid/api.h>. This is how a lot of the call sites are doing it, and the two types are equivalent in the C sense - but 'u32' better expresses that these are expressions of an immutable hardware ABI. Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Xin Li (Intel) <xin@zytor.com> Cc: Andrew Cooper <andrew.cooper3@citrix.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: John Ogness <john.ogness@linutronix.de> Cc: "Ahmed S. Darwish" <darwi@linutronix.de> Cc: x86-cpuid@lists.linux.dev Link: https://lore.kernel.org/r/20250317221824.3738853-5-mingo@kernel.org	2025-03-19 11:19:26 +01:00
Ingo Molnar	fb99ed1e00	x86/cpuid: Clean up <asm/cpuid/api.h> - Include <asm/cpuid/types.h> first, as is customary. This also has the side effect of build-testing the header dependency assumptions in the types header. - No newline necessary after the SPDX line - Newline necessary after inline function definitions - Rename native_cpuid_reg() to NATIVE_CPUID_REG(): it's a CPP macro, whose name we capitalize in such cases. - Prettify the CONFIG_PARAVIRT_XXL inclusion block a bit - Standardize register references in comments to EAX/EBX/ECX/etc., from the hodgepodge of references. - s/cpus/CPUs because why add noise to common acronyms? Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Andrew Cooper <andrew.cooper3@citrix.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: John Ogness <john.ogness@linutronix.de> Cc: "Ahmed S. Darwish" <darwi@linutronix.de> Cc: x86-cpuid@lists.linux.dev Link: https://lore.kernel.org/r/20250317221824.3738853-4-mingo@kernel.org	2025-03-19 11:19:25 +01:00
Ingo Molnar	04a1007004	x86/cpuid: Clean up <asm/cpuid/types.h> - We have 0x0d, 0x9 and 0x1d as literals for the CPUID_LEAF definitions, pick a single, consistent style of 0xZZ literals. - Likewise, harmonize the style of the 'struct cpuid_regs' list of registers with that of 'enum cpuid_regs_idx'. Because while computers don't care about unnecessary visual noise, humans do. Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Andrew Cooper <andrew.cooper3@citrix.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: John Ogness <john.ogness@linutronix.de> Cc: "Ahmed S. Darwish" <darwi@linutronix.de> Cc: x86-cpuid@lists.linux.dev Link: https://lore.kernel.org/r/20250317221824.3738853-3-mingo@kernel.org	2025-03-19 11:19:23 +01:00
Ahmed S. Darwish	adc574269b	x86/cpuid: Refactor <asm/cpuid.h> In preparation for future commits where CPUID headers will be expanded, refactor the CPUID header <asm/cpuid.h> into: asm/cpuid/ ├── api.h └── types.h Move the CPUID data structures into <asm/cpuid/types.h> and the access APIs into <asm/cpuid/api.h>. Let <asm/cpuid.h> be just an include of <asm/cpuid/api.h> so that existing call sites do not break. Suggested-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Andrew Cooper <andrew.cooper3@citrix.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: John Ogness <john.ogness@linutronix.de> Cc: "Ahmed S. Darwish" <darwi@linutronix.de> Cc: x86-cpuid@lists.linux.dev Link: https://lore.kernel.org/r/20250317221824.3738853-2-mingo@kernel.org	2025-03-19 11:19:22 +01:00
Brian Gerst	82070bc042	x86/syscall/32: Add comment to conditional Add a CONFIG_X86_FRED comment, since this conditional is nested. Suggested-by: Sohil Mehta <sohil.mehta@intel.com> Signed-off-by: Brian Gerst <brgerst@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Sohil Mehta <sohil.mehta@intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lore.kernel.org/r/20250314151220.862768-8-brgerst@gmail.com	2025-03-19 11:19:20 +01:00
Brian Gerst	6049395522	x86/syscall: Remove stray semicolons No functional change. Suggested-by: Sohil Mehta <sohil.mehta@intel.com> Signed-off-by: Brian Gerst <brgerst@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Sohil Mehta <sohil.mehta@intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lore.kernel.org/r/20250314151220.862768-7-brgerst@gmail.com	2025-03-19 11:19:18 +01:00
Brian Gerst	9a93e29f16	x86/syscall: Move sys_ni_syscall() Move sys_ni_syscall() to kernel/process.c, and remove the now empty entry/common.c No functional changes. Signed-off-by: Brian Gerst <brgerst@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Sohil Mehta <sohil.mehta@intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lore.kernel.org/r/20250314151220.862768-6-brgerst@gmail.com	2025-03-19 11:19:17 +01:00
Brian Gerst	21832247f2	x86/syscall/x32: Move x32 syscall table Since commit: `2e958a8a51` ("x86/entry/x32: Rename __x32_compat_sys_* to __x64_compat_sys_*") the ABI prefix for x32 syscalls is the same as native 64-bit syscalls. Move the x32 syscall table to syscall_64.c No functional changes. Signed-off-by: Brian Gerst <brgerst@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Sohil Mehta <sohil.mehta@intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lore.kernel.org/r/20250314151220.862768-5-brgerst@gmail.com	2025-03-19 11:19:15 +01:00
Brian Gerst	01dfb48054	x86/syscall/64: Move 64-bit syscall dispatch code Move the 64-bit syscall dispatch code to syscall_64.c. No functional changes. Signed-off-by: Brian Gerst <brgerst@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Sohil Mehta <sohil.mehta@intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lore.kernel.org/r/20250314151220.862768-4-brgerst@gmail.com	2025-03-19 11:19:04 +01:00
Brian Gerst	b634b02e2b	x86/syscall/32: Move 32-bit syscall dispatch code Move the 32-bit syscall dispatch code to syscall_32.c. No functional changes. Signed-off-by: Brian Gerst <brgerst@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Sohil Mehta <sohil.mehta@intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lore.kernel.org/r/20250314151220.862768-3-brgerst@gmail.com	2025-03-19 11:19:01 +01:00
Brian Gerst	1ab7b5ed44	x86/xen: Move Xen upcall handler Move the upcall handler to Xen-specific files. No functional changes. Signed-off-by: Brian Gerst <brgerst@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Juergen Gross <jgross@suse.com> Reviewed-by: Sohil Mehta <sohil.mehta@intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lore.kernel.org/r/20250314151220.862768-2-brgerst@gmail.com	2025-03-19 11:18:58 +01:00
Mario Limonciello	4476e7f814	x86/amd_node: Add a smn_read_register() helper Some of the ACP drivers will poll registers through SMN using read_poll_timeout() which requires returning the result of the register read as the argument. Add a helper to do just that. Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20250217231747.1656228-2-superm1@kernel.org	2025-03-19 11:18:48 +01:00
Mario Limonciello	9c19cc1f5f	x86/amd_node: Add support for debugfs access to SMN registers There are certain registers on AMD Zen systems that can only be accessed through SMN. Introduce a new interface that provides debugfs files for accessing SMN. As this introduces the capability for userspace to manipulate the hardware in unpredictable ways, taint the kernel when writing. Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20250130-wip-x86-amd-nb-cleanup-v4-3-b5cc997e471b@amd.com	2025-03-19 11:18:33 +01:00
Mario Limonciello	8351845307	x86/amd_node: Add SMN offsets to exclusive region access Offsets 0x60 and 0x64 are used internally by kernel drivers that call the amd_smn_read() and amd_smn_write() functions. If userspace accesses the regions at the same time as the kernel it may cause malfunctions in drivers using the offsets. Add these offsets to the exclusions so that the kernel is tainted if a non locked down userspace tries to access them. Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20250130-wip-x86-amd-nb-cleanup-v4-2-b5cc997e471b@amd.com	2025-03-19 11:18:23 +01:00
Yazen Ghannam	8a3dc0f7c4	x86/amd_node, platform/x86/amd/hsmp: Have HSMP use SMN through AMD_NODE The HSMP interface is just an SMN interface with different offsets. Define an HSMP wrapper in the SMN code and have the HSMP platform driver use that rather than a local solution. Also, remove the "root" member from AMD_NB, since there are no more users of it. Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Carlos Bilbao <carlos.bilbao@kernel.org> Acked-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Link: https://lore.kernel.org/r/20250130-wip-x86-amd-nb-cleanup-v4-1-b5cc997e471b@amd.com	2025-03-19 11:18:05 +01:00
Thorsten Blum	ad5a3a8f41	x86/mtrr: Use str_enabled_disabled() helper in print_mtrr_state() Remove hard-coded strings by using the str_enabled_disabled() helper function. Suggested-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/all/20250117144900.171684-2-thorsten.blum%40linux.dev	2025-03-19 11:17:56 +01:00
Vitaly Kuznetsov	d55f31e290	x86/entry: Add __init to ia32_emulation_override_cmdline() ia32_emulation_override_cmdline() is an early_param() arg and these are only needed at boot time. In fact, all other early_param() functions in arch/x86 seem to have '__init' annotation and ia32_emulation_override_cmdline() is the only exception. Fixes: `a11e097504` ("x86: Make IA32_EMULATION boot time configurable") Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Nikolay Borisov <nik.borisov@suse.com> Link: https://lore.kernel.org/all/20241210151650.1746022-1-vkuznets%40redhat.com	2025-03-19 11:17:37 +01:00
Sohil Mehta	07e4a6eec2	x86/cpufeatures: Warn about unmet CPU feature dependencies Currently, the cpuid_deps[] table is only exercised when a particular feature is explicitly disabled and clear_cpu_cap() is called. However, some of these listed dependencies might already be missing during boot. These types of errors shouldn't generally happen in production environments, but they could sometimes sneak through, especially when VMs and Kconfigs are in the mix. Also, the kernel might introduce artificial dependencies between unrelated features, such as making LAM depend on LASS. Unexpected failures can occur when the kernel tries to use such features. Add a simple boot-time scan of the cpuid_deps[] table to detect the missing dependencies. One option is to disable all of such features during boot, but that may cause regressions in existing systems. For now, just warn about the missing dependencies to create awareness. As a trade-off between spamming the kernel log and keeping track of all the features that have been warned about, only warn about the first missing dependency. Any subsequent unmet dependency will only be logged after the first one has been resolved. Features are typically represented through unsigned integers within the kernel, though some of them have user-friendly names if they are exposed via /proc/cpuinfo. Show the friendlier name if available, otherwise display the X86_FEATURE_* numerals to make it easier to identify the feature. Suggested-by: Tony Luck <tony.luck@intel.com> Suggested-by: Ingo Molnar <mingo@redhat.com> Signed-off-by: Sohil Mehta <sohil.mehta@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Brian Gerst <brgerst@gmail.com> Cc: Juergen Gross <jgross@suse.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lore.kernel.org/r/20250313201608.3304135-1-sohil.mehta@intel.com	2025-03-19 11:17:31 +01:00
Pawan Gupta	722fa0dba7	x86/rfds: Exclude P-only parts from the RFDS affected list The affected CPU table (cpu_vuln_blacklist) marks Alderlake and Raptorlake P-only parts affected by RFDS. This is not true because only E-cores are affected by RFDS. With the current family/model matching it is not possible to differentiate the unaffected parts, as the affected and unaffected hybrid variants have the same model number. Add a cpu-type match as well for such parts so as to exclude P-only parts being marked as affected. Note, family/model and cpu-type enumeration could be inaccurate in virtualized environments. In a guest affected status is decided by RFDS_NO and RFDS_CLEAR bits exposed by VMMs. Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lore.kernel.org/r/20250311-add-cpu-type-v8-5-e8514dcaaff2@linux.intel.com	2025-03-19 11:17:23 +01:00
Pawan Gupta	adf2de5e8d	x86/cpu: Update x86_match_cpu() to also use cpu-type Non-hybrid CPU variants that share the same Family/Model could be differentiated by their cpu-type. x86_match_cpu() currently does not use cpu-type for CPU matching. Dave Hansen suggested to use below conditions to match CPU-type: 1. If CPU_TYPE_ANY (the wildcard), then matched 2. If hybrid, then matched 3. If !hybrid, look at the boot CPU and compare the cpu-type to determine if it is a match. This special case for hybrid systems allows more compact vulnerability list. Imagine that "Haswell" CPUs might or might not be hybrid and that only Atom cores are vulnerable to Meltdown. That means there are three possibilities: 1. P-core only 2. Atom only 3. Atom + P-core (aka. hybrid) One might be tempted to code up the vulnerability list like this: MATCH( HASWELL, X86_FEATURE_HYBRID, MELTDOWN) MATCH_TYPE(HASWELL, ATOM, MELTDOWN) Logically, this matches #2 and #3. But that's a little silly. You would only ask for the "ATOM" match in cases where there WERE hybrid cores in play. You shouldn't have to _also_ ask for hybrid cores explicitly. In short, assume that processors that enumerate Hybrid==1 have a vulnerable core type. Update x86_match_cpu() to also match cpu-type. Also treat hybrid systems as special, and match them to any cpu-type. Suggested-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lore.kernel.org/r/20250311-add-cpu-type-v8-4-e8514dcaaff2@linux.intel.com	2025-03-19 11:17:11 +01:00
Pawan Gupta	00d7fc04b7	x86/cpu: Add cpu_type to struct x86_cpu_id In addition to matching vendor/family/model/feature, for hybrid variants it is required to also match cpu-type. For example, some CPU vulnerabilities like RFDS only affect a specific cpu-type. To be able to also match CPUs based on their type, add a new field "type" to struct x86_cpu_id which is used by the CPU-matching tables. Introduce X86_CPU_TYPE_ANY for the cases that don't care about the cpu-type. [ bp: Massage commit message. ] Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lore.kernel.org/r/20250311-add-cpu-type-v8-3-e8514dcaaff2@linux.intel.com	2025-03-19 11:17:03 +01:00
Pawan Gupta	c3390406ad	x86/cpu: Shorten CPU matching macro To add cpu-type to the existing CPU matching infrastructure, the base macro X86_MATCH_VENDOR_FAM_MODEL_STEPPINGS_FEATURE need to append _CPU_TYPE. This makes an already long name longer, and somewhat incomprehensible. To avoid this, rename the base macro to X86_MATCH_CPU. The macro name doesn't need to explicitly tell everything that it matches. The arguments to the macro already hint at that. For consistency, use this base macro to define X86_MATCH_VFM and friends. Remove unused X86_MATCH_VENDOR_FAM_MODEL_FEATURE while at it. [ bp: Massage commit message. ] Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lore.kernel.org/r/20250311-add-cpu-type-v8-2-e8514dcaaff2@linux.intel.com	2025-03-19 11:16:46 +01:00
Pawan Gupta	7b9b54e23a	x86/cpu: Fix the description of X86_MATCH_VFM_STEPS() The comments needs to reflect an implementation change. No functional change. Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20250311-add-cpu-type-v8-1-e8514dcaaff2@linux.intel.com	2025-03-19 11:16:33 +01:00
Xin Li (Intel)	da414d34b5	x86/cpufeatures: Use AWK to generate {REQUIRED\|DISABLED}_MASK_BIT_SET in <asm/cpufeaturemasks.h> Generate the {REQUIRED\|DISABLED}_MASK_BIT_SET macros in the newly added AWK script that generates <asm/cpufeaturemasks.h>. Suggested-by: Brian Gerst <brgerst@gmail.com> Signed-off-by: Xin Li (Intel) <xin@zytor.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Brian Gerst <brgerst@gmail.com> Reviewed-by: Nikolay Borisov <nik.borisov@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20250228082338.73859-6-xin@zytor.com	2025-03-19 11:15:12 +01:00

1 2 3 4 5 ...

1337551 Commits