Commit Graph

10651 Commits

Author SHA1 Message Date
Linus Torvalds
7393febcb1 Merge tag 'locking-core-2026-04-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking updates from Ingo Molnar:
 "Mutexes:

   - Add killable flavor to guard definitions (Davidlohr Bueso)

   - Remove the list_head from struct mutex (Matthew Wilcox)

   - Rename mutex_init_lockep() (Davidlohr Bueso)

  rwsems:

   - Remove the list_head from struct rw_semaphore and
     replace it with a single pointer (Matthew Wilcox)

   - Fix logic error in rwsem_del_waiter() (Andrei Vagin)

  Semaphores:

   - Remove the list_head from struct semaphore (Matthew Wilcox)

  Jump labels:

   - Use ATOMIC_INIT() for initialization of .enabled (Thomas Weißschuh)

   - Remove workaround for old compilers in initializations
     (Thomas Weißschuh)

  Lock context analysis changes and improvements:

   - Add context analysis for rwsems (Peter Zijlstra)

   - Fix rwlock and spinlock lock context annotations (Bart Van Assche)

   - Fix rwlock support in <linux/spinlock_up.h> (Bart Van Assche)

   - Add lock context annotations in the spinlock implementation
     (Bart Van Assche)

   - signal: Fix the lock_task_sighand() annotation (Bart Van Assche)

   - ww-mutex: Fix the ww_acquire_ctx function annotations
     (Bart Van Assche)

   - Add lock context support in do_raw_{read,write}_trylock()
     (Bart Van Assche)

   - arm64, compiler-context-analysis: Permit alias analysis through
     __READ_ONCE() with CONFIG_LTO=y (Marco Elver)

   - Add __cond_releases() (Peter Zijlstra)

   - Add context analysis for mutexes (Peter Zijlstra)

   - Add context analysis for rtmutexes (Peter Zijlstra)

   - Convert futexes to compiler context analysis (Peter Zijlstra)

  Rust integration updates:

   - Add atomic fetch_sub() implementation (Andreas Hindborg)

   - Refactor various rust_helper_ methods for expansion (Boqun Feng)

   - Add Atomic<*{mut,const} T> support (Boqun Feng)

   - Add atomic operation helpers over raw pointers (Boqun Feng)

   - Add performance-optimal Flag type for atomic booleans, to avoid
     slow byte-sized RMWs on architectures that don't support them.
     (FUJITA Tomonori)

   - Misc cleanups and fixes (Andreas Hindborg, Boqun Feng, FUJITA
     Tomonori)

  LTO support updates:

   - arm64: Optimize __READ_ONCE() with CONFIG_LTO=y (Marco Elver)

   - compiler: Simplify generic RELOC_HIDE() (Marco Elver)

  Miscellaneous fixes and cleanups by Peter Zijlstra, Randy Dunlap,
  Thomas Weißschuh, Davidlohr Bueso and Mikhail Gavrilov"

* tag 'locking-core-2026-04-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (39 commits)
  compiler: Simplify generic RELOC_HIDE()
  locking: Add lock context annotations in the spinlock implementation
  locking: Add lock context support in do_raw_{read,write}_trylock()
  locking: Fix rwlock support in <linux/spinlock_up.h>
  lockdep: Raise default stack trace limits when KASAN is enabled
  cleanup: Optimize guards
  jump_label: remove workaround for old compilers in initializations
  jump_label: use ATOMIC_INIT() for initialization of .enabled
  futex: Convert to compiler context analysis
  locking/rwsem: Fix logic error in rwsem_del_waiter()
  locking/rwsem: Add context analysis
  locking/rtmutex: Add context analysis
  locking/mutex: Add context analysis
  compiler-context-analysys: Add __cond_releases()
  locking/mutex: Remove the list_head from struct mutex
  locking/semaphore: Remove the list_head from struct semaphore
  locking/rwsem: Remove the list_head from struct rw_semaphore
  rust: atomic: Update a safety comment in impl of `fetch_add()`
  rust: sync: atomic: Update documentation for `fetch_add()`
  rust: sync: atomic: Add fetch_sub()
  ...
2026-04-14 12:36:25 -07:00
Linus Torvalds
f21f7b5162 Merge tag 'timers-vdso-2026-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull vdso updates from Thomas Gleixner:

 - Make the handling of compat functions consistent and more robust

 - Rework the underlying data store so that it is dynamically allocated,
   which allows the conversion of the last holdout SPARC64 to the
   generic VDSO implementation

 - Rework the SPARC64 VDSO to utilize the generic implementation

 - Mop up the left overs of the non-generic VDSO support in the core
   code

 - Expand the VDSO selftest and make them more robust

 - Allow time namespaces to be enabled independently of the generic VDSO
   support, which was not possible before due to SPARC64 not using it

 - Various cleanups and improvements in the related code

* tag 'timers-vdso-2026-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (51 commits)
  timens: Use task_lock guard in timens_get*()
  timens: Use mutex guard in proc_timens_set_offset()
  timens: Simplify some calls to put_time_ns()
  timens: Add a __free() wrapper for put_time_ns()
  timens: Remove dependency on the vDSO
  vdso/timens: Move functions to new file
  selftests: vDSO: vdso_test_correctness: Add a test for time()
  selftests: vDSO: vdso_test_correctness: Use facilities from parse_vdso.c
  selftests: vDSO: vdso_test_correctness: Handle different tv_usec types
  selftests: vDSO: vdso_test_correctness: Drop SYS_getcpu fallbacks
  selftests: vDSO: vdso_test_gettimeofday: Remove nolibc checks
  Revert "selftests: vDSO: parse_vdso: Use UAPI headers instead of libc headers"
  random: vDSO: Remove ifdeffery
  random: vDSO: Trim vDSO includes
  vdso/datapage: Trim down unnecessary includes
  vdso/datapage: Remove inclusion of gettimeofday.h
  vdso/helpers: Explicitly include vdso/processor.h
  vdso/gettimeofday: Add explicit includes
  random: vDSO: Add explicit includes
  MIPS: vdso: Explicitly include asm/vdso/vdso.h
  ...
2026-04-14 10:53:44 -07:00
Linus Torvalds
c1fe867b5b Merge tag 'timers-core-2026-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer core updates from Thomas Gleixner:

 - A rework of the hrtimer subsystem to reduce the overhead for
   frequently armed timers, especially the hrtick scheduler timer:

     - Better timer locality decision

     - Simplification of the evaluation of the first expiry time by
       keeping track of the neighbor timers in the RB-tree by providing
       a RB-tree variant with neighbor links. That avoids walking the
       RB-tree on removal to find the next expiry time, but even more
       important allows to quickly evaluate whether a timer which is
       rearmed changes the position in the RB-tree with the modified
       expiry time or not. If not, the dequeue/enqueue sequence which
       both can end up in rebalancing can be completely avoided.

     - Deferred reprogramming of the underlying clock event device. This
       optimizes for the situation where a hrtimer callback sets the
       need resched bit. In that case the code attempts to defer the
       re-programming of the clock event device up to the point where
       the scheduler has picked the next task and has the next hrtick
       timer armed. In case that there is no immediate reschedule or
       soft interrupts have to be handled before reaching the reschedule
       point in the interrupt entry code the clock event is reprogrammed
       in one of those code paths to prevent that the timer becomes
       stale.

     - Support for clocksource coupled clockevents

       The TSC deadline timer is coupled to the TSC. The next event is
       programmed in TSC time. Currently this is done by converting the
       CLOCK_MONOTONIC based expiry value into a relative timeout,
       converting it into TSC ticks, reading the TSC adding the delta
       ticks and writing the deadline MSR.

       As the timekeeping core has the conversion factors for the TSC
       already, the whole back and forth conversion can be completely
       avoided. The timekeeping core calculates the reverse conversion
       factors from nanoseconds to TSC ticks and utilizes the base
       timestamps of TSC and CLOCK_MONOTONIC which are updated once per
       tick. This allows a direct conversion into the TSC deadline value
       without reading the time and as a bonus keeps the deadline
       conversion in sync with the TSC conversion factors, which are
       updated by adjtimex() on systems with NTP/PTP enabled.

     - Allow inlining of the clocksource read and clockevent write
       functions when they are tiny enough, e.g. on x86 RDTSC and WRMSR.

   With all those enhancements in place a hrtick enabled scheduler
   provides the same performance as without hrtick. But also other
   hrtimer users obviously benefit from these optimizations.

 - Robustness improvements and cleanups of historical sins in the
   hrtimer and timekeeping code.

 - Rewrite of the clocksource watchdog.

   The clocksource watchdog code has over time reached the state of an
   impenetrable maze of duct tape and staples. The original design,
   which was made in the context of systems far smaller than today, is
   based on the assumption that the to be monitored clocksource (TSC)
   can be trivially compared against a known to be stable clocksource
   (HPET/ACPI-PM timer).

   Over the years this rather naive approach turned out to have major
   flaws. Long delays between the watchdog invocations can cause wrap
   arounds of the reference clocksource. The access to the reference
   clocksource degrades on large multi-sockets systems dure to
   interconnect congestion. This has been addressed with various
   heuristics which degraded the accuracy of the watchdog to the point
   that it fails to detect actual TSC problems on older hardware which
   exposes slow inter CPU drifts due to firmware manipulating the TSC to
   hide SMI time.

   The rewrite addresses this by:

     - Restricting the validation against the reference clocksource to
       the boot CPU which is usually closest to the legacy block which
       contains the reference clocksource (HPET/ACPI-PM).

     - Do a round robin validation betwen the boot CPU and the other
       CPUs based only on the TSC with an algorithm similar to the TSC
       synchronization code during CPU hotplug.

     - Being more leniant versus remote timeouts

 - The usual tiny fixes, cleanups and enhancements all over the place

* tag 'timers-core-2026-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (75 commits)
  alarmtimer: Access timerqueue node under lock in suspend
  hrtimer: Fix incorrect #endif comment for BITS_PER_LONG check
  posix-timers: Fix stale function name in comment
  timers: Get this_cpu once while clearing the idle state
  clocksource: Rewrite watchdog code completely
  clocksource: Don't use non-continuous clocksources as watchdog
  x86/tsc: Handle CLOCK_SOURCE_VALID_FOR_HRES correctly
  MIPS: Don't select CLOCKSOURCE_WATCHDOG
  parisc: Remove unused clocksource flags
  hrtimer: Add a helper to retrieve a hrtimer from its timerqueue node
  hrtimer: Remove trailing comma after HRTIMER_MAX_CLOCK_BASES
  hrtimer: Mark index and clockid of clock base as const
  hrtimer: Drop unnecessary pointer indirection in hrtimer_expire_entry event
  hrtimer: Drop spurious space in 'enum hrtimer_base_type'
  hrtimer: Don't zero-initialize ret in hrtimer_nanosleep()
  hrtimer: Remove hrtimer_get_expires_ns()
  timekeeping: Mark offsets array as const
  timekeeping/auxclock: Consistently use raw timekeeper for tk_setup_internals()
  timer_list: Print offset as signed integer
  tracing: Use explicit array size instead of sentinel elements in symbol printing
  ...
2026-04-14 10:27:07 -07:00
Linus Torvalds
2ad332b0e2 Merge tag 'core-debugobjects-2026-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull debugobjects update from Thomas Gleixner:
 "A trivial update for debugobjects to drop a pointless likely() around
  IS_ERR_OR_NULL()"

* tag 'core-debugobjects-2026-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  debugobjects: Drop likely() around !IS_ERR_OR_NULL()
2026-04-14 09:48:39 -07:00
Linus Torvalds
a970ed1881 Merge tag 'bitmap-for-v7.1' of https://github.com/norov/linux
Pull bitmap updates from Yury Norov:

 - new API: bitmap_weight_from() and bitmap_weighted_xor() (Yury)

 - drop unused __find_nth_andnot_bit() (Yury)

 - new tests and test improvements (Andy, Akinobu, Yury)

 - fixes for count_zeroes API (Yury)

 - cleanup bitmap_print_to_pagebuf() mess (Yury)

 - documentation updates (Andy, Kai, Kit).

* tag 'bitmap-for-v7.1' of https://github.com/norov/linux: (24 commits)
  bitops: Update kernel-doc for sign_extendXX()
  powerpc/xive: simplify xive_spapr_debug_show()
  thermal: intel: switch cpumask_get() to using cpumask_print_to_pagebuf()
  coresight: don't use bitmap_print_to_pagebuf()
  lib/prime_numbers: drop temporary buffer in dump_primes()
  drm/xe: switch xe_pagefault_queue_init() to using bitmap_weighted_or()
  ice: use bitmap_empty() in ice_vf_has_no_qs_ena
  ice: use bitmap_weighted_xor() in ice_find_free_recp_res_idx()
  bitmap: introduce bitmap_weighted_xor()
  bitmap: add test_zero_nbits()
  bitmap: exclude nbits == 0 cases from bitmap test
  bitmap: test bitmap_weight() for more
  asm-generic/bitops: Fix a comment typo in instrumented-atomic.h
  bitops: fix kernel-doc parameter name for parity8()
  lib: count_zeros: unify count_{leading,trailing}_zeros()
  lib: count_zeros: fix 32/64-bit inconsistency in count_trailing_zeros()
  lib: crypto: fix comments for count_leading_zeros()
  x86/topology: use bitmap_weight_from()
  bitmap: add bitmap_weight_from()
  lib/find_bit_benchmark: avoid clearing randomly filled bitmap in test_find_first_bit()
  ...
2026-04-14 08:55:18 -07:00
Linus Torvalds
d142ab35ee Merge tag 'crc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux
Pull CRC updates from Eric Biggers:

 - Several improvements related to crc_kunit, to align with the standard
   KUnit conventions and make it easier for developers and CI systems to
   run this test suite

 - Add an arm64-optimized implementation of CRC64-NVME

 - Remove unused code for big endian arm64

* tag 'crc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux:
  lib/crc: arm64: Simplify intrinsics implementation
  lib/crc: arm64: Use existing macros for kernel-mode FPU cflags
  lib/crc: arm64: Drop unnecessary chunking logic from crc64
  lib/crc: arm64: Assume a little-endian kernel
  lib/crc: arm64: add NEON accelerated CRC64-NVMe implementation
  lib/crc: arm64: Drop check for CONFIG_KERNEL_MODE_NEON
  crypto: crc32c - Remove another outdated comment
  crypto: crc32c - Remove more outdated usage information
  kunit: configs: Enable all CRC tests in all_tests.config
  lib/crc: tests: Add a .kunitconfig file
  lib/crc: tests: Add CRC_ENABLE_ALL_FOR_KUNIT
  lib/crc: tests: Make crc_kunit test only the enabled CRC variants
2026-04-13 17:36:04 -07:00
Linus Torvalds
370c388319 Merge tag 'libcrypto-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux
Pull crypto library updates from Eric Biggers:

 - Migrate more hash algorithms from the traditional crypto subsystem to
   lib/crypto/

   Like the algorithms migrated earlier (e.g. SHA-*), this simplifies
   the implementations, improves performance, enables further
   simplifications in calling code, and solves various other issues:

     - AES CBC-based MACs (AES-CMAC, AES-XCBC-MAC, and AES-CBC-MAC)

         - Support these algorithms in lib/crypto/ using the AES library
           and the existing arm64 assembly code

         - Reimplement the traditional crypto API's "cmac(aes)",
           "xcbc(aes)", and "cbcmac(aes)" on top of the library

         - Convert mac80211 to use the AES-CMAC library. Note: several
           other subsystems can use it too and will be converted later

         - Drop the broken, nonstandard, and likely unused support for
           "xcbc(aes)" with key lengths other than 128 bits

         - Enable optimizations by default

     - GHASH

         - Migrate the standalone GHASH code into lib/crypto/

         - Integrate the GHASH code more closely with the very similar
           POLYVAL code, and improve the generic GHASH implementation to
           resist cache-timing attacks and use much less memory

         - Reimplement the AES-GCM library and the "gcm" crypto_aead
           template on top of the GHASH library. Remove "ghash" from the
           crypto_shash API, as it's no longer needed

         - Enable optimizations by default

     - SM3

         - Migrate the kernel's existing SM3 code into lib/crypto/, and
           reimplement the traditional crypto API's "sm3" on top of it

         - I don't recommend using SM3, but this cleanup is worthwhile
           to organize the code the same way as other algorithms

 - Testing improvements:

     - Add a KUnit test suite for each of the new library APIs

     - Migrate the existing ChaCha20Poly1305 test to KUnit

     - Make the KUnit all_tests.config enable all crypto library tests

     - Move the test kconfig options to the Runtime Testing menu

 - Other updates to arch-optimized crypto code:

     - Optimize SHA-256 for Zhaoxin CPUs using the Padlock Hash Engine

     - Remove some MD5 implementations that are no longer worth keeping

     - Drop big endian and voluntary preemption support from the arm64
       code, as those configurations are no longer supported on arm64

 - Make jitterentropy and samples/tsm-mr use the crypto library APIs

* tag 'libcrypto-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux: (66 commits)
  lib/crypto: arm64: Assume a little-endian kernel
  arm64: fpsimd: Remove obsolete cond_yield macro
  lib/crypto: arm64/sha3: Remove obsolete chunking logic
  lib/crypto: arm64/sha512: Remove obsolete chunking logic
  lib/crypto: arm64/sha256: Remove obsolete chunking logic
  lib/crypto: arm64/sha1: Remove obsolete chunking logic
  lib/crypto: arm64/poly1305: Remove obsolete chunking logic
  lib/crypto: arm64/gf128hash: Remove obsolete chunking logic
  lib/crypto: arm64/chacha: Remove obsolete chunking logic
  lib/crypto: arm64/aes: Remove obsolete chunking logic
  lib/crypto: Include <crypto/utils.h> instead of <crypto/algapi.h>
  lib/crypto: aesgcm: Don't disable IRQs during AES block encryption
  lib/crypto: aescfb: Don't disable IRQs during AES block encryption
  lib/crypto: tests: Migrate ChaCha20Poly1305 self-test to KUnit
  lib/crypto: sparc: Drop optimized MD5 code
  lib/crypto: mips: Drop optimized MD5 code
  lib: Move crypto library tests to Runtime Testing menu
  crypto: sm3 - Remove 'struct sm3_state'
  crypto: sm3 - Remove the original "sm3_block_generic()"
  crypto: sm3 - Remove sm3_base.h
  ...
2026-04-13 17:31:39 -07:00
Linus Torvalds
de639344bb Merge tag 'audit-pr-20260410' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit
Pull audit updates from Paul Moore:

 - Improved handling of unknown status requests from userspace

   The current kernel code ignores unknown/unused request bits sent from
   userspace and returns an error code based on the results of the
   request(s) it does understand. The patch from Ricardo fixes this so
   that unknown requests return an -EINVAL to userspace, making
   compatibility a bit easier moving forward.

 - A number of small style and formatting cleanups

* tag 'audit-pr-20260410' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit:
  audit: handle unknown status requests in audit_receive_msg()
  audit: fix coding style issues
  audit: remove redundant initialization of static variables to 0
  audit: fix whitespace alignment in include/uapi/linux/audit.h
2026-04-13 14:56:54 -07:00
Linus Torvalds
26ff969926 Merge tag 'rust-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/ojeda/linux
Pull Rust updates from Miguel Ojeda:
 "Toolchain and infrastructure:

   - Bump the minimum Rust version to 1.85.0 (and 'bindgen' to 0.71.1).

     As proposed in LPC 2025 and the Maintainers Summit [1], we are
     going to follow Debian Stable's Rust versions as our minimum
     versions.

     Debian Trixie was released on 2025-08-09 with a Rust 1.85.0 and
     'bindgen' 0.71.1 toolchain, which is a fair amount of time for e.g.
     kernel developers to upgrade.

     Other major distributions support a Rust version that is high
     enough as well, including:

       + Arch Linux.
       + Fedora Linux.
       + Gentoo Linux.
       + Nix.
       + openSUSE Slowroll and openSUSE Tumbleweed.
       + Ubuntu 25.10 and 26.04 LTS. In addition, 24.04 LTS using
         their versioned packages.

     The merged patch series comes with the associated cleanups and
     simplifications treewide that can be performed thanks to both
     bumps, as well as documentation updates.

     In addition, start using 'bindgen''s '--with-attribute-custom-enum'
     feature to set the 'cfi_encoding' attribute for the 'lru_status'
     enum used in Binder.

     Link: https://lwn.net/Articles/1050174/ [1]

   - Add experimental Kconfig option ('CONFIG_RUST_INLINE_HELPERS') that
     inlines C helpers into Rust.

     Essentially, it performs a step similar to LTO, but just for the
     helpers, i.e. very local and fast.

     It relies on 'llvm-link' and its '--internalize' flag, and requires
     a compatible LLVM between Clang and 'rustc' (i.e. same major
     version, 'CONFIG_RUSTC_CLANG_LLVM_COMPATIBLE'). It is only enabled
     for two architectures for now.

     The result is a measurable speedup in different workloads that
     different users have tested. For instance, for the null block
     driver, it amounts to a 2%.

   - Support global per-version flags.

     While we already have per-version flags in many places, we didn't
     have a place to set global ones that depend on the compiler
     version, i.e. in 'rust_common_flags', which sometimes is needed to
     e.g. tweak the lints set per version.

     Use that to allow the 'clippy::precedence' lint for Rust < 1.86.0,
     since it had a change in behavior.

   - Support overriding the crate name and apply it to Rust Binder,
     which wanted the module to be called 'rust_binder'.

   - Add the remaining '__rust_helper' annotations (started in the
     previous cycle).

  'kernel' crate:

   - Introduce the 'const_assert!' macro: a more powerful version of
     'static_assert!' that can refer to generics inside functions or
     implementation bodies, e.g.:

         fn f<const N: usize>() {
             const_assert!(N > 1);
         }

         fn g<T>() {
             const_assert!(size_of::<T>() > 0, "T cannot be ZST");
         }

     In addition, reorganize our set of build-time assertion macros
     ('{build,const,static_assert}!') to live in the 'build_assert'
     module.

     Finally, improve the docs as well to clarify how these are
     different from one another and how to pick the right one to use,
     and their equivalence (if any) to the existing C ones for extra
     clarity.

   - 'sizes' module: add 'SizeConstants' trait.

     This gives us typed 'SZ_*' constants (avoiding casts) for use in
     device address spaces where the address width depends on the
     hardware (e.g. 32-bit MMIO windows, 64-bit GPU framebuffers, etc.),
     e.g.:

         let gpu_heap = 14 * u64::SZ_1M;
         let mmio_window = u32::SZ_16M;

   - 'clk' module: implement 'Send' and 'Sync' for 'Clk' and thus
     simplify the users in Tyr and PWM.

   - 'ptr' module: add 'const_align_up'.

   - 'str' module: improve the documentation of the 'c_str!' macro to
     explain that one should only use it for non-literal cases (for the
     other case we instead use C string literals, e.g. 'c"abc"').

   - Disallow the use of 'CStr::{as_ptr,from_ptr}' and clean one such
     use in the 'task' module.

   - 'sync' module: finish the move of 'ARef' and 'AlwaysRefCounted'
     outside of the 'types' module, i.e. update the last remaining
     instances and finally remove the re-exports.

   - 'error' module: clarify that 'from_err_ptr' can return 'Ok(NULL)',
     including runtime-tested examples.

     The intention is to hopefully prevent UB that assumes the result of
     the function is not 'NULL' if successful. This originated from a
     case of UB I noticed in 'regulator' that created a 'NonNull' on it.

  Timekeeping:

   - Expand the example section in the 'HrTimer' documentation.

   - Mark the 'ClockSource' trait as unsafe to ensure valid values for
     'ktime_get()'.

   - Add 'Delta::from_nanos()'.

  'pin-init' crate:

   - Replace the 'Zeroable' impls for 'Option<NonZero*>' with impls of
     'ZeroableOption' for 'NonZero*'.

   - Improve feature gate handling for unstable features.

   - Declutter the documentation of implementations of 'Zeroable' for
     tuples.

   - Replace uses of 'addr_of[_mut]!' with '&raw [mut]'.

  rust-analyzer:

   - Add type annotations to 'generate_rust_analyzer.py'.

   - Add support for scripts written in Rust ('generate_rust_target.rs',
     'rustdoc_test_builder.rs', 'rustdoc_test_gen.rs').

   - Refactor 'generate_rust_analyzer.py' to explicitly identify host
     and target crates, improve readability, and reduce duplication.

  And some other fixes, cleanups and improvements"

* tag 'rust-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/ojeda/linux: (79 commits)
  rust: sizes: add SizeConstants trait for device address space constants
  rust: kernel: update `file_with_nul` comment
  rust: kbuild: allow `clippy::precedence` for Rust < 1.86.0
  rust: kbuild: support global per-version flags
  rust: declare cfi_encoding for lru_status
  docs: rust: general-information: use real example
  docs: rust: general-information: simplify Kconfig example
  docs: rust: quick-start: remove GDB/Binutils mention
  docs: rust: quick-start: remove Nix "unstable channel" note
  docs: rust: quick-start: remove Gentoo "testing" note
  docs: rust: quick-start: add Ubuntu 26.04 LTS and remove subsection title
  docs: rust: quick-start: update minimum Ubuntu version
  docs: rust: quick-start: update Ubuntu versioned packages
  docs: rust: quick-start: openSUSE provides `rust-src` package nowadays
  rust: kbuild: remove "dummy parameter" workaround for `bindgen` < 0.71.1
  rust: kbuild: update `bindgen --rust-target` version and replace comment
  rust: rust_is_available: remove warning for `bindgen` < 0.69.5 && libclang >= 19.1
  rust: rust_is_available: remove warning for `bindgen` 0.66.[01]
  rust: bump `bindgen` minimum supported version to 0.71.1 (Debian Trixie)
  rust: block: update `const_refs_to_static` MSRV TODO comment
  ...
2026-04-13 09:54:20 -07:00
Linus Torvalds
fdcbb1bc06 Merge branch 'nocache-cleanup'
This series cleans up some of the special user copy functions naming and
semantics.  In particular, get rid of the (very traditional) double
underscore names and behavior: the whole "optimize away the range check"
model has been largely excised from the other user accessors because
it's so subtle and can be unsafe, but also because it's just not a
relevant optimization any more.

To do that, a couple of drivers that misused the "user" copies as kernel
copies in order to get non-temporal stores had to be fixed up, but that
kind of code should never have been allowed anyway.

The x86-only "nocache" version was also renamed to more accurately
reflect what it actually does.

This was all done because I looked at this code due to a report by Jann
Horn, and I just couldn't stand the inconsistent naming, the horrible
semantics, and the random misuse of these functions.  This code should
probably be cleaned up further, but it's at least slightly closer to
normal semantics.

I had a more intrusive series that went even further in trying to
normalize the semantics, but that ended up hitting so many other
inconsistencies between different architectures in this area (eg
'size_t' vs 'unsigned long' vs 'int' as size arguments, and various
iovec check differences that Vasily Gorbik pointed out) that I ended up
with this more limited version that fixed the worst of the issues.

Reported-by: Jann Horn <jannh@google.com>
Tested-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/all/CAHk-=wgg1QVWNWG-UCFo1hx0zqrPnB3qhPzUTrWNft+MtXQXig@mail.gmail.com/

* nocache-cleanup:
  x86-64/arm64/powerpc: clean up and rename __copy_from_user_flushcache
  x86: rename and clean up __copy_from_user_inatomic_nocache()
  x86-64: rename misleadingly named '__copy_user_nocache()' function
2026-04-13 08:39:51 -07:00
Thomas Gleixner
ff1c0c5d07 Merge branch 'timers/urgent' into timers/core
to resolve the conflict with urgent fixes.
2026-04-11 07:58:33 +02:00
Yury Norov
7ca1d7f939 lib/prime_numbers: drop temporary buffer in dump_primes()
The function uses temporary buffer to convert primes bitmap into
human readable format. Switch to using kunit_info("%*pbl")", and
drop the buffer.

Signed-off-by: Yury Norov <ynorov@nvidia.com>
2026-04-09 13:28:05 -04:00
Christian Brauner
e3b2cf6e5d kernfs: pass struct ns_common instead of const void * for namespace tags
kernfs has historically used const void * to pass around namespace tags
used for directory-level namespace filtering. The only current user of
this is sysfs network namespace tagging where struct net pointers are
cast to void *.

Replace all const void * namespace parameters with const struct
ns_common * throughout the kernfs, sysfs, and kobject namespace layers.
This includes the kobj_ns_type_operations callbacks, kobject_namespace(),
and all sysfs/kernfs APIs that accept or return namespace tags.

Passing struct ns_common is needed because various codepaths require
access to the underlying namespace. A struct ns_common can always be
converted back to the concrete namespace type (e.g., struct net) via
container_of() or to_ns_common() in the reverse direction.

This is a preparatory change for switching to ns_id-based directory
iteration to prevent a KASLR pointer leak through the current use of
raw namespace pointers as hash seeds and comparison keys.

Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-04-09 14:36:52 +02:00
Miguel Ojeda
b06b348e85 Merge tag 'rust-timekeeping-for-v7.1' of https://github.com/Rust-for-Linux/linux into rust-next
Pull timekeeping updates from Andreas Hindborg:

 - Expand the example section in the 'HrTimer' documentation.

 - Mark the 'ClockSource' trait as unsafe to ensure valid values for
   'ktime_get()'.

 - Add 'Delta::from_nanos()'.

This is a back merge since the pull request has a newer base -- we will
avoid that in the future.

And, given it is a back merge, it happens to resolve the "subtle" conflict
around '--remap-path-{prefix,scope}' that I discussed in linux-next [1],
plus a few other common conflicts. The result matches what we did for
next-20260407.

The actual diffstat (i.e. using a temporary merge of upstream first) is:

    rust/kernel/time.rs         |  32 ++++-
    rust/kernel/time/hrtimer.rs | 336 ++++++++++++++++++++++++++++++++++++++++++++
    2 files changed, 362 insertions(+), 6 deletions(-)

Link: https://lore.kernel.org/linux-next/CANiq72kdxB=W3_CV1U44oOK3SssztPo2wLDZt6LP94TEO+Kj4g@mail.gmail.com/ [1]

* tag 'rust-timekeeping-for-v7.1' of https://github.com/Rust-for-Linux/linux:
  hrtimer: add usage examples to documentation
  rust: time: make ClockSource unsafe trait
  rust/time: Add Delta::from_nanos()
2026-04-08 10:44:11 +02:00
Ard Biesheuvel
8fdef85d60 lib/crc: arm64: Simplify intrinsics implementation
NEON intrinsics are useful because they remove the need for manual
register allocation, and the resulting code can be re-compiled and
optimized for different micro-architectures, and shared between arm64
and 32-bit ARM.

However, the strong typing of the vector variables can lead to
incomprehensible gibberish, as is the case with the new CRC64
implementation. To address this, let's repaint all variables as
uint64x2_t to minimize the number of vreinterpretq_xxx() calls, and to
be able to rely on the ^ operator for exclusive OR operations. This
makes the code much more concise and readable.

While at it, wrap the calls to vmull_p64() et al in order to have a more
consistent calling convention, and encapsulate any remaining
vreinterpret() calls that are still needed.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20260330144630.33026-11-ardb@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2026-04-02 16:14:53 -07:00
Ard Biesheuvel
f956dc8131 lib/crc: arm64: Use existing macros for kernel-mode FPU cflags
Use the existing CC_FPU_CFLAGS and CC_NO_FPU_CFLAGS to pass the
appropriate compiler command line options for building kernel mode NEON
intrinsics code. This is tidier, and will make it easier to reuse the
code for 32-bit ARM.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20260330144630.33026-9-ardb@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2026-04-02 16:14:53 -07:00
Ard Biesheuvel
e0718ed60d lib/crc: arm64: Drop unnecessary chunking logic from crc64
On arm64, kernel mode NEON executes with preemption enabled, so there is
no need to chunk the input by hand.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20260330144630.33026-8-ardb@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2026-04-02 16:14:53 -07:00
Eric Biggers
5276ea17a2 lib/crc: arm64: Assume a little-endian kernel
Since support for big-endian arm64 kernels was removed, the CPU_LE()
macro now unconditionally emits the code it is passed, and the CPU_BE()
macro now unconditionally discards the code it is passed.

Simplify the assembly code in lib/crc/arm64/ accordingly.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20260401004431.151432-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2026-04-02 16:13:18 -07:00
Yury Norov
d57e74f104 bitmap: introduce bitmap_weighted_xor()
The function helps to XOR bitmaps and calculate Hamming weight of
the result in one pass.

Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Yury Norov <ynorov@nvidia.com>
2026-04-01 20:03:07 -04:00
Eric Biggers
12b11e47f1 lib/crypto: arm64: Assume a little-endian kernel
Since support for big-endian arm64 kernels was removed, the CPU_LE()
macro now unconditionally emits the code it is passed, and the CPU_BE()
macro now unconditionally discards the code it is passed.

Simplify the assembly code in lib/crypto/arm64/ accordingly.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20260401003331.144065-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2026-04-01 13:02:15 -07:00
Eric Biggers
6d575f11c7 lib/crypto: arm64/sha3: Remove obsolete chunking logic
Since commit aefbab8e77 ("arm64: fpsimd: Preserve/restore kernel mode
NEON at context switch"), kernel-mode NEON sections have been
preemptible on arm64.  And since commit 7dadeaa6e8 ("sched: Further
restrict the preemption modes"), voluntary preemption is no longer
supported on arm64 either.  Therefore, there's no longer any need to
limit the length of kernel-mode NEON sections on arm64.

Simplify the SHA-3 code accordingly.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20260401000548.133151-9-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2026-04-01 13:02:10 -07:00
Eric Biggers
7116418f6b lib/crypto: arm64/sha512: Remove obsolete chunking logic
Since commit aefbab8e77 ("arm64: fpsimd: Preserve/restore kernel mode
NEON at context switch"), kernel-mode NEON sections have been
preemptible on arm64.  And since commit 7dadeaa6e8 ("sched: Further
restrict the preemption modes"), voluntary preemption is no longer
supported on arm64 either.  Therefore, there's no longer any need to
limit the length of kernel-mode NEON sections on arm64.

Simplify the SHA-512 code accordingly.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20260401000548.133151-8-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2026-04-01 13:02:10 -07:00
Eric Biggers
fe1233c2eb lib/crypto: arm64/sha256: Remove obsolete chunking logic
Since commit aefbab8e77 ("arm64: fpsimd: Preserve/restore kernel mode
NEON at context switch"), kernel-mode NEON sections have been
preemptible on arm64.  And since commit 7dadeaa6e8 ("sched: Further
restrict the preemption modes"), voluntary preemption is no longer
supported on arm64 either.  Therefore, there's no longer any need to
limit the length of kernel-mode NEON sections on arm64.

Simplify the SHA-256 code accordingly.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20260401000548.133151-7-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2026-04-01 13:02:10 -07:00
Eric Biggers
fd5017138c lib/crypto: arm64/sha1: Remove obsolete chunking logic
Since commit aefbab8e77 ("arm64: fpsimd: Preserve/restore kernel mode
NEON at context switch"), kernel-mode NEON sections have been
preemptible on arm64.  And since commit 7dadeaa6e8 ("sched: Further
restrict the preemption modes"), voluntary preemption is no longer
supported on arm64 either.  Therefore, there's no longer any need to
limit the length of kernel-mode NEON sections on arm64.

Simplify the SHA-1 code accordingly.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20260401000548.133151-6-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2026-04-01 13:02:10 -07:00
Eric Biggers
dec1061f0a lib/crypto: arm64/poly1305: Remove obsolete chunking logic
Since commit aefbab8e77 ("arm64: fpsimd: Preserve/restore kernel mode
NEON at context switch"), kernel-mode NEON sections have been
preemptible on arm64.  And since commit 7dadeaa6e8 ("sched: Further
restrict the preemption modes"), voluntary preemption is no longer
supported on arm64 either.  Therefore, there's no longer any need to
limit the length of kernel-mode NEON sections on arm64.

Simplify the Poly1305 code accordingly.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20260401000548.133151-5-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2026-04-01 13:02:10 -07:00
Eric Biggers
d3a5cc5c92 lib/crypto: arm64/gf128hash: Remove obsolete chunking logic
Since commit aefbab8e77 ("arm64: fpsimd: Preserve/restore kernel mode
NEON at context switch"), kernel-mode NEON sections have been
preemptible on arm64.  And since commit 7dadeaa6e8 ("sched: Further
restrict the preemption modes"), voluntary preemption is no longer
supported on arm64 either.  Therefore, there's no longer any need to
limit the length of kernel-mode NEON sections on arm64.

Simplify the GHASH and POLYVAL code accordingly.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20260401000548.133151-4-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2026-04-01 13:02:10 -07:00
Eric Biggers
63fcc765e1 lib/crypto: arm64/chacha: Remove obsolete chunking logic
Since commit aefbab8e77 ("arm64: fpsimd: Preserve/restore kernel mode
NEON at context switch"), kernel-mode NEON sections have been
preemptible on arm64.  And since commit 7dadeaa6e8 ("sched: Further
restrict the preemption modes"), voluntary preemption is no longer
supported on arm64 either.  Therefore, there's no longer any need to
limit the length of kernel-mode NEON sections on arm64.

Simplify the ChaCha code accordingly.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20260401000548.133151-3-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2026-04-01 13:02:09 -07:00
Eric Biggers
11d6bc70ff lib/crypto: arm64/aes: Remove obsolete chunking logic
Since commit aefbab8e77 ("arm64: fpsimd: Preserve/restore kernel mode
NEON at context switch"), kernel-mode NEON sections have been
preemptible on arm64.  And since commit 7dadeaa6e8 ("sched: Further
restrict the preemption modes"), voluntary preemption is no longer
supported on arm64 either.  Therefore, there's no longer any need to
limit the length of kernel-mode NEON sections on arm64.

Simplify the AES-CBC-MAC code accordingly.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20260401000548.133151-2-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2026-04-01 13:02:09 -07:00
Eric Biggers
8aeeb5255d lib/crypto: Include <crypto/utils.h> instead of <crypto/algapi.h>
Since the lib/crypto/ files that include <crypto/algapi.h> need it only
for the transitive inclusion of <crypto/utils.h> (and not all the
traditional crypto API stuff that the rest of <crypto/algapi.h> is
filled with), replace these inclusions with direct inclusions of
<crypto/utils.h>.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20260331024438.51783-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2026-03-31 17:19:31 -07:00
Eric Biggers
8f45af945f lib/crypto: aesgcm: Don't disable IRQs during AES block encryption
aes_encrypt() now uses AES instructions when available instead of always
using table-based code.  AES instructions are constant-time and don't
benefit from disabling IRQs as a constant-time hardening measure.

In fact, on two architectures (arm and riscv) disabling IRQs is
counterproductive because it prevents the AES instructions from being
used.  (See the may_use_simd() implementation on those architectures.)

Therefore, let's remove the IRQ disabling/enabling and leave the choice
of constant-time hardening measures to the AES library code.

Note that currently the arm table-based AES code (which runs on arm
kernels that don't have ARMv8 CE) disables IRQs, while the generic
table-based AES code does not.  So this does technically regress in
constant-time hardening when that generic code is used.  But as
discussed in commit a22fd0e3c4 ("lib/crypto: aes: Introduce improved
AES library") I think just leaving IRQs enabled is the right choice.
Disabling them is slow and can cause problems, and AES instructions
(which modern CPUs have) solve the problem in a much better way anyway.

Link: https://lore.kernel.org/r/20260331024430.51755-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2026-03-31 17:19:22 -07:00
Eric Biggers
1aa82df3eb lib/crypto: aescfb: Don't disable IRQs during AES block encryption
aes_encrypt() now uses AES instructions when available instead of always
using table-based code.  AES instructions are constant-time and don't
benefit from disabling IRQs as a constant-time hardening measure.

In fact, on two architectures (arm and riscv) disabling IRQs is
counterproductive because it prevents the AES instructions from being
used.  (See the may_use_simd() implementation on those architectures.)

Therefore, let's remove the IRQ disabling/enabling and leave the choice
of constant-time hardening measures to the AES library code.

Note that currently the arm table-based AES code (which runs on arm
kernels that don't have ARMv8 CE) disables IRQs, while the generic
table-based AES code does not.  So this does technically regress in
constant-time hardening when that generic code is used.  But as
discussed in commit a22fd0e3c4 ("lib/crypto: aes: Introduce improved
AES library") I think just leaving IRQs enabled is the right choice.
Disabling them is slow and can cause problems, and AES instructions
(which modern CPUs have) solve the problem in a much better way anyway.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20260331024414.51545-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2026-03-31 17:19:15 -07:00
Linus Torvalds
809b997a5c x86-64/arm64/powerpc: clean up and rename __copy_from_user_flushcache
This finishes the work on these odd functions that were only implemented
by a handful of architectures.

The 'flushcache' function was only used from the iterator code, and
let's make it do the same thing that the nontemporal version does:
remove the two underscores and add the user address checking.

Yes, yes, the user address checking is also done at iovec import time,
but we have long since walked away from the old double-underscore thing
where we try to avoid address checking overhead at access time, and
these functions shouldn't be so special and old-fashioned.

The arm64 version already did the address check, in fact, so there it's
just a matter of renaming it.  For powerpc and x86-64 we now do the
proper user access boilerplate.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-03-30 15:05:57 -07:00
Linus Torvalds
5de7bcaadf x86: rename and clean up __copy_from_user_inatomic_nocache()
Similarly to the previous commit, this renames the somewhat confusingly
named function.  But in this case, it was at least less confusing: the
__copy_from_user_inatomic_nocache is indeed copying from user memory,
and it is indeed ok to be used in an atomic context, so it will not warn
about it.

But the previous commit also removed the NTB mis-use of the
__copy_from_user_inatomic_nocache() function, and as a result every
call-site is now _actually_ doing a real user copy.  That means that we
can now do the proper user pointer verification too.

End result: add proper address checking, remove the double underscores,
and change the "nocache" to "nontemporal" to more accurately describe
what this x86-only function actually does.  It might be worth noting
that only the target is non-temporal: the actual user accesses are
normal memory accesses.

Also worth noting is that non-x86 targets (and on older 32-bit x86 CPU's
before XMM2 in the Pentium III) we end up just falling back on a regular
user copy, so nothing can actually depend on the non-temporal semantics,
but that has always been true.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-03-30 15:05:57 -07:00
Linus Torvalds
d0c3bcd5b8 Merge tag 'libcrypto-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux
Pull crypto library fix from Eric Biggers:
 "Fix missing zeroization of the ChaCha state"

* tag 'libcrypto-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux:
  lib/crypto: chacha: Zeroize permuted_state before it leaves scope
2026-03-30 13:40:48 -07:00
Eric Biggers
d2a68aba85 lib/crypto: tests: Migrate ChaCha20Poly1305 self-test to KUnit
Move the ChaCha20Poly1305 test from an ad-hoc self-test to a KUnit test.

Keep the same test logic for now, just translated to KUnit.

Moving to KUnit has multiple benefits, such as:

- Consistency with the rest of the lib/crypto/ tests.

- Kernel developers familiar with KUnit, which is used kernel-wide, can
  quickly understand the test and how to enable and run it.

- The test will be automatically run by anyone using
  lib/crypto/.kunitconfig or KUnit's all_tests.config.

- Results are reported using the standard KUnit mechanism.

- It eliminates one of the few remaining back-references to crypto/ from
  lib/crypto/, specifically a reference to CONFIG_CRYPTO_SELFTESTS.

Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20260327224229.137532-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2026-03-30 12:35:30 -07:00
Eric Biggers
23e5c306a2 lib/crypto: sparc: Drop optimized MD5 code
MD5 is obsolete.  Continuing to maintain architecture-optimized
implementations of MD5 is unnecessary and risky.  It diverts resources
from the modern algorithms that are actually important.

While there was demand for continuing to maintain the PowerPC optimized
MD5 code to accommodate userspace programs that are misusing AF_ALG
(https://lore.kernel.org/linux-crypto/c4191597-341d-4fd7-bc3d-13daf7666c41@csgroup.eu/),
no such demand has been seen for the SPARC optimized MD5 code.

Thus, let's drop it and focus effort on the more modern SHA algorithms,
which already have optimized code for SPARC.

Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20260326203341.60393-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2026-03-30 12:35:16 -07:00
Eric Biggers
91cd9a0337 lib/crypto: mips: Drop optimized MD5 code
MD5 is obsolete.  Continuing to maintain architecture-optimized
implementations of MD5 is unnecessary and risky.  It diverts resources
from the modern algorithms that are actually important.

While there was demand for continuing to maintain the PowerPC optimized
MD5 code to accommodate userspace programs that are misusing AF_ALG
(https://lore.kernel.org/linux-crypto/c4191597-341d-4fd7-bc3d-13daf7666c41@csgroup.eu/),
no such demand has been seen for the MIPS Cavium Octeon optimized MD5
code.  Note that this code runs on only one particular line of SoCs.

Thus, let's drop it and focus effort on the more modern SHA algorithms,
which already have optimized code for the same SoCs.

Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20260326204824.62010-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2026-03-30 12:35:05 -07:00
Gary Guo
3a2486cc1d kbuild: rust: provide an option to inline C helpers into Rust
A new experimental Kconfig option, `RUST_INLINE_HELPERS` is added to
allow C helpers (which were created to allow Rust to call into
inline/macro C functions without having to re-implement the logic in
Rust) to be inlined into Rust crates without performing global LTO.

If the option is enabled, the following is performed:
* For helpers, instead of compiling them to an object file to be linked
  into vmlinux, they're compiled to LLVM IR bitcode. Two versions are
  generated: one for built-in code (`helpers.bc`) and one for modules
  (`helpers_module.bc`, with -DMODULE defined). This ensures that C
  macros/inlines that behave differently for modules (e.g. static calls)
  function correctly when inlined.
* When a Rust crate or object is compiled, instead of generating an
  object file, LLVM bitcode is generated.
* llvm-link is invoked with --internalize to combine the helper bitcode
  with the crate bitcode. This step is similar to LTO, but this is much
  faster since it only needs to inline the helpers.
* clang is invoked to turn the combined bitcode into a final object file.
* Since clang may produce LLVM bitcode when LTO is enabled, and objtool
  requires ELF input, $(cmd_ld_single) is invoked to ensure the object
  is converted to ELF before objtool runs.

The --internalize flag tells llvm-link to treat all symbols in
helpers.bc using `internal` linkage [1]. This matches the behavior of
`clang` on `static inline` functions, and avoids exporting the symbol
from the object file.

To ensure that RUST_INLINE_HELPERS is not incompatible with BTF, we pass
the -g0 flag when building helpers. See commit 5daa0c35a1 ("rust:
Disallow BTF generation with Rust + LTO") for details.

We have an intended triple mismatch of `aarch64-unknown-none` vs
`aarch64-unknown-linux-gnu`, so we pass --suppress-warnings to llvm-link
to suppress it.

I considered adding some sort of check that KBUILD_MODNAME is not
present in helpers_module.bc, but this is actually not so easy to carry
out because .bc files store strings in a weird binary format, so you
cannot just grep it for a string to check whether it ended up using
KBUILD_MODNAME anywhere.

[ Andreas writes:

    For the rnull driver, enabling helper inlining with this patch
    gives an average speedup of 2% over the set of 120 workloads that
    we publish on [2].

    Link: https://rust-for-linux.com/null-block-driver [2]

  This series also uncovered a pre-existing UB instance thanks to an
  `objtool` warning which I noticed while testing the series (details
  in the mailing list).

      - Miguel ]

Link: https://github.com/llvm/llvm-project/pull/170397 [1]
Co-developed-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Co-developed-by: Matthew Maurer <mmaurer@google.com>
Signed-off-by: Matthew Maurer <mmaurer@google.com>
Signed-off-by: Gary Guo <gary@garyguo.net>
Co-developed-by: Alice Ryhl <aliceryhl@google.com>
Signed-off-by: Alice Ryhl <aliceryhl@google.com>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Tested-by: Andreas Hindborg <a.hindborg@kernel.org>
Reviewed-by: Andreas Hindborg <a.hindborg@kernel.org>
Link: https://patch.msgid.link/20260203-inline-helpers-v2-3-beb8547a03c9@google.com
[ Some changes, apart from the rebase:

  - Added "(EXPERIMENTAL)" to Kconfig as the commit mentions.

  - Added `depends on ARM64 || X86_64` and `!UML` for now, since this is
    experimental, other architectures may require other changes (e.g.
    the issues I mentioned in the mailing list for ARM and UML) and they
    are not really tested so far. So let arch maintainers pick this up
    if they think it is worth it.

  - Gated the `cmd_ld_single` step also into the new mode, which also
    means that any possible future `objcopy` step is done after the
    translation, as expected.

  - Added `.gitignore` for `.bc` with exception for existing script.

  - Added `part-of-*` for helpers bitcode files as discussed, and
    dropped `$(if $(filter %_module.bc,$@),-DMODULE)` since `-DMODULE`
    is already there (would be duplicated otherwise).

  - Moved `LLVM_LINK` to keep binutils list alphabetized.

  - Fixed typo in title.

  - Dropped second `cmd_ld_single` commit message paragraph.

      - Miguel ]
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2026-03-30 02:03:52 +02:00
Demian Shulhan
63432fd625 lib/crc: arm64: add NEON accelerated CRC64-NVMe implementation
Implement an optimized CRC64 (NVMe) algorithm for ARM64 using NEON
Polynomial Multiply Long (PMULL) instructions. The generic shift-and-XOR
software implementation is slow, which creates a bottleneck in NVMe and
other storage subsystems.

The acceleration is implemented using C intrinsics (<arm_neon.h>) rather
than raw assembly for better readability and maintainability.

Key highlights of this implementation:
- Uses 4KB chunking inside scoped_ksimd() to avoid preemption latency
  spikes on large buffers.
- Pre-calculates and loads fold constants via vld1q_u64() to minimize
  register spilling.
- Benchmarks show the break-even point against the generic implementation
  is around 128 bytes. The PMULL path is enabled only for len >= 128.

Performance results (kunit crc_benchmark on Cortex-A72):
- Generic (len=4096): ~268 MB/s
- PMULL (len=4096): ~1556 MB/s (nearly 6x improvement)

Signed-off-by: Demian Shulhan <demyansh@gmail.com>
Link: https://lore.kernel.org/r/20260329074338.1053550-1-demyansh@gmail.com
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2026-03-29 13:22:13 -07:00
Arnd Bergmann
2598ab9d63 bug: avoid format attribute warning for clang as well
Like gcc, clang-22 now also warns about a function that it incorrectly
identifies as a printf-style format:

lib/bug.c:190:22: error: diagnostic behavior may be improved by adding the 'format(printf, 1, 0)' attribute to the declaration of '__warn_printf' [-Werror,-Wmissing-format-attribute]
  179 | static void __warn_printf(const char *fmt, struct pt_regs *regs)
      | __attribute__((format(printf, 1, 0)))
  180 | {
  181 |         if (!fmt)
  182 |                 return;
  183 |
  184 | #ifdef HAVE_ARCH_BUG_FORMAT_ARGS
  185 |         if (regs) {
  186 |                 struct arch_va_list _args;
  187 |                 va_list *args = __warn_args(&_args, regs);
  188 |
  189 |                 if (args) {
  190 |                         vprintk(fmt, *args);
      |                                           ^

Revert the change that added a gcc-specific workaround, and instead add
the generic annotation that avoid the warning.

Link: https://lkml.kernel.org/r/20260323205534.1284284-1-arnd@kernel.org
Fixes: d36067d6ea ("bug: Hush suggest-attribute=format for __warn_printf()")
Suggested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Suggested-by: Brendan Jackman <jackmanb@google.com>
Link: https://lore.kernel.org/all/20251208141618.2805983-1-andriy.shevchenko@linux.intel.com/T/#u
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Brendan Jackman <jackmanb@google.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Bill Wendling <morbo@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Justin Stitt <justinstitt@google.com>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-03-27 20:48:38 -07:00
Eric Biggers
e5046823f8 lib/crypto: chacha: Zeroize permuted_state before it leaves scope
Since the ChaCha permutation is invertible, the local variable
'permuted_state' is sufficient to compute the original 'state', and thus
the key, even after the permutation has been done.

While the kernel is quite inconsistent about zeroizing secrets on the
stack (and some prominent userspace crypto libraries don't bother at all
since it's not guaranteed to work anyway), the kernel does try to do it
as a best practice, especially in cases involving the RNG.

Thus, explicitly zeroize 'permuted_state' before it goes out of scope.

Fixes: c08d0e6473 ("crypto: chacha20 - Add a generic ChaCha20 stream cipher implementation")
Cc: stable@vger.kernel.org
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20260326032920.39408-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2026-03-27 13:35:35 -07:00
Thomas Weißschuh
5dc9cf835a vdso/timens: Move functions to new file
As a preparation of the untangling of time namespaces and the vDSO, move
the glue functions between those subsystems into a new file.

While at it, switch the mutex lock and mmap_read_lock() in the vDSO
namespace code to guard().

Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Link: https://patch.msgid.link/20260326-vdso-timens-decoupling-v2-1-c82693a7775f@linutronix.de
2026-03-26 15:44:22 +01:00
Philipp Hahn
723ddce93e debugobjects: Drop likely() around !IS_ERR_OR_NULL()
IS_ERR_OR_NULL() already uses likely(!ptr) internally. checkpatch points
out the nesting: Remove the explicit use of likely().

Change generated with coccinelle.

Signed-off-by: Philipp Hahn <phahn-oss@avm.de>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Link: https://patch.msgid.link/20260310-b4-is_err_or_null-v1-59-bd63b656022d@avm.de
2026-03-25 18:07:39 +01:00
Yury Norov
95d324fb1b bitmap: add test_zero_nbits()
In most real-life cases, 0-length bitmap provided by user is a sign of
an error. The API doesn't provide any guarantees on returned value, and
the bitmap pointers are not dereferenced.

Signed-off-by: Yury Norov <ynorov@nvidia.com>
2026-03-24 13:39:53 -04:00
Marco Elver
a21c1e961d compiler: Simplify generic RELOC_HIDE()
When enabling Context Analysis (CONTEXT_ANALYSIS := y) in arch/x86/kvm
code, Clang's Thread Safety Analysis failed to recognize that identical
per_cpu() accesses refer to the same lock:

|   CC [M]  arch/x86/kvm/vmx/posted_intr.o
| arch/x86/kvm/vmx/posted_intr.c:186:2: error: releasing raw_spinlock '__ptr + __per_cpu_offset[vcpu->cpu]' that was not held [-Werror,-Wthread-safety-analysis]
|   186 |         raw_spin_unlock(&per_cpu(wakeup_vcpus_on_cpu_lock, vcpu->cpu));
|       |         ^
| ./include/linux/spinlock.h:276:32: note: expanded from macro 'raw_spin_unlock'
|   276 | #define raw_spin_unlock(lock)           _raw_spin_unlock(lock)
|       |                                         ^
| arch/x86/kvm/vmx/posted_intr.c:207:1: error: raw_spinlock '__ptr + __per_cpu_offset[vcpu->cpu]' is still held at the end of function [-Werror,-Wthread-safety-analysis]
|   207 | }
|       | ^
| arch/x86/kvm/vmx/posted_intr.c:182:2: note: raw_spinlock acquired here
|   182 |         raw_spin_lock_nested(&per_cpu(wakeup_vcpus_on_cpu_lock, vcpu->cpu),
|       |         ^
| ./include/linux/spinlock.h:235:2: note: expanded from macro 'raw_spin_lock_nested'
|   235 |         _raw_spin_lock(((void)(subclass), (lock)))
|       |         ^
| 2 errors generated.

This occurred because the default RELOC_HIDE() implementation (used by
the per-CPU macros) is a statement expression containing an intermediate
'unsigned long' variable (this version appears to predate Git history).

While the analysis strips away inner casts when resolving pointer
aliases, it stops when encountering intermediate non-pointer variables
(this is Thread Safety Analysis specific and irrelevant for codegen).
This prevents the analysis from concluding that the pointers passed to
e.g. raw_spin_lock() and raw_spin_unlock() were identical when per-CPU
accessors are used.

Simplify RELOC_HIDE() to a single expression. This preserves the intent
of obfuscating UB-introducing out-of-bounds pointer calculations from
the compiler via the 'unsigned long' cast, but allows the alias analysis
to successfully resolve the pointers.

Using a recent Clang version, I observe that generated code remains the
same for vmlinux; the intermediate variable was already being optimized
away (for any respectable modern compiler, not doing so would be an
optimizer bug). Note that GCC provides its own version of RELOC_HIDE(),
so this change only affects Clang builds.

Add a test case to lib/test_context-analysis.c to catch any regressions.

Reported-by: Bart Van Assche <bvanassche@acm.org>
Reported-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Marco Elver <elver@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Link: https://lore.kernel.org/all/e3946223-4543-4a76-a328-9c6865e95192@acm.org/
Link: https://patch.msgid.link/20260319135245.1420780-1-elver@google.com
2026-03-24 15:08:05 +01:00
Eric Biggers
7ac21b4032 lib: Move crypto library tests to Runtime Testing menu
Currently the kconfig options for the crypto library KUnit tests appear
in the menu:

    -> Library routines
      -> Crypto library routines

However, this is the only content of "Crypto library routines".  I.e.,
it is empty when CONFIG_KUNIT=n.  This is because the crypto library
routines themselves don't have (or need to have) prompts.

Since this usually ends up as an unnecessary empty menu, let's remove
this menu and instead source the lib/crypto/tests/Kconfig file from
lib/Kconfig.debug inside the "Runtime Testing" menu:

    -> Kernel hacking
      -> Kernel Testing and Coverage
        -> Runtime Testing

This puts the prompts alongside the ones for most of the other lib/
KUnit tests.  This seems to be a much better match to how the kconfig
menus are organized.

Acked-by: Randy Dunlap <rdunlap@infradead.org>
Tested-by: Randy Dunlap <rdunlap@infradead.org>
Link: https://lore.kernel.org/r/20260322032438.286296-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2026-03-23 17:50:59 -07:00
Eric Biggers
ef01e1eafb crypto: sm3 - Remove the original "sm3_block_generic()"
Since the architecture-optimized SM3 code was migrated into lib/crypto/,
sm3_block_generic() is no longer called.  Remove it.  Then, since this
frees up the name, rename sm3_transform() to sm3_block_generic()
(matching the naming convention used in other hash algorithms).

Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20260321040935.410034-12-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2026-03-23 17:50:59 -07:00
Eric Biggers
17ba6108d3 lib/crypto: x86/sm3: Migrate optimized code into library
Instead of exposing the x86-optimized SM3 code via an x86-specific
crypto_shash algorithm, instead just implement the sm3_blocks() library
function.  This is much simpler, it makes the SM3 library functions be
x86-optimized, and it fixes the longstanding issue where the
x86-optimized SM3 code was disabled by default.  SM3 still remains
available through crypto_shash, but individual architectures no longer
need to handle it.

Tweak the prototype of sm3_transform_avx() to match what the library
expects, including changing the block count to size_t.  Note that the
assembly code actually already treated this argument as size_t.

Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20260321040935.410034-10-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2026-03-23 17:50:59 -07:00
Eric Biggers
5f6bbba5e9 lib/crypto: riscv/sm3: Migrate optimized code into library
Instead of exposing the riscv-optimized SM3 code via a riscv-specific
crypto_shash algorithm, instead just implement the sm3_blocks() library
function.  This is much simpler, it makes the SM3 library functions be
riscv-optimized, and it fixes the longstanding issue where the
riscv-optimized SM3 code was disabled by default.  SM3 still remains
available through crypto_shash, but individual architectures no longer
need to handle it.

Tweak the prototype of sm3_transform_zvksh_zvkb() to match what the
library expects, including changing the block count to size_t.
Note that the assembly code already treated it as size_t.

Note: to see the diff from arch/riscv/crypto/sm3-riscv64-glue.c to
lib/crypto/riscv/sm3.h, view this commit with 'git show -M10'.

Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20260321040935.410034-9-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2026-03-23 17:50:59 -07:00
Eric Biggers
9f69f52b46 lib/crypto: arm64/sm3: Migrate optimized code into library
Instead of exposing the arm64-optimized SM3 code via arm64-specific
crypto_shash algorithms, instead just implement the sm3_blocks() library
function.  This is much simpler, it makes the SM3 library functions be
arm64-optimized, and it fixes the longstanding issue where the
arm64-optimized SM3 code was disabled by default.  SM3 still remains
available through crypto_shash, but individual architectures no longer
need to handle it.

Tweak the SM3 assembly function prototypes to match what the library
expects, including changing the block count from 'int' to 'size_t'.
sm3_ce_transform() had to be updated to access 'x2' instead of 'w2',
while sm3_neon_transform() already used 'x2'.

Remove the CFI stubs which are no longer needed because the SM3 assembly
functions are no longer ever indirectly called.

Remove the dependency on KERNEL_MODE_NEON.  It was unnecessary, because
KERNEL_MODE_NEON is always enabled on arm64.

Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20260321040935.410034-8-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2026-03-23 17:50:59 -07:00