Commit Graph

3828 Commits

Author SHA1 Message Date
Linus Torvalds
334fbe734e Merge tag 'mm-stable-2026-04-13-21-45' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton:

 - "maple_tree: Replace big node with maple copy" (Liam Howlett)

   Mainly prepararatory work for ongoing development but it does reduce
   stack usage and is an improvement.

 - "mm, swap: swap table phase III: remove swap_map" (Kairui Song)

   Offers memory savings by removing the static swap_map. It also yields
   some CPU savings and implements several cleanups.

 - "mm: memfd_luo: preserve file seals" (Pratyush Yadav)

   File seal preservation to LUO's memfd code

 - "mm: zswap: add per-memcg stat for incompressible pages" (Jiayuan
   Chen)

   Additional userspace stats reportng to zswap

 - "arch, mm: consolidate empty_zero_page" (Mike Rapoport)

   Some cleanups for our handling of ZERO_PAGE() and zero_pfn

 - "mm/kmemleak: Improve scan_should_stop() implementation" (Zhongqiu
   Han)

   A robustness improvement and some cleanups in the kmemleak code

 - "Improve khugepaged scan logic" (Vernon Yang)

   Improve khugepaged scan logic and reduce CPU consumption by
   prioritizing scanning tasks that access memory frequently

 - "Make KHO Stateless" (Jason Miu)

   Simplify Kexec Handover by transitioning KHO from an xarray-based
   metadata tracking system with serialization to a radix tree data
   structure that can be passed directly to the next kernel

 - "mm: vmscan: add PID and cgroup ID to vmscan tracepoints" (Thomas
   Ballasi and Steven Rostedt)

   Enhance vmscan's tracepointing

 - "mm: arch/shstk: Common shadow stack mapping helper and
   VM_NOHUGEPAGE" (Catalin Marinas)

   Cleanup for the shadow stack code: remove per-arch code in favour of
   a generic implementation

 - "Fix KASAN support for KHO restored vmalloc regions" (Pasha Tatashin)

   Fix a WARN() which can be emitted the KHO restores a vmalloc area

 - "mm: Remove stray references to pagevec" (Tal Zussman)

   Several cleanups, mainly udpating references to "struct pagevec",
   which became folio_batch three years ago

 - "mm: Eliminate fake head pages from vmemmap optimization" (Kiryl
   Shutsemau)

   Simplify the HugeTLB vmemmap optimization (HVO) by changing how tail
   pages encode their relationship to the head page

 - "mm/damon/core: improve DAMOS quota efficiency for core layer
   filters" (SeongJae Park)

   Improve two problematic behaviors of DAMOS that makes it less
   efficient when core layer filters are used

 - "mm/damon: strictly respect min_nr_regions" (SeongJae Park)

   Improve DAMON usability by extending the treatment of the
   min_nr_regions user-settable parameter

 - "mm/page_alloc: pcp locking cleanup" (Vlastimil Babka)

   The proper fix for a previously hotfixed SMP=n issue. Code
   simplifications and cleanups ensued

 - "mm: cleanups around unmapping / zapping" (David Hildenbrand)

   A bunch of cleanups around unmapping and zapping. Mostly
   simplifications, code movements, documentation and renaming of
   zapping functions

 - "support batched checking of the young flag for MGLRU" (Baolin Wang)

   Batched checking of the young flag for MGLRU. It's part cleanups; one
   benchmark shows large performance benefits for arm64

 - "memcg: obj stock and slab stat caching cleanups" (Johannes Weiner)

   memcg cleanup and robustness improvements

 - "Allow order zero pages in page reporting" (Yuvraj Sakshith)

   Enhance free page reporting - it is presently and undesirably order-0
   pages when reporting free memory.

 - "mm: vma flag tweaks" (Lorenzo Stoakes)

   Cleanup work following from the recent conversion of the VMA flags to
   a bitmap

 - "mm/damon: add optional debugging-purpose sanity checks" (SeongJae
   Park)

   Add some more developer-facing debug checks into DAMON core

 - "mm/damon: test and document power-of-2 min_region_sz requirement"
   (SeongJae Park)

   An additional DAMON kunit test and makes some adjustments to the
   addr_unit parameter handling

 - "mm/damon/core: make passed_sample_intervals comparisons
   overflow-safe" (SeongJae Park)

   Fix a hard-to-hit time overflow issue in DAMON core

 - "mm/damon: improve/fixup/update ratio calculation, test and
   documentation" (SeongJae Park)

   A batch of misc/minor improvements and fixups for DAMON

 - "mm: move vma_(kernel|mmu)_pagesize() out of hugetlb.c" (David
   Hildenbrand)

   Fix a possible issue with dax-device when CONFIG_HUGETLB=n. Some code
   movement was required.

 - "zram: recompression cleanups and tweaks" (Sergey Senozhatsky)

   A somewhat random mix of fixups, recompression cleanups and
   improvements in the zram code

 - "mm/damon: support multiple goal-based quota tuning algorithms"
   (SeongJae Park)

   Extend DAMOS quotas goal auto-tuning to support multiple tuning
   algorithms that users can select

 - "mm: thp: reduce unnecessary start_stop_khugepaged()" (Breno Leitao)

   Fix the khugpaged sysfs handling so we no longer spam the logs with
   reams of junk when starting/stopping khugepaged

 - "mm: improve map count checks" (Lorenzo Stoakes)

   Provide some cleanups and slight fixes in the mremap, mmap and vma
   code

 - "mm/damon: support addr_unit on default monitoring targets for
   modules" (SeongJae Park)

   Extend the use of DAMON core's addr_unit tunable

 - "mm: khugepaged cleanups and mTHP prerequisites" (Nico Pache)

   Cleanups to khugepaged and is a base for Nico's planned khugepaged
   mTHP support

 - "mm: memory hot(un)plug and SPARSEMEM cleanups" (David Hildenbrand)

   Code movement and cleanups in the memhotplug and sparsemem code

 - "mm: remove CONFIG_ARCH_ENABLE_MEMORY_HOTREMOVE and cleanup
   CONFIG_MIGRATION" (David Hildenbrand)

   Rationalize some memhotplug Kconfig support

 - "change young flag check functions to return bool" (Baolin Wang)

   Cleanups to change all young flag check functions to return bool

 - "mm/damon/sysfs: fix memory leak and NULL dereference issues" (Josh
   Law and SeongJae Park)

   Fix a few potential DAMON bugs

 - "mm/vma: convert vm_flags_t to vma_flags_t in vma code" (Lorenzo
   Stoakes)

   Convert a lot of the existing use of the legacy vm_flags_t data type
   to the new vma_flags_t type which replaces it. Mainly in the vma
   code.

 - "mm: expand mmap_prepare functionality and usage" (Lorenzo Stoakes)

   Expand the mmap_prepare functionality, which is intended to replace
   the deprecated f_op->mmap hook which has been the source of bugs and
   security issues for some time. Cleanups, documentation, extension of
   mmap_prepare into filesystem drivers

 - "mm/huge_memory: refactor zap_huge_pmd()" (Lorenzo Stoakes)

   Simplify and clean up zap_huge_pmd(). Additional cleanups around
   vm_normal_folio_pmd() and the softleaf functionality are performed.

* tag 'mm-stable-2026-04-13-21-45' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (369 commits)
  mm: fix deferred split queue races during migration
  mm/khugepaged: fix issue with tracking lock
  mm/huge_memory: add and use has_deposited_pgtable()
  mm/huge_memory: add and use normal_or_softleaf_folio_pmd()
  mm: add softleaf_is_valid_pmd_entry(), pmd_to_softleaf_folio()
  mm/huge_memory: separate out the folio part of zap_huge_pmd()
  mm/huge_memory: use mm instead of tlb->mm
  mm/huge_memory: remove unnecessary sanity checks
  mm/huge_memory: deduplicate zap deposited table call
  mm/huge_memory: remove unnecessary VM_BUG_ON_PAGE()
  mm/huge_memory: add a common exit path to zap_huge_pmd()
  mm/huge_memory: handle buggy PMD entry in zap_huge_pmd()
  mm/huge_memory: have zap_huge_pmd return a boolean, add kdoc
  mm/huge: avoid big else branch in zap_huge_pmd()
  mm/huge_memory: simplify vma_is_specal_huge()
  mm: on remap assert that input range within the proposed VMA
  mm: add mmap_action_map_kernel_pages[_full]()
  uio: replace deprecated mmap hook with mmap_prepare in uio_info
  drivers: hv: vmbus: replace deprecated mmap hook with mmap_prepare
  mm: allow handling of stacked mmap_prepare hooks in more drivers
  ...
2026-04-15 12:59:16 -07:00
Linus Torvalds
7de6b4a246 Merge tag 'wq-for-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq
Pull workqueue updates from Tejun Heo:

 - New default WQ_AFFN_CACHE_SHARD affinity scope subdivides LLCs into
   smaller shards to improve scalability on machines with many CPUs per
   LLC

 - Misc:
    - system_dfl_long_wq for long unbound works
    - devm_alloc_workqueue() for device-managed allocation
    - sysfs exposure for ordered workqueues and the EFI workqueue
    - removal of HK_TYPE_WQ from wq_unbound_cpumask
    - various small fixes

* tag 'wq-for-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: (21 commits)
  workqueue: validate cpumask_first() result in llc_populate_cpu_shard_id()
  workqueue: use NR_STD_WORKER_POOLS instead of hardcoded value
  workqueue: avoid unguarded 64-bit division
  docs: workqueue: document WQ_AFFN_CACHE_SHARD affinity scope
  workqueue: add test_workqueue benchmark module
  tools/workqueue: add CACHE_SHARD support to wq_dump.py
  workqueue: set WQ_AFFN_CACHE_SHARD as the default affinity scope
  workqueue: add WQ_AFFN_CACHE_SHARD affinity scope
  workqueue: fix typo in WQ_AFFN_SMT comment
  workqueue: Remove HK_TYPE_WQ from affecting wq_unbound_cpumask
  workqueue: unlink pwqs from wq->pwqs list in alloc_and_link_pwqs() error path
  workqueue: Remove NULL wq WARN in __queue_delayed_work()
  workqueue: fix parse_affn_scope() prefix matching bug
  workqueue: devres: Add device-managed allocate workqueue
  workqueue: Add system_dfl_long_wq for long unbound works
  tools/workqueue/wq_dump.py: add NODE prefix to all node columns
  tools/workqueue/wq_dump.py: fix column alignment in node_nr/max_active section
  tools/workqueue/wq_dump.py: remove backslash separator from node_nr/max_active header
  efi: Allow to expose the workqueue via sysfs
  workqueue: Allow to expose ordered workqueues via sysfs
  ...
2026-04-15 10:32:08 -07:00
Linus Torvalds
00c6649baf Merge tag 'media/v7.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media
Pull media updates from Mauro Carvalho Chehab:

 - new CSI tegra support, covering Tegra20 and Tegra30

 - new camera sensor drivers: T4ka3 and ov2732

 - m88ds3103: add 3103c chip support

 - uvcvideo: add support for Intel RealSense D436/D555 and P010 pixel format

 - synopsys csi2rx: add i.MX93 support

 - imx8-isi: add i.MX95 support

 - imx8mq-mipi-csi2: add i.MX8ULP support

 - dw100: add V4L2 requests support

 - support for DTV devices from Hauppauge got some improvements

 - media staging: dropped starfive-camss driver

 - media docs: document multi-committers model and improve maint profile

 - media core:
    - add v4l2_subdev_get_frame_desc_passthrough() helper
    - improve error handling in fwnode parsing

 - lots of driver fixes, cleanups and improvements

* tag 'media/v7.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (251 commits)
  Revert "media: cx231xx: add USB ID 2040:8360 for Hauppauge WinTV-HVR-935"
  media: synopsys: csi2rx: add i.MX93 support
  media: dt-bindings: add NXP i.MX93 compatible string
  media: synopsys: csi2rx: Use enum and u32 array for register offsets
  media: synopsys: csi2rx: implement .get_frame_desc() callback
  media: synopsys: csi2rx: only check errors from devm_clk_bulk_get_all()
  media: synopsys: csi2rx: use devm_reset_control_get_optional_exclusive()
  media: i2c: imx283: add support for non-continuous MIPI clock mode
  media: i2c: ov08d10: add support for 24 MHz input clock
  media: i2c: ov08d10: add support for reset and power management
  media: i2c: ov08d10: add support for binding via device tree
  dt-bindings: media: i2c: document Omnivision OV08D10 CMOS image sensor
  media: i2c: ov08d10: add missing newline to prints
  media: i2c: ov08d10: fix some typos in comments
  media: i2c: ov08d10: remove duplicate register write
  media: i2c: ov08d10: fix image vertical start setting
  media: i2c: ov08d10: fix runtime PM handling in probe
  staging: media: ipu7: Update TODO
  media: Add t4ka3 camera sensor driver
  media: i2c: Add ov2732 image sensor driver
  ...
2026-04-15 08:32:10 -07:00
Linus Torvalds
91a4855d6c Merge tag 'net-next-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Jakub Kicinski:
 "Core & protocols:

   - Support HW queue leasing, allowing containers to be granted access
     to HW queues for zero-copy operations and AF_XDP

   - Number of code moves to help the compiler with inlining. Avoid
     output arguments for returning drop reason where possible

   - Rework drop handling within qdiscs to include more metadata about
     the reason and dropping qdisc in the tracepoints

   - Remove the rtnl_lock use from IP Multicast Routing

   - Pack size information into the Rx Flow Steering table pointer
     itself. This allows making the table itself a flat array of u32s,
     thus making the table allocation size a power of two

   - Report TCP delayed ack timer information via socket diag

   - Add ip_local_port_step_width sysctl to allow distributing the
     randomly selected ports more evenly throughout the allowed space

   - Add support for per-route tunsrc in IPv6 segment routing

   - Start work of switching sockopt handling to iov_iter

   - Improve dynamic recvbuf sizing in MPTCP, limit burstiness and avoid
     buffer size drifting up

   - Support MSG_EOR in MPTCP

   - Add stp_mode attribute to the bridge driver for STP mode selection.
     This addresses concerns about call_usermodehelper() usage

   - Remove UDP-Lite support (as announced in 2023)

   - Remove support for building IPv6 as a module. Remove the now
     unnecessary function calling indirection

  Cross-tree stuff:

   - Move Michael MIC code from generic crypto into wireless, it's
     considered insecure but some WiFi networks still need it

  Netfilter:

   - Switch nft_fib_ipv6 module to no longer need temporary dst_entry
     object allocations by using fib6_lookup() + RCU.

     Florian W reports this gets us ~13% higher packet rate

   - Convert IPVS's global __ip_vs_mutex to per-net service_mutex and
     switch the service tables to be per-net. Convert some code that
     walks the service lists to use RCU instead of the service_mutex

   - Add more opinionated input validation to lower security exposure

   - Make IPVS hash tables to be per-netns and resizable

  Wireless:

   - Finished assoc frame encryption/EPPKE/802.1X-over-auth

   - Radar detection improvements

   - Add 6 GHz incumbent signal detection APIs

   - Multi-link support for FILS, probe response templates and client
     probing

   - New APIs and mac80211 support for NAN (Neighbor Aware Networking,
     aka Wi-Fi Aware) so less work must be in firmware

  Driver API:

   - Add numerical ID for devlink instances (to avoid having to create
     fake bus/device pairs just to have an ID). Support shared devlink
     instances which span multiple PFs

   - Add standard counters for reporting pause storm events (implement
     in mlx5 and fbnic)

   - Add configuration API for completion writeback buffering (implement
     in mana)

   - Support driver-initiated change of RSS context sizes

   - Support DPLL monitoring input frequency (implement in zl3073x)

   - Support per-port resources in devlink (implement in mlx5)

  Misc:

   - Expand the YAML spec for Netfilter

  Drivers

   - Software:
      - macvlan: support multicast rx for bridge ports with shared
        source MAC address
      - team: decouple receive and transmit enablement for IEEE 802.3ad
        LACP "independent control"

   - Ethernet high-speed NICs:
      - nVidia/Mellanox:
         - support high order pages in zero-copy mode (for payload
           coalescing)
         - support multiple packets in a page (for systems with 64kB
           pages)
      - Broadcom 25-400GE (bnxt):
         - implement XDP RSS hash metadata extraction
         - add software fallback for UDP GSO, lowering the IOMMU cost
      - Broadcom 800GE (bnge):
         - add link status and configuration handling
         - add various HW and SW statistics
      - Marvell/Cavium:
         - NPC HW block support for cn20k
      - Huawei (hinic3):
         - add mailbox / control queue
         - add rx VLAN offload
         - add driver info and link management

   - Ethernet NICs:
      - Marvell/Aquantia:
         - support reading SFP module info on some AQC100 cards
      - Realtek PCI (r8169):
         - add support for RTL8125cp
      - Realtek USB (r8152):
         - support for the RTL8157 5Gbit chip
         - add 2500baseT EEE status/configuration support

   - Ethernet NICs embedded and off-the-shelf IP:
      - Synopsys (stmmac):
         - cleanup and reorganize SerDes handling and PCS support
         - cleanup descriptor handling and per-platform data
         - cleanup and consolidate MDIO defines and handling
         - shrink driver memory use for internal structures
         - improve Tx IRQ coalescing
         - improve TCP segmentation handling
         - add support for Spacemit K3
      - Cadence (macb):
         - support PHYs that have inband autoneg disabled with GEM
         - support IEEE 802.3az EEE
         - rework usrio capabilities and handling
      - AMD (xgbe):
         - improve power management for S0i3
         - improve TX resilience for link-down handling

   - Virtual:
      - Google cloud vNIC:
         - support larger ring sizes in DQO-QPL mode
         - improve HW-GRO handling
         - support UDP GSO for DQO format
      - PCIe NTB:
         - support queue count configuration

   - Ethernet PHYs:
      - automatically disable PHY autonomous EEE if MAC is in charge
      - Broadcom:
         - add BCM84891/BCM84892 support
      - Micrel:
         - support for LAN9645X internal PHY
      - Realtek:
         - add RTL8224 pair order support
         - support PHY LEDs on RTL8211F-VD
         - support spread spectrum clocking (SSC)
      - Maxlinear:
         - add PHY-level statistics via ethtool

   - Ethernet switches:
      - Maxlinear (mxl862xx):
         - support for bridge offloading
         - support for VLANs
         - support driver statistics

   - Bluetooth:
      - large number of fixes and new device IDs
      - Mediatek:
         - support MT6639 (MT7927)
         - support MT7902 SDIO

   - WiFi:
      - Intel (iwlwifi):
         - UNII-9 and continuing UHR work
      - MediaTek (mt76):
         - mt7996/mt7925 MLO fixes/improvements
         - mt7996 NPU support (HW eth/wifi traffic offload)
      - Qualcomm (ath12k):
         - monitor mode support on IPQ5332
         - basic hwmon temperature reporting
         - support IPQ5424
      - Realtek:
         - add USB RX aggregation to improve performance
         - add USB TX flow control by tracking in-flight URBs

   - Cellular:
      - IPA v5.2 support"

* tag 'net-next-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1561 commits)
  net: pse-pd: fix kernel-doc function name for pse_control_find_by_id()
  wireguard: device: use exit_rtnl callback instead of manual rtnl_lock in pre_exit
  wireguard: allowedips: remove redundant space
  tools: ynl: add sample for wireguard
  wireguard: allowedips: Use kfree_rcu() instead of call_rcu()
  MAINTAINERS: Add netkit selftest files
  selftests/net: Add additional test coverage in nk_qlease
  selftests/net: Split netdevsim tests from HW tests in nk_qlease
  tools/ynl: Make YnlFamily closeable as a context manager
  net: airoha: Add missing PPE configurations in airoha_ppe_hw_init()
  net: airoha: Fix VIP configuration for AN7583 SoC
  net: caif: clear client service pointer on teardown
  net: strparser: fix skb_head leak in strp_abort_strp()
  net: usb: cdc-phonet: fix skb frags[] overflow in rx_complete()
  selftests/bpf: add test for xdp_master_redirect with bond not up
  net, bpf: fix null-ptr-deref in xdp_master_redirect() for down master
  net: airoha: Remove PCE_MC_EN_MASK bit in REG_FE_PCE_CFG configuration
  sctp: disable BH before calling udp_tunnel_xmit_skb()
  sctp: fix missing encap_port propagation for GSO fragments
  net: airoha: Rely on net_device pointer in ETS callbacks
  ...
2026-04-14 18:36:10 -07:00
Linus Torvalds
c43267e679 Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Catalin Marinas:
 "The biggest changes are MPAM enablement in drivers/resctrl and new PMU
  support under drivers/perf.

  On the core side, FEAT_LSUI lets futex atomic operations with EL0
  permissions, avoiding PAN toggling.

  The rest is mostly TLB invalidation refactoring, further generic entry
  work, sysreg updates and a few fixes.

  Core features:

   - Add support for FEAT_LSUI, allowing futex atomic operations without
     toggling Privileged Access Never (PAN)

   - Further refactor the arm64 exception handling code towards the
     generic entry infrastructure

   - Optimise __READ_ONCE() with CONFIG_LTO=y and allow alias analysis
     through it

  Memory management:

   - Refactor the arm64 TLB invalidation API and implementation for
     better control over barrier placement and level-hinted invalidation

   - Enable batched TLB flushes during memory hot-unplug

   - Fix rodata=full block mapping support for realm guests (when
     BBML2_NOABORT is available)

  Perf and PMU:

   - Add support for a whole bunch of system PMUs featured in NVIDIA's
     Tegra410 SoC (cspmu extensions for the fabric and PCIe, new drivers
     for CPU/C2C memory latency PMUs)

   - Clean up iomem resource handling in the Arm CMN driver

   - Fix signedness handling of AA64DFR0.{PMUVer,PerfMon}

  MPAM (Memory Partitioning And Monitoring):

   - Add architecture context-switch and hiding of the feature from KVM

   - Add interface to allow MPAM to be exposed to user-space using
     resctrl

   - Add errata workaround for some existing platforms

   - Add documentation for using MPAM and what shape of platforms can
     use resctrl

  Miscellaneous:

   - Check DAIF (and PMR, where relevant) at task-switch time

   - Skip TFSR_EL1 checks and barriers in synchronous MTE tag check mode
     (only relevant to asynchronous or asymmetric tag check modes)

   - Remove a duplicate allocation in the kexec code

   - Remove redundant save/restore of SCS SP on entry to/from EL0

   - Generate the KERNEL_HWCAP_ definitions from the arm64 hwcap
     descriptions

   - Add kselftest coverage for cmpbr_sigill()

   - Update sysreg definitions"

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (109 commits)
  arm64: rsi: use linear-map alias for realm config buffer
  arm64: Kconfig: fix duplicate word in CMDLINE help text
  arm64: mte: Skip TFSR_EL1 checks and barriers in synchronous tag check mode
  arm64/sysreg: Update ID_AA64SMFR0_EL1 description to DDI0601 2025-12
  arm64/sysreg: Update ID_AA64ZFR0_EL1 description to DDI0601 2025-12
  arm64/sysreg: Update ID_AA64FPFR0_EL1 description to DDI0601 2025-12
  arm64/sysreg: Update ID_AA64ISAR2_EL1 description to DDI0601 2025-12
  arm64/sysreg: Update ID_AA64ISAR0_EL1 description to DDI0601 2025-12
  arm64/hwcap: Generate the KERNEL_HWCAP_ definitions for the hwcaps
  arm64: kexec: Remove duplicate allocation for trans_pgd
  ACPI: AGDI: fix missing newline in error message
  arm64: Check DAIF (and PMR) at task-switch time
  arm64: entry: Use split preemption logic
  arm64: entry: Use irqentry_{enter_from,exit_to}_kernel_mode()
  arm64: entry: Consistently prefix arm64-specific wrappers
  arm64: entry: Don't preempt with SError or Debug masked
  entry: Split preemption from irqentry_exit_to_kernel_mode()
  entry: Split kernel mode logic from irqentry_{enter,exit}()
  entry: Move irqentry_enter() prototype later
  entry: Remove local_irq_{enable,disable}_exit_to_user()
  ...
2026-04-14 16:48:56 -07:00
Linus Torvalds
e9635f2a73 Merge tag 'x86_fred_for_v7.1_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 FRED updates from Borislav Petkov:
 "We made the FRED support an opt-in initially out of fear of it
  breaking machines left and right in the case of a hw bug in the first
  generation of machines supporting it.

  Now that that the FRED code has seen a lot of hammering, flip the
  logic to be opt-out as is the usual case with new hw features"

* tag 'x86_fred_for_v7.1_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/fred: Remove kernel log message when initializing exceptions
  x86/fred: Enable FRED by default
2026-04-14 14:50:51 -07:00
Linus Torvalds
9f2bb6c7b3 Merge tag 'x86_cpu_for_7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 cpu updates from Dave Hansen:

 - Complete LASS enabling: deal with vsyscall and EFI

   The existing Linear Address Space Separation (LASS) support punted
   on support for common EFI and vsyscall configs. Complete the
   implementation by supporting EFI and vsyscall=xonly.

 - Clean up CPUID usage in newer Intel "avs" audio driver and update the
   x86-cpuid-db file

* tag 'x86_cpu_for_7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  tools/x86/kcpuid: Update bitfields to x86-cpuid-db v3.0
  ASoC: Intel: avs: Include CPUID header at file scope
  ASoC: Intel: avs: Check maximum valid CPUID leaf
  x86/cpu: Remove LASS restriction on vsyscall emulation
  x86/vsyscall: Disable LASS if vsyscall mode is set to EMULATE
  x86/vsyscall: Restore vsyscall=xonly mode under LASS
  x86/traps: Consolidate user fixups in the #GP handler
  x86/vsyscall: Reorganize the page fault emulation code
  x86/cpu: Remove LASS restriction on EFI
  x86/efi: Disable LASS while executing runtime services
  x86/cpu: Defer LASS enabling until userspace comes up
2026-04-14 14:24:45 -07:00
Linus Torvalds
c1fe867b5b Merge tag 'timers-core-2026-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer core updates from Thomas Gleixner:

 - A rework of the hrtimer subsystem to reduce the overhead for
   frequently armed timers, especially the hrtick scheduler timer:

     - Better timer locality decision

     - Simplification of the evaluation of the first expiry time by
       keeping track of the neighbor timers in the RB-tree by providing
       a RB-tree variant with neighbor links. That avoids walking the
       RB-tree on removal to find the next expiry time, but even more
       important allows to quickly evaluate whether a timer which is
       rearmed changes the position in the RB-tree with the modified
       expiry time or not. If not, the dequeue/enqueue sequence which
       both can end up in rebalancing can be completely avoided.

     - Deferred reprogramming of the underlying clock event device. This
       optimizes for the situation where a hrtimer callback sets the
       need resched bit. In that case the code attempts to defer the
       re-programming of the clock event device up to the point where
       the scheduler has picked the next task and has the next hrtick
       timer armed. In case that there is no immediate reschedule or
       soft interrupts have to be handled before reaching the reschedule
       point in the interrupt entry code the clock event is reprogrammed
       in one of those code paths to prevent that the timer becomes
       stale.

     - Support for clocksource coupled clockevents

       The TSC deadline timer is coupled to the TSC. The next event is
       programmed in TSC time. Currently this is done by converting the
       CLOCK_MONOTONIC based expiry value into a relative timeout,
       converting it into TSC ticks, reading the TSC adding the delta
       ticks and writing the deadline MSR.

       As the timekeeping core has the conversion factors for the TSC
       already, the whole back and forth conversion can be completely
       avoided. The timekeeping core calculates the reverse conversion
       factors from nanoseconds to TSC ticks and utilizes the base
       timestamps of TSC and CLOCK_MONOTONIC which are updated once per
       tick. This allows a direct conversion into the TSC deadline value
       without reading the time and as a bonus keeps the deadline
       conversion in sync with the TSC conversion factors, which are
       updated by adjtimex() on systems with NTP/PTP enabled.

     - Allow inlining of the clocksource read and clockevent write
       functions when they are tiny enough, e.g. on x86 RDTSC and WRMSR.

   With all those enhancements in place a hrtick enabled scheduler
   provides the same performance as without hrtick. But also other
   hrtimer users obviously benefit from these optimizations.

 - Robustness improvements and cleanups of historical sins in the
   hrtimer and timekeeping code.

 - Rewrite of the clocksource watchdog.

   The clocksource watchdog code has over time reached the state of an
   impenetrable maze of duct tape and staples. The original design,
   which was made in the context of systems far smaller than today, is
   based on the assumption that the to be monitored clocksource (TSC)
   can be trivially compared against a known to be stable clocksource
   (HPET/ACPI-PM timer).

   Over the years this rather naive approach turned out to have major
   flaws. Long delays between the watchdog invocations can cause wrap
   arounds of the reference clocksource. The access to the reference
   clocksource degrades on large multi-sockets systems dure to
   interconnect congestion. This has been addressed with various
   heuristics which degraded the accuracy of the watchdog to the point
   that it fails to detect actual TSC problems on older hardware which
   exposes slow inter CPU drifts due to firmware manipulating the TSC to
   hide SMI time.

   The rewrite addresses this by:

     - Restricting the validation against the reference clocksource to
       the boot CPU which is usually closest to the legacy block which
       contains the reference clocksource (HPET/ACPI-PM).

     - Do a round robin validation betwen the boot CPU and the other
       CPUs based only on the TSC with an algorithm similar to the TSC
       synchronization code during CPU hotplug.

     - Being more leniant versus remote timeouts

 - The usual tiny fixes, cleanups and enhancements all over the place

* tag 'timers-core-2026-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (75 commits)
  alarmtimer: Access timerqueue node under lock in suspend
  hrtimer: Fix incorrect #endif comment for BITS_PER_LONG check
  posix-timers: Fix stale function name in comment
  timers: Get this_cpu once while clearing the idle state
  clocksource: Rewrite watchdog code completely
  clocksource: Don't use non-continuous clocksources as watchdog
  x86/tsc: Handle CLOCK_SOURCE_VALID_FOR_HRES correctly
  MIPS: Don't select CLOCKSOURCE_WATCHDOG
  parisc: Remove unused clocksource flags
  hrtimer: Add a helper to retrieve a hrtimer from its timerqueue node
  hrtimer: Remove trailing comma after HRTIMER_MAX_CLOCK_BASES
  hrtimer: Mark index and clockid of clock base as const
  hrtimer: Drop unnecessary pointer indirection in hrtimer_expire_entry event
  hrtimer: Drop spurious space in 'enum hrtimer_base_type'
  hrtimer: Don't zero-initialize ret in hrtimer_nanosleep()
  hrtimer: Remove hrtimer_get_expires_ns()
  timekeeping: Mark offsets array as const
  timekeeping/auxclock: Consistently use raw timekeeper for tk_setup_internals()
  timer_list: Print offset as signed integer
  tracing: Use explicit array size instead of sentinel elements in symbol printing
  ...
2026-04-14 10:27:07 -07:00
Linus Torvalds
5181afcdf9 Merge tag 'docs-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/docs/linux
Pull documentation updates from Jonathan Corbet:
 "A busier cycle than I had expected for docs, including:

   - Translations: some overdue updates to the Japanese translations,
     Chinese translations for some of the Rust documentation, and the
     beginnings of a Portuguese translation.

   - New documents covering CPU isolation, managed interrupts, debugging
     Python gbb scripts, and more.

   - More tooling work from Mauro, reducing docs-build warnings, adding
     self tests, improving man-page output, bringing in a proper C
     tokenizer to replace (some of) the mess of kernel-doc regexes, and
     more.

   - Update and synchronize changes.rst and scripts/ver_linux, and put
     both into alphabetical order.

  ... and a long list of documentation updates, typo fixes, and general
  improvements"

* tag 'docs-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/docs/linux: (162 commits)
  Documentation: core-api: real-time: correct spelling
  doc: Add CPU Isolation documentation
  Documentation: Add managed interrupts
  Documentation: seq_file: drop 2.6 reference
  docs/zh_CN: update rust/index.rst translation
  docs/zh_CN: update rust/quick-start.rst translation
  docs/zh_CN: update rust/coding-guidelines.rst translation
  docs/zh_CN: update rust/arch-support.rst translation
  docs/zh_CN: sync process/2.Process.rst with English version
  docs/zh_CN: fix an inconsistent statement in dev-tools/testing-overview
  tracing: Documentation: Update histogram-design.rst for fn() handling
  docs: sysctl: Add documentation for /proc/sys/xen/
  Docs: hid: intel-ish-hid: make long URL usable
  Documentation/kernel-parameters: fix architecture alignment for pt, nopt, and nobypass
  sched/doc: Update yield_task description in sched-design-CFS
  Documentation/rtla: Convert links to RST format
  docs: fix typos and duplicated words across documentation
  docs: fix typo in zoran driver documentation
  docs: add an Assisted-by mention to submitting-patches.rst
  Revert "scripts/checkpatch: add Assisted-by: tag validation"
  ...
2026-04-14 08:47:08 -07:00
Linus Torvalds
d7c8087a9c Merge tag 'pm-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management updates from Rafael Wysocki:
 "Once again, cpufreq is the most active development area, mostly
  because of the new feature additions and documentation updates in the
  amd-pstate driver, but there are also changes in the cpufreq core
  related to boost support and other assorted updates elsewhere.

  Next up are power capping changes due to the major cleanup of the
  Intel RAPL driver.

  On the cpuidle front, a new C-states table for Intel Panther Lake is
  added to the intel_idle driver, the stopped tick handling in the menu
  and teo governors is updated, and there are a couple of cleanups.

  Apart from the above, support for Tegra114 is added to devfreq and
  there are assorted cleanups of that code, there are also two updates
  of the operating performance points (OPP) library, two minor updates
  related to hibernation, and cpupower utility man pages updates and
  cleanups.

  Specifics:

   - Update qcom-hw DT bindings to include Eliza hardware (Abel Vesa)

   - Update cpufreq-dt-platdev blocklist (Faruque Ansari)

   - Minor updates to driver and dt-bindings for Tegra (Thierry Reding,
     Rosen Penev)

   - Add MAINTAINERS entry for CPPC driver (Viresh Kumar)

   - Add support for new features: CPPC performance priority, Dynamic
     EPP, Raw EPP, and new unit tests for them to amd-pstate (Gautham
     Shenoy, Mario Limonciello)

   - Fix sysfs files being present when HW missing and broken/outdated
     documentation in the amd-pstate driver (Ninad Naik, Gautham Shenoy)

   - Pass the policy to cpufreq_driver->adjust_perf() to avoid using
     cpufreq_cpu_get() in the .adjust_perf() callback in amd-pstate
     which leads to a scheduling-while-atomic bug (K Prateek Nayak)

   - Clean up dead code in Kconfig for cpufreq (Julian Braha)

   - Remove max_freq_req update for pre-existing cpufreq policy and add
     a boost_freq_req QoS request to save the boost constraint instead
     of overwriting the last scaling_max_freq constraint (Pierre
     Gondois)

   - Embed cpufreq QoS freq_req objects in cpufreq policy so they all
     are allocated in one go along with the policy to simplify lifetime
     rules and avoid error handling issues (Viresh Kumar)

   - Use DMI max speed when CPPC is unavailable in the acpi-cpufreq
     scaling driver (Henry Tseng)

   - Switch policy_is_shared() in cpufreq to using cpumask_nth() instead
     of cpumask_weight() because the former is more efficient (Yury
     Norov)

   - Use sysfs_emit() in sysfs show functions for cpufreq governor
     attributes (Thorsten Blum)

   - Update intel_pstate to stop returning an error when "off" is
     written to its status sysfs attribute while the driver is already
     off (Fabio De Francesco)

   - Include current frequency in the debug message printed by
     __cpufreq_driver_target() (Pengjie Zhang)

   - Refine stopped tick handling in the menu cpuidle governor and
     rearrange stopped tick handling in the teo cpuidle governor (Rafael
     Wysocki)

   - Add Panther Lake C-states table to the intel_idle driver (Artem
     Bityutskiy)

   - Clean up dead dependencies on CPU_IDLE in Kconfig (Julian Braha)

   - Simplify cpuidle_register_device() with guard() (Huisong Li)

   - Use performance level if available to distinguish between rates in
     OPP debugfs (Manivannan Sadhasivam)

   - Fix scoped_guard in dev_pm_opp_xlate_required_opp() (Viresh Kumar)

   - Return -ENODATA if the snapshot image is not loaded (Alberto
     Garcia)

   - Remove inclusion of crypto/hash.h from hibernate_64.c on x86 (Eric
     Biggers)

   - Clean up and rearrange the intel_rapl power capping driver to make
     the respective interface drivers (TPMI, MSR, and MMOI) hold their
     own settings and primitives and consolidate PL4 and PMU support
     flags into rapl_defaults (Kuppuswamy Sathyanarayanan)

   - Correct kernel-doc function parameter names in the power capping
     core code (Randy Dunlap)

   - Remove unneeded casting for HZ_PER_KHZ in devfreq (Andy Shevchenko)

   - Use _visible attribute to replace create/remove_sysfs_files() in
     devfreq (Pengjie Zhang)

   - Add Tegra114 support to activity monitor device in tegra30-devfreq
     as a preparation to upcoming EMC controller support (Svyatoslav
     Ryhel)

   - Fix mistakes in cpupower man pages, add the boost and epp options
     to the cpupower-frequency-info man page, and add the perf-bias
     option to the cpupower-info man page (Roberto Ricci)

   - Remove unnecessary extern declarations from getopt.h in arguments
     parsing functions in cpufreq-set, cpuidle-info, cpuidle-set,
     cpupower-info, and cpupower-set utilities (Kaushlendra Kumar)"

* tag 'pm-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (74 commits)
  cpufreq/amd-pstate: Add POWER_SUPPLY select for dynamic EPP
  cpupower: remove extern declarations in cmd functions
  cpuidle: Simplify cpuidle_register_device() with guard()
  PM / devfreq: tegra30-devfreq: add support for Tegra114
  PM / devfreq: use _visible attribute to replace create/remove_sysfs_files()
  PM / devfreq: Remove unneeded casting for HZ_PER_KHZ
  MAINTAINERS: amd-pstate: Step down as maintainer, add Prateek as reviewer
  cpufreq: Pass the policy to cpufreq_driver->adjust_perf()
  cpufreq/amd-pstate: Pass the policy to amd_pstate_update()
  cpufreq/amd-pstate-ut: Add a unit test for raw EPP
  cpufreq/amd-pstate: Add support for raw EPP writes
  cpufreq/amd-pstate: Add support for platform profile class
  cpufreq/amd-pstate: add kernel command line to override dynamic epp
  cpufreq/amd-pstate: Add dynamic energy performance preference
  Documentation: amd-pstate: fix dead links in the reference section
  cpufreq/amd-pstate: Cache the max frequency in cpudata
  Documentation/amd-pstate: Add documentation for amd_pstate_floor_{freq,count}
  Documentation/amd-pstate: List amd_pstate_prefcore_ranking sysfs file
  Documentation/amd-pstate: List amd_pstate_hw_prefcore sysfs file
  amd-pstate-ut: Add a testcase to validate the visibility of driver attributes
  ...
2026-04-13 19:47:52 -07:00
Linus Torvalds
2e31b16101 Merge tag 'acpi-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI support updates from Rafael Wysocki:
 "These include an update of the CMOS RTC driver and the related ACPI
  and x86 code that, among other things, switches it over to using the
  platform device interface for device binding on x86 instead of the PNP
  device driver interface (which allows the code in question to be
  simplified quite a bit), a major update of the ACPI Time and Alarm
  Device (TAD) driver adding an RTC class device interface to it, and
  updates of core ACPI drivers that remove some unnecessary and not
  really useful code from them.

  Apart from that, two drivers are converted to using the platform
  driver interface for device binding instead of the ACPI driver one,
  which is slated for removal, support for the Performance Limited
  register is added to the ACPI CPPC library and there are some
  janitorial updates of it and the related cpufreq CPPC driver, the ACPI
  processor driver is fixed and cleaned up, and NVIDIA vendor CPER
  record handler is added to the APEI GHES code.

  Also, the interface for obtaining a CPU UID from ACPI is consolidated
  across architectures and used for fixing a problem with the PCI TPH
  Steering Tag on ARM64, there are two updates related to ACPICA, a
  minor ACPI OS Services Layer (OSL) update, and a few assorted updates
  related to ACPI tables parsing.

  Specifics:

   - Update maintainers information regarding ACPICA (Rafael Wysocki)

   - Replace strncpy() with strscpy_pad() in acpi_ut_safe_strncpy()
     (Kees Cook)

   - Trigger an ordered system power off after encountering a fatal
     error operator in AML (Armin Wolf)

   - Enable ACPI FPDT parsing on LoongArch (Xi Ruoyao)

   - Remove the temporary stop-gap acpi_pptt_cache_v1_full structure
     from the ACPI PPTT parser (Ben Horgan)

   - Add support for exposing ACPI FPDT subtables FBPT and S3PT (Nate
     DeSimone)

   - Address multiple assorted issues and clean up the code in the ACPI
     processor idle driver (Huisong Li)

   - Replace strlcat() in the ACPI processor idle drive with a better
     alternative (Andy Shevchenko)

   - Rearrange and clean up acpi_processor_errata_piix4() (Rafael
     Wysocki)

   - Move reference performance to capabilities and fix an uninitialized
     variable in the ACPI CPPC library (Pengjie Zhang)

   - Add support for the Performance Limited Register to the ACPI CPPC
     library (Sumit Gupta)

   - Add cppc_get_perf() API to read performance controls, extend
     cppc_set_epp_perf() for FFH/SystemMemory, and make the ACPI CPPC
     library warn on missing mandatory DESIRED_PERF register (Sumit
     Gupta)

   - Modify the cpufreq CPPC driver to update MIN_PERF/MAX_PERF in
     target callbacks to allow it to control performance bounds via
     standard scaling_min_freq and scaling_max_freq sysfs attributes and
     add sysfs documentation for the Performance Limited Register to it
     (Sumit Gupta)

   - Add ACPI support to the platform device interface in the CMOS RTC
     driver, make the ACPI core device enumeration code create a
     platform device for the CMOS RTC, and drop CMOS RTC PNP device
     support (Rafael Wysocki)

   - Consolidate the x86-specific CMOS RTC handling with the ACPI TAD
     driver and clean up the CMOS RTC ACPI address space handler (Rafael
     Wysocki)

   - Enable ACPI alarm in the CMOS RTC driver if advertised in ACPI FADT
     and allow that driver to work without a dedicated IRQ if the ACPI
     alarm is used (Rafael Wysocki)

   - Clean up the ACPI TAD driver in various ways and add an RTC class
     device interface, including both the RTC setting/reading and alarm
     timer support, to it (Rafael Wysocki)

   - Clean up the ACPI AC and ACPI PAD (processor aggregator device)
     drivers (Rafael Wysocki)

   - Rework checking for duplicate video bus devices and consolidate
     pnp.bus_id workarounds handling in the ACPI video bus driver
     (Rafael Wysocki)

   - Update the ACPI core device drivers to stop setting
     acpi_device_name() unnecessarily (Rafael Wysocki)

   - Rearrange code using acpi_device_class() in the ACPI core device
     drivers and update them to stop setting acpi_device_class()
     unnecessarily (Rafael Wysocki)

   - Define ACPI_AC_CLASS in one place (Rafael Wysocki)

   - Convert the ni903x_wdt watchdog driver and the xen ACPI PAD driver
     to bind to platform devices instead of ACPI devices (Rafael
     Wysocki)

   - Add devm_ghes_register_vendor_record_notifier(), use it in the PCI
     hisi driver, and Add NVIDIA vendor CPER record handler (Kai-Heng
     Feng)

   - Consolidate the interface for obtaining a CPU UID from ACPI across
     architectures and use it to address incorrect PCI TPH Steering Tag
     on ARM64 resulting from the invalid assumption that the ACPI
     Processor UID would always be the same as the corresponding logical
     CPU ID in Linux (Chengwen Feng)"

* tag 'acpi-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (73 commits)
  ACPICA: Update maintainers information
  watchdog: ni903x_wdt: Convert to a platform driver
  ACPI: PAD: xen: Convert to a platform driver
  ACPI: processor: idle: Reset cpuidle on C-state list changes
  cpuidle: Extract and export no-lock variants of cpuidle_unregister_device()
  PCI/TPH: Pass ACPI Processor UID to Cache Locality _DSM
  ACPI: PPTT: Use acpi_get_cpu_uid() and remove get_acpi_id_for_cpu()
  perf: arm_cspmu: Switch to acpi_get_cpu_uid() from get_acpi_id_for_cpu()
  ACPI: Centralize acpi_get_cpu_uid() declaration in include/linux/acpi.h
  x86/acpi: Add acpi_get_cpu_uid() for unified ACPI CPU UID retrieval
  RISC-V: ACPI: Add acpi_get_cpu_uid() for unified ACPI CPU UID retrieval
  LoongArch: Add acpi_get_cpu_uid() for unified ACPI CPU UID retrieval
  arm64: acpi: Add acpi_get_cpu_uid() for unified ACPI CPU UID retrieval
  ACPI: APEI: GHES: Add NVIDIA vendor CPER record handler
  PCI: hisi: Use devm_ghes_register_vendor_record_notifier()
  ACPI: APEI: GHES: Add devm_ghes_register_vendor_record_notifier()
  ACPI: tables: Enable FPDT on LoongArch
  ACPI: processor: idle: Fix NULL pointer dereference in hotplug path
  ACPI: processor: idle: Reset power_setup_done flag on initialization failure
  ACPI: TAD: Add alarm support to the RTC class device interface
  ...
2026-04-13 19:25:07 -07:00
Linus Torvalds
0b0128e64a Merge tag 'xfs-merge-7.1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Pull xfs updates from Carlos Maiolino:
 "There aren't any new features.

  The whole series is just a collection of bug fixes and code
  refactoring. There is some new information added a couple new
  tracepoints, new data added to mountstats, but no big changes"

* tag 'xfs-merge-7.1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (41 commits)
  xfs: fix number of GC bvecs
  xfs: untangle the open zones reporting in mountinfo
  xfs: expose the number of open zones in sysfs
  xfs: reduce special casing for the open GC zone
  xfs: streamline GC zone selection
  xfs: refactor GC zone selection helpers
  xfs: rename xfs_zone_gc_iter_next to xfs_zone_gc_iter_irec
  xfs: put the open zone later xfs_open_zone_put
  xfs: add a separate tracepoint for stealing an open zone for GC
  xfs: delay initial open of the GC zone
  xfs: fix a resource leak in xfs_alloc_buftarg()
  xfs: handle too many open zones when mounting
  xfs: refactor xfs_mount_zones
  xfs: fix integer overflow in busy extent sort comparator
  xfs: fix integer overflow in deferred intent sort comparators
  xfs: fold xfs_setattr_size into xfs_vn_setattr_size
  xfs: remove a duplicate assert in xfs_setattr_size
  xfs: return default quota limits for IDs without a dquot
  xfs: start gc on zonegc_low_space attribute updates
  xfs: don't decrement the buffer LRU count for in-use buffers
  ...
2026-04-13 17:03:48 -07:00
Linus Torvalds
7fe6ac157b Merge tag 'for-7.1/block-20260411' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux
Pull block updates from Jens Axboe:

 - Add shared memory zero-copy I/O support for ublk, bypassing per-I/O
   copies between kernel and userspace by matching registered buffer
   PFNs at I/O time. Includes selftests.

 - Refactor bio integrity to support filesystem initiated integrity
   operations and arbitrary buffer alignment.

 - Clean up bio allocation, splitting bio_alloc_bioset() into clear fast
   and slow paths. Add bio_await() and bio_submit_or_kill() helpers,
   unify synchronous bi_end_io callbacks.

 - Fix zone write plug refcount handling and plug removal races. Add
   support for serializing zone writes at QD=1 for rotational zoned
   devices, yielding significant throughput improvements.

 - Add SED-OPAL ioctls for Single User Mode management and a STACK_RESET
   command.

 - Add io_uring passthrough (uring_cmd) support to the BSG layer.

 - Replace pp_buf in partition scanning with struct seq_buf.

 - zloop improvements and cleanups.

 - drbd genl cleanup, switching to pre_doit/post_doit.

 - NVMe pull request via Keith:
      - Fabrics authentication updates
      - Enhanced block queue limits support
      - Workqueue usage updates
      - A new write zeroes device quirk
      - Tagset cleanup fix for loop device

 - MD pull requests via Yu Kuai:
      - Fix raid5 soft lockup in retry_aligned_read()
      - Fix raid10 deadlock with check operation and nowait requests
      - Fix raid1 overlapping writes on writemostly disks
      - Fix sysfs deadlock on array_state=clear
      - Proactive RAID-5 parity building with llbitmap, with
        write_zeroes_unmap optimization for initial sync
      - Fix llbitmap barrier ordering, rdev skipping, and bitmap_ops
        version mismatch fallback
      - Fix bcache use-after-free and uninitialized closure
      - Validate raid5 journal metadata payload size
      - Various cleanups

 - Various other fixes, improvements, and cleanups

* tag 'for-7.1/block-20260411' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux: (146 commits)
  ublk: fix tautological comparison warning in ublk_ctrl_reg_buf
  scsi: bsg: fix buffer overflow in scsi_bsg_uring_cmd()
  block: refactor blkdev_zone_mgmt_ioctl
  MAINTAINERS: update ublk driver maintainer email
  Documentation: ublk: address review comments for SHMEM_ZC docs
  ublk: allow buffer registration before device is started
  ublk: replace xarray with IDA for shmem buffer index allocation
  ublk: simplify PFN range loop in __ublk_ctrl_reg_buf
  ublk: verify all pages in multi-page bvec fall within registered range
  ublk: widen ublk_shmem_buf_reg.len to __u64 for 4GB buffer support
  xfs: use bio_await in xfs_zone_gc_reset_sync
  block: add a bio_submit_or_kill helper
  block: factor out a bio_await helper
  block: unify the synchronous bi_end_io callbacks
  xfs: fix number of GC bvecs
  selftests/ublk: add read-only buffer registration test
  selftests/ublk: add filesystem fio verify test for shmem_zc
  selftests/ublk: add hugetlbfs shmem_zc test for loop target
  selftests/ublk: add shared memory zero-copy test
  selftests/ublk: add UBLK_F_SHMEM_ZC support for loop target
  ...
2026-04-13 15:51:31 -07:00
Frederic Weisbecker
f0efd29aa6 doc: Add CPU Isolation documentation
nohz_full was introduced in v3.10 in 2013, which means this
documentation is overdue for 13 years.

Fortunately Paul wrote a part of the needed documentation a while ago,
especially concerning nohz_full in Documentation/timers/no_hz.rst and
also about per-CPU kthreads in
Documentation/admin-guide/kernel-per-CPU-kthreads.rst

Introduce a new page that gives an overview of CPU isolation in general.

Acked-by: Waiman Long <longman@redhat.com>
Reviewed-by: Valentin Schneider <vschneid@redhat.com>
Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <20260402094749.18879-1-frederic@kernel.org>
2026-04-12 13:08:28 -06:00
Rafael J. Wysocki
8e2308c1ae Merge branches 'acpica', 'acpi-osl' and 'acpi-tables'
Merge ACPICA updates, an ACPI OS service layer (OSL) update and
assorted updates related to parsing ACPI tables for 7.1-rc1:

 - Update maintainers information regarding ACPICA (Rafael Wysocki)

 - Replace strncpy() with strscpy_pad() in acpi_ut_safe_strncpy() (Kees
   Cook)

 - Trigger an ordered system power off after encountering a fatal error
   operator in AML (Armin Wolf)

 - Enable ACPI FPDT parsing on LoongArch (Xi Ruoyao)

 - Remove the temporary stop-gap acpi_pptt_cache_v1_full structure from
   the ACPI PPTT parser (Ben Horgan)

 - Add support for exposing ACPI FPDT subtables FBPT and S3PT (Nate
   DeSimone)

* acpica:
  ACPICA: Update maintainers information
  ACPICA: Replace strncpy() with strscpy_pad() in acpi_ut_safe_strncpy()

* acpi-osl:
  ACPI: OSL: Poweroff when encountering a fatal ACPI error

* acpi-tables:
  ACPI: tables: Enable FPDT on LoongArch
  Documentation: ABI: add FBPT and S3PT entries to sysfs-firmware-acpi
  ACPI: FPDT: expose FBPT and S3PT subtables via sysfs
  ACPI: PPTT: Remove duplicate structure, acpi_pptt_cache_v1_full
2026-04-09 21:19:34 +02:00
Shubham Chakraborty
136799e52c docs: sysctl: Add documentation for /proc/sys/xen/
Add documentation for the Xen hypervisor sysctl controls in
/proc/sys/xen/balloon/.

Documents the hotplug_unpopulated tunable (available when
CONFIG_XEN_BALLOON_MEMORY_HOTPLUG is enabled) which controls
whether unpopulated memory regions are automatically hotplugged
when the Xen balloon driver needs to reclaim memory.

The documentation is based on source code analysis of
drivers/xen/balloon.c.

Signed-off-by: Shubham Chakraborty <chakrabortyshubham66@gmail.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <20260304150419.16738-1-chakrabortyshubham66@gmail.com>
2026-04-09 08:44:31 -06:00
Li RongQing
15d49089e5 Documentation/kernel-parameters: fix architecture alignment for pt, nopt, and nobypass
Commit ab0e7f2076 ("Documentation: Merge x86-specific boot options doc
into kernel-parameters.txt") introduced a formatting regression where
architecture tags were placed on separate lines with broken indentation.
This caused the 'nopt' [X86] parameter to appear as if it belonged to
the [PPC/POWERNV] section.

Furthermore, since the main 'iommu=' parameter heading already specifies
it is for [X86, EARLY], the subsequent standalone [X86] tags for 'pt',
'nopt', and the AMD GART options are redundant and clutter the
documentation.

Clean up the formatting by removing these redundant tags and properly
attributing the 'nobypass' option to [PPC/POWERNV].

Fixes: ab0e7f2076 ("Documentation: Merge x86-specific boot options doc into kernel-parameters.txt")
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Li RongQing <lirongqing@baidu.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <20260330105957.2271-1-lirongqing@baidu.com>
2026-04-09 08:36:52 -06:00
Manuel Cortez
96c1f4517e docs: fix typos and duplicated words across documentation
Fix the following typos and duplicated words:

- admin-guide/pm/intel-speed-select.rst: "weather" -> "whether"
- core-api/real-time/differences.rst: "the the" -> "the"
- admin-guide/bcache.rst: "to to" -> "to"

Signed-off-by: Manuel Cortez <mdjesuscv@gmail.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <20260406030323.1196-1-mdjesuscv@gmail.com>
2026-04-09 08:21:32 -06:00
Christoph Hellwig
62c89988dc xfs: expose the number of open zones in sysfs
Add a sysfs attribute for the current number of open zones so that it
can be trivially read from userspace in monitoring or testing software.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hans Holmberg <hans.holmberg@wdc.com>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
2026-04-07 13:28:47 +02:00
Liew Rui Yan
6f1e182387 Docs/mm/damon: document min_nr_regions constraint and rationale
The current DAMON implementation requires 'min_nr_regions' to be at least
3.  However, this constraint is not explicitly documented in the
admin-guide documents, nor is its design rationale explained in the design
document.

Add a section in design.rst to explain the rationale: the virtual address
space monitoring design needs to handle at least three regions to
accommodate two large unmapped areas.  While this is specific to 'vaddr',
DAMON currently enforces it across all operation sets for consistency.

Also update reclaim.rst and lru_sort.rst by adding cross-references to
this constraint within their respective 'min_nr_regions' parameter
description sections, ensuring users are aware of the lower bound.

This change is motivated from a recent discussion [1].

Link: https://lkml.kernel.org/r/20260320052428.213230-1-aethernet65535@gmail.com
Link: https://lore.kernel.org/damon/20260319151528.86490-1-sj@kernel.org/T/#t [1]
Signed-off-by: Liew Rui Yan <aethernet65535@gmail.com>
Reviewed-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05 13:53:34 -07:00
Liew Rui Yan
4bdbddb4e4 Docs/mm/damon: document exclusivity of special-purpose modules
Add a section in design.rst to explain that DAMON special-purpose kernel
modules (LRU_SORT, RECLAIM, STAT) run in an exclusive manner and return
-EBUSY if another is already running.

Update lru_sort.rst, reclaim.rst and stat.rst by adding cross-references
to this exclusivity rule at the end of their respective Example sections.

This change is motivated from another discussion [1].

Link: https://lkml.kernel.org/r/20260315162945.80994-1-sj@kernel.org
Link: https://lore.kernel.org/damon/20260314002119.79742-1-sj@kernel.org/T/#t [1]
Signed-off-by: Liew Rui Yan <aethernet65535@gmail.com>
Reviewed-by: SeongJae Park <sj@kernel.org>
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05 13:53:30 -07:00
SeongJae Park
d9cfe515d3 Docs/admin-guide/mm/damon/usage: document goal_tuner sysfs file
Update the DAMON usage document for the new sysfs file for the goal based
quota auto-tuning algorithm selection.

Link: https://lkml.kernel.org/r/20260310010529.91162-7-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05 13:53:26 -07:00
Sergey Senozhatsky
cedfa028b5 zram: remove chained recompression
Chained recompression has unpredictable behavior and is not useful in
practice.

First, systems usually configure just one alternative recompression
algorithm, which has slower compression/decompression but better
compression ratio.  A single alternative algorithm doesn't need chaining.

Second, even with multiple recompression algorithms, chained recompression
is suboptimal.  If a lower priority algorithm succeeds, the page is never
attempted with a higher priority algorithm, leading to worse memory
savings.  If a lower priority algorithm fails, the page is still attempted
with a higher priority algorithm, wasting resources on the failed lower
priority attempt.

In either case, the system would be better off targeting a specific
priority directly.

Chained recompression also significantly complicates the code.  Remove it.

Link: https://lkml.kernel.org/r/20260311084312.1766036-6-senozhatsky@chromium.org
Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Brian Geffon <bgeffon@google.com>
Cc: gao xu <gaoxu2@honor.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05 13:53:24 -07:00
Sergey Senozhatsky
be5f13d948 zram: update recompression documentation
Emphasize usage of the `priority` parameter for recompression and explain
why `algo` parameter can lead to unexpected behavior and thus is not
recommended.

Link: https://lkml.kernel.org/r/20260311084312.1766036-5-senozhatsky@chromium.org
Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Brian Geffon <bgeffon@google.com>
Cc: gao xu <gaoxu2@honor.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05 13:53:24 -07:00
Akinobu Mita
1eba4c9599 docs: mm: fix typo in numa_memory_policy.rst
Fix a typo: MPOL_INTERLEAVED -> MPOL_INTERLEAVE.

Link: https://lkml.kernel.org/r/20260310151837.5888-1-akinobu.mita@gmail.com
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05 13:53:23 -07:00
SeongJae Park
d7f00084f6 Docs/admin-guide/mm/damn/lru_sort: fix intervals autotune parameter name
The section name should be the same as the parameter name.  Fix it.

Link: https://lkml.kernel.org/r/20260307195356.203753-6-sj@kernel.org
Fixes: ed581147a4 ("Docs/admin-guide/mm/damon/lru_sort: document intervals autotuning")
Signed-off-by: SeongJae Park <sj@kernel.org>
Acked-by: wang lian <lianux.mm@gmail.com>
Cc: Brendan Higgins <brendan.higgins@linux.dev>
Cc: David Gow <davidgow@google.com>
Cc: David Hildenbrand <david@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05 13:53:22 -07:00
Kiryl Shutsemau
d50569612c mm: rename the 'compound_head' field in the 'struct page' to 'compound_info'
The 'compound_head' field in the 'struct page' encodes whether the page is
a tail and where to locate the head page.  Bit 0 is set if the page is a
tail, and the remaining bits in the field point to the head page.

As preparation for changing how the field encodes information about the
head page, rename the field to 'compound_info'.

Link: https://lkml.kernel.org/r/20260227194302.274384-4-kas@kernel.org
Signed-off-by: Kiryl Shutsemau <kas@kernel.org>
Reviewed-by: Muchun Song <muchun.song@linux.dev>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Acked-by: David Hildenbrand (arm) <david@kernel.org>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Baoquan He <bhe@redhat.com>
Cc: Christoph Lameter <cl@gentwo.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Frank van der Linden <fvdl@google.com>
Cc: Harry Yoo <harry.yoo@oracle.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Usama Arif <usamaarif642@gmail.com>
Cc: WANG Xuerui <kernel@xen0n.name>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05 13:53:08 -07:00
Marco Elver
da735962d0 kfence: add kfence.fault parameter
Add kfence.fault parameter to control the behavior when a KFENCE error is
detected (similar in spirit to kasan.fault=<mode>).

The supported modes for kfence.fault=<mode> are:

  - report: print the error report and continue (default).
  - oops: print the error report and oops.
  - panic: print the error report and panic.

In particular, the 'oops' mode offers a trade-off between no mitigation
on report and panicking outright (if panic_on_oops is not set).

Link: https://lkml.kernel.org/r/20260225203639.3159463-1-elver@google.com
Signed-off-by: Marco Elver <elver@google.com>
Reviewed-by: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Kees Cook <kees@kernel.org>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05 13:53:06 -07:00
Jason Miu
6b0dd42d76 kho: remove finalize state and clients
Eliminate the `kho_finalize()` function and its associated state from the
KHO subsystem.  The transition to a radix tree for memory tracking makes
the explicit "finalize" state and its serialization step obsolete.

Remove the `kho_finalize()` and `kho_finalized()` APIs and their stub
implementations.  Update KHO client code and the debugfs interface to no
longer call or depend on the `kho_finalize()` mechanism.

Complete the move towards a stateless KHO, simplifying the overall design
by removing unnecessary state management.

Link: https://lkml.kernel.org/r/20260206021428.3386442-3-jasonmiu@google.com
Signed-off-by: Jason Miu <jasonmiu@google.com>
Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Cc: Alexander Graf <graf@amazon.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Changyuan Lyu <changyuanl@google.com>
Cc: David Matlack <dmatlack@google.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Pratyush Yadav <pratyush@kernel.org>
Cc: Ran Xiaokai <ran.xiaokai@zte.com.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05 13:53:04 -07:00
Jiayuan Chen
5ad41a38c3 mm: zswap: add per-memcg stat for incompressible pages
Patch series "mm: zswap: add per-memcg stat for incompressible pages", v3.

In containerized environments, knowing which cgroup is contributing
incompressible pages to zswap is essential for effective resource
management.  This series adds a new per-memcg stat 'zswap_incomp' to track
incompressible pages, along with a selftest.


This patch (of 2):

The global zswap_stored_incompressible_pages counter was added in commit
dca4437a58 ("mm/zswap: store <PAGE_SIZE compression failed page as-is")
to track how many pages are stored in raw (uncompressed) form in zswap. 
However, in containerized environments, knowing which cgroup is
contributing incompressible pages is essential for effective resource
management [1].

Add a new memcg stat 'zswap_incomp' to track incompressible pages per
cgroup.  This helps administrators and orchestrators to:

1. Identify workloads that produce incompressible data (e.g., encrypted
   data, already-compressed media, random data) and may not benefit from
   zswap.

2. Make informed decisions about workload placement - moving
   incompressible workloads to nodes with larger swap backing devices
   rather than relying on zswap.

3. Debug zswap efficiency issues at the cgroup level without needing to
   correlate global stats with individual cgroups.

While the compression ratio can be estimated from existing stats (zswap /
zswapped * PAGE_SIZE), this doesn't distinguish between "uniformly poor
compression" and "a few completely incompressible pages mixed with highly
compressible ones".  The zswap_incomp stat provides direct visibility into
the latter case.

Link: https://lkml.kernel.org/r/20260213071827.5688-1-jiayuan.chen@linux.dev
Link: https://lkml.kernel.org/r/20260213071827.5688-2-jiayuan.chen@linux.dev
Link: https://lore.kernel.org/linux-mm/CAF8kJuONDFj4NAksaR4j_WyDbNwNGYLmTe-o76rqU17La=nkOw@mail.gmail.com/ [1]
Signed-off-by: Jiayuan Chen <jiayuan.chen@shopee.com>
Acked-by: Nhat Pham <nphamcs@gmail.com>
Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
Reviewed-by: Yosry Ahmed <yosry.ahmed@linux.dev>
Reviewed-by: SeongJae Park <sj@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Chengming Zhou <chengming.zhou@linux.dev>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Michal Koutný <mkoutny@suse.com>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05 13:53:00 -07:00
Mario Limonciello (AMD)
6927f21852 cpufreq/amd-pstate: Add support for raw EPP writes
The energy performance preference field of the CPPC request MSR
supports values from 0 to 255, but the strings only offer 4 values.

The other values are useful for tuning the performance of some
workloads.

Add support for writing the raw energy performance preference value
to the sysfs file.  If the last value written was an integer then
an integer will be returned.  If the last value written was a string
then a string will be returned.

Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
2026-04-02 11:29:25 -05:00
Mario Limonciello (AMD)
798c47593c cpufreq/amd-pstate: Add support for platform profile class
The platform profile core allows multiple drivers and devices to
register platform profile support.

When the legacy platform profile interface is used all drivers will
adjust the platform profile as well.

Add support for registering every CPU with the platform profile handler
when dynamic EPP is enabled.

The end result will be that changing the platform profile will modify
EPP accordingly.

Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
2026-04-02 11:29:15 -05:00
Mario Limonciello (AMD)
da8afb1c66 cpufreq/amd-pstate: add kernel command line to override dynamic epp
Add `amd_dynamic_epp=enable` and `amd_dynamic_epp=disable` to override
the kernel configuration option `CONFIG_X86_AMD_PSTATE_DYNAMIC_EPP`
locally.

Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
2026-04-02 11:29:11 -05:00
Mario Limonciello (AMD)
e30ca6dd53 cpufreq/amd-pstate: Add dynamic energy performance preference
Dynamic energy performance preference changes the EPP profile based on
whether the machine is running on AC or DC power.

A notification chain from the power supply core is used to adjust EPP
values on plug in or plug out events.

When enabled, the driver exposes a sysfs toggle for dynamic EPP, blocks
manual writes to energy_performance_preference while it "owns" the EPP
updates.

For non-server systems:
    * the default EPP for AC mode is `performance`.
    * the default EPP for DC mode is `balance_performance`.

For server systems dynamic EPP is mostly a no-op.

Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
2026-04-02 11:29:02 -05:00
Ninad Naik
a362ae6e7e Documentation: amd-pstate: fix dead links in the reference section
The links for AMD64 Architecture Programmer's Manual and PPR for AMD
Family 19h Model 51h, Revision A1 Processors redirect to a generic page.
Update the links to the working ones.

Signed-off-by: Ninad Naik <ninadnaik07@gmail.com>
Link: https://lore.kernel.org/r/20260330190855.1115304-1-ninadnaik07@gmail.com
Acked-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
2026-04-02 11:28:58 -05:00
Gautham R. Shenoy
88d2ca6a68 Documentation/amd-pstate: Add documentation for amd_pstate_floor_{freq,count}
Add documentation for the sysfs files
/sys/devices/system/cpu/cpufreq/policy*/amd_pstate_floor_freq
and
/sys/devices/system/cpu/cpufreq/policy*/amd_pstate_floor_count.

Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
2026-04-02 11:28:51 -05:00
Gautham R. Shenoy
a5bc4c44ae Documentation/amd-pstate: List amd_pstate_prefcore_ranking sysfs file
Add the missing amd_pstate_prefcore_ranking filenames in the sysfs
listing example leading to the descriptions of these
parameters. Clarify when will the file be visible.

Fixes: 15a2b764ea ("amd-pstate: Add missing documentation for `amd_pstate_prefcore_ranking`")
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
2026-04-02 11:28:48 -05:00
Gautham R. Shenoy
7e1cf24efb Documentation/amd-pstate: List amd_pstate_hw_prefcore sysfs file
Add the missing amd_pstate_hw_prefcore filenames in the sysfs listing
example leading to the descriptions of these parameters. Clarify when
will the file be visible.

Fixes: b96b82d1af ("cpufreq: amd-pstate: Add documentation for `amd_pstate_hw_prefcore`")
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
2026-04-02 11:28:44 -05:00
Breno Leitao
41e3ccca00 docs: workqueue: document WQ_AFFN_CACHE_SHARD affinity scope
Update kernel-parameters.txt and workqueue.rst to reflect the new
cache_shard affinity scope and the default change from cache to
cache_shard.

Signed-off-by: Breno Leitao <leitao@debian.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
2026-04-01 10:24:18 -10:00
Damien Le Moal
b2a78fec34 zloop: add max_open_zones option
Introduce the new max_open_zones option to allow specifying a limit on
the maximum number of open zones of a zloop device. This change allows
creating a zloop device that can more closely mimick the characteristics
of a physical SMR drive.

When set to a non zero value, only up to max_open_zones zones can be in
the implicit open (BLK_ZONE_COND_IMP_OPEN) and explicit open
(BLK_ZONE_COND_EXP_OPEN) conditions at any time. The transition to the
implicit open condition of a zone on a write operation can result in an
implicit close of an already implicitly open zone. This is handled in
the function zloop_do_open_zone(). This function also handles
transitions to the explicit open condition. Implicit close transitions
are handled using an LRU ordered list of open zones which is managed
using the helper functions zloop_lru_rotate_open_zone() and
zloop_lru_remove_open_zone().

Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://patch.msgid.link/20260326203245.946830-1-dlemoal@kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2026-03-31 08:33:28 -06:00
H. Peter Anvin
ac66a73be0 x86/fred: Enable FRED by default
When FRED was added to the mainline kernel, it was set up as an explicit
opt-in due to the risk of regressions before hardware was available publicly.

Now, Panther Lake (Core Ultra 300 series) has been released, and benchmarking
by Phoronix has shown that it provides a significant performance benefit on
most workloads:

  https://www.phoronix.com/review/intel-fred-panther-lake

Accordingly, enable FRED by default if the CPU supports it. FRED can of
course still be disabled via the fred=off command line option.

Touch up Kconfig help too.

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Sohil Mehta <sohil.mehta@intel.com>
Link: https://patch.msgid.link/20260325230151.1898287-2-hpa@zytor.com
2026-03-27 16:04:47 +01:00
Christoph Hellwig
829def1e35 zloop: forget write cache on force removal
Add a new options that causes zloop to truncate the zone files to the
write pointer value recorded at the last cache flush to simulate
unclean shutdowns.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Link: https://patch.msgid.link/20260323071156.2940772-3-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2026-03-25 06:49:01 -06:00
Besar Wicaksono
2f89b7f78c perf: add NVIDIA Tegra410 C2C PMU
Adds NVIDIA C2C PMU support in Tegra410 SOC. This PMU is
used to measure memory latency between the SOC and device
memory, e.g GPU Memory (GMEM), CXL Memory, or memory on
remote Tegra410 SOC.

Reviewed-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Signed-off-by: Besar Wicaksono <bwicaksono@nvidia.com>
Signed-off-by: Will Deacon <will@kernel.org>
2026-03-24 12:37:33 +00:00
Besar Wicaksono
429b7638b2 perf: add NVIDIA Tegra410 CPU Memory Latency PMU
Adds CPU Memory (CMEM) Latency PMU support in Tegra410 SOC.
The PMU is used to measure latency between the edge of the
Unified Coherence Fabric to the local system DRAM.

Reviewed-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Signed-off-by: Besar Wicaksono <bwicaksono@nvidia.com>
Signed-off-by: Will Deacon <will@kernel.org>
2026-03-24 12:37:32 +00:00
Besar Wicaksono
3dd7302230 perf/arm_cspmu: nvidia: Add Tegra410 PCIE-TGT PMU
Adds PCIE-TGT PMU support in Tegra410 SOC. This PMU is
instanced in each root complex in the SOC and it captures
traffic originating from any source towards PCIE BAR and CXL
HDM range. The traffic can be filtered based on the
destination root port or target address range.

Reviewed-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Signed-off-by: Besar Wicaksono <bwicaksono@nvidia.com>
Signed-off-by: Will Deacon <will@kernel.org>
2026-03-24 12:37:32 +00:00
Besar Wicaksono
bf585ba147 perf/arm_cspmu: nvidia: Add Tegra410 PCIE PMU
Adds PCIE PMU support in Tegra410 SOC. This PMU is instanced
in each root complex in the SOC and can capture traffic from
PCIE device to various memory types. This PMU can filter traffic
based on the originating root port or BDF and the target memory
types (CPU DRAM, GPU Memory, CXL Memory, or remote Memory).

Reviewed-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Signed-off-by: Besar Wicaksono <bwicaksono@nvidia.com>
Signed-off-by: Will Deacon <will@kernel.org>
2026-03-24 12:37:32 +00:00
Besar Wicaksono
f5caf26fd6 perf/arm_cspmu: nvidia: Add Tegra410 UCF PMU
The Unified Coherence Fabric (UCF) contains last level cache
and cache coherent interconnect in Tegra410 SOC. The PMU in
this device can be used to capture events related to access
to the last level cache and memory from different sources.

Reviewed-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Signed-off-by: Besar Wicaksono <bwicaksono@nvidia.com>
Signed-off-by: Will Deacon <will@kernel.org>
2026-03-24 12:37:32 +00:00
Besar Wicaksono
d332424d1d perf/arm_cspmu: nvidia: Rename doc to Tegra241
The documentation in nvidia-pmu.rst contains PMUs specific
to NVIDIA Tegra241 SoC. Rename the file for this specific
SoC to have better distinction with other NVIDIA SoC.

Signed-off-by: Besar Wicaksono <bwicaksono@nvidia.com>
Signed-off-by: Will Deacon <will@kernel.org>
2026-03-24 12:37:32 +00:00
Thorsten Leemhuis
9d43740074 docs: reporting-issues: create a proper appendix explaining specialties
Merge "Why some bugs remain unfixed and some reports are ignored" with
the closing words while rewriting and extending the text.

The result spends fewer words on explaining things that are normal in
FLOSS -- while outlining where the kernel is different and how that
makes bug reporting more complicated than in other FLOSS projects.

Signed-off-by: Thorsten Leemhuis <linux@leemhuis.info>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <473b36fa9723c46b7167004752f097e6c26d7278.1773750701.git.linux@leemhuis.info>
2026-03-22 14:47:53 -06:00
Thorsten Leemhuis
a9a1c262df docs: verify-bugs-… and quickly-build-…: improve feedback section
Mention sending patches in the section about feedback. This syncs them
with a section a earlier patch added to reporting-issues.rst, which
was based on these sections and improved during review.

Signed-off-by: Thorsten Leemhuis <linux@leemhuis.info>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <cb219ccd15271bfb99ecce01dcbdbb03cccd7be1.1773750701.git.linux@leemhuis.info>
2026-03-22 14:47:53 -06:00