Commit Graph

134333 Commits

Author SHA1 Message Date
Linus Torvalds
7ce4de1cda Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Pull rdma updates from Jason Gunthorpe:

 - Various minor code cleanups and fixes for hns, iser, cxgb4, hfi1,
   rxe, erdma, mana_ib

 - Prefetch supprot for rxe ODP

 - Remove memory window support from hns as new device FW is no longer
   support it

 - Remove qib, it is very old and obsolete now, Cornelis wishes to
   restructure the hfi1/qib shared layer

 - Fix a race in destroying CQs where we can still end up with work
   running because the work is cancled before the driver stops
   triggering it

 - Improve interaction with namespaces:
     * Follow the devlink namespace for newly spawned RDMA devices
     * Create iopoib net devces in the parent IB device's namespace
     * Allow CAP_NET_RAW checks to pass in user namespaces

 - A new flow control scheme for IB MADs to try and avoid queue
   overflows in the network

 - Fix 2G message sizes in bnxt_re

 - Optimize mkey layout for mlx5 DMABUF

 - New "DMA Handle" concept to allow controlling PCI TPH and steering
   tags

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (71 commits)
  RDMA/siw: Change maintainer email address
  RDMA/mana_ib: add support of multiple ports
  RDMA/mlx5: Refactor optional counters steering code
  RDMA/mlx5: Add DMAH support for reg_user_mr/reg_user_dmabuf_mr
  IB: Extend UVERBS_METHOD_REG_MR to get DMAH
  RDMA/mlx5: Add DMAH object support
  RDMA/core: Introduce a DMAH object and its alloc/free APIs
  IB/core: Add UVERBS_METHOD_REG_MR on the MR object
  net/mlx5: Add support for device steering tag
  net/mlx5: Expose IFC bits for TPH
  PCI/TPH: Expose pcie_tph_get_st_table_size()
  RDMA/mlx5: Fix incorrect MKEY masking
  RDMA/mlx5: Fix returned type from _mlx5r_umr_zap_mkey()
  RDMA/mlx5: remove redundant check on err on return expression
  RDMA/mana_ib: add additional port counters
  RDMA/mana_ib: Fix DSCP value in modify QP
  RDMA/efa: Add CQ with external memory support
  RDMA/core: Add umem "is_contiguous" and "start_dma_addr" helpers
  RDMA/uverbs: Add a common way to create CQ with umem
  RDMA/mlx5: Optimize DMABUF mkey page size
  ...
2025-07-31 12:19:55 -07:00
Linus Torvalds
7d767a9528 Merge tag 'for-linus-6.17-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen updates from Juergen Gross:

 - fix for a UAF in the xen gntdev-dmabuf driver

 - fix in the xen netfront driver avoiding spurious interrupts

 - fix in the gntdev driver avoiding a large stack allocation

 - cleanup removing some dead code

 - build warning fix

 - cleanup of the sysfs code in the xen-pciback driver

* tag 'for-linus-6.17-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen/netfront: Fix TX response spurious interrupts
  xen/gntdev: remove struct gntdev_copy_batch from stack
  xen: fix UAF in dmabuf_exp_from_pages()
  xen: Remove some deadcode (x)
  xen-pciback: Replace scnprintf() with sysfs_emit_at()
  xen/xenbus: fix W=1 build warning in xenbus_va_dev_error function
2025-07-30 17:03:49 -07:00
Linus Torvalds
d9104cec3e Merge tag 'bpf-next-6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Pull bpf updates from Alexei Starovoitov:

 - Remove usermode driver (UMD) framework (Thomas Weißschuh)

 - Introduce Strongly Connected Component (SCC) in the verifier to
   detect loops and refine register liveness (Eduard Zingerman)

 - Allow 'void *' cast using bpf_rdonly_cast() and corresponding
   '__arg_untrusted' for global function parameters (Eduard Zingerman)

 - Improve precision for BPF_ADD and BPF_SUB operations in the verifier
   (Harishankar Vishwanathan)

 - Teach the verifier that constant pointer to a map cannot be NULL
   (Ihor Solodrai)

 - Introduce BPF streams for error reporting of various conditions
   detected by BPF runtime (Kumar Kartikeya Dwivedi)

 - Teach the verifier to insert runtime speculation barrier (lfence on
   x86) to mitigate speculative execution instead of rejecting the
   programs (Luis Gerhorst)

 - Various improvements for 'veristat' (Mykyta Yatsenko)

 - For CONFIG_DEBUG_KERNEL config warn on internal verifier errors to
   improve bug detection by syzbot (Paul Chaignon)

 - Support BPF private stack on arm64 (Puranjay Mohan)

 - Introduce bpf_cgroup_read_xattr() kfunc to read xattr of cgroup's
   node (Song Liu)

 - Introduce kfuncs for read-only string opreations (Viktor Malik)

 - Implement show_fdinfo() for bpf_links (Tao Chen)

 - Reduce verifier's stack consumption (Yonghong Song)

 - Implement mprog API for cgroup-bpf programs (Yonghong Song)

* tag 'bpf-next-6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (192 commits)
  selftests/bpf: Migrate fexit_noreturns case into tracing_failure test suite
  selftests/bpf: Add selftest for attaching tracing programs to functions in deny list
  bpf: Add log for attaching tracing programs to functions in deny list
  bpf: Show precise rejected function when attaching fexit/fmod_ret to __noreturn functions
  bpf: Fix various typos in verifier.c comments
  bpf: Add third round of bounds deduction
  selftests/bpf: Test invariants on JSLT crossing sign
  selftests/bpf: Test cross-sign 64bits range refinement
  selftests/bpf: Update reg_bound range refinement logic
  bpf: Improve bounds when s64 crosses sign boundary
  bpf: Simplify bounds refinement from s32
  selftests/bpf: Enable private stack tests for arm64
  bpf, arm64: JIT support for private stack
  bpf: Move bpf_jit_get_prog_name() to core.c
  bpf, arm64: Fix fp initialization for exception boundary
  umd: Remove usermode driver framework
  bpf/preload: Don't select USERMODE_DRIVER
  selftests/bpf: Fix test dynptr/test_dynptr_memset_xdp_chunks failure
  selftests/bpf: Fix test dynptr/test_dynptr_copy_xdp failure
  selftests/bpf: Increase xdp data size for arm64 64K page size
  ...
2025-07-30 09:58:50 -07:00
Linus Torvalds
8be4d31cb8 Merge tag 'net-next-6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Jakub Kicinski:
 "Core & protocols:

   - Wrap datapath globals into net_aligned_data, to avoid false sharing

   - Preserve MSG_ZEROCOPY in forwarding (e.g. out of a container)

   - Add SO_INQ and SCM_INQ support to AF_UNIX

   - Add SIOCINQ support to AF_VSOCK

   - Add TCP_MAXSEG sockopt to MPTCP

   - Add IPv6 force_forwarding sysctl to enable forwarding per interface

   - Make TCP validation of whether packet fully fits in the receive
     window and the rcv_buf more strict. With increased use of HW
     aggregation a single "packet" can be multiple 100s of kB

   - Add MSG_MORE flag to optimize large TCP transmissions via sockmap,
     improves latency up to 33% for sockmap users

   - Convert TCP send queue handling from tasklet to BH workque

   - Improve BPF iteration over TCP sockets to see each socket exactly
     once

   - Remove obsolete and unused TCP RFC3517/RFC6675 loss recovery code

   - Support enabling kernel threads for NAPI processing on per-NAPI
     instance basis rather than a whole device. Fully stop the kernel
     NAPI thread when threaded NAPI gets disabled. Previously thread
     would stick around until ifdown due to tricky synchronization

   - Allow multicast routing to take effect on locally-generated packets

   - Add output interface argument for End.X in segment routing

   - MCTP: add support for gateway routing, improve bind() handling

   - Don't require rtnl_lock when fetching an IPv6 neighbor over Netlink

   - Add a new neighbor flag ("extern_valid"), which cedes refresh
     responsibilities to userspace. This is needed for EVPN multi-homing
     where a neighbor entry for a multi-homed host needs to be synced
     across all the VTEPs among which the host is multi-homed

   - Support NUD_PERMANENT for proxy neighbor entries

   - Add a new queuing discipline for IETF RFC9332 DualQ Coupled AQM

   - Add sequence numbers to netconsole messages. Unregister
     netconsole's console when all net targets are removed. Code
     refactoring. Add a number of selftests

   - Align IPSec inbound SA lookup to RFC 4301. Only SPI and protocol
     should be used for an inbound SA lookup

   - Support inspecting ref_tracker state via DebugFS

   - Don't force bonding advertisement frames tx to ~333 ms boundaries.
     Add broadcast_neighbor option to send ARP/ND on all bonded links

   - Allow providing upcall pid for the 'execute' command in openvswitch

   - Remove DCCP support from Netfilter's conntrack

   - Disallow multiple packet duplications in the queuing layer

   - Prevent use of deprecated iptables code on PREEMPT_RT

  Driver API:

   - Support RSS and hashing configuration over ethtool Netlink

   - Add dedicated ethtool callbacks for getting and setting hashing
     fields

   - Add support for power budget evaluation strategy in PSE /
     Power-over-Ethernet. Generate Netlink events for overcurrent etc

   - Support DPLL phase offset monitoring across all device inputs.
     Support providing clock reference and SYNC over separate DPLL
     inputs

   - Support traffic classes in devlink rate API for bandwidth
     management

   - Remove rtnl_lock dependency from UDP tunnel port configuration

  Device drivers:

   - Add a new Broadcom driver for 800G Ethernet (bnge)

   - Add a standalone driver for Microchip ZL3073x DPLL

   - Remove IBM's NETIUCV device driver

   - Ethernet high-speed NICs:
      - Broadcom (bnxt):
         - support zero-copy Tx of DMABUF memory
         - take page size into account for page pool recycling rings
      - Intel (100G, ice, idpf):
         - idpf: XDP and AF_XDP support preparations
         - idpf: add flow steering
         - add link_down_events statistic
         - clean up the TSPLL code
         - preparations for live VM migration
      - nVidia/Mellanox:
         - support zero-copy Rx/Tx interfaces (DMABUF and io_uring)
         - optimize context memory usage for matchers
         - expose serial numbers in devlink info
         - support PCIe congestion metrics
      - Meta (fbnic):
         - add 25G, 50G, and 100G link modes to phylink
         - support dumping FW logs
      - Marvell/Cavium:
         - support for CN20K generation of the Octeon chips
      - Amazon:
         - add HW clock (without timestamping, just hypervisor time access)

   - Ethernet virtual:
      - VirtIO net:
         - support segmentation of UDP-tunnel-encapsulated packets
      - Google (gve):
         - support packet timestamping and clock synchronization
      - Microsoft vNIC:
         - add handler for device-originated servicing events
         - allow dynamic MSI-X vector allocation
         - support Tx bandwidth clamping

   - Ethernet NICs consumer, and embedded:
      - AMD:
         - amd-xgbe: hardware timestamping and PTP clock support
      - Broadcom integrated MACs (bcmgenet, bcmasp):
         - use napi_complete_done() return value to support NAPI polling
         - add support for re-starting auto-negotiation
      - Broadcom switches (b53):
         - support BCM5325 switches
         - add bcm63xx EPHY power control
      - Synopsys (stmmac):
         - lots of code refactoring and cleanups
      - TI:
         - icssg-prueth: read firmware-names from device tree
         - icssg: PRP offload support
      - Microchip:
         - lan78xx: convert to PHYLINK for improved PHY and MAC management
         - ksz: add KSZ8463 switch support
      - Intel:
         - support similar queue priority scheme in multi-queue and
           time-sensitive networking (taprio)
         - support packet pre-emption in both
      - RealTek (r8169):
         - enable EEE at 5Gbps on RTL8126
      - Airoha:
         - add PPPoE offload support
         - MDIO bus controller for Airoha AN7583

   - Ethernet PHYs:
      - support for the IPQ5018 internal GE PHY
      - micrel KSZ9477 switch-integrated PHYs:
         - add MDI/MDI-X control support
         - add RX error counters
         - add cable test support
         - add Signal Quality Indicator (SQI) reporting
      - dp83tg720: improve reset handling and reduce link recovery time
      - support bcm54811 (and its MII-Lite interface type)
      - air_en8811h: support resume/suspend
      - support PHY counters for QCA807x and QCA808x
      - support WoL for QCA807x

   - CAN drivers:
      - rcar_canfd: support for Transceiver Delay Compensation
      - kvaser: report FW versions via devlink dev info

   - WiFi:
      - extended regulatory info support (6 GHz)
      - add statistics and beacon monitor for Multi-Link Operation (MLO)
      - support S1G aggregation, improve S1G support
      - add Radio Measurement action fields
      - support per-radio RTS threshold
      - some work around how FIPS affects wifi, which was wrong (RC4 is
        used by TKIP, not only WEP)
      - improvements for unsolicited probe response handling

   - WiFi drivers:
      - RealTek (rtw88):
         - IBSS mode for SDIO devices
      - RealTek (rtw89):
         - BT coexistence for MLO/WiFi7
         - concurrent station + P2P support
         - support for USB devices RTL8851BU/RTL8852BU
      - Intel (iwlwifi):
         - use embedded PNVM in (to be released) FW images to fix
           compatibility issues
         - many cleanups (unused FW APIs, PCIe code, WoWLAN)
         - some FIPS interoperability
      - MediaTek (mt76):
         - firmware recovery improvements
         - more MLO work
      - Qualcomm/Atheros (ath12k):
         - fix scan on multi-radio devices
         - more EHT/Wi-Fi 7 features
         - encapsulation/decapsulation offload
      - Broadcom (brcm80211):
         - support SDIO 43751 device

   - Bluetooth:
      - hci_event: add support for handling LE BIG Sync Lost event
      - ISO: add socket option to report packet seqnum via CMSG
      - ISO: support SCM_TIMESTAMPING for ISO TS

   - Bluetooth drivers:
      - intel_pcie: support Function Level Reset
      - nxpuart: add support for 4M baudrate
      - nxpuart: implement powerup sequence, reset, FW dump, and FW loading"

* tag 'net-next-6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1742 commits)
  dpll: zl3073x: Fix build failure
  selftests: bpf: fix legacy netfilter options
  ipv6: annotate data-races around rt->fib6_nsiblings
  ipv6: fix possible infinite loop in fib6_info_uses_dev()
  ipv6: prevent infinite loop in rt6_nlmsg_size()
  ipv6: add a retry logic in net6_rt_notify()
  vrf: Drop existing dst reference in vrf_ip6_input_dst
  net/sched: taprio: align entry index attr validation with mqprio
  net: fsl_pq_mdio: use dev_err_probe
  selftests: rtnetlink.sh: remove esp4_offload after test
  vsock: remove unnecessary null check in vsock_getname()
  igb: xsk: solve negative overflow of nb_pkts in zerocopy mode
  stmmac: xsk: fix negative overflow of budget in zerocopy mode
  dt-bindings: ieee802154: Convert at86rf230.txt yaml format
  net: dsa: microchip: Disable PTP function of KSZ8463
  net: dsa: microchip: Setup fiber ports for KSZ8463
  net: dsa: microchip: Write switch MAC address differently for KSZ8463
  net: dsa: microchip: Use different registers for KSZ8463
  net: dsa: microchip: Add KSZ8463 switch support to KSZ DSA driver
  dt-bindings: net: dsa: microchip: Add KSZ8463 switch support
  ...
2025-07-30 08:58:55 -07:00
Linus Torvalds
99e731bcb8 Merge tag 'timers-cleanups-2025-07-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer cleanups from Thomas Gleixner:
 "A treewide cleanup of struct cycle_counter const annotations.

  The initial idea of making them const was correct as they were
  seperate instances. When they got embedded into larger data
  structures, which are even modified by the callback this got moot. The
  only reason why this went unnoticed is that the required
  container_of() casts the const attribute forcefully away.

  Stop pretending that it is const"

* tag 'timers-cleanups-2025-07-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  time/timecounter: Fix the lie that struct cyclecounter is const
2025-07-29 14:02:53 -07:00
Linus Torvalds
22c5696e3f Merge tag 'driver-core-6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/driver-core/driver-core
Pull driver core updates from Danilo Krummrich:
 "debugfs:
   - Remove unneeded debugfs_file_{get,put}() instances
   - Remove last remnants of debugfs_real_fops()
   - Allow storing non-const void * in struct debugfs_inode_info::aux

  sysfs:
   - Switch back to attribute_group::bin_attrs (treewide)
   - Switch back to bin_attribute::read()/write() (treewide)
   - Constify internal references to 'struct bin_attribute'

  Support cache-ids for device-tree systems:
   - Add arch hook arch_compact_of_hwid()
   - Use arch_compact_of_hwid() to compact MPIDR values on arm64

  Rust:
   - Device:
       - Introduce CoreInternal device context (for bus internal methods)
       - Provide generic drvdata accessors for bus devices
       - Provide Driver::unbind() callbacks
       - Use the infrastructure above for auxiliary, PCI and platform
       - Implement Device::as_bound()
       - Rename Device::as_ref() to Device::from_raw() (treewide)
       - Implement fwnode and device property abstractions
       - Implement example usage in the Rust platform sample driver
   - Devres:
       - Remove the inner reference count (Arc) and use pin-init instead
       - Replace Devres::new_foreign_owned() with devres::register()
       - Require T to be Send in Devres<T>
       - Initialize the data kept inside a Devres last
       - Provide an accessor for the Devres associated Device
   - Device ID:
       - Add support for ACPI device IDs and driver match tables
       - Split up generic device ID infrastructure
       - Use generic device ID infrastructure in net::phy
   - DMA:
       - Implement the dma::Device trait
       - Add DMA mask accessors to dma::Device
       - Implement dma::Device for PCI and platform devices
       - Use DMA masks from the DMA sample module
   - I/O:
       - Implement abstraction for resource regions (struct resource)
       - Implement resource-based ioremap() abstractions
       - Provide platform device accessors for I/O (remap) requests
   - Misc:
       - Support fallible PinInit types in Revocable
       - Implement Wrapper<T> for Opaque<T>
       - Merge pin-init blanket dependencies (for Devres)

  Misc:
   - Fix OF node leak in auxiliary_device_create()
   - Use util macros in device property iterators
   - Improve kobject sample code
   - Add device_link_test() for testing device link flags
   - Fix typo in Documentation/ABI/testing/sysfs-kernel-address_bits
   - Hint to prefer container_of_const() over container_of()"

* tag 'driver-core-6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/driver-core/driver-core: (84 commits)
  rust: io: fix broken intra-doc links to `platform::Device`
  rust: io: fix broken intra-doc link to missing `flags` module
  rust: io: mem: enable IoRequest doc-tests
  rust: platform: add resource accessors
  rust: io: mem: add a generic iomem abstraction
  rust: io: add resource abstraction
  rust: samples: dma: set DMA mask
  rust: platform: implement the `dma::Device` trait
  rust: pci: implement the `dma::Device` trait
  rust: dma: add DMA addressing capabilities
  rust: dma: implement `dma::Device` trait
  rust: net::phy Change module_phy_driver macro to use module_device_table macro
  rust: net::phy represent DeviceId as transparent wrapper over mdio_device_id
  rust: device_id: split out index support into a separate trait
  device: rust: rename Device::as_ref() to Device::from_raw()
  arm64: cacheinfo: Provide helper to compress MPIDR value into u32
  cacheinfo: Add arch hook to compress CPU h/w id into 32 bits for cache-id
  cacheinfo: Set cache 'id' based on DT data
  container_of: Document container_of() is not to be used in new code
  driver core: auxiliary bus: fix OF node leak
  ...
2025-07-29 12:15:39 -07:00
Jakub Kicinski
c58c18be88 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Merge in late fixes to prepare for the 6.17 net-next PR.

Conflicts:

net/core/neighbour.c
  1bbb76a899 ("neighbour: Fix null-ptr-deref in neigh_flush_dev().")
  13a936bb99 ("neighbour: Protect tbl->phash_buckets[] with a dedicated mutex.")
  03dc03fa04 ("neighbor: Add NTF_EXT_VALIDATED flag for externally validated entries")

Adjacent changes:

drivers/net/usb/usbnet.c
  0d9cfc9b8c ("net: usbnet: Avoid potential RCU stall on LINK_CHANGE event")
  2c04d279e8 ("net: usb: Convert tasklet API to new bottom half workqueue mechanism")

net/ipv6/route.c
  31d7d67ba1 ("ipv6: annotate data-races around rt->fib6_nsiblings")
  1caf272972 ("ipv6: adopt dst_dev() helper")
  3b3ccf9ed0 ("net: Remove unnecessary NULL check for lwtunnel_fill_encap()")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-26 11:49:45 -07:00
Stanislav Fomichev
f388f807ec vrf: Drop existing dst reference in vrf_ip6_input_dst
Commit ff3fbcdd47 ("selftests: tc: Add generic erspan_opts matching support
for tc-flower") started triggering the following kmemleak warning:

unreferenced object 0xffff888015fb0e00 (size 512):
  comm "softirq", pid 0, jiffies 4294679065
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 40 d2 85 9e ff ff ff ff  ........@.......
    41 69 59 9d ff ff ff ff 00 00 00 00 00 00 00 00  AiY.............
  backtrace (crc 30b71e8b):
    __kmalloc_noprof+0x359/0x460
    metadata_dst_alloc+0x28/0x490
    erspan_rcv+0x4f1/0x1160 [ip_gre]
    gre_rcv+0x217/0x240 [ip_gre]
    gre_rcv+0x1b8/0x400 [gre]
    ip_protocol_deliver_rcu+0x31d/0x3a0
    ip_local_deliver_finish+0x37d/0x620
    ip_local_deliver+0x174/0x460
    ip_rcv+0x52b/0x6b0
    __netif_receive_skb_one_core+0x149/0x1a0
    process_backlog+0x3c8/0x1390
    __napi_poll.constprop.0+0xa1/0x390
    net_rx_action+0x59b/0xe00
    handle_softirqs+0x22b/0x630
    do_softirq+0xb1/0xf0
    __local_bh_enable_ip+0x115/0x150

vrf_ip6_input_dst unconditionally sets skb dst entry, add a call to
skb_dst_drop to drop any existing entry.

Cc: David Ahern <dsahern@kernel.org>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Fixes: 9ff7438460 ("net: vrf: Handle ipv6 multicast and link-local addresses")
Signed-off-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20250725160043.350725-1-sdf@fomichev.me
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-26 11:28:45 -07:00
Alexander Stein
5737383fae net: fsl_pq_mdio: use dev_err_probe
Silence deferred probes using dev_err_probe(). This can happen when
the ethernet PHY uses an IRQ line attached to a i2c GPIO expander. If the
i2c bus is not yet ready, a probe deferral can occur.

Signed-off-by: Alexander Stein <alexander.stein@ew.tq-group.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250725055615.259945-1-alexander.stein@ew.tq-group.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-26 11:27:25 -07:00
Jason Xing
3b7c13dfdc igb: xsk: solve negative overflow of nb_pkts in zerocopy mode
There is no break time in the while() loop, so every time at the end of
igb_xmit_zc(), negative overflow of nb_pkts will occur, which renders
the return value always false. But theoretically, the result should be
set after calling xsk_tx_peek_release_desc_batch(). We can take
i40e_xmit_zc() as a good example.

Returning false means we're not done with transmission and we need one
more poll, which is exactly what igb_xmit_zc() always did before this
patch. After this patch, the return value depends on the nb_pkts value.
Two cases might happen then:
1. if (nb_pkts < budget), it means we process all the possible data, so
   return true and no more necessary poll will be triggered because of
   this.
2. if (nb_pkts == budget), it means we might have more data, so return
   false to let another poll run again.

Fixes: f8e284a02a ("igb: Add AF_XDP zero-copy Tx support")
Signed-off-by: Jason Xing <kernelxing@tencent.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Link: https://patch.msgid.link/20250723142327.85187-3-kerneljasonxing@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-26 11:23:09 -07:00
Jason Xing
2764ab51d5 stmmac: xsk: fix negative overflow of budget in zerocopy mode
A negative overflow can happen when the budget number of descs are
consumed. as long as the budget is decreased to zero, it will again go
into while (budget-- > 0) statement and get decreased by one, so the
overflow issue can happen. It will lead to returning true whereas the
expected value should be false.

In this case where all the budget is used up, it means zc function
should return false to let the poll run again because normally we
might have more data to process. Without this patch, zc function would
return true instead.

Fixes: 132c32ee5b ("net: stmmac: Add TX via XDP zero-copy socket")
Signed-off-by: Jason Xing <kernelxing@tencent.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Link: https://patch.msgid.link/20250723142327.85187-2-kerneljasonxing@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-26 11:23:08 -07:00
Tristram Ha
620e2392db net: dsa: microchip: Disable PTP function of KSZ8463
The PTP function of KSZ8463 is on by default.  However, its proprietary
way of storing timestamp directly in a reserved field inside the PTP
message header is not suitable for use with the current Linux PTP stack
implementation.  It is necessary to disable the PTP function to not
interfere the normal operation of the MAC.

Note the PTP driver for KSZ switches does not work for KSZ8463 and is not
activated for it.

Signed-off-by: Tristram Ha <tristram.ha@microchip.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20250725001753.6330-7-Tristram.Ha@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-25 17:01:56 -07:00
Tristram Ha
006983e597 net: dsa: microchip: Setup fiber ports for KSZ8463
The fiber ports in KSZ8463 cannot be detected internally, so it requires
specifying that condition in the device tree.  Like the one used in
Micrel PHY the port link can only be read and there is no write to the
PHY.  The driver programs registers to operate fiber ports correctly.

Signed-off-by: Tristram Ha <tristram.ha@microchip.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20250725001753.6330-6-Tristram.Ha@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-25 17:01:56 -07:00
Tristram Ha
5bcdb1373a net: dsa: microchip: Write switch MAC address differently for KSZ8463
KSZ8463 uses 16-bit register definitions so it writes differently for
8-bit switch MAC address.

Signed-off-by: Tristram Ha <tristram.ha@microchip.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20250725001753.6330-5-Tristram.Ha@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-25 17:01:56 -07:00
Tristram Ha
15b8d3e386 net: dsa: microchip: Use different registers for KSZ8463
KSZ8463 does not use same set of registers as KSZ8863 so it is necessary
to change some registers when using KSZ8463.

Signed-off-by: Tristram Ha <tristram.ha@microchip.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20250725001753.6330-4-Tristram.Ha@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-25 17:01:56 -07:00
Tristram Ha
84c47bfc5b net: dsa: microchip: Add KSZ8463 switch support to KSZ DSA driver
KSZ8463 switch is a 3-port switch based from KSZ8863.  Its major
difference from other KSZ SPI switches is its register access is not a
simple continual 8-bit transfer with automatic address increase but uses
a byte-enable mechanism specifying 8-bit, 16-bit, or 32-bit access.  Its
registers are also defined in 16-bit format because it shares a design
with a MAC controller using 16-bit access.  As a result some common
register accesses need to be re-arranged.

This patch adds the basic structure for using KSZ8463.  It cannot use the
same regmap table for other KSZ switches as it interprets the 16-bit
value as little-endian and its SPI commands are different.

KSZ8463 uses a byte-enable mechanism to specify 8-bit, 16-bit, and 32-bit
access.  The register is first shifted right by 2 then left by 4.  Extra
4 bits are added.  If the access is 8-bit one of the 4 bits is set.  If
the access is 16-bit two of the 4 bits are set.  If the access is 32-bit
all 4 bits are set.  The SPI command for read or write is then added.

Because of this register transformation separate SPI read and write
functions are provided for KSZ8463.

KSZ8463's internal PHYs use standard PHY register definitions so there is
no need to remap things.  However, the hardware has a bug that the high
word and low word of the PHY id are swapped.  In addition the port
registers are arranged differently so KSZ8463 has its own mapping for
port registers and PHY registers.  Therefore the PORT_CTRL_ADDR macro is
replaced with the get_port_addr helper function.

Signed-off-by: Tristram Ha <tristram.ha@microchip.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20250725001753.6330-3-Tristram.Ha@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-25 17:01:56 -07:00
Mun Yew Tham
a5e290aab8 net: stmmac: dwmac-socfpga: Add xgmac support for Agilex5
Add support for Agilex5 compatible value.

Signed-off-by: Mun Yew Tham <mun.yew.tham@altera.com>
Signed-off-by: Matthew Gerlach <matthew.gerlach@altera.com>
Link: https://patch.msgid.link/20250724154052.205706-5-matthew.gerlach@altera.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-25 16:55:38 -07:00
Jakub Kicinski
eccf7a3480 Merge tag 'linux-can-fixes-for-6.16-20250725' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can
Marc Kleine-Budde says:

====================
pull-request: can 2025-07-25

The patch is by Stephane Grosjean and adds support the recent firmware
of USB CAN FD interfaces to the peak_usb driver.

* tag 'linux-can-fixes-for-6.16-20250725' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can:
  can: peak_usb: fix USB FD devices potential malfunction
====================

Link: https://patch.msgid.link/20250725101619.4095105-1-mkl@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-25 16:48:37 -07:00
Jakub Kicinski
ecc383e5fe Merge tag 'linux-can-next-for-6.17-20250725' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next
Marc Kleine-Budde says:

====================
pull-request: can-next 2025-07-25

The first patch is by Khaled Elnaggar and converts the janz-ican3
driver's fwinfo_show() to sysfs_emit().

Vincent Mailhol contributes 3 patches that first fix a warning in the
ti_hecc driver and then add missing COMPILE_TEST more compile
coverage to the ti_hecc and tscan1 driver.

Randy Dunlap's patch let's the tscan1 driver depend on PC104.

A patch by Luis Felipe Hernandez fixes a kernel-doc error in the
ctucanfd driver.

Jimmy Assarsson contributes 10 patches for the kvaser_pciefd and 11
for the kvaser_usb driver. Both series simplify the identification of
physical the CAN interfaces and add devlink support to get information
about the running firmware.

* tag 'linux-can-next-for-6.17-20250725' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next: (27 commits)
  Documentation: devlink: add devlink documentation for the kvaser_usb driver
  can: kvaser_usb: Add devlink port support
  can: kvaser_usb: Expose device information via devlink info_get()
  can: kvaser_usb: Add devlink support
  can: kvaser_usb: Store additional device information
  can: kvaser_usb: Store the different firmware version components in a struct
  can: kvaser_usb: Move comment regarding max_tx_urbs
  can: kvaser_usb: Add intermediate variables
  can: kvaser_usb: Assign netdev.dev_port based on device channel index
  can: kvaser_usb: Add support for ethtool set_phys_id()
  can: kvaser_usb: Add support to control CAN LEDs on device
  Documentation: devlink: add devlink documentation for the kvaser_pciefd driver
  can: kvaser_pciefd: Add devlink port support
  can: kvaser_pciefd: Expose device firmware version via devlink info_get()
  can: kvaser_pciefd: Add devlink support
  can: kvaser_pciefd: Split driver into C-file and header-file.
  can: kvaser_pciefd: Store device channel index
  can: kvaser_pciefd: Store the different firmware version components in a struct
  can: kvaser_pciefd: Add intermediate variable for device struct in probe()
  can: kvaser_pciefd: Add support for ethtool set_phys_id()
  ...
====================

Link: https://patch.msgid.link/20250725161327.4165174-1-mkl@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-25 16:27:19 -07:00
Jakub Kicinski
e9e91870ac Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
Tony Nguyen says:

====================
libie: commonize adminq structure

Michal Swiatkowski says:

It is a prework to allow reusing some specific Intel code (eq. fwlog).

Move common *_aq_desc structure to libie header and changing
it in ice, ixgbe, i40e and iavf.

Only generic adminq commands can be easily moved to common header, as
rest is slightly different. Format remains the same. It will be better
to correctly move it when it will be needed to commonize other part of
the code.

Move *_aq_str() to new libie module (libie_adminq) and use it across
drivers. The functions are exactly the same in each driver. Some more
adminq helpers/functions can be moved to libie_adminq when needed.

* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
  i40e: use libie_aq_str
  iavf: use libie_aq_str
  ice: use libie_aq_str
  libie: add adminq helper for converting err to str
  iavf: use libie adminq descriptors
  i40e: use libie adminq descriptors
  ixgbe: use libie adminq descriptors
  ice, libie: move generic adminq descriptors to lib
====================

Link: https://patch.msgid.link/20250724182826.3758850-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-25 16:01:18 -07:00
Kees Cook
9203e0a82c wireguard: peer: Replace sockaddr with sockaddr_inet
As part of the removal of the variably-sized sockaddr for kernel
internals, replace struct sockaddr with sockaddr_inet in the endpoint
union.

No binary changes; the union size remains unchanged due to sockaddr_inet
matching the size of sockaddr_in6.

Signed-off-by: Kees Cook <kees@kernel.org>
Link: https://patch.msgid.link/20250722171836.1078436-2-kees@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-25 15:29:58 -07:00
Feng Liu
5474ca2118 net/mlx5e: Expose TIS via devlink tx reporter diagnose
Underneath "TIS Config" tag expose TIS diagnostic information.
Expose the tisn of each TC under each lag port.

$ sudo devlink health diagnose auxiliary/mlx5_core.eth.2/131072 reporter tx
......
  TIS Config:
      lag port: 0 tc: 0 tisn: 0
      lag port: 1 tc: 0 tisn: 8
......

Signed-off-by: Feng Liu <feliu@nvidia.com>
Reviewed-by: Aya Levin <ayal@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Link: https://patch.msgid.link/1753194228-333722-3-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-25 15:27:54 -07:00
Alexandre Cassen
71670f766b net/mlx5e: Support routed networks during IPsec MACs initialization
Remote IPsec tunnel endpoint may refer to a network segment that is
not directly connected to the host. In such a case, IPsec tunnel
endpoints are connected to a router and reachable via a routing path.
In IPsec packet offload mode, HW is initialized with the MAC address
of both IPsec tunnel endpoints.

Extend the current IPsec init MACs procedure to resolve nexthop for
routed networks. Direct neighbour lookup and probe is still used
for directly connected networks and as a fallback mechanism if fib
lookup fails.

Signed-off-by: Alexandre Cassen <acassen@corp.free.fr>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Link: https://patch.msgid.link/1753194228-333722-2-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-25 15:27:54 -07:00
Kyle Hendry
5ac0002385 net: dsa: b53: mmap: Implement bcm63xx ephy power control
Implement the phy enable/disable calls for b53 mmap, and
set the power down registers in the ephy control register
appropriately.

Signed-off-by: Kyle Hendry <kylehendrydev@gmail.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20250724035300.20497-8-kylehendrydev@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-25 14:07:34 -07:00
Kyle Hendry
e8e13073df net: dsa: b53: mmap: Add register layout for bcm6368
Add ephy register info for bcm6368.

Signed-off-by: Kyle Hendry <kylehendrydev@gmail.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20250724035300.20497-7-kylehendrydev@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-25 14:07:34 -07:00
Kyle Hendry
c251304ab0 net: dsa: b53: mmap: Add register layout for bcm6318
Add ephy register info for bcm6318, which also applies to
bcm6328 and bcm6362.

Signed-off-by: Kyle Hendry <kylehendrydev@gmail.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20250724035300.20497-6-kylehendrydev@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-25 14:07:33 -07:00
Kyle Hendry
aed2aaa3c9 net: dsa: b53: mmap: Add syscon reference and register layout for bcm63268
On bcm63xx SoCs there are registers that control the PHYs in
the GPIO controller. Allow the b53 driver to access them
by passing in the syscon through the device tree.

Add a structure to describe the ephy control register
and add register info for bcm63268.

Signed-off-by: Kyle Hendry <kylehendrydev@gmail.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20250724035300.20497-5-kylehendrydev@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-25 14:07:33 -07:00
Kyle Hendry
fcf02a462f net: dsa: b53: Define chip IDs for more bcm63xx SoCs
Add defines for bcm6318, bcm6328, bcm6362, bcm6368 chip IDs,
update tables and switch init.

Signed-off-by: Kyle Hendry <kylehendrydev@gmail.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20250724035300.20497-4-kylehendrydev@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-25 14:07:33 -07:00
Kyle Hendry
be7a79145d net: dsa: b53: Add phy_enable(), phy_disable() methods
Add phy enable/disable to b53 ops to be called when
enabling/disabling ports.

Signed-off-by: Kyle Hendry <kylehendrydev@gmail.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20250724035300.20497-2-kylehendrydev@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-25 14:07:33 -07:00
Florian Larysch
49db61c27c net: phy: micrel: fix KSZ8081/KSZ8091 cable test
Commit 21b688dabe ("net: phy: micrel: Cable Diag feature for lan8814
phy") introduced cable_test support for the LAN8814 that reuses parts of
the KSZ886x logic and introduced the cable_diag_reg and pair_mask
parameters to account for differences between those chips.

However, it did not update the ksz8081_type struct, so those members are
now 0, causing no pairs to be tested in ksz886x_cable_test_get_status
and ksz886x_cable_test_wait_for_completion to poll the wrong register
for the affected PHYs (Basic Control/Reset, which is 0 in normal
operation) and exit immediately.

Fix this by setting both struct members accordingly.

Fixes: 21b688dabe ("net: phy: micrel: Cable Diag feature for lan8814 phy")
Cc: stable@vger.kernel.org
Signed-off-by: Florian Larysch <fl@n621.de>
Link: https://patch.msgid.link/20250723222250.13960-1-fl@n621.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-25 13:43:24 -07:00
Tristram Ha
165a7f5db9 net: dsa: microchip: Fix wrong rx drop MIB counter for KSZ8863
When KSZ8863 support was first added to KSZ driver the RX drop MIB
counter was somehow defined as 0x105.  The TX drop MIB counter
starts at 0x100 for port 1, 0x101 for port 2, and 0x102 for port 3, so
the RX drop MIB counter should start at 0x103 for port 1, 0x104 for
port 2, and 0x105 for port 3.

There are 5 ports for KSZ8895, so its RX drop MIB counter starts at
0x105.

Fixes: 4b20a07e10 ("net: dsa: microchip: ksz8795: add support for ksz88xx chips")
Signed-off-by: Tristram Ha <tristram.ha@microchip.com>
Reviewed-by: Oleksij Rempel <o.rempel@pengutronix.de>
Link: https://patch.msgid.link/20250723030403.56878-1-Tristram.Ha@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-25 13:09:23 -07:00
Simon Horman
9312ee7649 octeontx2-af: use unsigned int as iterator for unsigned values
The local variable i is used to iterate over unsigned
values. The lower bound of the loop is set to 0. While
the upper bound is cgx->lmac_count, where they lmac_count is
an u8. So the theoretical upper bound is 255.

As is, GCC can't see this range of values and warns that
a formatted string, which includes the %d representation of i,
may overflow the buffer provided.

GCC 15.1.0 says:

  .../cgx.c: In function 'cgx_lmac_init':
  .../cgx.c:1737:49: warning: '%d' directive writing between 1 and 11 bytes into a region of size between 4 and 6 [-Wformat-overflow=]
   1737 |                 sprintf(lmac->name, "cgx_fwi_%d_%d", cgx->cgx_id, i);
        |                                                 ^~
  .../cgx.c:1737:37: note: directive argument in the range [-2147483641, 254]
   1737 |                 sprintf(lmac->name, "cgx_fwi_%d_%d", cgx->cgx_id, i);
        |                                     ^~~~~~~~~~~~~~~
  .../cgx.c:1737:17: note: 'sprintf' output between 12 and 24 bytes into a destination of size 16
   1737 |                 sprintf(lmac->name, "cgx_fwi_%d_%d", cgx->cgx_id, i);
        |                 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Empirically, changing the type of i from (signed) int to unsigned int
addresses this problem. I assume by allowing GCC to see the range of
values described above.

Also update the format specifiers for the integer values in the string
in question from %d to %u. This seems appropriate as they are now both
unsigned.

No functional change intended.
Compile tested only.

Signed-off-by: Simon Horman <horms@kernel.org>
Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Link: https://patch.msgid.link/20250724-octeontx2-af-unsigned-v1-1-c745c106e06f@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-25 11:30:50 -07:00
Yi Cong
a75afcd188 usbnet: Set duplex status to unknown in the absence of MII
Currently, USB CDC devices that do not use MDIO to get link status have
their duplex mode set to half-duplex by default. However, since the CDC
specification does not define a duplex status, this can be misleading.

This patch changes the default to DUPLEX_UNKNOWN in the absence of MII,
which more accurately reflects the state of the link and avoids implying
an incorrect or error state.

Link: https://lore.kernel.org/all/20250723152151.70a8034b@kernel.org/
Signed-off-by: Yi Cong <yicong@kylinos.cn>
Acked-by: Oliver Neukum <oneukum@suse.com>
Link: https://patch.msgid.link/20250724013133.1645142-1-yicongsrfy@163.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-25 11:05:22 -07:00
Stanislav Fomichev
0349659fd7 macsec: set IFF_UNICAST_FLT priv flag
Cosmin reports the following locking issue:

  # BUG: sleeping function called from invalid context at
  kernel/locking/mutex.c:275
  #   dump_stack_lvl+0x4f/0x60
  #   __might_resched+0xeb/0x140
  #   mutex_lock+0x1a/0x40
  #   dev_set_promiscuity+0x26/0x90
  #   __dev_set_promiscuity+0x85/0x170
  #   __dev_set_rx_mode+0x69/0xa0
  #   dev_uc_add+0x6d/0x80
  #   vlan_dev_open+0x5f/0x120 [8021q]
  #  __dev_open+0x10c/0x2a0
  #  __dev_change_flags+0x1a4/0x210
  #  netif_change_flags+0x22/0x60
  #  do_setlink.isra.0+0xdb0/0x10f0
  #  rtnl_newlink+0x797/0xb00
  #  rtnetlink_rcv_msg+0x1cb/0x3f0
  #  netlink_rcv_skb+0x53/0x100
  #  netlink_unicast+0x273/0x3b0
  #  netlink_sendmsg+0x1f2/0x430

Which is similar to recent syzkaller reports in [0] and [1] and triggers
because macsec does not advertise IFF_UNICAST_FLT although it has proper
ndo_set_rx_mode callback that takes care of pushing uc/mc addresses
down to the real device.

In general, dev_uc_add call path is problematic for stacking
non-IFF_UNICAST_FLT because we might grab netdev instance lock under
addr_list_lock spinlock, so this is not a systemic fix.

0: https://lore.kernel.org/netdev/686d55b4.050a0220.1ffab7.0014.GAE@google.com
1: https://lore.kernel.org/netdev/68712acf.a00a0220.26a83e.0051.GAE@google.com/
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/netdev/2aff4342b0f5b1539c02ffd8df4c7e58dd9746e7.camel@nvidia.com
Fixes: 7e4d784f58 ("net: hold netdev instance lock during rtnetlink operations")
Reported-by: Cosmin Ratiu <cratiu@nvidia.com>
Tested-by: Cosmin Ratiu <cratiu@nvidia.com>
Signed-off-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20250723224715.1341121-1-sdf@fomichev.me
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-25 11:03:46 -07:00
John Ernberg
0d9cfc9b8c net: usbnet: Avoid potential RCU stall on LINK_CHANGE event
The Gemalto Cinterion PLS83-W modem (cdc_ether) is emitting confusing link
up and down events when the WWAN interface is activated on the modem-side.

Interrupt URBs will in consecutive polls grab:
* Link Connected
* Link Disconnected
* Link Connected

Where the last Connected is then a stable link state.

When the system is under load this may cause the unlink_urbs() work in
__handle_link_change() to not complete before the next usbnet_link_change()
call turns the carrier on again, allowing rx_submit() to queue new SKBs.

In that event the URB queue is filled faster than it can drain, ending up
in a RCU stall:

    rcu: INFO: rcu_sched detected expedited stalls on CPUs/tasks: { 0-.... } 33108 jiffies s: 201 root: 0x1/.
    rcu: blocking rcu_node structures (internal RCU debug):
    Sending NMI from CPU 1 to CPUs 0:
    NMI backtrace for cpu 0

    Call trace:
     arch_local_irq_enable+0x4/0x8
     local_bh_enable+0x18/0x20
     __netdev_alloc_skb+0x18c/0x1cc
     rx_submit+0x68/0x1f8 [usbnet]
     rx_alloc_submit+0x4c/0x74 [usbnet]
     usbnet_bh+0x1d8/0x218 [usbnet]
     usbnet_bh_tasklet+0x10/0x18 [usbnet]
     tasklet_action_common+0xa8/0x110
     tasklet_action+0x2c/0x34
     handle_softirqs+0x2cc/0x3a0
     __do_softirq+0x10/0x18
     ____do_softirq+0xc/0x14
     call_on_irq_stack+0x24/0x34
     do_softirq_own_stack+0x18/0x20
     __irq_exit_rcu+0xa8/0xb8
     irq_exit_rcu+0xc/0x30
     el1_interrupt+0x34/0x48
     el1h_64_irq_handler+0x14/0x1c
     el1h_64_irq+0x68/0x6c
     _raw_spin_unlock_irqrestore+0x38/0x48
     xhci_urb_dequeue+0x1ac/0x45c [xhci_hcd]
     unlink1+0xd4/0xdc [usbcore]
     usb_hcd_unlink_urb+0x70/0xb0 [usbcore]
     usb_unlink_urb+0x24/0x44 [usbcore]
     unlink_urbs.constprop.0.isra.0+0x64/0xa8 [usbnet]
     __handle_link_change+0x34/0x70 [usbnet]
     usbnet_deferred_kevent+0x1c0/0x320 [usbnet]
     process_scheduled_works+0x2d0/0x48c
     worker_thread+0x150/0x1dc
     kthread+0xd8/0xe8
     ret_from_fork+0x10/0x20

Get around the problem by delaying the carrier on to the scheduled work.

This needs a new flag to keep track of the necessary action.

The carrier ok check cannot be removed as it remains required for the
LINK_RESET event flow.

Fixes: 4b49f58fff ("usbnet: handle link change")
Cc: stable@vger.kernel.org
Signed-off-by: John Ernberg <john.ernberg@actia.se>
Link: https://patch.msgid.link/20250723102526.1305339-1-john.ernberg@actia.se
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-25 11:02:51 -07:00
Jijie Shao
15dc08fd2c net: hibmcge: support for statistics of reset failures
Add a statistical item to count the number of reset failures.
This statistical item can be queried using ethtool -S or
reported through diagnose information.

Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Link: https://patch.msgid.link/20250723074826.2756135-1-shaojijie@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-25 11:01:53 -07:00
Shahar Shitrit
e80d655615 net/mlx5e: Fix potential deadlock by deferring RX timeout recovery
mlx5e_reporter_rx_timeout() is currently invoked synchronously
in the driver's open error flow. This causes the thread holding
priv->state_lock to attempt acquiring the devlink lock, which
can result in a circular dependency with other devlink operations.

For example:

- Devlink health diagnose flow:
  - __devlink_nl_pre_doit() acquires the devlink lock.
  - devlink_nl_health_reporter_diagnose_doit() invokes the
    driver's diagnose callback.
  - mlx5e_rx_reporter_diagnose() then attempts to acquire
    priv->state_lock.

- Driver open flow:
  - mlx5e_open() acquires priv->state_lock.
  - If an error occurs, devlink_health_reporter may be called,
    attempting to acquire the devlink lock.

To prevent this circular locking scenario, defer the RX timeout
recovery by scheduling it via a workqueue. This ensures that the
recovery work acquires locks in a consistent order: first the
devlink lock, then priv->state_lock.

Additionally, make the recovery work acquire the netdev instance
lock to safely synchronize with the open/close channel flows,
similar to mlx5e_tx_timeout_work. Repeatedly attempt to acquire
the netdev instance lock until it is taken or the target RQ is no
longer active, as indicated by the MLX5E_STATE_CHANNELS_ACTIVE bit.

Fixes: 32c57fb268 ("net/mlx5e: Report and recover from rx timeout")
Signed-off-by: Shahar Shitrit <shshitrit@nvidia.com>
Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1753256672-337784-4-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-25 11:00:17 -07:00
Jianbo Liu
6d19c44b5c net/mlx5e: Remove skb secpath if xfrm state is not found
Hardware returns a unique identifier for a decrypted packet's xfrm
state, this state is looked up in an xarray. However, the state might
have been freed by the time of this lookup.

Currently, if the state is not found, only a counter is incremented.
The secpath (sp) extension on the skb is not removed, resulting in
sp->len becoming 0.

Subsequently, functions like __xfrm_policy_check() attempt to access
fields such as xfrm_input_state(skb)->xso.type (which dereferences
sp->xvec[sp->len - 1]) without first validating sp->len. This leads to
a crash when dereferencing an invalid state pointer.

This patch prevents the crash by explicitly removing the secpath
extension from the skb if the xfrm state is not found after hardware
decryption. This ensures downstream functions do not operate on a
zero-length secpath.

 BUG: unable to handle page fault for address: ffffffff000002c8
 #PF: supervisor read access in kernel mode
 #PF: error_code(0x0000) - not-present page
 PGD 282e067 P4D 282e067 PUD 0
 Oops: Oops: 0000 [#1] SMP
 CPU: 12 UID: 0 PID: 0 Comm: swapper/12 Not tainted 6.15.0-rc7_for_upstream_min_debug_2025_05_27_22_44 #1 NONE
 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
 RIP: 0010:__xfrm_policy_check+0x61a/0xa30
 Code: b6 77 7f 83 e6 02 74 14 4d 8b af d8 00 00 00 41 0f b6 45 05 c1 e0 03 48 98 49 01 c5 41 8b 45 00 83 e8 01 48 98 49 8b 44 c5 10 <0f> b6 80 c8 02 00 00 83 e0 0c 3c 04 0f 84 0c 02 00 00 31 ff 80 fa
 RSP: 0018:ffff88885fb04918 EFLAGS: 00010297
 RAX: ffffffff00000000 RBX: 0000000000000002 RCX: 0000000000000000
 RDX: 0000000000000002 RSI: 0000000000000002 RDI: 0000000000000000
 RBP: ffffffff8311af80 R08: 0000000000000020 R09: 00000000c2eda353
 R10: ffff88812be2bbc8 R11: 000000001faab533 R12: ffff88885fb049c8
 R13: ffff88812be2bbc8 R14: 0000000000000000 R15: ffff88811896ae00
 FS:  0000000000000000(0000) GS:ffff8888dca82000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: ffffffff000002c8 CR3: 0000000243050002 CR4: 0000000000372eb0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
 Call Trace:
  <IRQ>
  ? try_to_wake_up+0x108/0x4c0
  ? udp4_lib_lookup2+0xbe/0x150
  ? udp_lib_lport_inuse+0x100/0x100
  ? __udp4_lib_lookup+0x2b0/0x410
  __xfrm_policy_check2.constprop.0+0x11e/0x130
  udp_queue_rcv_one_skb+0x1d/0x530
  udp_unicast_rcv_skb+0x76/0x90
  __udp4_lib_rcv+0xa64/0xe90
  ip_protocol_deliver_rcu+0x20/0x130
  ip_local_deliver_finish+0x75/0xa0
  ip_local_deliver+0xc1/0xd0
  ? ip_protocol_deliver_rcu+0x130/0x130
  ip_sublist_rcv+0x1f9/0x240
  ? ip_rcv_finish_core+0x430/0x430
  ip_list_rcv+0xfc/0x130
  __netif_receive_skb_list_core+0x181/0x1e0
  netif_receive_skb_list_internal+0x200/0x360
  ? mlx5e_build_rx_skb+0x1bc/0xda0 [mlx5_core]
  gro_receive_skb+0xfd/0x210
  mlx5e_handle_rx_cqe_mpwrq+0x141/0x280 [mlx5_core]
  mlx5e_poll_rx_cq+0xcc/0x8e0 [mlx5_core]
  ? mlx5e_handle_rx_dim+0x91/0xd0 [mlx5_core]
  mlx5e_napi_poll+0x114/0xab0 [mlx5_core]
  __napi_poll+0x25/0x170
  net_rx_action+0x32d/0x3a0
  ? mlx5_eq_comp_int+0x8d/0x280 [mlx5_core]
  ? notifier_call_chain+0x33/0xa0
  handle_softirqs+0xda/0x250
  irq_exit_rcu+0x6d/0xc0
  common_interrupt+0x81/0xa0
  </IRQ>

Fixes: b2ac7541e3 ("net/mlx5e: IPsec: Add Connect-X IPsec Rx data path offload")
Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Reviewed-by: Yael Chemla <ychemla@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1753256672-337784-3-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-25 11:00:17 -07:00
Alexei Lazar
fd4b97246a net/mlx5e: Clear Read-Only port buffer size in PBMC before update
When updating the PBMC register, we read its current value,
modify desired fields, then write it back.

The port_buffer_size field within PBMC is Read-Only (RO).
If this RO field contains a non-zero value when read,
attempting to write it back will cause the entire PBMC
register update to fail.

This commit ensures port_buffer_size is explicitly cleared
to zero after reading the PBMC register but before writing
back the modified value.
This allows updates to other fields in the PBMC register to succeed.

Fixes: 0696d60853 ("net/mlx5e: Receive buffer configuration")
Signed-off-by: Alexei Lazar <alazar@nvidia.com>
Reviewed-by: Yael Chemla <ychemla@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1753256672-337784-2-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-25 11:00:17 -07:00
Bjorn Helgaas
fe09560f82 net: Fix typos
Fix typos in comments and error messages.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: David Arinzon <darinzon@amazon.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250723201528.2908218-1-helgaas@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-25 10:29:07 -07:00
Jimmy Assarsson
aa6a5c995e can: kvaser_usb: Add devlink port support
Register each CAN channel of the device as an devlink physical port.
This makes it easier to get device information for a given network
interface (i.e. can2).

Example output:
  $ devlink dev
  usb/1-1.3:1.0

  $ devlink port
  usb/1-1.3:1.0/0: type eth netdev can0 flavour physical port 0 splittable false
  usb/1-1.3:1.0/1: type eth netdev can1 flavour physical port 1 splittable false

  $ devlink port show can1
  usb/1-1.3:1.0/1: type eth netdev can1 flavour physical port 0 splittable false

  $ devlink dev info
  usb/1-1.3:1.0:
    driver kvaser_usb
    serial_number 1020
    versions:
        fixed:
          board.rev 1
          board.id 7330130009653
        running:
          fw 3.22.527

  $ ethtool -i can1
  driver: kvaser_usb
  version: 6.12.10-arch1-1
  firmware-version: 3.22.527
  expansion-rom-version:
  bus-info: 1-1.3:1.0
  supports-statistics: no
  supports-test: no
  supports-eeprom-access: no
  supports-register-dump: no
  supports-priv-flags: no

Reviewed-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Signed-off-by: Jimmy Assarsson <extja@kvaser.com>
Link: https://patch.msgid.link/20250725123452.41-11-extja@kvaser.com
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2025-07-25 18:01:22 +02:00
Jimmy Assarsson
8720aed90c can: kvaser_usb: Expose device information via devlink info_get()
Expose device information via devlink info_get():
  * Serial number
  * Firmware version
  * Hardware revision
  * EAN (product number)

Example output:
  $ devlink dev
  usb/1-1.2:1.0

  $ devlink dev info
  usb/1-1.2:1.0:
    driver kvaser_usb
    serial_number 1020
    versions:
        fixed:
          board.rev 1
          board.id 7330130009653
        running:
          fw 3.22.527

Reviewed-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Signed-off-by: Jimmy Assarsson <extja@kvaser.com>
Link: https://patch.msgid.link/20250725123452.41-10-extja@kvaser.com
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2025-07-25 18:01:21 +02:00
Jimmy Assarsson
9505a83fc4 can: kvaser_usb: Add devlink support
Add devlink support at device level.

Example output:
  $ devlink dev
  usb/1-1.3:1.0

  $ devlink dev info
  usb/1-1.3:1.0:
    driver kvaser_usb

Reviewed-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Signed-off-by: Jimmy Assarsson <extja@kvaser.com>
Link: https://patch.msgid.link/20250725123452.41-9-extja@kvaser.com
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2025-07-25 18:01:21 +02:00
Jimmy Assarsson
0020f2ba40 can: kvaser_usb: Store additional device information
Store additional device information; EAN (product number), serial_number
and hardware revision.

Reviewed-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Signed-off-by: Jimmy Assarsson <extja@kvaser.com>
Link: https://patch.msgid.link/20250725123452.41-8-extja@kvaser.com
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2025-07-25 18:01:20 +02:00
Jimmy Assarsson
280eba332b can: kvaser_usb: Store the different firmware version components in a struct
Store firmware version in kvaser_usb_fw_version struct, specifying the
different components of the version number.
And drop debug prinout of firmware version, since later patches will expose
it via the devlink interface.

Reviewed-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Signed-off-by: Jimmy Assarsson <extja@kvaser.com>
Link: https://patch.msgid.link/20250725123452.41-7-extja@kvaser.com
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2025-07-25 18:01:20 +02:00
Jimmy Assarsson
7506789c53 can: kvaser_usb: Move comment regarding max_tx_urbs
Move comment regarding max_tx_urbs, to where the struct member is declared.

Reviewed-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Signed-off-by: Jimmy Assarsson <extja@kvaser.com>
Link: https://patch.msgid.link/20250725123452.41-6-extja@kvaser.com
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2025-07-25 18:01:20 +02:00
Jimmy Assarsson
827158a67c can: kvaser_usb: Add intermediate variables
Add intermediate variables, for readability and to simplify future patches.

Reviewed-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Signed-off-by: Jimmy Assarsson <extja@kvaser.com>
Link: https://patch.msgid.link/20250725123452.41-5-extja@kvaser.com
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2025-07-25 18:01:19 +02:00
Jimmy Assarsson
c151b06a08 can: kvaser_usb: Assign netdev.dev_port based on device channel index
Assign netdev.dev_port based on the device channel index, to indicate the
port number of the network device.
While this driver already uses netdev.dev_id for that purpose, dev_port is
more appropriate. However, retain dev_id to avoid potential regressions.

Fixes: 3e66d0138c ("can: populate netdev::dev_id for udev discrimination")
Reviewed-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Signed-off-by: Jimmy Assarsson <extja@kvaser.com>
Link: https://patch.msgid.link/20250725123452.41-4-extja@kvaser.com
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2025-07-25 18:01:19 +02:00
Jimmy Assarsson
3d7a3de9eb can: kvaser_usb: Add support for ethtool set_phys_id()
Add support for ethtool set_phys_id(), to physically locate devices by
flashing a LED on the device.

Reviewed-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Signed-off-by: Jimmy Assarsson <extja@kvaser.com>
Link: https://patch.msgid.link/20250725123452.41-3-extja@kvaser.com
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2025-07-25 18:01:19 +02:00
Jimmy Assarsson
478248f1bc can: kvaser_usb: Add support to control CAN LEDs on device
Add support to turn on/off CAN LEDs on device.

Reviewed-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Signed-off-by: Jimmy Assarsson <extja@kvaser.com>
Link: https://patch.msgid.link/20250725123452.41-2-extja@kvaser.com
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2025-07-25 18:01:18 +02:00