linux

mirror of https://github.com/torvalds/linux.git synced 2026-04-18 06:44:00 -04:00

Author	SHA1	Message	Date
Andy Roulin	54fc83a172	net: bridge: add stp_mode attribute for STP mode selection The bridge-stp usermode helper is currently restricted to the initial network namespace, preventing userspace STP daemons (e.g. mstpd) from operating on bridges in other network namespaces. Since commit `ff62198553` ("bridge: Only call /sbin/bridge-stp for the initial network namespace"), bridges in non-init namespaces silently fall back to kernel STP with no way to use userspace STP. Add a new bridge attribute IFLA_BR_STP_MODE that allows explicit per-bridge control over STP mode selection: BR_STP_MODE_AUTO (default) - Existing behavior: invoke the /sbin/bridge-stp helper in init_net only; fall back to kernel STP if it fails or in non-init namespaces. BR_STP_MODE_USER - Directly enable userspace STP (BR_USER_STP) without invoking the helper. Works in any network namespace. Userspace is responsible for ensuring an STP daemon manages the bridge. BR_STP_MODE_KERNEL - Directly enable kernel STP (BR_KERNEL_STP) without invoking the helper. The mode can only be changed while STP is disabled, or set to the same value (-EBUSY otherwise). IFLA_BR_STP_MODE is processed before IFLA_BR_STP_STATE in br_changelink(), so both can be set atomically in a single netlink message. The mode can also be changed in the same message that disables STP. The stp_mode struct field is u8 since all possible values fit, while NLA_U32 is used for the netlink attribute since it occupies the same space in the netlink message as NLA_U8. A new stp_helper_active boolean tracks whether the /sbin/bridge-stp helper was invoked during br_stp_start(), so that br_stp_stop() only calls the helper for stop when it was called for start. This avoids calling the helper asymmetrically when stp_mode changes between start and stop. Suggested-by: Ido Schimmel <idosch@nvidia.com> Assisted-by: Claude:claude-opus-4-6 Reviewed-by: Ido Schimmel <idosch@nvidia.com> Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: Andy Roulin <aroulin@nvidia.com> Link: https://patch.msgid.link/20260405205224.3163000-2-aroulin@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-10 15:52:24 -07:00
Daniel Borkmann	4810389605	netkit: Add single device mode for netkit Add a single device mode for netkit instead of netkit pairs. The primary target for the paired devices is to connect network namespaces, of course, and support has been implemented in projects like Cilium [0]. For the rxq leasing the plan is to support two main scenarios related to single device mode: * For the use-case of io_uring zero-copy, the control plane can either set up a netkit pair where the peer device can perform rxq leasing which is then tied to the lifetime of the peer device, or the control plane can use a regular netkit pair to connect the hostns to a Pod/container and dynamically add/remove rxq leasing through a single device without having to interrupt the device pair. In the case of io_uring, the memory pool is used as skb non-linear pages, and thus the skb will go its way through the regular stack into netkit. Things like the netkit policy when no BPF is attached or skb scrubbing etc apply as-is in case the paired devices are used, or if the backend memory is tied to the single device and traffic goes through a paired device. * For the use-case of AF_XDP, the control plane needs to use netkit in the single device mode. The single device mode currently enforces only a pass policy when no BPF is attached, and does not yet support BPF link attachments for AF_XDP. skbs sent to that device get dropped at the moment. Given AF_XDP operates at a lower layer of the stack tying this to the netkit pair did not make sense. In future, the plan is to allow BPF at the XDP layer which can: i) process traffic coming from the AF_XDP application (e.g. QEMU with AF_XDP backend) to filter egress traffic or to push selected egress traffic up to the single netkit device to the local stack (e.g. DHCP requests), and ii) vice-versa skbs sent to the single netkit into the AF_XDP application (e.g. DHCP replies). Also, the control-plane can dynamically manage rxq leasing for the single netkit device without having to interrupt (e.g. down/up cycle) the main netkit pair for the Pod which has traffic going in and out. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Co-developed-by: David Wei <dw@davidwei.uk> Signed-off-by: David Wei <dw@davidwei.uk> Reviewed-by: Jordan Rife <jordan@jrife.io> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://docs.cilium.io/en/stable/operations/performance/tuning/#netkit-device-mode [0] Link: https://patch.msgid.link/20260402231031.447597-11-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-09 18:21:47 -07:00
Paolo Abeni	ba1b8c97b9	geneve: add netlink support for GRO hint Allow configuring and dumping the new device option, and cache its value into the geneve socket itself. The new option is not tie to it any code yet. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Link: https://patch.msgid.link/2295d4e4d1e919a3189425141bbc71c7850a2de0.1769011015.git.pabeni@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-01-23 11:31:14 -08:00
Hangbin Liu	651765e8d5	netlink: specs: add big-endian byte-order for u32 IPv4 addresses The fix commit converted several IPv4 address attributes from binary to u32, but forgot to specify byte-order: big-endian. Without this, YNL tools display IPv4 addresses incorrectly due to host-endian interpretation. Add the missing byte-order: big-endian to all affected u32 IPv4 address fields to ensure correct parsing and display. Fixes: `1064d521d1` ("netlink: specs: support ipv4-or-v6 for dual-stack fields") Reported-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Reviewed-by: Asbjørn Sloth Tønnesen <ast@fiberby.net> Link: https://patch.msgid.link/20251125112048.37631-1-liuhangbin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-11-26 17:14:17 -08:00
Hangbin Liu	1064d521d1	netlink: specs: support ipv4-or-v6 for dual-stack fields Since commit `1b255e1bea` ("tools: ynl: add ipv4-or-v6 display hint"), we can display either IPv4 or IPv6 addresses for a single field based on the address family. However, most dual-stack fields still use the ipv4 display hint. This update changes them to use the new ipv4-or-v6 display hint and converts IPv4-only fields to use the u32 type. Field changes: - v4-or-v6 - IFA_ADDRESS, IFA_LOCAL - IFLA_GRE_LOCAL, IFLA_GRE_REMOTE - IFLA_VTI_LOCAL, IFLA_VTI_REMOTE - IFLA_IPTUN_LOCAL, IFLA_IPTUN_REMOTE - NDA_DST - RTA_DST, RTA_SRC, RTA_GATEWAY, RTA_PREFSRC - FRA_SRC, FRA_DST - ipv4 - IFA_BROADCAST - IFLA_GENEVE_REMOTE - IFLA_IPTUN_6RD_RELAY_PREFIX Reviewed-by: Asbjørn Sloth Tønnesen <ast@fiberby.net> Reviewed-by: Donald Hunter <donald.hunter@gmail.com> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Link: https://patch.msgid.link/20251117024457.3034-3-liuhangbin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-11-18 18:42:10 -08:00
Felix Maurer	c294432be1	netlink: specs: rt-link: Add attributes for hsr YNL wasn't able to decode the linkinfo from hsr interfaces. Add the linkinfo attribute definitions for hsr interfaces. Example output now looks like this: $ ynl --spec Documentation/netlink/specs/rt-link.yaml --do getlink \ --json '{"ifname": "hsr0"}' --output-json \| jq .linkinfo { "kind": "hsr", "data": { "slave1": 15, "slave2": 13, "supervision-addr": "01:15:4e:00:01:00", "seq-nr": 64511, "version": 1, "protocol": 0 } } Signed-off-by: Felix Maurer <fmaurer@redhat.com> Link: https://patch.msgid.link/926077a70de614f1539c905d06515e258905255e.1762968225.git.fmaurer@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-11-13 17:46:31 -08:00
Alasdair McWilliam	b73b8146d7	rtnetlink: add needed_{head,tail}room attributes Various network interface types make use of needed_{head,tail}room values to efficiently reserve buffer space for additional encapsulation headers, such as VXLAN, Geneve, IPSec, etc. However, it is not currently possible to query these values in a generic way. Introduce ability to query the needed_{head,tail}room values of a network device via rtnetlink, such that applications that may wish to use these values can do so. For example, Cilium agent iterates over present devices based on user config (direct routing, vxlan, geneve, wireguard etc.) and in future will configure netkit in order to expose the needed_{head,tail}room into K8s pods. See `b9ed315d3c` ("netkit: Allow for configuring needed_{head,tail}room"). Suggested-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alasdair McWilliam <alasdair@mcwilliam.dev> Reviewed-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://patch.msgid.link/20250917095543.14039-1-alasdair@mcwilliam.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-09-19 17:21:55 -07:00
Matthieu Baerts (NGI0)	12e74931ee	netlink: specs: explicitly declare block scalar strings In YAML, it is allowed to declare a scalar strings at the next lines without explicitly declaring them as a block. Yet, they looks weird, and can cause issues when ':' or '#' are present. The modified lines didn't have issues with the special characters, but it seems better to explicitly declare such blocks as scalar strings to encourage people to "properly" declare future scalar strings. The right angle bracket is used with a minus sign to indicate that the folded style should be used without adding extra newlines. By doing that, the output is not changed compared to what was done before this patch. Suggested-by: Donald Hunter <donald.hunter@gmail.com> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250913-net-next-ynl-attr-doc-rst-v3-3-4f06420d87db@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-09-15 18:27:19 -07:00
Jakub Kicinski	28aa52b618	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Cross-merge networking fixes after downstream PR (net-6.16-rc4). Conflicts: Documentation/netlink/specs/mptcp_pm.yaml `9e6dd4c256` ("netlink: specs: mptcp: replace underscores with dashes in names") `ec362192aa` ("netlink: specs: fix up indentation errors") https://lore.kernel.org/20250626122205.389c2cd4@canb.auug.org.au Adjacent changes: Documentation/netlink/specs/fou.yaml `791a9ed0a4` ("netlink: specs: fou: replace underscores with dashes in names") `880d43ca9a` ("netlink: specs: clean up spaces in brackets") Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-06-26 10:40:50 -07:00
Jakub Kicinski	8d7e211ea9	netlink: specs: rt-link: replace underscores with dashes in names We're trying to add a strict regexp for the name format in the spec. Underscores will not be allowed, dashes should be used instead. This makes no difference to C (codegen, if used, replaces special chars in names) but it gives more uniform naming in Python. Fixes: `b2f63d904e` ("doc/netlink: Add spec for rt link messages") Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Donald Hunter <donald.hunter@gmail.com> Link: https://patch.msgid.link/20250624211002.3475021-9-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-06-25 15:36:28 -07:00
Donald Hunter	ce6bd277e1	netlink: specs: add doc start markers to yaml Clean up all document-start warnings reported by yamllint in the netlink specs: warning missing document start "---" (document-start) Signed-off-by: Donald Hunter <donald.hunter@gmail.com> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> # mptcp_pm.yaml Link: https://patch.msgid.link/20250610125944.85265-2-donald.hunter@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-06-11 14:01:19 -07:00
Jakub Kicinski	8af7a919c5	netlink: specs: rt-link: decode ip6gre Driver tests now require GRE tunnels, while we don't configure them with YNL, YNL will complain when it sees link types it doesn't recognize. Teach it decoding ip6gre tunnels. The attrs are largely the same as IPv4 GRE. Correct the type of encap-limit, but note that this attr is only used in ip6gre, so the mistake didn't matter until now. Fixes: `0d0f4174f6` ("selftests: drv-net: add a simple TSO test") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Donald Hunter <donald.hunter@gmail.com> Link: https://patch.msgid.link/20250603135357.502626-3-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-06-05 12:50:10 +02:00
Jakub Kicinski	de92258e3b	netlink: specs: rt-link: add missing byte-order properties A number of fields in the ip tunnels are lacking the big-endian designation. I suspect this is not intentional, as decoding the ports with the right endian seems objectively beneficial. Fixes: `6ffdbb93a5` ("netlink: specs: rt_link: decode ip6tnl, vti and vti6 link attrs") Fixes: `077b6022d2` ("doc/netlink/specs: Add sub-message type to rt_link family") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Donald Hunter <donald.hunter@gmail.com> Link: https://patch.msgid.link/20250603135357.502626-2-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-06-05 12:50:10 +02:00
Jakub Kicinski	dc3f63bc3e	netlink: specs: rt-link: add C naming info for ovpn C naming info for OVPN which was added since I adjusted the existing attrs. Also add missing reference to a header needed for a bridge struct. Reviewed-by: Donald Hunter <donald.hunter@gmail.com> Link: https://patch.msgid.link/20250515231650.1325372-2-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-05-16 16:32:05 -07:00
Jakub Kicinski	720447bd0b	netlink: specs: rt-link: remove implicit structs from devconf devconf is even odder than SNMP. On input it reports an array of u32s which seem to be indexed by the enum values - 1. On output kernel expects a nest where each attr has the enum type as the nla type. sub-type: u32 is probably best we can do right now. Reviewed-by: Donald Hunter <donald.hunter@gmail.com> Link: https://patch.msgid.link/20250506194101.696272-5-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-05-07 18:21:50 -07:00
Jakub Kicinski	ab91c140be	netlink: specs: remove implicit structs for SNMP counters uAPI doesn't define structs for the SNMP counters, just enums to index them as arrays. Switch to the same representation in the spec. C codegen will soon need all the struct types to actually exist. Note that the existing definition was broken, anyway, as the first member should be the number of counters reported. Reviewed-by: Donald Hunter <donald.hunter@gmail.com> Link: https://patch.msgid.link/20250506194101.696272-4-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-05-07 18:21:49 -07:00
Jakub Kicinski	622d7050cf	netlink: specs: rt-link: add notification for newlink Add a notification entry for netlink so that we can test ntf handling in classic netlink and C. Reviewed-by: Donald Hunter <donald.hunter@gmail.com> Link: https://patch.msgid.link/20250418021706.1967583-9-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-04-23 16:07:16 -07:00
Jakub Kicinski	1c224f19ff	netlink: specs: rt-link: make bond's ipv6 address attribute fixed size ns-ip6-target is an indexed-array. Codegen for variable size binary array would be a bit tedious, tell C that we know the size of these attributes, since they are IPv6 addrs. Reviewed-by: Donald Hunter <donald.hunter@gmail.com> Link: https://patch.msgid.link/20250418021706.1967583-8-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-04-23 16:07:16 -07:00
Jakub Kicinski	e6e1f53f02	netlink: specs: rt-link: adjust AF_ nest for C codegen The AF nest is indexed by AF ID, so it's a bit strange, but with minor adjustments C codegen deals with it just fine. Entirely unclear why the names have been in quotes here. Reviewed-by: Donald Hunter <donald.hunter@gmail.com> Link: https://patch.msgid.link/20250418021706.1967583-7-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-04-23 16:07:16 -07:00
Jakub Kicinski	b12b0f4181	netlink: specs: rt-link: add C naming info Add properties needed for C codegen to match names with uAPI headers. Reviewed-by: Donald Hunter <donald.hunter@gmail.com> Link: https://patch.msgid.link/20250418021706.1967583-6-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-04-23 16:07:15 -07:00
Jakub Kicinski	c703d258f6	netlink: specs: rt-link: remove duplicated group in attr list group is listed twice for newlink. Reviewed-by: Donald Hunter <donald.hunter@gmail.com> Link: https://patch.msgid.link/20250418021706.1967583-5-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-04-23 16:07:15 -07:00
Jakub Kicinski	ed43ce6ab2	netlink: specs: rt-link: remove if-netnsid from attr list if-netnsid an alias to target-netnsid: IFLA_TARGET_NETNSID = IFLA_IF_NETNSID, /* new alias */ We don't have a definition for this attr. Reviewed-by: Donald Hunter <donald.hunter@gmail.com> Link: https://patch.msgid.link/20250418021706.1967583-4-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-04-23 16:07:15 -07:00
Jakub Kicinski	43b606d984	netlink: specs: rt-link: remove the fixed members from attrs The purpose of the attribute list is to list the attributes which will be included in a given message to shrink the objects for families with huge attr spaces. Fixed headers are always present in their entirety (between netlink header and the attrs) so there's no point in listing their members. Current C codegen doesn't expect them and tries to look them up in the attribute space. Reviewed-by: Donald Hunter <donald.hunter@gmail.com> Link: https://patch.msgid.link/20250418021706.1967583-3-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-04-23 16:07:15 -07:00
Jakub Kicinski	240ce924d2	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Cross-merge networking fixes after downstream PR (net-6.15-rc3). No conflicts. Adjacent changes: tools/net/ynl/pyynl/ynl_gen_c.py `4d07bbf2d4` ("tools: ynl-gen: don't declare loop iterator in place") `7e8ba0c7de` ("tools: ynl: don't use genlmsghdr in classic netlink") Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-04-17 12:26:50 -07:00
Antonio Quartulli	c2d950c467	ovpn: add basic interface creation/destruction/management routines Add basic infrastructure for handling ovpn interfaces. Tested-by: Donald Hunter <donald.hunter@gmail.com> Signed-off-by: Antonio Quartulli <antonio@openvpn.net> Link: https://patch.msgid.link/20250415-b4-ovpn-v26-3-577f6097b964@openvpn.net Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-04-17 12:30:02 +02:00
Jakub Kicinski	cd5e64fb95	netlink: specs: rename rtnetlink specs in accordance with family name The rtnetlink family names are set to rt-$name within the YAML but the files are called rt_$name. C codegen assumes that the generated file name will match the family. The use of dashes is in line with our general expectation that name properties in the spec use dashes not underscores (even tho, as Donald points out most genl families use underscores in the name). We have 3 un-ideal options to choose from: - accept the slight inconsistency with old families using _, or - accept the slight annoyance with all languages having to do s/-/_/ when looking up family ID, or - accept the inconsistency with all name properties in new YAML spec being separated with - and just the family name always using _. Pick option 1 and rename the rtnl spec files. Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Donald Hunter <donald.hunter@gmail.com> Link: https://patch.msgid.link/20250410014658.782120-2-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-04-10 20:14:40 -07:00

26 Commits