Add an env NetDrvContEnv for container based selftests. This automates
the setup of a netns, netkit pair with one inside the netns, and a BPF
program that forwards skbs from the NETIF host inside the container.
Currently only netkit is used, but other virtual netdevs e.g. veth can
be used too.
Expect netkit container datapath selftests to have a publicly routable
IP prefix to assign to netkit in a container, such that packets will
land on eth0. The BPF skb forward program will then forward such packets
from the host netns to the container netns.
Signed-off-by: David Wei <dw@davidwei.uk>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://patch.msgid.link/20260305181803.2912736-4-dw@davidwei.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This reverts commit 77b9c4a438, reversing
changes made to 4515ec4ad5:
931420a2fc ("selftests/net: Add netkit container tests")
ab771c938d ("selftests/net: Make NetDrvContEnv support queue leasing")
6be87fbb27 ("selftests/net: Add env for container based tests")
61d99ce3df ("selftests/net: Add bpf skb forwarding program")
920da36341 ("netkit: Add xsk support for af_xdp applications")
eef51113f8 ("netkit: Add netkit notifier to check for unregistering devices")
b5ef109d22 ("netkit: Implement rtnl_link_ops->alloc and ndo_queue_create")
b5c3fa4a0b ("netkit: Add single device mode for netkit")
0073d2fd67 ("xsk: Proxy pool management for leased queues")
1ecea95dd3 ("xsk: Extend xsk_rcv_check validation")
804bf334d0 ("net: Proxy netdev_queue_get_dma_dev for leased queues")
0caa9a8dde ("net: Proxy net_mp_{open,close}_rxq for leased queues")
ff8889ff91 ("net, ethtool: Disallow leased real rxqs to be resized")
9e2103f361 ("net: Add lease info to queue-get response")
31127dedde ("net: Implement netdev_nl_queue_create_doit")
a5546e18f7 ("net: Add queue-create operation")
The series will conflict with io_uring work, and the code needs more
polish.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add a new parameter `lease` to NetDrvContEnv that sets up queue leasing
in the env.
The NETIF also has some ethtool parameters changed to support memory
provider tests. This is needed in NetDrvContEnv rather than individual
test cases since the cleanup to restore NETIF can't be done, until the
netns in the env is gone.
Signed-off-by: David Wei <dw@davidwei.uk>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://patch.msgid.link/20260115082603.219152-16-daniel@iogearbox.net
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Add an env NetDrvContEnv for container based selftests. This automates
the setup of a netns, netkit pair with one inside the netns, and a BPF
program that forwards skbs from the NETIF host inside the container.
Currently only netkit is used, but other virtual netdevs e.g. veth can
be used too.
Expect netkit container datapath selftests to have a publicly routable
IP prefix to assign to netkit in a container, such that packets will
land on eth0. The BPF skb forward program will then forward such packets
from the host netns to the container netns.
Signed-off-by: David Wei <dw@davidwei.uk>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://patch.msgid.link/20260115082603.219152-15-daniel@iogearbox.net
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
The PSP responder fails when zero or multiple PSP devices are detected.
There's an option to select the device id to use (-d) but it's
currently not used from the PSP self test. It's also hard to use because
the PSP test doesn't dump the PSP devices so can't choose one.
When zero devices are detected, psp_responder fails which will cause the
parent test to fail as well instead of skipping PSP tests.
Fix both of these problems. Change psp_responder to:
- not fail when no PSP devs are detected.
- get an optional -i ifindex argument instead of -d.
- select the correct PSP dev from the dump corresponding to ifindex or
- select the first PSP dev when -i is not given.
- fail when multiple devs are found and -i is not given.
- warn and continue when the requested ifindex is not found.
Also plumb the ifindex from the Python test.
With these, when there are no PSP devs found or the wrong one is chosen,
psp_responder opens the server socket, listens for control connections
normally, and leaves the skipping of the various test cases which
require a PSP device (~most, but not all of them) to the parent test.
This results in output like:
ok 1 psp.test_case # SKIP No PSP devices found
[...]
ok 12 psp.dev_get_device # SKIP No PSP devices found
ok 13 psp.dev_get_device_bad
ok 14 psp.dev_rotate # SKIP No PSP devices found
[...]
Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com>
Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
Link: https://patch.msgid.link/20260109110851.2952906-2-cratiu@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
GenerateTraffic was added to spin up long-running iperf3 load, mainly
to drive high PPS background traffic. It was never meant to provide
stable throughput numbers, and trying to repurpose it for measurement
does not make sense.
Introduce Iperf3Runner to allow tests to split out server/client
configuration, control start/stop, and collect JSON output for
analysis. This makes it possible to measure bandwidth directly when
validating egress shaping.
GenerateTraffic stays as the background load generator, reusing the
common iperf3 helpers under the hood.
Signed-off-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com>
Reviewed-by: Nimrod Oren <noren@nvidia.com>
Link: https://patch.msgid.link/20251130091938.4109055-3-cjubran@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
We're already saving the info about the local dev in env.dev
for the tests, save remote dev as well. This is more symmetric,
env generally provides the same info for local and remote end.
While at it make sure that we reliably get the detailed info
about the local dev. nsim used to read the dev info without -d.
Reviewed-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20251120021024.2944527-8-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
There's a lot of cases where we try to re-run the same code with
different parameters. We currently need to either use a generator
method or create a "main" case implementation which then gets called
by trivial case functions:
def _test(x, y, z):
...
def case_int():
_test(1, 2, 3)
def case_str():
_test('a', 'b', 'c')
Add support for variants, similar to kselftests_harness.h and
a lot of other frameworks. Variants can be added as decorator
to test functions:
@ksft_variants([(1, 2, 3), ('a', 'b', 'c')])
def case(x, y, z):
...
ksft_run() will auto-generate case names:
case.1_2_3
case.a_b_c
Because the names may not always be pretty (and to avoid forcing
classes to implement case-friendly __str__()) add a wrapper class
KsftNamedVariant which lets the user specify the name for the variant.
Note that ksft_run's args are still supported. ksft_run splices args
and variant params together.
Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Link: https://patch.msgid.link/20251120021024.2944527-4-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Convert remaining __init__ files similar to what we did in
commit b615879dbf ("selftests: drv-net: make linters happy with our imports")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Stanislav Fomichev <sdf@fomichev.me>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linters are still not very happy with our __init__ files,
which was pointed out in recent review (see Link).
We have previously started importing things one by one to
make linters happy with the test files (which import from __init__).
But __init__ file itself still makes linters unhappy.
To clean it up I believe we must completely remove the wildcard
imports, and assign the imported modules to __all__.
hds.py needs to be fixed because it seems to be importing
the Python standard random from lib.net.
We can't use ksft_pr() / ktap_result() in case importing
from net.lib fails. Linters complain that those helpers
themselves may not have been imported.
Link: https://lore.kernel.org/9d215979-6c6d-4e9b-9cdd-39cff595866e@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Link: https://patch.msgid.link/20251003164748.860042-1-kuba@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
On fast machines the tests run in quick succession so even
when tests clean up after themselves the carrier may need
some time to come back.
Specifically in NIPA when ping.py runs right after netpoll_basic.py
the first ping command fails.
Since the context manager callbacks are now common NetDrvEpEnv
gets an ip link up call as well.
Reviewed-by: Joe Damato <joe@dama.to>
Link: https://patch.msgid.link/20250812142054.750282-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Make require_cmd() calls explicit about whether commands are needed
locally, remotely, or both.
Since require_cmd() defaults to local=True, tests should explicitly set
local=False when commands are only needed remotely.
- socat: Set local=False since it's only needed on remote hosts.
- iperf3: Use single call with both local=True and remote=True since
it's needed on both hosts.
This avoids unnecessary test failures when commands are missing locally
but available remotely where actually needed, and consolidates a
duplicate require_cmd() call into single call that checks both hosts.
Fixes: 0d0f4174f6 ("selftests: drv-net: add a simple TSO test")
Fixes: f1e68a1a4a ("selftests: drv-net: add require_XYZ() helpers for validating env")
Fixes: c76bab22e9 ("selftests: drv-net: rss_input_xfrm: Check test prerequisites before running")
Reviewed-by: Nimrod Oren <noren@nvidia.com>
Signed-off-by: Gal Pressman <gal@nvidia.com>
Link: https://patch.msgid.link/20250723135454.649342-3-gal@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The require_cmd() method was checking for command availability locally
even when remote=True was specified, due to a missing host parameter.
Fix by passing host=self.remote when checking remote command
availability, ensuring commands are verified on the correct host.
Fixes: f1e68a1a4a ("selftests: drv-net: add require_XYZ() helpers for validating env")
Reviewed-by: Nimrod Oren <noren@nvidia.com>
Signed-off-by: Gal Pressman <gal@nvidia.com>
Link: https://patch.msgid.link/20250723135454.649342-2-gal@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
A few packets may still be sent out during the termination of iperf
processes. These late packets cause failures in rss_ctx.py when they
arrive on queues expected to be empty.
Example failure observed:
Check failed 2 != 0 traffic on inactive queues (context 1):
[0, 0, 1, 1, 386385, 397196, 0, 0, 0, 0, ...]
Check failed 4 != 0 traffic on inactive queues (context 2):
[0, 0, 0, 0, 2, 2, 247152, 253013, 0, 0, ...]
Check failed 2 != 0 traffic on inactive queues (context 3):
[0, 0, 0, 0, 0, 0, 1, 1, 282434, 283070, ...]
To avoid such failures, wait until all client sockets for the requested
port are either closed or in the TIME_WAIT state.
Fixes: 847aa551fa ("selftests: drv-net: rss_ctx: factor out send traffic and check")
Signed-off-by: Nimrod Oren <noren@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250722122655.3194442-1-noren@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
bpftrace is very useful for low level driver testing. perf or trace-cmd
would also do for collecting data from tracepoints, but they require
much more post-processing.
Add a wrapper for running bpftrace and sanitizing its output.
bpftrace has JSON output, which is great, but it prints loose objects
and in a slightly inconvenient format. We have to read the objects
line by line, and while at it return them indexed by the map name.
Reviewed-by: Breno Leitao <leitao@debian.org>
Signed-off-by: Breno Leitao <leitao@debian.org>
Link: https://patch.msgid.link/20250714-netpoll_test-v7-1-c0220cfaa63e@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This test suite validates the functionality of the devlink-rate API for
traffic class (TC) bandwidth allocation. It ensures that bandwidth can
be distributed between different traffic classes as configured, and
verifies that explicit TC-to-queue mapping is required for the
allocation to be effective.
The first test (test_no_tc_mapping_bandwidth) is marked as expected
failure on mlx5, since the hardware automatically enforces traffic
class separation by dynamically moving queues to the correct TC
scheduler, even without explicit TC-to-queue mapping configuration.
Test output on mlx5:
1..2
# Created VF interface: eth5
# Created VLAN eth5.101 on eth5 with tc 3 and IP 198.51.100.2
# Created VLAN eth5.102 on eth5 with tc 4 and IP 198.51.100.10
# Set representor eth4 up and added to bridge
# Bandwidth check results without TC mapping:
# TC 3: 0.19 Gbits/sec
# TC 4: 0.76 Gbits/sec
# Total bandwidth: 0.95 Gbits/sec
# TC 3 percentage: 20.0%
# TC 4 percentage: 80.0%
ok 1 devlink_rate_tc_bw.test_no_tc_mapping_bandwidth # XFAIL Bandwidth matched 80/20 split without TC mapping
# Created VF interface: eth5
# Created VLAN eth5.101 on eth5 with tc 3 and IP 198.51.100.2
# Created VLAN eth5.102 on eth5 with tc 4 and IP 198.51.100.10
# Set representor eth4 up and added to bridge
# Bandwidth check results with TC mapping:
# TC 3: 0.21 Gbits/sec
# TC 4: 0.78 Gbits/sec
# Total bandwidth: 0.98 Gbits/sec
# TC 3 percentage: 21.1%
# TC 4 percentage: 78.9%
# Bandwidth is distributed as 80/20 with TC mapping
ok 2 devlink_rate_tc_bw.test_tc_mapping_bandwidth
# Totals: pass:1 fail:0 xfail:1 xpass:0 skip:0 error:0
Signed-off-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com>
Reviewed-by: Nimrod Oren <noren@nvidia.com>
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Link: https://patch.msgid.link/20250629142138.361537-9-mbloch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Revert fbbf93556f ("selftests: nic_performance: Add selftest for performance of NIC driver")
Revert c087dc5439 ("selftests: nic_link_layer: Add selftest case for speed and duplex states")
Revert 6116075e18 ("selftests: nic_link_layer: Add link layer selftest for NIC driver")
These tests don't clean up after themselves, don't use the disruptive
annotations, don't get included in make install etc. etc. The tests
were added before we have any "HW" runner, so the issues were missed.
Our CI doesn't have any way of excluding broken tests, remove these
for now to stop the random pollution of results due to broken env.
We can always add them back once / if fixed.
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Reviewed-by: David Wei <dw@davidwei.uk>
Link: https://patch.msgid.link/20250507140109.929801-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add tests to check that the napi retained the IRQ after down/up,
multiple changes in the number of rx queues and after
attaching/releasing XDP program.
Tested on ice and idpf:
# NETIF=<iface> tools/testing/selftests/drivers/net/hw/irq.py
KTAP version 1
1..4
ok 1 irq.check_irqs_reported
ok 2 irq.check_reconfig_queues
ok 3 irq.check_reconfig_xdp
ok 4 irq.check_down
# Totals: pass:4 fail:0 xfail:0 xpass:0 skip:0 error:0
Tested-by: Ahmed Zaki <ahmed.zaki@intel.com>
Link: https://patch.msgid.link/20250224232228.990783-7-ahmed.zaki@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Looks like more and more tests want to iterate over IP version,
run the same test over ipv4 and ipv6. The current naming of
members in the env class makes it a bit awkward, we have
separate members for ipv4 and ipv6 parameters.
Store the parameters inside dicts, so that tests can easily
index them with ip version.
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20250218225426.77726-4-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Refering to C binaries from Python code is going to be a common
need. Add a helper to convert from path in relation to the test.
Meaning, if the test is in the same directory as the binary, the
call would be simply: cfg.rpath("binary").
The helper name "rpath" is not great. I can't think of a better
name that would be accurate yet concise.
Reviewed-by: Petr Machata <petrm@nvidia.com>
Link: https://patch.msgid.link/20250207184140.1730466-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Most of our tests use rtnetlink to read device stats, so they
don't expose the drivers much to paths in which device stats
are read under RCU. Add tests which hammer profcs reads to
make sure drivers:
- don't sleep while reporting stats,
- can handle parallel reads,
- can handle device going down while reading.
Set ifname on the env class in NetDrvEnv, we already do that
in NetDrvEpEnv.
KTAP version 1
1..7
ok 1 stats.check_pause
ok 2 stats.check_fec
ok 3 stats.pkt_byte_sum
ok 4 stats.qstat_by_ifindex
ok 5 stats.check_down
ok 6 stats.procfs_hammer
# completed up/down cycles: 6
ok 7 stats.procfs_downup_hammer
# Totals: pass:7 fail:0 xfail:0 xpass:0 skip:0 error:0
Reviewed-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20250107022932.2087744-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add selftest case to check the send and receive throughput.
Supported link modes between local NIC driver and partner
are varied. Then send and receive throughput is captured
and verified. Test uses iperf3 tool.
Add iperf3 server/client function in GenerateTraffic class.
Signed-off-by: Mohan Prasad J <mohan.prasad@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Add new @ksft_disruptive decorator to mark the tests that might
be disruptive to the system. Depending on how well the previous
test works in the CI we might want to disable disruptive tests
by default and only let the developers run them manually.
KSFT framework runs disruptive tests by default. DISRUPTIVE=False
environment (or config file) can be used to disable these tests.
ksft_setup should be called by the test cases that want to use
new decorator (ksft_setup is only called via NetDrvEnv/NetDrvEpEnv for now).
In the future we can add similar decorators to, for example, avoid
running slow tests all the time. And/or have some option to run
only 'fast' tests for some sort of smoke test scenario.
$ DISRUPTIVE=False ./stats.py
KTAP version 1
1..5
ok 1 stats.check_pause
ok 2 stats.check_fec
ok 3 stats.pkt_byte_sum
ok 4 stats.qstat_by_ifindex
ok 5 stats.check_down # SKIP marked as disruptive
# Totals: pass:4 fail:0 xfail:0 xpass:0 skip:1 error:0
v3:
- parse yes and properly treat non-zero nums as true (Petr)
v2:
- convert from cli argument to env variable (Jakub)
Signed-off-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20240802000309.2368-2-sdf@fomichev.me
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add tests focusing on indirection table configuration and
creating extra RSS contexts in drivers which support it.
$ export NETIF=eth0 REMOTE_...
$ ./drivers/net/hw/rss_ctx.py
KTAP version 1
1..8
ok 1 rss_ctx.test_rss_key_indir
ok 2 rss_ctx.test_rss_context
ok 3 rss_ctx.test_rss_context4
# Increasing queue count 44 -> 66
# Failed to create context 32, trying to test what we got
ok 4 rss_ctx.test_rss_context32 # SKIP Tested only 31 contexts, wanted 32
ok 5 rss_ctx.test_rss_context_overlap
ok 6 rss_ctx.test_rss_context_overlap2
# .. sprays traffic like a headless chicken ..
not ok 7 rss_ctx.test_rss_context_out_of_order
ok 8 rss_ctx.test_rss_context4_create_with_cfg
# Totals: pass:6 fail:1 xfail:0 xpass:0 skip:1 error:0
Note that rss_ctx.test_rss_context_out_of_order fails with the device
I tested with, but it seems to be a device / driver bug.
Reviewed-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20240626012456.2326192-5-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add a selftest for netdev generic netlink. For now there is only a
single test that exercises the `queue-get` API.
The test works with netdevsim by default or with a real device by
setting NETIF.
Add a timeout param to cmd() since ethtool -L can take a long time on
real devices.
Signed-off-by: David Wei <dw@davidwei.uk>
Link: https://lore.kernel.org/r/20240507163228.2066817-3-dw@davidwei.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
We created a separate directory for HW-only tests, recently.
Glue in the Python test library there, Python is a bit annoying
when it comes to using library code located "lower"
in the directory structure.
Reuse the Env class, but let tests require non-nsim setup.
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://lore.kernel.org/r/20240429144426.743476-3-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
More complex tests often have to spawn a background process,
like a server which will respond to requests or tcpdump.
Add support for creating such processes using the with keyword:
with bkg("my-daemon", ..):
# my-daemon is alive in this block
My initial thought was to add this support to cmd() directly
but it runs the command in the constructor, so by the time
we __enter__ it's too late to make sure we used "background=True".
Second useful helper transplanted from net_helper.sh is
wait_port_listen().
The test itself uses socat, which insists on v6 addresses
being wrapped in [], it's not the only command which requires
this format, so add the wrapped address to env. The hope
is to save test code from checking if address is v6.
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://lore.kernel.org/r/20240420025237.3309296-7-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>