Pavel Begunkov says:
====================
Add support for providers with large rx buffer
Many modern NICs support configurable receive buffer lengths, and zcrx and
memory providers can use buffers larger than 4K to improve performance.
When paired with hw-gro larger rx buffer sizes can drastically reduce
the number of buffers traversing the stack and save a lot of processing
time. It also allows to give to users larger contiguous chunks of data.
Single stream benchmarks showed up to ~30% CPU util improvement.
E.g. comparison for 4K vs 32K buffers using a 200Gbit NIC:
packets=23987040 (MB=2745098), rps=199559 (MB/s=22837)
CPU %usr %nice %sys %iowait %irq %soft %idle
0 1.53 0.00 27.78 2.72 1.31 66.45 0.22
packets=24078368 (MB=2755550), rps=200319 (MB/s=22924)
CPU %usr %nice %sys %iowait %irq %soft %idle
0 0.69 0.00 8.26 31.65 1.83 57.00 0.57
This series adds net infrastructure for memory providers configuring
the size and implements it for bnxt. It's an opt-in feature for drivers,
they should advertise support for the parameter in the qops and must check
if the hardware supports the given size. It's limited to memory providers
as it drastically simplifies implementation. It doesn't affect the fast
path zcrx uAPI, and the user exposed parameter is defined in zcrx terms,
which allows it to be flexible and adjusted in the future.
A liburing example can be found at [2]
full branch:
[1] https://github.com/isilence/linux.git zcrx/large-buffers-v8
Liburing example:
[2] https://github.com/isilence/liburing.git zcrx/rx-buf-len
* tag 'net-queue-rx-buf-len-v9' of https://github.com/isilence/linux:
io_uring/zcrx: document area chunking parameter
selftests: iou-zcrx: test large chunk sizes
eth: bnxt: support qcfg provided rx page size
eth: bnxt: adjust the fill level of agg queues with larger buffers
eth: bnxt: store rx buffer size per queue
net: pass queue rx page size from memory provider
net: add bare bone queue configs
net: reduce indent of struct netdev_queue_mgmt_ops members
net: memzero mp params when closing a queue
====================
Link: https://patch.msgid.link/
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The iou-zcrx selftest currently runs the server on the remote host
and the client on the local host. This commit flips the endpoints
such that server runs on localhost and client on remote.
This change brings the iou-zcrx selftest in convention with other
selftests.
Drive-by fix for a missing import exception that happens when the
network interface has less than 2 combined channels.
Test plan: ran iou-zcrx.py selftest between 2 physical machines
Signed-off-by: Vishwanath Seshagiri <vishs@fb.com>
Link: https://patch.msgid.link/20250710165337.614159-1-vishs@meta.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
RSS contexts are used to shard work across multiple queues for an
application using io_uring zero copy receive. Add a test case checking
that steering flows into an RSS context works.
Until I add multi-thread support to the selftest binary, this test case
only has 1 queue in the RSS context.
Signed-off-by: David Wei <dw@davidwei.uk>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Joe Damato <jdamato@fastly.com>
Link: https://patch.msgid.link/20250425022049.3474590-4-dw@davidwei.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
For bnxt when the agg ring is used then tcp-data-split is automatically
reported to be enabled, but __net_mp_open_rxq() requires tcp-data-split
to be explicitly enabled by the user.
Enable tcp-data-split explicitly in io_uring zc rx selftest.
Signed-off-by: David Wei <dw@davidwei.uk>
Link: https://patch.msgid.link/20250409163153.2747918-1-dw@davidwei.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add a selftest for io_uring zero copy Rx. This test cannot run locally
and requires a remote host to be configured in net.config. The remote
host must have hardware support for zero copy Rx as listed in the
documentation page. The test will restore the NIC config back to before
the test and is idempotent.
liburing is required to compile the test and be installed on the remote
host running the test.
Signed-off-by: David Wei <dw@davidwei.uk>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20250215000947.789731-12-dw@davidwei.uk
Signed-off-by: Jens Axboe <axboe@kernel.dk>