Commit Graph

17 Commits

Author SHA1 Message Date
Jeroen de Borst
056a70924a gve: Add header split ethtool stats
To record the stats of header split packets, three stats are added in
the driver's ethtool stats.

- rx_hsplit_pkt is the split packets count with header split
- rx_hsplit_bytes is the received header bytes count with header split
- rx_hsplit_unsplit_pkt is the unsplit packet count due to header buffer
  overflow or zero header length when header split is enabled

Currently, it's entering the stats_update critical section more than
once per packet. We have plans to avoid that in the future change to let
all the stats_update happen in one place at the end of
`gve_rx_poll_dqo`.

Co-developed-by: Ziwei Xiao <ziweixiao@google.com>
Signed-off-by: Ziwei Xiao <ziweixiao@google.com>
Signed-off-by: Jeroen de Borst <jeroendb@google.com>
Reviewed-by: Praveen Kaligineedi <pkaligineedi@google.com>
Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-04 10:03:32 +00:00
Jeroen de Borst
5e37d8254e gve: Add header split data path
Add header buffers and ethtool support to enable header split via the
tcp-data-split flag in ethtool's ringparam config. A coherent dma memory
is allocated for the header buffers. There is one header buffer per ring
entry by calculating the offset to the header-buffers starting address.
The header buffer is always copied directly into the skb and payload is
always added as frags. When there is a header buffer overflow or the
header length is 0, the driver places the whole unsplit packet in frags.

When toggling header split, the driver will call gve_adjust_config to
set its queues appropriately. If header split is enabled by the user and
the max packet buffer size is no less than 4KB, driver will set the
packet buffer size as 4KB to support TCP_ZEROCOPY_RECEIVE. Otherwise the
driver will use the default 2KB as the packet buffer size.

`ethtool -G <dev> tcp-data-split on/off` is the command to toggle header
split.
`ethtool -g <dev>` will show the status of header split with the field
of `tcp-data-split`.

Co-developed-by: Ziwei Xiao <ziweixiao@google.com>
Signed-off-by: Ziwei Xiao <ziweixiao@google.com>
Signed-off-by: Jeroen de Borst <jeroendb@google.com>
Reviewed-by: Praveen Kaligineedi <pkaligineedi@google.com>
Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-04 10:03:32 +00:00
Jeroen de Borst
0b43cf527d gve: Add header split device option
To enable header split via ethtool, we first need to query the device to
get the max rx buffer size and header buffer size. Add a device option
to get these values and store them in the driver. If the header buffer
size received from the device is non-zero, it means header split is
supported in the device.

Currently the max rx buffer size will only be used when header split is
enabled which will set the data_buffer_size_dqo to be the max rx buffer
size. Also change the data_buffer_size_dqo from int to u16 since we are
modifying it and making it to be consistent with max_rx_buffer_size.

Co-developed-by: Ziwei Xiao <ziweixiao@google.com>
Signed-off-by: Ziwei Xiao <ziweixiao@google.com>
Signed-off-by: Jeroen de Borst <jeroendb@google.com>
Reviewed-by: Praveen Kaligineedi <pkaligineedi@google.com>
Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-04 10:03:31 +00:00
Shailend Chand
f13697cc7a gve: Switch to config-aware queue allocation
The new config-aware functions will help achieve the goal of being able
to allocate resources for new queues while there already are active
queues serving traffic.

These new functions work off of arbitrary queue allocation configs
rather than just the currently active config in priv, and they return
the newly allocated resources instead of writing them into priv.

Signed-off-by: Shailend Chand <shailend@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Jeroen de Borst <jeroendb@google.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240122182632.1102721-4-shailend@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-23 17:41:31 -08:00
Eric Dumazet
817c7cd204 gve: fix frag_list chaining
gve_rx_append_frags() is able to build skbs chained with frag_list,
like GRO engine.

Problem is that shinfo->frag_list should only be used
for the head of the chain.

All other links should use skb->next pointer.

Otherwise, built skbs are not valid and can cause crashes.

Equivalent code in GRO (skb_gro_receive()) is:

    if (NAPI_GRO_CB(p)->last == p)
        skb_shinfo(p)->frag_list = skb;
    else
        NAPI_GRO_CB(p)->last->next = skb;
    NAPI_GRO_CB(p)->last = skb;

Fixes: 9b8dd5e5ea ("gve: DQO: Add RX path")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Bailey Forrest <bcf@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Catherine Sullivan <csully@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-09-04 06:52:27 +01:00
Rushil Gupta
e7075ab4fb gve: RX path for DQO-QPL
The RX path allocates the QPL page pool at queue creation, and
tries to reuse these pages through page recycling. This patch
ensures that on refill no non-QPL pages are posted to the device.

When the driver is running low on free buffers, an ondemand
allocation step kicks in that allocates a non-qpl page for
SKB business to free up the QPL page in use.

gve_try_recycle_buf was moved to gve_rx_append_frags so that driver does
not attempt to mark buffer as used if a non-qpl page was allocated
ondemand.

Signed-off-by: Rushil Gupta <rushilg@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Praveen Kaligineedi <pkaligineedi@google.com>
Signed-off-by: Bailey Forrest <bcf@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-06 08:34:36 +01:00
Praveen Kaligineedi
2e80aeae9f gve: XDP support GQI-QPL: helper function changes
This patch adds/modifies helper functions needed to add XDP
support.

Signed-off-by: Praveen Kaligineedi <pkaligineedi@google.com>
Reviewed-by: Jeroen de Borst <jeroendb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-17 08:29:20 +00:00
Shailend Chand
82fd151d38 gve: Reduce alloc and copy costs in the GQ rx path
Previously, even if just one of the many fragments of a 9k packet
required a copy, we'd copy the whole packet into a freshly-allocated
9k-sized linear SKB, and this led to performance issues.

By having a pool of pages to copy into, each fragment can be
independently handled, leading to a reduced incidence of
allocation and copy.

Signed-off-by: Shailend Chand <shailend@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-02 11:52:51 +00:00
Shailend Chand
8ccac4edc8 gve: Fix GFP flags when allocing pages
Use GFP_ATOMIC when allocating pages out of the hotpath,
continue to use GFP_KERNEL when allocating pages during setup.

GFP_KERNEL will allow blocking which allows it to succeed
more often in a low memory enviornment but in the hotpath we do
not want to allow the allocation to block.

Fixes: 9b8dd5e5ea ("gve: DQO: Add RX path")
Signed-off-by: Shailend Chand <shailend@google.com>
Signed-off-by: Jeroen de Borst <jeroendb@google.com>
Link: https://lore.kernel.org/r/20220913000901.959546-1-jeroendb@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-19 18:31:06 -07:00
Catherine Sullivan
a92f7a6fee gve: Fix GFP flags when allocing pages
Use GFP_ATOMIC when allocating pages out of the hotpath,
continue to use GFP_KERNEL when allocating pages during setup.

GFP_KERNEL will allow blocking which allows it to succeed
more often in a low memory enviornment but in the hotpath we do
not want to allow the allocation to block.

Fixes: f5cedc84a3 ("gve: Add transmit and receive support")
Signed-off-by: Catherine Sullivan <csully@google.com>
Signed-off-by: David Awogbemila <awogbemila@google.com>
Link: https://lore.kernel.org/r/20220126003843.3584521-1-awogbemila@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-26 18:45:01 -08:00
David Awogbemila
37149e9374 gve: Implement packet continuation for RX.
This enables the driver to receive RX packets spread across multiple
buffers:

For a given multi-fragment packet the "packet continuation" bit is set
on all descriptors except the last one. These descriptors' payloads are
combined into a single SKB before the SKB is handed to the
networking stack.

This change adds a "packet buffer size" notion for RX queues. The
CreateRxQueue AdminQueue command sent to the device now includes the
packet_buffer_size.

We opt for a packet_buffer_size of PAGE_SIZE / 2 to give the
driver the opportunity to flip pages where we can instead of copying.

Signed-off-by: David Awogbemila <awogbemila@google.com>
Signed-off-by: Jeroen de Borst <jeroendb@google.com>
Reviewed-by: Catherine Sullivan <csully@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25 14:13:12 +01:00
David Awogbemila
1344e751e9 gve: Add RX context.
This refactor moves the skb_head and skb_tail fields into a new
gve_rx_ctx struct. This new struct will contain information about the
current packet being processed. This is in preparation for
multi-descriptor RX packets.

Signed-off-by: David Awogbemila <awogbemila@google.com>
Signed-off-by: Jeroen de Borst <jeroendb@google.com>
Reviewed-by: Catherine Sullivan <csully@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25 14:13:12 +01:00
Bailey Forrest
1bfa4d0cb5 gve: DQO: Remove incorrect prefetch
The prefetch is incorrectly using the dma address instead of the virtual
address.

It's supposed to be:
prefetch((char *)buf_state->page_info.page_address +
	 buf_state->page_info.page_offset)

However, after correcting this mistake, there is no evidence of
performance improvement.

Fixes: 9b8dd5e5ea ("gve: DQO: Add RX path")
Signed-off-by: Bailey Forrest <bcf@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-02 12:06:17 -07:00
Dan Carpenter
ecd89c02da gve: DQO: Fix off by one in gve_rx_dqo()
The rx->dqo.buf_states[] array is allocated in gve_rx_alloc_ring_dqo()
and it has rx->dqo.num_buf_states so this > needs to >= to prevent an
out of bounds access.

Fixes: 9b8dd5e5ea ("gve: DQO: Add RX path")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-29 11:49:44 -07:00
Bailey Forrest
9b8dd5e5ea gve: DQO: Add RX path
The RX queue has an array of `gve_rx_buf_state_dqo` objects. All
allocated pages have an associated buf_state object. When a buffer is
posted on the RX buffer queue, the buffer ID will be the buf_state's
index into the RX queue's array.

On packet reception, the RX queue will have one descriptor for each
buffer associated with a received packet. Each RX descriptor will have
a buffer_id that was posted on the buffer queue.

Notable mentions:

- We use a default buffer size of 2048 bytes. Based on page size, we
  may post separate sections of a single page as separate buffers.

- The driver holds an extra reference on pages passed up the receive
  path with an skb and keeps these pages on a list. When posting new
  buffers to the NIC, we check if any of these pages has only our
  reference, or another buffer sized segment of the page has no
  references. If so, it is free to reuse. This page recycling approach
  is a common netdev optimization that reduces page alloc/free calls.

- Pages in the free list have a page_count bias in order to avoid an
  atomic increment of pagecount every time we attempt to reuse a page.
  # references = page_count() - bias

- In order to track when a page is safe to reuse, we keep track of the
  last offset which had a single SKB reference. When this occurs, it
  implies that every single other offset is reusable. Otherwise, we
  don't know if offsets can be safely reused.

- We maintain two free lists of pages. List #1 (recycled_buf_states)
  contains pages we know can be reused right away. List #2
  (used_buf_states) contains pages which cannot be used right away. We
  only attempt to get pages from list #2 when list #1 is empty. We only
  attempt to use a small fixed number pages from list #2 before giving
  up and allocating a new page. Both lists are FIFOs in hope that by the
  time we attempt to reuse a page, the references were dropped.

Signed-off-by: Bailey Forrest <bcf@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Catherine Sullivan <csully@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-24 12:47:38 -07:00
Bailey Forrest
9c1a59a2f4 gve: DQO: Add ring allocation and initialization
Allocate the buffer and completion ring structures. Do not populate the
rings yet. That will happen in the respective rx and tx datapath
follow-on patches

Signed-off-by: Bailey Forrest <bcf@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Catherine Sullivan <csully@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-24 12:47:38 -07:00
Bailey Forrest
5e8c5adf95 gve: DQO: Add core netdev features
Add napi netdev device registration, interrupt handling and initial tx
and rx polling stubs. The stubs will be filled in follow-on patches.

Also:
- LRO feature advertisement and handling
- Also update ethtool logic

Signed-off-by: Bailey Forrest <bcf@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Catherine Sullivan <csully@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-24 12:47:38 -07:00