linux

mirror of https://github.com/torvalds/linux.git synced 2026-05-05 06:52:34 -04:00

Author	SHA1	Message	Date
Jeroen de Borst	056a70924a	gve: Add header split ethtool stats To record the stats of header split packets, three stats are added in the driver's ethtool stats. - rx_hsplit_pkt is the split packets count with header split - rx_hsplit_bytes is the received header bytes count with header split - rx_hsplit_unsplit_pkt is the unsplit packet count due to header buffer overflow or zero header length when header split is enabled Currently, it's entering the stats_update critical section more than once per packet. We have plans to avoid that in the future change to let all the stats_update happen in one place at the end of `gve_rx_poll_dqo`. Co-developed-by: Ziwei Xiao <ziweixiao@google.com> Signed-off-by: Ziwei Xiao <ziweixiao@google.com> Signed-off-by: Jeroen de Borst <jeroendb@google.com> Reviewed-by: Praveen Kaligineedi <pkaligineedi@google.com> Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-03-04 10:03:32 +00:00
Jeroen de Borst	5e37d8254e	gve: Add header split data path Add header buffers and ethtool support to enable header split via the tcp-data-split flag in ethtool's ringparam config. A coherent dma memory is allocated for the header buffers. There is one header buffer per ring entry by calculating the offset to the header-buffers starting address. The header buffer is always copied directly into the skb and payload is always added as frags. When there is a header buffer overflow or the header length is 0, the driver places the whole unsplit packet in frags. When toggling header split, the driver will call gve_adjust_config to set its queues appropriately. If header split is enabled by the user and the max packet buffer size is no less than 4KB, driver will set the packet buffer size as 4KB to support TCP_ZEROCOPY_RECEIVE. Otherwise the driver will use the default 2KB as the packet buffer size. `ethtool -G <dev> tcp-data-split on/off` is the command to toggle header split. `ethtool -g <dev>` will show the status of header split with the field of `tcp-data-split`. Co-developed-by: Ziwei Xiao <ziweixiao@google.com> Signed-off-by: Ziwei Xiao <ziweixiao@google.com> Signed-off-by: Jeroen de Borst <jeroendb@google.com> Reviewed-by: Praveen Kaligineedi <pkaligineedi@google.com> Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-03-04 10:03:32 +00:00
Jeroen de Borst	0b43cf527d	gve: Add header split device option To enable header split via ethtool, we first need to query the device to get the max rx buffer size and header buffer size. Add a device option to get these values and store them in the driver. If the header buffer size received from the device is non-zero, it means header split is supported in the device. Currently the max rx buffer size will only be used when header split is enabled which will set the data_buffer_size_dqo to be the max rx buffer size. Also change the data_buffer_size_dqo from int to u16 since we are modifying it and making it to be consistent with max_rx_buffer_size. Co-developed-by: Ziwei Xiao <ziweixiao@google.com> Signed-off-by: Ziwei Xiao <ziweixiao@google.com> Signed-off-by: Jeroen de Borst <jeroendb@google.com> Reviewed-by: Praveen Kaligineedi <pkaligineedi@google.com> Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-03-04 10:03:31 +00:00
Shailend Chand	f13697cc7a	gve: Switch to config-aware queue allocation The new config-aware functions will help achieve the goal of being able to allocate resources for new queues while there already are active queues serving traffic. These new functions work off of arbitrary queue allocation configs rather than just the currently active config in priv, and they return the newly allocated resources instead of writing them into priv. Signed-off-by: Shailend Chand <shailend@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Jeroen de Borst <jeroendb@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240122182632.1102721-4-shailend@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-01-23 17:41:31 -08:00
Eric Dumazet	817c7cd204	gve: fix frag_list chaining gve_rx_append_frags() is able to build skbs chained with frag_list, like GRO engine. Problem is that shinfo->frag_list should only be used for the head of the chain. All other links should use skb->next pointer. Otherwise, built skbs are not valid and can cause crashes. Equivalent code in GRO (skb_gro_receive()) is: if (NAPI_GRO_CB(p)->last == p) skb_shinfo(p)->frag_list = skb; else NAPI_GRO_CB(p)->last->next = skb; NAPI_GRO_CB(p)->last = skb; Fixes: `9b8dd5e5ea` ("gve: DQO: Add RX path") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Bailey Forrest <bcf@google.com> Cc: Willem de Bruijn <willemb@google.com> Cc: Catherine Sullivan <csully@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-09-04 06:52:27 +01:00
Rushil Gupta	e7075ab4fb	gve: RX path for DQO-QPL The RX path allocates the QPL page pool at queue creation, and tries to reuse these pages through page recycling. This patch ensures that on refill no non-QPL pages are posted to the device. When the driver is running low on free buffers, an ondemand allocation step kicks in that allocates a non-qpl page for SKB business to free up the QPL page in use. gve_try_recycle_buf was moved to gve_rx_append_frags so that driver does not attempt to mark buffer as used if a non-qpl page was allocated ondemand. Signed-off-by: Rushil Gupta <rushilg@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Praveen Kaligineedi <pkaligineedi@google.com> Signed-off-by: Bailey Forrest <bcf@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-08-06 08:34:36 +01:00
Praveen Kaligineedi	2e80aeae9f	gve: XDP support GQI-QPL: helper function changes This patch adds/modifies helper functions needed to add XDP support. Signed-off-by: Praveen Kaligineedi <pkaligineedi@google.com> Reviewed-by: Jeroen de Borst <jeroendb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-03-17 08:29:20 +00:00
Shailend Chand	82fd151d38	gve: Reduce alloc and copy costs in the GQ rx path Previously, even if just one of the many fragments of a 9k packet required a copy, we'd copy the whole packet into a freshly-allocated 9k-sized linear SKB, and this led to performance issues. By having a pool of pages to copy into, each fragment can be independently handled, leading to a reduced incidence of allocation and copy. Signed-off-by: Shailend Chand <shailend@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-11-02 11:52:51 +00:00
Shailend Chand	8ccac4edc8	gve: Fix GFP flags when allocing pages Use GFP_ATOMIC when allocating pages out of the hotpath, continue to use GFP_KERNEL when allocating pages during setup. GFP_KERNEL will allow blocking which allows it to succeed more often in a low memory enviornment but in the hotpath we do not want to allow the allocation to block. Fixes: `9b8dd5e5ea` ("gve: DQO: Add RX path") Signed-off-by: Shailend Chand <shailend@google.com> Signed-off-by: Jeroen de Borst <jeroendb@google.com> Link: https://lore.kernel.org/r/20220913000901.959546-1-jeroendb@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-19 18:31:06 -07:00
Catherine Sullivan	a92f7a6fee	gve: Fix GFP flags when allocing pages Use GFP_ATOMIC when allocating pages out of the hotpath, continue to use GFP_KERNEL when allocating pages during setup. GFP_KERNEL will allow blocking which allows it to succeed more often in a low memory enviornment but in the hotpath we do not want to allow the allocation to block. Fixes: `f5cedc84a3` ("gve: Add transmit and receive support") Signed-off-by: Catherine Sullivan <csully@google.com> Signed-off-by: David Awogbemila <awogbemila@google.com> Link: https://lore.kernel.org/r/20220126003843.3584521-1-awogbemila@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-01-26 18:45:01 -08:00
David Awogbemila	37149e9374	gve: Implement packet continuation for RX. This enables the driver to receive RX packets spread across multiple buffers: For a given multi-fragment packet the "packet continuation" bit is set on all descriptors except the last one. These descriptors' payloads are combined into a single SKB before the SKB is handed to the networking stack. This change adds a "packet buffer size" notion for RX queues. The CreateRxQueue AdminQueue command sent to the device now includes the packet_buffer_size. We opt for a packet_buffer_size of PAGE_SIZE / 2 to give the driver the opportunity to flip pages where we can instead of copying. Signed-off-by: David Awogbemila <awogbemila@google.com> Signed-off-by: Jeroen de Borst <jeroendb@google.com> Reviewed-by: Catherine Sullivan <csully@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-10-25 14:13:12 +01:00
David Awogbemila	1344e751e9	gve: Add RX context. This refactor moves the skb_head and skb_tail fields into a new gve_rx_ctx struct. This new struct will contain information about the current packet being processed. This is in preparation for multi-descriptor RX packets. Signed-off-by: David Awogbemila <awogbemila@google.com> Signed-off-by: Jeroen de Borst <jeroendb@google.com> Reviewed-by: Catherine Sullivan <csully@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-10-25 14:13:12 +01:00
Bailey Forrest	1bfa4d0cb5	gve: DQO: Remove incorrect prefetch The prefetch is incorrectly using the dma address instead of the virtual address. It's supposed to be: prefetch((char *)buf_state->page_info.page_address + buf_state->page_info.page_offset) However, after correcting this mistake, there is no evidence of performance improvement. Fixes: `9b8dd5e5ea` ("gve: DQO: Add RX path") Signed-off-by: Bailey Forrest <bcf@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-07-02 12:06:17 -07:00
Dan Carpenter	ecd89c02da	gve: DQO: Fix off by one in gve_rx_dqo() The rx->dqo.buf_states[] array is allocated in gve_rx_alloc_ring_dqo() and it has rx->dqo.num_buf_states so this > needs to >= to prevent an out of bounds access. Fixes: `9b8dd5e5ea` ("gve: DQO: Add RX path") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-06-29 11:49:44 -07:00
Bailey Forrest	9b8dd5e5ea	gve: DQO: Add RX path The RX queue has an array of `gve_rx_buf_state_dqo` objects. All allocated pages have an associated buf_state object. When a buffer is posted on the RX buffer queue, the buffer ID will be the buf_state's index into the RX queue's array. On packet reception, the RX queue will have one descriptor for each buffer associated with a received packet. Each RX descriptor will have a buffer_id that was posted on the buffer queue. Notable mentions: - We use a default buffer size of 2048 bytes. Based on page size, we may post separate sections of a single page as separate buffers. - The driver holds an extra reference on pages passed up the receive path with an skb and keeps these pages on a list. When posting new buffers to the NIC, we check if any of these pages has only our reference, or another buffer sized segment of the page has no references. If so, it is free to reuse. This page recycling approach is a common netdev optimization that reduces page alloc/free calls. - Pages in the free list have a page_count bias in order to avoid an atomic increment of pagecount every time we attempt to reuse a page. # references = page_count() - bias - In order to track when a page is safe to reuse, we keep track of the last offset which had a single SKB reference. When this occurs, it implies that every single other offset is reusable. Otherwise, we don't know if offsets can be safely reused. - We maintain two free lists of pages. List #1 (recycled_buf_states) contains pages we know can be reused right away. List #2 (used_buf_states) contains pages which cannot be used right away. We only attempt to get pages from list #2 when list #1 is empty. We only attempt to use a small fixed number pages from list #2 before giving up and allocating a new page. Both lists are FIFOs in hope that by the time we attempt to reuse a page, the references were dropped. Signed-off-by: Bailey Forrest <bcf@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Catherine Sullivan <csully@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-06-24 12:47:38 -07:00
Bailey Forrest	9c1a59a2f4	gve: DQO: Add ring allocation and initialization Allocate the buffer and completion ring structures. Do not populate the rings yet. That will happen in the respective rx and tx datapath follow-on patches Signed-off-by: Bailey Forrest <bcf@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Catherine Sullivan <csully@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-06-24 12:47:38 -07:00
Bailey Forrest	5e8c5adf95	gve: DQO: Add core netdev features Add napi netdev device registration, interrupt handling and initial tx and rx polling stubs. The stubs will be filled in follow-on patches. Also: - LRO feature advertisement and handling - Also update ethtool logic Signed-off-by: Bailey Forrest <bcf@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Catherine Sullivan <csully@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-06-24 12:47:38 -07:00

17 Commits