HW reports each A-MSDU subframe as a separate
sk_buff. It is impossible to configure it to
behave differently.
Until now ath10k was reconstructing A-MSDUs from
subframes which involved a lot of memory
operations. This proved to be a significant
contributor to degraded RX performance.
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Instead of allocating sk_buff for a mere 16-byte
tx fragment list buffer use headroom of the
original msdu sk_buff.
This decreases CPU cache pressure and improves
performance.
Measured improvement on AP135 is 560mbps ->
590mbps of UDP TX briding traffic.
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Until now the all MSDU transfer related structures
were freed when all resources were unreferenced.
Now HTC transfer is freed independently and HTT
transfer is so too.
This yields a way more simpler ath10k_skb_cb and
should possibly enable parallel pipe processing
(which is now serialized in
ath10k_pci_process_ce routine).
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
This reduces number of memory accesses and
hopefully contributes to better performance in the
future.
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
It's more efficient to simply check num_pending_tx
value instead of traversing whole bitmap of
msdu ids.
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>