Commit Graph

23 Commits

Author SHA1 Message Date
Jacob Moroni
62356fccb1 RDMA/irdma: Remove doorbell elision logic
In some cases, this logic can result in doorbell writes being
skipped when they should not have been (at least on GEN3 HW),
so remove it. This also means that the mb() can be safely
downgraded to dma_wmb().

Fixes: 551c46edc7 ("RDMA/irdma: Add user/kernel shared libraries")
Signed-off-by: Jacob Moroni <jmoroni@google.com>
Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com>
Link: https://patch.msgid.link/20251125025350.180-9-tatyana.e.nikolova@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2025-11-26 02:26:05 -05:00
Jay Bhat
da58d4223b RDMA/irdma: Take a lock before moving SRQ tail in poll_cq
Need to take an SRQ lock in poll_cq before moving SRQ tail.

Signed-off-by: Jay Bhat <jay.bhat@intel.com>
Reviewed-by: Jacob Moroni <jmoroni@google.com>
Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com>
Link: https://patch.msgid.link/20251031021726.1003-7-tatyana.e.nikolova@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2025-11-06 02:23:23 -05:00
Jay Bhat
0a19274555 RDMA/irdma: CQ size and shadow update changes for GEN3
CQ shadow area should not be updated at the end of a page (once every
64th CQ entry), except when CQ has no more CQEs. SW must also increase
the requested CQ size by 1 and make sure the CQ is not exactly one page
in size. This is to address a quirk in the hardware.

Signed-off-by: Jay Bhat <jay.bhat@intel.com>
Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com>
Link: https://patch.msgid.link/20251031021726.1003-4-tatyana.e.nikolova@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2025-11-02 06:52:58 -05:00
Jay Bhat
cd84d8001e RDMA/irdma: Silently consume unsignaled completions
In case we get an unsignaled error completion, we silently consume the CQE by
pretending the QP does not exist. Without this, bookkeeping for signaled
completions does not work correctly.

Signed-off-by: Jay Bhat <jay.bhat@intel.com>
Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com>
Link: https://patch.msgid.link/20251031021726.1003-5-tatyana.e.nikolova@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2025-11-02 06:52:51 -05:00
Shiraz Saleem
42f1d09909 RDMA/irdma: Extend CQE Error and Flush Handling for GEN3 Devices
Enhance the CQE error and flush handling specific to GEN3 devices.
Unlike GEN1/2 devices, which depend on software to generate completions
in error, GEN3 devices leverage firmware to generate CQEs in error for
all WQEs posted after a QP moves to an error state.

Key changes include:
- Updating the CQ poll logic to properly advance the CQ head in the
event of a flush CQE.
- Updating the flush logic for GEN3 to pass error WQE idx
for SQ on an AE to flush out unprocessed WQEs in error.
- Isolating the decoding of AE to flush codes into a separate routine
irdma_ae_to_qp_err_code. This routine can now be leveraged to
flush error CQEs on an AE and when error CQE is received for SRQ.

Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com>
Link: https://patch.msgid.link/20250827152545.2056-16-tatyana.e.nikolova@intel.com
Tested-by: Jacob Moroni <jmoroni@google.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2025-09-18 04:48:46 -04:00
Faisal Latif
a24a29c874 RDMA/irdma: Add Atomic Operations support
Extend irdma to support atomic operations, namely Compare and Swap and
Fetch and Add, for GEN3 devices.

Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com>
Link: https://patch.msgid.link/20250827152545.2056-15-tatyana.e.nikolova@intel.com
Tested-by: Jacob Moroni <jmoroni@google.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2025-09-18 04:48:46 -04:00
Faisal Latif
563e1feb5f RDMA/irdma: Add SRQ support
Implement verb API and UAPI changes to support SRQ functionality in GEN3
devices.

Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com>
Link: https://patch.msgid.link/20250827152545.2056-13-tatyana.e.nikolova@intel.com
Tested-by: Jacob Moroni <jmoroni@google.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2025-09-18 04:48:46 -04:00
Mustafa Ismail
d5edd33364 RDMA/irdma: Add GEN3 core driver support
Introduce support for the GEN3 auxiliary core driver, which is
responsible for initializing PCI-level RDMA resources.

Facilitate host-driver communication with the device's Control Plane (CP)
to discover capabilities and perform privileged operations through an
RDMA-specific messaging interface built atop the IDPF mailbox and virtual
channel protocol.

Establish the RDMA virtual channel message interface and incorporate
operations to retrieve the hardware version and discover capabilities
from the CP.

Additionally, set up the RDMA MMIO regions and initialize the RF structure.

Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Co-developed-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com>
Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com>
Link: https://patch.msgid.link/20250827152545.2056-3-tatyana.e.nikolova@intel.com
Tested-by: Jacob Moroni <jmoroni@google.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2025-09-18 04:48:45 -04:00
Krzysztof Kozlowski
3ec648c631 IB: Use capital "OR" for multiple licenses in SPDX
Documentation/process/license-rules.rst and checkpatch expect the SPDX
identifier syntax for multiple licenses to use capital "OR".  Correct it
to keep consistent format and avoid copy-paste issues.

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20230823092912.122674-1-krzysztof.kozlowski@linaro.org
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-09-11 14:14:00 +03:00
Shiraz Saleem
295c95aa7e RDMA/irdma: Drop unused kernel push code
The driver has code blocks for kernel push WQEs but does not
map the doorbell page rendering this mode non functional [1]

Remove code associated with this feature from the kernel fast
path as there is currently no plan of record to support this.

This also address a sparse issue reported by lkp.

drivers/infiniband/hw/irdma/uk.c:285:24: sparse: sparse: incorrect type in assignment (different base types) @@     expected bool [usertype] push_wqe:1 @@     got restricted __le32 [usertype] *push_db @@
drivers/infiniband/hw/irdma/uk.c:285:24: sparse:     expected bool [usertype] push_wqe:1
drivers/infiniband/hw/irdma/uk.c:285:24: sparse:     got restricted __le32 [usertype] *push_db
drivers/infiniband/hw/irdma/uk.c:386:24: sparse: sparse: incorrect type in assignment (different base types) @@     expected bool [usertype] push_wqe:1 @@     got restricted __le32 [usertype] *push_db @@

[1] https://lore.kernel.org/linux-rdma/20230815051809.GB22185@unreal/T/#t

Fixes: 272bba19d6 ("RDMA: Remove unnecessary ternary operators")
Fixes: 551c46edc7 ("RDMA/irdma: Add user/kernel shared libraries")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202308110251.BV6BcwUR-lkp@intel.com/
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Link: https://lore.kernel.org/r/20230816001209.1721-1-shiraz.saleem@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-08-16 09:24:52 +03:00
Sindhu Devale
72d422c246 RDMA/irdma: Use HW specific minimum WQ size
HW GEN1 and GEN2 have different min WQ sizes but they are
currently set to the same value.

Use a gen specific attribute min_hw_wq_size and extend ABI to
pass it to user-space.

Signed-off-by: Sindhu Devale <sindhu.devale@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Link: https://lore.kernel.org/r/20230725155525.1081-3-shiraz.saleem@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-07-30 15:43:00 +03:00
Sindhu Devale
3a84987204 RDMA/irdma: Allow accurate reporting on QP max send/recv WR
Currently the attribute cap.max_send_wr and cap.max_recv_wr
sent from user-space during create QP are the provider computed
SQ/RQ depth as opposed to raw values passed from application.
This inhibits computation of an accurate value for max_send_wr
and max_recv_wr for this QP in the kernel which matches the value
returned in user create QP. Also these capabilities needs to be
reported from the driver in query QP.

Add support by extending the ABI to allow the raw cap.max_send_wr and
cap.max_recv_wr to be passed from user-space, while keeping compatibility
for the older scheme.

The internal HW depth and shift needed for the WQs needs to be computed
now for both kernel and user-mode QPs. Add new helpers to assist with this:
irdma_uk_calc_depth_shift_sq, irdma_uk_calc_depth_shift_rq and
irdma_uk_calc_depth_shift_wq.

Consolidate all the user mode QP setup into a new function
irdma_setup_umode_qp which keeps it with its counterpart
irdma_setup_kmode_qp.

Signed-off-by: Youvaraj Sagar <youvaraj.sagar@intel.com>
Signed-off-by: Sindhu Devale <sindhu.devale@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Link: https://lore.kernel.org/r/20230725155525.1081-2-shiraz.saleem@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-07-30 15:43:00 +03:00
Mustafa Ismail
24419777e9 RDMA/irdma: Fix RQ completion opcode
The opcode written by HW, in the RQ CQE, is the
RoCEv2/iWARP protocol opcode from the received
packet and not the SW opcode as currently assumed.
Fix this by returning the raw operation type and
queue type in the CQE to irdma_process_cqe and add
2 helpers set_ib_wc_op_sq set_ib_wc_op_rq to map
IRDMA HW op types to IB op types.

Note that for iWARP, only Write with Immediate is
supported so the opcode can only be IB_WC_RECV_RDMA_WITH_IMM
when there is immediate data present.

Fixes: b48c24c2d7 ("RDMA/irdma: Implement device supported verb APIs")
Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Link: https://lore.kernel.org/r/20221115011701.1379-3-shiraz.saleem@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-11-17 10:41:28 +02:00
Mustafa Ismail
4f44e519b6 RDMA/irdma: Fix inline for multiple SGE's
Currently, inline send and inline write assume a single
SGE and only copy data from the first one. Add support
for multiple SGE's.

Fixes: b48c24c2d7 ("RDMA/irdma: Implement device supported verb APIs")
Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Link: https://lore.kernel.org/r/20221115011701.1379-2-shiraz.saleem@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-11-17 10:41:28 +02:00
Sindhu-Devale
7f51a961f8 RDMA/irdma: Align AE id codes to correct flush code and event
A number of asynchronous event (AE) ids were not aligned to the
correct flush_code and event_type. Fix these up so that the
correct IBV error and event codes are returned to application.

Also, add handling for new AE ids like IRDMA_AE_INVALID_REQUEST to
return the correct WC error code.

Fixes: 44d9e52977 ("RDMA/irdma: Implement device initialization definitions")
Signed-off-by: Sindhu-Devale <sindhu.devale@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Link: https://lore.kernel.org/r/20220907191324.1173-2-shiraz.saleem@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-09-20 13:19:52 +03:00
Shiraz Saleem
2c4b14ea95 RDMA/irdma: Remove enum irdma_status_code
Replace use of custom irdma_status_code with linux error codes.

Remove enum irdma_status_code and header in which its defined.

Link: https://lore.kernel.org/r/20220217151851.1518-2-shiraz.saleem@intel.com
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-02-23 15:24:18 -04:00
Zhu Yanjun
9ed8110c9b RDMA/irdma: optimize rx path by removing unnecessary copy
In the function irdma_post_recv, the function irdma_copy_sg_list is
not needed since the struct irdma_sge and ib_sge have the similar
member variables. The struct irdma_sge can be replaced with the
struct ib_sge totally.

This can increase the rx performance of irdma.

Link: https://lore.kernel.org/r/20211030104226.253346-1-yanjun.zhu@linux.dev
Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
Reviewed-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-11-01 14:39:33 -03:00
Zhu Yanjun
86479f8a3f RDMA/irdma: Remove the unused spin lock in struct irdma_qp_uk
The spin lock in struct irdma_qp_uk is not used. So remove it.

Link: https://lore.kernel.org/r/20211021230612.153812-1-yanjun.zhu@linux.dev
Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
Acked-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-10-25 14:36:58 -03:00
Zhu Yanjun
dede33da0d RDMA/irdma: Make irdma_uk_cq_init() return a void
The function irdma_uk_cq_init always returns 0, so make it void and delete
all the return value checks.

Link: https://lore.kernel.org/r/20211019153717.3836-1-yanjun.zhu@linux.dev
Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
Reviewed-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-10-20 19:54:16 -03:00
Zhu Yanjun
0bed5dfa5a RDMA/irdma: Remove irdma_uk_mw_bind()
The function irdma_uk_mw_bind is not used. So remove it.

Link: https://lore.kernel.org/r/20211011110128.4057-2-yanjun.zhu@linux.dev
Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
Acked-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-10-12 13:21:39 -03:00
Sindhu Devale
9f7fa37a6b RDMA/irdma: Report correct WC error when there are MW bind errors
Report the correct WC error when MW bind error related asynchronous events
are generated by HW.

Fixes: b48c24c2d7 ("RDMA/irdma: Implement device supported verb APIs")
Link: https://lore.kernel.org/r/20210916191222.824-5-shiraz.saleem@intel.com
Signed-off-by: Sindhu Devale <sindhu.devale@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-09-20 14:13:23 -03:00
Sindhu Devale
d3bdcd5963 RDMA/irdma: Report correct WC error when transport retry counter is exceeded
When the retry counter exceeds, as the remote QP didn't send any Ack or
Nack an asynchronous event (AE) for too many retries is generated. Add
code to handle the AE and set the correct IB WC error code
IB_WC_RETRY_EXC_ERR.

Fixes: b48c24c2d7 ("RDMA/irdma: Implement device supported verb APIs")
Link: https://lore.kernel.org/r/20210916191222.824-4-shiraz.saleem@intel.com
Signed-off-by: Sindhu Devale <sindhu.devale@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-09-20 14:13:23 -03:00
Mustafa Ismail
551c46edc7 RDMA/irdma: Add user/kernel shared libraries
Building the WQE descriptors for different verb
operations are similar in kernel and user-space.
Add these shared libraries.

Link: https://lore.kernel.org/r/20210602205138.889-12-shiraz.saleem@intel.com
Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-02 19:55:18 -03:00