Add a new unused state to the admin completion contexts state machine
instead of the occupied field. This improves the completion validity
check because it now enforce the context to be in submitted state prior
to completing it. Also add allocated state as a intermediate state
between unused and submitted.
Reviewed-by: Daniel Kranzdorf <dkkranzd@amazon.com>
Reviewed-by: Michael Margolin <mrgolin@amazon.com>
Signed-off-by: Yonatan Nachum <ynachum@amazon.com>
Link: https://patch.msgid.link/20251210130614.36460-3-ynachum@amazon.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
In admin command completion, we receive a CQE with the command ID which
is constructed from context index and entropy bits from the admin queue
producer counter. To try to detect memory corruptions in the received
CQE, validate the full command ID of the fetched context with the CQE
command ID. If there is a mismatch, complete the CQE with error.
Also use LSBs of the admin queue producer counter to better detect
entropy mismatch between smaller number of commands.
Reviewed-by: Daniel Kranzdorf <dkkranzd@amazon.com>
Reviewed-by: Michael Margolin <mrgolin@amazon.com>
Signed-off-by: Yonatan Nachum <ynachum@amazon.com>
Link: https://patch.msgid.link/20251210130614.36460-2-ynachum@amazon.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
This patch adds support for CQ notifications through the standard verbs
api.
In order to achieve that, a new event queue (EQ) object is introduced,
which is in charge of reporting completion events to the driver. On
driver load, EQs are allocated and their affinity is set to a single
cpu. When a user app creates a CQ with a completion channel, the
completion vector number is converted to a EQ number, which is in charge
of reporting the CQ events.
In addition, the CQ creation admin command now returns an offset for the
CQ doorbell, which is mapped to the userspace provider and is used to arm
the CQ when requested by the user.
The EQs use a single doorbell (located on the registers BAR), which
encodes the EQ number and arm as part of the doorbell value. The EQs are
polled by the driver on each new EQE, and arm it when the poll is
completed.
Link: https://lore.kernel.org/r/20211003105605.29222-1-galpress@amazon.com
Reviewed-by: Firas JahJah <firasj@amazon.com>
Reviewed-by: Yossi Leybovich <sleybo@amazon.com>
Signed-off-by: Gal Pressman <galpress@amazon.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
A pointer to store the command completion must be provided as it is always
used in efa_com_put_comp_ctx() to return the completion context back to
the pool. Remove the NULL pointer check and the redundant 'status' field
stored on the context as it could be retrieved from the completion itself.
Link: https://lore.kernel.org/r/20210126120702.9807-2-galpress@amazon.com
Reviewed-by: Firas JahJah <firasj@amazon.com>
Reviewed-by: Yossi Leybovich <sleybo@amazon.com>
Signed-off-by: Gal Pressman <galpress@amazon.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
We cannot rely on the entry memcpy as we only copy the actual size of the
command, the rest of the bytes must be memset to zero.
Currently providing non-zero memory will not have any user visible impact.
However, since admin commands are extendable (in a backwards compatible
way) everything beyond the size of the command must be cleared to prevent
issues in the future.
Fixes: 0420e54256 ("RDMA/efa: Implement functions that submit and complete admin commands")
Link: https://lore.kernel.org/r/20191112092608.46964-1-galpress@amazon.com
Reviewed-by: Daniel Kranzdorf <dkkranzd@amazon.com>
Reviewed-by: Firas JahJah <firasj@amazon.com>
Signed-off-by: Gal Pressman <galpress@amazon.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Make admin commands id easier to distinguish by using relevant bits from
the producer counter.
This allows us to differentiate admin commands with the same producer
index (happens after admin queue overlap), which is helpful when
debugging.
Signed-off-by: Daniel Kranzdorf <dkkranzd@amazon.com>
Reviewed-by: Firas JahJah <firasj@amazon.com>
Reviewed-by: Yossi Leybovich <sleybo@amazon.com>
Signed-off-by: Gal Pressman <galpress@amazon.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
The admin commands abort flow is buggy (use-after-free) and not really
necessary as it is guaranteed that after ib_unregister_device() is called
there are no user verbs threads running in parallel, delete it.
Suggested-by: Jason Gunthorpe <jgg@ziepe.ca>
Reviewed-by: Firas JahJah <firasj@amazon.com>
Reviewed-by: Yossi Leybovich <sleybo@amazon.com>
Signed-off-by: Gal Pressman <galpress@amazon.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>