Add debugging prints to the modify QP verb to help understand the cause a
returned error.
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Haggai Eran <haggaie@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
In order to create multiple GSI QPs, we need to set the source QP number to
one on all these QPs. Add the necessary definitions and infrastructure to
do that.
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Haggai Eran <haggaie@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
The driver checks the csum from the HW when completion arrived and marks
it in the wc->wc_flags field for the ulp drivers.
These is for packets from type IB_WC_RECV only.
Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
In order to support LSO and CSUM in the TX flow the driver does the
following:
* LSO bit for the enum mlx5_ib_qp_flags was added, indicates QP that
supports LSO offloads.
* Enables the special offload when the QP is created, and enable the
special work request id (IB_WR_LSO) when comes.
* Calculates the size of the WQE according to the new WQE format that
support these offloads.
* Handles the new WQE format when arrived, sets the relevant
fields, and copies the needed data.
Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Modify mlx5_ib_process_mad to use PPCNT and query_vport commands
instead of MAD_IFC, as MAD_IFC is deprecated on new firmware
versions (and doesn't support RoCE anyway).
Traffic counters exist in both 32-bit and 64-bit forms.
Declaring support of extended coutners results in traffic counters
to be read in their 64-bit form only via the query_vport command.
Error counters exist only in 32-bit form and read via PPCNT command.
This commit also adds counters support in RoCE.
Signed-off-by: Meny Yossefi <menyy@mellanox.com>
Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Reviewed-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
In case of failure returned from query function in
IB device registration, we need to clean IB cache which
was missed.
This change fixes it.
Fixes: 3e153a93a1 ('IB/core: Save the device attributes on the device
structure')
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Since the wait list is not protected against concurrent access
it must be processed from the context of the completion handler.
Replace the wait list processing code in the IB CM RTU callback
handler by code that triggers a completion handler. This patch
fixes the following rare crash:
WARNING: CPU: 2 PID: 78656 at lib/list_debug.c:53 __list_del_entry+0x67/0xd0()
list_del corruption, ffff88041ae404b8->next is LIST_POISON1 (dead000000000100)
Call Trace:
[<ffffffff81251c6b>] dump_stack+0x4f/0x74
[<ffffffff810574ab>] warn_slowpath_common+0x8b/0xd0
[<ffffffff81057591>] warn_slowpath_fmt+0x41/0x70
[<ffffffff8126f007>] __list_del_entry+0x67/0xd0
[<ffffffff8126f081>] list_del+0x11/0x40
[<ffffffffa0265242>] srpt_cm_handler+0x172/0x1a4 [ib_srpt]
[<ffffffffa0370370>] cm_process_work+0x20/0xf0 [ib_cm]
[<ffffffffa0370dae>] cm_establish_handler+0xbe/0x110 [ib_cm]
[<ffffffffa03733e7>] cm_work_handler+0x67/0xd0 [ib_cm]
[<ffffffff8107184d>] process_one_work+0x1bd/0x460
[<ffffffff81073148>] worker_thread+0x118/0x420
[<ffffffff81078444>] kthread+0xe4/0x100
[<ffffffff8151caff>] ret_from_fork+0x3f/0x70
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Alex Estrin <alex.estrin@intel.com>
Cc: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
If an initiator observes LUN deletion during shutdown of the
target stack then that will trigger an I/O error even when using
multipathd. Users need a way to avoid that shutting down the
target stack causes I/O errors, e.g. by providing a way to force
initiator logout. Hence close all sessions if a target port is
disabled.
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Alex Estrin <alex.estrin@intel.com>
Cc: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
The only allowed return values for the write_pending() callback
function are 0, -EAGAIN and -ENOMEM. Since attempting to perform
RDMA over a disconnecting channel will result in an IB error
completion anyway, remove the code that checks the channel state
from srpt_write_pending().
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Cc: Alex Estrin <alex.estrin@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
The Last WQE Reached event is only generated after one or more work
requests have been queued on the QP associated with a session. Since
session shutdown can start before any work requests have been queued,
use a zero-length RDMA write to wait until a QP has been drained.
Additionally, rework the code for closing and disconnecting a session.
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Sagi Grimberg <sagig@mellanox.com>
Cc: Alex Estrin <alex.estrin@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
In a later patch a function that can block will be called while
iterating over the rch_list. Hence protect that list with a
mutex instead of a spinlock. And since it is not allowed to sleep
while the task state != TASK_RUNNING, convert the list test in
srpt_ch_list_empty() into a lockless test.
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Sagi Grimberg <sagig@mellanox.com>
Cc: Alex Estrin <alex.estrin@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
In the CM REQ message handler, store the channel pointer in
cm_id->context such that the function srpt_find_channel() is no
longer needed. Additionally, make the CM event messages more
informative.
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Sagi Grimberg <sagig@mellanox.com>
Cc: Alex Estrin <alex.estrin@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
srpt_abort_cmd() must not be called in state SRPT_STATE_DATA_IN. Issue
a warning if this occurs.
srpt_abort_cmd() must not invoke target_put_sess_cmd() for commands
in state SRPT_STATE_DONE because the srpt_abort_cmd() callers already
do this when necessary. Hence remove this call.
If an RDMA read fails the corresponding SCSI command must fail. Hence
add a transport_generic_request_failure() call.
Remove an incorrect srpt_abort_cmd() call from srpt_rdma_write_done().
Avoid that srpt_send_done() calls srpt_abort_cmd() for finished SCSI
commands.
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Alex Estrin <alex.estrin@intel.com>
Cc: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
The target core function that should be called if target_submit_cmd()
fails is target_put_sess_cmd(). Additionally, change the return type
of srpt_handle_cmd() from int into void.
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Alex Estrin <alex.estrin@intel.com>
Cc: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Avoid that srpt_close_session() waits if it doesn't have to wait.
Additionally, increase the time during which srpt_close_session()
waits until closing a session has finished. This makes it easier
to detect session shutdown bugs.
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Cc: Alex Estrin <alex.estrin@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
The target core guarantees that shutdown_session() is only invoked
once per session. This means that the ib_srpt target driver doesn't
have to track whether or not shutdown_session() has been called.
Additionally, ensure that target_sess_cmd_list_set_waiting() is
called before target_wait_for_sess_cmds() by moving it into
srpt_release_channel_work().
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Cc: Alex Estrin <alex.estrin@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
The only allowed channel state changes are those that change
the channel state into a state with a higher numerical value.
This allows to merge the functions srpt_set_ch_state() and
srpt_test_and_set_ch_state() into a single function.
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Cc: Alex Estrin <alex.estrin@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Just like other target drivers, use scsilun_to_int() to unpack SCSI
LUN numbers. This patch only changes the behavior of ib_srpt for LUN
numbers >= 16384.
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Cc: Alex Estrin <alex.estrin@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Since struct srpt_node_acl is identical to struct se_node_acl,
remove the definition of the former structure. This patch does
not change any functionality.
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Alex Estrin <alex.estrin@intel.com>
Cc: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Although sizeof is an operator and hence in many cases parentheses can
be left out, the recommended kernel coding style is to surround the
sizeof argument with parentheses. This patch does not change any
functionality. It has been generated by running the following shell
command:
sed -i 's/sizeof \([^ );,]*\)/sizeof(\1)/g' drivers/infiniband/ulp/srpt/*.[ch]
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Cc: Alex Estrin <alex.estrin@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Let the target core check task existence instead of the SRP target
driver. Additionally, let the target core check the validity of the
task management request instead of the ib_srpt driver.
This patch fixes the following kernel crash:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000001
IP: [<ffffffffa0565f37>] srpt_handle_new_iu+0x6d7/0x790 [ib_srpt]
Oops: 0002 [#1] SMP
Call Trace:
[<ffffffffa05660ce>] srpt_process_completion+0xde/0x570 [ib_srpt]
[<ffffffffa056669f>] srpt_compl_thread+0x13f/0x160 [ib_srpt]
[<ffffffff8109726f>] kthread+0xcf/0xe0
[<ffffffff81613cfc>] ret_from_fork+0x7c/0xb0
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Fixes: 3e4f574857 ("ib_srpt: Convert TMR path to target_submit_tmr")
Tested-by: Alex Estrin <alex.estrin@intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Nicholas Bellinger <nab@linux-iscsi.org>
Cc: Sagi Grimberg <sagig@mellanox.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
This patch adds support to create RoCE-v2 compatible AH. It uses ahid
field to tell network-header-type to user space library. The library
has to decode network-header-type from ahid field.
Signed-off-by: Somnath Kotur <somnath.kotur@avagotech.com>
Signed-off-by: Devesh Sharma <devesh.sharma@avagotech.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
This patch implements following changes to support RoCE-v2
in the RC path:
* Get the GID-type for a given sgid.
* Based on the GID-type get IPv4/IPv6 L3-address
and give those to underlying device.
* Resolve and provide network header type to device.
Signed-off-by: Somnath Kotur <somnath.kotur@avagotech.com>
Signed-off-by: Devesh Sharma <devesh.sharma@avagotech.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
This patch adds following changes to support RoCE-v2
in the UD path.
* During AH creation GID-type is resolved for a given gid-index.
* Based on GID-type protocol header is built.
* Work completion reports network header type and set
IB_WC_WITH_NETWORK_HDR_TYPE flag in wc->wc_flags to indicate
that the network header type is valid.
Signed-off-by: Somnath Kotur <somnath.kotur@avagotech.com>
Signed-off-by: Devesh Sharma <devesh.sharma@avagotech.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Since create_singlethread_workqueue uses kzalloc internally,
it can fail when the system is under memory pressure, so need
to handle it.
Signed-off-by: Insu Yun <wuninsu@gmail.com>
Reviewed-by: Leon Romanovsky <leon@leon.nu>
Signed-off-by: Doug Ledford <dledford@redhat.com>
GRO is simpler to use than the old inet_lro library, and is compatible
with forwarding and bridging configurations.
Compile-tested only.
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Add support for receiving multicast/unicast traffic with
the don't trap rule.
Sniffing these packets requires a flow steering rule of type NORMAL
at priority 0 with flag IB_FLOW_ATTR_FLAGS_DONT_TRAP set.
Choosing between multicast or unicast is done via ethernet L2 dest_mac
mask and value:
- If mask is all zeros - unicast and multicast are set.
- If mask non zero - only mask with multicast bit 1 and rest 0 is
supported, the mac value will choose if it is
multicast or unicast rule.
If the mask multicast bit is on and some other bits are on too, it means
a request for specific multicast or unicast, this is not supported,
either receive all multicast or all unicast.
Only when limitations are met registered QP will receive requested type
but other QPs can receive same traffic if registered for it.
Otherwise, if limitations are not met, an error will be returned.
Limitations:
- Rule must be with priority 0.
- A0 mode is not supported.
- Sniffer QP cannot appear in any other flow steering rule.
Signed-off-by: Marina Varshaver <marinav@mellanox.com>
Reviewed-by: Matan Barak <matanb@mellanox.com>
Reviewed-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Don't trap flag (i.e. IB_FLOW_ATTR_FLAGS_DONT_TRAP) indicates that QP
will receive traffic, but will not steal it.
When a packet matches a flow steering rule that was created with
the don't trap flag, the QPs assigned to this rule will get this
packet, but matching will continue to other equal/lower priority
rules. This will let other QPs assigned to those rules to get the
packet too.
If both don't trap rule and other rules have the same priority
and match the same packet, the behavior is undefined.
The don't trap flag can't be set with default rule types
(i.e. IB_FLOW_ATTR_ALL_DEFAULT, IB_FLOW_ATTR_MC_DEFAULT) as default rules
don't have rules after them and don't trap has no meaning here.
Signed-off-by: Marina Varshaver <marinav@mellanox.com>
Reviewed-by: Matan Barak <matanb@mellanox.com>
Reviewed-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Wall time obtained from ktime_get_real_ns is susceptible to sudden jumps due to
user setting the time or due to NTP. Boot time is constantly increasing time
better suited for comparing two timestamps.
Signed-off-by: Abhilash Jindal <klock.android@gmail.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
i40iw_verbs.[ch] are to handle iwarp interface.
Changes since v2:
Made infiniband interface changes for 4.5
removed i40iw_reg_phys_mr() for 4.5
made changes as made by Christoph Hellwig made for nes
in i40iw_get_dma_mr().
Changes since v1:
Following modification based on Christoph Hellwig's feedback
-remove kmap() calls and moved to i40iw_cm.c.
-cleanup some of casts
Acked-by: Anjali Singhai Jain <anjali.singhai@intel.com>
Acked-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
i40iw_hw.c, i40iw_utils.c and i40iw_osdep.h are files to handle
interrupts and processing.
Changes since v1:
Cleanup/removed some macros reported by Christoph Hellwig.
Acked-by: Anjali Singhai Jain <anjali.singhai@intel.com>
Acked-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
i40iw_cm.c i40iw_cm.h are used for connection management.
changes since v2:
Implemented interface changes as reg_phys_mr() is
not part of inifiniband interface Done as
Christoph Hellwig <hch@infradead.org> did for nes.
Changes since v1:
improved casts
moved kmap() from i40iw_verbs.c to make them short
lived.
Acked-by: Anjali Singhai Jain <anjali.singhai@intel.com>
Acked-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
i40iw_main.c contains routines for i40e <=> i40iw interface and setup.
i40iw.h is header file for main device data structures.
i40iw_status.h is for return status codes.
Changes from v2:
more cast improvement
fixed timing issue during unload
added paramater change call from i40e
Changes from v1:
improved casting issues
do not print error using pr_err
change from bits to bool in i40iw_cqp_request{}
Acked-by: Anjali Singhai Jain <anjali.singhai@intel.com>
Acked-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Add completion objects, named sq_drained and rq_drained, to the c4iw_qp
struct. The queue-specific completion object is signaled when the last
CQE is drained from the CQ for that queue.
Add c4iw_drain_sq() to block until qp->rq_drained is completed.
Add c4iw_drain_rq() to block until qp->sq_drained is completed.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Add provider-specific drain_sq/drain_rq functions for providers needing
special drain logic.
Add static functions __ib_drain_sq() and __ib_drain_rq() which post noop
WRs to the SQ or RQ and block until their completions are processed.
This ensures the applications completions for work requests posted prior
to the drain work request have all been processed.
Add API functions ib_drain_sq(), ib_drain_rq(), and ib_drain_qp().
For the drain logic to work, the caller must:
ensure there is room in the CQ(s) and QP for the drain work request
and completion.
allocate the CQ using ib_alloc_cq() and the CQ poll context cannot be
IB_POLL_DIRECT.
ensure that there are no other contexts that are posting WRs concurrently.
Otherwise the drain is not guaranteed.
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
The cxgb4 prints an MMIO resource using the "0x%x" and "%p" format
strings on the length and start, respective, but that
triggers a compiler warning when using a 64-bit resource_size_t
on a 32-bit architecture:
drivers/infiniband/hw/cxgb4/device.c: In function 'c4iw_rdev_open':
drivers/infiniband/hw/cxgb4/device.c:807:7: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
(void *)pci_resource_start(rdev->lldi.pdev, 2),
This changes the format string to use %pR instead, which pretty-prints
the resource, avoids the warning and is shorter.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
The max depth of a fastreg mr depends on whether the device supports
DSGL or not. So compute it dynamically based on the device support and
the module use_dsgl option.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
This series provides support for iWARP applications to specify a TOS
value and have that map to a VLAN Priority for iw_cxgb4 iWARP connections.
In iw_cxgb4, when allocating an L2T entry, pass the skb_priority based
on the tos value in the cm_id. Also pass the correct tos value during
connection setup so the passive side gets the client's desired tos.
When sending the FLOWC work request to FW, if the egress device is
in a vlan, then use the vlan priority bits as the scheduling class.
This allows associating RDMA connections with scheduling classes to
provide traffic shaping per flow.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>