Add controller logs to kdump.
Driver allocates DMA memory and communicates this address to FW. In the
event of system crash, host driver notifies the firmware about the crash
and firmware posts all the necessary logs in the pre-allocated host buffer
for firmware debugging.
Once firmware notifies the completion of the log uploading to the host
memory and host continues with the OS crash dump saving.
This is a "feature" driven capability and is backward compatible with
existing controller FW.
Rename some prefixes for OFA (Online-Firmware Activation ofa_*) buffers to
host_memory_*. So, not a lot of actual functional changes to
smartpqi_init.c, mainly determining the memory size allocation.
Added a function to notify the controller to copy debug data into host
memory before continuing kdump.
Most of the functional changes are in smartpqi_sis.c where the actual
handshaking is done.
Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
Signed-off-by: Murthy Bhat <Murthy.Bhat@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Link: https://lore.kernel.org/r/20240827185501.692804-2-don.brace@microchip.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Allow user to override the default driver timeout for controller ready.
There are some rare configurations which require the driver to wait longer
than the normal 3 minutes for the controller to complete its bootup
sequence and be ready to accept commands from the driver.
The module parameter is:
ctrl_ready_timeout= { 0 | 30-1800 }
and specifies the timeout in seconds for the driver to wait for controller
ready. The valid range is 0 or 30-1800. The default value is 0, which
causes the driver to use a timeout of 180 seconds (3 minutes).
Link: https://lore.kernel.org/r/165730607666.177165.9221211345284471213.stgit@brunhilda
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Fail all outstanding requests after a PCI linkdown.
Block access to device SCSI attributes during the following conditions:
"Cable pull" is called PQI_CTRL_SURPRISE_REMOVAL.
"PCIe Link Down" is called PQI_CTRL_GRACEFUL_REMOVAL.
Block access to device SCSI attributes during and in rare instances when
the controller goes offline.
Either outstanding requests or the access of SCSI attributes post linkdown
can lead to a hang.
Post linkdown, driver does not fail the outstanding requests leading to
long wait time before all the IOs eventually fail.
Also access of the SCSI attributes by host applications can lead to a
system hang.
Link: https://lore.kernel.org/r/165730603578.177165.4699352086827187263.stgit@brunhilda
Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Sagar Biradar <sagar.biradar@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Correct kdump hangs when controller is locked up.
There are occasions when a controller reboot (controller soft reset) is
issued when a controller firmware crash dump is in progress.
This leads to incomplete controller firmware crash dump:
- When the controller crash dump is in progress, and a kdump is initiated,
the driver issues inbound doorbell reset to bring back the controller in
SIS mode.
- If the controller is in locked up state, the inbound doorbell reset does
not work causing controller initialization failures. This results in the
driver hanging waiting for SIS mode.
To avoid an incomplete controller crash dump, add in a controller crash
dump handshake:
- Controller will indicate start and end of the controller crash dump by
setting some register bits.
- Driver will look these bits when a kdump is initiated. If a controller
crash dump is in progress, the driver will wait for the controller crash
dump to complete before issuing the controller soft reset then complete
driver initialization.
Link: https://lore.kernel.org/r/20210928235442.201875-3-don.brace@microchip.com
Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
Acked-by: John Donnelly <john.p.donnelly@oracle.com>
Signed-off-by: Mahesh Rajashekhara <mahesh.rajashekhara@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
- when OFA event occurs, driver will stop traffic to RAID/HBA path. Driver
waits for all the outstanding requests to complete.
- Driver sends OFA event acknowledgment to firmware.
- Driver will wait until the new firmware is up and running.
- Driver will free up the resources.
- Driver calls SIS/PQI initialization and rescans the device list.
- Driver will resume the traffic to RAID/HBA path.
Reviewed-by: Murthy Bhat <murthy.bhat@microsemi.com>
Signed-off-by: Mahesh Rajashekhara <mahesh.rajashekhara@microsemi.com>
Signed-off-by: Don Brace <don.brace@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>