crypto: qat - add heartbeat feature

Under some circumstances, firmware in the QAT devices could become
unresponsive. The Heartbeat feature provides a mechanism to detect
unresponsive devices.

The QAT FW periodically writes to memory a set of counters that allow
to detect the liveness of a device. This patch adds logic to enable
the reporting of those counters, analyze them and report if a device
is alive or not.

In particular this adds
  (1) heartbeat enabling, reading and detection logic
  (2) reporting of heartbeat status and configuration via debugfs
  (3) documentation for the newly created sysfs entries
  (4) configuration of FW settings related to heartbeat, e.g. tick period
  (5) logic to convert time in ms (provided by the user) to clock ticks

This patch introduces a new folder in debugfs called heartbeat with the
following attributes:
 - status
 - queries_sent
 - queries_failed
 - config

All attributes except config are reading only. In particular:
 - `status` file returns 0 when device is operational and -1 otherwise.
 - `queries_sent` returns the total number of heartbeat queries sent.
 - `queries_failed` returns the total number of heartbeat queries failed.
 - `config` allows to adjust the frequency at which the firmware writes
   counters to memory. This period is given in milliseconds and it is
   fixed for GEN4 devices.

Signed-off-by: Damian Muszynski <damian.muszynski@intel.com>
Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This commit is contained in:
Damian Muszynski
2023-06-30 19:03:57 +02:00
committed by Herbert Xu
parent e2980ba57e
commit 359b84f8db
21 changed files with 662 additions and 2 deletions

View File

@@ -233,6 +233,7 @@ struct adf_hw_device_data {
u8 num_accel;
u8 num_logical_accel;
u8 num_engines;
u32 num_hb_ctrs;
};
/* CSR write macro */
@@ -245,6 +246,11 @@ struct adf_hw_device_data {
#define ADF_CFG_NUM_SERVICES 4
#define ADF_SRV_TYPE_BIT_LEN 3
#define ADF_SRV_TYPE_MASK 0x7
#define ADF_AE_ADMIN_THREAD 7
#define ADF_NUM_THREADS_PER_AE 8
#define ADF_NUM_PKE_STRAND 2
#define ADF_AE_STRAND0_THREAD 8
#define ADF_AE_STRAND1_THREAD 9
#define GET_DEV(accel_dev) ((accel_dev)->accel_pci_dev.pci_dev->dev)
#define GET_BARS(accel_dev) ((accel_dev)->accel_pci_dev.pci_bars)
@@ -301,6 +307,7 @@ struct adf_accel_dev {
struct module *owner;
struct adf_accel_pci accel_pci_dev;
struct adf_timer *timer;
struct adf_heartbeat *heartbeat;
union {
struct {
/* protects VF2PF interrupts access */