The SEV-SNP firmware provides the SNP_CONFIG command used to set various
system-wide configuration values for SNP guests, such as the reported
TCB version used when signing guest attestation reports. Add an
interface to set this via userspace.
[ mdr: Squash in doc patch from Dionna, drop extended request/
certificate handling and simplify this to a simple wrapper around
SNP_CONFIG fw cmd. ]
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Co-developed-by: Alexey Kardashevskiy <aik@amd.com>
Signed-off-by: Alexey Kardashevskiy <aik@amd.com>
Co-developed-by: Dionna Glaze <dionnaglaze@google.com>
Signed-off-by: Dionna Glaze <dionnaglaze@google.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20240126041126.1927228-26-michael.roth@amd.com
The SNP_COMMIT command is used to commit the currently installed version
of the SEV firmware. Once committed, the firmware cannot be replaced
with a previous firmware version (cannot be rolled back). This command
will also update the reported TCB to match that of the currently
installed firmware.
[ mdr: Note the reported TCB update in the documentation/commit. ]
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20240126041126.1927228-25-michael.roth@amd.com
Add a kdump safe version of sev_firmware_shutdown() and register it as a
crash_kexec_post_notifier so it will be invoked during panic/crash to do
SEV/SNP shutdown. This is required for transitioning all IOMMU pages to
reclaim/hypervisor state, otherwise re-init of IOMMU pages during
crashdump kernel boot fails and panics the crashdump kernel.
This panic notifier runs in atomic context, hence it ensures not to
acquire any locks/mutexes and polls for PSP command completion instead
of depending on PSP command completion interrupt.
[ mdr: Remove use of "we" in comments. ]
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20240126041126.1927228-21-michael.roth@amd.com
The behavior of legacy SEV commands is altered when the firmware is
initialized for SNP support. In that case, all command buffer memory
that may get written to by legacy SEV commands must be marked as
firmware-owned in the RMP table prior to issuing the command.
Additionally, when a command buffer contains a system physical address
that points to additional buffers that firmware may write to, special
handling is needed depending on whether:
1) the system physical address points to guest memory
2) the system physical address points to host memory
To handle case #1, the pages of these buffers are changed to
firmware-owned in the RMP table before issuing the command, and restored
to hypervisor-owned after the command completes.
For case #2, a bounce buffer is used instead of the original address.
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Co-developed-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20240126041126.1927228-19-michael.roth@amd.com
For SEV/SEV-ES, a buffer can be used to access non-volatile data so it
can be initialized from a file specified by the init_ex_path CCP module
parameter instead of relying on the SPI bus for NV storage, and
afterward the buffer can be read from to sync new data back to the file.
When SNP is enabled, the pages comprising this buffer need to be set to
firmware-owned in the RMP table before they can be accessed by firmware
for subsequent updates to the initial contents.
Implement that handling here.
[ bp: Carve out allocation into a helper. ]
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Co-developed-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20240126041126.1927228-18-michael.roth@amd.com
The behavior and requirement for the SEV-legacy command is altered when
the SNP firmware is in the INIT state. See SEV-SNP firmware ABI
specification for more details.
Allocate the Trusted Memory Region (TMR) as a 2MB-sized/aligned region
when SNP is enabled to satisfy new requirements for SNP. Continue
allocating a 1MB-sized region for !SNP configuration.
[ bp: Carve out TMR allocation into a helper. ]
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Co-developed-by: Ashish Kalra <ashish.kalra@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20240126041126.1927228-17-michael.roth@amd.com
Before SNP VMs can be launched, the platform must be appropriately
configured and initialized via the SNP_INIT command.
During the execution of SNP_INIT command, the firmware configures
and enables SNP security policy enforcement in many system components.
Some system components write to regions of memory reserved by early
x86 firmware (e.g. UEFI). Other system components write to regions
provided by the operation system, hypervisor, or x86 firmware.
Such system components can only write to HV-fixed pages or Default
pages. They will error when attempting to write to pages in other page
states after SNP_INIT enables their SNP enforcement.
Starting in SNP firmware v1.52, the SNP_INIT_EX command takes a list of
system physical address ranges to convert into the HV-fixed page states
during the RMP initialization. If INIT_RMP is 1, hypervisors should
provide all system physical address ranges that the hypervisor will
never assign to a guest until the next RMP re-initialization.
For instance, the memory that UEFI reserves should be included in the
range list. This allows system components that occasionally write to
memory (e.g. logging to UEFI reserved regions) to not fail due to
RMP initialization and SNP enablement.
Note that SNP_INIT(_EX) must not be executed while non-SEV guests are
executing, otherwise it is possible that the system could reset or hang.
The psp_init_on_probe module parameter was added for SEV/SEV-ES support
and the init_ex_path module parameter to allow for time for the
necessary file system to be mounted/available.
SNP_INIT(_EX) does not use the file associated with init_ex_path. So, to
avoid running into issues where SNP_INIT(_EX) is called while there are
other running guests, issue it during module probe regardless of the
psp_init_on_probe setting, but maintain the previous deferrable handling
for SEV/SEV-ES initialization.
[ mdr: Squash in psp_init_on_probe changes from Tom, reduce
proliferation of 'probe' function parameter where possible.
bp: Fix 32-bit allmodconfig build. ]
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Co-developed-by: Ashish Kalra <ashish.kalra@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
Co-developed-by: Jarkko Sakkinen <jarkko@profian.com>
Signed-off-by: Jarkko Sakkinen <jarkko@profian.com>
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20240126041126.1927228-14-michael.roth@amd.com
AMD introduced the next generation of SEV called SEV-SNP (Secure Nested
Paging). SEV-SNP builds upon existing SEV and SEV-ES functionality while
adding new hardware security protection.
Define the commands and structures used to communicate with the AMD-SP
when creating and managing the SEV-SNP guests. The SEV-SNP firmware spec
is available at developer.amd.com/sev.
[ mdr: update SNP command list and SNP status struct based on current
spec, use C99 flexible arrays, fix kernel-doc issues. ]
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Co-developed-by: Ashish Kalra <ashish.kalra@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20240126041126.1927228-13-michael.roth@amd.com
As noted in the "Deprecated Interfaces, Language Features, Attributes,
and Conventions" documentation [1], size calculations (especially
multiplication) should not be performed in memory allocator (or similar)
function arguments due to the risk of them overflowing. This could lead
to values wrapping around and a smaller allocation being made than the
caller was expecting. Using those allocations could lead to linear
overflows of heap memory and other misbehaviors.
So, use the purpose specific kcalloc_node() function instead of the
argument count * size in the kzalloc_node() function.
Link: https://www.kernel.org/doc/html/next/process/deprecated.html#open-coded-arithmetic-in-allocator-arguments [1]
Link: https://github.com/KSPP/linux/issues/162
Signed-off-by: Erick Archer <erick.archer@gmx.com>
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Acked-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
As noted in the "Deprecated Interfaces, Language Features, Attributes,
and Conventions" documentation [1], size calculations (especially
multiplication) should not be performed in memory allocator (or similar)
function arguments due to the risk of them overflowing. This could lead
to values wrapping around and a smaller allocation being made than the
caller was expecting. Using those allocations could lead to linear
overflows of heap memory and other misbehaviors.
So, use the purpose specific kcalloc() function instead of the argument
size * count in the kzalloc() function.
Link: https://www.kernel.org/doc/html/next/process/deprecated.html#open-coded-arithmetic-in-allocator-arguments [1]
Link: https://github.com/KSPP/linux/issues/162
Signed-off-by: Erick Archer <erick.archer@gmx.com>
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Acked-by: Jernej Skrabec <jernej.skrabec@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Switch to raw_smp_processor_id() to prevent a number of
warnings from kernel debugging. We do not care about
preemption here, as the CPU number is only used as a
poor mans load balancing or device selection. If preemption
happens during an encrypt/decrypt operation a small performance
hit will occur but everything will continue to work, so just
ignore it. This commit is similar to e7a9b05ca4
("crypto: cavium - Fix smp_processor_id() warnings").
[ 7538.874350] BUG: using smp_processor_id() in preemptible [00000000] code: af_alg06/8438
[ 7538.874368] caller is debug_smp_processor_id+0x1c/0x28
[ 7538.874373] CPU: 50 PID: 8438 Comm: af_alg06 Kdump: loaded Not tainted 5.10.0.pc+ #18
[ 7538.874377] Call trace:
[ 7538.874387] dump_backtrace+0x0/0x210
[ 7538.874389] show_stack+0x2c/0x38
[ 7538.874392] dump_stack+0x110/0x164
[ 7538.874394] check_preemption_disabled+0xf4/0x108
[ 7538.874396] debug_smp_processor_id+0x1c/0x28
[ 7538.874406] sec_create_qps+0x24/0xe8 [hisi_sec2]
[ 7538.874408] sec_ctx_base_init+0x20/0x4d8 [hisi_sec2]
[ 7538.874411] sec_aead_ctx_init+0x68/0x180 [hisi_sec2]
[ 7538.874413] sec_aead_sha256_ctx_init+0x28/0x38 [hisi_sec2]
[ 7538.874421] crypto_aead_init_tfm+0x54/0x68
[ 7538.874423] crypto_create_tfm_node+0x6c/0x110
[ 7538.874424] crypto_alloc_tfm_node+0x74/0x288
[ 7538.874426] crypto_alloc_aead+0x40/0x50
[ 7538.874431] aead_bind+0x50/0xd0
[ 7538.874433] alg_bind+0x94/0x148
[ 7538.874439] __sys_bind+0x98/0x118
[ 7538.874441] __arm64_sys_bind+0x28/0x38
[ 7538.874445] do_el0_svc+0x88/0x258
[ 7538.874447] el0_svc+0x1c/0x28
[ 7538.874449] el0_sync_handler+0x8c/0xb8
[ 7538.874452] el0_sync+0x148/0x180
Signed-off-by: Wenkai Lin <linwenkai6@hisilicon.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Read the values of some device registers before the device
is reset, these values help analyze the cause of the device exception.
Signed-off-by: Weili Qian <qianweili@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Support get device current state. The value 0 indicates that
the device is busy, and the value 1 indicates that the
device is idle. When the device is in suspended, 1 is returned.
Signed-off-by: Weili Qian <qianweili@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch removes the debugfs_create_dir() error checking in
iaa_crypto_debugfs_init(). Because the debugfs_create_dir() is developed
in a way that the caller can safely handle the errors that
occur during the creation of DebugFS nodes.
Signed-off-by: Minjie Du <duminjie@vivo.com>
Acked-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The use of array_size() leads gcc to assume the memcpy() can have a larger
limit than actually possible, which triggers a string fortification warning:
In file included from include/linux/string.h:296,
from include/linux/bitmap.h:12,
from include/linux/cpumask.h:12,
from include/linux/sched.h:16,
from include/linux/delay.h:23,
from include/linux/iopoll.h:12,
from drivers/crypto/intel/qat/qat_common/adf_gen4_hw_data.c:3:
In function 'fortify_memcpy_chk',
inlined from 'adf_gen4_init_thd2arb_map' at drivers/crypto/intel/qat/qat_common/adf_gen4_hw_data.c:401:3:
include/linux/fortify-string.h:579:4: error: call to '__write_overflow_field' declared with attribute warning: detected write beyond size of field (1st parameter); maybe use struct_group()? [-Werror=attribute-warning]
579 | __write_overflow_field(p_size_field, size);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
include/linux/fortify-string.h:588:4: error: call to '__read_overflow2_field' declared with attribute warning: detected read beyond size of field (2nd parameter); maybe use struct_group()? [-Werror=attribute-warning]
588 | __read_overflow2_field(q_size_field, size);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Add an explicit range check to avoid this.
Fixes: 5da6a2d535 ("crypto: qat - generate dynamically arbiter mappings")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Relocate all crypto files in vmx driver to arch/powerpc/crypto directory
and remove vmx directory.
drivers/crypto/vmx/aes.c rename to arch/powerpc/crypto/aes.c
drivers/crypto/vmx/aes_cbc.c rename to arch/powerpc/crypto/aes_cbc.c
drivers/crypto/vmx/aes_ctr.c rename to arch/powerpc/crypto/aes_ctr.c
drivers/crypto/vmx/aes_xts.c rename to arch/powerpc/crypto/aes_xts.c
drivers/crypto/vmx/aesp8-ppc.h rename to arch/powerpc/crypto/aesp8-ppc.h
drivers/crypto/vmx/aesp8-ppc.pl rename to arch/powerpc/crypto/aesp8-ppc.pl
drivers/crypto/vmx/ghash.c rename to arch/powerpc/crypto/ghash.c
drivers/crypto/vmx/ghashp8-ppc.pl rename to arch/powerpc/crypto/ghashp8-ppc.pl
drivers/crypto/vmx/vmx.c rename to arch/powerpc/crypto/vmx.c
deleted files:
drivers/crypto/vmx/Makefile
drivers/crypto/vmx/Kconfig
drivers/crypto/vmx/ppc-xlate.pl
This patch has been tested has passed the selftest. The patch is also tested with
CONFIG_CRYPTO_MANAGER_EXTRA_TESTS enabled.
Signed-off-by: Danny Tsen <dtsen@linux.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The kfree() function was called in up to two cases by the
__virtio_crypto_akcipher_do_req() function during error handling
even if the passed variable contained a null pointer.
This issue was detected by using the Coccinelle software.
* Adjust jump targets.
* Delete two initialisations which became unnecessary
with this refactoring.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Reviewed-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Justin Stitt <justinstitt@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
ahash_alg->setkey is updated to ahash_nosetkey in ahash.c
so checking setkey() function to determine hmac algorithm is not valid.
to fix this added is_hmac variable in structure caam_hash_alg to determine
whether the algorithm is hmac or not.
Fixes: 2f1f34c1bf ("crypto: ahash - optimize performance when wrapping shash")
Signed-off-by: Gaurav Jain <gaurav.jain@nxp.com>
Reviewed-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The commit "crypto: qat - generate dynamically arbiter mappings"
introduced a regression on qat_402xx devices.
This is reported when the driver probes the device, as indicated by
the following error messages:
4xxx 0000:0b:00.0: enabling device (0140 -> 0142)
4xxx 0000:0b:00.0: Generate of the thread to arbiter map failed
4xxx 0000:0b:00.0: Direct firmware load for qat_402xx_mmp.bin failed with error -2
The root cause of this issue was the omission of a necessary function
pointer required by the mapping algorithm during the implementation.
Fix it by adding the missing function pointer.
Fixes: 5da6a2d535 ("crypto: qat - generate dynamically arbiter mappings")
Signed-off-by: Damian Muszynski <damian.muszynski@intel.com>
Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
In some configurations e.g. systems with CXL, a numa node can have 0
cpus and cpumask_nth() will return a cpu value that doesn't exist,
which will result in an attempt to add an entry to the wq table at a
bad index.
To fix this, when iterating the cpus for a node, skip any node that
doesn't have cpus.
Also, as a precaution, add a warning and bail if cpumask_nth() returns
a nonexistent cpu.
Reported-by: Zhang, Rex <rex.zhang@intel.com>
Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Do not spam the kernel log with unnecessary error messages when processing
requests that aren't a multiple of AES block size.
Signed-off-by: Ovidiu Panait <ovidiu.panait@windriver.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The 'active' flag is only used to indirectly set the 'first' flag.
Drop the 'active' flag and set 'first' directly in sahara_sha_init().
Signed-off-by: Ovidiu Panait <ovidiu.panait@windriver.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Switch to use dev_err_probe() to simplify the error paths and unify
message template. While at it, also remove explicit error messages
from every potential -ENOMEM.
Signed-off-by: Ovidiu Panait <ovidiu.panait@windriver.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Use devm_clk_get_enabled() helper to simplify probe/remove code. Also, use
dev_err_probe() for error reporting.
Signed-off-by: Ovidiu Panait <ovidiu.panait@windriver.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Where applicable, use BIT() macro instead of shift operation to improve
readability. No functional change.
Signed-off-by: Ovidiu Panait <ovidiu.panait@windriver.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
When testing sahara sha256 speed performance with tcrypt (mode=404) on
imx53-qsrb board, multiple "Invalid numbers of src SG." errors are
reported. This was traced to sahara_walk_and_recalc() resizing req->src
and causing the subsequent dma_map_sg() call to fail.
Now that the previous commit fixed sahara_sha_hw_links_create() to take
into account the actual request size, rather than relying on sg->length
values, the resize operation is no longer necessary.
Therefore, remove sahara_walk_and_recalc() and simplify associated logic.
Fixes: 5a2bb93f59 ("crypto: sahara - add support for SHA1/256")
Signed-off-by: Ovidiu Panait <ovidiu.panait@windriver.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
It's not always the case that the entire sg entry needs to be processed.
Currently, when nbytes is less than sg->length, "Descriptor length" errors
are encountered.
To fix this, take the actual request size into account when populating the
hw links.
Fixes: 5a2bb93f59 ("crypto: sahara - add support for SHA1/256")
Signed-off-by: Ovidiu Panait <ovidiu.panait@windriver.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
sahara_sha_hw_data_descriptor_create() returns negative error codes on
failure, so make sure the errors are correctly handled / propagated.
Fixes: 5a2bb93f59 ("crypto: sahara - add support for SHA1/256")
Signed-off-by: Ovidiu Panait <ovidiu.panait@windriver.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The sg lists are not unmapped in case of timeout errors. Fix this.
Fixes: 5a2bb93f59 ("crypto: sahara - add support for SHA1/256")
Fixes: 5de8875281 ("crypto: sahara - Add driver for SAHARA2 accelerator.")
Signed-off-by: Ovidiu Panait <ovidiu.panait@windriver.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Set the reqsize for sha algorithms to sizeof(struct sahara_sha_reqctx), the
extra space is not needed.
Fixes: 5a2bb93f59 ("crypto: sahara - add support for SHA1/256")
Signed-off-by: Ovidiu Panait <ovidiu.panait@windriver.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
In case of a zero-length input, exit gracefully from sahara_aes_crypt().
Fixes: 5de8875281 ("crypto: sahara - Add driver for SAHARA2 accelerator.")
Signed-off-by: Ovidiu Panait <ovidiu.panait@windriver.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The thread-to-arbiter mapping describes which arbiter can assign jobs
to an acceleration engine thread.
The existing mappings are functionally correct, but hardcoded and not
optimized.
Replace the static mappings with an algorithm that generates optimal
mappings, based on the loaded configuration.
The logic has been made common so that it can be shared between all
QAT GEN4 devices.
Signed-off-by: Damian Muszynski <damian.muszynski@intel.com>
Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Expose through debugfs ring pair telemetry data for QAT GEN4 devices.
This allows to gather metrics about the PCIe channel and device TLB for
a selected ring pair. It is possible to monitor maximum 4 ring pairs at
the time per device.
For details, refer to debugfs-driver-qat_telemetry in Documentation/ABI.
This patch is based on earlier work done by Wojciech Ziemba.
Signed-off-by: Lucas Segarra Fernandez <lucas.segarra.fernandez@intel.com>
Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Reviewed-by: Damian Muszynski <damian.muszynski@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Expose through debugfs device telemetry data for QAT GEN4 devices.
This allows to gather metrics about the performance and the utilization
of a device. In particular, statistics on (1) the utilization of the
PCIe channel, (2) address translation, when SVA is enabled and (3) the
internal engines for crypto and data compression.
If telemetry is supported by the firmware, the driver allocates a DMA
region and a circular buffer. When telemetry is enabled, through the
`control` attribute in debugfs, the driver sends to the firmware, via
the admin interface, the `TL_START` command. This triggers the device to
periodically gather telemetry data from hardware registers and write it
into the DMA memory region. The device writes into the shared region
every second.
The driver, every 500ms, snapshots the DMA shared region into the
circular buffer. This is then used to compute basic metric
(min/max/average) on each counter, every time the `device_data` attribute
is queried.
Telemetry counters are exposed through debugfs in the folder
/sys/kernel/debug/qat_<device>_<BDF>/telemetry.
For details, refer to debugfs-driver-qat_telemetry in Documentation/ABI.
This patch is based on earlier work done by Wojciech Ziemba.
Signed-off-by: Lucas Segarra Fernandez <lucas.segarra.fernandez@intel.com>
Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Reviewed-by: Damian Muszynski <damian.muszynski@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Extend the admin interface with two new public APIs to enable
and disable the telemetry feature: adf_send_admin_tl_start() and
adf_send_admin_tl_stop().
The first, sends to the firmware, through the ICP_QAT_FW_TL_START
message, the IO address where the firmware will write telemetry
metrics and a list of ring pairs (maximum 4) to be monitored.
It returns the number of accelerators of each type supported by
this hardware. After this message is sent, the firmware starts
periodically reporting telemetry data using by writing into the
dma buffer specified as input.
The second, sends the admin message ICP_QAT_FW_TL_STOP
which stops the reporting of telemetry data.
This patch is based on earlier work done by Wojciech Ziemba.
Signed-off-by: Lucas Segarra Fernandez <lucas.segarra.fernandez@intel.com>
Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Reviewed-by: Damian Muszynski <damian.muszynski@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
In order for shared workqeues to work properly, desc->priv should be
set to 0 rather than 1. The need for this is described in commit
f5ccf55e10 (dmaengine/idxd: Re-enable kernel workqueue under DMA
API), so we need to make IAA consistent with IOMMU settings, otherwise
we get:
[ 141.948389] IOMMU: dmar15: Page request in Privilege Mode
[ 141.948394] dmar15: Invalid page request: 2000026a100101 ffffb167
Dedicated workqueues ignore this field and are unaffected.
Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>