During packet transmission, if the system is under heavy load,
the hardware might complete processing the packet and free the
request memory (req) before the transmission function finishes.
If the software subsequently accesses this req, a use-after-free
error will occur. The qp_ctx memory exists throughout the packet
sending process, so replace the req with the qp_ctx.
Fixes: f0ae287c50 ("crypto: hisilicon/sec2 - implement full backlog mode for sec")
Signed-off-by: Wenkai Lin <linwenkai6@hisilicon.com>
Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This converts some of the visually simpler cases that have been split
over multiple lines. I only did the ones that are easy to verify the
resulting diff by having just that final GFP_KERNEL argument on the next
line.
Somebody should probably do a proper coccinelle script for this, but for
me the trivial script actually resulted in an assertion failure in the
middle of the script. I probably had made it a bit _too_ trivial.
So after fighting that far a while I decided to just do some of the
syntactically simpler cases with variations of the previous 'sed'
scripts.
The more syntactically complex multi-line cases would mostly really want
whitespace cleanup anyway.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This was done entirely with mindless brute force, using
git grep -l '\<k[vmz]*alloc_objs*(.*, GFP_KERNEL)' |
xargs sed -i 's/\(alloc_objs*(.*\), GFP_KERNEL)/\1)/'
to convert the new alloc_obj() users that had a simple GFP_KERNEL
argument to just drop that argument.
Note that due to the extreme simplicity of the scripting, any slightly
more complex cases spread over multiple lines would not be triggered:
they definitely exist, but this covers the vast bulk of the cases, and
the resulting diff is also then easier to check automatically.
For the same reason the 'flex' versions will be done as a separate
conversion.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This is the result of running the Coccinelle script from
scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to
avoid scalar types (which need careful case-by-case checking), and
instead replace kmalloc-family calls that allocate struct or union
object instances:
Single allocations: kmalloc(sizeof(TYPE), ...)
are replaced with: kmalloc_obj(TYPE, ...)
Array allocations: kmalloc_array(COUNT, sizeof(TYPE), ...)
are replaced with: kmalloc_objs(TYPE, COUNT, ...)
Flex array allocations: kmalloc(struct_size(PTR, FAM, COUNT), ...)
are replaced with: kmalloc_flex(*PTR, FAM, COUNT, ...)
(where TYPE may also be *VAR)
The resulting allocations no longer return "void *", instead returning
"TYPE *".
Signed-off-by: Kees Cook <kees@kernel.org>
When all hardware queues are busy and no shareable queue,
new processes fail to apply for queues. To avoid affecting
tasks, support fallback mechanism when hardware queues are
unavailable.
Fixes: c16a70c1f2 ("crypto: hisilicon/sec - add new algorithm mode for AEAD")
Signed-off-by: Qi Tao <taoqi10@huawei.com>
Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Consolidate the creation and start of qp into the function
hisi_qm_alloc_qps_node. This change eliminates the need for
each module to perform these steps in two separate phases
(creation and start).
Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
Signed-off-by: Weili Qian <qianweili@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Originally, when a queue was requested, it could only be configured
with the default algorithm type of 0. Now, when multiple tfms use
the same queue, the queue must be selected based on its attributes
to meet the requirements of tfm tasks. So the algorithm type
attribute of queue need to be distinguished. Just like a queue used
for compression in ZIP cannot be used for decompression tasks.
Fixes: 3f1ec97aac ("crypto: hisilicon/qm - Put device finding logic into QM")
Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
Signed-off-by: Weili Qian <qianweili@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
When multiple tfm use a same qp, the backlog data should be managed
centrally by the qp, rather than in the qp_ctx of each req.
Additionally, since SEC_BD_TYPE1 and SEC_BD_TYPE2 cannot use the
tag of the sqe to carry the virtual address of the req, the sent
sqe is stored in the qp. This allows the callback function to get
the req address. To handle the differences between hardware types,
the callback functions are split into two separate implementations.
Fixes: f0ae287c50 ("crypto: hisilicon/sec2 - implement full backlog mode for sec")
Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
Signed-off-by: Weili Qian <qianweili@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Fix the false-positive warning of qp_ctx being unitialised in
sec_request_init. The value of ctx_q_num defaults to 2 and is
guaranteed to be non-zero.
Thus qp_ctx is always initialised. However, the compiler is
not aware of this constraint on ctx_q_num. Restructure the loop
so that it is obvious to the compiler that ctx_q_num cannot be
zero.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Reviewed-by: Longfang Liu <liulongfang@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch introduces a hierarchical backlog mechanism to cache
user data in high-throughput encryption/decryption scenarios,
the implementation addresses packet loss issues when hardware
queues overflow during peak loads.
First, we use sec_alloc_req_id to obtain an exclusive resource
from the pre-allocated resource pool of each queue, if no resource
is allocated, perform the DMA map operation on the request memory.
When the task is ready, we will attempt to send it to the hardware,
if the hardware queue is already full, we cache the request into
the backlog list, then return an EBUSY status to the upper layer
and instruct the packet-sending thread to pause transmission.
Simultaneously, when the hardware completes a task, it triggers
the sec callback function, within this function, reattempt to send
the requests from the backlog list and wake up the sending thread
until the hardware queue becomes fully occupied again.
In addition, it handles such exceptions like the hardware is reset
when packets are sent, it will switch to the software computing
and release occupied resources.
Signed-off-by: Wenkai Lin <linwenkai6@hisilicon.com>
Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The following splat was triggered when booting the kernel built with
arm64's defconfig + CRYPTO_SELFTESTS + DMA_API_DEBUG.
------------[ cut here ]------------
DMA-API: hisi_sec2 0000:75:00.0: cacheline tracking EEXIST, overlapping mappings aren't supported
WARNING: CPU: 24 PID: 1273 at kernel/dma/debug.c:596 add_dma_entry+0x248/0x308
Call trace:
add_dma_entry+0x248/0x308 (P)
debug_dma_map_sg+0x208/0x3e4
__dma_map_sg_attrs+0xbc/0x118
dma_map_sg_attrs+0x10/0x24
hisi_acc_sg_buf_map_to_hw_sgl+0x80/0x218 [hisi_qm]
sec_cipher_map+0xc4/0x338 [hisi_sec2]
sec_aead_sgl_map+0x18/0x24 [hisi_sec2]
sec_process+0xb8/0x36c [hisi_sec2]
sec_aead_crypto+0xe4/0x264 [hisi_sec2]
sec_aead_encrypt+0x14/0x20 [hisi_sec2]
crypto_aead_encrypt+0x24/0x38
test_aead_vec_cfg+0x480/0x7e4
test_aead_vec+0x84/0x1b8
alg_test_aead+0xc0/0x498
alg_test.part.0+0x518/0x524
alg_test+0x20/0x64
cryptomgr_test+0x24/0x44
kthread+0x130/0x1fc
ret_from_fork+0x10/0x20
---[ end trace 0000000000000000 ]---
DMA-API: Mapped at:
debug_dma_map_sg+0x234/0x3e4
__dma_map_sg_attrs+0xbc/0x118
dma_map_sg_attrs+0x10/0x24
hisi_acc_sg_buf_map_to_hw_sgl+0x80/0x218 [hisi_qm]
sec_cipher_map+0xc4/0x338 [hisi_sec2]
This occurs in selftests where the input and the output scatterlist point
to the same underlying memory (e.g., when tested with INPLACE_TWO_SGLISTS
mode).
The problem is that the hisi_sec2 driver maps these two different
scatterlists using the DMA_BIDIRECTIONAL flag which leads to overlapped
write mappings which are not supported by the DMA layer.
Fix it by using the fine grained and correct DMA mapping directions. While
at it, switch the DMA directions used by the hisi_zip driver too.
Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
Reviewed-by: Longfang Liu <liulongfang@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
During encryption and decryption, user requests
must be checked first, if the specifications that
are not supported by the hardware are used, the
software computing is used for processing.
Fixes: 2f072d75d1 ("crypto: hisilicon - Add aead support on SEC2")
Signed-off-by: Wenkai Lin <linwenkai6@hisilicon.com>
Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The hardware only supports authentication sizes
that are 4-byte aligned. Therefore, the driver
switches to software computation in this case.
Fixes: 2f072d75d1 ("crypto: hisilicon - Add aead support on SEC2")
Signed-off-by: Wenkai Lin <linwenkai6@hisilicon.com>
Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
According to the HMAC RFC, the authentication key
can be 0 bytes, and the hardware can handle this
scenario. Therefore, remove the incorrect validation
for this case.
Fixes: 2f072d75d1 ("crypto: hisilicon - Add aead support on SEC2")
Signed-off-by: Wenkai Lin <linwenkai6@hisilicon.com>
Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
When the digest alg is HMAC-SHAx or another, the authsize may be less
than 4 bytes and mac_len of the BD is set to zero, the hardware considers
it a BD configuration error and reports a ras error, so the sec driver
needs to switch to software calculation in this case, this patch add a
check for it and remove unnecessary check that has been done by crypto.
Fixes: 2f072d75d1 ("crypto: hisilicon - Add aead support on SEC2")
Signed-off-by: Wenkai Lin <linwenkai6@hisilicon.com>
Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
When the AEAD algorithm is used for encryption or decryption,
the input authentication length varies, the hardware needs to
obtain the input length to pass the integrity check verification.
Currently, the driver uses a fixed authentication length,which
causes decryption failure, so the length configuration is modified.
In addition, the step of setting the auth length is unnecessary,
so it was deleted from the setkey function.
Fixes: 2f072d75d1 ("crypto: hisilicon - Add aead support on SEC2")
Signed-off-by: Wenkai Lin <linwenkai6@hisilicon.com>
Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Query the capability register status of accelerator devices
(SEC, HPRE and ZIP) through the debugfs interface, for example:
cat cap_regs. The purpose is to improve the robustness and
locability of hardware devices and drivers.
Signed-off-by: Qi Tao <taoqi10@huawei.com>
Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The AIV is one of the SEC resources. When releasing resources,
it need to release the AIV resources at the same time.
Otherwise, memory leakage occurs.
The aiv resource release is added to the sec resource release
function.
Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch fixes following cleanup issues:
- The return value of the function is
inconsistent with the actual return type.
- After the pointer type is directly converted
to the `__le64` type, the program may crash
or produce unexpected results.
Signed-off-by: Qi Tao <taoqi10@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Pre-store the valid value of the sec alg support related capability
register in sec_qm_init(), which will be called by probe process.
It can reduce the number of capability register queries and avoid
obtaining incorrect values in abnormal scenarios, such as reset
failed and the memory space disabled.
Fixes: 921715b6b7 ("crypto: hisilicon/sec - get algorithm bitmap from registers")
Signed-off-by: Zhiqi Song <songzhiqi1@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
When the Kunpeng accelerator executes tasks such as encryption
and decryption have minimum requirements on the number of device
queues. If the number of queues does not meet the requirement,
the process initialization will fail. Therefore, the driver checks
the number of queues on the device before registering the algorithm.
If the number does not meet the requirements, the driver does not register
the algorithm to crypto subsystem, the device is still added to the
qm_list.
Signed-off-by: Weili Qian <qianweili@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
When sec_aead_mac_init returns an error code, sec_cipher_map
will exit abnormally, the hardware sgl should be unmmaped.
Signed-off-by: Wenkai Lin <linwenkai6@hisilicon.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Add function 'sec_get_alg_bitmap' to get hardware algorithm bitmap
before register algorithm to crypto, instead of determining
whether to register an algorithm based on hardware platform's version.
Signed-off-by: Wenkai Lin <linwenkai6@hisilicon.com>
Signed-off-by: Weili Qian <qianweili@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Hardware V3 and later versions can obtain qp num and depth supported
by the hardware from registers. To be compatible with later hardware
versions, get qp num and depth from registers instead of fixed marcos.
Signed-off-by: Weili Qian <qianweili@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The authentication algorithm supports a maximum of 128-byte keys.
The allocated key memory is insufficient.
Fixes: 2f072d75d1 ("crypto: hisilicon - Add aead support on SEC2")
Signed-off-by: Kai Ye <yekai13@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Should not to uses the CRYPTO_ALG_ALLOCATES_MEMORY in SEC2. The SEC2
driver uses the pre-allocated buffers, including the src sgl pool, dst
sgl pool and other qp ctx resources. (e.g. IV buffer, mac buffer, key
buffer). The SEC2 driver doesn't allocate memory during request processing.
The driver only maps software sgl to allocated hardware sgl during I/O. So
here is fix it.
Signed-off-by: Kai Ye <yekai13@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Due to the subreq pointer misuse the private context memory. The aead
soft crypto occasionally casues the OS panic as setting the 64K page.
Here is fix it.
Fixes: 6c46a3297b ("crypto: hisilicon/sec - add fallback tfm...")
Signed-off-by: Kai Ye <yekai13@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Use the correct print format. Printing an unsigned int value should
use %u instead of %d.
Signed-off-by: Kai Ye <yekai13@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The CTR counter is 32bit rollover default on the BD.
But the NIST standard is 128bit rollover. it cause the
testing failed, so need to fix the BD configuration.
Signed-off-by: Kai Ye <yekai13@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Modify the print of information that might lead to user misunderstanding.
Currently only XTS mode need the fallback tfm when using 192bit key.
Others algs not need soft fallback tfm. So others algs can return
directly.
Signed-off-by: Kai Ye <yekai13@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Modify the SEC request structure, combines two common parameters of the
SEC request into one parameter.
Signed-off-by: Kai Ye <yekai13@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Add fallback tfm supporting for hisi_sec driver. Due to the Kunpeng920's
CCM/GCM algorithm not supports 0 byte src length. So the driver needs to
setting the soft fallback aead tfm.
Signed-off-by: Kai Ye <yekai13@huawei.com>
Signed-off-by: Longfang Liu <liulongfang@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Add fallback tfm supporting for hisi_sec driver. Due to the hardware
not supports 192bit key length when using XTS mode. So the driver needs
to setting the soft fallback skcipher tfm for user.
Signed-off-by: Kai Ye <yekai13@huawei.com>
Signed-off-by: Longfang Liu <liulongfang@huawei.com>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Due to Kunpeng930 adds new SQE data structure, the SEC driver needs
to be upgraded. It mainly includes bd parsing process and bd filling
process.
Signed-off-by: Kai Ye <yekai13@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>