Commit Graph

587 Commits

Author SHA1 Message Date
Linus Torvalds
32a92f8c89 Convert more 'alloc_obj' cases to default GFP_KERNEL arguments
This converts some of the visually simpler cases that have been split
over multiple lines.  I only did the ones that are easy to verify the
resulting diff by having just that final GFP_KERNEL argument on the next
line.

Somebody should probably do a proper coccinelle script for this, but for
me the trivial script actually resulted in an assertion failure in the
middle of the script.  I probably had made it a bit _too_ trivial.

So after fighting that far a while I decided to just do some of the
syntactically simpler cases with variations of the previous 'sed'
scripts.

The more syntactically complex multi-line cases would mostly really want
whitespace cleanup anyway.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-02-21 20:03:00 -08:00
Linus Torvalds
bf4afc53b7 Convert 'alloc_obj' family to use the new default GFP_KERNEL argument
This was done entirely with mindless brute force, using

    git grep -l '\<k[vmz]*alloc_objs*(.*, GFP_KERNEL)' |
        xargs sed -i 's/\(alloc_objs*(.*\), GFP_KERNEL)/\1)/'

to convert the new alloc_obj() users that had a simple GFP_KERNEL
argument to just drop that argument.

Note that due to the extreme simplicity of the scripting, any slightly
more complex cases spread over multiple lines would not be triggered:
they definitely exist, but this covers the vast bulk of the cases, and
the resulting diff is also then easier to check automatically.

For the same reason the 'flex' versions will be done as a separate
conversion.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-02-21 17:09:51 -08:00
Kees Cook
69050f8d6d treewide: Replace kmalloc with kmalloc_obj for non-scalar types
This is the result of running the Coccinelle script from
scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to
avoid scalar types (which need careful case-by-case checking), and
instead replace kmalloc-family calls that allocate struct or union
object instances:

Single allocations:	kmalloc(sizeof(TYPE), ...)
are replaced with:	kmalloc_obj(TYPE, ...)

Array allocations:	kmalloc_array(COUNT, sizeof(TYPE), ...)
are replaced with:	kmalloc_objs(TYPE, COUNT, ...)

Flex array allocations:	kmalloc(struct_size(PTR, FAM, COUNT), ...)
are replaced with:	kmalloc_flex(*PTR, FAM, COUNT, ...)

(where TYPE may also be *VAR)

The resulting allocations no longer return "void *", instead returning
"TYPE *".

Signed-off-by: Kees Cook <kees@kernel.org>
2026-02-21 01:02:28 -08:00
Weili Qian
6d0de6014b crypto: hisilicon/qm - increase wait time for mailbox
The device requires more time to process queue stop and function stop
mailbox commands compared to other mailbox commands . In the current
driver, the mailbox processing wait time for queue stop and function
stop is less than the device timeout, which may cause the driver to
incorrectly assume that the mailbox processing has failed. Therefore,
the driver wait time for queue stop and function stop should be set to
be greater than the device timeout.  And PF and VF communication
relies on mailbox, the communication wait time should also be modified.

Signed-off-by: Weili Qian <qianweili@huawei.com>
Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2026-01-31 10:52:31 +08:00
Weili Qian
3296992ffc crypto: hisilicon/qm - obtain the mailbox configuration at one time
The malibox needs to be triggered by a 128bit atomic operation.
The reason is that the PF and VFs of the device share the mmio memory
of the mailbox, and the mutex cannot lock mailbox operations in
different functions, especially when passing through VFs to
virtual machines.

Currently, the write operation to the mailbox is already a 128-bit
atomic write. The read operation also needs to be modified to a
128-bit atomic read. Since there is no general 128-bit IO memory
access API in the current ARM64 architecture, and the stp and ldp
instructions do not guarantee atomic access to device memory, they
cannot be extracted as a general API. Therefore, the 128-bit atomic
read and write operations need to be implemented in the driver.

Signed-off-by: Weili Qian <qianweili@huawei.com>
Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2026-01-31 10:52:31 +08:00
Weili Qian
fc8ae11b84 crypto: hisilicon/qm - remove unnecessary code in qm_mb_write()
Since the HiSilicon accelerator is used only on the
ARM64 architectures, the implementations for other
architectures are not needed, so remove the unnecessary code.

Signed-off-by: Weili Qian <qianweili@huawei.com>
Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2026-01-31 10:52:31 +08:00
Chenghai Huang
ebf35d8f93 crypto: hisilicon/qm - move the barrier before writing to the mailbox register
Before sending the data via the mailbox to the hardware, to ensure
that the data accessed by the hardware is the most up-to-date,
a write barrier should be added before writing to the mailbox register.
The current memory barrier is placed after writing to the register,
the barrier order should be modified to be before writing to the register.

Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2026-01-31 10:52:31 +08:00
Weili Qian
3d3135057f crypto: hisilicon/trng - support tfms sharing the device
Since the number of devices is limited, and the number
of tfms may exceed the number of devices, to ensure that
tfms can be successfully allocated, support tfms
sharing the same device.

Fixes: e4d9d10ef4 ("crypto: hisilicon/trng - add support for PRNG")
Signed-off-by: Weili Qian <qianweili@huawei.com>
Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2026-01-31 10:52:31 +08:00
Chenghai Huang
ea377793f4 crypto: hisilicon/zip - add lz4 algorithm for hisi_zip
Add the "hisi-lz4-acomp" algorithm by the crypto acomp. When the
8th bit of the capability register is 1, the lz4 algorithm will
register to crypto acomp, and the window length is configured to
16K by default.

Since the "hisi-lz4-acomp" currently only support compression
direction, decompression is completed by the soft lz4 algorithm.

Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2026-01-31 10:52:31 +08:00
Chenghai Huang
4154f7d3b1 crypto: hisilicon/sgl - fix inconsistent map/unmap direction issue
Ensure that the direction for dma_map_sg and dma_unmap_sg is
consistent.

Fixes: 2566de3e06 ("crypto: hisilicon - Use fine grained DMA mapping direction")
Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
Reviewed-by: Zenghui Yu <yuzenghui@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2026-01-16 14:02:06 +08:00
Qi Tao
e750743962 crypto: hisilicon/sec2 - support skcipher/aead fallback for hardware queue unavailable
When all hardware queues are busy and no shareable queue,
new processes fail to apply for queues. To avoid affecting
tasks, support fallback mechanism when hardware queues are
unavailable.

Fixes: c16a70c1f2 ("crypto: hisilicon/sec - add new algorithm mode for AEAD")
Signed-off-by: Qi Tao <taoqi10@huawei.com>
Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2026-01-16 14:02:06 +08:00
Weili Qian
6aff4d977e crypto: hisilicon/hpre - support the hpre algorithm fallback
When all hardware queues are busy and no shareable queue,
new processes fail to apply for queues. To avoid affecting
tasks, support fallback mechanism when hardware queues are
unavailable.

HPRE driver supports DH algorithm, limited to prime numbers up to 4K.
It supports prime numbers larger than 4K via fallback mechanism.

Fixes: 05e7b906aa ("crypto: hisilicon/hpre - add 'ECDH' algorithm")
Signed-off-by: Weili Qian <qianweili@huawei.com>
Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2026-01-16 14:02:06 +08:00
Chenghai Huang
73398f85a4 crypto: hisilicon/zip - support fallback for zip
When the hardware queue resource busy(no shareable queue)
or memery alloc fail in initialization of acomp_alg, use
soft algorithm to complete the work.

Fixes: 1a9e6f59ca ("crypto: hisilicon/zip - remove zlib and gzip")
Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2026-01-16 14:02:06 +08:00
Chenghai Huang
2a75decec1 crypto: hisilicon/qm - optimize device selection priority based on queue ref count and NUMA distance
Add device sorting criteria to prioritize devices with fewer
references and closer NUMA distances. Devices that are fully
occupied will not be prioritized for use.

Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
Signed-off-by: Weili Qian <qianweili@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2026-01-16 14:02:06 +08:00
Chenghai Huang
4705489742 crypto: hisilicon/qm - add reference counting to queues for tfm kernel reuse
Add reference counting to queues. When all queues are occupied, tfm
will reuse queues with the same algorithm type that have already
been allocated in the kernel. The corresponding queue will be
released when the reference count reaches 1.

Reviewed-by: Longfang Liu <liulongfang@huawei.com>
Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
Signed-off-by: Weili Qian <qianweili@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2026-01-16 14:02:06 +08:00
Chenghai Huang
72f3bbebff crypto: hisilicon - consolidate qp creation and start in hisi_qm_alloc_qps_node
Consolidate the creation and start of qp into the function
hisi_qm_alloc_qps_node. This change eliminates the need for
each module to perform these steps in two separate phases
(creation and start).

Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
Signed-off-by: Weili Qian <qianweili@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2026-01-16 14:02:06 +08:00
Chenghai Huang
8cd9b608ee crypto: hisilicon/qm - centralize the sending locks of each module into qm
When a single queue used by multiple tfms, the protection of shared
resources by individual module driver programs is no longer
sufficient. The hisi_qp_send needs to be ensured by the lock in qp.

Fixes: 5fdb4b345c ("crypto: hisilicon - add a lock for the qp send operation")
Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
Signed-off-by: Weili Qian <qianweili@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2026-01-16 14:02:06 +08:00
Chenghai Huang
21452eaa06 crypto: hisilicon/qm - enhance the configuration of req_type in queue attributes
Originally, when a queue was requested, it could only be configured
with the default algorithm type of 0. Now, when multiple tfms use
the same queue, the queue must be selected based on its attributes
to meet the requirements of tfm tasks. So the algorithm type
attribute of queue need to be distinguished. Just like a queue used
for compression in ZIP cannot be used for decompression tasks.

Fixes: 3f1ec97aac ("crypto: hisilicon/qm - Put device finding logic into QM")
Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
Signed-off-by: Weili Qian <qianweili@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2026-01-16 14:02:06 +08:00
lizhi
3a19847581 crypto: hisilicon/hpre: extend tag field to 64 bits for better performance
This commit expands the tag field in hpre_sqe structure from 16-bit
to 64-bit. The change enables storing request addresses directly
in the tag field, allowing callback functions to access request messages
without the previous indirection mechanism.

By eliminating the need for lookup tables, this modification reduces lock
contention and associated overhead, leading to improved efficiency and
simplified code.

Fixes: c8b4b47707 ("crypto: hisilicon - add HiSilicon HPRE accelerator")
Signed-off-by: lizhi <lizhi206@huawei.com>
Signed-off-by: Weili Qian <qianweili@huawei.com>
Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2026-01-16 14:02:06 +08:00
Chenghai Huang
08eb67d23e crypto: hisilicon/sec - move backlog management to qp and store sqe in qp for callback
When multiple tfm use a same qp, the backlog data should be managed
centrally by the qp, rather than in the qp_ctx of each req.

Additionally, since SEC_BD_TYPE1 and SEC_BD_TYPE2 cannot use the
tag of the sqe to carry the virtual address of the req, the sent
sqe is stored in the qp. This allows the callback function to get
the req address. To handle the differences between hardware types,
the callback functions are split into two separate implementations.

Fixes: f0ae287c50 ("crypto: hisilicon/sec2 - implement full backlog mode for sec")
Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
Signed-off-by: Weili Qian <qianweili@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2026-01-16 14:02:06 +08:00
Chenghai Huang
19c2475ce1 crypto: hisilicon/zip - adjust the way to obtain the req in the callback function
In the shared queue design, multiple tfms use same qp, and one qp
need to corresponds to multiple qp_ctx. So use tag to obtain the
req virtual address. Build a one-to-one relationship between tfm
and qp_ctx. finaly remove the old get_tag operation.

Fixes: 2bcf36348c ("crypto: hisilicon/zip - initialize operations about 'sqe' in 'acomp_alg.init'")
Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
Signed-off-by: Weili Qian <qianweili@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2026-01-16 14:02:06 +08:00
Chenghai Huang
b74fd80d7f crypto: hisilicon/qm - fix incorrect judgment in qm_get_complete_eqe_num()
In qm_get_complete_eqe_num(), the function entry has already
checked whether the interrupt is valid, so the interrupt event
can be processed directly. Currently, the interrupt valid bit is
being checked again redundantly, and no interrupt processing is
performed. Therefore, the loop condition should be modified to
directly process the interrupt event, and use do while instead of
the current while loop, because the condition is always satisfied
on the first iteration.

Fixes: f5a332980a ("crypto: hisilicon/qm - add the save operation of eqe and aeqe")
Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-12-19 14:47:46 +08:00
Linus Torvalds
a3ebb59eee Merge tag 'vfio-v6.19-rc1' of https://github.com/awilliam/linux-vfio
Pull VFIO updates from Alex Williamson:

 - Move libvfio selftest artifacts in preparation of more tightly
   coupled integration with KVM selftests (David Matlack)

 - Fix comment typo in mtty driver (Chu Guangqing)

 - Support for new hardware revision in the hisi_acc vfio-pci variant
   driver where the migration registers can now be accessed via the PF.
   When enabled for this support, the full BAR can be exposed to the
   user (Longfang Liu)

 - Fix vfio cdev support for VF token passing, using the correct size
   for the kernel structure, thereby actually allowing userspace to
   provide a non-zero UUID token. Also set the match token callback for
   the hisi_acc, fixing VF token support for this this vfio-pci variant
   driver (Raghavendra Rao Ananta)

 - Introduce internal callbacks on vfio devices to simplify and
   consolidate duplicate code for generating VFIO_DEVICE_GET_REGION_INFO
   data, removing various ioctl intercepts with a more structured
   solution (Jason Gunthorpe)

 - Introduce dma-buf support for vfio-pci devices, allowing MMIO regions
   to be exposed through dma-buf objects with lifecycle managed through
   move operations. This enables low-level interactions such as a
   vfio-pci based SPDK drivers interacting directly with dma-buf capable
   RDMA devices to enable peer-to-peer operations. IOMMUFD is also now
   able to build upon this support to fill a long standing feature gap
   versus the legacy vfio type1 IOMMU backend with an implementation of
   P2P support for VM use cases that better manages the lifecycle of the
   P2P mapping (Leon Romanovsky, Jason Gunthorpe, Vivek Kasireddy)

 - Convert eventfd triggering for error and request signals to use RCU
   mechanisms in order to avoid a 3-way lockdep reported deadlock issue
   (Alex Williamson)

 - Fix a 32-bit overflow introduced via dma-buf support manifesting with
   large DMA buffers (Alex Mastro)

 - Convert nvgrace-gpu vfio-pci variant driver to insert mappings on
   fault rather than at mmap time. This conversion serves both to make
   use of huge PFNMAPs but also to both avoid corrected RAS events
   during reset by now being subject to vfio-pci-core's use of
   unmap_mapping_range(), and to enable a device readiness test after
   reset (Ankit Agrawal)

 - Refactoring of vfio selftests to support multi-device tests and split
   code to provide better separation between IOMMU and device objects.
   This work also enables a new test suite addition to measure parallel
   device initialization latency (David Matlack)

* tag 'vfio-v6.19-rc1' of https://github.com/awilliam/linux-vfio: (65 commits)
  vfio: selftests: Add vfio_pci_device_init_perf_test
  vfio: selftests: Eliminate INVALID_IOVA
  vfio: selftests: Split libvfio.h into separate header files
  vfio: selftests: Move vfio_selftests_*() helpers into libvfio.c
  vfio: selftests: Rename vfio_util.h to libvfio.h
  vfio: selftests: Stop passing device for IOMMU operations
  vfio: selftests: Move IOVA allocator into iova_allocator.c
  vfio: selftests: Move IOMMU library code into iommu.c
  vfio: selftests: Rename struct vfio_dma_region to dma_region
  vfio: selftests: Upgrade driver logging to dev_err()
  vfio: selftests: Prefix logs with device BDF where relevant
  vfio: selftests: Eliminate overly chatty logging
  vfio: selftests: Support multiple devices in the same container/iommufd
  vfio: selftests: Introduce struct iommu
  vfio: selftests: Rename struct vfio_iommu_mode to iommu_mode
  vfio: selftests: Allow passing multiple BDFs on the command line
  vfio: selftests: Split run.sh into separate scripts
  vfio: selftests: Move run.sh into scripts directory
  vfio/nvgrace-gpu: wait for the GPU mem to be ready
  vfio/nvgrace-gpu: Inform devmem unmapped after reset
  ...
2025-12-04 18:42:48 -08:00
Linus Torvalds
a619fe35ab Merge tag 'v6.19-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto updates from Herbert Xu:
 "API:
   - Rewrite memcpy_sglist from scratch
   - Add on-stack AEAD request allocation
   - Fix partial block processing in ahash

  Algorithms:
   - Remove ansi_cprng
   - Remove tcrypt tests for poly1305
   - Fix EINPROGRESS processing in authenc
   - Fix double-free in zstd

  Drivers:
   - Use drbg ctr helper when reseeding xilinx-trng
   - Add support for PCI device 0x115A to ccp
   - Add support of paes in caam
   - Add support for aes-xts in dthev2

  Others:
   - Use likely in rhashtable lookup
   - Fix lockdep false-positive in padata by removing a helper"

* tag 'v6.19-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (71 commits)
  crypto: zstd - fix double-free in per-CPU stream cleanup
  crypto: ahash - Zero positive err value in ahash_update_finish
  crypto: ahash - Fix crypto_ahash_import with partial block data
  crypto: lib/mpi - use min() instead of min_t()
  crypto: ccp - use min() instead of min_t()
  hwrng: core - use min3() instead of nested min_t()
  crypto: aesni - ctr_crypt() use min() instead of min_t()
  crypto: drbg - Delete unused ctx from struct sdesc
  crypto: testmgr - Add missing DES weak and semi-weak key tests
  Revert "crypto: scatterwalk - Move skcipher walk and use it for memcpy_sglist"
  crypto: scatterwalk - Fix memcpy_sglist() to always succeed
  crypto: iaa - Request to add Kanchana P Sridhar to Maintainers.
  crypto: tcrypt - Remove unused poly1305 support
  crypto: ansi_cprng - Remove unused ansi_cprng algorithm
  crypto: asymmetric_keys - fix uninitialized pointers with free attribute
  KEYS: Avoid -Wflex-array-member-not-at-end warning
  crypto: ccree - Correctly handle return of sg_nents_for_len
  crypto: starfive - Correctly handle return of sg_nents_for_len
  crypto: iaa - Fix incorrect return value in save_iaa_wq()
  crypto: zstd - Remove unnecessary size_t cast
  ...
2025-12-03 11:28:38 -08:00
Miaoqian Lin
59b0afd01b crypto: hisilicon/qm - Fix device reference leak in qm_get_qos_value
The qm_get_qos_value() function calls bus_find_device_by_name() which
increases the device reference count, but fails to call put_device()
to balance the reference count and lead to a device reference leak.

Add put_device() calls in both the error path and success path to
properly balance the reference count.

Found via static analysis.

Fixes: 22d7a6c39c ("crypto: hisilicon/qm - add pci bdf number check")
Cc: stable@vger.kernel.org
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Reviewed-by: Longfang Liu <liulongfang@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-11-06 14:29:49 +08:00
Longfang Liu
4868d2d52d crypto: hisilicon - qm updates BAR configuration
On new platforms greater than QM_HW_V3, the configuration region for the
live migration function of the accelerator device is no longer
placed in the VF, but is instead placed in the PF.

Therefore, the configuration region of the live migration function
needs to be opened when the QM driver is loaded. When the QM driver
is uninstalled, the driver needs to clear this configuration.

Signed-off-by: Longfang Liu <liulongfang@huawei.com>
Reviewed-by: Shameer Kolothum <shameerkolothum@gmail.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Link: https://lore.kernel.org/r/20251030015744.131771-2-liulongfang@huawei.com
Signed-off-by: Alex Williamson <alex@shazbot.org>
2025-11-05 14:56:16 -07:00
nieweiqiang
d912494747 crypto: hisilicon/qm - add missing default in switch in qm_vft_data_cfg
Add default case to avoid warnings and meet code style requirements.

Signed-off-by: nieweiqiang <nieweiqiang@huawei.com>
Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-10-31 17:50:43 +08:00
nieweiqiang
51996331f7 crypto: hisilicon/sgl - remove unnecessary checks for curr_hw_sgl error
Before calling acc_get_sgl(), the mem_block has already been
created. acc_get_sgl() will not return NULL or any other error.
so the return value check can be removed.

Signed-off-by: nieweiqiang <nieweiqiang@huawei.com>
Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-10-31 17:50:43 +08:00
nieweiqiang
1b5645e454 crypto: hisilicon/qm - add concurrency protection for variable err_threshold
The isolate_strategy_store function is not protected
by a lock. If sysfs operations and functions that depend
on the err_threshold variable,such as qm_hw_err_isolate(),
execute concurrently, the outcome will be unpredictable.
Therefore, concurrency protection should be added for
the err_threshold variable.

Signed-off-by: nieweiqiang <nieweiqiang@huawei.com>
Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-10-31 17:50:42 +08:00
nieweiqiang
f5a332980a crypto: hisilicon/qm - add the save operation of eqe and aeqe
The eqe and aeqe are device updated values that include the
valid bit and queue number. In the current process, there is no
memory barrier added, so it cannot be guaranteed that the valid
bit is read before other processes are executed. Since eqe and aeqe
are only 4 bytes and the device writes them to memory in a single
operation, saving the values of eqe and aeqe ensures that the valid
bit and queue number read by the CPU were written by the device
simultaneously.

Signed-off-by: nieweiqiang <nieweiqiang@huawei.com>
Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-10-31 17:50:42 +08:00
nieweiqiang
e7066160f5 crypto: hisilicon/qm - restore original qos values
When the new qos valus setting fails, restore to
the original qos values.

Fixes: 72b010dc33 ("crypto: hisilicon/qm - supports writing QoS int the host")
Signed-off-by: nieweiqiang <nieweiqiang@huawei.com>
Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-10-23 12:55:43 +08:00
Linus Torvalds
908057d185 Merge tag 'v6.18-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto updates from Herbert Xu:
 "Drivers:
   - Add ciphertext hiding support to ccp
   - Add hashjoin, gather and UDMA data move features to hisilicon
   - Add lz4 and lz77_only to hisilicon
   - Add xilinx hwrng driver
   - Add ti driver with ecb/cbc aes support
   - Add ring buffer idle and command queue telemetry for GEN6 in qat

  Others:
   - Use rcu_dereference_all to stop false alarms in rhashtable
   - Fix CPU number wraparound in padata"

* tag 'v6.18-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (78 commits)
  dt-bindings: rng: hisi-rng: convert to DT schema
  crypto: doc - Add explicit title heading to API docs
  hwrng: ks-sa - fix division by zero in ks_sa_rng_init
  KEYS: X.509: Fix Basic Constraints CA flag parsing
  crypto: anubis - simplify return statement in anubis_mod_init
  crypto: hisilicon/qm - set NULL to qm->debug.qm_diff_regs
  crypto: hisilicon/qm - clear all VF configurations in the hardware
  crypto: hisilicon - enable error reporting again
  crypto: hisilicon/qm - mask axi error before memory init
  crypto: hisilicon/qm - invalidate queues in use
  crypto: qat - Return pointer directly in adf_ctl_alloc_resources
  crypto: aspeed - Fix dma_unmap_sg() direction
  rhashtable: Use rcu_dereference_all and rcu_dereference_all_check
  crypto: comp - Use same definition of context alloc and free ops
  crypto: omap - convert from tasklet to BH workqueue
  crypto: qat - Replace kzalloc() + copy_from_user() with memdup_user()
  crypto: caam - double the entropy delay interval for retry
  padata: WQ_PERCPU added to alloc_workqueue users
  padata: replace use of system_unbound_wq with system_dfl_wq
  crypto: cryptd - WQ_PERCPU added to alloc_workqueue users
  ...
2025-10-04 14:59:29 -07:00
Chenghai Huang
f0cafb02de crypto: hisilicon/qm - set NULL to qm->debug.qm_diff_regs
When the initialization of qm->debug.acc_diff_reg fails,
the probe process does not exit. However, after qm->debug.qm_diff_regs is
freed, it is not set to NULL. This can lead to a double free when the
remove process attempts to free it again. Therefore, qm->debug.qm_diff_regs
should be set to NULL after it is freed.

Fixes: 8be0913389 ("crypto: hisilicon/debugfs - Fix debugfs uninit process issue")
Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-09-20 20:21:04 +08:00
Weili Qian
64b9642fc2 crypto: hisilicon/qm - clear all VF configurations in the hardware
When disabling SR-IOV, clear the configuration of each VF
in the hardware. Do not exit the configuration clearing process
due to the failure of a single VF. Additionally, Clear the VF
configurations before decrementing the PM counter.

Signed-off-by: Weili Qian <qianweili@huawei.com>
Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-09-20 20:21:04 +08:00
Weili Qian
80736a97cf crypto: hisilicon - enable error reporting again
When an error occurs on the device, an interrupt is reported.
When the firmware forwards the interrupt to the driver and masks the
error. If the driver does not enable error reporting when an error does
not need to be reset, the device does not report the error to the driver
when the error occurs again. Therefore, after the driver obtains the
information, the error reporting needs to be enabled again.

Signed-off-by: Weili Qian <qianweili@huawei.com>
Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-09-20 20:21:04 +08:00
Weili Qian
3d716c51e0 crypto: hisilicon/qm - mask axi error before memory init
After the device memory is cleared, if the software sends
the doorbell operation, the hardware may trigger a axi error
when processing the doorbell. This error is caused by memory
clearing and hardware access to address 0. Therefore, the axi
error is masked during this period.

Signed-off-by: Weili Qian <qianweili@huawei.com>
Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-09-20 20:21:03 +08:00
Weili Qian
85acd1b26b crypto: hisilicon/qm - invalidate queues in use
Before the device reset, although the driver has set the queue
status to intercept doorbells sent by the task process, the reset
thread is isolated from the user-mode task process, so the task process
may still send doorbells. Therefore, before the reset, the queue is
directly invalidated, and the device directly discards the doorbells
sent by the process.

Signed-off-by: Weili Qian <qianweili@huawei.com>
Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-09-20 20:21:03 +08:00
Qianfeng Rong
a9a84a853c crypto: hisilicon/sec - Use int type to store negative error codes
Change the 'ret' variable in sec_hw_init() from u32 to int, as
it needs to store either negative error codes or zero returned by
sec_ipv4_hashmask().

No effect on runtime.

Signed-off-by: Qianfeng Rong <rongqianfeng@vivo.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-09-13 12:11:06 +08:00
Eric Biggers
ee289d3abe crypto: hisilicon/hpre - Remove unused curve25519 kpp support
Curve25519 is used only via the library API, not the crypto_kpp API.  In
preparation for removing the unused crypto_kpp API for Curve25519,
remove the unused "hpre-curve25519" kpp algorithm.

Cc: Longfang Liu <liulongfang@huawei.com>
Cc: Zhiqi Song <songzhiqi1@huawei.com>
Link: https://lore.kernel.org/r/20250906213523.84915-2-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-09-06 14:45:49 -07:00
Zhushuai Yin
886d698120 crypto: hisilicon/zip - add hashjoin, gather, and UDMA data move features
The new version of the hisilicon zip driver supports the hash join
and gather features, as well as the data move feature (UDMA),
including data copying and memory initialization functions.These
features are registered to the uacce subsystem.

Signed-off-by: Zhushuai Yin <yinzhushuai@huawei.com>
Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-09-06 15:57:23 +08:00
Chenghai Huang
cf79ed6aac crypto: hisilicon/zip - add lz4 and lz77_only to algorithm sysfs
The current hisilicon zip supports the new algorithms lz77_only and
lz4. To enable user space to recognize the new algorithm support,
add lz77_only and lz4 to the sysfs. Users can now use the new
algorithms through uacce.

Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-09-06 15:57:23 +08:00
Herbert Xu
56e6f77ebd crypto: hisilicon/sec2 - Fix false-positive warning of uninitialised qp_ctx
Fix the false-positive warning of qp_ctx being unitialised in
sec_request_init.  The value of ctx_q_num defaults to 2 and is
guaranteed to be non-zero.

Thus qp_ctx is always initialised.  However, the compiler is
not aware of this constraint on ctx_q_num.  Restructure the loop
so that it is obvious to the compiler that ctx_q_num cannot be
zero.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Reviewed-by: Longfang Liu <liulongfang@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-09-06 15:57:23 +08:00
Qianfeng Rong
41eab2a959 crypto: hisilicon - use kcalloc() instead of kzalloc()
As noted in the kernel documentation [1], open-coded multiplication in
allocator arguments is discouraged because it can lead to integer overflow.

Use devm_kcalloc() to gain built-in overflow protection, making memory
allocation safer when calculating allocation size compared to explicit
multiplication.

Link: https://www.kernel.org/doc/html/next/process/deprecated.html#open-coded-arithmetic-in-allocator-arguments #1
Signed-off-by: Qianfeng Rong <rongqianfeng@vivo.com>
Reviewed-by: Longfang Liu <liulongfang@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-08-30 15:43:26 +08:00
Chenghai Huang
dcd2d5fda2 crypto: hisilicon/zip - enable literal length in stream mode compression
In stream mode, the hardware needs to combine the length of the
previous literal to calculate the length of the current literal.

Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-08-30 15:43:26 +08:00
Weili Qian
9228facb30 crypto: hisilicon/qm - request reserved interrupt for virtual function
The device interrupt vector 3 is an error interrupt for
physical function and a reserved interrupt for virtual function.
However, the driver has not registered the reserved interrupt for
virtual function. When allocating interrupts, the number of interrupts
is allocated based on powers of two, which includes this interrupt.
When the system enables GICv4 and the virtual function passthrough
to the virtual machine, releasing the interrupt in the driver
triggers a warning.

The WARNING report is:
WARNING: CPU: 62 PID: 14889 at arch/arm64/kvm/vgic/vgic-its.c:852 its_free_ite+0x94/0xb4

Therefore, register a reserved interrupt for VF and set the
IRQF_NO_AUTOEN flag to avoid that warning.

Fixes: 3536cc55ca ("crypto: hisilicon/qm - support get device irq information from hardware registers")
Signed-off-by: Weili Qian <qianweili@huawei.com>
Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-08-30 15:43:26 +08:00
Zhushuai Yin
6a2c9164b5 crypto: hisilicon/qm - check whether the input function and PF are on the same device
Function rate limiting is set through physical function driver.
Users configure by providing function information and rate limit values.
Before configuration, it is necessary to check whether the
provided function and PF belong to the same device.

Fixes: 22d7a6c39c ("crypto: hisilicon/qm - add pci bdf number check")
Signed-off-by: Zhushuai Yin <yinzhushuai@huawei.com>
Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-08-30 15:43:26 +08:00
Weili Qian
1f9128f121 crypto: hisilicon - check the sva module status while enabling or disabling address prefetch
After enabling address prefetch, check the sva module status. If all
previous prefetch requests from the sva module are not completed, then
disable the address prefetch to ensure normal execution of new task
operations. After disabling address prefetch, check if all requests
from the sva module have been completed.

Fixes: a5c164b195 ("crypto: hisilicon/qm - support address prefetching")
Signed-off-by: Weili Qian <qianweili@huawei.com>
Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-08-30 15:43:26 +08:00
Chenghai Huang
0dcd21443d crypto: hisilicon - re-enable address prefetch after device resuming
When the device resumes from a suspended state, it will revert to its
initial state and requires re-enabling. Currently, the address prefetch
function is not re-enabled after device resuming. Move the address prefetch
enable to the initialization process. In this way, the address prefetch
can be enabled when the device resumes by calling the initialization
process.

Fixes: 607c191b37 ("crypto: hisilicon - support runtime PM for accelerator device")
Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-08-30 15:43:26 +08:00
Chenghai Huang
d4e0815104 crypto: hisilicon/zip - remove unnecessary validation for high-performance mode configurations
When configuring the high-performance mode register, there is no
need to verify whether the register has been successfully
enabled, as there is no possibility of a write failure for this
register.

Fixes: a9864bae18 ("crypto: hisilicon/zip - add zip comp high perf mode configuration")
Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-08-30 15:43:26 +08:00
Zhiqi Song
982fd1a74d crypto: hisilicon/hpre - fix dma unmap sequence
Perform DMA unmapping operations before processing data.
Otherwise, there may be unsynchronized data accessed by
the CPU when the SWIOTLB is enabled.

Signed-off-by: Zhiqi Song <songzhiqi1@huawei.com>
Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-07-27 22:41:45 +10:00