Add support for gracefully terminating hardware cyclic DMA transfers when
a new transfer is queued and is flagged with DMA_PREP_LOAD_EOT. Without
this, cyclic transfers would continue indefinitely until we brute force
it with .device_terminate_all().
When a new descriptor is queued while a cyclic transfer is active, mark
the cyclic transfer for termination. For hardware with scatter-gather
support, modify the last segment flags to trigger end-of-transfer. For
non-SG hardware, clear the CYCLIC flag to allow natural completion.
Older IP core versions (pre-4.6.a) can prefetch data when clearing the
CYCLIC flag, causing corruption in the next transfer. Work around this
by disabling and re-enabling the core to flush prefetched data.
The cyclic_eot flag tracks transfers marked for termination, preventing
new transfers from starting until the cyclic one completes. Non-EOT
transfers submitted after cyclic transfers are discarded with a warning.
Also note that for hardware cyclic transfers not using SG, we need to
make sure that chan->next_desc is also set to NULL (so we can look at
possible EOT transfers) and we also need to move the queue check to
after axi_dmac_get_next_desc() because with hardware based cyclic
transfers we might get the queue marked as full and hence we would not
be able to check for cyclic termination.
Signed-off-by: Nuno Sá <nuno.sa@analog.com>
Link: https://patch.msgid.link/20260303-axi-dac-cyclic-support-v2-5-0db27b4be95a@analog.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
As of now, to terminate a cyclic transfer, one pretty much needs to use
brute force and terminate all transfers with .device_terminate_all().
With this change, when a cyclic transfer terminates (and generates an
EOT interrupt), look at any new pending transfer with the DMA_PREP_LOAD_EOT
flag set. If there is one, the current cyclic transfer is terminated and
the next one is enqueued. If the flag is not set, that transfer is ignored.
Signed-off-by: Nuno Sá <nuno.sa@analog.com>
Link: https://patch.msgid.link/20260303-axi-dac-cyclic-support-v2-4-0db27b4be95a@analog.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
This is the result of running the Coccinelle script from
scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to
avoid scalar types (which need careful case-by-case checking), and
instead replace kmalloc-family calls that allocate struct or union
object instances:
Single allocations: kmalloc(sizeof(TYPE), ...)
are replaced with: kmalloc_obj(TYPE, ...)
Array allocations: kmalloc_array(COUNT, sizeof(TYPE), ...)
are replaced with: kmalloc_objs(TYPE, COUNT, ...)
Flex array allocations: kmalloc(struct_size(PTR, FAM, COUNT), ...)
are replaced with: kmalloc_flex(*PTR, FAM, COUNT, ...)
(where TYPE may also be *VAR)
The resulting allocations no longer return "void *", instead returning
"TYPE *".
Signed-off-by: Kees Cook <kees@kernel.org>
In some supported platforms as ARCH_ZYNQMP, part of the memory is mapped
above 32bit addresses and since the DMA mask, by default, is set to 32bits,
we would need to rely on swiotlb (which incurs a performance penalty)
for the DMA mappings. Thus, we can write either the SRC or DEST high
addresses with 1's and read them back. The last bit set on the return
value will reflect the IP address bus width and so we can update the
device DMA mask accordingly.
While at it, support bigger that 32 bits transfers in IP without HW
scatter gather support.
Signed-off-by: Nuno Sá <nuno.sa@analog.com>
base-commit: 3980351785
change-id: 20251104-axi-dmac-fixes-and-improvs-e3ad512a329c
Acked-by: Michael Hennerich <michael.hennerich@analog.com>
Link: https://patch.msgid.link/20251104-axi-dmac-fixes-and-improvs-v1-3-3e6fd9328f72@analog.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
For HW scatter gather transfers we still need to look for the queue. The
HW is capable of queueing 3 concurrent transfers and if we try more than
that we'll get the submit queue full and should return. Otherwise, if we
go ahead and program the new transfer, we end up discarding it.
Fixes: e97dc74359 ("dmaengine: axi-dmac: Add support for scatter-gather transfers")
Signed-off-by: Nuno Sá <nuno.sa@analog.com>
base-commit: 3980351785
change-id: 20251104-axi-dmac-fixes-and-improvs-e3ad512a329c
Acked-by: Michael Hennerich <michael.hennerich@analog.com>
Link: https://patch.msgid.link/20251104-axi-dmac-fixes-and-improvs-v1-2-3e6fd9328f72@analog.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
If 'hw_cyclic' is false we should still be able to do cyclic transfers in
"software". That was not working for the case where 'desc->num_sgs' is 1
because 'chan->next_desc' is never set with the current desc which means
that the cyclic transfer only runs once and in the next SOT interrupt we
do nothing since vchan_next_desc() will return NULL.
Fix it by setting 'chan->next_desc' as soon as we get a new desc via
vchan_next_desc().
Fixes: 0e3b67b348 ("dmaengine: Add support for the Analog Devices AXI-DMAC DMA controller")
Signed-off-by: Nuno Sá <nuno.sa@analog.com>
base-commit: 3980351785
change-id: 20251104-axi-dmac-fixes-and-improvs-e3ad512a329c
Acked-by: Michael Hennerich <michael.hennerich@analog.com>
Link: https://patch.msgid.link/20251104-axi-dmac-fixes-and-improvs-v1-1-3e6fd9328f72@analog.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Instead of notifying userspace in the end-of-transfer (EOT) interrupt
and program the hardware in the start-of-transfer (SOT) interrupt, we
can do both things in the EOT, allowing us to mask the SOT, and halve
the number of interrupts sent by the HDL core.
Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Link: https://lore.kernel.org/r/20231215131313.23840-5-paul@crapouillou.net
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Implement support for scatter-gather transfers. Build a chain of
hardware descriptors, each one corresponding to a segment of the
transfer, and linked to the next one. The hardware will transfer the
chain and only fire interrupts when the whole chain has been
transferred.
Support for scatter-gather is automatically enabled when the driver
detects that the hardware supports it, by writing then reading the
AXI_DMAC_REG_SG_ADDRESS register. If not available, the driver will fall
back to standard DMA transfers.
Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Link: https://lore.kernel.org/r/20231215131313.23840-4-paul@crapouillou.net
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Change where and how the DMA transfers meta-data is stored, to prepare
for the upcoming introduction of scatter-gather support.
Allocate hardware descriptors in the format that the HDL core will be
expecting them when the scatter-gather feature is enabled, and use these
fields to store the data that was previously stored in the axi_dmac_sg
structure.
Note that the 'x_len' and 'y_len' fields now contain the transfer length
minus one, since that's what the hardware will expect in these fields.
Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Link: https://lore.kernel.org/r/20231215131313.23840-3-paul@crapouillou.net
Signed-off-by: Vinod Koul <vkoul@kernel.org>
The .remove() callback for a platform driver returns an int which makes
many driver authors wrongly assume it's possible to do error handling by
returning an error code. However the value returned is ignored (apart
from emitting a warning) and this typically results in resource leaks.
To improve here there is a quest to make the remove callback return
void. In the first step of this quest all drivers are converted to
.remove_new() which already returns void. Eventually after all drivers
are converted, .remove_new() is renamed to .remove().
Trivially convert this driver from always returning zero in the remove
callback to the void returning variant.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Link: https://lore.kernel.org/r/20230919133207.1400430-9-u.kleine-koenig@pengutronix.de
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Starting with core version 4.3.a the DMA bus attributes can (and should) be
read from the INTERFACE_DESCRIPTION (0x10) register.
For older core versions, this will still need to be provided from the
device-tree.
The bus-type values are identical to the ones stored in the device-trees,
so we just need to read them. Bus-width values are stored in log2 values,
so we just need to use them as shift values to make them equivalent to the
current format.
Signed-off-by: Alexandru Ardelean <alexandru.ardelean@analog.com>
Link: https://lore.kernel.org/r/20200825151950.57605-7-alexandru.ardelean@analog.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
The channel parameters (which are read from the device-tree) are adjusted
for the DMAEngine framework in the axi_dmac_parse_chan_dt() function, after
they are read from the device-tree.
When we want to read these from registers, we will need to use the same
logic, so this change splits the logic into a separate function.
Signed-off-by: Alexandru Ardelean <alexandru.ardelean@analog.com>
Link: https://lore.kernel.org/r/20200825151950.57605-6-alexandru.ardelean@analog.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Pull dmaengine updates from Vinod Koul:
- Add support in dmaengine core to do device node checks for DT devices
and update bunch of drivers to use that and remove open coding from
drivers
- New driver/driver support for new hardware, namely:
- MediaTek UART APDMA
- Freescale i.mx7ulp edma2
- Synopsys eDMA IP core version 0
- Allwinner H6 DMA
- Updates to axi-dma and support for interleaved cyclic transfers
- Greg's debugfs return value check removals on drivers
- Updates to stm32-dma, hsu, dw, pl330, tegra drivers
* tag 'dmaengine-5.3-rc1' of git://git.infradead.org/users/vkoul/slave-dma: (68 commits)
dmaengine: Revert "dmaengine: fsl-edma: add i.mx7ulp edma2 version support"
dmaengine: at_xdmac: check for non-empty xfers_list before invoking callback
Documentation: dmaengine: clean up description of dmatest usage
dmaengine: tegra210-adma: remove PM_CLK dependency
dmaengine: fsl-edma: add i.mx7ulp edma2 version support
dt-bindings: dma: fsl-edma: add new i.mx7ulp-edma
dmaengine: fsl-edma-common: version check for v2 instead
dmaengine: fsl-edma-common: move dmamux register to another single function
dmaengine: fsl-edma: add drvdata for fsl-edma
dmaengine: Revert "dmaengine: fsl-edma: support little endian for edma driver"
dmaengine: rcar-dmac: Reject zero-length slave DMA requests
dmaengine: dw: Enable iDMA 32-bit on Intel Elkhart Lake
dmaengine: dw-edma: fix semicolon.cocci warnings
dmaengine: sh: usb-dmac: Use [] to denote a flexible array member
dmaengine: dmatest: timeout value of -1 should specify infinite wait
dmaengine: dw: Distinguish ->remove() between DW and iDMA 32-bit
dmaengine: fsl-edma: support little endian for edma driver
dmaengine: hsu: Revert "set HSU_CH_MTSR to memory width"
dmagengine: pl330: add code to get reset property
dt-bindings: pl330: document the optional resets property
...
When a partial transfer is received, the driver should not submit any more
segments to the hardware, as they will be ignored/unused until a new
transfer start operation is done.
This change implements this by adding a new flag on the AXI DMAC
descriptor. This flags is set to true, if there was a partial transfer in
a previously completed segment. When that flag is true, the TLAST flag is
added to the to the submitted segment, signaling the controller to stop
receiving more segments.
Signed-off-by: Alexandru Ardelean <alexandru.ardelean@analog.com>
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Starting with version 4.2.a, the AXI DMAC controller can report partial
transfers that have been issued.
This change implements computing DMA residue information for transfers,
based on that reported information.
The way this is done, is to dequeue the partial transfers from the FIFO of
partial transfers, store the partial length to the correct segment &
descriptor, and compute the residue before submitting the DMA cookie to the
DMA framework.
Signed-off-by: Alexandru Ardelean <alexandru.ardelean@analog.com>
Signed-off-by: Vinod Koul <vkoul@kernel.org>
The change replaces the old license information in the comment header with
the new SPDX license specifier.
As well as bumping the year range from 2013-2015 to 2013-2019.
The latter also reflects recent changes that were added to the driver.
Signed-off-by: Alexandru Ardelean <alexandru.ardelean@analog.com>
Signed-off-by: Vinod Koul <vkoul@kernel.org>
The `copy_align` property is a generic property that describes alignment
for DMA memcpy & sg ops.
It serves mostly an informational purpose, and can be used in DMA tests, to
pass the info to know what alignment to expect.
Signed-off-by: Alexandru Ardelean <alexandru.ardelean@analog.com>
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Starting with version 4.1.a the AXI-DMAC is capable of reporting the
required length alignment.
The LSBs that are required to be set for alignment will always read back as
set from the transfer length register. It is not possible to clear them by
writing a 0. This means the driver can discover the length alignment
requirement by writing 0 to that register and reading back the value.
Since the DMA will support length alignment requirements that are different
from the address alignment requirement track both of them independently.
For older versions of the peripheral assume that the length alignment
requirement is equal to the address alignment requirement.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Alexandru Ardelean <alexandru.ardelean@analog.com>
Signed-off-by: Vinod Koul <vkoul@kernel.org>
The AXI-DMAC supports different types of interface for the data source and
destination ports. Typically one of those ports is a memory-mapped
interface while the other is some kind of streaming interface.
The information about which kind of interface is used for each port is
encoded in the devicetree.
It is also possible in the driver to detect whether a port supports
memory-mapped transfers or not. For streaming interfaces the address
register is read-only and will always return 0. So in order to check if a
port supports memory-mapped transfers write a non-zero value to the
corresponding address register and check that the value read-back is still
non zero.
This allows to detect mismatches between the devicetree description and the
actual hardware configuration.
Unfortunately it is not possible to autodetect the interface types since
there is no method to distinguish between the different streaming ports. So
the best thing that can be done is to error out when a memory mapped port
is described in the devicetree but none is detected in the hardware.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Alexandru Ardelean <alexandru.ardelean@analog.com>
Signed-off-by: Vinod Koul <vkoul@kernel.org>
The TLAST flag is used by the DMAC HDL controller to signal to the
controller that the following segment (to be submitted) is the last one (in
a series of segments).
A receiver DMA (typically another DMAC) can read this parameter (from the
transfer), and terminate the transfer earlier. A typical use-case for this,
is when the receiver expects a certain amount of segments, but for some
reason (e.g. an ADC capture which can have an unknown number of digital
samples) the number of actual segments is smaller. The receiver would read
this flag, and then the DMAC would finish.
Signed-off-by: Michael Hennerich <michael.hennerich@analog.com>
Signed-off-by: Alexandru Ardelean <alexandru.ardelean@analog.com>
Signed-off-by: Vinod Koul <vkoul@kernel.org>
The DMAC HDL core supports interleaved & cyclic transfers.
An example use-case for this mode is when the controller is used as a
video DMA.
This change sets the `cyclic` field to true, so that when the IRQ comes and
the `axi_dmac_transfer_done()` callback is called (from the interrupt
handler) the proper `vchan_cyclic_callback()` is called. This way the
DMAEngine framework will process data correctly for interleaved + cyclic
transfers.
This doesn't fix anything. It's an enhancement to the driver.
Signed-off-by: Dragos Bogdan <dragos.bogdan@analog.com>
Signed-off-by: Alexandru Ardelean <alexandru.ardelean@analog.com>
Signed-off-by: Vinod Koul <vkoul@kernel.org>
In 2D transfers (for the AXI DMAC), the number of frames (numf) represents
Y_LENGTH, and the length of a frame is X_LENGTH. 2D transfers are useful
for video transfers where screen resolutions ( X * Y ) are typically
aligned for X, but not for Y.
There is no requirement for Y_LENGTH to be aligned to the bus-width (or
anything), and this is also true for AXI DMAC.
Checking the Y_LENGTH for alignment causes false errors when initiating DMA
transfers. This change fixes this by checking only that the Y_LENGTH is
non-zero.
Fixes: 0e3b67b348 ("dmaengine: Add support for the Analog Devices AXI-DMAC DMA controller")
Signed-off-by: Alexandru Ardelean <alexandru.ardelean@analog.com>
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Some synthesis time configuration parameters of the DMA controller can be
inferred from the hardware itself.
Use this information as it is more reliably than the information specified
in the devicetree which might be outdated if the HDL project got changed.
Deprecate the devicetree properties that can be inferred from the hardware
itself.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Alexandru Ardelean <alexandru.ardelean@analog.com>
Signed-off-by: Vinod Koul <vkoul@kernel.org>
The axi-dmac driver currently rejects transfers with segments that are
larger than what the hardware can handle.
Re-work the driver so that these large segments are split into multiple
segments instead where each segment is smaller or equal to the maximum
segment size.
This allows the driver to handle transfers with segments of arbitrary size.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Bogdan Togorean <bogdan.togorean@analog.com>
Signed-off-by: Alexandru Ardelean <alex.ardelean@analog.com>
Signed-off-by: Vinod Koul <vkoul@kernel.org>
One of the more common cases of allocation size calculations is finding the
size of a structure that has a zero-sized array at the end, along with memory
for some number of elements for that array. For example:
struct foo {
int stuff;
void *entry[];
};
instance = kzalloc(sizeof(struct foo) + sizeof(void *) * count, GFP_KERNEL);
Instead of leaving these open-coded and prone to type mistakes, we can now
use the new struct_size() helper:
instance = kzalloc(struct_size(instance, entry, count), GFP_KERNEL);
This code was detected with the help of Coccinelle.
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Request IRQ with IRQF_SHARED flag to enable setups with multiple
instances of the core sharing a single IRQ line.
This works out since the IRQ handler already checks if there is
an actual IRQ pending and returns IRQ_NONE otherwise.
Acked-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Moritz Fischer <mdf@kernel.org>
Signed-off-by: Vinod Koul <vkoul@kernel.org>
When running in software cyclic mode the driver currently does not go back
to the first segment once the last segment has been reached. Effectively
making the transfer non-cyclic.
Fix this by going back to the first segment once the last segment has been
reached for cyclic transfers.
Special care need to be taken to avoid a segment from being submitted
multiple times concurrently, which could happen for transfers with a number
of segments that is smaller than the DMA controller's internal queue.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>