Pull dmaengine updates from Vinod Koul:
"A new driver, couple of device support and binding conversion along
with bunch of driver updates are the main features of this.
New hardware support:
- TI AM62Ax controller support
- Xilinx xdma driver
- Qualcomm SM6125, SM8550, QDU1000/QRU1000 GPI controller
Updates:
- Runtime pm support for at_xdmac driver
- IMX sdma binding conversion to yaml and HDMI audio support
- IMX mxs binding conversion to yaml"
* tag 'dmaengine-6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine: (35 commits)
dmaengine: idma64: Update bytes_transferred field
dmaengine: imx-sdma: Set DMA channel to be private
dmaengine: dw: Move check for paused channel to dwc_get_residue()
dmaengine: ptdma: check for null desc before calling pt_cmd_callback
dmaengine: dw-axi-dmac: Do not dereference NULL structure
dmaengine: idxd: Fix default allowed read buffers value in group
dmaengine: sf-pdma: pdma_desc memory leak fix
dmaengine: Simplify dmaenginem_async_device_register() function
dmaengine: use sysfs_emit() to instead of scnprintf()
dmaengine: Make an order in struct dma_device definition
dt-bindings: dma: cleanup examples - indentation, lowercase hex
dt-bindings: dma: drop unneeded quotes
dmaengine: xilinx: xdma: Add user logic interrupt support
dmaengine: xilinx: xdma: Add xilinx xdma driver
dmaengine: drivers: Use devm_platform_ioremap_resource()
dmaengine: at_xdmac: remove empty line
dmaengine: at_xdmac: add runtime pm support
dmaengine: at_xdmac: align properly function members
dmaengine: ppc4xx: Convert to use sysfs_emit()/sysfs_emit_at() APIs
dmaengine: sun6i: Set the maximum segment size
...
Currently default read buffers that is allowed in a group is 0.
grpcfg will be configured to max read buffers that IDXD can support if
the group's allowed read buffers value is 0. But 0 is an invalid
read buffers value and user may get confused when seeing the invalid
initial value 0 through sysfs interface.
To show only valid allowed read buffers value and eliminate confusion,
directly initialize the allowed read buffers to IDXD's max read buffers.
User still can change the value through sysfs interface.
Suggested-by: Ramesh Thomas <ramesh.thomas@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Reviewed-by: Nikhil Rao <nikhil.rao@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/20230127192855.966929-1-fenghua.yu@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
On driver unload any pending descriptors are flushed and pending
DMA descriptors are explicitly completed:
idxd_dmaengine_drv_remove() ->
drv_disable_wq() ->
idxd_wq_free_irq() ->
idxd_flush_pending_descs() ->
idxd_dma_complete_txd()
With this done during driver unload any remaining descriptor is
likely stuck and can be dropped. Even so, the descriptor may still
have a callback set that could no longer be accessible. An
example of such a problem is when the dmatest fails and the dmatest
module is unloaded. The failure of dmatest leaves descriptors with
dma_async_tx_descriptor::callback pointing to code that no longer
exist. This causes a page fault as below at the time the IDXD driver
is unloaded when it attempts to run the callback:
BUG: unable to handle page fault for address: ffffffffc0665190
#PF: supervisor instruction fetch in kernel mode
#PF: error_code(0x0010) - not-present page
Fix this by clearing the callback pointers on the transmit
descriptors only when workqueue is disabled.
Fixes: 403a2e2365 ("dmaengine: idxd: change MSIX allocation based on per wq activation")
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Fenghua Yu <fenghua.yu@intel.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/37d06b772aa7f8863ca50f90930ea2fd80b38fc3.1670452419.git.reinette.chatre@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
On driver unload any pending descriptors are flushed at the
time the interrupt is freed:
idxd_dmaengine_drv_remove() ->
drv_disable_wq() ->
idxd_wq_free_irq() ->
idxd_flush_pending_descs().
If there are any descriptors present that need to be flushed this
flow triggers a "not present" page fault as below:
BUG: unable to handle page fault for address: ff391c97c70c9040
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
The address that triggers the fault is the address of the
descriptor that was freed moments earlier via:
drv_disable_wq()->idxd_wq_free_resources()
Fix the use after free by freeing the descriptors after any possible
usage. This is done after idxd_wq_reset() to ensure that the memory
remains accessible during possible completion writes by the device.
Fixes: 63c14ae6c1 ("dmaengine: idxd: refactor wq driver enable/disable operations")
Suggested-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Fenghua Yu <fenghua.yu@intel.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/6c4657d9cff0a0a00501a7b928297ac966e9ec9d.1670452419.git.reinette.chatre@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
The workqueue is enabled when the appropriate driver is loaded and
disabled when the driver is removed. When the driver is removed it
assumes that the workqueue was enabled successfully and proceeds to
free allocations made during workqueue enabling.
Failure during workqueue enabling does not prevent the driver from
being loaded. This is because the error path within drv_enable_wq()
returns success unless a second failure is encountered
during the error path. By returning success it is possible to load
the driver even if the workqueue cannot be enabled and
allocations that do not exist are attempted to be freed during
driver remove.
Some examples of problematic flows:
(a)
idxd_dmaengine_drv_probe() -> drv_enable_wq() -> idxd_wq_request_irq():
In above flow, if idxd_wq_request_irq() fails then
idxd_wq_unmap_portal() is called on error exit path, but
drv_enable_wq() returns 0 because idxd_wq_disable() succeeds. The
driver is thus loaded successfully.
idxd_dmaengine_drv_remove()->drv_disable_wq()->idxd_wq_unmap_portal()
Above flow on driver unload triggers the WARN in devm_iounmap() because
the device resource has already been removed during error path of
drv_enable_wq().
(b)
idxd_dmaengine_drv_probe() -> drv_enable_wq() -> idxd_wq_request_irq():
In above flow, if idxd_wq_request_irq() fails then
idxd_wq_init_percpu_ref() is never called to initialize the percpu
counter, yet the driver loads successfully because drv_enable_wq()
returns 0.
idxd_dmaengine_drv_remove()->__idxd_wq_quiesce()->percpu_ref_kill():
Above flow on driver unload triggers a BUG when attempting to drop the
initial ref of the uninitialized percpu ref:
BUG: kernel NULL pointer dereference, address: 0000000000000010
Fix the drv_enable_wq() error path by returning the original error that
indicates failure of workqueue enabling. This ensures that the probe
fails when an error is encountered and the driver remove paths are only
attempted when the workqueue was enabled successfully.
Fixes: 1f2bb40337 ("dmaengine: idxd: move wq_enable() to device.c")
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Fenghua Yu <fenghua.yu@intel.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/e8d8116e5efa0fd14fadc5adae6ffd319f0e5ff1.1670452419.git.reinette.chatre@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
>From Intel IAA spec [1], Intel IAA does not support batch processing.
Two batch related default values for IAA are incorrect in current code:
(1) The max batch size of device is set during device initialization,
that indicates batch is supported. It should be always 0 on IAA.
(2) The max batch size of work queue is set to WQ_DEFAULT_MAX_BATCH (32)
as the default value regardless of Intel DSA or IAA device during
work queue setup and cleanup. It should be always 0 on IAA.
Fix the issues by setting the max batch size of device and max batch
size of work queue to 0 on IAA device, that means batch is not
supported.
[1]: https://cdrdv2.intel.com/v1/dl/getContent/721858
Fixes: 23084545db ("dmaengine: idxd: set max_xfer and max_batch for RO device")
Fixes: 92452a72eb ("dmaengine: idxd: set defaults for wq configs")
Fixes: bfe1d56091 ("dmaengine: idxd: Init and probe for Intel data accelerators")
Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Fenghua Yu <fenghua.yu@intel.com>
Link: https://lore.kernel.org/r/20220930201528.18621-2-xiaochen.shen@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
DSA 2.0 add the capability of configuring DMA ops on a per workqueue basis.
This means that certain ops can be disabled by the system administrator for
certain wq. By default, all ops are available. A bitmap is used to store
the ops due to total op size of 256 bits and it is more convenient to use a
range list to specify which bits are enabled.
One of the usage to support this is for VM migration between different
iteration of devices. The newer ops are disabled in order to allow guest to
migrate to a host that only support older ops. Another usage is to
restrict the WQ to certain operations for QoS of performance.
A sysfs of ops_config attribute is added per wq. It is only usable when the
ops_config bit is set under WQ_CAP register. This means that this attribute
will return -EOPNOTSUPP on DSA 1.x devices. The expected input is a range
list for the bits per operation the WQ supports.
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Co-developed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: https://lore.kernel.org/r/20220917161222.2835172-4-fenghua.yu@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Testing shown that when a wq mode is setup to be dedicated and then torn
down and reconfigured to shared, the wq configured end up being dedicated
anyays. The root cause is when idxd_device_wqs_clear_state() gets called
during idxd_driver removal, idxd_wq_disable_cleanup() does not get called
vs when the wq driver is removed first. The check of wq state being
"enabled" causes the cleanup to be bypassed. However, idxd_driver->remove()
releases all wq drivers. So the wqs goes to "disabled" state and will never
be "enabled". By that point, the driver has no idea if the wq was
previously configured or clean. So force call idxd_wq_disable_cleanup() on
all wqs always to make sure everything gets cleaned up.
Reported-by: Tony Zhu <tony.zhu@intel.com>
Tested-by: Tony Zhu <tony.zhu@intel.com>
Fixes: 0dcfe41e9a ("dmanegine: idxd: cleanup all device related bits after disabling device")
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Co-developed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: https://lore.kernel.org/r/20220628230056.2527816-1-fenghua.yu@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Pull dmaengine updates from Vinod Koul:
"Nothing special, this includes a couple of new device support and new
driver support and bunch of driver updates.
New support:
- Tegra gpcdma driver support
- Qualcomm SM8350, Sm8450 and SC7280 device support
- Renesas RZN1 dma and platform support
Updates:
- stm32 device pause/resume support and updates
- DMA memset ops Documentation and usage clarification
- deprecate '#dma-channels' & '#dma-requests' bindings
- driver updates for stm32, ptdma idsx etc"
* tag 'dmaengine-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine: (87 commits)
dmaengine: idxd: make idxd_wq_enable() return 0 if wq is already enabled
dmaengine: sun6i: Add support for the D1 variant
dmaengine: sun6i: Add support for 34-bit physical addresses
dmaengine: sun6i: Do not use virt_to_phys
dt-bindings: dma: sun50i-a64: Add compatible for D1
dmaengine: tegra: Remove unused switch case
dmaengine: tegra: Fix uninitialized variable usage
dmaengine: stm32-dma: add device_pause/device_resume support
dmaengine: stm32-dma: rename pm ops before dma pause/resume introduction
dmaengine: stm32-dma: pass DMA_SxSCR value to stm32_dma_handle_chan_done()
dmaengine: stm32-dma: introduce stm32_dma_sg_inc to manage chan->next_sg
dmaengine: stm32-dmamux: avoid reset of dmamux if used by coprocessor
dmaengine: qcom: gpi: Add support for sc7280
dt-bindings: dma: pl330: Add power-domains
dmaengine: stm32-mdma: use dev_dbg on non-busy channel spurious it
dmaengine: stm32-mdma: fix chan initialization in stm32_mdma_irq_handler()
dmaengine: stm32-mdma: remove GISR1 register
dmaengine: ti: deprecate '#dma-channels'
dmaengine: mmp: deprecate '#dma-channels'
dmaengine: pxa: deprecate '#dma-channels' and '#dma-requests'
...
The driver currently programs the system pasid to the WQ preemptively when
system pasid is enabled. Given that a dwq will reprogram the pasid and
possibly a different pasid, the programming is not necessary. The pasid_en
bit can be set for swq as it does not need pasid programming but
needs the pasid_en bit. Remove system pasid programming on device config
write. Add pasid programming for kernel wq type on wq driver enable. The
char dev driver already reprograms the dwq on ->open() call so there's no
change.
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/164935607115.1660372.6734518676950372366.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Change the driver where WQ interrupt is requested only when wq is being
enabled. This new scheme set things up so that request_threaded_irq() is
only called when a kernel wq type is being enabled. This also sets up for
future interrupt request where different interrupt handler such as wq
occupancy interrupt can be setup instead of the wq completion interrupt.
Not calling request_irq() until the WQ actually needs an irq also prevents
wasting of CPU irq vectors on x86 systems, which is a limited resource.
idxd_flush_pending_descs() is moved to device.c since descriptor flushing
is now part of wq disable rather than shutdown().
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/163942149487.2412839.6691222855803875848.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
With irq_entry already being associated with the wq in a 1:1 relationship,
embed the irq_entry in the idxd_wq struct and remove back pointers for
idxe_wq and idxd_device. In the process of this work, clean up the interrupt
handle assignment so that there's no decision to be made during submit
call on where interrupt handle value comes from. Set the interrupt handle
during irq request initialization time.
irq_entry 0 is designated as special and is tied to the device itself.
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/163942148362.2412839.12055447853311267866.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Add a sysfs knob to allow tuning of retries for the kernel ENQCMDS
descriptor submission. While on host, it is not as likely that ENQCMDS
return busy during normal operations due to the driver controlling the
number of descriptors allocated for submission. However, when the driver is
operating as a guest driver, the chance of retry goes up significantly due
to sharing a wq with multiple VMs. A default value is provided with the
system admin being able to tune the value on a per WQ basis.
Suggested-by: Sanjay Kumar <sanjay.k.kumar@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/163820629464.2702134.7577370098568297574.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
"Interrupt handle revoked" is an event that happens when the driver is
running on a guest kernel and the VM is migrated to a new machine.
The device will trigger an interrupt that signals to the guest driver
that the interrupt handles need to be replaced.
The misc irq thread function calls a helper function to handle the
event. The function uses the WQ percpu_ref to quiesce the kernel
submissions. It then replaces the interrupt handles by requesting
interrupt handle command for each I/O MSIX vector. Once the handle is
updated, the driver will unblock the submission path to allow new
submissions.
The submitter will attempt to acquire a percpu_ref before submission. When
the request fails, it will wait on the wq_resurrect 'completion'.
The driver does anticipate the possibility of descriptors being submitted
before the WQ percpu_ref is killed. If a descriptor has already been
submitted, it will return with incorrect interrupt handle status. The
descriptor will be re-submitted with the new interrupt handle on the
completion path. For descriptors with incorrect interrupt handles,
completion interrupt won't be triggered.
At the completion of the interrupt handle refresh, the handling function
will call idxd_int_handle_refresh_drain() to issue drain descriptors to
each of the wq with associated interrupt handle. The drain descriptor will have
interrupt request set but without completion record. This will ensure all
descriptors with incorrect interrupt completion handle get drained and
a completion interrupt is triggered for the guest driver to process them.
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Co-Developed-by: Sanjay Kumar <sanjay.k.kumar@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/163528420189.3925689.18212568593220415551.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Enabling device and wq returns standard errno and that does not provide
enough details to indicate what exactly failed. The hardware command status
is only 8bits. Expand the command status to 32bits and use the upper 16
bits to define software errors to provide more details on the exact
failure. Bit 31 will be used to indicate the error is software set as the
driver is using some of the spec defined hardware error as well.
Cc: Ramesh Thomas <ramesh.thomas@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/162681373579.1968485.5891788397526827892.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>