linux

mirror of https://github.com/torvalds/linux.git synced 2026-04-24 17:42:27 -04:00

Author	SHA1	Message	Date
Krishna chaitanya chundru	ceeb64f41f	bus: mhi: host: Add tracing support This change adds ftrace support for following functions which helps in debugging the issues when there is Channel state & MHI state change and also when we receive data and control events: 1. mhi_intvec_mhi_states 2. mhi_process_data_event_ring 3. mhi_process_ctrl_ev_ring 4. mhi_gen_tre 5. mhi_update_channel_state 6. mhi_tryset_pm_state 7. mhi_pm_st_worker Change the implementation of the arrays which has enum to strings mapping to make it consistent in both trace header file and other files. Where ever the trace events are added, debug messages are removed. Signed-off-by: Krishna chaitanya chundru <quic_krichai@quicinc.com> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org> Link: https://lore.kernel.org/r/20240206-ftrace_support-v11-1-3f71dc187544@quicinc.com Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>	2024-02-06 11:54:44 +05:30
Jeffrey Hugo	bce3f77068	bus: mhi: host: Add MHI_PM_SYS_ERR_FAIL state When processing a SYSERR, if the device does not respond to the MHI_RESET from the host, the host will be stuck in a difficult to recover state. The host will remain in MHI_PM_SYS_ERR_PROCESS and not clean up the host channels. Clients will not be notified of the SYSERR via the destruction of their channel devices, which means clients may think that the device is still up. Subsequent SYSERR events such as a device fatal error will not be processed as the state machine cannot transition from PROCESS back to DETECT. The only way to recover from this is to unload the mhi module (wipe the state machine state) or for the mhi controller to initiate SHUTDOWN. This issue was discovered by stress testing soc_reset events on AIC100 via the sysfs node. soc_reset is processed entirely in hardware. When the register write hits the endpoint hardware, it causes the soc to reset without firmware involvement. In stress testing, there is a rare race where soc_reset N will cause the soc to reset and PBL to signal SYSERR (fatal error). If soc_reset N+1 is triggered before PBL can process the MHI_RESET from the host, then the soc will reset again, and re-run PBL from the beginning. This will cause PBL to lose all state. PBL will be waiting for the host to respond to the new syserr, but host will be stuck expecting the previous MHI_RESET to be processed. Additionally, the AMSS EE firmware (QSM) was hacked to synthetically reproduce the issue by simulating a FW hang after the QSM issued a SYSERR. In this case, soc_reset would not recover the device. For this failure case, to recover the device, we need a state similar to PROCESS, but can transition to DETECT. There is not a viable existing state to use. POR has the needed transitions, but assumes the device is in a good state and could allow the host to attempt to use the device. Allowing PROCESS to transition to DETECT invites the possibility of parallel SYSERR processing which could get the host and device out of sync. Thus, invent a new state - MHI_PM_SYS_ERR_FAIL This essentially a holding state. It allows us to clean up the host elements that are based on the old state of the device (channels), but does not allow us to directly advance back to an operational state. It does allow the detection and processing of another SYSERR which may recover the device, or allows the controller to do a clean shutdown. Signed-off-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Reviewed-by: Carl Vanderlip <quic_carlv@quicinc.com> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Link: https://lore.kernel.org/r/20240112180800.536733-1-quic_jhugo@quicinc.com Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>	2024-01-30 23:52:40 +05:30
Qiang Yu	6ab3d50b10	bus: mhi: host: Add a separate timeout parameter for waiting ready Some devices(eg. SDX75) take longer than expected (default, 8 seconds) to set ready after reboot. Hence add optional ready timeout parameter and pass the appropriate timeout value to mhi_poll_reg_field() to wait enough for device ready as part of power up sequence. Signed-off-by: Qiang Yu <quic_qianyu@quicinc.com> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Link: https://lore.kernel.org/r/1699344890-87076-2-git-send-email-quic_qianyu@quicinc.com Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>	2023-12-14 10:57:34 +05:30
Qiang Yu	cabce92dd8	bus: mhi: host: Skip MHI reset if device is in RDDM In RDDM EE, device can not process MHI reset issued by host. In case of MHI power off, host is issuing MHI reset and polls for it to get cleared until it times out. Since this timeout can not be avoided in case of RDDM, skip the MHI reset in this scenarios. Cc: <stable@vger.kernel.org> Fixes: `a6e2e3522f` ("bus: mhi: core: Add support for PM state transitions") Signed-off-by: Qiang Yu <quic_qianyu@quicinc.com> Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Reviewed-by: Manivannan Sadhasivam <mani@kernel.org> Link: https://lore.kernel.org/r/1684390959-17836-1-git-send-email-quic_qianyu@quicinc.com Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>	2023-07-12 17:49:38 +05:30
Qiang Yu	869a99907f	bus: mhi: host: Fix race between channel preparation and M0 event There is a race condition where mhi_prepare_channel() updates the read and write pointers as the base address and in parallel, if an M0 transition occurs, the tasklet goes ahead and rings doorbells for all channels with a delta in TRE rings assuming they are already enabled. This causes a null pointer access. Fix it by adding a channel enabled check before ringing channel doorbells. Cc: stable@vger.kernel.org # 5.19 Fixes: `a6e2e3522f` "bus: mhi: core: Add support for PM state transitions" Signed-off-by: Qiang Yu <quic_qianyu@quicinc.com> Reviewed-by: Manivannan Sadhasivam <mani@kernel.org> Link: https://lore.kernel.org/r/1665889532-13634-1-git-send-email-quic_qianyu@quicinc.com [mani: CCed stable list] Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>	2022-10-28 22:59:10 +05:30
Qiang Yu	1227d2a20c	bus: mhi: host: Move IRQ allocation to controller registration phase During runtime, the MHI endpoint may be powered up/down several times. So instead of allocating and destroying the IRQs all the time, let's just enable/disable IRQs during power up/down. The IRQs will be allocated during mhi_register_controller() and freed during mhi_unregister_controller(). This works well for things like PCI hotplug also as once the PCI device gets removed, the controller will get unregistered. And once it comes back, it will get registered back and even if the IRQ configuration changes (MSI), that will get accounted. Signed-off-by: Qiang Yu <quic_qianyu@quicinc.com> Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Reviewed-by: Manivannan Sadhasivam <mani@kernel.org> Link: https://lore.kernel.org/r/1655952183-66792-1-git-send-email-quic_qianyu@quicinc.com Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>	2022-06-24 12:54:19 +05:30
Bhaumik Bhatt	0bca889fd6	bus: mhi: host: Bail on writing register fields if read fails Helper API to write register fields relies on successful reads of the register/address prior to the write. Bail out if a failure is seen when reading the register before the actual write is performed. Signed-off-by: Bhaumik Bhatt <bbhatt@codeaurora.org> Signed-off-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Reviewed-by: Hemant Kumar <hemantk@codeaurora.org> Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Link: https://lore.kernel.org/r/1650304226-11080-2-git-send-email-quic_jhugo@quicinc.com Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>	2022-04-23 18:57:32 +05:30
Jeffrey Hugo	36e5505dfb	bus: mhi: host: Wait for ready state after reset After the device has signaled the end of reset by clearing the reset bit, it will automatically reinit MHI and the internal device structures. Once That is done, the device will signal it has entered the ready state. Signaling the ready state involves sending an interrupt (MSI) to the host which might cause IOMMU faults if it occurs at the wrong time. If the controller is being powered down, and possibly removed, then the reset flow would only wait for the end of reset. At which point, the host and device would start a race. The host may complete its reset work, and remove the interrupt handler, which would cause the interrupt to be disabled in the IOMMU. If that occurs before the device signals the ready state, then the IOMMU will fault since it blocked an interrupt. While harmless, the fault would appear like a serious issue has occurred so let's silence it by making sure the device hits the ready state before the host completes its reset processing. Signed-off-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Reviewed-by: Hemant Kumar <quic_hemantk@quicinc.com> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Link: https://lore.kernel.org/r/1650302562-30964-1-git-send-email-quic_jhugo@quicinc.com Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>	2022-04-23 18:57:32 +05:30
Manivannan Sadhasivam	3a1b8e281a	bus: mhi: Make mhi_state_str[] array static inline and move to common.h mhi_state_str[] array could be used by MHI endpoint stack also. So let's make the array as "static inline function" and move it inside the "common.h" header so that the endpoint stack could also make use of it. Reviewed-by: Hemant Kumar <hemantk@codeaurora.org> Reviewed-by: Alex Elder <elder@linaro.org> Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Link: https://lore.kernel.org/r/20220301160308.107452-11-manivannan.sadhasivam@linaro.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-03-18 14:02:55 +01:00
Manivannan Sadhasivam	d28cab4d4a	bus: mhi: Use bitfield operations for register read and write Functions like mhi_read_reg_field(), mhi_poll_reg_field() and mhi_write_reg_field() could be modified to not depend on the shift value passed as an argument. Instead, the bitfield operation could be used to extract the shift value from the mask itself. This eliminates the need to define _SHIFT (and _SHFT) macros and simplifies the code a bit. For shift values those cannot be determined during build time, "__ffs()" helper is used find the shift value during runtime. While at it, let's also get rid of 32-bit masks like CHDBOFF_CHDBOFF_MASK by doing the full 32-bit register read. Suggested-by: Alex Elder <elder@linaro.org> Reviewed-by: Alex Elder <elder@linaro.org> Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Link: https://lore.kernel.org/r/20220301160308.107452-6-manivannan.sadhasivam@linaro.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-03-18 14:02:54 +01:00
Manivannan Sadhasivam	a0f5a63066	bus: mhi: Move host MHI code to "host" directory In preparation of the endpoint MHI support, let's move the host MHI code to its own "host" directory and adjust the toplevel MHI Kconfig & Makefile. While at it, let's also move the "pci_generic" driver to "host" directory as it is a host MHI controller driver. Reviewed-by: Hemant Kumar <hemantk@codeaurora.org> Reviewed-by: Alex Elder <elder@linaro.org> Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Link: https://lore.kernel.org/r/20220301160308.107452-5-manivannan.sadhasivam@linaro.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-03-18 14:02:54 +01:00

11 Commits