linux

mirror of https://github.com/torvalds/linux.git synced 2026-04-24 09:35:52 -04:00

Author	SHA1	Message	Date
Ville Syrjälä	783d8b8087	drm/i915/psr: Re-enable PSR1 on hsw/bdw All known issues fixed now, so re-enable PSR1 on hsw/bdw. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230609141404.12729-14-ville.syrjala@linux.intel.com Reviewed-by: Jouni Högander <jouni.hogander@intel.com>	2023-06-16 17:59:44 +03:00
Ville Syrjälä	3e3c8e294b	drm/i915/psr: Allow PSR with sprite enabled on hsw/bdw Can't see why we'd want the sprite blocking PSR entry. Mask it out. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230609141404.12729-13-ville.syrjala@linux.intel.com Reviewed-by: Jouni Högander <jouni.hogander@intel.com>	2023-06-16 17:59:10 +03:00
Ville Syrjälä	1d3ebcfc5d	drm/i915/psr: Don't skip both TP1 and TP2/3 on hsw/bdw WA 0479 says: "Do not skip both TP1 and TP2/TP3". Let's just stick the minimum 100us TP2/3 time in there to avoid that. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230609141404.12729-12-ville.syrjala@linux.intel.com Reviewed-by: Jouni Högander <jouni.hogander@intel.com>	2023-06-16 17:58:56 +03:00
Ville Syrjälä	4d2391a0dd	drm/i915/psr: Do no mask display register writes on hsw/bdw hsw/bdw lack the pipe register vs. display register distinction in their PSR masking capabilities. So to keep our CURSURFLIVE tricks working we need to just unmask all display register writes on these platforms. The downside being that any display regitster (eg. even SWF regs) will cause a PSR exit. Note that WaMaskMMIOWriteForPSR asks us to mask this on bdw, but that won't work since we need those CURSURFLIVE tricks. Observations on actual hardware show that this causes one extra PSR exit ~every 10 seconds, which is pretty much irrelevant. I suspect this is due to the pcode poking at IPS_CTL. Disabling IPS does not stop it however, so either I'm wrong or pcode pokes at the register regardless of whether it's actually trying to enable/disable IPS. Also when the machine is busy (eg. just running 'find /') these extra PSR exits cease, which again points at pcode or some other PM entity as being the culprit. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230609141404.12729-11-ville.syrjala@linux.intel.com Reviewed-by: Jouni Högander <jouni.hogander@intel.com>	2023-06-16 17:58:33 +03:00
Ville Syrjälä	8a824f8fbf	drm/i915/psr: Implement WaPsrDPRSUnmaskVBlankInSRD:hsw Bspec asks us to unmask "vblank to registers" in the DPRS unit. Note that I was unable to observe any change in hardware behviour due to this bit on HSW. But let's do this anyway in case it matters in some cases, and the corresponding bit on BDW is abolutely critical as without it the hardware won't generate any vblanks whatsoever after PSR exit. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230609141404.12729-10-ville.syrjala@linux.intel.com Reviewed-by: Jouni Högander <jouni.hogander@intel.com>	2023-06-16 17:58:07 +03:00
Ville Syrjälä	a77c3fe304	drm/i915/psr: Implement WaPsrDPAMaskVBlankInSRD:hsw Implement WaPsrDPAMaskVBlankInSRD:hsw, which makes the hardware generate the extra vblank between link training and first frame being transmitted. This is the same thing that's controlled by TRANS_CHICKEN[21] on skl+ (but due to the funky double buffering it's effectively always at the rest value after DC5 exit). So for consistent behaviour we want every platform to generate said vblank. BDW is already setting this up correctly. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230609141404.12729-9-ville.syrjala@linux.intel.com Reviewed-by: Jouni Högander <jouni.hogander@intel.com>	2023-06-16 17:57:45 +03:00
Ville Syrjälä	e8b883c123	drm/i915/psr: Restore PSR interrupt handler for HSW Add the PSR interrupt handling code back for HSW. Looks like the removal was never completed anyway since the irq setup code was lest untouched. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230609141404.12729-8-ville.syrjala@linux.intel.com Reviewed-by: Jouni Högander <jouni.hogander@intel.com>	2023-06-16 17:57:25 +03:00
Ville Syrjälä	52b9c1ff2d	drm/i915/psr: HSW/BDW have no PSR2 Deal with HSW/BDW in transcoder_has_psr2(). Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230609141404.12729-7-ville.syrjala@linux.intel.com Reviewed-by: Jouni Högander <jouni.hogander@intel.com>	2023-06-16 17:57:05 +03:00
Ville Syrjälä	a181e94013	drm/i915/psr: Bring back HSW/BDW PSR AUX CH registers/setup Reintroduce the special PSR AUX CH setup for hsw/bdw. Not all of it was even removed (BDW AUX data registers were left behind). Update the code to use REG_BIT() & co. while at it. v2: Define the SRD_AUX_CTL bits in terms of DP_AUX_CTL bits (Jouni) Add a comment explaining the hand rolled DPCD write (Jouni) Cc: Jouni Högander <jouni.hogander@intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230609141404.12729-6-ville.syrjala@linux.intel.com Reviewed-by: Jouni Högander <jouni.hogander@intel.com>	2023-06-16 17:55:56 +03:00
Ville Syrjälä	c18cee2ee8	drm/i915/psr: Reintroduce HSW PSR1 registers Add back hsw'w special SRD/PSR1 registers. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230609141404.12729-5-ville.syrjala@linux.intel.com Reviewed-by: Jouni Högander <jouni.hogander@intel.com>	2023-06-16 17:55:21 +03:00
Ville Syrjälä	6a6b0ab2f3	drm/i915/psr: Wrap PSR1 register with functions In preparation for re-introducing HSW's different PSR1 register offeets let's just wrap all the registers into functions. Avoids having to make the register macros more complex. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230609141404.12729-4-ville.syrjala@linux.intel.com Reviewed-by: Jouni Högander <jouni.hogander@intel.com>	2023-06-16 17:55:05 +03:00
Ville Syrjälä	460dc4ba14	drm/i915/psr: Fix BDW PSR AUX CH data register offsets The multiplication got replaced by an addition in some cleanup. This means we never write the correct data to some of the BDW PSR data registers and thus we fail to actually wake up the panel from PSR. Fixes: `4ab4fa1032` ("drm/i915/psr: Make PSR registers relative to transcoders") Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230609141404.12729-3-ville.syrjala@linux.intel.com Reviewed-by: Jouni Högander <jouni.hogander@intel.com>	2023-06-16 17:54:47 +03:00
Ville Syrjälä	5197c49d20	drm/i915: Re-init clock gating on coming out of PC8+ PC8+ clobbers a bunch of displays registers which need to be restored by hand or else we lost a bunch of workarounds. The important ones for us are at least CHICKEN_PAR2* and CHICKEN_PIPESL. Curiously at least some CHICKEN_PAR1 registers are preserved by the hardware/firmware. Unfortunately Bspec doens't really specify what gets clobbered vs. preserved so further reverse engieering might be warranted to figure out the specifics. Note that PCH_LP_PARTITION_LEVEL_DISABLE is also set by lpt_init_clock_gating() so the rmw in hsw_disable_pc8() is now redundant. Remove it. TODO: I suspect most gt stuff doesn't need this and we should finish moving all of them from init_clock_gating() to a more appropriate place... Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230609141404.12729-2-ville.syrjala@linux.intel.com Reviewed-by: Jouni Högander <jouni.hogander@intel.com>	2023-06-16 17:54:13 +03:00
Marek Vasut	7f947be02a	drm/bridge: tc358764: Fix debug print parameter order The debug print parameters were swapped in the output and they were printed as decimal values, both the hardware address and the value. Update the debug print to print the parameters in correct order, and use hexadecimal print for both address and value. Fixes: `f38b7cca6d` ("drm/bridge: tc358764: Add DSI to LVDS bridge driver") Signed-off-by: Marek Vasut <marex@denx.de> Reviewed-by: Robert Foss <rfoss@kernel.org> Signed-off-by: Robert Foss <rfoss@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20230615152817.359420-1-marex@denx.de	2023-06-16 14:01:41 +02:00
Dmitry Baryshkov	452c46ccf6	drm/msm/dsi: split dsi_ctrl_config() function It makes no sense to pass NULL parameters to dsi_ctrl_config() in the disable case. Split dsi_ctrl_config() into enable and disable parts and drop unused params. Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Reviewed-by: Marijn Suijten <marijn.suijten@somainline.org> Patchwork: https://patchwork.freedesktop.org/patch/542559/ Link: https://lore.kernel.org/r/20230614224402.296825-2-dmitry.baryshkov@linaro.org	2023-06-16 12:46:47 +03:00
Dmitry Baryshkov	e2fd7dda3b	drm/msm/dsi: dsi_host: drop unused clocks Several source clocks are not used anymore, so stop handling them. Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Reviewed-by: Marijn Suijten <marijn.suijten@somainline.org> Patchwork: https://patchwork.freedesktop.org/patch/542558/ Link: https://lore.kernel.org/r/20230614224402.296825-1-dmitry.baryshkov@linaro.org	2023-06-16 12:46:47 +03:00
Dmitry Baryshkov	c7c4afd943	drm/msm/dpu: remove unused INTF_NONE interfaces sm6115, sm6375 and qcm2290 do not have INTF_0. Drop corresponding interface definitions. Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Reviewed-by: Abhinav Kumar <quic_abhinavk@quicinc.com> Reviewed-by: Marijn Suijten <marijn.suijten@somainline.org> Patchwork: https://patchwork.freedesktop.org/patch/542180/ Link: https://lore.kernel.org/r/20230613001004.3426676-4-dmitry.baryshkov@linaro.org	2023-06-16 12:43:38 +03:00
Dmitry Baryshkov	9a6c13b847	drm/msm/dpu: correct MERGE_3D length Each MERGE_3D block has just two registers. Correct the block length accordingly. Fixes: `4369c93cf3` ("drm/msm/dpu: initial support for merge3D hardware block") Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Reviewed-by: Abhinav Kumar <quic_abhinavk@quicinc.com> Patchwork: https://patchwork.freedesktop.org/patch/542177/ Reviewed-by: Marijn Suijten <marijn.suijten@somainline.org> Link: https://lore.kernel.org/r/20230613001004.3426676-3-dmitry.baryshkov@linaro.org	2023-06-16 12:43:24 +03:00
Dmitry Baryshkov	0b78be614c	drm/msm/dpu: fix sc7280 and sc7180 PINGPONG done interrupts During IRQ conversion we have lost the PP_DONE interrupts for sc7280 platform. This was left unnoticed, because this interrupt is only used for CMD outputs and probably no sc7[12]80 systems use DSI CMD panels. Fixes: `667e9985ee` ("drm/msm/dpu: replace IRQ lookup with the data in hw catalog") Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Reviewed-by: Abhinav Kumar <quic_abhinavk@quicinc.com> Reviewed-by: Marijn Suijten <marijn.suijten@somainline.org> Patchwork: https://patchwork.freedesktop.org/patch/542175/ Link: https://lore.kernel.org/r/20230613001004.3426676-2-dmitry.baryshkov@linaro.org	2023-06-16 12:43:11 +03:00
Dave Airlie	c8a5d5ea3b	nouveau: fix client work fence deletion race This seems to have existed for ever but is now more apparant after commit `9bff18d134` ("drm/ttm: use per BO cleanup workers") My analysis: two threads are running, one in the irq signalling the fence, in dma_fence_signal_timestamp_locked, it has done the DMA_FENCE_FLAG_SIGNALLED_BIT setting, but hasn't yet reached the callbacks. The second thread in nouveau_cli_work_ready, where it sees the fence is signalled, so then puts the fence, cleanups the object and frees the work item, which contains the callback. Thread one goes again and tries to call the callback and causes the use-after-free. Proposed fix: lock the fence signalled check in nouveau_cli_work_ready, so either the callbacks are done or the memory is freed. Reviewed-by: Karol Herbst <kherbst@redhat.com> Fixes: `11e451e740` ("drm/nouveau: remove fence wait code from deferred client work handler") Cc: stable@vger.kernel.org Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://lore.kernel.org/dri-devel/20230615024008.1600281-1-airlied@gmail.com/	2023-06-16 11:19:34 +10:00
Dave Airlie	8e04cddf3b	Merge tag 'drm-intel-next-2023-06-10' of git://anongit.freedesktop.org/drm/drm-intel into drm-next drm/i915 feature pull #2 for v6.5: Features and functionality: - Meteorlake PM demand (Vinod, Mika) - Switch to dedicated workqueues to stop using flush_scheduled_work() (Luca) Refactoring and cleanups: - Move display runtime init under display/ (Matt) - Async flip error message clarifications (Arun) Fixes: - Remove 10bit gamma on desktop gen3 parts, they don't support it (Ville) - Fix driver probe error handling if driver creation fails (Matt) - Fix all -Wunused-but-set-variable warnings, and enable it for i915 (Jani) - Stop using edid_blob_ptr (Jani) - Fix log level for "CDS interlane align done" (Khaled) - Fix an unnecessary include prefix (Matt) Merges: - Backmerge drm-next to sync with drm-intel-gt-next (Jani) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/87o7lnpxz2.fsf@intel.com	2023-06-16 08:09:08 +10:00
Wayne Lin	72f1de49ff	drm/dp_mst: Clear MSG_RDY flag before sending new message [Why] The sequence for collecting down_reply from source perspective should be: Request_n->repeat (get partial reply of Request_n->clear message ready flag to ack DPRX that the message is received) till all partial replies for Request_n are received->new Request_n+1. Now there is chance that drm_dp_mst_hpd_irq() will fire new down request in the tx queue when the down reply is incomplete. Source is restricted to generate interveleaved message transactions so we should avoid it. Also, while assembling partial reply packets, reading out DPCD DOWN_REP Sideband MSG buffer + clearing DOWN_REP_MSG_RDY flag should be wrapped up as a complete operation for reading out a reply packet. Kicking off a new request before clearing DOWN_REP_MSG_RDY flag might be risky. e.g. If the reply of the new request has overwritten the DPRX DOWN_REP Sideband MSG buffer before source writing one to clear DOWN_REP_MSG_RDY flag, source then unintentionally flushes the reply for the new request. Should handle the up request in the same way. [How] Separete drm_dp_mst_hpd_irq() into 2 steps. After acking the MST IRQ event, driver calls drm_dp_mst_hpd_irq_send_new_request() and might trigger drm_dp_mst_kick_tx() only when there is no on going message transaction. Changes since v1: * Reworked on review comments received -> Adjust the fix to let driver explicitly kick off new down request when mst irq event is handled and acked -> Adjust the commit message Changes since v2: * Adjust the commit message * Adjust the naming of the divided 2 functions and add a new input parameter "ack". * Adjust code flow as per review comments. Changes since v3: * Update the function description of drm_dp_mst_hpd_irq_handle_event Changes since v4: * Change ack of drm_dp_mst_hpd_irq_handle_event() to be an array align the size of esi[] Signed-off-by: Wayne Lin <Wayne.Lin@amd.com> Reviewed-by: Lyude Paul <lyude@redhat.com> Acked-by: Jani Nikula <jani.nikula@intel.com> Cc: stable@vger.kernel.org Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-15 17:55:41 -04:00
Philip Yang	5d1c70bb6e	drm/amdgpu: Increase hmm range get pages timeout If hmm_range_fault returns -EBUSY, we should call hmm_range_fault again to validate the remaining pages. On one system with NUMA auto balancing enabled, hmm_range_fault takes 6 seconds for 1GB range because CPU migrate the range one page at a time. To be safe, increase timeout value to 1 second for 128MB range. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-15 17:55:41 -04:00
Philip Yang	d728eda3c5	drm/amdgpu: Enable translate further for GC v9.4.3 To extend UTCL2 reach. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-15 17:55:41 -04:00
Javier Martinez Canillas	e254b584db	drm/ssd130x: Remove hardcoded bits-per-pixel in ssd130x_buf_alloc() The driver only supports OLED controllers that have a native DRM_FORMAT_C1 pixel format and that is why it has harcoded a division of the width by 8. But the driver might be extended to support devices that have a different pixel format. So it's better to use the struct drm_format_info helpers to compute the size of the buffer, used to store the pixels in native format. Signed-off-by: Javier Martinez Canillas <javierm@redhat.com> Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/20230609170941.1150941-6-javierm@redhat.com	2023-06-15 23:50:46 +02:00
Javier Martinez Canillas	49d7d581ce	drm/ssd130x: Don't allocate buffers on each plane update The resolutions for these panels are fixed and defined in the Device Tree, so there's no point to allocate the buffers on each plane update and that can just be done once. Let's do the allocation and free on the encoder enable and disable helpers since that's where others initialization and teardown operations are done. Signed-off-by: Javier Martinez Canillas <javierm@redhat.com> Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/20230609170941.1150941-5-javierm@redhat.com	2023-06-15 23:50:45 +02:00
Javier Martinez Canillas	179a790aaf	drm/ssd130x: Set the page height value in the device info data The driver only supports OLED controllers that have a page height of 8 but there are devices that have different page heights. So it is better to not hardcode this value and instead have it as a per controller data value. Signed-off-by: Javier Martinez Canillas <javierm@redhat.com> Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/20230609170941.1150941-4-javierm@redhat.com	2023-06-15 23:50:44 +02:00
Javier Martinez Canillas	f1f288d07a	drm/ssd130x: Make default width and height to be controller dependent Currently the driver hardcodes the default values to 96x16 pixels but this default resolution depends on the controller. The datasheets for the chips describes the following display controller resolutions: - SH1106: 132 x 64 Dot Matrix OLED/PLED - SSD1305: 132 x 64 Dot Matrix OLED/PLED - SSD1306: 128 x 64 Dot Matrix OLED/PLED - SSD1307: 128 x 39 Dot Matrix OLED/PLED - SSD1309: 128 x 64 Dot Matrix OLED/PLED Add this information to the devices' info structures, and use it set as a default if not defined in DT rather than hardcoding to an arbitrary value. Signed-off-by: Javier Martinez Canillas <javierm@redhat.com> Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/20230609170941.1150941-2-javierm@redhat.com	2023-06-15 23:50:41 +02:00
Harshit Mogalapalli	ce432fd34c	drm/i915/huc: Fix missing error code in intel_huc_init() Smatch warns: drivers/gpu/drm/i915/gt/uc/intel_huc.c:388 intel_huc_init() warn: missing error code 'err' When the allocation of VMAs fail: The value of err is zero at this point and it is passed to PTR_ERR and also finally returning zero which is success instead of failure. Fix this by adding the missing error code when VMA allocation fails. Fixes: `08872cb13a` ("drm/i915/mtl/huc: auth HuC via GSC") Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230614223646.2583633-1-daniele.ceraolospurio@intel.com	2023-06-15 14:12:19 -07:00
Lijo Lazar	0e41639d9a	drm/amdgpu: Remove unused NBIO interface Set compute partition mode interface in NBIO is no longer used. Remove the only implementation from NBIO v7.9 Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Le Ma <le.ma@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-15 14:33:56 -04:00
James Zhu	973fddea6f	drm/amdkfd: update user space last_event_age Update user space last_event_age when event age is enabled. It is only for KFD_EVENT_TYPE_SIGNAL which is checked by user space. Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-15 11:37:55 -04:00
James Zhu	96cdb5384d	drm/amdkfd: set activated flag true when event age unmatchs Set waiter's activated flag true when event age unmatchs with last_event_age. Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-15 11:37:55 -04:00
James Zhu	4057e6ce33	drm/amdkfd: add event_age tracking when receiving interrupt Add event_age tracking when receiving interrupt. Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-15 11:37:55 -04:00
ZhenGuo Yin	4f9b94d848	drm/scheduler: avoid infinite loop if entity's dependency is a scheduled error fence [Why] drm_sched_entity_add_dependency_cb ignores the scheduled fence and return false. If entity's dependency is a scheduler error fence and drm_sched_stop is called due to TDR, drm_sched_entity_pop_job will wait for the dependency infinitely. [How] Do not wait or ignore the scheduled error fence, add drm_sched_entity_wakeup callback for the dependency with scheduled error fence. Signed-off-by: ZhenGuo Yin <zhenguo.yin@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-15 11:37:55 -04:00
ZhenGuo Yin	71eaac368d	drm/amdgpu: add entity error check in amdgpu_ctx_get_entity [Why] UMD is not aware of entity error, and will keep submitting jobs into the error entity. [How] Add entity error check when getting entity from ctx. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: ZhenGuo Yin <zhenguo.yin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-15 11:37:55 -04:00
Christian König	f88e295e90	drm/amdgpu: add VM generation token Instead of using the VRAM lost counter add a 64bit token which indicates if a context or job is still valid to use. Should the VRAM be lost or the page tables need re-creation the token will change indicating that userspace needs to act and re-create the contexts and re-submit the work. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-15 11:37:55 -04:00
Christian König	55bf196f60	drm/amdgpu: reset VM when an error is detected When some problem with the updates of page tables is detected reset the state machine of the VM and re-create all page tables from scratch. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-15 11:37:55 -04:00
Christian König	e84e697d92	drm/amdgpu: abort submissions during prepare on error Forward errors from previous submissions to this one. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-15 11:37:55 -04:00
Christian König	89fae8dc41	drm/amdgpu: mark soft recovered fences with -ENODATA Set the fence error code before trying to soft-recover it. It gets overwritten when a hard recovery is required. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-15 11:37:55 -04:00
Christian König	0a33b11d26	drm/amdgpu: mark force completed fences with -ECANCELED When we force complete fences we should mark them as canceled. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-15 11:37:55 -04:00
Christian König	b13eb02ba8	drm/amdgpu: add amdgpu_error_* debugfs file This allows us to insert some error codes into the bottom of the pipeline on an engine. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-15 11:37:54 -04:00
Alex Deucher	2eb841bdbc	drm/amdgpu: mark GC 9.4.3 experimental for now Mark as experimental for now until we get closer to production to avoid possible undesireable behavior when mixing newer boards with older kernels. Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-15 11:37:54 -04:00
Lijo Lazar	b00f55374c	drm/amdgpu: Use PSP FW API for partition switch Use PSP firmware interface for switching compute partitions. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Le Ma <le.ma@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-15 11:06:59 -04:00
Lijo Lazar	fe381726c9	drm/amdgpu: Change nbio v7.9 xcp status definition PARTITION_MODE field in PARTITION_COMPUTE_STATUS register is defined as below by firmware. SPX = 0, DPX = 1, TPX = 2, QPX = 3, CPX = 4 Change driver definition accordingly. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Le Ma <le.ma@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-15 11:06:59 -04:00
Christian König	ca0b954a43	drm/amdgpu: make sure that BOs have a backing store It's perfectly possible that the BO is about to be destroyed and doesn't have a backing store associated with it. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Guchun Chen <guchun.chen@amd.com> Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-15 11:06:59 -04:00
Christian König	e2ad8e2df4	drm/amdgpu: make sure BOs are locked in amdgpu_vm_get_memory We need to grab the lock of the BO or otherwise can run into a crash when we try to inspect the current location. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Guchun Chen <guchun.chen@amd.com> Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-15 11:06:59 -04:00
Stanley.Yang	43aedbf4da	drm/amdgpu: Add checking mc_vram_size Do not compare injection address with mc_vram_size if mc_vram_size is zero. Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-15 11:06:59 -04:00
Stanley.Yang	38298ce6fc	drm/amdgpu: Optimize checking ras supported Using "is_app_apu" to identify device in the native APU mode or carveout mode. Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-15 11:06:59 -04:00
Candice Li	6fac3964a9	drm/amdgpu: Add channel_dis_num to ras init flags Add disabled channel number to ras init flags. Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-15 11:06:59 -04:00
Candice Li	bcd9a5f8b9	drm/amdgpu: Update total channel number for umc v8_10 Update total channel number for umc v8_10. Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-15 11:06:59 -04:00

... 29 30 31 32 33 ...

95756 Commits