linux

mirror of https://github.com/torvalds/linux.git synced 2026-04-30 20:42:33 -04:00

Author	SHA1	Message	Date
Dave Airlie	68b89e23c2	Merge tag 'drm-intel-gt-next-2024-04-26' of https://anongit.freedesktop.org/git/drm/drm-intel into drm-next UAPI Changes: - drm/i915/guc: Use context hints for GT frequency Allow user to provide a low latency context hint. When set, KMD sends a hint to GuC which results in special handling for this context. SLPC will ramp the GT frequency aggressively every time it switches to this context. The down freq threshold will also be lower so GuC will ramp down the GT freq for this context more slowly. We also disable waitboost for this context as that will interfere with the strategy. We need to enable the use of SLPC Compute strategy during init, but it will apply only to contexts that set this bit during context creation. Userland can check whether this feature is supported using a new param- I915_PARAM_HAS_CONTEXT_FREQ_HINT. This flag is true for all guc submission enabled platforms as they use SLPC for frequency management. The Mesa usage model for this flag is here - https://gitlab.freedesktop.org/sushmave/mesa/-/commits/compute_hint - drm/i915/gt: Enable only one CCS for compute workload Enable only one CCS engine by default with all the compute sices allocated to it. While generating the list of UABI engines to be exposed to the user, exclude any additional CCS engines beyond the first instance *** NOTE: This W/A will make all DG2 SKUs appear like single CCS SKUs by default to mitigate a hardware bug. All the EUs will still remain usable, and all the userspace drivers have been confirmed to be able to dynamically detect the change in number of CCS engines and adjust. For the smaller percent of applications that get perf benefit from letting the userspace driver dispatch across all 4 CCS engines we will be introducing a sysfs control as a later patch to choose 4 CCS each with 25% EUs (or 50% if 2 CCS). NOTE: A regression has been reported at https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/10895 However Andi has been triaging the issue and we're closing in a fix to the gap in the W/A implementation: https://lists.freedesktop.org/archives/intel-gfx/2024-April/348747.html Driver Changes: - Add new and fix to existing workarounds: Wa_14018575942 (MTL), Wa_16019325821 (Gen12.70), Wa_14019159160 (MTL), Wa_16015675438, Wa_14020495402 (Gen12.70) (Tejas, John, Lucas) - Fix UAF on destroy against retire race and remove two earlier partial fixes (Janusz) - Limit the reserved VM space to only the platforms that need it (Andi) - Reset queue_priority_hint on parking for execlist platforms (Chris) - Fix gt reset with GuC submission is disabled (Nirmoy) - Correct capture of EIR register on hang (John) - Remove usage of the deprecated ida_simple_xx() API - Refactor confusing __intel_gt_reset() (Nirmoy) - Fix the fix for GuC reset lock confusion (John) - Simplify/extend platform check for Wa_14018913170 (John) - Replace dev_priv with i915 (Andi) - Add and use gt_to_guc() wrapper (Andi) - Remove bogus null check (Rodrigo, Dan) . Selftest improvements (Janusz, Nirmoy, Daniele) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/ZitVBTvZmityDi7D@jlahtine-mobl.ger.corp.intel.com	2024-04-30 14:40:43 +10:00
Nirmoy Das	4d3421e04c	drm/i915: Fix gt reset with GuC submission is disabled Currently intel_gt_reset() kills the GuC and then resets requested engines. This is problematic because there is a dedicated CSB FIFO which only GuC can access and if that FIFO fills up, the hardware will block on the next context switch until there is space that means the system is effectively hung. If an engine is reset whilst actively executing a context, a CSB entry will be sent to say that the context has gone idle. Thus if reset happens on a very busy system then killing GuC before killing the engines will lead to deadlock because of filled up CSB FIFO. To address this issue, the GuC should be killed only after resetting the requested engines and before calling intel_gt_init_hw(). v2: Improve commit message(John) Cc: John Harrison <john.c.harrison@intel.com> Signed-off-by: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240422201951.633-2-nirmoy.das@intel.com	2024-04-24 18:48:32 +02:00
Nirmoy Das	31c3c53ee3	drm/i915: Refactor confusing __intel_gt_reset() __intel_gt_reset() is really for resetting engines though the name might suggest something else. So add a helper function to remove confusions with no functional changes. v2: Move intel_gt_reset_all_engines() next to intel_gt_reset_engine() to make diff simple(John) Cc: John Harrison <john.c.harrison@intel.com> Signed-off-by: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240422201951.633-1-nirmoy.das@intel.com	2024-04-24 18:48:31 +02:00
Dave Airlie	0208ca55aa	Backmerge tag 'v6.9-rc5' into drm-next Linux 6.9-rc5 I've had a persistent msm failure on clang, and the fix is in fixes so just pull it back to fix that. Signed-off-by: Dave Airlie <airlied@redhat.com>	2024-04-22 14:35:52 +10:00
Christophe JAILLET	2af231e1b8	drm/i915/guc: Remove usage of the deprecated ida_simple_xx() API ida_alloc() and ida_free() should be preferred to the deprecated ida_simple_get() and ida_simple_remove(). Note that the upper limit of ida_simple_get() is exclusive, but the one of ida_alloc_range() is inclusive. So a -1 has been added when needed. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/7108c1871c6cb08d403c4fa6534bc7e6de4cb23d.1705245316.git.christophe.jaillet@wanadoo.fr Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2024-04-14 15:09:18 -04:00
Jani Nikula	87816d6074	drm/i915/gt: drop display clock info from gt debugfs The same info is available in i915_cdclk_info. Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Acked-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/50461f13ab09b162de25d3f3587890548f4db499.1712599670.git.jani.nikula@intel.com	2024-04-09 11:29:38 +03:00
John Harrison	152191e5e9	drm/i915/guc: Fix the fix for reset lock confusion The previous fix for the circlular lock splat about the busyness worker wasn't quite complete. Even though the reset-in-progress flag is cleared at the start of intel_uc_reset_finish, the entire function is still inside the reset mutex lock. Not sure why the patch appeared to fix the issue both locally and in CI. However, it is now back again. There is a further complication that the wedge code path within intel_gt_reset() jumps around so much that it results in nested reset_prepare/_finish calls. That is, the call sequence is: intel_gt_reset \| reset_prepare \| __intel_gt_set_wedged \| \| reset_prepare \| \| reset_finish \| reset_finish The nested finish means that even if the clear of the in-progress flag was moved to the end of _finish, it would still be clear for the entire second call. Surprisingly, this does not seem to be causing any other problems at present. As an aside, a wedge on fini does not call the finish functions at all. The reset_in_progress flag is left set (twice). So instead of trying to cancel the worker anywhere at all in the reset path, just add a cancel to intel_guc_submission_fini instead. Note that it is not a problem if the worker is still active during a reset. Either it will run before the reset path starts locking things and will simply block the reset code for a tiny amount of time. Or it will run after the locks have been acquired and will early exit due to the try-lock. Also, do not use the reset-in-progress flag to decide whether a synchronous cancel is safe (from a lockdep perspective) or not. Instead, use the actual reset mutex state (both the genuine one and the custom rolled BACKOFF one). Fixes: `0e00a8814e` ("drm/i915/guc: Avoid circular locking issue on busyness flush") Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Cc: Zhanjun Dong <zhanjun.dong@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Andi Shyti <andi.shyti@linux.intel.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Nirmoy Das <nirmoy.das@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Cc: Andrzej Hajda <andrzej.hajda@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Jonathan Cavitt <jonathan.cavitt@intel.com> Cc: Prathap Kumar Valsan <prathap.kumar.valsan@intel.com> Cc: Alan Previn <alan.previn.teres.alexis@intel.com> Cc: Madhumitha Tolakanahalli Pradeep <madhumitha.tolakanahalli.pradeep@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Ashutosh Dixit <ashutosh.dixit@intel.com> Cc: Dnyaneshwar Bhadane <dnyaneshwar.bhadane@intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240329235306.1559639-1-John.C.Harrison@Intel.com (cherry picked from commit `3563d85531`) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2024-04-08 13:09:37 -04:00
John Harrison	3563d85531	drm/i915/guc: Fix the fix for reset lock confusion The previous fix for the circlular lock splat about the busyness worker wasn't quite complete. Even though the reset-in-progress flag is cleared at the start of intel_uc_reset_finish, the entire function is still inside the reset mutex lock. Not sure why the patch appeared to fix the issue both locally and in CI. However, it is now back again. There is a further complication that the wedge code path within intel_gt_reset() jumps around so much that it results in nested reset_prepare/_finish calls. That is, the call sequence is: intel_gt_reset \| reset_prepare \| __intel_gt_set_wedged \| \| reset_prepare \| \| reset_finish \| reset_finish The nested finish means that even if the clear of the in-progress flag was moved to the end of _finish, it would still be clear for the entire second call. Surprisingly, this does not seem to be causing any other problems at present. As an aside, a wedge on fini does not call the finish functions at all. The reset_in_progress flag is left set (twice). So instead of trying to cancel the worker anywhere at all in the reset path, just add a cancel to intel_guc_submission_fini instead. Note that it is not a problem if the worker is still active during a reset. Either it will run before the reset path starts locking things and will simply block the reset code for a tiny amount of time. Or it will run after the locks have been acquired and will early exit due to the try-lock. Also, do not use the reset-in-progress flag to decide whether a synchronous cancel is safe (from a lockdep perspective) or not. Instead, use the actual reset mutex state (both the genuine one and the custom rolled BACKOFF one). Fixes: `0e00a8814e` ("drm/i915/guc: Avoid circular locking issue on busyness flush") Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Cc: Zhanjun Dong <zhanjun.dong@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Andi Shyti <andi.shyti@linux.intel.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Nirmoy Das <nirmoy.das@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Cc: Andrzej Hajda <andrzej.hajda@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Jonathan Cavitt <jonathan.cavitt@intel.com> Cc: Prathap Kumar Valsan <prathap.kumar.valsan@intel.com> Cc: Alan Previn <alan.previn.teres.alexis@intel.com> Cc: Madhumitha Tolakanahalli Pradeep <madhumitha.tolakanahalli.pradeep@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Ashutosh Dixit <ashutosh.dixit@intel.com> Cc: Dnyaneshwar Bhadane <dnyaneshwar.bhadane@intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240329235306.1559639-1-John.C.Harrison@Intel.com	2024-04-05 09:31:37 -07:00
Rodrigo Vivi	7406538860	drm/i915/guc: Remove bogus null check This null check is bogus because we are already using 'ce' stuff in many places before this function is called. Having this here is useless and confuses static analyzer tools that can see: struct intel_engine_cs *engine = ce->engine; before this check, in the same function. Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Closes: https://lore.kernel.org/r/202403101225.7AheJhZJ-lkp@intel.com/ Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240328213107.90632-1-rodrigo.vivi@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2024-04-03 14:44:22 -04:00
Andi Shyti	6db31251bb	drm/i915/gt: Enable only one CCS for compute workload Enable only one CCS engine by default with all the compute sices allocated to it. While generating the list of UABI engines to be exposed to the user, exclude any additional CCS engines beyond the first instance. This change can be tested with igt i915_query. Fixes: `d2eae8e98d` ("drm/i915/dg2: Drop force_probe requirement") Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Cc: Chris Wilson <chris.p.wilson@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: <stable@vger.kernel.org> # v6.2+ Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Acked-by: Michal Mrozek <michal.mrozek@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240328073409.674098-4-andi.shyti@linux.intel.com (cherry picked from commit `2bebae0112`) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2024-04-03 14:26:10 -04:00
Andi Shyti	ea315f98e5	drm/i915/gt: Do not generate the command streamer for all the CCS We want a fixed load CCS balancing consisting in all slices sharing one single user engine. For this reason do not create the intel_engine_cs structure with its dedicated command streamer for CCS slices beyond the first. Fixes: `d2eae8e98d` ("drm/i915/dg2: Drop force_probe requirement") Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Cc: Chris Wilson <chris.p.wilson@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: <stable@vger.kernel.org> # v6.2+ Acked-by: Michal Mrozek <michal.mrozek@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240328073409.674098-3-andi.shyti@linux.intel.com (cherry picked from commit `c7a5aa4e57`) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2024-04-03 14:26:10 -04:00
Andi Shyti	bc9a1ec012	drm/i915/gt: Disable HW load balancing for CCS The hardware should not dynamically balance the load between CCS engines. Wa_14019159160 recommends disabling it across all platforms. Fixes: `d2eae8e98d` ("drm/i915/dg2: Drop force_probe requirement") Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Cc: Chris Wilson <chris.p.wilson@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: <stable@vger.kernel.org> # v6.2+ Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Acked-by: Michal Mrozek <michal.mrozek@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240328073409.674098-2-andi.shyti@linux.intel.com (cherry picked from commit `f5d2904cf8`) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2024-04-03 14:26:10 -04:00
Andi Shyti	94bf3e60e1	drm/i915/gt: Limit the reserved VM space to only the platforms that need it Commit `9bb66c179f` ("drm/i915: Reserve some kernel space per vm") reduces the available VM space of one page in order to apply Wa_16018031267 and Wa_16018063123. This page was reserved indiscrimitely in all platforms even when not needed. Limit it to DG2 onwards. Fixes: `9bb66c179f` ("drm/i915: Reserve some kernel space per vm") Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Cc: Andrzej Hajda <andrzej.hajda@intel.com> Cc: Chris Wilson <chris.p.wilson@linux.intel.com> Cc: Jonathan Cavitt <jonathan.cavitt@intel.com> Cc: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Acked-by: Michal Mrozek <michal.mrozek@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240327200546.640108-1-andi.shyti@linux.intel.com (cherry picked from commit `9721634441`) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2024-04-03 14:26:10 -04:00
Rodrigo Vivi	5add703f6a	Merge drm/drm-next into drm-intel-next Catching up on 6.9-rc2 Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2024-04-02 08:17:13 -04:00
Andi Shyti	2bebae0112	drm/i915/gt: Enable only one CCS for compute workload Enable only one CCS engine by default with all the compute sices allocated to it. While generating the list of UABI engines to be exposed to the user, exclude any additional CCS engines beyond the first instance. This change can be tested with igt i915_query. Fixes: `d2eae8e98d` ("drm/i915/dg2: Drop force_probe requirement") Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Cc: Chris Wilson <chris.p.wilson@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: <stable@vger.kernel.org> # v6.2+ Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Acked-by: Michal Mrozek <michal.mrozek@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240328073409.674098-4-andi.shyti@linux.intel.com	2024-03-30 01:17:36 +01:00
Andi Shyti	c7a5aa4e57	drm/i915/gt: Do not generate the command streamer for all the CCS We want a fixed load CCS balancing consisting in all slices sharing one single user engine. For this reason do not create the intel_engine_cs structure with its dedicated command streamer for CCS slices beyond the first. Fixes: `d2eae8e98d` ("drm/i915/dg2: Drop force_probe requirement") Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Cc: Chris Wilson <chris.p.wilson@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: <stable@vger.kernel.org> # v6.2+ Acked-by: Michal Mrozek <michal.mrozek@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240328073409.674098-3-andi.shyti@linux.intel.com	2024-03-30 01:17:35 +01:00
Andi Shyti	f5d2904cf8	drm/i915/gt: Disable HW load balancing for CCS The hardware should not dynamically balance the load between CCS engines. Wa_14019159160 recommends disabling it across all platforms. Fixes: `d2eae8e98d` ("drm/i915/dg2: Drop force_probe requirement") Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Cc: Chris Wilson <chris.p.wilson@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: <stable@vger.kernel.org> # v6.2+ Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Acked-by: Michal Mrozek <michal.mrozek@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240328073409.674098-2-andi.shyti@linux.intel.com	2024-03-30 01:17:35 +01:00
Andi Shyti	9721634441	drm/i915/gt: Limit the reserved VM space to only the platforms that need it Commit `9bb66c179f` ("drm/i915: Reserve some kernel space per vm") reduces the available VM space of one page in order to apply Wa_16018031267 and Wa_16018063123. This page was reserved indiscrimitely in all platforms even when not needed. Limit it to DG2 onwards. Fixes: `9bb66c179f` ("drm/i915: Reserve some kernel space per vm") Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Cc: Andrzej Hajda <andrzej.hajda@intel.com> Cc: Chris Wilson <chris.p.wilson@linux.intel.com> Cc: Jonathan Cavitt <jonathan.cavitt@intel.com> Cc: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Acked-by: Michal Mrozek <michal.mrozek@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240327200546.640108-1-andi.shyti@linux.intel.com	2024-03-28 20:19:11 +01:00
Chris Wilson	4a3859ea52	drm/i915/gt: Reset queue_priority_hint on parking Originally, with strict in order execution, we could complete execution only when the queue was empty. Preempt-to-busy allows replacement of an active request that may complete before the preemption is processed by HW. If that happens, the request is retired from the queue, but the queue_priority_hint remains set, preventing direct submission until after the next CS interrupt is processed. This preempt-to-busy race can be triggered by the heartbeat, which will also act as the power-management barrier and upon completion allow us to idle the HW. We may process the completion of the heartbeat, and begin parking the engine before the CS event that restores the queue_priority_hint, causing us to fail the assertion that it is MIN. <3>[ 166.210729] __engine_park:283 GEM_BUG_ON(engine->sched_engine->queue_priority_hint != (-((int)(~0U >> 1)) - 1)) <0>[ 166.210781] Dumping ftrace buffer: <0>[ 166.210795] --------------------------------- ... <0>[ 167.302811] drm_fdin-1097 2..s1. 165741070us : trace_ports: 0000:00:02.0 rcs0: promote { ccid:20 1217:2 prio 0 } <0>[ 167.302861] drm_fdin-1097 2d.s2. 165741072us : execlists_submission_tasklet: 0000:00:02.0 rcs0: preempting last=1217:2, prio=0, hint=2147483646 <0>[ 167.302928] drm_fdin-1097 2d.s2. 165741072us : __i915_request_unsubmit: 0000:00:02.0 rcs0: fence 1217:2, current 0 <0>[ 167.302992] drm_fdin-1097 2d.s2. 165741073us : __i915_request_submit: 0000:00:02.0 rcs0: fence 3:4660, current 4659 <0>[ 167.303044] drm_fdin-1097 2d.s1. 165741076us : execlists_submission_tasklet: 0000:00:02.0 rcs0: context:3 schedule-in, ccid:40 <0>[ 167.303095] drm_fdin-1097 2d.s1. 165741077us : trace_ports: 0000:00:02.0 rcs0: submit { ccid:40 3:4660* prio 2147483646 } <0>[ 167.303159] kworker/-89 11..... 165741139us : i915_request_retire.part.0: 0000:00:02.0 rcs0: fence c90:2, current 2 <0>[ 167.303208] kworker/-89 11..... 165741148us : __intel_context_do_unpin: 0000:00:02.0 rcs0: context:c90 unpin <0>[ 167.303272] kworker/-89 11..... 165741159us : i915_request_retire.part.0: 0000:00:02.0 rcs0: fence 1217:2, current 2 <0>[ 167.303321] kworker/-89 11..... 165741166us : __intel_context_do_unpin: 0000:00:02.0 rcs0: context:1217 unpin <0>[ 167.303384] kworker/-89 11..... 165741170us : i915_request_retire.part.0: 0000:00:02.0 rcs0: fence 3:4660, current 4660 <0>[ 167.303434] kworker/-89 11d..1. 165741172us : __intel_context_retire: 0000:00:02.0 rcs0: context:1216 retire runtime: { total:56028ns, avg:56028ns } <0>[ 167.303484] kworker/-89 11..... 165741198us : __engine_park: 0000:00:02.0 rcs0: parked <0>[ 167.303534] <idle>-0 5d.H3. 165741207us : execlists_irq_handler: 0000:00:02.0 rcs0: semaphore yield: 00000040 <0>[ 167.303583] kworker/-89 11..... 165741397us : __intel_context_retire: 0000:00:02.0 rcs0: context:1217 retire runtime: { total:325575ns, avg:0ns } <0>[ 167.303756] kworker/-89 11..... 165741777us : __intel_context_retire: 0000:00:02.0 rcs0: context:c90 retire runtime: { total:0ns, avg:0ns } <0>[ 167.303806] kworker/-89 11..... 165742017us : __engine_park: __engine_park:283 GEM_BUG_ON(engine->sched_engine->queue_priority_hint != (-((int)(~0U >> 1)) - 1)) <0>[ 167.303811] --------------------------------- <4>[ 167.304722] ------------[ cut here ]------------ <2>[ 167.304725] kernel BUG at drivers/gpu/drm/i915/gt/intel_engine_pm.c:283! <4>[ 167.304731] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI <4>[ 167.304734] CPU: 11 PID: 89 Comm: kworker/11:1 Tainted: G W 6.8.0-rc2-CI_DRM_14193-gc655e0fd2804+ #1 <4>[ 167.304736] Hardware name: Intel Corporation Rocket Lake Client Platform/RocketLake S UDIMM 6L RVP, BIOS RKLSFWI1.R00.3173.A03.2204210138 04/21/2022 <4>[ 167.304738] Workqueue: i915-unordered retire_work_handler [i915] <4>[ 167.304839] RIP: 0010:__engine_park+0x3fd/0x680 [i915] <4>[ 167.304937] Code: 00 48 c7 c2 b0 e5 86 a0 48 8d 3d 00 00 00 00 e8 79 48 d4 e0 bf 01 00 00 00 e8 ef 0a d4 e0 31 f6 bf 09 00 00 00 e8 03 49 c0 e0 <0f> 0b 0f 0b be 01 00 00 00 e8 f5 61 fd ff 31 c0 e9 34 fd ff ff 48 <4>[ 167.304940] RSP: 0018:ffffc9000059fce0 EFLAGS: 00010246 <4>[ 167.304942] RAX: 0000000000000200 RBX: 0000000000000000 RCX: 0000000000000006 <4>[ 167.304944] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000009 <4>[ 167.304946] RBP: ffff8881330ca1b0 R08: 0000000000000001 R09: 0000000000000001 <4>[ 167.304947] R10: 0000000000000001 R11: 0000000000000001 R12: ffff8881330ca000 <4>[ 167.304948] R13: ffff888110f02aa0 R14: ffff88812d1d0205 R15: ffff88811277d4f0 <4>[ 167.304950] FS: 0000000000000000(0000) GS:ffff88844f780000(0000) knlGS:0000000000000000 <4>[ 167.304952] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 <4>[ 167.304953] CR2: 00007fc362200c40 CR3: 000000013306e003 CR4: 0000000000770ef0 <4>[ 167.304955] PKRU: 55555554 <4>[ 167.304957] Call Trace: <4>[ 167.304958] <TASK> <4>[ 167.305573] ____intel_wakeref_put_last+0x1d/0x80 [i915] <4>[ 167.305685] i915_request_retire.part.0+0x34f/0x600 [i915] <4>[ 167.305800] retire_requests+0x51/0x80 [i915] <4>[ 167.305892] intel_gt_retire_requests_timeout+0x27f/0x700 [i915] <4>[ 167.305985] process_scheduled_works+0x2db/0x530 <4>[ 167.305990] worker_thread+0x18c/0x350 <4>[ 167.305993] kthread+0xfe/0x130 <4>[ 167.305997] ret_from_fork+0x2c/0x50 <4>[ 167.306001] ret_from_fork_asm+0x1b/0x30 <4>[ 167.306004] </TASK> It is necessary for the queue_priority_hint to be lower than the next request submission upon waking up, as we rely on the hint to decide when to kick the tasklet to submit that first request. Fixes: `22b7a426bb` ("drm/i915/execlists: Preempt-to-busy") Closes: https://gitlab.freedesktop.org/drm/intel/issues/10154 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: <stable@vger.kernel.org> # v5.4+ Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240318135906.716055-2-janusz.krzysztofik@linux.intel.com (cherry picked from commit `98850e96cf`) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2024-03-28 12:16:16 -04:00
Tejas Upadhyay	186bce6827	drm/i915/mtl: Update workaround 14018575942 Applying WA 14018575942 only on Compute engine has impact on some apps like chrome. Updating this WA to apply on Render engine as well as it is helping with performance on Chrome. Note: There is no concern from media team thus not applying WA on media engines. We will revisit if any issues reported from media team. V2(Matt): - Use correct WA number Fixes: `668f37e1ee` ("drm/i915/mtl: Update workaround 14018778641") Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240228103738.2018458-1-tejas.upadhyay@intel.com (cherry picked from commit `7127128017`) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2024-03-28 12:16:15 -04:00
Lucas De Marchi	e580fc47b5	drm/i915: Delete stray .rej file drivers/gpu/drm/i915/gt/intel_workarounds.c.rej was incorrectly added to the tree after solving a conflict. Remove it. Fixes: `326e30e462` ("drm/i915: Drop dead code for pvc") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Closes: https://lore.kernel.org/r/20240325083435.4f970eec@canb.auug.org.au Cc: Jani Nikula <jani.nikula@linux.intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240325144728.537855-1-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2024-03-26 06:43:28 -07:00
Chris Wilson	98850e96cf	drm/i915/gt: Reset queue_priority_hint on parking Originally, with strict in order execution, we could complete execution only when the queue was empty. Preempt-to-busy allows replacement of an active request that may complete before the preemption is processed by HW. If that happens, the request is retired from the queue, but the queue_priority_hint remains set, preventing direct submission until after the next CS interrupt is processed. This preempt-to-busy race can be triggered by the heartbeat, which will also act as the power-management barrier and upon completion allow us to idle the HW. We may process the completion of the heartbeat, and begin parking the engine before the CS event that restores the queue_priority_hint, causing us to fail the assertion that it is MIN. <3>[ 166.210729] __engine_park:283 GEM_BUG_ON(engine->sched_engine->queue_priority_hint != (-((int)(~0U >> 1)) - 1)) <0>[ 166.210781] Dumping ftrace buffer: <0>[ 166.210795] --------------------------------- ... <0>[ 167.302811] drm_fdin-1097 2..s1. 165741070us : trace_ports: 0000:00:02.0 rcs0: promote { ccid:20 1217:2 prio 0 } <0>[ 167.302861] drm_fdin-1097 2d.s2. 165741072us : execlists_submission_tasklet: 0000:00:02.0 rcs0: preempting last=1217:2, prio=0, hint=2147483646 <0>[ 167.302928] drm_fdin-1097 2d.s2. 165741072us : __i915_request_unsubmit: 0000:00:02.0 rcs0: fence 1217:2, current 0 <0>[ 167.302992] drm_fdin-1097 2d.s2. 165741073us : __i915_request_submit: 0000:00:02.0 rcs0: fence 3:4660, current 4659 <0>[ 167.303044] drm_fdin-1097 2d.s1. 165741076us : execlists_submission_tasklet: 0000:00:02.0 rcs0: context:3 schedule-in, ccid:40 <0>[ 167.303095] drm_fdin-1097 2d.s1. 165741077us : trace_ports: 0000:00:02.0 rcs0: submit { ccid:40 3:4660* prio 2147483646 } <0>[ 167.303159] kworker/-89 11..... 165741139us : i915_request_retire.part.0: 0000:00:02.0 rcs0: fence c90:2, current 2 <0>[ 167.303208] kworker/-89 11..... 165741148us : __intel_context_do_unpin: 0000:00:02.0 rcs0: context:c90 unpin <0>[ 167.303272] kworker/-89 11..... 165741159us : i915_request_retire.part.0: 0000:00:02.0 rcs0: fence 1217:2, current 2 <0>[ 167.303321] kworker/-89 11..... 165741166us : __intel_context_do_unpin: 0000:00:02.0 rcs0: context:1217 unpin <0>[ 167.303384] kworker/-89 11..... 165741170us : i915_request_retire.part.0: 0000:00:02.0 rcs0: fence 3:4660, current 4660 <0>[ 167.303434] kworker/-89 11d..1. 165741172us : __intel_context_retire: 0000:00:02.0 rcs0: context:1216 retire runtime: { total:56028ns, avg:56028ns } <0>[ 167.303484] kworker/-89 11..... 165741198us : __engine_park: 0000:00:02.0 rcs0: parked <0>[ 167.303534] <idle>-0 5d.H3. 165741207us : execlists_irq_handler: 0000:00:02.0 rcs0: semaphore yield: 00000040 <0>[ 167.303583] kworker/-89 11..... 165741397us : __intel_context_retire: 0000:00:02.0 rcs0: context:1217 retire runtime: { total:325575ns, avg:0ns } <0>[ 167.303756] kworker/-89 11..... 165741777us : __intel_context_retire: 0000:00:02.0 rcs0: context:c90 retire runtime: { total:0ns, avg:0ns } <0>[ 167.303806] kworker/-89 11..... 165742017us : __engine_park: __engine_park:283 GEM_BUG_ON(engine->sched_engine->queue_priority_hint != (-((int)(~0U >> 1)) - 1)) <0>[ 167.303811] --------------------------------- <4>[ 167.304722] ------------[ cut here ]------------ <2>[ 167.304725] kernel BUG at drivers/gpu/drm/i915/gt/intel_engine_pm.c:283! <4>[ 167.304731] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI <4>[ 167.304734] CPU: 11 PID: 89 Comm: kworker/11:1 Tainted: G W 6.8.0-rc2-CI_DRM_14193-gc655e0fd2804+ #1 <4>[ 167.304736] Hardware name: Intel Corporation Rocket Lake Client Platform/RocketLake S UDIMM 6L RVP, BIOS RKLSFWI1.R00.3173.A03.2204210138 04/21/2022 <4>[ 167.304738] Workqueue: i915-unordered retire_work_handler [i915] <4>[ 167.304839] RIP: 0010:__engine_park+0x3fd/0x680 [i915] <4>[ 167.304937] Code: 00 48 c7 c2 b0 e5 86 a0 48 8d 3d 00 00 00 00 e8 79 48 d4 e0 bf 01 00 00 00 e8 ef 0a d4 e0 31 f6 bf 09 00 00 00 e8 03 49 c0 e0 <0f> 0b 0f 0b be 01 00 00 00 e8 f5 61 fd ff 31 c0 e9 34 fd ff ff 48 <4>[ 167.304940] RSP: 0018:ffffc9000059fce0 EFLAGS: 00010246 <4>[ 167.304942] RAX: 0000000000000200 RBX: 0000000000000000 RCX: 0000000000000006 <4>[ 167.304944] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000009 <4>[ 167.304946] RBP: ffff8881330ca1b0 R08: 0000000000000001 R09: 0000000000000001 <4>[ 167.304947] R10: 0000000000000001 R11: 0000000000000001 R12: ffff8881330ca000 <4>[ 167.304948] R13: ffff888110f02aa0 R14: ffff88812d1d0205 R15: ffff88811277d4f0 <4>[ 167.304950] FS: 0000000000000000(0000) GS:ffff88844f780000(0000) knlGS:0000000000000000 <4>[ 167.304952] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 <4>[ 167.304953] CR2: 00007fc362200c40 CR3: 000000013306e003 CR4: 0000000000770ef0 <4>[ 167.304955] PKRU: 55555554 <4>[ 167.304957] Call Trace: <4>[ 167.304958] <TASK> <4>[ 167.305573] ____intel_wakeref_put_last+0x1d/0x80 [i915] <4>[ 167.305685] i915_request_retire.part.0+0x34f/0x600 [i915] <4>[ 167.305800] retire_requests+0x51/0x80 [i915] <4>[ 167.305892] intel_gt_retire_requests_timeout+0x27f/0x700 [i915] <4>[ 167.305985] process_scheduled_works+0x2db/0x530 <4>[ 167.305990] worker_thread+0x18c/0x350 <4>[ 167.305993] kthread+0xfe/0x130 <4>[ 167.305997] ret_from_fork+0x2c/0x50 <4>[ 167.306001] ret_from_fork_asm+0x1b/0x30 <4>[ 167.306004] </TASK> It is necessary for the queue_priority_hint to be lower than the next request submission upon waking up, as we rely on the hint to decide when to kick the tasklet to submit that first request. Fixes: `22b7a426bb` ("drm/i915/execlists: Preempt-to-busy") Closes: https://gitlab.freedesktop.org/drm/intel/issues/10154 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: <stable@vger.kernel.org> # v5.4+ Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240318135906.716055-2-janusz.krzysztofik@linux.intel.com	2024-03-26 09:34:25 +01:00
Lucas De Marchi	1bfc03b137	drm/i915: Remove special handling for !RCS_MASK() With both XEHPSDV and PVC removed (as platforms, most of their code remain used by others), there's no need to handle !RCS_MASK() as other platforms don't ever have fused-off render. Remove those code paths and the special WA flag when initializing GuC. Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Acked-by: Tvrtko Ursulin <tursulin@ursulin.net> Link: https://patchwork.freedesktop.org/patch/msgid/20240320060543.4034215-7-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2024-03-22 14:15:00 -07:00
Lucas De Marchi	326e30e462	drm/i915: Drop dead code for pvc PCI IDs for PVC were never added and platform always marked with force_probe. Drop what's not used and rename some places as needed. The registers not used anymore are also removed. Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Acked-by: Tvrtko Ursulin <tursulin@ursulin.net> Link: https://patchwork.freedesktop.org/patch/msgid/20240320060543.4034215-6-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2024-03-22 14:14:56 -07:00
Lucas De Marchi	48ba4a6dc3	drm/i915: Update IP_VER(12, 50) With no platform using graphics/media IP_VER(12, 50), replace the checks throughout the code with IP_VER(12, 55) so the code makes sense by itself with no additional explanation of previous baggage. Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Acked-by: Tvrtko Ursulin <tursulin@ursulin.net> Link: https://patchwork.freedesktop.org/patch/msgid/20240320060543.4034215-5-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2024-03-22 14:14:52 -07:00
Lucas De Marchi	cb4046d289	drm/i915: Drop dead code for xehpsdv PCI IDs for XEHPSDV were never added and platform always marked with force_probe. Drop what's not used and rename some places to either be xehp or dg2, depending on the platform/IP checks. The registers not used anymore are also removed. Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Acked-by: Tvrtko Ursulin <tursulin@ursulin.net> Link: https://patchwork.freedesktop.org/patch/msgid/20240320060543.4034215-2-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2024-03-22 14:14:39 -07:00
Radhakrishna Sripada	b4985cce81	drm/i915/xelpg: Add Wa_14020495402 Disable clockgating for TDL SVHS fub. v2: Implement in general render/compute wa's(MattR) Bspec: 46045 Cc: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Radhakrishna Sripada <radhakrishna.sripada@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240318210025.562698-1-radhakrishna.sripada@intel.com	2024-03-20 10:51:12 -07:00
Lucas De Marchi	af7c4a648e	drm/i915: Drop WA 16015675438 With dynamic load-balancing disabled on the compute side, there's no reason left to enable WA 16015675438. Drop it from both PVC and DG2. Note that this can be done because now the driver always set a fixed partition of EUs during initialization via the ccs_mode configuration. The flag to GuC is still needed because of 18020744125, so update the comment accordingly. Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Acked-by: Mateusz Jablonski <mateusz.jablonski@intel.com> Acked-by: Michal Mrozek <michal.mrozek@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240306144723.1826977-1-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2024-03-11 08:32:35 -07:00
John Harrison	7ad6a8fae5	drm/i915/guc: Enable Wa_14019159160 Use the new w/a KLV support to enable a MTL w/a. Note, this w/a is a super-set of Wa_16019325821, so requires turning that one as well as setting the new flag for Wa_14019159160 itself. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240223205632.1621019-4-John.C.Harrison@Intel.com	2024-03-07 15:26:47 -08:00
John Harrison	6cc7a5c7dc	drm/i915/guc: Add support for w/a KLVs To prevent running out of bits, new w/a enable flags are being added via a KLV system instead of a 32 bit flags word. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240223205632.1621019-3-John.C.Harrison@Intel.com	2024-03-07 15:26:45 -08:00
John Harrison	f673d59e31	drm/i915: Enable Wa_16019325821 Some platforms require holding RCS context switches until CCS is idle (the reverse w/a of Wa_14014475959). Some platforms require both versions. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240223205632.1621019-2-John.C.Harrison@Intel.com	2024-03-07 15:26:42 -08:00
Vinay Belgaumkar	cec82816d0	drm/i915/guc: Use context hints for GT frequency Allow user to provide a low latency context hint. When set, KMD sends a hint to GuC which results in special handling for this context. SLPC will ramp the GT frequency aggressively every time it switches to this context. The down freq threshold will also be lower so GuC will ramp down the GT freq for this context more slowly. We also disable waitboost for this context as that will interfere with the strategy. We need to enable the use of SLPC Compute strategy during init, but it will apply only to contexts that set this bit during context creation. Userland can check whether this feature is supported using a new param- I915_PARAM_HAS_CONTEXT_FREQ_HINT. This flag is true for all guc submission enabled platforms as they use SLPC for frequency management. The Mesa usage model for this flag is here - https://gitlab.freedesktop.org/sushmave/mesa/-/commits/compute_hint v2: Rename flags as per review suggestions (Rodrigo, Tvrtko). Also, use flag bits in intel_context as it allows finer control for toggling per engine if needed (Tvrtko). v3: Minor review comments (Tvrtko) v4: Update comment (Sushma) Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Sushma Venkatesh Reddy <sushma.venkatesh.reddy@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Acked-by: Ivan Briano <ivan.briano@intel.com> Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240306012759.204938-1-vinay.belgaumkar@intel.com	2024-03-07 10:25:06 -08:00
Tejas Upadhyay	7127128017	drm/i915/mtl: Update workaround 14018575942 Applying WA 14018575942 only on Compute engine has impact on some apps like chrome. Updating this WA to apply on Render engine as well as it is helping with performance on Chrome. Note: There is no concern from media team thus not applying WA on media engines. We will revisit if any issues reported from media team. V2(Matt): - Use correct WA number Fixes: `668f37e1ee` ("drm/i915/mtl: Update workaround 14018778641") Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240228103738.2018458-1-tejas.upadhyay@intel.com	2024-03-07 14:48:26 +01:00
John Harrison	4ae86a7f8d	drm/i915/guc: Simplify/extend platform check for Wa_14018913170 The above w/a is required for every platform that the i915 driver supports. It is fixed on the latest platforms but they are only supported by Xe instead of i915. So just remove the platform check completely and keep the code simple. v2: Add extra comment (review feedback from Rodrigo). Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240223202846.1532176-1-John.C.Harrison@Intel.com	2024-03-05 17:19:50 -08:00
Janusz Krzysztofik	6616e04817	drm/i915/selftest_hangcheck: Check sanity with more patience While trying to reproduce some other issues reported by CI for i915 hangcheck live selftest, I found them hidden behind timeout failures reported by igt_hang_sanitycheck -- the very first hangcheck test case executed. Feb 22 19:49:06 DUT1394ACMR kernel: calling mei_gsc_driver_init+0x0/0xff0 [mei_gsc] @ 121074 Feb 22 19:49:06 DUT1394ACMR kernel: i915 0000:03:00.0: [drm] DRM_I915_DEBUG enabled Feb 22 19:49:06 DUT1394ACMR kernel: i915 0000:03:00.0: [drm] Cannot find any crtc or sizes Feb 22 19:49:06 DUT1394ACMR kernel: probe of i915.mei-gsc.768 returned 0 after 1475 usecs Feb 22 19:49:06 DUT1394ACMR kernel: probe of i915.mei-gscfi.768 returned 0 after 1441 usecs Feb 22 19:49:06 DUT1394ACMR kernel: initcall mei_gsc_driver_init+0x0/0xff0 [mei_gsc] returned 0 after 3010 usecs Feb 22 19:49:06 DUT1394ACMR kernel: i915 0000:03:00.0: [drm] DRM_I915_DEBUG_GEM enabled Feb 22 19:49:06 DUT1394ACMR kernel: i915 0000:03:00.0: [drm] DRM_I915_DEBUG_RUNTIME_PM enabled Feb 22 19:49:06 DUT1394ACMR kernel: i915: Performing live selftests with st_random_seed=0x4c26c048 st_timeout=500 Feb 22 19:49:07 DUT1394ACMR kernel: i915: Running hangcheck Feb 22 19:49:07 DUT1394ACMR kernel: calling mei_hdcp_driver_init+0x0/0xff0 [mei_hdcp] @ 121074 Feb 22 19:49:07 DUT1394ACMR kernel: i915: Running intel_hangcheck_live_selftests/igt_hang_sanitycheck Feb 22 19:49:07 DUT1394ACMR kernel: probe of 0000:00:16.0-b638ab7e-94e2-4ea2-a552-d1c54b627f04 returned 0 after 1398 usecs Feb 22 19:49:07 DUT1394ACMR kernel: probe of i915.mei-gsc.768-b638ab7e-94e2-4ea2-a552-d1c54b627f04 returned 0 after 97 usecs Feb 22 19:49:07 DUT1394ACMR kernel: initcall mei_hdcp_driver_init+0x0/0xff0 [mei_hdcp] returned 0 after 101960 usecs Feb 22 19:49:07 DUT1394ACMR kernel: calling mei_pxp_driver_init+0x0/0xff0 [mei_pxp] @ 121094 Feb 22 19:49:07 DUT1394ACMR kernel: probe of 0000:00:16.0-fbf6fcf1-96cf-4e2e-a6a6-1bab8cbe36b1 returned 0 after 435 usecs Feb 22 19:49:07 DUT1394ACMR kernel: mei_pxp i915.mei-gsc.768-fbf6fcf1-96cf-4e2e-a6a6-1bab8cbe36b1: bound 0000:03:00.0 (ops i915_pxp_tee_component_ops [i915]) Feb 22 19:49:07 DUT1394ACMR kernel: 100ms wait for request failed on rcs0, err=-62 Feb 22 19:49:07 DUT1394ACMR kernel: probe of i915.mei-gsc.768-fbf6fcf1-96cf-4e2e-a6a6-1bab8cbe36b1 returned 0 after 158425 usecs Feb 22 19:49:07 DUT1394ACMR kernel: initcall mei_pxp_driver_init+0x0/0xff0 [mei_pxp] returned 0 after 224159 usecs Feb 22 19:49:07 DUT1394ACMR kernel: i915/intel_hangcheck_live_selftests: igt_hang_sanitycheck failed with error -5 Feb 22 19:49:07 DUT1394ACMR kernel: i915: probe of 0000:03:00.0 failed with error -5 Those request waits, once timed out after 100ms, have never been confirmed to still persist over another 100ms, always being able to complete within the originally requested wait time doubled. Taking into account potentially significant additional concurrent workload generated by new auxiliary drivers that didn't exist before and now are loaded in parallel with the i915 module also when loaded in selftest mode, relax our expectations on time consumed by the sanity check request before it completes. Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240228152500.38267-2-janusz.krzysztofik@linux.intel.com	2024-03-05 17:42:05 +01:00
John Harrison	e45afbeb59	drm/i915/guc: Correct capture of EIR register on hang The EIR register (0x20B0) was being included in the engine class list for render and compute as the absolute register address. However, it is actually a ring register available on all engines at an offset of (base) + 0xB0. As it was included as an RCS engine but with the absolute address, GuC was adding on another 0x2000 and coming out at an invalid location. Thus it would reject the register and complain about only managing a partial capture. So update the list to use the RING_EIR version of the register and include it for all engines. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Alan Previn <alan.previn.teres.alexis@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240223203204.1533410-1-John.C.Harrison@Intel.com	2024-03-04 15:35:22 -08:00
Andi Shyti	3f2f20da79	drm/i915/guc: Use the new gt_to_guc() wrapper Get the guc reference from the gt using the gt_to_guc() helper. Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231229102734.674362-3-andi.shyti@linux.intel.com	2024-03-01 13:19:43 +01:00
Andi Shyti	01b2b8cc1e	drm/i915/gt: Create the gt_to_guc() wrapper We already have guc_to_gt() and getting to guc from the GT it requires some mental effort. Add the gt_to_guc(). Given the reference to the "gt", the gt_to_guc() will return the pinter to the "guc". Update all the files under the gt/ directory. Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231229102734.674362-2-andi.shyti@linux.intel.com	2024-03-01 13:19:43 +01:00
Dave Airlie	c6d6a82d8a	Merge tag 'drm-misc-next-fixes-2024-02-29' of https://anongit.freedesktop.org/git/drm/drm-misc into drm-next Short summary of fixes pull: i915: - Fix NULL-pointer deref imx: - dcss: Fix resource-size calculation firmware: - sysfb: Fix returned error code Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/20240229085331.GA25863@localhost.localdomain	2024-03-01 19:38:13 +10:00
Dave Airlie	ca7a1d0d18	Merge tag 'drm-intel-next-2024-02-27-1' of git://anongit.freedesktop.org/drm/drm-intel into drm-next drm/i915 feature pull #2 for v6.9: Features and functionality: - DP tunneling and bandwidth allocation support (Imre) - Add more ADL-N PCI IDs (Gustavo) - Enable fastboot also on older platforms (Ville) - Bigjoiner force enable debugfs option for testing (Stan) Refactoring and cleanups: - Remove unused structs and struct members (Jiri Slaby) - Use per-device debug logging (Ville) - State check improvements (Ville) - Hardcoded cd2x divider cleanups (Ville) - CDCLK documentation updates (Ville, Rodrigo) Fixes: - HDCP MST Type1 fixes (Suraj) - Fix MTL C20 PHY PLL values (Ravi) - More hardware access prevention during init (Imre) - Always enable decompression with tile4 on Xe2 (Juha-Pekka) - Improve LNL package C residency (Suraj) drm core changes: - DP tunneling and bandwidth allocation helpers (Imre) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/87sf1devbj.fsf@intel.com	2024-02-28 11:02:55 +10:00
Thomas Zimmermann	b0fda2fcb4	Merge drm/drm-next into drm-misc-next-fixes Backmerging from drm/drm-next to prepare drm-misc-next-fixes for the rest of the release cycle. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>	2024-02-26 15:12:53 +01:00
Tvrtko Ursulin	32ca5ebfde	drm/i915: Fix possible null pointer dereference after drm_dbg_printer conversion Request can be NULL if no guilty request was identified so simply use engine->i915 instead. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Fixes: `d50892a955` ("drm/i915: switch from drm_debug_printer() to device specific drm_dbg_printer()") Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Cc: Jani Nikula <jani.nikula@intel.com> Cc: Luca Coelho <luciano.coelho@intel.com> Cc: Maxime Ripard <mripard@kernel.org> Cc: Jani Nikula <jani.nikula@linux.intel.com> Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://lore.kernel.org/r/20240219131423.1854991-1-tvrtko.ursulin@linux.intel.com Signed-off-by: Maxime Ripard <mripard@kernel.org>	2024-02-22 18:50:06 +01:00
Jiri Slaby (SUSE)	394a1376d8	drm/i915: remove intel_guc::ads_engine_usage_size intel_guc::ads_engine_usage_size was never used since its addition in commit `77cdd054dd` (drm/i915/pmu: Connect engine busyness stats from GuC to pmu). Drop it. Found by https://github.com/jirislaby/clang-struct. Signed-off-by: Jiri Slaby (SUSE) <jirislaby@kernel.org> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: intel-gfx@lists.freedesktop.org Acked-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240216065326.6910-9-jirislaby@kernel.org	2024-02-19 15:35:17 -05:00
Dave Airlie	9ac4beb757	Merge tag 'drm-misc-next-2024-02-15' of git://anongit.freedesktop.org/drm/drm-misc into drm-next drm-misc-next for v6.9: UAPI Changes: Cross-subsystem Changes: arch: - powerpc/ps3: select CONFIG_VIDEO Core Changes: ci: - msm: fix apq8016 runner display: - use newer DRM print helpers documentation: - fix typos print: - add device-specific error and debug printers sysfb: - set Linux parent device for firmware framebuffer tests: - mm: use newer DRM print helpers Driver Changes: bridge: - switch to ->read_edid callback throughout the bridge drivers - remove old ->get_edid callback i915: - use newer DRM print helpers lima: - improve stability by fixes to error handling and recovery mediathek: - switch to ->read_edid callback msm: - switch to ->read_edid callback omap: - switch to ->read_edid callback panel: - add Powkiddy RGB10MAX3 plus DT bindings - st7703: support panel rotation plus DT bindings rockchip: - DT bindings: remove port, add power-domains xe: - use newer DRM print helpers xlnx: - switch to ->read_edid callback Signed-off-by: Dave Airlie <airlied@redhat.com> # -----BEGIN PGP SIGNATURE----- # # iQEzBAABCgAdFiEEchf7rIzpz2NEoWjlaA3BHVMLeiMFAmXOD/oACgkQaA3BHVML # eiMWMAgArTVXF4UQ+FUxYZB5QTm2veYIpilvwmzaQLNxsM9SsWpzwMIVAi+xf93g # uqUqkl6QvZ9pJg6bxuXRNcJw/GObIO4x6tn+LkbccczgHiHwvn6ydNdUoMx8ulne # EsGC0z8bb5Gpwh9b/pnBul2AoIE7PHAJltgH271/O2xnhFMUbchQ0ckHvWnn8/GA # Nef145ySX4gkYtY8u2TRr4r6Bkp7Tpiyv6ipU7Cpu7KqyveTDMx3c9r5FaiHnJT/ # Hx/5s87q0Bx2m+iNjlBLJzYjF2UWth+pbfiu3xwyWOE7hdkPLwCQ5mqHWcFFqxfb # Vuj9jP+Vb68L7EvGpq2LArLdhZjHIQ== # =SsjX # -----END PGP SIGNATURE----- # gpg: Signature made Thu 15 Feb 2024 23:22:02 AEST # gpg: using RSA key 7217FBAC8CE9CF6344A168E5680DC11D530B7A23 # gpg: Can't check signature: No public key From: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/20240215132610.GA1464@localhost.localdomain	2024-02-16 13:16:40 +10:00
Dave Airlie	6f167a3673	Merge tag 'drm-intel-gt-next-2024-02-15' of git://anongit.freedesktop.org/drm/drm-intel into drm-next UAPI Changes: - Add GuC submission interface version query (Tvrtko Ursulin) Driver Changes: Fixes/improvements/new stuff: - Atomically invalidate userptr on mmu-notifier (Jonathan Cavitt) - Update handling of MMIO triggered reports (Umesh Nerlige Ramappa) - Don't make assumptions about intel_wakeref_t type (Jani Nikula) - Add workaround 14019877138 [xelpg] (Tejas Upadhyay) - Allow for very slow HuC loading [huc] (John Harrison) - Flush context destruction worker at suspend [guc] (Alan Previn) - Close deregister-context race against CT-loss [guc] (Alan Previn) - Avoid circular locking issue on busyness flush [guc] (John Harrison) - Use rc6.supported flag from intel_gt for rc6_enable sysfs (Juan Escamilla) - Reflect the true and current status of rc6_enable (Juan Escamilla) - Wake GT before sending H2G message [mtl] (Vinay Belgaumkar) - Restart the heartbeat timer when forcing a pulse (John Harrison) Future platform enablement: - Extend driver code of Xe_LPG to Xe_LPG+ [xelpg] (Harish Chegondi) - Extend some workarounds/tuning to gfx version 12.74 [xelpg] (Matt Roper) Miscellaneous: - Reconcile Excess struct member kernel-doc warnings (Randy Dunlap) - Change wa and EU_PERF_CNTL registers to MCR type [guc] (Shuicheng Lin) - Add flex arrays to struct i915_syncmap (Erick Archer) - Increasing the sleep time for live_rc6_manual [selftests] (Anirban Sk) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/Zc3iIVsiAwo+bu10@tursulin-desk	2024-02-16 11:19:15 +10:00
John Harrison	eb927f01df	drm/i915/gt: Restart the heartbeat timer when forcing a pulse The context persistence code does things like send super high priority heartbeat pulses to ensure any leaked context can still be pre-empted and thus isn't a total denial of service but only a minor denial of service. Unfortunately, it wasn't bothering to restart the heartbeat worker with a fresh timeout. Thus, if a persistent context happened to be closed just before the heartbeat was going to go ping anyway then the forced pulse would get a negligble execution time. And as the forced pulse is super high priority, the worker thread's next step is a reset. Which means a potentially innocent system randomly goes boom when attempting to close a context. So, force a re-schedule of the worker thread with the appropriate timeout. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240110210216.4125092-1-John.C.Harrison@Intel.com	2024-02-14 17:17:35 -08:00
Anirban Sk	599b0d8ce6	drm/i915/selftests: Increasing the sleep time for live_rc6_manual Sometimes gt_pm live_rc6_manual selftest fails due to no power being measured for the rc6 disabled period. Therefore increasing the rc6 disable period from 250ms to 1000ms to rule out such sporadic failure. v3: - More descriptive and improved commit message (Anshuman) Signed-off-by: Anirban Sk <sk.anirban@intel.com> Reviewed-by: Anshuman Gupta <anshuman.gupta@intel.com> Signed-off-by: Anshuman Gupta <anshuman.gupta@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240212050738.1162198-1-sk.anirban@intel.com	2024-02-13 19:55:52 +05:30
Jani Nikula	d2dda3bf5c	drm/i915: use drm_printf() with the drm_err_printer intead of pr_err() There's already a related drm_printer. Use it to preserve the context instead of a separate pr_err(). Signed-off-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Luca Coelho <luciano.coelho@intel.com> Acked-by: Maxime Ripard <mripard@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/246c0c275d05c919d959983e1784e3f7347f4540.1705410327.git.jani.nikula@intel.com	2024-02-09 11:52:09 +02:00
Jani Nikula	d50892a955	drm/i915: switch from drm_debug_printer() to device specific drm_dbg_printer() Prefer the device specific debug printer. Signed-off-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Luca Coelho <luciano.coelho@intel.com> Acked-by: Maxime Ripard <mripard@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/f2614dfcba295be20c650cdab24c3979d265f422.1705410327.git.jani.nikula@intel.com	2024-02-09 11:52:06 +02:00
Jani Nikula	5e0c04c8c4	drm/print: make drm_err_printer() device specific by using drm_err() With few users for drm_err_printer(), it's still feasible to convert it to be device specific. Use drm_err() under the hood. While at it, make the prefix optional. Signed-off-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Luca Coelho <luciano.coelho@intel.com> Acked-by: Maxime Ripard <mripard@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/2a9cdcfc1df44568078f7c131e2e7e0f7c94e97e.1705410327.git.jani.nikula@intel.com	2024-02-09 11:51:13 +02:00

1 2 3 4 5 ...

3210 Commits