DG2 has issues. To work around one of these the GuC must schedule
apps in an exclusive manner across both RCS and CCS. That is, if a
context from app X is running on RCS then all CCS engines must sit
idle even if there are contexts from apps Y, Z, ... waiting to run. A
certain OS favours RCS to the total starvation of CCS. Linux does not.
Hence the GuC now has a scheduling policy setting to control this
abitration.
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220922201209.1446343-2-John.C.Harrison@Intel.com
If attempting to perform a GT reset takes long than 5 seconds (including
resetting the display for gen3/4), then we declare all hope lost and
discard all user work and wedge the device to prevent further
misbehaviour. 5 seconds is too short a time for such drastic action, as
we may be stuck on other timeouts and watchdogs. If we allow a little
bit longer before hitting the big red button, we should at the very
least capture other hung task indicators pointing towards the reason why
the reset was hanging; and allow more marginal cases the extra headroom
to complete the reset without further collateral damage.
Bug: https://gitlab.freedesktop.org/drm/intel/-/issues/6448
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220916204823.1897089-1-ashutosh.dixit@intel.com
On pre-ddi platforms we have slightly different code being
used for HDMI TMDS clock to dotclock conversion between the
state computation and state readout. Both of these need to
round the same way in order to not get a mismatch between
the computed and read out states. Fix up the rounding
direction in the readout path to match what is used during
state computation.
Another option would to just use intel_crtc_dotclock()
in the readout path as well, but I don't really want to
do that as the current code more accurately represents
how the hardware really works; The HDMI port register
defines whether we're actually outputting 8bpc or 12bpc
over HDMI, and the PIPECONF bpc setting just defines what
goes over FDI between the CPU and PCH. The fact that we
try to cram all that into a single pipe_bpp during state
computation is perhaps not entirely great...
Fixes: f2c9df1010 ("drm/i915: Round TMDS clock to nearest")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220926193021.23287-1-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
During system resume DP MST requires AUX to be working already before
the HW state readout of the given encoder. Since AUX requires the
encoder/PHY TypeC mode to be initialized, which atm only happens during
HW state readout, these AUX transfers can change the TypeC mode
incorrectly (disconnecting the PHY for an enabled encoder) and trigger
the state check WARNs in intel_tc_port_sanitize().
Fix this by initializing the TypeC mode earlier both during driver
loading and system resume and making sure that the mode can't change
until the encoder's state is read out. While at it add the missing
DocBook comments and rename
intel_tc_port_sanitize()->intel_tc_port_sanitize_mode() for consistency.
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Mika Kahola <mika.kahola@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220922172148.2913088-1-imre.deak@intel.com
We always allocate two DPLLs (TC and TBT) for TC ports. This
is because we can't know ahead of time wherher we need to put
the PHY into DP-Alt or TBT mode.
However during readout we can obviously only read out the state
of the DPLL that the port is actually using. Thus the state after
readout will not have both DPLLs populated.
We run into problems if during readout the TC port is in DP-Alt
mode, but we then perform a modeset on the port without going
through the full .compute_config() machinery, and during said
modeset the port cannot be switched back into DP-Alt mode and
we need to take the TBT fallback path. Such a modeset can
happen eg. due to cdclk reprogramming.
This wasn't a problem earlier because we did all the DPLL
calculations much later in the modeset. So even if flagged
a modeset very late we'd still have gone through the DPLL
calculations. But now all the DPLL calculations happen much
earlier and so we need to deal with it, or else we'll attempt
a modeset without a DPLL.
To guarantee that we always have both DPLLs fully cal/ulated
for TC ports force a full modeset computation during the
initial commit.
v2: Avoid bitwise operation on bool (Jani)
Call the return variable 'fastset' to convey its meaning
Reported-by: Lee Shawn C <shawn.c.lee@intel.com>
Fixes: b000abd3b3 ("drm/i915: Do .crtc_compute_clock() earlier")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220922191236.4194-1-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
(cherry picked from commit eddb4afcb6)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Commit 00c6cbfd4e ("drm/i915: move pipe_mask and cpu_transcoder_mask
to runtime info") moved the pipe_mask member from struct
intel_device_info to intel_runtime_info, but overlooked some of our
platforms initializing device info .display = {}. This is significant,
as pipe_mask is the single point of truth for a device having a display
or not; the platforms in question left pipe_mask to whatever was set for
the platforms they "inherit" from in the complex macro scheme we have.
Add new NO_DISPLAY macro initializing .__runtime.pipe_mask = 0, which
will cause the device info .display sub-struct to be zeroed in
intel_device_info_runtime_init(). A better solution (or simply audit of
proper use of HAS_DISPLAY() checks) is required before moving forward
with [1].
Also clear all the display related members in runtime info if there's no
display. The latter is a bit tedious, but it's for completeness at this
time, to ensure similar functionality as before.
[1] https://lore.kernel.org/r/dfda1bf67f02ceb07c280b7a13216405fd1f7a34.1660137416.git.jani.nikula@intel.com
Fixes: 00c6cbfd4e ("drm/i915: move pipe_mask and cpu_transcoder_mask to runtime info")
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Maarten Lankhort <maarten.lankhorst@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Acked-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220916082642.3451961-1-jani.nikula@intel.com
(cherry picked from commit 86570b7b12)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Do all the checks in intel_dp_initial_fastset_check() instead
of bailing out on the first condition that triggers.
This makes for better debug logs since we see all the reasons
why the full modeset computation is forced.
Also avoid the risk of someone accidentally adding a check
later in the function that would require connectors_changed=true
(ie. no fastset at all), but an earlier check may have already
bailed out with just mode_changed=true (ie. fastset is still
possible).
Pimp the debugs with the encoder id+name while at it.
v2: Call the return variable 'fastset' to convey its meaning
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220922191314.4252-1-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
We always allocate two DPLLs (TC and TBT) for TC ports. This
is because we can't know ahead of time wherher we need to put
the PHY into DP-Alt or TBT mode.
However during readout we can obviously only read out the state
of the DPLL that the port is actually using. Thus the state after
readout will not have both DPLLs populated.
We run into problems if during readout the TC port is in DP-Alt
mode, but we then perform a modeset on the port without going
through the full .compute_config() machinery, and during said
modeset the port cannot be switched back into DP-Alt mode and
we need to take the TBT fallback path. Such a modeset can
happen eg. due to cdclk reprogramming.
This wasn't a problem earlier because we did all the DPLL
calculations much later in the modeset. So even if flagged
a modeset very late we'd still have gone through the DPLL
calculations. But now all the DPLL calculations happen much
earlier and so we need to deal with it, or else we'll attempt
a modeset without a DPLL.
To guarantee that we always have both DPLLs fully cal/ulated
for TC ports force a full modeset computation during the
initial commit.
v2: Avoid bitwise operation on bool (Jani)
Call the return variable 'fastset' to convey its meaning
Reported-by: Lee Shawn C <shawn.c.lee@intel.com>
Fixes: b000abd3b3 ("drm/i915: Do .crtc_compute_clock() earlier")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220922191236.4194-1-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Commit 00c6cbfd4e ("drm/i915: move pipe_mask and cpu_transcoder_mask
to runtime info") moved the pipe_mask member from struct
intel_device_info to intel_runtime_info, but overlooked some of our
platforms initializing device info .display = {}. This is significant,
as pipe_mask is the single point of truth for a device having a display
or not; the platforms in question left pipe_mask to whatever was set for
the platforms they "inherit" from in the complex macro scheme we have.
Add new NO_DISPLAY macro initializing .__runtime.pipe_mask = 0, which
will cause the device info .display sub-struct to be zeroed in
intel_device_info_runtime_init(). A better solution (or simply audit of
proper use of HAS_DISPLAY() checks) is required before moving forward
with [1].
Also clear all the display related members in runtime info if there's no
display. The latter is a bit tedious, but it's for completeness at this
time, to ensure similar functionality as before.
[1] https://lore.kernel.org/r/dfda1bf67f02ceb07c280b7a13216405fd1f7a34.1660137416.git.jani.nikula@intel.com
Fixes: 00c6cbfd4e ("drm/i915: move pipe_mask and cpu_transcoder_mask to runtime info")
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Maarten Lankhort <maarten.lankhorst@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Acked-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220916082642.3451961-1-jani.nikula@intel.com
When we submit a new pair of contexts to ELSP for execution, we start a
timer by which point we expect the HW to have switched execution to the
pending contexts. If the promotion to the new pair of contexts has not
occurred, we declare the executing context to have hung and force the
preemption to take place by resetting the engine and resubmitting the
new contexts.
This can lead to an unfair situation where almost all of the preemption
timeout is consumed by the first context which just switches into the
second context immediately prior to the timer firing and triggering the
preemption reset (assuming that the timer interrupts before we process
the CS events for the context switch). The second context hasn't yet had
a chance to yield to the incoming ELSP (and send the ACk for the
promotion) and so ends up being blamed for the reset.
If we see that a context switch has occurred since setting the
preemption timeout, but have not yet received the ACK for the ELSP
promotion, rearm the preemption timer and check again. This is
especially significant if the first context was not schedulable and so
we used the shortest timer possible, greatly increasing the chance of
accidentally blaming the second innocent context.
Fixes: 3a7a92aba8 ("drm/i915/execlists: Force preemption")
Fixes: d12acee84f ("drm/i915/execlists: Cancel banned contexts on schedule-out")
Reported-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Andi Shyti <andi.shyti@linux.intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Tested-by: Andrzej Hajda <andrzej.hajda@intel.com>
Cc: <stable@vger.kernel.org> # v5.5+
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220921135258.1714873-1-andrzej.hajda@intel.com
(cherry picked from commit 107ba1a2c7)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
We can simplify the vlv watermark sanitation by reusing the
second half of vlv_compute_pipe_wm() to convert the sanitized
raw watermarks into the proper form to be used as the
optimal/intermediate watermarks.
Also to be consistent with normal watermark computation the sanitized
watermarks should be all 0 for any disabled plane. Previously we
zeroed out the watermarks only up to the level (ie. PM2/5/DVDFS)
that was enabled.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220622155452.32587-5-ville.syrjala@linux.intel.com
Reviewed-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
In the unlikely case of not finding a fixed mode don't register
the eDP connector. I think there are some places where we'd oops
if we didn't have a fixed mode for eDP so presumable this doesn't
typically happen. But better safe than sorry.
Also pimp the debugs with the encoder id+name. I think dumping
the encoder rather than the connector provides more information
here (eg. to match against the port information in the VBT).
We can also drop the extra check from intel_edp_add_properties().
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220912111814.17466-14-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
When we submit a new pair of contexts to ELSP for execution, we start a
timer by which point we expect the HW to have switched execution to the
pending contexts. If the promotion to the new pair of contexts has not
occurred, we declare the executing context to have hung and force the
preemption to take place by resetting the engine and resubmitting the
new contexts.
This can lead to an unfair situation where almost all of the preemption
timeout is consumed by the first context which just switches into the
second context immediately prior to the timer firing and triggering the
preemption reset (assuming that the timer interrupts before we process
the CS events for the context switch). The second context hasn't yet had
a chance to yield to the incoming ELSP (and send the ACk for the
promotion) and so ends up being blamed for the reset.
If we see that a context switch has occurred since setting the
preemption timeout, but have not yet received the ACK for the ELSP
promotion, rearm the preemption timer and check again. This is
especially significant if the first context was not schedulable and so
we used the shortest timer possible, greatly increasing the chance of
accidentally blaming the second innocent context.
Fixes: 3a7a92aba8 ("drm/i915/execlists: Force preemption")
Fixes: d12acee84f ("drm/i915/execlists: Cancel banned contexts on schedule-out")
Reported-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Andi Shyti <andi.shyti@linux.intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Tested-by: Andrzej Hajda <andrzej.hajda@intel.com>
Cc: <stable@vger.kernel.org> # v5.5+
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220921135258.1714873-1-andrzej.hajda@intel.com
Use DECLARE_DYNDBG_CLASSMAP across DRM:
- in .c files, since macro defines/initializes a record
- in drivers, $mod_{drv,drm,param}.c
ie where param setup is done, since a classmap is param related
- in drm/drm_print.c
since existing __drm_debug param is defined there,
and we ifdef it, and provide an elaborated alternative.
- in drm_*_helper modules:
dp/drm_dp - 1st item in makefile target
drivers/gpu/drm/drm_crtc_helper.c - random pick iirc.
Since these modules all use identical CLASSMAP declarations (ie: names
and .class_id's) they will all respond together to "class DRM_UT_*"
query-commands:
:#> echo class DRM_UT_KMS +p > /proc/dynamic_debug/control
NOTES:
This changes __drm_debug from int to ulong, so BIT() is usable on it.
DRM's enum drm_debug_category values need to sync with the index of
their respective class-names here. Then .class_id == category, and
dyndbg's class FOO mechanisms will enable drm_dbg(DRM_UT_KMS, ...).
Though DRM needs consistent categories across all modules, thats not
generally needed; modules X and Y could define FOO differently (ie a
different NAME => class_id mapping), changes are made according to
each module's private class-map.
No callsites are actually selected by this patch, since none are
class'd yet.
Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Link: https://lore.kernel.org/r/20220912052852.1123868-3-jim.cromie@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>