Waitboost (when SLPC is enabled) results in a H2G message. This can result
in thousands of messages during a stress test and fill up an already full
CTB. There is no need to request for boost if min softlimit is equal or
greater than it.
v2: Add the tracing back, and check requested freq
in the worker thread (Tvrtko)
v3: Check requested freq in dec_waiters as well
v4: Only check min_softlimit against boost_freq. Limit this
optimization for server parts for now.
v5: min_softlimit can be greater than boost (Ashutosh)
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221024171108.14373-1-vinay.belgaumkar@intel.com
Not all Dekel PHY registers have a lane instance, so having to specify
this when using them is awkward. It makes more sense to define each PHY
register with its full internal PHY offset where bits 15:12 is the lane
for lane-instanced PHY registers and just a register bank index for other
PHY registers. This way lane-instanced registers can be referred to with
the (tc_port, lane) parameters, while other registers just with a tc_port
parameter.
An additional benefit of this change is to prevent passing a Dekel
register to a generic MMIO access function or vice versa.
v2:
- Fix parameter reuse in the DKL_REG_MMIO definition.
v3:
- Rebase on latest patchset version.
Cc: Jani Nikula <jani.nikula@intel.com>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221025114457.2191004-3-imre.deak@intel.com
Accessing the TypeC DKL PHY registers during modeset-commit,
-verification, DP link-retraining and AUX power well toggling is racy
due to these code paths being concurrent and the PHY register bank
selection register (HIP_INDEX_REG) being shared between PHY instances
(aka TC ports) and the bank selection being not atomic wrt. the actual
PHY register access.
Add the required locking around each PHY register bank selection->
register access sequence.
Kudos to Ville for noticing the race conditions.
v2:
- Add the DKL PHY register accessors to intel_dkl_phy.[ch]. (Jani)
- Make the DKL_REG_TC_PORT macro independent of PHY internals.
- Move initing the DKL PHY lock to a more logical place.
v3:
- Fix parameter reuse in the DKL_REG_TC_PORT definition.
- Document the usage of phy_lock.
v4:
- Fix adding TC_PORT_1 offset in the DKL_REG_TC_PORT definition.
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: <stable@vger.kernel.org> # v5.5+
Acked-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221025114457.2191004-1-imre.deak@intel.com
The connector->override_edid flag is strictly for EDID override debugfs
management, and drivers have no business using it.
The check for override_edid was added in commit 3019062905 ("drm/i915:
Ignore TMDS clock limit for DP++ when EDID override is set") to
facilitate mode list cross-checking against modes in override EDID when
the connector in question isn't even connected. The dual mode detect
fallback would do VBT based limiting in this case.
Instead of override EDID, check for connector forcing in the fallback.
v2: Simply use !connector->force (Ville)
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/c8b45867cf37134ab40be23e22825ca45adc6041.1666614699.git.jani.nikula@intel.com
For normal connector detect, there's really no point in trying dual mode
detect if the connector is disconnected. We can simplify the detect
sequence by skipping it. Since intel_hdmi_dp_dual_mode_detect() is only
called when EDID is present, we can drop the has_edid parameter.
The functional effect is speeding up disconnected connector detection
ever so slightly, and, combined with firmware EDID, also stop logging
about assuming dual mode adaptor.
It's a bit subtle, but this will also skip dual mode detect if the
connector is force connected and a) there's no EDID of any kind, normal
or override/firmware or b) there's EDID but it does not indicate
digital. These are corner cases no matter what, and arguably forcing
should not be limited by dual mode detect.
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/f8f2a4a147e1c87ba93269a607f71fc29c4b59f6.1666614699.git.jani.nikula@intel.com
Make glk_load_luts() a bit lighter for the common case
where neither the degamma LUT nor pipe CSC are enabled
by not loading the linear degamma LUT. Making .load_luts()
as lightweight as possible is a good idea since it may need
to execute from a vblank worker under tight deadlines.
My earlier reasoning for always loading the linear degamma LUT
was to avoid an extra LUT load when just enabling/disabling the
pipe CSC, but that is nonsense since we load the LUTs on every
flagged color management change/modeset anyway (either of which
is needed for a pipe CSC toggle).
We can also get rid of the glk_can_preload_luts() special
case since the presence of the degamma LUT will now always
match csc_enable.
v2: Fix typos (Uma)
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221024161514.5340-6-ville.syrjala@linux.intel.com
Add an extra remapping step between the logical state of the LUTs
(hw.(de)gamma_lut) as specified via uapi/bigjoiner copy vs.
the actual state of the LUTs programmed into the hardware.
With this we should be finally able finish the (de)gamma
readout/state checker support for the remaining platforms
(ilk-skl) where the same hardware LUT can be positioned
either before or after the pipe CSC unit. Where we position
it depends on factors such as presence of the logical degamma
LUT, RGB vs. YCbCr output, full vs. limited RGB quantization
range.
Without the extra remapping step the state readout doesn't
really know whether the LUT read from the hardware is the
degamma or gamma LUT, and so we is unable to accurately store
it into our crtc state. With the remapping step we know
exactly where to put it given the order of the LUT vs. CSC
in the hardware state.
Only the initial hw->uapi state readout done during driver
load/resume still has the problem of not really knowing
what to do with the LUT(s). But we can just assume 1:1
mapping there and let subsequent commits fix things up.
Another benefit is that we now have a place for purely
internal LUTs, without complicating the bigjoiner uapi->hw
copy logic. This should prove useful for streamlining
glk degamma LUT handling.
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221024161514.5340-3-ville.syrjala@linux.intel.com
If GuC is being used and we initialized GuC-error-capture,
we need to be warning if we don't provide an error-capture
register list in the firmware ADS, for valid GT engines.
A warning makes sense as this would impact debugability
without realizing why a reglist wasn't retrieved and reported
by GuC.
However, depending on the platform, we might have certain
engines that have a register list for engine instance error state
but not for engine class. Thus, add a check only to warn if the
register list was non existent vs an empty list (use the
empty lists to skip the warning).
NOTE: if a future platform were to introduce new registers
in place of what was an empty list on existing / legacy hardware
engines no warning is provided as the empty list is meant
to be used intentionally. As an example, if a future hardware
were to add blitter engine-class-registers (new) on top
of the legacy blitter engine-instance-register (HEAD, TAIL, etc.),
no warning is generated.
Signed-off-by: Alan Previn <alan.previn.teres.alexis@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221019072930.17755-2-alan.previn.teres.alexis@intel.com
A workaround was added to the driver to allow compute workloads to run
'forever' by disabling pre-emption on the RCS engine for Gen12.
It is not totally unbound as the heartbeat will kick in eventually
and cause a reset of the hung engine.
However, this does not work well in GuC submission mode. In GuC mode,
the pre-emption timeout is how GuC detects hung contexts and triggers
a per engine reset. Thus, disabling the timeout means also losing all
per engine reset ability. A full GT reset will still occur when the
heartbeat finally expires, but that is a much more destructive and
undesirable mechanism.
The purpose of the workaround is actually to give compute tasks longer
to reach a pre-emption point after a pre-emption request has been
issued. This is necessary because Gen12 does not support mid-thread
pre-emption and compute tasks can have long running threads.
So, rather than disabling the timeout completely, just set it to a
'long' value.
v2: Review feedback from Tvrtko - must hard code the 'long' value
instead of determining it algorithmically. So make it an extra CONFIG
definition. Also, remove the execlist centric comment from the
existing pre-emption timeout CONFIG option given that it applies to
more than just execlists.
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Acked-by: Michal Mrozek <michal.mrozek@intel.com>
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221006213813.1563435-5-John.C.Harrison@Intel.com
Compute workloads are inherently not pre-emptible for long periods on
current hardware. As a workaround for this, the pre-emption timeout
for compute capable engines was disabled. This is undesirable with GuC
submission as it prevents per engine reset of hung contexts. Hence the
next patch will re-enable the timeout but bumped up by an order of
magnitude.
However, the heartbeat might not respect that. Depending upon current
activity, a pre-emption to the heartbeat pulse might not even be
attempted until the last heartbeat period. Which means that only one
period is granted for the pre-emption to occur. With the aforesaid
bump, the pre-emption timeout could be significantly larger than this
heartbeat period.
So adjust the heartbeat code to take the pre-emption timeout into
account. When it reaches the final (high priority) period, it now
ensures the delay before hitting reset is bigger than the pre-emption
timeout.
v2: Fix for selftests which adjust the heartbeat period manually.
v3: Add FIXME comment about selftests. Add extra FIXME comment and
drm_notices when setting heartbeat to a non-default value (review
feedback from Tvrtko)
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221006213813.1563435-4-John.C.Harrison@Intel.com
GuC converts the pre-emption timeout and timeslice quantum values into
clock ticks internally. That significantly reduces the point of 32bit
overflow. On current platforms, worst case scenario is approximately
110 seconds. Rather than allowing the user to set higher values and
then get confused by early timeouts, add limits when setting these
values.
v2: Add helper functions for clamping (review feedback from Tvrtko).
v3: Add a bunch of BUG_ON range checks in addition to the checks
already in the clamping functions (Tvrtko)
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221006213813.1563435-2-John.C.Harrison@Intel.com
DGFX platforms has lmem and cpu can access the lmem objects
via mmap and i915 internal i915_gem_object_pin_map() for
i915 own usages. Both of these methods has pre-requisite
requirement to keep GFX PCI endpoint in D0 for a supported
iomem transaction over PCI link. (Refer PCIe specs 5.3.1.4.1)
Both DG1/DG2 have a known hardware bug that violates the PCIe specs
and support the iomem read write transaction over PCIe bus despite
endpoint is D3 state.
Due to above H/W bug, we had never observed any issue with i915 runtime
PM versus lmem access.
But this issue becomes visible when PCIe gfx endpoint's upstream
bridge enters to D3, at this point any lmem read/write access will be
returned as unsupported request. But again this issue is not observed
on every platform because it has been observed on few host machines
DG1/DG2 endpoint's upstream bridge does not bind with pcieport driver.
which really disables the PCIe power savings and leaves the bridge
at D0 state.
We need a unique interface to read/write from lmem with runtime PM
wakeref protection something similar to intel_uncore_{read, write},
keep autosuspend control to 'on' on all discrete platforms,
until we have a unique interface to read/write from lmem.
This just change the default autosuspend setting of i915 on dGPU,
user can still change it to 'auto'.
v2:
- Modified the commit message and subject with more information.
- Changed the Fixes tag to LMEM support commit. [Joonas]
- Changed !HAS_LMEM() Cond to !IS_DGFX(). [Rodrigo]
Fixes: b908be543e ("drm/i915: support creating LMEM objects")
Suggested-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Anshuman Gupta <anshuman.gupta@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221014113258.1284226-1-anshuman.gupta@intel.com
(cherry picked from commit 66eb93e71a)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Be consistent in whether we flag a full modeset or a
fastset for the pipe. intel_modeset_all_pipes() would
seem to be the only codepath not getting this right.
The other case is when we flag the fastset initially,
currently we just clear the mode_changed flag and set
the update_pipe flag. But we could still have
connectors_changed==true or active_changed==true forcing
a full modeset anyway. So check for that after clearing
the mode_changed flag.
And let's add a WARN to make sure we did get it right.
v2: Deal with {connectors,active}_changed
Reviewed-by: Jani Nikula <jani.nikula@intel.com> #v1
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221021162442.27283-4-ville.syrjala@linux.intel.com
On BDW+ we have just the one set of DP M/N registers. The
values we write into said registers depends on whether we
want DRRS to be in high or low gear. This causes issues
for the state checker which currently has to assume either
set of M/N (high or low refresh rate) values may appear there.
That sort of works for M/N itself, but all other values
derived from the M/N (dotclock, pixel rate) are not handled
correctly, leading to potential for state checker mismatches.
Let's avoid all those problems by simply keeping DRRS in
high gear until the state checker has done its hardware
state readout.
Note that hitting this issue presumable became very hard
after commit 1b333c679a ("drm/i915: Do DRRS disable/enable
during pre/post_plane_update()") since the state check would
have to laze about for one full second (delay used by
intel_drrs_schedule_work()) to see the low refresh rate.
But it is still theoretically possible.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221020120706.25728-1-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Since a7c01fa93a ("signal: break out of wait loops on kthread_stop()")
kthread_stop() started asserting a pending signal which wreaks havoc with
a few of our selftests. Mainly because they are not fully expecting to
handle signals, but also cutting the intended test runtimes short due
signal_pending() now returning true (via __igt_timeout), which therefore
breaks both the patterns of:
kthread_run()
..sleep for igt_timeout_ms to allow test to exercise stuff..
kthread_stop()
And check for errors recorded in the thread.
And also:
Main thread | Test thread
---------------+------------------------------
kthread_run() |
kthread_stop() | do stuff until __igt_timeout
| -- exits early due signal --
Where this kthread_stop() was assume would have a "join" semantics, which
it would have had if not the new signal assertion issue.
To recap, threads are now likely to catch a previously impossible
ERESTARTSYS or EINTR, marking the test as failed, or have a pointlessly
short run time.
To work around this start using kthread_work(er) API which provides
an explicit way of waiting for threads to exit. And for cases where
parent controls the test duration we add explicit signaling which threads
will now use instead of relying on kthread_should_stop().
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221020130841.3845791-1-tvrtko.ursulin@linux.intel.com
We use all kinds of weird names for our base address registers.
Take the names from the spec and stick to them to avoid confusing
everyone.
The only exceptions are IOBAR and LMEMBAR since naming them
IOBAR_BAR and LMEMBAR_BAR looks too funny, and yet I think
that adding the _BAR to GTTMMADR & co. (which don't have one
in the spec name) does make it more clear what they are.
And IOBAR vs. GTTMMADR_BAR also looks a bit too inconsistent
for my taste.
v2: Fix gvt build
v3: Add GEN2_IO_BAR for completeness
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221005195646.17201-1-ville.syrjala@linux.intel.com
Acked-by: Matthew Auld <matthew.auld@intel.com>
DGFX platforms has lmem and cpu can access the lmem objects
via mmap and i915 internal i915_gem_object_pin_map() for
i915 own usages. Both of these methods has pre-requisite
requirement to keep GFX PCI endpoint in D0 for a supported
iomem transaction over PCI link. (Refer PCIe specs 5.3.1.4.1)
Both DG1/DG2 have a known hardware bug that violates the PCIe specs
and support the iomem read write transaction over PCIe bus despite
endpoint is D3 state.
Due to above H/W bug, we had never observed any issue with i915 runtime
PM versus lmem access.
But this issue becomes visible when PCIe gfx endpoint's upstream
bridge enters to D3, at this point any lmem read/write access will be
returned as unsupported request. But again this issue is not observed
on every platform because it has been observed on few host machines
DG1/DG2 endpoint's upstream bridge does not bind with pcieport driver.
which really disables the PCIe power savings and leaves the bridge
at D0 state.
We need a unique interface to read/write from lmem with runtime PM
wakeref protection something similar to intel_uncore_{read, write},
keep autosuspend control to 'on' on all discrete platforms,
until we have a unique interface to read/write from lmem.
This just change the default autosuspend setting of i915 on dGPU,
user can still change it to 'auto'.
v2:
- Modified the commit message and subject with more information.
- Changed the Fixes tag to LMEM support commit. [Joonas]
- Changed !HAS_LMEM() Cond to !IS_DGFX(). [Rodrigo]
Fixes: b908be543e ("drm/i915: support creating LMEM objects")
Suggested-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Anshuman Gupta <anshuman.gupta@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221014113258.1284226-1-anshuman.gupta@intel.com