UAPI Changes:
- Add uAPI for using PXP protected objects
Mesa changes: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8064
- Add PCI IDs and LMEM discovery/placement uAPI for DG1
Mesa changes: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11584
- Disable engine bonding on Gen12+ except TGL, RKL and ADL-S
Cross-subsystem Changes:
- Merges 'tip/locking/wwmutex' branch (core kernel tip)
- "mei: pxp: export pavp client to me client bus"
Core Changes:
- Update ttm_move_memcpy for async use (Thomas)
Driver Changes:
- Enable GuC submission by default on DG1 (Matt B)
- Add PXP (Protected Xe Path) support for Gen12 integrated (Daniele,
Sean, Anshuman)
See "drm/i915/pxp: add PXP documentation" for details!
- Remove force_probe protection for ADL-S (Raviteja)
- Add base support for XeHP/XeHP SDV (Matt R, Stuart, Lucas)
- Handle DRI_PRIME=1 on Intel igfx + Intel dgfx hybrid graphics setup (Tvrtko)
- Use Transparent Hugepages when IOMMU is enabled (Tvrtko, Chris)
- Implement LMEM backup and restore for suspend / resume (Thomas)
- Report INSTDONE_GEOM values in error state for DG2 (Matt R)
- Add DG2-specific shadow register table (Matt R)
- Update Gen11/Gen12/XeHP shadow register tables (Matt R)
- Maintain backward-compatible nested batch behavior on TGL+ (Matt R)
- Add new LRI reg offsets for DG2 (Akeem)
- Initialize unused MOCS entries to device specific values (Ayaz)
- Track and use the correct UC MOCS index on Gen12 (Ayaz)
- Add separate MOCS table for Gen12 devices other than TGL/RKL (Ayaz)
- Simplify the locking and eliminate some RCU usage (Daniel)
- Add some flushing for the 64K GTT path (Matt A)
- Mark GPU wedging on driver unregister unrecoverable (Janusz)
- Major rework in the GuC codebase, simplify locking and add docs (Matt B)
- Add DG1 GuC/HuC firmwares (Daniele, Matt B)
- Remember to call i915_sw_fence_fini on guc_state.blocked (Matt A)
- Use "gt" forcewake domain name for error messages instead of "blitter" (Matt R)
- Drop now duplicate LMEM uAPI RFC kerneldoc section (Daniel)
- Fix early tracepoints for requests (Matt A)
- Use locked access to ctx->engines in set_priority (Daniel)
- Convert gen6/gen7/gen8 read operations to fwtable (Matt R)
- Drop gen11/gen12 specific mmio write handlers (Matt R)
- Drop gen11 specific mmio read handlers (Matt R)
- Use designated initializers for init/exit table (Kees)
- Fix syncmap memory leak (Matt B)
- Add pretty printing for buddy allocator state debug (Matt A)
- Fix potential error pointer dereference in pinned_context() (Dan)
- Remove IS_ACTIVE macro (Lucas)
- Static code checker fixes (Nathan)
- Clean up disabled warnings (Nathan)
- Increase timeout in i915_gem_contexts selftests 5x for GuC submission (Matt B)
- Ensure wa_init_finish() is called for ctx workaround list (Matt R)
- Initialize L3CC table in mocs init (Sreedhar, Ayaz, Ram)
- Get PM ref before accessing HW register (Vinay)
- Move __i915_gem_free_object to ttm_bo_destroy (Maarten)
- Deduplicate frequency dump on debugfs (Lucas)
- Make wa list per-gt (Venkata)
- Do not define dummy vma in stack (Venkata)
- Take pinning into account in __i915_gem_object_is_lmem (Matt B, Thomas)
- Do not report currently active engine when describing objects (Tvrtko)
- Fix pdfdocs build error by removing nested grid from GuC docs (Akira)
- Remove false warning from the rps worker (Tejas)
- Flush buffer pools on driver remove (Janusz)
- Fix runtime pm handling in i915_gem_shrink (Maarten)
- Rework TTM object initialization slightly (Thomas)
- Use fixed offset for PTEs location (Michal Wa)
- Verify result from CTB (de)register action and improve error messages (Michal Wa)
- Fix bug in user proto-context creation that leaked contexts (Matt B)
- Re-use Gen11 forcewake read functions on Gen12 (Matt R)
- Make shadow tables range-based (Matt R)
- Ditch the i915_gem_ww_ctx loop member (Thomas, Maarten)
- Use NULL instead of 0 where appropriate (Ville)
- Rename pci/debugfs functions to respect file prefix (Jani, Lucas)
- Drop guc_communication_enabled (Daniele)
- Selftest fixes (Thomas, Daniel, Matt A, Maarten)
- Clean up inconsistent indenting (Colin)
- Use direction definition DMA_BIDIRECTIONAL instead of
PCI_DMA_BIDIRECTIONAL (Cai)
- Add "intel_" as prefix in set_mocs_index() (Ayaz)
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/YWAO80MB2eyToYoy@jlahtine-mobl.ger.corp.intel.com
Signed-off-by: Dave Airlie <airlied@redhat.com>
Add support to enable/disable PLANE_SURF Decryption Request bit.
It requires only to enable plane decryption support when following
condition met.
1. PXP session is enabled.
2. Buffer object is protected.
v2:
- Used gen fb obj user_flags instead gem_object_metadata. [Krishna]
v3:
- intel_pxp_gem_object_status() API changes.
v4: use intel_pxp_is_active (Daniele)
v5: rebase and use the new protected object status checker (Daniele)
v6: used plane state for plane_decryption to handle async flip
as suggested by Ville.
v7: check pxp session while plane decrypt state computation. [Ville]
removed pointless code. [Ville]
v8 (Daniele): update PXP check
v9: move decrypt check after icl_check_nv12_planes() when overlays
have fb set (Juston)
v10 (Daniele): update PXP check again to match rework in earlier
patches and don't consider protection valid if the object has not
been used in an execbuf beforehand.
Cc: Bommu Krishnaiah <krishnaiah.bommu@intel.com>
Cc: Huang Sean Z <sean.z.huang@intel.com>
Cc: Gaurav Kumar <kumar.gaurav@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Anshuman Gupta <anshuman.gupta@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Juston Li <juston.li@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Uma Shankar <uma.shankar@intel.com> #v9
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210924191452.1539378-14-alan.previn.teres.alexis@intel.com
Adjust the link training code to accommodate per-lane drive settings,
if supported by the platform. Actually enabling this will involve
some changes to each platform's .set_signal_level() implementation,
so for the moment all supported platforms will keep using the current
codepath that just uses the same drive settings for all the lanes.
v2: Fix min() vs. max() fumble
v3: Compact the debug print to a single line
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211001130107.1746-10-ville.syrjala@linux.intel.com
Currently .set_signal_levels() is only used by encoders in DP mode.
For most modern platforms there is no essential difference between
DP and HDMI, and both codepaths just end up calling the same function
under the hood. Let's get remove the need for that extra indirection
by moving .set_signal_levels() into the encoder from intel_dp.
Since we already plumb the crtc_state/etc. into .set_signal_levels()
the code will do the right thing for both DP and HDMI.
HSW/BDW/SKL are the only platforms that need a bit of care on
account of having to preload the hardware buf_trans register
with the full set of values. So we must still remember to call
hsw_prepare_{dp,hdmi}_ddi_buffers() to do said preloading, and
.set_signal_levels() will just end up selecting the correct entry
for DP, and also setting up the iboost magic for both DP and HDMI.
Note that previously on HSW/BDW/SKL we did write to DDI_BUF_CTL to
select the correct entry until link training started, now that we
call .set_signal_levels() already from hsw_ddi_pre_enable_dp() that
is no longer the case. But it's all safe now that the
intel_ddi_init_dp_buf_reg() call was hoisted up and it no longer
sets up the DDI_BUF_CTL_ENABLE bit (that is still deferred until
link training).
v2: Rebase due to has_{iboost,buf_trans_select}()
Add some notes about the DDI_BUF_CTL situation on HSW/BDW/SKL (Imre)
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211001130107.1746-4-ville.syrjala@linux.intel.com
The DP spec says:
"If the receiver keeps the same value in the ADJUST_REQUEST_LANEx_y
register(s) while the LANEx_CR_DONE bits remain unset, the transmitter
must loop four times with the same voltage swing. On the fifth time,
the transmitter must down-shift to the lower bit rate and must repeat
the CR-lock training sequence as described below."
Lets fix the code to follow that instead of terminating after five
times of transmitting the same signal levels. The text in spec feels
a little bit ambiguous still, but this is my best guess at its meaning.
As a bonus this also gets rid of the train_set[0] stuff which
would not work for per-lane drive settings anyway.
Cc: Imre Deak <imre.deak@intel.com>
CC: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211001160826.17080-1-ville.syrjala@linux.intel.com
Reviewed-by: Imre Deak <imre.deak@intel.com>
While sanitizing the hardware state we're currently forcing
the pipe bottom color legacy csc/gamma bits on. That is not a
good idea as BIOSen are likely to leave gabage in the LUTs and
so doing this causes ugly visual glitches if and when the
planes covering the background get disabled. This was exactly
the case on this Dell Precision 5560 tgl laptop.
On icl+ we don't normally even use these legacy bits
anymore and instead use their GAMMA_MODE counterparts.
On earlier platforms the bits are used, but we still
shouldn't force them on without knowing what's in the LUT.
So two options, get rid of the whole thing, or do what
intel_color_commit() does to make sure the bottom color state
matches whatever out hardware readout produced. I chose the
latter since it'll match what happens on older platforms when
the primary plane gets turned off. In fact let's just call
intel_color_commit(). It'll also do some CSC programming but
since we don't have readout for that it'll actually just set
to all zeros. So in the unlikely case of CSC actually being
enabld by the BIOS we'll end up with all black until the first
atomic commit happens.
Still not totally sure what we should do about color management
features here in general. Probably the safest thing would be to
force everything off exactly at the same time when we disable
the primary plane as there is no guarantees that whatever the
LUTs/CSCs contain make any sense whatsoever without the
specific pixel data in the BIOS fb. And if we preserve the
primary plane then we should disable the color management
features exactly when the primary plane fb contents first
changes since the new content assumes more or less no
transformations. But of course synchronizing front buffer
rendering with anything else is a bit hard...
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/3534
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210928185105.3030-1-ville.syrjala@linux.intel.com
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
The Wa_14014971508 is required to fix scanout when a feature that i915
do not support is enabled and this feature is not planned to be enabled
for adlp.
Keeping this workaround enabled can badly hurt power-savings when
a full frame fetch is required(see psr2_sel_fetch_plane_state_supported()
and psr2_sel_fetch_pipe_state_supported()).
Here a example that could badly hurt power-savings, userspace does
a page flip to a rotated plane, so CONTINUOS_FULL_FRAME set.
But then for a whole 30 seconds nothing in the screen requires updates
but because CONTINUOS_FULL_FRAME is set, it will not go into DC5/DC6.
Reverting Wa_14014971508 fixes that, as only a single frame will be
sent and then display can go to DC5/DC6 for those 30 seconds of
idleness.
BSpec: 54369
Cc: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210930001409.254817-6-jose.souza@intel.com
Legacy cursor APIs are handled by intel_legacy_cursor_update(), that
calls drm_atomic_helper_update_plane() when going through the
slow/atomic path to update cursor, what was the case for PSR2
selective fetch.
drm_atomic_helper_update_plane() sets
drm_atomic_state->legacy_cursor_update to true when updating the
cursor plane, to allow several cursor updates to happen within the
same frame, as userspace does that.
If drivers waited for a vblank increment at the end of every cursor
movement that would cause a visible lag in the cursor.
But this optimization do not properly work with PSR2 selective fetch
dirt area calculation, for example if within a single frame the cursor
had 3 moves the final dirt area programmed to PSR2_MAN_TRK_CTL would
be based in the second movement as old state and third movement as new
state, not updating the area where cursor was in the first state.
So here switching back to the fast path approach in
intel_legacy_cursor_update() and handling cursor movements as
frontbuffer rendering(psr_force_hw_tracking_exit()), that is not the
most optimal for power-savings but is the solution that we have until
mailbox style updates is implemented.
Also removing the cursor workaround as not it is properly undestand
the issue and is know that it will never cover all the cases.
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Acked-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210930001409.254817-5-jose.souza@intel.com
When PSR2 selective fetch is enabled writes to CURSURFLIVE alone do
not causes the panel to be updated when doing frontbuffer rendering.
From what I was able to figure from experiments the writes to
CURSURFLIVE takes PSR2 from deep sleep but panel is not updated
because PSR2_MAN_TRK_CTL has no start and end region set.
As we don't have the dirt area from current flush and invalidate API
and even if we did userspace could do several draws to frontbuffer and
we would need a way to append all the damaged areas of all the draws
that need to be part of next frame.
So here only programing PSR2_MAN_TRK_CTL to do a single full frame
fetch.
It is a safe approach as if scanout is in the visible area
the single full frame will only be visible for hardware in the next
frame because of the double buffering, and if scanout is in vblank
area it will be draw in the current frame.
No need to disable PSR and wait a few miliseconds to enable it again.
Cc: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210930001409.254817-4-jose.souza@intel.com
This unnecessary flushes are hurting power-savings are it causes
features like PSR, FBC and DRRS to disable it self to handle
frontbuffer rendering, below some explanation of why each removed
call is not necessary.
The flush in intel_prepare_plane_fb() is not required as framebuffer
will be flipped and power-saving features do the proper flip handling
in hardware.
intel_find_initial_plane_obj() flush is not required because it is
only executed during driver load and at this point the power-saving
features are not even enabled.
And the last one intelfb_create(), is also not required as at this
point the fbdev was just allocated, userspace will draw on
it what will trigger frontbuffer invalidates and flushes later on.
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210930001409.254817-3-jose.souza@intel.com
PSR2 selective is not supported over rotated and scaled planes.
We had the rotation check in intel_psr2_sel_fetch_config_valid()
but that code path is only execute when a modeset is needed and
those plane parameters can change without a modeset.
Pipe selective fetch restrictions are also needed, it could be added
in intel_psr_compute_config() but pippe scaling is computed after
it is executed, so leaving as is for now.
There is no much loss in this approach as it would cause selective
fetch to not enabled as for alderlake-P and newer will cause it to
switch to PSR1 that will have the same power-savings as do full pipe
fetch.
Also need to check those restricions in the second
for_each_oldnew_intel_plane_in_state() loop because the state could
only have a plane that is not affected by those restricitons but
the damaged area intersect with planes that has those restrictions,
so a full pipe fetch is required.
v2:
- also handling pipe restrictions
BSpec: 55229
Reviewed-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com> # v1
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210930001409.254817-1-jose.souza@intel.com
Setting DP_PORT_EN in intel_dp->DP is already handled by
intel_dp_enable_port() so there is no point in setting it also
from the link training code.
For DDI platforms a bit with that name doesn't even exist. The
counterpart is DDI_BUF_CTL_ENABLE, which is already set up by
intel_ddi_prepare_link_retrain(). Fortunately it is the same bit
so there was no harm in doing this from the platform independent
code as well. But it's just confusing when platform independent
code sets platform specific bits in intel_dp->DP. Just get rid
of it.
Cc: Imre Deak <imre.deak@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210930134310.31669-3-ville.syrjala@linux.intel.com
Reviewed-by: Imre Deak <imre.deak.intel.com>
I want intel_dp->DP to be fully populated by the time the
initial vswing programming happens. To that end move the
intel_ddi_init_dp_buf_reg() call to an earlier spot.
Additionally we don't want intel_ddi_init_dp_buf_reg() to
set DDI_BUF_CTL_ENABLE since the port should only get enabled
at the start of link training (see intel_ddi_prepare_link_retrain()).
So any earlier write to the register should not set the enable bit.
Cc: Imre Deak <imre.deak@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210930134310.31669-2-ville.syrjala@linux.intel.com
Reviewed-by: Imre Deak <imre.deak@intel.com>
Currently we clear the leftover vswing/preemphasis values only
at the start of link training. That means the initial vswing
programming performed during modeset is going to use stale values
left over from the previous link training sequence, and then at
the start of link training we're going to reset the levels back
to 0. Seems much better to make sure we start with level 0 from
the get go.
Additionally if LTTPRs are present the leftover vswing/preemphasis
values are those of the last link in the chain, so not the values
that our PHY is even using after a successful link training sequence.
So let's make sure everything is cleared up before we start
programming anything.
Suggested-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210930134310.31669-1-ville.syrjala@linux.intel.com
Reviewed-by: Imre Deak <imre.deak@intel.com>
With patch "drm/i915/vbt: Fix backlight parsing for VBT 234+"
the size of bdb_lfp_backlight_data structure has been increased,
causing if-statement in the parse_lfp_backlight function
that comapres this structure size to the one retrieved from BDB,
always to fail for older revisions.
This patch calculates expected size of the structure for a given
BDB version and compares it with the value gathered from BDB.
Tested on Chromebook Pixelbook (Nocturne) (reports bdb->version = 221)
Fixes: d381baad29 ("drm/i915/vbt: Fix backlight parsing for VBT 234+")
Tested-by: Lukasz Majczak <lma@semihalf.com>
Signed-off-by: Lukasz Majczak <lma@semihalf.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210930134606.227234-1-lma@semihalf.com
Adjust the HSW+ transcoder state readout to just read through
all the possible transcoders for the pipe, and stuff the results
in a bitmask.
We can conveniently cross check the bitmask for invalid
combinations of enabled transcoders, and later we can easily
extend the bitmask readout to handle the bigjoiner case.
One slight change in behaviour is that we no longer read out
the AONOFF->force_pfit.pfit bit for all the enabled "panel
transcoders". But having more than one enabled would anyway
be illegal so no big loss. Also the AONOFF selection should
only ever be used on HSW, which only has the EDP transcoder
an no DSI transcoders.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210913144440.23008-10-ville.syrjala@linux.intel.com
Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>