hsw/bdw lack the pipe register vs. display register distinction
in their PSR masking capabilities. So to keep our CURSURFLIVE
tricks working we need to just unmask all display register writes
on these platforms. The downside being that any display regitster
(eg. even SWF regs) will cause a PSR exit.
Note that WaMaskMMIOWriteForPSR asks us to mask this on bdw, but
that won't work since we need those CURSURFLIVE tricks. Observations
on actual hardware show that this causes one extra PSR exit ~every
10 seconds, which is pretty much irrelevant. I suspect this is
due to the pcode poking at IPS_CTL. Disabling IPS does not stop it
however, so either I'm wrong or pcode pokes at the register
regardless of whether it's actually trying to enable/disable IPS.
Also when the machine is busy (eg. just running 'find /') these
extra PSR exits cease, which again points at pcode or some other
PM entity as being the culprit.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230609141404.12729-11-ville.syrjala@linux.intel.com
Reviewed-by: Jouni Högander <jouni.hogander@intel.com>
Implement WaPsrDPAMaskVBlankInSRD:hsw, which makes the hardware
generate the extra vblank between link training and first frame
being transmitted. This is the same thing that's controlled by
TRANS_CHICKEN[21] on skl+ (but due to the funky double buffering
it's effectively always at the rest value after DC5 exit). So
for consistent behaviour we want every platform to generate said
vblank. BDW is already setting this up correctly.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230609141404.12729-9-ville.syrjala@linux.intel.com
Reviewed-by: Jouni Högander <jouni.hogander@intel.com>
PC8+ clobbers a bunch of displays registers which need to
be restored by hand or else we lost a bunch of workarounds.
The important ones for us are at least CHICKEN_PAR2* and
CHICKEN_PIPESL*.
Curiously at least some CHICKEN_PAR1* registers
are preserved by the hardware/firmware. Unfortunately Bspec
doens't really specify what gets clobbered vs. preserved
so further reverse engieering might be warranted to figure
out the specifics.
Note that PCH_LP_PARTITION_LEVEL_DISABLE is also set by
lpt_init_clock_gating() so the rmw in hsw_disable_pc8()
is now redundant. Remove it.
TODO: I suspect most gt stuff doesn't need this and we should
finish moving all of them from init_clock_gating() to
a more appropriate place...
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230609141404.12729-2-ville.syrjala@linux.intel.com
Reviewed-by: Jouni Högander <jouni.hogander@intel.com>
This seems to have existed for ever but is now more apparant after
commit 9bff18d134 ("drm/ttm: use per BO cleanup workers")
My analysis: two threads are running, one in the irq signalling the
fence, in dma_fence_signal_timestamp_locked, it has done the
DMA_FENCE_FLAG_SIGNALLED_BIT setting, but hasn't yet reached the
callbacks.
The second thread in nouveau_cli_work_ready, where it sees the fence is
signalled, so then puts the fence, cleanups the object and frees the
work item, which contains the callback.
Thread one goes again and tries to call the callback and causes the
use-after-free.
Proposed fix: lock the fence signalled check in nouveau_cli_work_ready,
so either the callbacks are done or the memory is freed.
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Fixes: 11e451e740 ("drm/nouveau: remove fence wait code from deferred client work handler")
Cc: stable@vger.kernel.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://lore.kernel.org/dri-devel/20230615024008.1600281-1-airlied@gmail.com/
drm/i915 feature pull #2 for v6.5:
Features and functionality:
- Meteorlake PM demand (Vinod, Mika)
- Switch to dedicated workqueues to stop using flush_scheduled_work() (Luca)
Refactoring and cleanups:
- Move display runtime init under display/ (Matt)
- Async flip error message clarifications (Arun)
Fixes:
- Remove 10bit gamma on desktop gen3 parts, they don't support it (Ville)
- Fix driver probe error handling if driver creation fails (Matt)
- Fix all -Wunused-but-set-variable warnings, and enable it for i915 (Jani)
- Stop using edid_blob_ptr (Jani)
- Fix log level for "CDS interlane align done" (Khaled)
- Fix an unnecessary include prefix (Matt)
Merges:
- Backmerge drm-next to sync with drm-intel-gt-next (Jani)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/87o7lnpxz2.fsf@intel.com
[Why]
The sequence for collecting down_reply from source perspective should
be:
Request_n->repeat (get partial reply of Request_n->clear message ready
flag to ack DPRX that the message is received) till all partial
replies for Request_n are received->new Request_n+1.
Now there is chance that drm_dp_mst_hpd_irq() will fire new down
request in the tx queue when the down reply is incomplete. Source is
restricted to generate interveleaved message transactions so we should
avoid it.
Also, while assembling partial reply packets, reading out DPCD DOWN_REP
Sideband MSG buffer + clearing DOWN_REP_MSG_RDY flag should be
wrapped up as a complete operation for reading out a reply packet.
Kicking off a new request before clearing DOWN_REP_MSG_RDY flag might
be risky. e.g. If the reply of the new request has overwritten the
DPRX DOWN_REP Sideband MSG buffer before source writing one to clear
DOWN_REP_MSG_RDY flag, source then unintentionally flushes the reply
for the new request. Should handle the up request in the same way.
[How]
Separete drm_dp_mst_hpd_irq() into 2 steps. After acking the MST IRQ
event, driver calls drm_dp_mst_hpd_irq_send_new_request() and might
trigger drm_dp_mst_kick_tx() only when there is no on going message
transaction.
Changes since v1:
* Reworked on review comments received
-> Adjust the fix to let driver explicitly kick off new down request
when mst irq event is handled and acked
-> Adjust the commit message
Changes since v2:
* Adjust the commit message
* Adjust the naming of the divided 2 functions and add a new input
parameter "ack".
* Adjust code flow as per review comments.
Changes since v3:
* Update the function description of drm_dp_mst_hpd_irq_handle_event
Changes since v4:
* Change ack of drm_dp_mst_hpd_irq_handle_event() to be an array align
the size of esi[]
Signed-off-by: Wayne Lin <Wayne.Lin@amd.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
If hmm_range_fault returns -EBUSY, we should call hmm_range_fault again
to validate the remaining pages. On one system with NUMA auto balancing
enabled, hmm_range_fault takes 6 seconds for 1GB range because CPU
migrate the range one page at a time. To be safe, increase timeout value
to 1 second for 128MB range.
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The driver only supports OLED controllers that have a native DRM_FORMAT_C1
pixel format and that is why it has harcoded a division of the width by 8.
But the driver might be extended to support devices that have a different
pixel format. So it's better to use the struct drm_format_info helpers to
compute the size of the buffer, used to store the pixels in native format.
Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20230609170941.1150941-6-javierm@redhat.com
The resolutions for these panels are fixed and defined in the Device Tree,
so there's no point to allocate the buffers on each plane update and that
can just be done once.
Let's do the allocation and free on the encoder enable and disable helpers
since that's where others initialization and teardown operations are done.
Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20230609170941.1150941-5-javierm@redhat.com
Currently the driver hardcodes the default values to 96x16 pixels but this
default resolution depends on the controller. The datasheets for the chips
describes the following display controller resolutions:
- SH1106: 132 x 64 Dot Matrix OLED/PLED
- SSD1305: 132 x 64 Dot Matrix OLED/PLED
- SSD1306: 128 x 64 Dot Matrix OLED/PLED
- SSD1307: 128 x 39 Dot Matrix OLED/PLED
- SSD1309: 128 x 64 Dot Matrix OLED/PLED
Add this information to the devices' info structures, and use it set as a
default if not defined in DT rather than hardcoding to an arbitrary value.
Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20230609170941.1150941-2-javierm@redhat.com
Update user space last_event_age when event age is enabled.
It is only for KFD_EVENT_TYPE_SIGNAL which is checked by user space.
Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
drm_sched_entity_add_dependency_cb ignores the scheduled fence and return false.
If entity's dependency is a scheduler error fence and drm_sched_stop is called
due to TDR, drm_sched_entity_pop_job will wait for the dependency infinitely.
[How]
Do not wait or ignore the scheduled error fence, add drm_sched_entity_wakeup
callback for the dependency with scheduled error fence.
Signed-off-by: ZhenGuo Yin <zhenguo.yin@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
UMD is not aware of entity error, and will keep submitting jobs
into the error entity.
[How]
Add entity error check when getting entity from ctx.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: ZhenGuo Yin <zhenguo.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Instead of using the VRAM lost counter add a 64bit token which indicates
if a context or job is still valid to use.
Should the VRAM be lost or the page tables need re-creation the token will
change indicating that userspace needs to act and re-create the contexts
and re-submit the work.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
When some problem with the updates of page tables is detected reset the
state machine of the VM and re-create all page tables from scratch.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Mark as experimental for now until we get closer to production
to avoid possible undesireable behavior when mixing newer
boards with older kernels.
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
PARTITION_MODE field in PARTITION_COMPUTE_STATUS register is defined as
below by firmware.
SPX = 0, DPX = 1, TPX = 2, QPX = 3, CPX = 4
Change driver definition accordingly.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Le Ma <le.ma@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>