Add some (probably overkill) locking to protect the vblank
timestamping constants updates during seamless M/N fastsets.
As everything should be naturally aligned I think the individual
pieces should probably end up updating atomically enough. So this
is only really meant to guarantee everyone sees a consistent whole.
All the drm_vblank.c usage is covered by vblank_time_lock,
and uncore.lock will take care of __intel_get_crtc_scanline()
that can also be called from outside the core vblank functionality.
Currently only crtc_clock and framedur_ns can change, but in
the future might fastset also across eg. vtotal/vblank_end
changes, so let's just grab the locks across the whole thing.
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230310235828.17439-2-ville.syrjala@linux.intel.com
When we change the M/N values seamlessly during a fastset we should
also update the vblank timestamping stuff to make sure the vblank
timestamp corrections/guesstimations come out exact.
Note that only crtc_clock and framedur_ns can actually end up
changing here during fastsets. Everything else we touch can
only change during full modesets.
Technically we should try to do this exactly at the start of
vblank, but that would require some kind of double buffering
scheme. Let's skip that for now and just update things right
after the commit has been submitted to the hardware. This
means the information will be properly up to date when the
vblank irq handler goes to work. Only if someone ends up
querying some vblanky stuff in between the commit and start
of vblank may we see a slight discrepancy.
Also this same problem really exists for the DRRS downclocking
stuff. But as that is supposed to be more or less transparent
to the user, and it only drops to low gear after a long delay
(1 sec currently) we probably don't have to worry about it.
Any time something is actively submitting updates DRRS will
remain in high gear and so the timestamping constants will
match the hardware state.
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Mitul Golani <mitulkumar.ajitkumar.golani@intel.com>
Fixes: e6f29923c0 ("drm/i915: Allow M/N change during fastset on bdw+")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230310235828.17439-1-ville.syrjala@linux.intel.com
On MTL, objects allocated through i915_gem_object_create_internal() are
mapped as uncached in GPU by default because HAS_LLC is false. However
in the live_hwsp_read selftest these watcher objects are mapped as WB
on CPU side. The conseqence is that the updates done by the GPU are not
immediately visible to CPU, thus the selftest is randomly failing due to
the stale data in CPU cache. Solution can be either setting WC for CPU +
UC for GPU, or WB for CPU + 1-way coherent WB for GPU.
To keep the consistency, let's simply inherit the same cache settings
from the timeline, which is WB for CPU + 1-way coherent WB for GPU,
because this test is supposed to emulate the behavior of the timeline
anyway.
Signed-off-by: Fei Yang <fei.yang@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230315180800.2632766-1-fei.yang@intel.com
Add definitions for various pipe timestamp registers:
- frame timestamp (last start of vblank) (g4x+), already had this defined
- flip timestamp (when SURF was last written) (g4x+)
- flipdone timestamp (when last flipdone was signalled) (tgl+)
Note that on pre-tgl the flip related timestamps are only updated
for primary plane flips, but on tgl+ we can select which plane
updates them (via PIPE_MISC2). Let's define those related bits
as well.
Curiously VLV/CHV do not have the frame/flip timestamp registers,
despite all the other related registers being inherited from g4x.
This means we can get rid of the pipe_offsets[] usage for these,
and thus the implicit dev_priv is gone as well.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230314130255.23273-4-ville.syrjala@linux.intel.com
Reviewed-by: Jouni Högander <jouni.hogander@intel.com>
Write-combining memory allows speculative reads by CPU.
ggtt->error_capture is WC mapped to CPU, so CPU/MMU can try
to prefetch memory beyond the error_capture, ie it tries
to read memory pointed by next PTE in GGTT.
If this PTE points to invalid address DMAR errors will occur.
This behaviour was observed on ADL and RPL platforms.
To avoid it, guard scratch page should be added after error_capture.
The patch fixes the most annoying issue with error capture but
since WC reads are used also in other places there is a risk similar
problem can affect them as well.
v2:
- modified commit message (I hope the diagnosis is correct),
- added bug checks to ensure scratch is initialized on gen3 platforms.
CI produces strange stacktrace for it suggesting scratch[0] is NULL,
to be removed after resolving the issue with gen3 platforms.
v3:
- removed bug checks, replaced with gen check.
v4:
- change code for scratch page insertion to support all platforms,
- add info in commit message there could be more similar issues
v5:
- check for nop_clear_range instead of gen8 (Tvrtko),
- re-insert scratch pages on resume (Tvrtko)
v6:
- use scratch_range callback to set scratch pages (Chris)
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Acked-by: Nirmoy Das <nirmoy.das@intel.com>
Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230308-guard_error_capture-v6-2-1b5f31422563@intel.com
Some panels generate long HPD events even while connected to
the port. This cause some unexpected CI execution issues. A
new flag is added to track if such spurious long HPDs can be
ignored and are not processed further if the flag is set.
Debugfs entry is added to control the ignore long hpd flag.
v2: Address patch styling comments (Jani Nikula)
v3: Ignoring the HPD moved to hotplug handler and now applies
to all types of outputs (Imre Deak)
v4: use debugfs_create_bool and squash patches (Jani Nikula)
Signed-off-by: Vinod Govindapillai <vinod.govindapillai@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230215083832.287519-2-vinod.govindapillai@intel.com
drm-misc-next for v6.4-rc1:
Note: Only changes since pull request from 2023-02-23 are included here.
UAPI Changes:
- Convert rockchip bindings to YAML.
- Constify kobj_type structure in dma-buf.
- FBDEV cmdline parser fixes, and other small fbdev fixes for mode
parsing.
Cross-subsystem Changes:
- Add Neil Armstrong as linaro maintainer.
- Actually signal the private stub dma-fence.
Core Changes:
- Add function for adding syncobj dep to sched_job and use it in panfrost, v3d.
- Improve DisplayID 2.0 topology parsing and EDID parsing in general.
- Add a gem eviction function and callback for generic GEM shrinker
purposes.
- Prepare to convert shmem helper to use the GEM reservation lock instead of own
locking. (Actual commit itself got reverted for now)
- Move the suballocator from radeon and amdgpu drivers to core in preparation
for Xe.
- Assorted small fixes and documentation.
- Fixes to HPD polling.
- Assorted small fixes in simpledrm, bridge, accel, shmem-helper,
and the selftest of format-helper.
- Remove dummy resource when ttm bo is created, and during pipelined
gutting. Fix all drivers to accept a NULL ttm_bo->resource.
- Handle pinned BO moving prevention in ttm core.
- Set drm panel-bridge orientation before connector is registered.
- Remove dumb_destroy callback.
- Add documentation to GEM_CLOSE, PRIME_HANDLE_TO_FD, PRIME_FD_TO_HANDLE, GETFB2 ioctl's.
- Add atomic enable_plane callback, use it in ast, mgag200, tidss.
Driver Changes:
- Use drm_gem_objects_lookup in vc4.
- Assorted small fixes to virtio, ast, bridge/tc358762, meson, nouveau.
- Allow virtio KMS to be disabled and compiled out.
- Add Radxa 8/10HD, Samsung AMS495QA01 panels.
- Fix ivpu compiler errors.
- Assorted fixes to drm/panel, malidp, rockchip, ivpu, amdgpu, vgem,
nouveau, vc4.
- Assorted cleanups, simplifications and fixes to vmwgfx.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ac1f5186-54bb-02f4-ac56-907f5b76f3de@linux.intel.com
The Wa_14017073508 require to send Media Busy/Idle mailbox while
accessing Media tile. As of now it is getting handled while __gt_unpark,
__gt_park. But there are various corner cases where forcewakes are taken
without __gt_unpark i.e. without sending Busy Mailbox especially during
register reads. Forcewakes are taken without busy mailbox leads to
GPU HANG. So bringing mailbox calls under forcewake calls are no feasible
option as forcewake calls are atomic and mailbox calls are blocking.
The issue already fixed in B step so disabling MC6 on A step and
reverting previous commit which handles Wa_14017073508
Fixes: 8f70f1ec58 ("drm/i915/mtl: Add Wa_14017073508 for SAMedia")
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Badal Nilawar <badal.nilawar@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Anshuman Gupta <anshuman.gupta@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230310061339.2495416-2-badal.nilawar@intel.com
Users reported oopses on list corruptions when using i915 perf with a
number of concurrently running graphics applications. Root cause analysis
pointed at an issue in barrier processing code -- a race among perf open /
close replacing active barriers with perf requests on kernel context and
concurrent barrier preallocate / acquire operations performed during user
context first pin / last unpin.
When adding a request to a composite tracker, we try to reuse an existing
fence tracker, already allocated and registered with that composite. The
tracker we obtain may already track another fence, may be an idle barrier,
or an active barrier.
If the tracker we get occurs a non-idle barrier then we try to delete that
barrier from a list of barrier tasks it belongs to. However, while doing
that we don't respect return value from a function that performs the
barrier deletion. Should the deletion ever fail, we would end up reusing
the tracker still registered as a barrier task. Since the same structure
field is reused with both fence callback lists and barrier tasks list,
list corruptions would likely occur.
Barriers are now deleted from a barrier tasks list by temporarily removing
the list content, traversing that content with skip over the node to be
deleted, then populating the list back with the modified content. Should
that intentionally racy concurrent deletion attempts be not serialized,
one or more of those may fail because of the list being temporary empty.
Related code that ignores the results of barrier deletion was initially
introduced in v5.4 by commit d8af05ff38 ("drm/i915: Allow sharing the
idle-barrier from other kernel requests"). However, all users of the
barrier deletion routine were apparently serialized at that time, then the
issue didn't exhibit itself. Results of git bisect with help of a newly
developed igt@gem_barrier_race@remote-request IGT test indicate that list
corruptions might start to appear after commit 311770173f ("drm/i915/gt:
Schedule request retirement when timeline idles"), introduced in v5.5.
Respect results of barrier deletion attempts -- mark the barrier as idle
only if successfully deleted from the list. Then, before proceeding with
setting our fence as the one currently tracked, make sure that the tracker
we've got is not a non-idle barrier. If that check fails then don't use
that tracker but go back and try to acquire a new, usable one.
v3: use unlikely() to document what outcome we expect (Andi),
- fix bad grammar in commit description.
v2: no code changes,
- blame commit 311770173f ("drm/i915/gt: Schedule request retirement
when timeline idles"), v5.5, not commit d8af05ff38 ("drm/i915: Allow
sharing the idle-barrier from other kernel requests"), v5.4,
- reword commit description.
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/6333
Fixes: 311770173f ("drm/i915/gt: Schedule request retirement when timeline idles")
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: stable@vger.kernel.org # v5.5
Cc: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230302120820.48740-1-janusz.krzysztofik@linux.intel.com
(cherry picked from commit 5060060557)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
It seems that commit bc3c5e0809 ("drm/i915/sseu: Don't try to store EU
mask internally in UAPI format") exposed a potential out-of-bounds
access, reported by UBSAN as following on a laptop with a gen 11 i915
card:
UBSAN: array-index-out-of-bounds in drivers/gpu/drm/i915/gt/intel_sseu.c:65:27
index 6 is out of range for type 'u16 [6]'
CPU: 2 PID: 165 Comm: systemd-udevd Not tainted 6.2.0-9-generic #9-Ubuntu
Hardware name: Dell Inc. XPS 13 9300/077Y9N, BIOS 1.11.0 03/22/2022
Call Trace:
<TASK>
show_stack+0x4e/0x61
dump_stack_lvl+0x4a/0x6f
dump_stack+0x10/0x18
ubsan_epilogue+0x9/0x3a
__ubsan_handle_out_of_bounds.cold+0x42/0x47
gen11_compute_sseu_info+0x121/0x130 [i915]
intel_sseu_info_init+0x15d/0x2b0 [i915]
intel_gt_init_mmio+0x23/0x40 [i915]
i915_driver_mmio_probe+0x129/0x400 [i915]
? intel_gt_probe_all+0x91/0x2e0 [i915]
i915_driver_probe+0xe1/0x3f0 [i915]
? drm_privacy_screen_get+0x16d/0x190 [drm]
? acpi_dev_found+0x64/0x80
i915_pci_probe+0xac/0x1b0 [i915]
...
According to the definition of sseu_dev_info, eu_mask->hsw is limited to
a maximum of GEN_MAX_SS_PER_HSW_SLICE (6) sub-slices, but
gen11_sseu_info_init() can potentially set 8 sub-slices, in the
!IS_JSL_EHL(gt->i915) case.
Fix this by reserving up to 8 slots for max_subslices in the eu_mask
struct.
Reported-by: Emil Renner Berthing <emil.renner.berthing@canonical.com>
Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
Fixes: bc3c5e0809 ("drm/i915/sseu: Don't try to store EU mask internally in UAPI format")
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230220171858.131416-1-andrea.righi@canonical.com
(cherry picked from commit 3cba09a6ac)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
The CI results for the 'fast request' patch set (enables error return
codes for fire-and-forget H2G messages) hit an issue with the KMD
sending context submission requests on an invalid context. That was
caused by a fault injection probe failing the context creation of a
kernel context. However, there was no return code checking on any of
the kernel context registration paths. So the driver kept going and
tried to use the kernel context for the record defaults process.
This would not cause any actual problems. The invalid requests would
be rejected by GuC and ultimately the start up sequence would
correctly wedge due to the context creation failure. But fixing the
issue correctly rather ignoring it means we won't get CI complaining
when the fast request patch lands and enables the extra error checking.
So fix it by checking for errors and aborting as appropriate when
creating kernel contexts. While at it, clean up some other submission
init related failure cleanup paths. Also, rename guc_init_lrc_mapping
to guc_init_submission as the former name hasn't been valid in a long
time.
v2: Add another wrapper to keep the flow balanced (Daniele)
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230217223308.3449737-3-John.C.Harrison@Intel.com
The pipe needs a certain amount of time during vblank to prefill
sufficiently. If the vblank is too short the relevant watermark
level must be disabled.
Start implementing the necessary calculations to check this.
Scaler and DSC prefill are left out for now as handling those
is not entirely trivial.
Also the PSR latency reporting override chicken bits would
need to be correctly configured based on the results of these
calculations. Just add some FIXMEs for now.
TODO: bspec isn't exactly crystal clear in its explanations
so quite a few open questions remain...
v2: Skip inacive pipes
Handle SAGV latency
v3: Rebase
v4: Fix handling of disabled wm levels (latency == 0)
Reviewed-by: Jouni Högander <jouni.hogander@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230306164854.25928-1-ville.syrjala@linux.intel.com