Rename enum plane to enum i9xx_plane_id to make it clear that it only
applies to pre-SKL platforms.
enum i9xx_plane_id is a global identifier, whereas enum plane_id is
per-pipe. We need the old global identifier to index the primary plane
(and the pre-g4x sprite C if we ever expose it) registers on pre-SKL
platforms.
v2: Reorder patches
v3: s/old_plane_id/i9xx_plane_id/ (Daniel)
Pimp the commit message a bit
Note that i9xx_plane_id doesn't apply to SKL+
v4: Rebase due to power domain handling in plane readout
v5: Rebase due to crtc->dspaddr_offset removal
v6: s/plane/i9xx_plane/ etc. (James)
Cc: James Ausmus <james.ausmus@intel.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20171117191917.11506-4-ville.syrjala@linux.intel.com
Reviewed-by: James Ausmus <james.ausmus@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Add a .get_hw_state() method for planes, returning true or false
depending on whether the plane is enabled. Use it to rewrite the
plane enabled/disabled asserts in platform agnostic fashion.
We do lose the pre-gen4 plane<->pipe mapping checks, but since we're
supposed sanitize that anyway it doesn't really matter.
v2: Reoder patches to not depend on enum old_plane_id
Just call assert_plane_disabled() from assert_planes_disabled()
v3: Deal with disabled power wells in .get_hw_state()
v4: Rebase due skl primary plane code removal
Cc: Thierry Reding <thierry.reding@gmail.com>
Cc: Alex Villacís Lasso <alexvillacislasso@hotmail.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> #v2
Tested-by: Thierry Reding <thierry.reding@gmail.com> #v2
Link: https://patchwork.freedesktop.org/patch/msgid/20171117191917.11506-2-ville.syrjala@linux.intel.com
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Bake in the conflict between the drm_print.h extraction and the
addition of DRM_DEBUG_LEASES since we lost it a few too many times.
Also fix a new use of drm_plane_helper_check_state in msm to follow
Ville's conversion in
commit a01cb8ba3f
Author: Ville Syrjälä <ville.syrjala@linux.intel.com>
Date: Wed Nov 1 22:16:19 2017 +0200
drm: Move drm_plane_helper_check_state() into drm_atomic_helper.c
Acked-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
An earlier fix changed the return type from find_bb_size however the
integer return is being assigned to a unsigned int so the -ve error
check will never be detected. Make bb_size an int to fix this.
Detected by CoverityScan CID#1456886 ("Unsigned compared against 0")
Fixes: 1e3197d6ad ("drm/i915/gvt: Refine error handling for perform_bb_shadow")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
(cherry picked from commit 24f8a29af4)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
assert_rpm_wakelock_held is triggered from i915_pmic_bus_access_notifier
even though it gets unregistered on (runtime) suspend, this is caused
by a race happening under the following circumstances:
intel_runtime_pm_put does:
atomic_dec(&dev_priv->pm.wakeref_count);
pm_runtime_mark_last_busy(kdev);
pm_runtime_put_autosuspend(kdev);
And pm_runtime_put_autosuspend calls intel_runtime_suspend from
a workqueue, so there is ample of time between the atomic_dec() and
intel_runtime_suspend() unregistering the notifier. If the notifier
gets called in this windowd assert_rpm_wakelock_held falsely triggers
(at this point we're not runtime-suspended yet).
This commit adds disable_rpm_wakeref_asserts and
enable_rpm_wakeref_asserts calls around the
intel_uncore_forcewake_get(FORCEWAKE_ALL) call in
i915_pmic_bus_access_notifier fixing the false-positive WARN_ON.
Changes in v2:
-Reword comment explaining why disabling the wakeref asserts is
ok and necessary
Cc: stable@vger.kernel.org
Reported-by: FKr <bugs-freedesktop@ubermail.me>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171110150301.9601-2-hdegoede@redhat.com
(cherry picked from commit ce30560c80)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
drivers/gpu/drm/i915/gvt/execlist.c:531:6: warning: symbol 'clean_execlist' was not declared. Should it be static?
drivers/gpu/drm/i915/gvt/execlist.c:545:6: warning: symbol 'reset_execlist' was not declared. Should it be static?
drivers/gpu/drm/i915/gvt/execlist.c:556:5: warning: symbol 'init_execlist' was not declared. Should it be static?
drivers/gpu/drm/i915/gvt/scheduler.c:248:6: warning: symbol 'release_shadow_wa_ctx' was not declared. Should it be static?
References: 06bb372f9a ("drm/i915/gvt: Introduce intel_vgpu_reset_submission")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Zhi Wang <zhi.a.wang@intel.com>
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Cc: intel-gvt-dev@lists.freedesktop.org
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
I should have admitted defeat long ago as there has been a rare but
persistent error on Sandybridge where semaphore signaling did not
propagate to the waiter, leading to a GPU hang.
With the work on fence signaling for v4.9, the impact of using CPU driven
signaling was greatly reduced wrt to the latency of GPU semaphores,
though without logical rings support, the benefit of reordering work to
avoid bubbles is not realised (i.e. as it stands fence signaling is just
a slower, more costly version of HW semaphores; but works more
consistently). As a rough indicator of the difference,
with semaphores:
Sequential (3 engines, 1 processes): average 5.470us per cycle [expected 4.988us]
w/o semaphores:
Sequential (3 engines, 1 processes): average 15.771us per cycle [expected 4.923us]
In comparison, v3.4:
with semaphores:
Sequential (3 engines, 1 processes): average 16.066us per cycle [expected 11.842us]
w/o semaphores:
Sequential (3 engines, 1 processes): average 23.460us per cycle [expected 11.839us]
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=54226 #and 100+ dupes
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171120205504.21892-3-chris@chris-wilson.co.uk
Since removing the module parameter to force selection of ringbuffer
emission for gen8, the code is defunct. Remove it.
To put the difference into perspective, a couple of microbenchmarks
(bdw i7-5557u, 20170324):
ring execlists
exec continuous nops on all rings: 1.491us 2.223us
exec sequential nops on each ring: 12.508us 53.682us
single nop + sync: 9.272us 30.291us
vblank_mode=0 glxgears: ~11000fps ~9000fps
Since the earlier submission, gen8 ringbuffer submission has fallen
further and further behind in features. So while ringbuffer may hold the
throughput crown, in terms of interactive latency, execlists is much
better. Alas, we have no convenient metrics for such, other than
demonstrating things we can do with execlists but can not using
legacy ringbuffer submission.
We have made a few improvements to lowlevel execlists throughput,
and ringbuffer currently panics on boot! (bdw i7-5557u, 20171026):
ring execlists
exec continuous nops on all rings: n/a 1.921us
exec sequential nops on each ring: n/a 44.621us
single nop + sync: n/a 21.953us
vblank_mode=0 glxgears: n/a ~18500fps
References: https://bugs.freedesktop.org/show_bug.cgi?id=87725
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Once-upon-a-time-Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171120205504.21892-2-chris@chris-wilson.co.uk
drm_plane_helper_check_state() is supposed to do things the atomic way,
so it should not be inspecting crtc->enabled. Rather we should be
looking at crtc_state->enable.
We have a slight complication due to drm_plane_helper_check_update()
reusing drm_plane_helper_check_state() for non-atomic drivers. Thus
we'll have to pass the crtc_state in manally and construct a fake
crtc_state in drm_plane_helper_check_update().
v2: Fix the WARNs about plane_state->crtc matching crtc_state->crtc
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171101201558.6059-1-ville.syrjala@linux.intel.com
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The hardware needs some time to process the information received in the
ExecList Submission Port, and expects us to not write anything more until
it has 'acknowledged' this new submission by sending an IDLE_ACTIVE or
PREEMPTED CSB event.
If we do not follow this, the driver could write new data into the ELSP
before HW had finishing fetching the previous one, putting us in
'undefined behaviour' space.
This seems to be the problem causing the spurious PREEMPTED & COMPLETE
events after a COMPLETE like the one below:
[] vcs0: sw rd pointer = 2, hw wr pointer = 0, current 'head' = 3.
[] vcs0: Execlist CSB[0]: 0x00000018 _ 0x00000007
[] vcs0: Execlist CSB[1]: 0x00000001 _ 0x00000000
[] vcs0: Execlist CSB[2]: 0x00000018 _ 0x00000007 <<< COMPLETE
[] vcs0: Execlist CSB[3]: 0x00000012 _ 0x00000007 <<< PREEMPTED & COMPLETE
[] vcs0: Execlist CSB[4]: 0x00008002 _ 0x00000006
[] vcs0: Execlist CSB[5]: 0x00000014 _ 0x00000006
The ELSP writes that lead to this CSB sequence show that the HW hadn't
started executing the previous execlist (the one with only ctx 0x6) by the
time the new one was submitted; this is a bit more clear in the data
show in the EXECLIST_STATUS register at the time of the ELSP write.
[] vcs0: ELSP[0] = 0x0_0 [execlist1] - status_reg = 0x0_302
[] vcs0: ELSP[1] = 0x6_fedb2119 [execlist0] - status_reg = 0x0_8302
[] vcs0: ELSP[2] = 0x7_fedaf119 [execlist1] - status_reg = 0x0_8308
[] vcs0: ELSP[3] = 0x6_fedb2119 [execlist0] - status_reg = 0x7_8308
Note that having to wait for this ack does not disable lite-restores,
although it may reduce their numbers.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102035
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/<20171118003038.7935-1-michel.thierry@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171120123458.23242-4-chris@chris-wilson.co.uk
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Tested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Now that we have this stored in the device info, we can drop it from perf
part of the driver.
Note that this requires to init perf after we've computed the frequency,
hence why we move i915_perf_init() from i915_driver_init_early() to after
intel_device_info_runtime_init().
v2: Use div_u64 (Chris)
v3: Drop u64 divs by switching to kHz (Chris/Ville)
Move i915_perf_fini to i915_driver_cleanup_hw (Matthew)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171113181902.12411-2-lionel.g.landwerlin@intel.com
As the request will, in the following patch, implicitly invoke a
context-switch on construction, we should precede that with a GPU TLB
invalidation. Also, even before using GGTT, we always want to invalidate
the TLBs for any updates (as well as the ppgtt invalidates that are
unconditionally applied by execbuf). Since we almost always require the
TLB invalidate, do it unconditionally on request allocation and so we can
remove it from all other paths.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171120102002.22254-1-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Rodrigo gave a persuasive argument for keeping workarounds: that they
serve as a good guide for the bring up of the next generation. Not only
do workarounds persist into the early revisions, they show where the
workarounds were previously added to the code flow and sometimes the old
workarounds have an explanation that give insight into their wider
implications.
Based on his suggestion, document the policy that we want to keep the
workarounds from the current generation to guide the next. Older
preproduction workarounds we still want to remove to keep the code
clean.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20171117102635.8689-1-chris@chris-wilson.co.uk
Acked-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
The watermarks it should calculate against are the old optimal watermarks.
The currently active crtc watermarks are pure fiction, and are invalid in
case of a nonblocking modeset, page flip enabling/disabling planes or any
other reason.
When the crtc is disabled or during a modeset the intermediate watermarks
don't need to be programmed separately, and could be directly assigned
to the optimal watermarks.
CXSR must always be disabled in the intermediate case for modesets,
else we get a WARN for vblank wait timeout.
Also rename crtc_state to new_crtc_state, to distinguish it from the old
state.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171115163157.14372-2-maarten.lankhorst@linux.intel.com
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
The watermarks it should calculate against are the old optimal watermarks.
The currently active crtc watermarks are pure fiction, and are invalid in
case of a nonblocking modeset, page flip enabling/disabling planes or any
other reason.
When the crtc is disabled or during a modeset the intermediate watermarks
don't need to be programmed separately, and could be directly assigned
to the optimal watermarks.
CXSR must always be disabled in the intermediate case for modesets, else
we get a WARN for vblank wait timeout.
Also rename crtc_state to new_crtc_state, to distinguish it from the old state.
Changes since v1:
- Use intel_atomic_get_old_crtc_state. (ville)
Changes since v2:
- Always unset cxsr during modeset.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171115163157.14372-1-maarten.lankhorst@linux.intel.com
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
The firmware may have set up the pipe correctly, but the FIFO
underrun and CRC interrupts are likely not enabled.
This resulted in debugfs_test.read_all_entries failing on haswell,
because of a timeout when reading the crc debugfs entry.
Solve this by enabling FIFO underrun reporting after the initial
fastset, which lets interrupts be generated as expected.
Changes since v1:
- Always enable CPU FIFO underrun reporting for >GEN2,
and handle GEN2 correctly.
Changes since v2:
- Remove unneeded HAS_DDI, simplify GEN2 case.
Changes since v3:
- Use intel_crtc_pch_transcoder to determine pch transcoder for underruns. (Ville)
- Remove crtc->config dereference in intel_crtc_pch_transcoder. (Ville)
Testcase: debugfs_test.read_all_entries
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171113144043.58658-1-maarten.lankhorst@linux.intel.com
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
The first test aims to check guc_init_doorbell_hw, changing the existing
guc clients and doorbells state before calling it.
The second test tries to create as many clients as it is currently possible
(currently limited to max number of doorbells) and exercise the doorbell
alloc/dealloc code.
Since our usage mode require very few clients/doorbells, this code has
been exercised very lightly and it's good to have a simple test for it.
As reference, this test already helped identify the bug fixed by
commit 7f1ea2ac30 ("drm/i915/guc: Fix doorbell id selection").
v2: Extend number of clients; check for client allocation failure when
number of doorbells is exceeded; validate client properties; reuse
guc_init_doorbell_hw (Chris).
v3: guc_init_doorbell_hw test added per Chris suggestion.
v4: Try to explain why guc_init_doorbell_hw exist and comment some
details in the subtest.
v5: Remove redundant pr_info at the beginning of each subtest (Chris);
rebase (s/i915_guc_client/intel_guc_client/).
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171116220632.1909-1-michel.thierry@intel.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Pull AFS updates from David Howells:
"kAFS filesystem driver overhaul.
The major points of the overhaul are:
(1) Preliminary groundwork is laid for supporting network-namespacing
of kAFS. The remainder of the namespacing work requires some way
to pass namespace information to submounts triggered by an
automount. This requires something like the mount overhaul that's
in progress.
(2) sockaddr_rxrpc is used in preference to in_addr for holding
addresses internally and add support for talking to the YFS VL
server. With this, kAFS can do everything over IPv6 as well as
IPv4 if it's talking to servers that support it.
(3) Callback handling is overhauled to be generally passive rather
than active. 'Callbacks' are promises by the server to tell us
about data and metadata changes. Callbacks are now checked when
we next touch an inode rather than actively going and looking for
it where possible.
(4) File access permit caching is overhauled to store the caching
information per-inode rather than per-directory, shared over
subordinate files. Whilst older AFS servers only allow ACLs on
directories (shared to the files in that directory), newer AFS
servers break that restriction.
To improve memory usage and to make it easier to do mass-key
removal, permit combinations are cached and shared.
(5) Cell database management is overhauled to allow lighter locks to
be used and to make cell records autonomous state machines that
look after getting their own DNS records and cleaning themselves
up, in particular preventing races in acquiring and relinquishing
the fscache token for the cell.
(6) Volume caching is overhauled. The afs_vlocation record is got rid
of to simplify things and the superblock is now keyed on the cell
and the numeric volume ID only. The volume record is tied to a
superblock and normal superblock management is used to mediate
the lifetime of the volume fscache token.
(7) File server record caching is overhauled to make server records
independent of cells and volumes. A server can be in multiple
cells (in such a case, the administrator must make sure that the
VL services for all cells correctly reflect the volumes shared
between those cells).
Server records are now indexed using the UUID of the server
rather than the address since a server can have multiple
addresses.
(8) File server rotation is overhauled to handle VMOVED, VBUSY (and
similar), VOFFLINE and VNOVOL indications and to handle rotation
both of servers and addresses of those servers. The rotation will
also wait and retry if the server says it is busy.
(9) Data writeback is overhauled. Each inode no longer stores a list
of modified sections tagged with the key that authorised it in
favour of noting the modified region of a page in page->private
and storing a list of keys that made modifications in the inode.
This simplifies things and allows other keys to be used to
actually write to the server if a key that made a modification
becomes useless.
(10) Writable mmap() is implemented. This allows a kernel to be build
entirely on AFS.
Note that Pre AFS-3.4 servers are no longer supported, though this can
be added back if necessary (AFS-3.4 was released in 1998)"
* tag 'afs-next-20171113' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs: (35 commits)
afs: Protect call->state changes against signals
afs: Trace page dirty/clean
afs: Implement shared-writeable mmap
afs: Get rid of the afs_writeback record
afs: Introduce a file-private data record
afs: Use a dynamic port if 7001 is in use
afs: Fix directory read/modify race
afs: Trace the sending of pages
afs: Trace the initiation and completion of client calls
afs: Fix documentation on # vs % prefix in mount source specification
afs: Fix total-length calculation for multiple-page send
afs: Only progress call state at end of Tx phase from rxrpc callback
afs: Make use of the YFS service upgrade to fully support IPv6
afs: Overhaul volume and server record caching and fileserver rotation
afs: Move server rotation code into its own file
afs: Add an address list concept
afs: Overhaul cell database management
afs: Overhaul permit caching
afs: Overhaul the callback handling
afs: Rename struct afs_call server member to cm_server
...
Spec describe all values in MHz. We handle our
clocks in KHz. This includes the best_dco_centrality that was
forgot in the same unity as spec. Consequently we couldn't
get a good divider for high frequenies. Hence HDMI 2.0 wasn't
working.
Spec tells 999999 for initial best_dco_centrality meaning the
max value in MHz.
Since we convert dco from MHz to KHz we also need to convert
this initial best_doc_centrality to 999999000 or 999999999
or even better, to the max that its variable allow.
This patch also replaces the use of "* KHz(1)" with the values
directly on KHz to avoid future confusion.
v2: Use U32_MAX instead of random 99999 as spec tells. (Ville).
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Shashank Sharma <shashank.sharma@intel.com>
Cc: Mika Kahola <mika.kahola@intel.com>
Cc: Manasi Navare <manasi.d.navare@intel.com>
Cc: James Ausmus <james.ausmus@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171114234223.10600-1-rodrigo.vivi@intel.com