Commit Graph

17886 Commits

Author SHA1 Message Date
Chris Wilson
79e6770cb1 drm/i915: Remove obsolete ringbuffer emission for gen8+
Since removing the module parameter to force selection of ringbuffer
emission for gen8, the code is defunct. Remove it.

To put the difference into perspective, a couple of microbenchmarks
(bdw i7-5557u, 20170324):
                                        ring          execlists
exec continuous nops on all rings:   1.491us            2.223us
exec sequential nops on each ring:  12.508us           53.682us
single nop + sync:                   9.272us           30.291us

vblank_mode=0 glxgears:            ~11000fps           ~9000fps

Since the earlier submission, gen8 ringbuffer submission has fallen
further and further behind in features. So while ringbuffer may hold the
throughput crown, in terms of interactive latency, execlists is much
better. Alas, we have no convenient metrics for such, other than
demonstrating things we can do with execlists but can not using
legacy ringbuffer submission.

We have made a few improvements to lowlevel execlists throughput,
and ringbuffer currently panics on boot! (bdw i7-5557u, 20171026):

                                        ring          execlists
exec continuous nops on all rings:       n/a            1.921us
exec sequential nops on each ring:       n/a           44.621us
single nop + sync:                       n/a           21.953us

vblank_mode=0 glxgears:                  n/a          ~18500fps

References: https://bugs.freedesktop.org/show_bug.cgi?id=87725
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Once-upon-a-time-Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171120205504.21892-2-chris@chris-wilson.co.uk
2017-11-20 21:54:58 +00:00
Chris Wilson
fb5c551ad5 drm/i915: Remove i915.enable_execlists module parameter
Execlists and legacy ringbuffer submission are no longer feature
comparable (execlists now offer greater functionality that should
overcome their performance hit) and obsoletes the unsafe module
parameter, i.e. comparing the two modes of execution is no longer
useful, so remove the debug tool.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> #i915_perf.c
Link: https://patchwork.freedesktop.org/patch/msgid/20171120205504.21892-1-chris@chris-wilson.co.uk
2017-11-20 21:53:59 +00:00
Ville Syrjälä
a01cb8ba3f drm: Move drm_plane_helper_check_state() into drm_atomic_helper.c
drm_plane_helper_check_update() isn't a transitional helper, so let's
rename it to drm_atomic_helper_check_plane_state() and move it into
drm_atomic_helper.c.

v2: Fix the WARNs about plane_state->crtc matching crtc_state->crtc

Cc: Daniel Vetter <daniel@ffwll.ch>
Suggested-by: Daniel Vetter <daniel@ffwll.ch>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171101201619.6175-1-ville.syrjala@linux.intel.com
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2017-11-20 21:14:22 +02:00
Ville Syrjälä
10b47ee02d drm: Check crtc_state->enable rather than crtc->enabled in drm_plane_helper_check_state()
drm_plane_helper_check_state() is supposed to do things the atomic way,
so it should not be inspecting crtc->enabled. Rather we should be
looking at crtc_state->enable.

We have a slight complication due to drm_plane_helper_check_update()
reusing drm_plane_helper_check_state() for non-atomic drivers. Thus
we'll have to pass the crtc_state in manally and construct a fake
crtc_state in drm_plane_helper_check_update().

v2: Fix the WARNs about plane_state->crtc matching crtc_state->crtc

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171101201558.6059-1-ville.syrjala@linux.intel.com
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2017-11-20 20:33:21 +02:00
Michel Thierry
ba74cb10c7 drm/i915/execlists: Delay writing to ELSP until HW has processed the previous write
The hardware needs some time to process the information received in the
ExecList Submission Port, and expects us to not write anything more until
it has 'acknowledged' this new submission by sending an IDLE_ACTIVE or
PREEMPTED CSB event.

If we do not follow this, the driver could write new data into the ELSP
before HW had finishing fetching the previous one, putting us in
'undefined behaviour' space.

This seems to be the problem causing the spurious PREEMPTED & COMPLETE
events after a COMPLETE like the one below:

[] vcs0: sw rd pointer = 2, hw wr pointer = 0, current 'head' = 3.
[] vcs0:  Execlist CSB[0]: 0x00000018 _ 0x00000007
[] vcs0:  Execlist CSB[1]: 0x00000001 _ 0x00000000
[] vcs0:  Execlist CSB[2]: 0x00000018 _ 0x00000007  <<< COMPLETE
[] vcs0:  Execlist CSB[3]: 0x00000012 _ 0x00000007  <<< PREEMPTED & COMPLETE
[] vcs0:  Execlist CSB[4]: 0x00008002 _ 0x00000006
[] vcs0:  Execlist CSB[5]: 0x00000014 _ 0x00000006

The ELSP writes that lead to this CSB sequence show that the HW hadn't
started executing the previous execlist (the one with only ctx 0x6) by the
time the new one was submitted; this is a bit more clear in the data
show in the EXECLIST_STATUS register at the time of the ELSP write.

[] vcs0: ELSP[0] = 0x0_0        [execlist1] - status_reg = 0x0_302
[] vcs0: ELSP[1] = 0x6_fedb2119 [execlist0] - status_reg = 0x0_8302

[] vcs0: ELSP[2] = 0x7_fedaf119 [execlist1] - status_reg = 0x0_8308
[] vcs0: ELSP[3] = 0x6_fedb2119 [execlist0] - status_reg = 0x7_8308

Note that having to wait for this ack does not disable lite-restores,
although it may reduce their numbers.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102035
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/<20171118003038.7935-1-michel.thierry@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171120123458.23242-4-chris@chris-wilson.co.uk
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Tested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-11-20 17:01:38 +00:00
Chris Wilson
2a6c4241fc drm/i915/selftest: Make guc clients static
Make the private array used for stashing test clients static, to silence
sparse.

References: 55bd6bd757 ("drm/i915/selftests: Add a GuC doorbells selftest")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171120132606.4254-1-chris@chris-wilson.co.uk
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
2017-11-20 16:50:27 +00:00
Lionel Landwerlin
9f9b2792b6 drm/i915/perf: reuse timestamp frequency from device info
Now that we have this stored in the device info, we can drop it from perf
part of the driver.

Note that this requires to init perf after we've computed the frequency,
hence why we move i915_perf_init() from i915_driver_init_early() to after
intel_device_info_runtime_init().

v2: Use div_u64 (Chris)

v3: Drop u64 divs by switching to kHz (Chris/Ville)
    Move i915_perf_fini to i915_driver_cleanup_hw (Matthew)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171113181902.12411-2-lionel.g.landwerlin@intel.com
2017-11-20 16:09:04 +00:00
Chris Wilson
3fef5cda97 drm/i915: Automatic i915_switch_context for legacy
During request construction, after pinning the context we know whether
or not we have to emit a context switch. So move this common operation
from every caller into i915_gem_request_alloc() itself.

v2: Always submit the request if we emitted some commands during request
construction, as typically it also involves changes in global state.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171120102002.22254-2-chris@chris-wilson.co.uk
2017-11-20 15:56:16 +00:00
Chris Wilson
2113184c6f drm/i915: Pull the unconditional GPU cache invalidation into request construction
As the request will, in the following patch, implicitly invoke a
context-switch on construction, we should precede that with a GPU TLB
invalidation. Also, even before using GGTT, we always want to invalidate
the TLBs for any updates (as well as the ppgtt invalidates that are
unconditionally applied by execbuf). Since we almost always require the
TLB invalidate, do it unconditionally on request allocation and so we can
remove it from all other paths.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171120102002.22254-1-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
2017-11-20 15:56:16 +00:00
Lionel Landwerlin
7c52a2219d drm/i915/perf: replace .reg accesses with i915_mmio_reg_offset
This replaces accesses to the reg field of the i915_reg_t structure
with the i915_mmio_reg_offset() inline function.

Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ewelina Musial <ewelina.musial@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171113233455.12085-2-lionel.g.landwerlin@intel.com
2017-11-20 15:46:02 +00:00
Chris Wilson
1f5f9edb44 drm/i915/execlists: Assert that we don't get mixed IDLE_ACTIVE | COMPLETE events
If IDLE_ACTIVE is set, then all other bits are invalid. For us, we can
assert that if we see a COMPLETE | PREEMPTED event, then it should be
impossible for it to also contain an IDLE_ACTIVE flag.

Suggested-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Michal Winiarski <michal.winiarski@intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171120123458.23242-3-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2017-11-20 14:50:45 +00:00
Chris Wilson
d8747afb11 drm/i915/execlists: Reduce completed event mask to COMPLETE | PREEMPTED
Since we get a COMPLETE event when the context switch occurs on
RING_HEAD == RING_TAIL and a PREEMPTED event when a switch occurs
before that point, COMPLETE | PREEMPTED should cover all possible context
switch completion events. We can move the ELEMENT_SWITCH info message
from the COMPLETED_MASK into an assertion for when we are performing a
switch to port[1].

Suggested-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Michal Winiarski <michal.winiarski@intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171120123458.23242-2-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2017-11-20 14:50:03 +00:00
Chris Wilson
e40dd22624 drm/i915/execlists: Listen to COMPLETE context event not ACTIVE_IDLE
Since commit e1fee72c2e
Author: Oscar Mateo <oscar.mateo@intel.com>
Date:   Thu Jul 24 17:04:40 2014 +0100

    drm/i915/bdw: Avoid non-lite-restore preemptions

execlists has listened to (ACTIVE_IDLE | ELEMENT_SWITCH) for detecting
when one context completed and it either continued onto the next (in port
1) or idled. We would always see COMPLETE | ACTIVE_IDLE on the final
context-switch event, but on recent gen it appears that we now get
separate ACTIVE_IDLE and COMPLETE events. In particular, the ACTIVE_IDLE
events may not be coupled to a context (since it is a general state rather
than a specific context completion event).

v2: Update the history, execlists did originally start out by listening
to the COMPLETE event not ACTIVE_IDLE.
v3: Update preempt completion test to also use COMPLETE not ACTIVE_IDLE.

References: bspec/12255
References: https://bugs.freedesktop.org/show_bug.cgi?id=103800
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Oscar Mateo <oscar.mateo@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Michal Winiarski <michal.winiarski@intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Acked-by: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171120123458.23242-1-chris@chris-wilson.co.uk
2017-11-20 14:49:16 +00:00
Ville Syrjälä
675f7ff35b drm/i915: Fix init_clock_gating for resume
Moving the init_clock_gating() call from intel_modeset_init_hw() to
intel_modeset_gem_init() had an unintended effect of not applying
some workarounds on resume. This, for example, cause some kind of
corruption to appear at the top of my IVB Thinkpad X1 Carbon LVDS
screen after hibernation. Fix the problem by explicitly calling
init_clock_gating() from the resume path.

I really hope this doesn't break something else again. At least
the problems reported at https://bugs.freedesktop.org/show_bug.cgi?id=103549
didn't make a comeback, even after a hibernate cycle.

v2: Reorder the init_clock_gating vs. modeset_init_hw to match
    the display reset path (Rodrigo)

Cc: stable@vger.kernel.org
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Fixes: 6ac4327276 ("drm/i915: Move init_clock_gating() back to where it was")
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171116160215.25715-1-ville.syrjala@linux.intel.com
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
2017-11-20 14:53:43 +02:00
Rodrigo Vivi
010d118c20 drm/i915: Update DRIVER_DATE to 20171117
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2017-11-17 14:47:02 -08:00
Chris Wilson
6a7a6a982a drm/i915: Add a policy note for removing workarounds
Rodrigo gave a persuasive argument for keeping workarounds: that they
serve as a good guide for the bring up of the next generation. Not only
do workarounds persist into the early revisions, they show where the
workarounds were previously added to the code flow and sometimes the old
workarounds have an explanation that give insight into their wider
implications.

Based on his suggestion, document the policy that we want to keep the
workarounds from the current generation to guide the next. Older
preproduction workarounds we still want to remove to keep the code
clean.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20171117102635.8689-1-chris@chris-wilson.co.uk
Acked-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2017-11-17 18:20:36 +00:00
Chris Wilson
223c73a366 drm/i915/selftests: Report ENOMEM clearly for an allocation failure
If we can not run the drunk_hole test because we couldn't allocate the
memory for the permutation array (even after we tried trimming the
size), report a clear ENOMEM. Similary, if we are asked to operate on a
hole too small for ourselves, make it skip quietly.

v2: Avoid malloc(0) since that returns ZERO_SIZE_PTR not NULL.
v3: Fixup similar construction for lowlevel_hole
v4: Use u64 >> 1 to avoid 64b div.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171117101732.4335-1-chris@chris-wilson.co.uk
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171117162945.16390-1-chris@chris-wilson.co.uk
2017-11-17 18:20:36 +00:00
Radhakrishna Sripada
0cfecb7c4b Revert "drm/i915: Display WA #1133 WaFbcSkipSegments:cnl, glk"
This reverts commit 8f067837c4.

HSD says "WA withdrawn. It was causing corruption with some images.
WA is not strictly necessary since this bug just causes loss of FBC
compression with some sizes and images, but doesn't break anything."

Fixes: 8f067837c4 ("drm/i915: Display WA #1133 WaFbcSkipSegments:cnl, glk")
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171117010825.23118-1-radhakrishna.sripada@intel.com
2017-11-17 09:52:14 -08:00
Maarten Lankhorst
248c2435cb drm/i915: Calculate g4x intermediate watermarks correctly
The watermarks it should calculate against are the old optimal watermarks.
The currently active crtc watermarks are pure fiction, and are invalid in
case of a nonblocking modeset, page flip enabling/disabling planes or any
other reason.

When the crtc is disabled or during a modeset the intermediate watermarks
don't need to be programmed separately, and could be directly assigned
to the optimal watermarks.

CXSR must always be disabled in the intermediate case for modesets,
else we get a WARN for vblank wait timeout.

Also rename crtc_state to new_crtc_state, to distinguish it from the old
state.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171115163157.14372-2-maarten.lankhorst@linux.intel.com
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
2017-11-17 15:24:07 +01:00
Maarten Lankhorst
5b9489cb8e drm/i915: Calculate vlv/chv intermediate watermarks correctly, v3.
The watermarks it should calculate against are the old optimal watermarks.
The currently active crtc watermarks are pure fiction, and are invalid in
case of a nonblocking modeset, page flip enabling/disabling planes or any
other reason.

When the crtc is disabled or during a modeset the intermediate watermarks
don't need to be programmed separately, and could be directly assigned
to the optimal watermarks.

CXSR must always be disabled in the intermediate case for modesets, else
we get a WARN for vblank wait timeout.

Also rename crtc_state to new_crtc_state, to distinguish it from the old state.

Changes since v1:
- Use intel_atomic_get_old_crtc_state. (ville)
Changes since v2:
- Always unset cxsr during modeset.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171115163157.14372-1-maarten.lankhorst@linux.intel.com
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
2017-11-17 15:23:36 +01:00
Maarten Lankhorst
199ea381d9 drm/i915: Pass crtc_state to ips toggle functions, v2
Changes since v1:
- Only pass crtc_state, not crtc.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171110113503.16253-8-maarten.lankhorst@linux.intel.com
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
2017-11-17 12:14:25 +01:00
Maarten Lankhorst
93313538c1 drm/i915: Pass idle crtc_state to intel_dp_sink_crc
IPS can only be enabled if the primary plane is visible, so
first make sure sw state matches hw state by waiting for hw_done.

After this pass crtc_state to intel_dp_sink_crc() so that can be used,
instead of using legacy pointers.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171110113503.16253-7-maarten.lankhorst@linux.intel.com
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
2017-11-17 12:14:25 +01:00
Maarten Lankhorst
33a49868e5 drm/i915: Enable FIFO underrun reporting after initial fastset, v4.
The firmware may have set up the pipe correctly, but the FIFO
underrun and CRC interrupts are likely not enabled.

This resulted in debugfs_test.read_all_entries failing on haswell,
because of a timeout when reading the crc debugfs entry.

Solve this by enabling FIFO underrun reporting after the initial
fastset, which lets interrupts be generated as expected.

Changes since v1:
- Always enable CPU FIFO underrun reporting for >GEN2,
  and handle GEN2 correctly.
Changes since v2:
- Remove unneeded HAS_DDI, simplify GEN2 case.
Changes since v3:
- Use intel_crtc_pch_transcoder to determine pch transcoder for underruns. (Ville)
- Remove crtc->config dereference in intel_crtc_pch_transcoder. (Ville)

Testcase: debugfs_test.read_all_entries
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171113144043.58658-1-maarten.lankhorst@linux.intel.com
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
2017-11-17 12:14:25 +01:00
Chris Wilson
41729bf224 drm/i915: Mark the userptr invalidate workqueue as WQ_MEM_RECLAIM
Commit  21cc6431e0 ("drm/i915: Mark the userptr invalidate workqueue
as WQ_MEM_RECLAIM") tried to fixup the check_flush_dependency warning
for hitting i915_gem_userptr_mn_invalidate_range_start from within the
shrinker, but I failed to notice userptr has 2 similarly named
workqueues. I marked up i915-userptr-acquire as WQ_MEM_RECLAIM whereas
we only wait upon i915-userptr-release from inside the reclaim paths.

[62530.869510] workqueue: PF_MEMALLOC task 7983(gem_shrink) is flushing !WQ_MEM_RECLAIM i915-userptr-release:          (null)
[62530.869515] ------------[ cut here ]------------
[62530.869519] WARNING: CPU: 1 PID: 7983 at kernel/workqueue.c:2434 check_flush_dependency+0x7f/0x110
[62530.869519] Modules linked in: pegasus mii ip6table_filter ip6_tables bnep iptable_filter snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic binfmt_misc nls_iso8859_1 intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp snd_hda_intel snd_hda_codec kvm_intel snd_hda_core snd_hwdep kvm snd_pcm irqbypass snd_seq_midi snd_seq_midi_event snd_rawmidi crct10dif_pclmul crc32_pclmul 8250_dw ghash_clmulni_intel snd_seq pcbc snd_seq_device snd_timer btusb aesni_intel btrtl btbcm aes_x86_64 iwlwifi btintel crypto_simd glue_helper cryptd bluetooth snd intel_cstate input_leds idma64 intel_rapl_perf ecdh_generic serio_raw soundcore cfg80211 wmi_bmof virt_dma intel_lpss_pci intel_lpss acpi_als kfifo_buf industrialio winbond_cir soc_button_array rc_core spidev tpm_crb intel_hid acpi_pad mac_hid sparse_keymap
[62530.869546]  parport_pc ppdev lp parport ip_tables x_tables autofs4 hid_generic usbhid i915 i2c_algo_bit prime_numbers drm_kms_helper syscopyarea e1000e sysfillrect sysimgblt fb_sys_fops ahci ptp pps_core libahci drm wmi video i2c_hid hid
[62530.869557] CPU: 1 PID: 7983 Comm: gem_shrink Tainted: G     U  W    L  4.14.0-rc8-drm-tip-ww45-commit-1342299+ #1
[62530.869558] Hardware name: Intel Corporation CoffeeLake Client Platform/CoffeeLake H DDR4 RVP, BIOS CNLSFWR1.R00.X098.A00.1707301945 07/30/2017
[62530.869559] task: ffffa1049dbeec80 task.stack: ffffae7d05c44000
[62530.869560] RIP: 0010:check_flush_dependency+0x7f/0x110
[62530.869561] RSP: 0018:ffffae7d05c473a0 EFLAGS: 00010286
[62530.869562] RAX: 000000000000006e RBX: ffffa1049540f400 RCX: ffffffffa3e55788
[62530.869562] RDX: 0000000000000000 RSI: 0000000000000092 RDI: 0000000000000202
[62530.869563] RBP: ffffae7d05c473c0 R08: 000000000000006e R09: 000000000038bb0e
[62530.869563] R10: 0000000000000000 R11: 000000000000006e R12: ffffa1049dbeec80
[62530.869564] R13: 0000000000000000 R14: 0000000000000000 R15: ffffae7d05c473e0
[62530.869565] FS:  00007f621b129880(0000) GS:ffffa1050b240000(0000) knlGS:0000000000000000
[62530.869566] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[62530.869566] CR2: 00007f6214400000 CR3: 0000000353a17003 CR4: 00000000003606e0
[62530.869567] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[62530.869567] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[62530.869568] Call Trace:
[62530.869570]  flush_workqueue+0x115/0x3d0
[62530.869573]  ? wake_up_process+0x15/0x20
[62530.869596]  i915_gem_userptr_mn_invalidate_range_start+0x12f/0x160 [i915]
[62530.869614]  ? i915_gem_userptr_mn_invalidate_range_start+0x12f/0x160 [i915]
[62530.869616]  __mmu_notifier_invalidate_range_start+0x55/0x80
[62530.869618]  try_to_unmap_one+0x791/0x8b0
[62530.869620]  ? call_rwsem_down_read_failed+0x18/0x30
[62530.869622]  rmap_walk_anon+0x10b/0x260
[62530.869624]  rmap_walk+0x48/0x60
[62530.869625]  try_to_unmap+0x93/0xf0
[62530.869626]  ? page_remove_rmap+0x2a0/0x2a0
[62530.869627]  ? page_not_mapped+0x20/0x20
[62530.869629]  ? page_get_anon_vma+0x90/0x90
[62530.869630]  ? invalid_mkclean_vma+0x20/0x20
[62530.869631]  migrate_pages+0x946/0xaa0
[62530.869633]  ? __ClearPageMovable+0x10/0x10
[62530.869635]  ? isolate_freepages_block+0x3c0/0x3c0
[62530.869636]  compact_zone+0x22f/0x970
[62530.869638]  compact_zone_order+0xa3/0xd0
[62530.869640]  try_to_compact_pages+0x1a5/0x2a0
[62530.869641]  ? try_to_compact_pages+0x1a5/0x2a0
[62530.869643]  __alloc_pages_direct_compact+0x50/0x110
[62530.869644]  __alloc_pages_slowpath+0x4da/0xf30
[62530.869646]  __alloc_pages_nodemask+0x262/0x280
[62530.869648]  alloc_pages_vma+0x165/0x1e0
[62530.869649]  shmem_alloc_hugepage+0xd0/0x130
[62530.869651]  ? __radix_tree_insert+0x45/0x230
[62530.869652]  ? __vm_enough_memory+0x29/0x130
[62530.869654]  shmem_alloc_and_acct_page+0x10d/0x1e0
[62530.869655]  shmem_getpage_gfp+0x426/0xc00
[62530.869657]  shmem_fault+0xa0/0x1e0
[62530.869659]  ? file_update_time+0x60/0x110
[62530.869660]  __do_fault+0x1e/0xc0
[62530.869661]  __handle_mm_fault+0xa35/0x1170
[62530.869662]  handle_mm_fault+0xcc/0x1c0
[62530.869664]  __do_page_fault+0x262/0x4f0
[62530.869666]  do_page_fault+0x2e/0xe0
[62530.869667]  page_fault+0x22/0x30
[62530.869668] RIP: 0033:0x404335
[62530.869669] RSP: 002b:00007fff7829e420 EFLAGS: 00010216
[62530.869670] RAX: 00007f6210400000 RBX: 0000000000000004 RCX: 0000000000b80000
[62530.869670] RDX: 0000000000002e01 RSI: 0000000000008000 RDI: 0000000000000004
[62530.869671] RBP: 0000000000000019 R08: 0000000000000002 R09: 0000000000000000
[62530.869671] R10: 0000000000000559 R11: 0000000000000246 R12: 0000000008000000
[62530.869672] R13: 00000000004042f0 R14: 0000000000000004 R15: 000000000000007e
[62530.869673] Code: 00 8b b0 18 05 00 00 48 8d 8b b0 00 00 00 48 8d 90 c0 06 00 00 4d 89 f0 48 c7 c7 40 c0 c8 a3 c6 05 68 c5 e8 00 01 e8 c2 68 04 00 <0f> ff 4d 85 ed 74 18 49 8b 45 20 48 8b 70 08 8b 86 00 01 00 00
[62530.869691] ---[ end trace 01e01ad0ff5781f8 ]---

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103739
Fixes: 21cc6431e0 ("drm/i915: Mark the userptr invalidate workqueue as WQ_MEM_RECLAIM")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171114173520.8829-1-chris@chris-wilson.co.uk
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
2017-11-17 10:58:55 +00:00
Chris Wilson
290b20a6d3 drm/i915: Add might_sleep() check to wait_for()
We should long past the time of trying to use wait_for() from inside
atomic contexts, so add a might_sleep() check to prevent misuse.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171114215655.4849-1-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
2017-11-17 10:38:04 +00:00
Michel Thierry
55bd6bd757 drm/i915/selftests: Add a GuC doorbells selftest
The first test aims to check guc_init_doorbell_hw, changing the existing
guc clients and doorbells state before calling it.

The second test tries to create as many clients as it is currently possible
(currently limited to max number of doorbells) and exercise the doorbell
alloc/dealloc code.

Since our usage mode require very few clients/doorbells, this code has
been exercised very lightly and it's good to have a simple test for it.

As reference, this test already helped identify the bug fixed by
commit 7f1ea2ac30 ("drm/i915/guc: Fix doorbell id selection").

v2: Extend number of clients; check for client allocation failure when
number of doorbells is exceeded; validate client properties; reuse
guc_init_doorbell_hw (Chris).

v3: guc_init_doorbell_hw test added per Chris suggestion.

v4: Try to explain why guc_init_doorbell_hw exist and comment some
details in the subtest.

v5: Remove redundant pr_info at the beginning of each subtest (Chris);
rebase (s/i915_guc_client/intel_guc_client/).

Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171116220632.1909-1-michel.thierry@intel.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-11-17 10:02:39 +00:00
Rodrigo Vivi
3dd435ef69 Merge tag 'gvt-next-2017-11-16' of https://github.com/intel/gvt-linux into drm-intel-next-queued
gvt-next-2017-11-16

- CSB HWSP update support (Weinan)
- GVT debug helpers, dyndbg and debugfs (Chuanxiao, Shuo)
- full virtualized opregion (Xiaolin)
- VM health check for sane fallback (Fred)
- workload submission code refactor for future enabling (Zhi)
- Updated repo URL in MAINTAINERS (Zhenyu)
- other many misc fixes

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171116092007.ww5bvfx7rf36bjmn@zhen-hp.sh.intel.com
2017-11-16 12:12:28 -08:00
Linus Torvalds
487e2c9f44 Merge tag 'afs-next-20171113' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs
Pull AFS updates from David Howells:
 "kAFS filesystem driver overhaul.

  The major points of the overhaul are:

   (1) Preliminary groundwork is laid for supporting network-namespacing
       of kAFS. The remainder of the namespacing work requires some way
       to pass namespace information to submounts triggered by an
       automount. This requires something like the mount overhaul that's
       in progress.

   (2) sockaddr_rxrpc is used in preference to in_addr for holding
       addresses internally and add support for talking to the YFS VL
       server. With this, kAFS can do everything over IPv6 as well as
       IPv4 if it's talking to servers that support it.

   (3) Callback handling is overhauled to be generally passive rather
       than active. 'Callbacks' are promises by the server to tell us
       about data and metadata changes. Callbacks are now checked when
       we next touch an inode rather than actively going and looking for
       it where possible.

   (4) File access permit caching is overhauled to store the caching
       information per-inode rather than per-directory, shared over
       subordinate files. Whilst older AFS servers only allow ACLs on
       directories (shared to the files in that directory), newer AFS
       servers break that restriction.

       To improve memory usage and to make it easier to do mass-key
       removal, permit combinations are cached and shared.

   (5) Cell database management is overhauled to allow lighter locks to
       be used and to make cell records autonomous state machines that
       look after getting their own DNS records and cleaning themselves
       up, in particular preventing races in acquiring and relinquishing
       the fscache token for the cell.

   (6) Volume caching is overhauled. The afs_vlocation record is got rid
       of to simplify things and the superblock is now keyed on the cell
       and the numeric volume ID only. The volume record is tied to a
       superblock and normal superblock management is used to mediate
       the lifetime of the volume fscache token.

   (7) File server record caching is overhauled to make server records
       independent of cells and volumes. A server can be in multiple
       cells (in such a case, the administrator must make sure that the
       VL services for all cells correctly reflect the volumes shared
       between those cells).

       Server records are now indexed using the UUID of the server
       rather than the address since a server can have multiple
       addresses.

   (8) File server rotation is overhauled to handle VMOVED, VBUSY (and
       similar), VOFFLINE and VNOVOL indications and to handle rotation
       both of servers and addresses of those servers. The rotation will
       also wait and retry if the server says it is busy.

   (9) Data writeback is overhauled. Each inode no longer stores a list
       of modified sections tagged with the key that authorised it in
       favour of noting the modified region of a page in page->private
       and storing a list of keys that made modifications in the inode.

       This simplifies things and allows other keys to be used to
       actually write to the server if a key that made a modification
       becomes useless.

  (10) Writable mmap() is implemented. This allows a kernel to be build
       entirely on AFS.

  Note that Pre AFS-3.4 servers are no longer supported, though this can
  be added back if necessary (AFS-3.4 was released in 1998)"

* tag 'afs-next-20171113' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs: (35 commits)
  afs: Protect call->state changes against signals
  afs: Trace page dirty/clean
  afs: Implement shared-writeable mmap
  afs: Get rid of the afs_writeback record
  afs: Introduce a file-private data record
  afs: Use a dynamic port if 7001 is in use
  afs: Fix directory read/modify race
  afs: Trace the sending of pages
  afs: Trace the initiation and completion of client calls
  afs: Fix documentation on # vs % prefix in mount source specification
  afs: Fix total-length calculation for multiple-page send
  afs: Only progress call state at end of Tx phase from rxrpc callback
  afs: Make use of the YFS service upgrade to fully support IPv6
  afs: Overhaul volume and server record caching and fileserver rotation
  afs: Move server rotation code into its own file
  afs: Add an address list concept
  afs: Overhaul cell database management
  afs: Overhaul permit caching
  afs: Overhaul the callback handling
  afs: Rename struct afs_call server member to cm_server
  ...
2017-11-16 11:41:22 -08:00
Rodrigo Vivi
9672a69c76 drm/i915/cnl: Extend HDMI 2.0 support to CNL.
Starting on GLK we support HDMI 2.0. So this patch only
extend the work Shashank has made to GLK to CNL.

v2: The version that compiles :/
v3: Invert order to newer || older platforms check. (Ville).

Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Cc: Shashank Sharma <shashank.sharma@intel.com>
Cc: Manasi Navare <manasi.d.navare@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171115184205.8104-1-rodrigo.vivi@intel.com
2017-11-16 09:45:52 -08:00
Rodrigo Vivi
8a00678a09 drm/i915/cnl: Simplify dco_fraction calculation.
I confess I never fully understood that previous calculation,
so this is not a "fix". But let's simplify this math
so poor brains like mine can read and make some sense of
it in the future.

v2: Don't follow the spec since that gives invalid
    values and it is also confusing. This Ville's
    version is much simpler.
v3: Use u64 cast instead of declaring a u64 dco. (Ville).

Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Mika Kahola <mika.kahola@intel.com>
Cc: Manasi Navare <manasi.d.navare@intel.com>
Cc: James Ausmus <james.ausmus@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171115184257.8633-1-rodrigo.vivi@intel.com
2017-11-16 09:45:39 -08:00
Rodrigo Vivi
cacf6fe7c6 drm/i915/cnl: Don't blindly replace qdiv.
Accordingly to spec "If Kdiv != 2, then Qdiv must be 1."
but we already handle qdiv values properly and this case here
should be spurious. But instead of blindly replacing let's
warn loudly instead. Because it means something was really
wrong on initial setup.

Cc: Mika Kahola <mika.kahola@intel.com>
Cc: Manasi Navare <manasi.d.navare@intel.com>
Cc: James Ausmus <james.ausmus@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171114194759.24541-6-rodrigo.vivi@intel.com
2017-11-16 09:45:16 -08:00
Rodrigo Vivi
063c886197 drm/i915/cnl: Fix wrpll math for higher freqs.
Spec describe all values in MHz. We handle our
clocks in KHz. This includes the best_dco_centrality that was
forgot in the same unity as spec. Consequently we couldn't
get a good divider for high frequenies. Hence HDMI 2.0 wasn't
working.

Spec tells 999999 for initial best_dco_centrality meaning the
max value in MHz.
Since we convert dco from MHz to KHz we also need to convert
this initial best_doc_centrality to 999999000 or 999999999
or even better, to the max that its variable allow.

This patch also replaces the use of "* KHz(1)" with the values
directly on KHz to avoid future confusion.

v2: Use U32_MAX instead of random 99999 as spec tells. (Ville).

Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Shashank Sharma <shashank.sharma@intel.com>
Cc: Mika Kahola <mika.kahola@intel.com>
Cc: Manasi Navare <manasi.d.navare@intel.com>
Cc: James Ausmus <james.ausmus@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171114234223.10600-1-rodrigo.vivi@intel.com
2017-11-16 09:44:48 -08:00
Rodrigo Vivi
5eca81de88 drm/i915/cnl: Fix, simplify and unify wrpll variable sizes.
- 64 bits is not needed for afe_clock now we don't convert
  that to Hz.
- 16 bits is not enough for all dco stuff.
- unsigned is not relevant/needed for all divisors values.

Cc: Mika Kahola <mika.kahola@intel.com>
Cc: Manasi Navare <manasi.d.navare@intel.com>
Cc: James Ausmus <james.ausmus@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Manasi Navare <manasi.d.navare@intel.com
Link: https://patchwork.freedesktop.org/patch/msgid/20171114194759.24541-4-rodrigo.vivi@intel.com
2017-11-16 09:44:18 -08:00
Rodrigo Vivi
ecc2069a02 drm/i915/cnl: Remove useless conversion.
No functional change. Just starting the wrpll fixes
with a clean-up to make units a bit more clear.

Cc: Mika Kahola <mika.kahola@intel.com>
Cc: Manasi Navare <manasi.d.navare@intel.com>
Cc: James Ausmus <james.ausmus@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171114194759.24541-3-rodrigo.vivi@intel.com
2017-11-16 09:44:05 -08:00
Rodrigo Vivi
ec2f343e72 drm/i915/cnl: Remove spurious central_freq.
"Display software must leave this field at the default value.
It no longer needs to be configured as part of PLL programming."

We respect this already and we are setting up the default
one line below: "DPLL_CFGCR1_CENTRAL_FREQ".

Also we don't touch anywhere else this central_freq for cnl.
So let's remove from the final write.

No functional change. Only a clean-up patch.

Cc: Manasi Navare <manasi.d.navare@intel.com>
Cc: Mika Kahola <mika.kahola@intel.com>
Cc: James Ausmus <james.ausmus@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171114194759.24541-2-rodrigo.vivi@intel.com
2017-11-16 09:43:37 -08:00
Chris Wilson
4fe95b042d drm/i915/selftests: exercise_ggtt may have nothing to do
When operating on the live_ggtt we have to find a usuable hole for our
test. It is possible for there to be no hole we can use, so initialise
the err to 0 for the early exit.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171115152558.31252-1-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2017-11-16 15:47:17 +00:00
Ville Syrjälä
738a814386 drm/i915: Don't sanitize frame start delay if the pipe is off
Avoid touching PIPECONF in intel_sanitize_crtc() unless the pipe is
actually on. Should cure some unclaimed register accesses during reset,
as we are rather cavalier in our approach to powerdomain management.

We don't have to sanitize this if the pipe is off since we will
overwrite the frame start delay anyway when turning the pipe on.

v2: Amended commit message to implicate the reset path (Chris)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102249
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171115200442.15051-1-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-11-16 17:15:57 +02:00
Sagar Arun Kamble
a269574489 drm/i915/guc: Rename i915_guc_submission.c|h to intel_guc_submission.c|h
With all component structures and functions named appropriately, change
the names of GuC submission source files. There were bunch of style issues
in guc_submission.c that are highlighted now by checkpatch. Fix those.
Update name in Documentation/gpu. (Joonas)

v2: Rebase.

v3: Rebase.

Signed-off-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Michal Winiarski <michal.winiarski@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Acked-by: Oscar Mateo <oscar.mateo@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1510839162-25197-6-git-send-email-sagar.a.kamble@intel.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-11-16 15:06:18 +00:00
Sagar Arun Kamble
5afc8b49e4 drm/i915/guc: Rename i915_guc_client struct to intel_guc_client
GuC submission clients are currently being used in kernel only hence
update the structure name to intel_guc_client.

Signed-off-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Michal Winiarski <michal.winiarski@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Acked-by: Oscar Mateo <oscar.mateo@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1510839162-25197-5-git-send-email-sagar.a.kamble@intel.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-11-16 15:04:17 +00:00
Sagar Arun Kamble
db14d0c5ae drm/i915/guc: Update name and prototype of GuC submission interface functions
i915 GuC submission is hardware interface and GuC APIs that are not user
facing should be named intel_guc* hence we change GuC submission related
functions name prefix to intel_guc. Also changed the parameter to these
functions to intel_guc struct.

v2: Using local guc variable in intel_uc_fini_hw. (Michal Wajdeczko)
Rebase.

Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Michal Winiarski <michal.winiarski@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Acked-by: Oscar Mateo <oscar.mateo@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1510839162-25197-4-git-send-email-sagar.a.kamble@intel.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-11-16 15:03:44 +00:00
Sagar Arun Kamble
6fa1f6fbb1 drm/i915/guc: Update names of submission related static functions
i915_guc_submit, i915_guc_dequeue, i915_guc_submission_park and
i915_guc_submission_upark are functions internal to GuC submission
hence remove "i915_" prefix.

Suggested-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Michal Winiarski <michal.winiarski@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1510839162-25197-3-git-send-email-sagar.a.kamble@intel.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-11-16 15:02:06 +00:00
Sagar Arun Kamble
c6dce8f140 drm/i915: Update execlists tasklet naming
intel_lrc_irq_handler and i915_guc_irq_handler are HW submission related
tasklet functions. Name them with "submission_tasklet" suffix and
remove intel/i915 prefix as they are static. Also rename irq_tasklet
as just tasklet for clarity.

v2: s/_bh/_tasklet (Chris)

Suggested-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Michal Winiarski <michal.winiarski@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/1510839162-25197-2-git-send-email-sagar.a.kamble@intel.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-11-16 15:01:31 +00:00
Chris Wilson
d710fc16ff drm/i915: Prevent overflow of execbuf.buffer_count and num_cliprects
We check whether the multiplies will overflow prior to calling
kmalloc_array so that we can respond with -EINVAL for the invalid user
arguments rather than treating it as an -ENOMEM that would otherwise
occur. However, as Dan Carpenter pointed out, we did an addition on the
unsigned int prior to passing to kmalloc_array where it would be
promoted to size_t for the calculation, thereby allowing it to overflow
and underallocate.

v2: buffer_count is currently limited to INT_MAX because we treat it as
signaled variable for LUT_HANDLE in eb_lookup_vma
v3: Move common checks for eb1/eb2 into the same function
v4: Put the check back for nfence*sizeof(user_fence) overflow
v5: access_ok uses ULONG_MAX but kvmalloc_array uses SIZE_MAX
v6: size_t and unsigned long are not type-equivalent on 32b

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171116105059.25142-1-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2017-11-16 14:17:08 +00:00
Chris Wilson
c534612e78 drm/i915: Clear breadcrumb node when cancelling signaling
When we call intel_engine_cancel_signaling() to stop reporting when
a request is completed via an asynchronous signal, we remove that request
from the breadcrumb wait queue. However, we may be concurrently
processing that request in the signaler itself, the actual operations on
the request's node itself are serialised but we do not actually clear the
waiter after removing it from the tree allowing both parties to attempt
to do so and corrupting the rbtree. (Previously removing from the
breadcrumb wait queue could only be done on behalf of i915_wait_request,
so this race could not happen).

Reported-by: "He, Bo" <bo.he@intel.com>
Fixes: 9eb143bbec ("drm/i915: Allow a request to be cancelled")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: "He, Bo" <bo.he@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171115121458.24655-1-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2017-11-16 14:17:08 +00:00
Mika Kuoppala
d5653ec322 drm/i915: Print the condition causing GEM_BUG_ON
It is easier to categorize and debug bugs if the failed condition
is in plain sight in the actual dmesg output. Make it so.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Marta Lofstedt <marta.lofstedt@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171116083954.3357-1-mika.kuoppala@linux.intel.com
2017-11-16 15:35:47 +02:00
Linus Torvalds
e60e1ee606 Merge tag 'drm-for-v4.15' of git://people.freedesktop.org/~airlied/linux
Pull drm updates from Dave Airlie:
 "This is the main drm pull request for v4.15.

  Core:
   - Atomic object lifetime fixes
   - Atomic iterator improvements
   - Sparse/smatch fixes
   - Legacy kms ioctls to be interruptible
   - EDID override improvements
   - fb/gem helper cleanups
   - Simple outreachy patches
   - Documentation improvements
   - Fix dma-buf rcu races
   - DRM mode object leasing for improving VR use cases.
   - vgaarb improvements for non-x86 platforms.

  New driver:
   - tve200: Faraday Technology TVE200 block.

     This "TV Encoder" encodes a ITU-T BT.656 stream and can be found in
     the StorLink SL3516 (later Cortina Systems CS3516) as well as the
     Grain Media GM8180.

  New bridges:
   - SiI9234 support

  New panels:
   - S6E63J0X03, OTM8009A, Seiko 43WVF1G, 7" rpi touch panel, Toshiba
     LT089AC19000, Innolux AT043TN24

  i915:
   - Remove Coffeelake from alpha support
   - Cannonlake workarounds
   - Infoframe refactoring for DisplayPort
   - VBT updates
   - DisplayPort vswing/emph/buffer translation refactoring
   - CCS fixes
   - Restore GPU clock boost on missed vblanks
   - Scatter list updates for userptr allocations
   - Gen9+ transition watermarks
   - Display IPC (Isochronous Priority Control)
   - Private PAT management
   - GVT: improved error handling and pci config sanitizing
   - Execlist refactoring
   - Transparent Huge Page support
   - User defined priorities support
   - HuC/GuC firmware refactoring
   - DP MST fixes
   - eDP power sequencing fixes
   - Use RCU instead of stop_machine
   - PSR state tracking support
   - Eviction fixes
   - BDW DP aux channel timeout fixes
   - LSPCON fixes
   - Cannonlake PLL fixes

  amdgpu:
   - Per VM BO support
   - Powerplay cleanups
   - CI powerplay support
   - PASID mgr for kfd
   - SR-IOV fixes
   - initial GPU reset for vega10
   - Prime mmap support
   - TTM updates
   - Clock query interface for Raven
   - Fence to handle ioctl
   - UVD encode ring support on Polaris
   - Transparent huge page DMA support
   - Compute LRU pipe tweaks
   - BO flag to allow buffers to opt out of implicit sync
   - CTX priority setting API
   - VRAM lost infrastructure plumbing

  qxl:
   - fix flicker since atomic rework

  amdkfd:
   - Further improvements from internal AMD tree
   - Usermode events
   - Drop radeon support

  nouveau:
   - Pascal temperature sensor support
   - Improved BAR2 handling
   - MMU rework to support Pascal MMU

  exynos:
   - Improved HDMI/mixer support
   - HDMI audio interface support

  tegra:
   - Prep work for tegra186
   - Cleanup/fixes

  msm:
   - Preemption support for a5xx
   - Display fixes for 8x96 (snapdragon 820)
   - Async cursor plane fixes
   - FW loading rework
   - GPU debugging improvements

  vc4:
   - Prep for DSI panels
   - fix T-format tiling scanout
   - New madvise ioctl

  Rockchip:
   - LVDS support

  omapdrm:
   - omap4 HDMI CEC support

  etnaviv:
   - GPU performance counters groundwork

  sun4i:
   - refactor driver load + TCON backend
   - HDMI improvements
   - A31 support
   - Misc fixes

  udl:
   - Probe/EDID read fixes.

  tilcdc:
   - Misc fixes.

  pl111:
   - Support more variants

  adv7511:
   - Improve EDID handling.
   - HDMI CEC support

  sii8620:
   - Add remote control support"

* tag 'drm-for-v4.15' of git://people.freedesktop.org/~airlied/linux: (1480 commits)
  drm/rockchip: analogix_dp: Use mutex rather than spinlock
  drm/mode_object: fix documentation for object lookups.
  drm/i915: Reorder context-close to avoid calling i915_vma_close() under RCU
  drm/i915: Move init_clock_gating() back to where it was
  drm/i915: Prune the reservation shared fence array
  drm/i915: Idle the GPU before shinking everything
  drm/i915: Lock llist_del_first() vs llist_del_all()
  drm/i915: Calculate ironlake intermediate watermarks correctly, v2.
  drm/i915: Disable lazy PPGTT page table optimization for vGPU
  drm/i915/execlists: Remove the priority "optimisation"
  drm/i915: Filter out spurious execlists context-switch interrupts
  drm/amdgpu: use irq-safe lock for kiq->ring_lock
  drm/amdgpu: bypass lru touch for KIQ ring submission
  drm/amdgpu: Potential uninitialized variable in amdgpu_vm_update_directories()
  drm/amdgpu: potential uninitialized variable in amdgpu_vce_ring_parse_cs()
  drm/amd/powerplay: initialize a variable before using it
  drm/amd/powerplay: suppress KASAN out of bounds warning in vega10_populate_all_memory_levels
  drm/amd/amdgpu: fix evicted VRAM bo adjudgement condition
  drm/vblank: Tune drm_crtc_accurate_vblank_count() WARN down to a debug
  drm/rockchip: add CONFIG_OF dependency for lvds
  ...
2017-11-15 20:42:10 -08:00
fred gao
f2880e04f3 drm/i915/gvt: Move request alloc to dispatch_workload path only
Previously the performance is improved through the workload auditing
and shadowing ahead of vGPU scheduling, however, there is the case that
more requests are allocated in submit_context before the previous request
is added, the timeline will hold its seqno which is later.

This patch is to move the request alloc to dispatch_workload function,
where is the same place as request is added.

It will fix the issue of kernel BUG for (timeline->seqno != request->fence.seqno)
check when add_request.

Fixes: 89ea20b930 ("drm/i915/gvt: Factor out scan and shadow from workload dispatch")
Signed-off-by: Chuanxiao Dong <chuanxiao.dong@intel.com>
Signed-off-by: fred gao <fred.gao@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2017-11-16 11:51:55 +08:00
Xiong Zhang
b2d6ef7061 drm/i915/gvt: Let each vgpu has separate opregion memory
Currently every vgpu share a common gvt opregion memory, but
it is freed at vgpu destroy, then the later vgpu doesn't have
opregion memory once the first vgpu is destroyed. This cause
guest function failure like reboot, second or later boot.

This patch allocate and init virt opregion memory for each
vgpu, so this memory could be freed at vgpu destroy.

Signed-off-by: Xiong Zhang <xiong.y.zhang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2017-11-16 11:48:35 +08:00
Xiong Zhang
295764cd2f drm/i915/gvt: Limit read hw reg to active vgpu
mmio_read_from_hw() let vgpu could read hw reg, if vgpu's workload
is running on hw, things is good. Otherwise vgpu will get other
vgpu's reg val, it is unsafe.

This patch limit such hw access to active vgpu. If vgpu isn't
running on hw, the reg read of this vgpu will get the last active
val which saved at schedule_out.

v2: ring timestamp is walking continuously even if the ring is idle.
    so read hw directly. (Zhenyu)

Signed-off-by: Xiong Zhang <xiong.y.zhang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2017-11-16 11:48:35 +08:00
Zhenyu Wang
5c35258de6 Revert "drm/i915/gvt: Refine broken PPGTT scratch"
This reverts commit b20d09886fd1b74cd2255d846029a049e524db14.

This caused windows driver boot errors for invalid page address.
Revert for now.

Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Cc: Zhi Wang <zhi.a.wang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2017-11-16 11:48:35 +08:00