Commit Graph

3179 Commits

Author SHA1 Message Date
Tejas Upadhyay
24ea098b7c drm/i915/jsl: Split EHL/JSL platform info and PCI ids
Recently we came across requirement to identify EHL and JSL
platform to program them differently. Thus Split the basic
platform definition, macros, and PCI IDs to differentiate
between EHL and JSL platforms. Also, IS_ELKHARTLAKE is replaced
with IS_JSL_EHL everywhere.

Changes since V1 :
	- Rebased to avoid merge conflicts
	- Added missed check for jasperlake in intel_uc_fw.c

Cc : Matt Roper <matthew.d.roper@intel.com>
Cc : Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Tejas Upadhyay <tejaskumarx.surendrakumar.upadhyay@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201013192948.63470-1-tejaskumarx.surendrakumar.upadhyay@intel.com
2020-10-14 09:31:34 +02:00
Ayaz A Siddiqui
4d8a5cfe3b drm/i915/gt: Initialize reserved and unspecified MOCS indices
In order to avoid functional breakage of mis-programmed applications that
have grown to depend on unused MOCS entries, we are programming
those entries to be equal to fully cached ("L3 + LLC") entry.

These reserved and unspecified entries should not be used as they may be
changed to less performant variants with better coherency in the future
if more entries are needed.

v2: As suggested by Lucas De Marchi to utilise __init_mocs_table for
programming default value, setting I915_MOCS_PTE index of tgl_mocs_table
with desired value.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Tomasz Lis <tomasz.lis@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Francisco Jerez <currojerez@riseup.net>
Cc: Mathew Alwin <alwin.mathew@intel.com>
Cc: Mcguire Russell W <russell.w.mcguire@intel.com>
Cc: Spruit Neil R <neil.r.spruit@intel.com>
Cc: Zhou Cheng <cheng.zhou@intel.com>
Cc: Benemelis Mike G <mike.g.benemelis@intel.com>

Signed-off-by: Ayaz A Siddiqui <ayaz.siddiqui@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200729102539.134731-2-ayaz.siddiqui@intel.com
Cc: stable@vger.kernel.org
2020-10-13 10:24:12 +01:00
Matt Roper
3bcacad3d7 drm/i915: Update gen12 multicast register ranges
The updated bspec forcewake table also provides us with new multicast
ranges that should be reflected in our workaround code.

Note that there are different types of multicast registers with
different styles of replication and different steering registers.  The
i915 MCR range lists we're updating here are only used to ensure we can
verify workarounds properly (i.e., if we can't steer register reads we
don't want to verify workarounds where an unsteered read might hit a
fused-off instance of the unit).  Because of this, we don't need to
include any of the multicast ranges where all instances of the register
will always present and fusing doesn't play a role.  Specifically, that
means that we are not including the MCR ranges designated as "SQIDI" in
the bspec.

Bspec: 66696
Cc: Caz Yokoyama <caz.yokoyama@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201009194442.3668677-4-matthew.d.roper@intel.com
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
2020-10-09 18:54:44 -07:00
Matt Roper
55e3c17095 drm/i915: Rename FORCEWAKE_BLITTER to FORCEWAKE_GT
The power well that we've been referring to as the 'blitter' well is
actually more of a general GT power well which contains a lot of things
other than the blitter engine registers.  The FORCEWAKE_BLITTER name in
the code was used for historic reasons, but no longer matches how the
bspec describes this power well and just causes confusion for people not
familiar with this area of the code.  Let's rename it to FORCEWAKE_GT to
more accurately describe the role of the power well and match how the
modern bspec refers to it.

v2:
 - Add a comment noting that the GT power well includes the blitter
   engine. (Jose)

Bspec: 66696, 66534, 67609
Cc: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201009194442.3668677-2-matthew.d.roper@intel.com
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
2020-10-09 18:51:27 -07:00
Lucas De Marchi
2606b26923 drm/i915/dg1: Define MOCS table for DG1
DG1 has a new MOCS table. We still use the old definition of the table,
but as for any dgfx card it doesn't contain the control_value values
(these values don't matter as we won't program them).

Bspec: 45101

v2: Reword the comment to state that the last few entries are reserved
    instead of "the last two". DG1 reserves four instead of two from
    previous platforms (from Matt Roper)

Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201007002210.3678024-3-lucas.demarchi@intel.com
2020-10-07 13:51:20 -07:00
Chris Wilson
bf9bd6a512 drm/i915/gt: Track the most recent pulse for the heartbeat
Since we track the idle_pulse for flushing the barriers and avoid
re-emitting the pulse upon idling if no futher action is required, this
also impacts the heartbeat. Before emitting a fresh heartbeat, we look
at the engine idle status and assume that if the pulse was the last
request emitted along the heartbeat, the engine is idling and a
heartbeat pulse not required. This assumption fails, but we can reuse
the idle pulse as the heartbeat if we are yet to emit one, and so track
the status of that pulse for our engine health check.

This impacts tgl/rcs0 as we rely on the heartbeat for our healthcheck for
the normal preemption detection mechanism is disabled by default.

Testcase: igt/gem_exec_schedule/preempt-hang/rcs0 #tgl
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201006094653.7558-1-chris@chris-wilson.co.uk
2020-10-07 10:23:11 +01:00
Tvrtko Ursulin
934941ed5a drm/i915: Fix DMA mapped scatterlist lookup
As the previous patch fixed the places where we walk the whole scatterlist
for DMA addresses, this patch fixes the random lookup functionality.

To achieve this we have to add a second lookup iterator and add a
i915_gem_object_get_sg_dma helper, to be used analoguous to existing
i915_gem_object_get_sg_dma. Therefore two lookup caches are maintained per
object and they are flushed at the same point for simplicity. (Strictly
speaking the DMA cache should be flushed from i915_gem_gtt_finish_pages,
but today this conincides with unsetting of the pages in general.)

Partial VMA view is then fixed to use the new DMA lookup and properly
query sg length.

v2:
 * Checkpatch.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Lu Baolu <baolu.lu@linux.intel.com>
Cc: Tom Murphy <murphyt7@tcd.ie>
Cc: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201006092508.1064287-2-tvrtko.ursulin@linux.intel.com
2020-10-06 12:49:12 +01:00
Tvrtko Ursulin
8a473dbadc drm/i915: Fix DMA mapped scatterlist walks
When walking DMA mapped scatterlists sg_dma_len has to be used since it
can be different (coalesced) from the backing store entry.

This also means we have to end the walk when encountering a zero length
DMA entry and cannot rely on the normal sg list end marker.

Both issues were there in theory for some time but were hidden by the fact
Intel IOMMU driver was never coalescing entries. As there are ongoing
efforts to change this we need to start handling it.

v2:
 * Use unsigned int for local storing sg_dma_len. (Logan)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
References: 85d1225ec0 ("drm/i915: Introduce & use new lightweight SGL iterators")
References: b31144c0da ("drm/i915: Micro-optimise gen6_ppgtt_insert_entries()")
Reported-by: Tom Murphy <murphyt7@tcd.ie>
Suggested-by: Tom Murphy <murphyt7@tcd.ie> # __sgt_iter
Suggested-by: Logan Gunthorpe <logang@deltatee.com> # __sgt_iter
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201006092508.1064287-1-tvrtko.ursulin@linux.intel.com
2020-10-06 12:49:07 +01:00
Chris Wilson
25dc89d527 drm/i915/gt: Scrub HW state on remove
Currently we do a final scrub of the HW state upon release. However,
when rebinding the device, this is too late as the device may either
have been partially rebound or the device is no longer accessible. If
the device has been removed before release, the reset goes astray
leaving the device in an inconsistent state, unlikely to work without a
full PCI reset. Furthermore, if the device is partially rebound before
the HW scrubbing, there may be leftover HW state that should have been
scrubbed. Either way, we need to push the scrubbing earlier before the
removal, so into unregister. The danger is that on older machines,
resetting the GPU also impact the display engine and so the reset should
be after modesetting is disabled (and before reuse we need to recover
modesetting).

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2508
Testcase: igt/core_hotunplug
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200929112639.24223-1-chris@chris-wilson.co.uk
2020-10-06 12:09:30 +01:00
Lucas De Marchi
b1e93a85f8 drm/i915: don't conflate is_dgfx with fake lmem
When using fake lmem for tests, we are overriding the setting in
device info for dgfx devices. Current users of IS_DGFX() except one are
correct. However, as we add support for DG1, we are going to use it in
additional places to trigger dgfx-only code path.

In future if we need we can use HAS_LMEM() instead of IS_DGFX() in the
places that make sense to also contemplate fake lmem use.

v2: update gen8_gmch_probe() to use HAS_LMEM(): we need to steal the
mappable aperture later(which is fine since it doesn't exist on "DGFX"),
and use it as a substitute for LMEMBAR. The !mappable aperture property
is also useful since it exercises some other parts of the code too.
(Matthew Auld)

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201001063917.3133475-1-lucas.demarchi@intel.com
2020-10-05 15:54:44 -07:00
Chris Wilson
ca65fc0d8e drm/i915/gt: Always send a pulse down the engine after disabling heartbeat
Currently, we check we can send a pulse prior to disabling the
heartbeat to verify that we can change the heartbeat, but since we may
re-evaluate execution upon changing the heartbeat interval we need another
pulse afterwards to refresh execution.

v2: Tvrtko asked if we could reduce the double pulse to a single, which
opened up a discussion of how we should handle the pulse-error after
attempting to change the property, and the desire to serialise
adjustment of the property with its validating pulse, and unwind upon
failure.

Fixes: 9a40bddd47 ("drm/i915/gt: Expose heartbeat interval via sysfs")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: <stable@vger.kernel.org> # v5.7+
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200928221510.26044-2-chris@chris-wilson.co.uk
(cherry picked from commit 3dd66a94de)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2020-09-30 14:24:48 -04:00
Chris Wilson
7d442ea7c5 drm/i915: Cancel outstanding work after disabling heartbeats on an engine
We only allow persistent requests to remain on the GPU past the closure
of their containing context (and process) so long as they are continuously
checked for hangs or allow other requests to preempt them, as we need to
ensure forward progress of the system. If we allow persistent contexts
to remain on the system after the the hangcheck mechanism is disabled,
the system may grind to a halt. On disabling the mechanism, we sent a
pulse along the engine to remove all executing contexts from the engine
which would check for hung contexts -- but we did not prevent those
contexts from being resubmitted if they survived the final hangcheck.

Fixes: 9a40bddd47 ("drm/i915/gt: Expose heartbeat interval via sysfs")
Testcase: igt/gem_ctx_persistence/heartbeat-stop
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: <stable@vger.kernel.org> # v5.7+
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200928221510.26044-1-chris@chris-wilson.co.uk
(cherry picked from commit 7a991cd3e3)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2020-09-30 14:24:46 -04:00
Maarten Lankhorst
159ace7ffe drm/i915: Fix uninitialised variable in intel_context_create_request.
In case backoff fails with an error, we return an undefined rq,
assign err to rq correctly.

Fixes: 8a929c9eb1 ("drm/i915: Use ww pinning for intel_context_create_request()")
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200918111208.1392128-1-maarten.lankhorst@linux.intel.com
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
(cherry picked from commit 4316b19dee)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2020-09-30 14:24:32 -04:00
Chris Wilson
922d369b29 drm/i915/gt: Clear the buffer pool age before use
If we create a new node, it is possible for the slab allocator to return
us a recently freed node. If that node was just retired, it will retain
the current jiffy as its node->age. There is then a miniscule window,
where as that node is retired, it will appear on the free list with an
incorrect age and be eligible for reuse by one thread, and then by a
second thread as the correct node->age is written.

Fixes: 06b73c2d0b ("drm/i915/gt: Delay taking the spinlock for grabbing from the buffer pool")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200915091417.4086-3-chris@chris-wilson.co.uk
(cherry picked from commit 9bb34ff25c)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2020-09-30 14:24:23 -04:00
Chris Wilson
b05734720d drm/i915/gt: Retire cancelled requests on unload
If we manage to hit the intel_gt_set_wedged_on_fini() while active, i.e.
module unload during a stress test, we may cancel the requests but not
clean up. This leads to a very slow module unload as we wait for
something or other to trigger the retirement flushing, or timeout and
unload with a bunch of warnings. Instead if we explicitly cancel then
cleanup on an active unload, it should be instant and quiet.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200930163253.2789-3-chris@chris-wilson.co.uk
2020-09-30 17:59:26 +01:00
Chris Wilson
eb3afbe18e drm/i915/selftests: Finish pending mock requests on cancellation.
Flush all the pending requests from the mock engine when they are
cancelled.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200930163253.2789-2-chris@chris-wilson.co.uk
2020-09-30 17:59:25 +01:00
Chris Wilson
5e39b4d94c drm/i915/gt: Signal cancelled requests
After marking the requests on an engine as cancelled upon wedging, send
any signals for their completions.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200930163253.2789-1-chris@chris-wilson.co.uk
2020-09-30 17:59:25 +01:00
Chris Wilson
3dd66a94de drm/i915/gt: Always send a pulse down the engine after disabling heartbeat
Currently, we check we can send a pulse prior to disabling the
heartbeat to verify that we can change the heartbeat, but since we may
re-evaluate execution upon changing the heartbeat interval we need another
pulse afterwards to refresh execution.

v2: Tvrtko asked if we could reduce the double pulse to a single, which
opened up a discussion of how we should handle the pulse-error after
attempting to change the property, and the desire to serialise
adjustment of the property with its validating pulse, and unwind upon
failure.

Fixes: 9a40bddd47 ("drm/i915/gt: Expose heartbeat interval via sysfs")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: <stable@vger.kernel.org> # v5.7+
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200928221510.26044-2-chris@chris-wilson.co.uk
2020-09-29 09:03:17 +01:00
Chris Wilson
7a991cd3e3 drm/i915: Cancel outstanding work after disabling heartbeats on an engine
We only allow persistent requests to remain on the GPU past the closure
of their containing context (and process) so long as they are continuously
checked for hangs or allow other requests to preempt them, as we need to
ensure forward progress of the system. If we allow persistent contexts
to remain on the system after the the hangcheck mechanism is disabled,
the system may grind to a halt. On disabling the mechanism, we sent a
pulse along the engine to remove all executing contexts from the engine
which would check for hung contexts -- but we did not prevent those
contexts from being resubmitted if they survived the final hangcheck.

Fixes: 9a40bddd47 ("drm/i915/gt: Expose heartbeat interval via sysfs")
Testcase: igt/gem_ctx_persistence/heartbeat-stop
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: <stable@vger.kernel.org> # v5.7+
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200928221510.26044-1-chris@chris-wilson.co.uk
2020-09-29 09:01:03 +01:00
Jani Nikula
5ae26012a1 drm/i915/uc: tune down GuC communication enabled/disabled messages
The GuC communication enabled/disabled messages are too noisy in info
level. Convert them from info to debug level, and switch to device based
logging while at it.

Reported-by: Karol Herbst <kherbst@redhat.com>
Cc: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200917165056.29766-1-jani.nikula@intel.com
2020-09-23 12:07:23 +03:00
Dave Airlie
6ea6be7708 Merge tag 'drm-misc-next-2020-09-21' of git://anongit.freedesktop.org/drm/drm-misc into drm-next
drm-misc-next for 5.10:

UAPI Changes:

Cross-subsystem Changes:
  - virtio: Merged a PR for patches that will affect drm/virtio

Core Changes:
  - dev: More devm_drm convertions and removal of drm_dev_init
  - atomic: Split out drm_atomic_helper_calc_timestamping_constants of
    drm_atomic_helper_update_legacy_modeset_state
  - ttm: More rework

Driver Changes:
  - i915: selftests improvements
  - panfrost: support for Amlogic SoC
  - vc4: one fix
  - tree-wide: conversions to devm_drm_dev_alloc,
  - ast: simplifications of the atomic modesetting code
  - panfrost: multiple fixes
  - vc4: multiple fixes
Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20200921152956.2gxnsdgxmwhvjyut@gilmour.lan
2020-09-23 09:52:24 +10:00
Maarten Lankhorst
4316b19dee drm/i915: Fix uninitialised variable in intel_context_create_request.
In case backoff fails with an error, we return an undefined rq,
assign err to rq correctly.

Fixes: 8a929c9eb1 ("drm/i915: Use ww pinning for intel_context_create_request()")
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200918111208.1392128-1-maarten.lankhorst@linux.intel.com
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2020-09-21 11:09:46 +02:00
Daniel Vetter
82be0d7540 drm/i915/selftest: Create mock_destroy_device
Just some prep work before we rework the lifetime handling, which
requires replacing all the drm_dev_put in selftests by something else.

v2: Don't go with a static inline, upsets the header tests and
separation.

Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200918132505.2316382-2-daniel.vetter@ffwll.ch
2020-09-21 10:36:24 +02:00
Chris Wilson
29545e5cd2 drm/i915/gt: Remove defunct intel_virtual_engine_get_sibling()
As the last user was eliminated in commit e21fecdcde40 ("drm/i915/gt:
Distinguish the virtual breadcrumbs from the irq breadcrumbs"), we can
remove the function. One less implementation detail creeping beyond its
scope.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200826132811.17577-16-chris@chris-wilson.co.uk
2020-09-18 12:18:03 +01:00
Chris Wilson
0bda4b80d9 drm/i915/gt: Show engine properties in the pretty printer
When debugging the engine state, include the user properties that may,
or may not, have been adjusted by the user/test.

For example,
vecs0
	...
	Properties:
		heartbeat_interval_ms: 2500 [default 2500]
		max_busywait_duration_ns: 8000 [default 8000]
		preempt_timeout_ms: 640 [default 640]
		stop_timeout_ms: 100 [default 100]
		timeslice_duration_ms: 1 [default 1]

Suggested-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200916090059.3189-1-chris@chris-wilson.co.uk
2020-09-18 11:52:36 +01:00
Swathi Dhanavanthri
dc6798a520 drm/i915/tgl, rkl: Make Wa_1606700617/22010271021 permanent
This workaround applies to all TGL and RKL steppings.

Signed-off-by: Swathi Dhanavanthri <swathi.dhanavanthri@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200911221158.4700-1-swathi.dhanavanthri@intel.com
2020-09-17 12:42:08 -07:00
Chris Wilson
4ff64bcfe2 drm/i915/gt: Use a mmio read of the CSB in case of failure
If we find the GPU didn't update the CSB within 50us, we currently fail
and eventually reset the GPU. Lets report the value from the mmio space
as a last resort, it may just stave off an unnecessary GPU reset.

References: HSDES#22011327657
Suggested-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200915134923.30088-4-chris@chris-wilson.co.uk
2020-09-15 15:33:54 +01:00
Chris Wilson
884c407412 drm/i915/gt: Apply the CSB w/a for all
Since we expect to inline the csb_parse() routines, the w/a for the
stale CSB data on Tigerlake will be pulled into process_csb(), and so we
might as well simply reuse the logic for all, and so will hopefully
avoid any strange behaviour on Icelake that was not covered by our
previous w/a.

References: d8f5053117 ("drm/i915/icl: Forcibly evict stale csb entries")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Bruce Chang <yu.bruce.chang@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200915134923.30088-3-chris@chris-wilson.co.uk
2020-09-15 15:33:54 +01:00
Chris Wilson
233c1ae3c8 drm/i915/gt: Wait for CSB entries on Tigerlake
On Tigerlake, we are seeing a repeat of commit d8f5053117 ("drm/i915/icl:
Forcibly evict stale csb entries") where, presumably, due to a missing
Global Observation Point synchronisation, the write pointer of the CSB
ringbuffer is updated _prior_ to the contents of the ringbuffer. That is
we see the GPU report more context-switch entries for us to parse, but
those entries have not been written, leading us to process stale events,
and eventually report a hung GPU.

However, this effect appears to be much more severe than we previously
saw on Icelake (though it might be best if we try the same approach
there as well and measure), and Bruce suggested the good idea of resetting
the CSB entry after use so that we can detect when it has been updated by
the GPU. By instrumenting how long that may be, we can set a reliable
upper bound for how long we should wait for:

    513 late, avg of 61 retries (590 ns), max of 1061 retries (10099 ns)

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2045
References: d8f5053117 ("drm/i915/icl: Forcibly evict stale csb entries")
References: HSDES#22011327657, HSDES#1508287568
Suggested-by: Bruce Chang <yu.bruce.chang@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Bruce Chang <yu.bruce.chang@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: stable@vger.kernel.org # v5.4
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200915134923.30088-2-chris@chris-wilson.co.uk
2020-09-15 15:33:53 +01:00
Chris Wilson
f24a44e52f drm/i915/gt: Widen CSB pointer to u64 for the parsers
A CSB entry is 64b, and it is simpler for us to treat it as an array of
64b entries than as an array of pairs of 32b entries.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200915134923.30088-1-chris@chris-wilson.co.uk
2020-09-15 15:33:53 +01:00
Chris Wilson
6cb304b312 drm/i915/gt: Check for a registered driver with IPS
If the ips module calls into the driver during an unbind/bind cycle, we
may see the driver while it has unregistered itself from ips and try and
dereference a NULL ips_mchdev pointer.

<1> [211.928844] BUG: kernel NULL pointer dereference, address: 0000000000000014
<1> [211.928861] #PF: supervisor read access in kernel mode
<1> [211.928871] #PF: error_code(0x0000) - not-present page
<6> [211.928881] PGD 0 P4D 0
<4> [211.928890] Oops: 0000 [#1] PREEMPT SMP PTI
<4> [211.928900] CPU: 3 PID: 327 Comm: ips-monitor Not tainted 5.9.0-rc5-CI-CI_DRM_9008+ #1
<4> [211.928914] Hardware name: Hewlett-Packard HP EliteBook 8440p/172A, BIOS 68CCU Ver. F.24 09/13/2013
<4> [211.929056] RIP: 0010:mchdev_get+0x5a/0x180 [i915]
<4> [211.929067] Code: c0 5a 74 0d 80 3d f1 53 29 00 00 0f 84 ab 00 00 00 48 8b 1d c8 a8 29 00 e8 d3 18 89 e1 85 c0 74 09 80 3d d1 53 29 00 00 74 65 <8b> 4b 14 48 8d 7b 14 85 c9 0f 84 09 01 00 00 8d 51 01 89 c8 f0 0f
<4> [211.929095] RSP: 0018:ffffc900002efe60 EFLAGS: 00010202
<4> [211.929105] RAX: 0000000000000001 RBX: 0000000000000000 RCX: ffff8881297acf40
<4> [211.929118] RDX: 0000000000000000 RSI: ffffffff8264e2c0 RDI: ffff8881297ad820
<4> [211.929130] RBP: ffffc900002efe68 R08: ffff8881297ad820 R09: 00000000fffffffe
<4> [211.929143] R10: ffff8881297acf40 R11: 00000000fff74c96 R12: ffff8881294dfa18
<4> [211.929155] R13: 0000000000000067 R14: ffff888126eff640 R15: ffff888126efe840
<4> [211.929168] FS:  0000000000000000(0000) GS:ffff888133d80000(0000) knlGS:0000000000000000
<4> [211.929182] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4> [211.929194] CR2: 0000000000000014 CR3: 0000000002610000 CR4: 00000000000006e0
<4> [211.929206] Call Trace:
<4> [211.929294]  i915_read_mch_val+0x15/0x380 [i915]
<4> [211.929309]  ? ips_monitor+0x3fb/0x630 [intel_ips]
<4> [211.929321]  ips_monitor+0x53c/0x630 [intel_ips]
<4> [211.929334]  ? ips_gpu_lower+0x30/0x30 [intel_ips]
<4> [211.929348]  kthread+0x14d/0x170
<4> [211.929358]  ? kthread_park+0x80/0x80
<4> [211.929369]  ret_from_fork+0x22/0x30
<4> [211.929382] Modules linked in: vgem snd_hda_codec_hdmi snd_hda_codec_generic ledtrig_audio i915 coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_hda_intel snd_intel_dspcfg snd_hda_codec snd_hwdep snd_hda_core e1000e snd_pcm mei_me mei intel_ips lpc_ich ptp prime_numbers pps_core
<4> [211.929437] CR2: 0000000000000014

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200915105113.26564-1-chris@chris-wilson.co.uk
2020-09-15 14:21:30 +01:00
Chris Wilson
9bb34ff25c drm/i915/gt: Clear the buffer pool age before use
If we create a new node, it is possible for the slab allocator to return
us a recently freed node. If that node was just retired, it will retain
the current jiffy as its node->age. There is then a miniscule window,
where as that node is retired, it will appear on the free list with an
incorrect age and be eligible for reuse by one thread, and then by a
second thread as the correct node->age is written.

Fixes: 06b73c2d0b ("drm/i915/gt: Delay taking the spinlock for grabbing from the buffer pool")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200915091417.4086-3-chris@chris-wilson.co.uk
2020-09-15 11:34:05 +01:00
Rodrigo Vivi
0ea8a56de2 Merge drm/drm-next into drm-intel-next-queued
Sync drm-intel-gt-next here so we can have an unified fixes flow.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2020-09-11 20:00:20 -04:00
Dave Airlie
1f4b2aca79 Merge tag 'drm-intel-gt-next-2020-09-07' of git://anongit.freedesktop.org/drm/drm-intel into drm-next
(Same content as drm-intel-gt-next-2020-09-04-3, S-o-b's added)

UAPI Changes:
(- Potential implicit changes from WW locking refactoring)

Cross-subsystem Changes:
(- WW locking changes should align the i915 locking more with others)

Driver Changes:

- MAJOR: Apply WW locking across the driver (Maarten)

- Reverts for 5 commits to make applying WW locking faster (Maarten)
- Disable preparser around invalidations on Tigerlake for non-RCS engines (Chris)
- Add missing dma_fence_put() for error case of syncobj timeline (Chris)
- Parse command buffer earlier in eb_relocate(slow) to facilitate backoff (Maarten)
- Pin engine before pinning all objects (Maarten)
- Rework intel_context pinning to do everything outside of pin_mutex (Maarten)

- Avoid tracking GEM context until registered (Cc: stable, Chris)
- Provide a fastpath for waiting on vma bindings (Chris)
- Fixes to preempt-to-busy mechanism (Chris)
- Distinguish the virtual breadcrumbs from the irq breadcrumbs (Chris)
- Switch to object allocations for page directories (Chris)
- Hold context/request reference while breadcrumbs are active (Chris)
- Make sure execbuffer always passes ww state to i915_vma_pin (Maarten)

- Code refactoring to facilitate use of WW locking (Maarten)
- Locking refactoring to use more granular locking (Maarten, Chris)
- Support for multiple pinned timelines per engine (Chris)
- Move complication of I915_GEM_THROTTLE to the ioctl from general code (Chris)
- Make active tracking/vma page-directory stash work preallocated (Chris)
- Avoid flushing submission tasklet too often (Chris)
- Reduce context termination list iteration guard to RCU (Chris)
- Reductions to locking contention (Chris)
- Fixes for issues found by CI (Chris)

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Joonas Lahtinen <jlahtine@jlahtine-mobl.ger.corp.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200907130039.GA27766@jlahtine-mobl.ger.corp.intel.com
2020-09-09 07:55:22 +10:00
Dave Airlie
ce5c207c6b Merge tag 'v5.9-rc4' into drm-next
Backmerge 5.9-rc4 as there is a nasty qxl conflict
that needs to be resolved.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2020-09-08 14:41:40 +10:00
Thomas Hellström
e0ee152fce drm/i915: Unlock the shared hwsp_gtt object after pinning
The hwsp_gtt object is used for sub-allocation and could therefore
be shared by many contexts causing unnecessary contention during
concurrent context pinning.
However since we're currently locking it only for pinning, it remains
resident until we unpin it, and therefore it's safe to drop the
lock early, allowing for concurrent thread access.

Signed-off-by: Thomas Hellström <thomas.hellstrom@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07 15:08:11 +03:00
Chris Wilson
b4d9145b01 drm/i915: Be wary of data races when reading the active execlists
To implement preempt-to-busy (and so efficient timeslicing and best utilization
of the hardware submission ports) we let the GPU run asynchronously in respect
to the ELSP submission queue. This created challenges in keeping and accessing
the driver state mirroring the asynchronous GPU execution.

The latest occurence of this was spotted by KCSAN:

[ 1413.563200] BUG: KCSAN: data-race in __await_execution+0x217/0x370 [i915]
[ 1413.563221]
[ 1413.563236] race at unknown origin, with read to 0xffff88885bb6c478 of 8 bytes by task 9654 on cpu 1:
[ 1413.563548]  __await_execution+0x217/0x370 [i915]
[ 1413.563891]  i915_request_await_dma_fence+0x4eb/0x6a0 [i915]
[ 1413.564235]  i915_request_await_object+0x421/0x490 [i915]
[ 1413.564577]  i915_gem_do_execbuffer+0x29b7/0x3c40 [i915]
[ 1413.564967]  i915_gem_execbuffer2_ioctl+0x22f/0x5c0 [i915]
[ 1413.564998]  drm_ioctl_kernel+0x156/0x1b0
[ 1413.565022]  drm_ioctl+0x2ff/0x480
[ 1413.565046]  __x64_sys_ioctl+0x87/0xd0
[ 1413.565069]  do_syscall_64+0x4d/0x80
[ 1413.565094]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

To complicate matters, we have to both avoid the read tearing of *active and
avoid any write tearing as perform the pending[] -> inflight[] promotion of the
execlists.

This is because we cannot rely on the memcpy doing u64 aligned copies on all
kernels/platforms and so we opt to open-code it with explicit WRITE_ONCE
annotations to satisfy KCSAN.

v2: When in doubt, write the same comment again.
v3: Expanded commit message.

Fixes: b55230e5e8 ("drm/i915: Check for awaits on still currently executing requests")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200716142207.13003-1-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
[Joonas: Rebased and reordered into drm-intel-gt-next branch]
[Joonas: Added expanded commit message from Tvrtko and Chris]
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07 15:07:49 +03:00
Maarten Lankhorst
15b6c92498 drm/i915: Move i915_vma_lock in the selftests to avoid lock inversion, v3.
Make sure vma_lock is not used as inner lock when kernel context is used,
and add ww handling where appropriate.

Ensure that execbuf selftests keep passing by using ww handling.

Changes since v2:
- Fix i915_gem_context finally.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200819140904.1708856-22-maarten.lankhorst@linux.intel.com
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07 14:32:06 +03:00
Maarten Lankhorst
8a929c9eb1 drm/i915: Use ww pinning for intel_context_create_request()
We want to get rid of intel_context_pin(), convert
intel_context_create_request() first. :)

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200819140904.1708856-21-maarten.lankhorst@linux.intel.com
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07 14:31:59 +03:00
Maarten Lankhorst
052e04f170 drm/i915/selftests: Fix locking inversion in lrc selftest.
This function does not use intel_context_create_request, so it has
to use the same locking order as normal code. This is required to
shut up lockdep in selftests.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200819140904.1708856-20-maarten.lankhorst@linux.intel.com
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07 14:31:50 +03:00
Maarten Lankhorst
dd878c0cec drm/i915: Dirty hack to fix selftests locking inversion
Some i915 selftests still use i915_vma_lock() as inner lock, and
intel_context_create_request() intel_timeline->mutex as outer lock.
Fortunately for selftests this is not an issue, they should be fixed
but we can move ahead and cleanify lockdep now.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200819140904.1708856-19-maarten.lankhorst@linux.intel.com
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07 14:31:43 +03:00
Maarten Lankhorst
c8d225946a drm/i915: Kill last user of intel_context_create_request outside of selftests
Instead of using intel_context_create_request(), use intel_context_pin()
and i915_create_request directly.

Now all those calls are gone outside of selftests. :)

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200819140904.1708856-17-maarten.lankhorst@linux.intel.com
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07 14:31:29 +03:00
Maarten Lankhorst
47b086934f drm/i915: Make sure execbuffer always passes ww state to i915_vma_pin.
As a preparation step for full object locking and wait/wound handling
during pin and object mapping, ensure that we always pass the ww context
in i915_gem_execbuffer.c to i915_vma_pin, use lockdep to ensure this
happens.

This also requires changing the order of eb_parse slightly, to ensure
we pass ww at a point where we could still handle -EDEADLK safely.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200819140904.1708856-15-maarten.lankhorst@linux.intel.com
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07 14:31:13 +03:00
Maarten Lankhorst
3999a70879 drm/i915: Rework intel_context pinning to do everything outside of pin_mutex
Instead of doing everything inside of pin_mutex, we move all pinning
outside. Because i915_active has its own reference counting and
pinning is also having the same issues vs mutexes, we make sure
everything is pinned first, so the pinning in i915_active only needs
to bump refcounts. This allows us to take pin refcounts correctly
all the time.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200819140904.1708856-14-maarten.lankhorst@linux.intel.com
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07 14:31:02 +03:00
Maarten Lankhorst
bfdf8b1d38 drm/i915: Use ww locking in intel_renderstate.
We want to start using ww locking in intel_context_pin, for this
we need to lock multiple objects, and the single i915_gem_object_lock
is not enough.

Convert to using ww-waiting, and make sure we always pin intel_context_state,
even if we don't have a renderstate object.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200819140904.1708856-10-maarten.lankhorst@linux.intel.com
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07 14:30:31 +03:00
Maarten Lankhorst
80f0b679d6 drm/i915: Add an implementation for i915_gem_ww_ctx locking, v2.
i915_gem_ww_ctx is used to lock all gem bo's for pinning and memory
eviction. We don't use it yet, but lets start adding the definition
first.

To use it, we have to pass a non-NULL ww to gem_object_lock, and don't
unlock directly. It is done in i915_gem_ww_ctx_fini.

Changes since v1:
- Change ww_ctx and obj order in locking functions (Jonas Lahtinen)

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200819140904.1708856-6-maarten.lankhorst@linux.intel.com
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07 14:29:44 +03:00
Chris Wilson
e23005604b drm/i915/gt: Hold context/request reference while breadcrumbs are active
Currently we hold no actual reference to the request nor context while
they are attached to a breadcrumb. To avoid freeing the request/context
too early, we serialise with cancel-breadcrumbs by taking the irq
spinlock in i915_request_retire(). The alternative is to take a
reference for a new breadcrumb and release it upon signaling; removing
the more frequently hit contention point in i915_request_retire().

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200801160225.6814-2-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
[Joonas: Rebased and reordered into drm-intel-gt-next branch]
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07 14:24:29 +03:00
Chris Wilson
3f7dc10716 drm/i915/gt: Move intel_breadcrumbs_arm_irq earlier
Move the __intel_breadcrumbs_arm_irq earlier, next to the disarm_irq, so
that we can make use of it in the following patch.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200801160225.6814-1-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07 14:24:25 +03:00
Chris Wilson
82adf90113 drm/i915/gt: Shrink i915_page_directory's slab bucket
kmalloc uses power-of-two slab buckets for small allocations (up to a
few pages). Since i915_page_directory is a page of pointers, plus a
couple more, this is rounded up to 8K, and we waste nearly 50% of that
allocation. Long terms this leads to poor memory utilisation, bloating
the kernel footprint, but the problem is exacerbated by our conservative
preallocation scheme for binding VMA. As we are required to allocate all
levels for each vma just in case we need to insert them upon binding,
this leads to a large multiplication factor for a single page vma. By
halving the allocation we need for the page directory structure, we
halve the impact of that factor, bringing workloads that once fitted into
memory, hopefully back to fitting into memory.

We maintain the split between i915_page_directory and i915_page_table as
we only need half the allocation for the lowest, most populous, level.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200729164219.5737-3-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07 14:24:23 +03:00
Chris Wilson
89351925a4 drm/i915/gt: Switch to object allocations for page directories
The GEM object is grossly overweight for the practicality of tracking
large numbers of individual pages, yet it is currently our only
abstraction for tracking DMA allocations. Since those allocations need
to be reserved upfront before an operation, and that we need to break
away from simple system memory, we need to ditch using plain struct page
wrappers.

In the process, we drop the WC mapping as we ended up clflushing
everything anyway due to various issues across a wider range of
platforms. Though in a future step, we need to drop the kmap_atomic
approach which suggests we need to pre-map all the pages and keep them
mapped.

v2: Verify our large scratch page is suitably DMA aligned; and manually
clear the scratch since we are allocating plain struct pages full of
prior content.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200729164219.5737-2-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07 14:24:08 +03:00