Limit busywaiting only to the request currently being processed by the
GPU. If the request is not currently being processed by the GPU, there
is a very low likelihood of it being completed within the 2 microsecond
spin timeout and so we will just be wasting CPU cycles.
v2: Check for logical inversion when rebasing - we were incorrectly
checking for this request being active, and instead busywaiting for
when the GPU was not yet processing the request of interest.
v3: Try another colour for the seqno names.
v4: Another colour for the function names.
v5: Remove the forced coherency when checking for the active request. On
reflection and plenty of recent experimentation, the issue is not a
cache coherency problem - but an irq/seqno ordering problem (timing issue).
Here, we do not need the w/a to force ordering of the read with an
interrupt.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: "Rogozhkin, Dmitry V" <dmitry.v.rogozhkin@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Eero Tamminen <eero.t.tamminen@intel.com>
Cc: "Rantala, Valtteri" <valtteri.rantala@intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1449833608-22125-4-git-send-email-chris@chris-wilson.co.uk
When waiting for high frequency requests, the finite amount of time
required to set up the irq and wait upon it limits the response rate. By
busywaiting on the request completion for a short while we can service
the high frequency waits as quick as possible. However, if it is a slow
request, we want to sleep as quickly as possible. The tradeoff between
waiting and sleeping is roughly the time it takes to sleep on a request,
on the order of a microsecond. Based on measurements of synchronous
workloads from across big core and little atom, I have set the limit for
busywaiting as 10 microseconds. In most of the synchronous cases, we can
reduce the limit down to as little as 2 miscroseconds, but that leaves
quite a few test cases regressing by factors of 3 and more.
The code currently uses the jiffie clock, but that is far too coarse (on
the order of 10 milliseconds) and results in poor interactivity as the
CPU ends up being hogged by slow requests. To get microsecond resolution
we need to use a high resolution timer. The cheapest of which is polling
local_clock(), but that is only valid on the same CPU. If we switch CPUs
because the task was preempted, we can also use that as an indicator that
the system is too busy to waste cycles on spinning and we should sleep
instead.
__i915_spin_request was introduced in
commit 2def4ad99b [v4.2]
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Tue Apr 7 16:20:41 2015 +0100
drm/i915: Optimistically spin for the request completion
v2: Drop full u64 for unsigned long - the timer is 32bit wraparound safe,
so we can use native register sizes on smaller architectures. Mention
the approximate microseconds units for elapsed time and add some extra
comments describing the reason for busywaiting.
v3: Raise the limit to 10us
v4: Now 5us.
Reported-by: Jens Axboe <axboe@kernel.dk>
Link: https://lkml.org/lkml/2015/11/12/621
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: "Rogozhkin, Dmitry V" <dmitry.v.rogozhkin@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Eero Tamminen <eero.t.tamminen@intel.com>
Cc: "Rantala, Valtteri" <valtteri.rantala@intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1449833608-22125-3-git-send-email-chris@chris-wilson.co.uk
pm_runtime_{use,dont_use}_autosuspend() controls whether the device's
sysfs power/autosuspend_delay_ms file is writeable or returns -EIO on
access to user space. Since
commit 25b181b46e
Author: Imre Deak <imre.deak@intel.com>
Date: Thu Dec 17 13:44:56 2015 +0200
drm/i915: get a permanent RPM reference on platforms w/o RPM support
this sysfs file is writeable also on platforms without RPM support, but
userspace (at least IGT) depends on this file being unchangable to
determine whether the device supports runtime PM at all. So restore the
old behavior.
This gets rid of igt/pm_rpm failures on old platforms without RPM
support, where the test should be skipped.
Testcase: igt/pm_rpm/basic-rte
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: David Weinehall <david.weinehall@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1450371873-878-1-git-send-email-imre.deak@intel.com
"exec->exec_bo" is NULL at this point so this code returns success. We
want to return -ENOMEM.
Fixes: d5b1a78a77 ('drm/vc4: Add support for drawing 3D frames.')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
"state" is smaller than "kernel_state" so we end up corrupting memory.
Fixes: 214613656b ('drm/vc4: Add an interface for capturing the GPU state after a hang.')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
The copy_to/from_user() functions return the number of bytes remaining
to be copied. We want to return error codes here.
Also it's a bad idea to print an error message if a copy from user fails
because users can use that to spam /var/log/messages which is annoying
so I removed those.
Fixes: 214613656b ('drm/vc4: Add an interface for capturing the GPU state after a hang.')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
In some cases the amount of vram may be less than the BAR size,
if so, limit visible vram to the amount of actual vram, not the
BAR size.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
A long time ago (before 3.14) we relied on a permanent pinning of the
ifbdev to lock the fb in place inside the GGTT. However, the
introduction of stealing the BIOS framebuffer and reusing its address in
the GGTT for the fbdev has muddied waters and we use an inherited fb.
However, the inherited fb is only pinned whilst it is active and we no
longer have an explicit pin for the info->system_base mmapping used by
the fbdev. The result is that after some aperture pressure the fbdev may
be evicted, but we continue to write the fbcon into the same GGTT
address - overwriting anything else that may be put into that offset.
The effect is most pronounced across suspend/resume as
intel_fbdev_set_suspend() does a full clear over the whole scanout.
v2: Only unpin the intel_fb is we allocate it. If we inherit the fb from
the BIOS, we do not own the pinned vma (except for the reference we add
in this patch for our access via info->screen_base).
v3: Finish balancing the vma pinning for the normal !preallocated case.
v4: Try to simplify the pinning even further.
v5: Leak the VMA (cleaned up by object-free) to avoid complicated error paths.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: "Goel, Akash" <akash.goel@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: Lukas Wunner <lukas@wunner.de>
Cc: drm-intel-fixes@lists.freedesktop.org
Link: http://patchwork.freedesktop.org/patch/msgid/1449245126-26158-1-git-send-email-chris@chris-wilson.co.uk
Tested-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
In some cases we want to check whether we hold an RPM wakelock reference
for the whole duration of a sequence. To achieve this add a new RPM
atomic sequence counter that we increment any time the wakelock refcount
drops to zero. Check whether the sequence number stays the same during
the atomic section and that we hold the wakelock at the beginning of the
section.
Motivated by Chris.
v2-v3:
- unchanged
v4:
- swap the order of atomic_read() and assert_rpm_wakelock_held() in
assert_rpm_atomic_begin() to avoid race
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v3)
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1450203038-5150-10-git-send-email-imre.deak@intel.com
Atm, we assert that the device is not suspended until the point when the
device is truly put to a suspended state. This is fine, but we can catch
more problems if we check that RPM refcount is non-zero. After that one
drops to zero we shouldn't access the device any more, even if the actual
device suspend may be delayed. Change assert_rpm_wakelock_held()
accordingly to check for a non-zero RPM refcount in addition to the
current device-not-suspended check.
For the new asserts to work we need to annotate every place explicitly in
the code where we expect that the device is powered. The places where we
only assume this, but may not hold an RPM reference:
- driver load
We assume the device to be powered until we enable RPM. Make this
explicit by taking an RPM reference around the load function.
- system and runtime sudpend/resume handlers
These handlers are called when the RPM reference becomes 0 and know the
exact point after which the device can get powered off. Disable the
RPM-reference-held check for their duration.
- the IRQ, hangcheck and RPS work handlers
These handlers are flushed in the system/runtime suspend handler
before the device is powered off, so it's guaranteed that they won't
run while the device is powered off even though they don't hold any
RPM reference. Disable the RPM-reference-held check for their duration.
In all these cases we still check that the device is not suspended.
These explicit annotations also have the positive side effect of
documenting our assumptions better.
This caught additional WARNs from the atomic modeset path, those should
be fixed separately.
v2:
- remove the redundant HAS_RUNTIME_PM check (moved to patch 1) (Ville)
v3:
- use a new dedicated RPM wakelock refcount to also catch cases where
our own RPM get/put functions were not called (Chris)
- assert also that the new RPM wakelock refcount is 0 in the RPM
suspend handler (Chris)
- change the assert error message to be more meaningful (Chris)
- prevent false assert errors and check that the RPM wakelock is 0 in
the RPM resume handler too
- prevent false assert errors in the hangcheck work too
- add a device not suspended assert check to the hangcheck work
v4:
- rename disable/enable_rpm_asserts to disable/enable_rpm_wakeref_asserts
and wakelock_count to wakeref_count
- disable the wakeref asserts in the IRQ handlers and RPS work too
- update/clarify commit message
v5:
- mark places we plan to change to use proper RPM refcounting with
separate DISABLE/ENABLE_RPM_WAKEREF_ASSERTS aliases (Chris)
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1450227139-13471-1-git-send-email-imre.deak@intel.com
As a preparation for follow-up patches add a new helper that checks
whether we hold an RPM reference, since this is what we want most of
the cases. Atm this helper will only check for the HW suspended state, a
follow-up patch will do the actual change to check the refcount instead.
One exception is the forcewake release timer function, where it's
guaranteed that the HW is on even though the RPM refcount drops to zero.
This guarantee is provided by flushing the timer in the runtime suspend
handler. So leave the assert_device_not_suspended check in place there.
Also rename assert_device_suspended for consistency and export these
helpers as a preparation for the follow-up patches.
No functional change.
v3:
- change the assert warning message to be more meaningful (Chris)
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1450203038-5150-6-git-send-email-imre.deak@intel.com
We don't really need to check this flag in the get/put/assert helpers,
as on platforms without RPM support we won't ever enable RPM. That means
pm.suspend will be always false and the assert will be always true.
Do this to simplify the code and to let us extend the RPM asserts to all
platforms for a better coverage.
Motivated by Ville.
v2-v3:
- unchanged
v4:
- remove the HAS_RUNTIME_PM check from intel_runtime_pm_enable() too
made possible by the previous two patches
v5:
- rebased on the previous new patch in the series that keeps
HAS_RUNTIME_PM() in intel_runtime_pm_enable() with a permanent
reference taken there
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v3)
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1450352931-16498-1-git-send-email-imre.deak@intel.com
Currently we disable RPM functionality on platforms that doesn't support
this by not putting/getting the RPM reference we receive from the RPM
core during driver loading/unloading respectively. This is somewhat
obscure, so make it more explicit by keeping a reference dedicated for
this particular purpose whenever the driver is loaded. This makes it
possible to remove the HAS_RUNTIME_PM() special casing from every other
places in the next patch.
v2:
- fix intel_runtime_pm_get vs. intel_runtime_pm_put in
intel_power_domains_fini()
v3:
- take only a low level RPM reference so the ref tracking asserts
continue to work (Ville)
- update the commit message
- move the patch earlier for bisectability
Suggested-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1450352696-16135-1-git-send-email-imre.deak@intel.com
This fixes a random corruption under memory pressure. We need to fence
the BO for the user fence as well, otherwise it might be swapped out
and the GPU could write the fence value to an undesired location.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
drm/tegra: Changes for v4.5-rc1
This adds support for the version of host1x found on Tegra210 SoCs. It
also makes use of the new atomic suspend/resume functionality to bring
this feature to Tegra.
Other than that it's mostly small fixes and cleanups, with some prep-
work for things that will hopefully get merged for the next release.
* tag 'drm/tegra/for-4.5-rc1' of git://anongit.freedesktop.org/tegra/linux:
drm/tegra: Advertise DRIVER_ATOMIC
drm/tegra: Use DRIVER level for IOMMU aperture message
drm/tegra: checking for IS_ERR() instead of NULL
drm/tegra: dc: Add missing of_node_put()
drm/tegra: Implement subsystem-level suspend/resume
drm/tegra: sor: Remove unnecessary conditional
drm/tegra: sor: Operate on struct drm_dp_aux *
drm/tegra: Use drm_gem_object_unreference_unlocked()
drm/tegra: Don't take dev->struct_mutex in mmap offset ioctl
drm/tegra: Use unlocked gem unreferencing
drm/tegra: Use new multi-driver module helpers
gpu: host1x: Add Tegra210 support
gpu: host1x: Remove core driver on unregister
gpu: host1x: Use platform_register/unregister_drivers()
drm/panel: Changes for v4.5-rc1
This set of changes brings in a few more helpers for DSI support as well
as a couple of new drivers and support for some more simple panels.
* tag 'drm/panel/for-4.5-rc1' of git://anongit.freedesktop.org/tegra/linux:
drm/panel: simple: Add QiaoDian qd43003c0-40
of: Add vendor prefix for QiaoDian Xianshi
drm/panel: add kernel doc for size attributes in panel_desc
drm/panel: simple: Add support for Kyocera TCG121XGLP panel
devicetree: add vendor prefix for Kyocera Corporation
drm/bridge: Remove gratuitous blank line
drm/bridge: dw-hdmi: Use dashes in filenames
drm/panel: Add Sharp LS043T1LE01 MIPI DSI panel
dt-bindings: Add Sharp LS043T1LE01 panel binding
drm/dsi: Add Turn On/Shutdown Peripheral command helpers
drm/panel: Add Panasonic VVX10F034N00 MIPI DSI panel
dt-bindings: Add Panasonic VVX10F034N00 panel binding
drm/panel: simple: Add support for Innolux G121X1-L03
drm/panel: simple: Add support for BOE TV080WUM-NL0
dt-bindings: Add BOE TV080WUM-NL0 panel binding
of: Add vendor prefix for BOE Technology Group
drm/dsi: Add a helper to get bits per pixel of MIPI DSI pixel format
I broke AVI/HDMI/SPD infoframes on HSW+ with the register type
safety changes. We were supposed to check that the infoframe data
register is valid before writing the infoframe data, but the check
ended up inverted, and so in practice we never wrote or enabled
these infoframes.
We were still sending out the GCP infoframe when the sink was
deep-color capable. That and the fact that we use a single
bool to track our infoframe state meant that the state checker
only caught this when a HDMI sink that doesn't do deep-color was
used.
We really need to fix our infoframe state checking to be much
more anal. But in the meantime let's just fix the regression.
In fact let's just throw out the register validity check and
convert some of the "unknown info frame type" debug messages
into MISSING_CASE(). So far we support the same set of infoframe
types on all platforms, so the silent debug messages make no
sense.
Cc: drm-intel-fixes@lists.freedesktop.org
Fixes: f0f59a00a1 ("drm/i915: Type safe register read/write")
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> (irc)
Tested-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> (irc)
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1450282200-4203-1-git-send-email-ville.syrjala@linux.intel.com
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93119
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Document that 'width' and 'height' are measured in millimeters.
Signed-off-by: Ulrich Ölmann <u.oelmann@pengutronix.de>
Signed-off-by: Thierry Reding <treding@nvidia.com>
The Kyocera TCG121XGLP panel is an XGA LCD TFT panel connected through
LVDS, which can be supported by the simple panel driver.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Thierry Reding <treding@nvidia.com>
The decision about which source will be used for VBT is done in
intel_parse_bios(), not in the VBT validation function. Make the VBT
validation function strictly about validation, and move the debug
logging to where it logically belongs.
Also split the logging about where the valid VBT was found and what the
signature is. This will make even more sense in the future when the
validation for ACPI OpRegion based VBT takes place at OpRegion setup
time.
v2: Split logging about VBT signature and BDB version.
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1450178092-27148-1-git-send-email-jani.nikula@intel.com