Always true these days. It has been added originally to work
around some issues with the agp layer in 2.6.29:
commit ac5c4e7618
Author: Dave Airlie <airlied@redhat.com>
Date: Fri Dec 19 15:38:34 2008 +1000
drm/i915: GEM on PAE has problems - disable it for now.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We always set it so there's no point in checking. We could
instead add a bit that tells us whether gem is actually
initialized (i.e. either kms or gem_init_ioctl called), but
that's imho not worth it.
So just rip it out.
There's a little change in the wait_ring timeout, but we've never
run with anything else than the 60 second timeout, even on dri1
userspace.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This ioctl used in a kms driver is only useful to create massive
havoc.
v2: Bail out with -ENODEV as suggested by Chris Wilson.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Also ditch the cargo-culted dev_priv checks - either we have a
giant hole in our setup code or this is useless. Plainly bogus
to check for it in either case.
v2: Chris Wilson noticed that I've missed one bogus dev_priv check.
v3: The check in the overlay code is redundant (Chris)
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The specs recommend that this bit be set on PineView. No reason is
given, but it sounds like a powersaving bit that we should expect the
BIOS to be setting...
v2: Rebase on top of _MASKED_ENABLE_BIT
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We slightly modify the initialisation sequence to move the
initialisation of the memory managers earlier and in particular before
probing outputs and detecting any existing output configuration. This is
essential if we wish to track preallocated objects and preserve them
whilst initialising GEM.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The use of the mm_list by deferred-free breaks the following patches to
extend the range of objects tracked. We can simplify things if we just
make the unbind during free uninterrutible.
Note that unbinding should never fail, because we hold an additional
reference on every active object. Only the ilk vt-d workaround breaks
this, but already takes care of not failing by waiting for the gpu to
quiescent non-interruptible. But the existence of the deferred free
list casted some doubts on this theory, hence WARN if the unbind fails
and only then retry non-interruptible.
We can kill this additional code after a release in case the theory is
indeed right and no one has hit that WARN.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Simplify object tracking by removing the inactive but pinned list. The
only place where this was used is for counting the available memory,
which is just as easy performed by checking all objects on the rare
occasions it is required (application startup). For ease of debugging,
we keep the reporting of pinned objects through the error-state and
debugfs.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This was only used by one external caller who would just be as happy
with evict-everything, so perform the replacement and make the function
private.
In the process we note that unbinding the inactive list should not fail,
and make it a warning instead.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Currently, we only bump the inactive LRU of an object when we bind
into the GTT for a page-fault. As the object may be used many times
before its mapping is zapped, we do not mark it as active as
frequently as we should. Userspace should be calling set-to-GTT-domain
before each pointer deference (for synchronous access) and so is a
good place to mark the buffer as active.
Marking the buffer as recently used places it at the end of the
inactive eviction queue, though still before anything with outstanding
rendering. This reduces the likelihood of evicting a buffer that is
going to be used again by the CPU in the near future. This way we can
hopefully avoid to kick out upload buffers right before we use them on
the gpu.
Note that we need to check that the object is not active or pinned,
for otherwise we create havoc on the active/pinned lists, which also
use obj->mm_list.
The active lists are sorted by and evicted in last GPU rendering
order, access by the CPU to a still active buffer therefore does not
affect its eviction ordering. Pinned objects are currently excluded
from eviction, therefore the only list that we need to bump for GTT
access by the CPU is the inactive list.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
[danvet: Added further explanations to the commit message as discussed
on irc.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We were attempting to use a per-ring spinlock whilst modifying global
IRQ flags. A recipe for rare missed interrupts.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Copy&pasted from the vlv setup code. According to docs, we need that
on ivb, too.
v2: Use new masked bit handling macros.
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
... and put them to so good use.
Note that there's functional change in vlv clock gating code, we now
no longer spuriously read back the current value of the bit. According
to Bspec the high bits should always read zero, so ORing this in
should have no effect.
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
PCH PLLs aren't required for outputs on the CPU, so we shouldn't just
treat them as part of the pipe.
So split the code out and manage PCH PLLs separately, allocating them
when needed or trying to re-use existing PCH PLL setups when the timings
match.
v2: add num_pch_pll field to dev_priv (Daniel)
don't NULL the pch_pll pointer in disable or DPMS will fail (Jesse)
put register offsets in pll struct (Chris)
v3: Decouple enable/disable of PLLs from get/put.
v4: Track temporary PLL disabling during modeset
v5: Tidy PLL initialisation by only checking for num_pch_pll == 0 (Eugeni)
v6: Avoid mishandling allocation failure by embedding the small array of
PLLs into the device struct
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=44309
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> (up to v2)
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> (v3+)
Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
Tested-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
gen2 hardware has some significant differences from the other interrupt
routines that were glossed over and then forgotten about in the
transition to KMS. Such as
- 16bit IIR
- PendingFlip status bit
This patch reintroduces a handler specifically for gen2 for the purpose
of handling pageflips correctly, simplifying code in the process.
v2: Also fixup ring get/put irq to only access 16bit registers (Daniel)
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=24202
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=41793
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
[danvet: use posting_read16 in intel_ringbuffer.c and kill _driver
from the function names.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
If we fail to unbind and so abort the change in tiling, we will have
removed the VMA for the object for no reason. The likelihood of unbind
failing is slim (other than ERESTARTSYS which will cause userspace to
try again), so the change is mostly for the principle.
Also improve the slightly stale comment.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Rename obj->tiling_changed to obj->fence_dirty so that it is clear that
it flags when the parameters for an active fence (including the
no-fence) register are changed.
Also, do not set this flag when the object does not have a fence
register allocated currently and the gpu does not depend upon the
unfence. This case works exactly like when a tiled object lost its
fence and hence does not need additional handling for the tiling
change in the code.
v2: Use fence_dirty to better express what the flag tracks and add a few
more details to the comments to serve as a reminder of how the GPU also
uses the unfenced register slot.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
[danvet: Add some bikeshed to the commit message about the stricter
use of fence_dirty.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
When fixing up the crt load detect code I've failed to notice the same
problem in the tv load detect code. Again, unconditionally use the
load detect pipe infrastructure, it gets things right.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
As with one of the earlier patches in the series, we're forced to cast
for copy_[to|from]_user. Again because of the nature of the GEN x86
exclusivity, this should be safe.
Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com>
[danvet: Added some bikeshed.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
These were mostly straight forward. No forced casting needed.
Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com>
[danvet: fix conflict with ringbuffer_data removal and drop the hunk
about the status page - that needs more care to fix up.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
With the exception of a forced cast for phys_obj stuff (a problem in
other patches as well) all of these are fairly simple __iomem compliance
fixes.
As with other patches, yank/paste errors may exist.
Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com>
[danvet: Added comment to explain the __iomem cast.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Almost all of the errors related __iomem problems.
Most of the changes here are trivial, however there is plenty of chance
for yank/paste errors.
Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This was originally used as an attempt to diagnose GPU hangs, but was
never very reliable and superseded by the i915_error_state capture on
hangcheck. It now lies languishing unused and unwanted.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
IvyBridge requires an extra frame between disabling the low power
watermarks and enabling scaling on the sprite plane. If the scaling
is already enabled, then we have already disabled the low power
watermarks and need not incur an extra wait.
Similarly, as we disable the scaling when turning off the sprite plane,
we can update the scaling enabled flag and restore the low power
watermarks.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Pretty useful to debug our DP bandwidth woes.
v2: Also print out the required and available link bandwidth,
suggested by Chris Wilson.
v3: Also print out the input parameters so that diagnosing failures to
find a valid dp link configuration is possible.
v4: s/Display port/DP/ to shorten the output.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
On SandyBridge IPS was entirely implemented in hardware and not reliant
on the driver monitoring power consumption and feeding back desired run
states, so the hardware is able to adapt quicker and more flexibly. Which
is a huge relief for us as we no longer have to carry empirically
derived magic algorithms.
Yet despite the advance in technology, the driver was still doing its
IPS polling on all machines. Restrict it to the only supported hardware,
Clarkdale/Arrandale.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Tested-by: Andrey Rahmatullin <wrar@wrar.name>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=49025
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Clearing bit 5 of CACHE_MODE_0 is necessary to prevent GPU hangs in
OpenGL programs such as Google MapsGL, Google Earth, and gzdoom when
using separate stencil buffers. Without it, the GPU tries to use the
LRA eviction policy, which isn't supported. This was supposed to be off
by default, but seems to be on for many machines.
This cannot be done in gen6_init_clock_gating with most of the other
workaround bits; the render ring needs to exist. Otherwise, the
register write gets dropped on the floor (one printk will show it
changed, but a second printk immediately following shows the value
reverts to the old one).
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=47535
Cc: stable@vger.kernel.org
Cc: Rob Castle <futuredub@gmail.com>
Cc: Eric Appleman <erappleman@gmail.com>
Cc: aaron667@gmx.net
Cc: Keith Packard <keithp@keithp.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Dave Airlie <airlied@redhat.com>
We seem to have a decent confusion between the output timings and the
input timings of the sdvo encoder. If I understand the code correctly,
we use the original mode unchanged for the output timings, safe for
the lvds case. And we should use the adjusted mode for input timings.
Clarify the situation by adding an explicit output_dtd to the sdvo
mode_set function and streamline the code-flow by moving the input and
output mode setting in the sdvo encode together.
Furthermore testing showed that the sdvo input timing needs the
unadjusted dotclock, the sdvo chip will automatically compute the
required pixel multiplier to get a dotclock above 100 MHz.
Fix this up when converting a drm mode to an sdvo dtd.
This regression was introduced in
commit c74696b9c8
Author: Pavel Roskin <proski@gnu.org>
Date: Thu Sep 2 14:46:34 2010 -0400
i915: revert some checks added by commit 32aad86f
particularly the following hunk:
diff --git a/drivers/gpu/drm/i915/intel_sdvo.c
b/drivers/gpu/drm/i915/intel_sdvo.c
index 093e914..62d22ae 100644
--- a/drivers/gpu/drm/i915/intel_sdvo.c
+++ b/drivers/gpu/drm/i915/intel_sdvo.c
@@ -1122,11 +1123,9 @@ static void intel_sdvo_mode_set(struct drm_encoder *encoder,
/* We have tried to get input timing in mode_fixup, and filled into
adjusted_mode */
- if (intel_sdvo->is_tv || intel_sdvo->is_lvds) {
- intel_sdvo_get_dtd_from_mode(&input_dtd, adjusted_mode);
+ intel_sdvo_get_dtd_from_mode(&input_dtd, adjusted_mode);
+ if (intel_sdvo->is_tv || intel_sdvo->is_lvds)
input_dtd.part2.sdvo_flags = intel_sdvo->sdvo_flags;
- } else
- intel_sdvo_get_dtd_from_mode(&input_dtd, mode);
/* If it's a TV, we already set the output timing in mode_fixup.
* Otherwise, the output timing is equal to the input timing.
Due to questions raised in review, below a more elaborate analysis of
the bug at hand:
Sdvo seems to have two timings, one is the output timing which will be
sent over whatever is connected on the other side of the sdvo chip (panel,
hdmi screen, tv), the other is the input timing which will be generated by
the gmch pipe. It looks like sdvo is expected to scale between the two.
To make things slightly more complicated, we have a bunch of special
cases:
- For lvds panel we always use a fixed output timing, namely
intel_sdvo->sdvo_lvds_fixed_mode, hence that special case.
- Sdvo has an interface to generate a preferred input timing for a given
output timing. This is the confusing thing that I've tried to clear up
with the follow-on patches.
- A special requirement is that the input pixel clock needs to be between
100MHz and 200MHz (likely to keep it within the electromechanical design
range of PCIe), 270MHz on later gen4+. Lower pixel clocks are
doubled/quadrupled.
The thing this patch tries to fix is that the pipe needs to be
explicitly instructed to double/quadruple the pixels and needs the
correspondingly higher pixel clock, whereas the sdvo adaptor seems to
do that itself and needs the unadjusted pixel clock. For the sdvo
encode side we already set the pixel mutliplier with a different
command (0x21).
This patch tries to fix this mess by:
- Keeping the output mode timing in the unadjusted plain mode, safe
for the lvds case.
- Storing the input timing in the adjusted_mode with the adjusted
pixel clock. This way we don't need to frob around with the core
crtc mode set code.
- Fixing up the pixelclock when constructing the sdvo dtd timing
struct. This is why the first hunk of the patch is an integral part
of the series.
- Dropping the is_tv special case because input_dtd is equivalent to
adjusted_mode after these changes. Follow-up patches clear this up
further (by simply ripping out intel_sdvo->input_dtd because it's
not needed).
v2: Extend commit message with an in-depth bug analysis.
Reported-and-Tested-by: Bernard Blackham <b-linuxgit@largestprime.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=48157
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: stable@kernel.org
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
On 32-bit systems, a large args->num_cliprects from userspace via ioctl
may overflow the allocation size, leading to out-of-bounds access.
This vulnerability was introduced in commit 432e58ed ("drm/i915: Avoid
allocation for execbuffer object list").
Signed-off-by: Xi Wang <xi.wang@gmail.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: stable@vger.kernel.org
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
On 32-bit systems, a large args->buffer_count from userspace via ioctl
may overflow the allocation size, leading to out-of-bounds access.
This vulnerability was introduced in commit 8408c282 ("drm/i915:
First try a normal large kmalloc for the temporary exec buffers").
Signed-off-by: Xi Wang <xi.wang@gmail.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: stable@vger.kernel.org
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Somehow we have a fast-path that tries to avoid going through
the load-detect code when the encode already has a crtc associated.
But this fails horribly when the crtc is off. The load detect pipe
itself manages this case well (and also does not forget to restore the
dpms state), so just rip out this special case.
The issue seems to go back all the way to the commit that originally
introduced load-detection on the vga output:
commit e4a5d54f92
Author: Ma Ling <ling.ma@intel.com>
Date: Tue May 26 11:31:00 2009 +0800
drm/i915: Add support for VGA load detection (pre-945).
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=43020
Reported-by: Jean Delvare <khali@linux-fr.org>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This continues the theme started with vm_brk() and vm_munmap():
vm_mmap() does the same thing as do_mmap(), but additionally does the
required VM locking.
This uninlines (and rewrites it to be clearer) do_mmap(), which sadly
duplicates it in mm/mmap.c and mm/nommu.c. But that way we don't have
to export our internal do_mmap_pgoff() function.
Some day we hopefully don't have to export do_mmap() either, if all
modular users can become the simpler vm_mmap() instead. We're actually
very close to that already, with the notable exception of the (broken)
use in i810, and a couple of stragglers in binfmt_elf.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
It looks like we also need to flush the render cache when we just
invalidate it. This fixes a regression in i-g-t/gem_tiled_blits on my
i855gm. I guess the render cache there is virtually indexed, so we
need to clean it when changing gtt mappings.
This regression has been introduce in
commit 46f0f8d120
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Wed Apr 18 11:12:11 2012 +0100
drm/i915: Don't set a MBZ bit in gen2/3 MI_FLUSH
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
When the change to start adjusting the sync polarity of the LVDS mode
was introduced in
commit aa9b500ddf
Author: Bryan Freed <bfreed@google.com>
Date: Wed Jan 12 13:43:19 2011 -0800
drm/i915: Honour LVDS sync polarity from EDID
we made the change in state verbose so that we could quickly spot any
regressions that made have also been introduced with it. As there do not
appear to have been any, remove the extra logging.
v2: Remove the no longer used variables.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This adds intel_pm routine for generic power-related infrastructure
initialization.
v2: now that all the platform-specific stuff is initialized in one place, we
can also add back the static definitions to platform-specific functions which
we abstract now.
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Acked-by: Ben Widawsky <benjamin.widawsky@intel.com>
Signed-off-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This moves the clock gating-related functions into intel_pm module.
Also, please note that we do change the function type from static to
non-static in this patch for the move, to prevent breaking bisecting with
non-working intermediate commit. Those are returned back to static form in
the following patch which setups a generic PM initialization function,
which was split into a different one to simplify review.
v2: rebase on top of latest drm-intel-next-queued to incorporate all the
changes that went there meanwhile.
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Acked-by: Ben Widawsky <benjamin.widawsky@intel.com>
Signed-off-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This moves DRPS, RPS and RC6-related functionality into intel_pm module.
It also removes the linux/cpufreq.h include from intel_display, as its
only user was the GPU turbo-related functionality in Gen6+ code path.
v2: rebase on top of latest drm-intel-next-queued adding the bits that
shifted around since the last patch.
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Acked-by: Ben Widawsky <benjamin.widawsky@intel.com>
Signed-off-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The previous patch had way too long lines, this fixes them to fit into a
reasonable screen space.
Signed-off-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We can also take advantage of the new 'no retire' mode for seqno waiting
to avoid having to take a reference on the old fence object whilst
flushing an existing fence.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Now that we have a routine that is able to clear the fences as well as
setup up the register for a tiled object, remove the surplus routines to
clear the fences.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>