If the end of the system DMA window is farther away from the start of
physical RAM than the size of the GPU linear window, move the linear
window so that it ends at the same address than the system DMA window.
This allows to map command buffer from CMA, which is likely to reside
at the end of the system DMA window, while also overlapping as much
RAM as possible, in order to optimize regular buffer mappings through
the linear window.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
The retire worker is kicked for each fence, either the normal way
by signaling the fence from the event completion interrupt or by
the recover worker if the GPU got stuck. Moving the RPM put into
the retire worker allows us to have it in a single place for
both cases.
This also shaves off quite a bit of the CPU time spent in hardirq
context, as arming the autosuspend timer when the RPM refcount
drops to 0 is a relatively costly operation.
Tested-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Ignore GPUs with a 2.0 front end. These have a different register
layout for the front end, which provokes imprecise aborts from the
register accesses in the 'gpu' debugfs file.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
The hardware description macros define the mask and shifts the wrong
way around for the intended use, leading to the condition never being
true and the chip revision ending up with the wrong value.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Acked-by: Christian Gmeiner <christian.gmeiner@gmail.com>