Commit Graph

113 Commits

Author SHA1 Message Date
Dan Carpenter
2579b8b0ec drm/nouveau/fifo/gk104-: Silence a locking warning
Presumably we can never actually hit this return, but static checkers
complain that we should unlock before we return.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-05-17 09:09:41 +10:00
Ilia Mirkin
eef4988ab4 drm/nouveau/fifo/nv40: no ctxsw for pre-nv44 mpeg engine
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-04-29 22:39:23 +10:00
Alexandre Courbot
af3a4f7efb drm/nouveau/fifo: add GP10B support
GP10B's FIFO is similar to GP100's, but only allows 512 channels.

Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-04-06 14:39:04 +10:00
Ben Skeggs
75d115f2aa drm/nouveau/fifo/gk104-: preempt recovery
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-02-17 17:38:16 +10:00
Ben Skeggs
3ebef76a1d drm/nouveau/fifo/gk104-: trigger mmu fault before attempting engine recovery
Greatly improves the chances of recovering the GPU from a CTXSW_TIMEOUT.

Tested with piglit's arb_shader_image_load_store-atomicity, which causes
GR to hang in such a way that recovery failed (CTXSW_TIMEOUT continually
re-triggers).

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-02-17 17:38:15 +10:00
Ben Skeggs
03f16f5f27 drm/nouveau/fifo/gk104-: ACK SCHED_ERROR before attempting CTXSW_TIMEOUT recovery
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-02-17 17:38:15 +10:00
Ben Skeggs
91b9d659ab drm/nouveau/fifo/gk104-: directly use new recovery code for ctxsw timeout
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-02-17 17:38:14 +10:00
Ben Skeggs
3534821df5 drm/nouveau/fifo/gk104-: directly use new recovery code for mmu faults
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-02-17 17:38:14 +10:00
Ben Skeggs
eaa5ed65ee drm/nouveau/fifo/gk104-: reset all engines a killed channel is still active on
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-02-17 17:38:13 +10:00
Ben Skeggs
0faaa47d44 drm/nouveau/fifo/gk104-: refactor recovery code
This will serve as a basis for implementing some improvements to how
we recover the GPU from channel errors.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-02-17 17:38:13 +10:00
Ben Skeggs
ec5c6bda19 drm/nouveau/fifo/gk104-: better detection of chid when parsing engine status
The previous commit simply changes the interface, but should result in
the same behaviour as previously.  This commit has been split out from
it as it can result in a different channel being selected.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-02-17 17:38:12 +10:00
Ben Skeggs
b88917fe0f drm/nouveau/fifo/gk104-: separate out engine status parsing
We'll be wanting to reuse this logic in more places.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-02-17 17:38:12 +10:00
Ben Skeggs
21e6de29bb drm/nouveau/fifo: add an api for initiating channel recovery
This will be used by callers outside of fifo interrupt handlers.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-02-17 17:38:12 +10:00
Ben Skeggs
ff9f29abf0 drm/nouveau/fifo/gf100-: provide notification to user if channel is killed
There are instances (such as non-recoverable GPU page faults) where
NVKM decides that a channel's context is no longer viable, and will
be removed from the runlist.

This commit notifies the owner of the channel when this happens, so
it has the opportunity to take some kind of recovery action instead
of hanging.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-02-17 17:38:08 +10:00
Ben Skeggs
40cea73984 drm/nouveau/fifo/g84-: rename non-stall interrupt event
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-02-17 17:38:08 +10:00
Ben Skeggs
e774055a07 drm/nouveau/fifo: tidy up channel creation event code
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-02-17 17:38:08 +10:00
Ben Skeggs
d2ee360564 drm/nouveau/core/memory: distinguish between coherent/non-coherent targets
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-02-17 15:15:01 +10:00
Ben Skeggs
83e85d91b2 drm/nouveau/dma: lookup objects with nvkm_object_search()
Custom code is no longer needed here.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-02-17 15:14:57 +10:00
Ben Skeggs
ec884f74f1 drm/nouveau/fifo/gf100-: recover from host mmu faults
This has been on the TODO list for a while now, recovering from things
such as attempting to execute a push buffer or touch a semaphore in an
unmapped memory area.

The only thing required on the HW side here is that the offending
channel is removed from the runlist, and *not* a full reset of PFIFO.

This used to be a bit messier to handle before the rework to make use
of engine topology info, but is apparently now trivial.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2016-12-13 11:38:51 +10:00
Ben Skeggs
b27add13f5 drm/nouveau/fifo/gf100-: protect channel preempt with subdev mutex
This avoids an issue that occurs when we're attempting to preempt multiple
channels simultaneously.  HW seems to ignore preempt requests while it's
still processing a previous one, which, well, makes sense.

Fixes random "fifo: SCHED_ERROR 0d []" + GPCCS page faults during parallel
piglit runs on (at least) GM107.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Cc: stable@vger.kernel.org
2016-11-07 14:05:13 +10:00
Baoyou Xie
e08a1d97d3 drm/nouveau: mark symbols static where possible
We get a few warnings when building kernel with W=1:
drivers/gpu/drm/nouveau/nvkm/subdev/bios/fan.c:29:1: warning: no previous prototype for 'nvbios_fan_table' [-Wmissing-prototypes]
drivers/gpu/drm/nouveau/nvkm/subdev/bios/fan.c:56:1: warning: no previous prototype for 'nvbios_fan_entry' [-Wmissing-prototypes]
drivers/gpu/drm/nouveau/nvkm/subdev/clk/gt215.c:184:1: warning: no previous prototype for 'gt215_clk_info' [-Wmissing-prototypes]
drivers/gpu/drm/nouveau/nvkm/subdev/fb/ramgt215.c:99:1: warning: no previous prototype for 'gt215_link_train_calc' [-Wmissing-prototypes]
drivers/gpu/drm/nouveau/nvkm/subdev/fb/ramgt215.c:153:1: warning: no previous prototype for 'gt215_link_train' [-Wmissing-prototypes]
drivers/gpu/drm/nouveau/nvkm/subdev/fb/ramgt215.c:271:1: warning: no previous prototype for 'gt215_link_train_init' [-Wmissing-prototypes]
....

In fact, both functions are only used in the file in which they are
declared and don't need a declaration, but can be made static.
So this patch marks these functions with 'static'.

Signed-off-by: Baoyou Xie <baoyou.xie@linaro.org>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2016-11-07 14:04:36 +10:00
Ilia Mirkin
666ca3d8f1 drm/nouveau/fifo/nv04: avoid ramht race against cookie insertion
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: stable@vger.kernel.org
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2016-09-22 17:32:20 +10:00
Ben Skeggs
e8ff979492 drm/nouveau/fifo/gp100: initial support
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2016-07-14 11:53:25 +10:00
Ben Skeggs
1fe8c02fbc drm/nouveau/fifo/gk104-: translate engidx into human-readable name in debug output
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2016-07-14 11:53:25 +10:00
Ben Skeggs
952eb819e3 drm/nouveau/top: take nvkm_device as argument to public functions
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2016-07-14 11:53:25 +10:00
Ben Skeggs
0cdc3fdfb7 drm/nouveau/fifo/gm107-: remove engines from mmu engine mapping array
These are specified by PTOP on Maxwell GPUs.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2016-05-20 14:43:04 +10:00
Ben Skeggs
289e082706 drm/nouveau/fifo/gk104-: identify mmu engine ids for host faults
It appears these don't map to PBDMAs (at least on Kepler, it may or may
be valid for Fermi - this hasn't been checked), but to runlists.

This drops the NVKM_ENGINE_FIFO data from the entries too, as resetting
all of PFIFO is *not* the way to handle such faults.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2016-05-20 14:43:04 +10:00
Ben Skeggs
e50d0237fc drm/nouveau/fifo/gk104-: implement support for PTOP fault info
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2016-05-20 14:43:04 +10:00
Ben Skeggs
91419acf78 drm/nouveau/fifo/gk104-: abstract mmu fault data structures
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2016-05-20 14:43:04 +10:00
Ben Skeggs
98ac3f061a drm/nouveau/fifo/gk104-: subclass func
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2016-05-20 14:43:04 +10:00
Ben Skeggs
e93e198d46 drm/nouveau/fifo/gk104-: use device info from top subdev
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2016-05-20 14:43:04 +10:00
Ben Skeggs
56d06fa29e drm/nouveau/core: remove pmc_enable argument from subdev ctor
These are now specified directly in the MC subdev.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2016-05-20 14:43:04 +10:00
Ben Skeggs
7c4f87c9e5 drm/nouveau/fifo/gm107: KeplerChannelGpfifoB, and 2048 channels
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2016-03-14 10:13:48 +10:00
Ben Skeggs
63f8c9b7f6 drm/nouveau/fifo/gk110: expose KeplerChannelGpfifoB
This class supports a WFI method (0x0078) that's not present on the
KeplerChannelGpfifoA class.

The binary driver exposes both classes on these GPUs for some reason,
though there doesn't appear to be any difference in the setup that's
done for each (ie. even if you allocate GpfifoA, the WFI method will
still work).

We shall just expose GpfifoB, as I don't see a good reason to report
the presence of both.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2016-03-14 10:13:48 +10:00
Ben Skeggs
b4c5fc4b85 drm/nouveau/fifo/gk104: submit NOP after all PBDMA_INTR_0, not just DEVICE
Prevents the same interrupt from re-triggering forever.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2016-03-14 10:13:47 +10:00
Ben Skeggs
4a3f63f808 drm/nouveau/fifo/gk104: add vic plumbing
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2016-03-14 10:13:47 +10:00
Ben Skeggs
a8b005fd52 drm/nouveau/fifo/gk104: add sec plumbing
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2016-03-14 10:13:46 +10:00
Ben Skeggs
608fd040b7 drm/nouveau/fifo/gk104: add nvdec plumbing
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2016-03-14 10:13:46 +10:00
Ben Skeggs
9e4fff3205 drm/nouveau/fifo/gk104: add nvenc plumbing
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2016-03-14 10:13:46 +10:00
Ben Skeggs
5d7fa4de46 drm/nouveau/fifo/gk104: add msenc plumbing
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2016-03-14 10:13:45 +10:00
Ben Skeggs
1f5ff7f52b drm/nouveau/fifo/gk104: make use of topology info during gpfifo construction
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2016-03-14 10:13:42 +10:00
Ben Skeggs
19f89279fa drm/nouveau/fifo/gk104: make use of topology info during fault recovery
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2016-03-14 10:13:42 +10:00
Ben Skeggs
af83a67779 drm/nouveau/fifo/gk104: make use of topology info when handling ctxsw timeout
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2016-03-14 10:13:41 +10:00
Ben Skeggs
41e5171ba8 drm/nouveau/fifo/gk104: read device topology information from hw
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2016-03-14 10:13:41 +10:00
Ben Skeggs
69aa40e276 drm/nouveau/fifo/gk104: cosmetic engine->runlist changes
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2016-03-14 10:13:40 +10:00
Ben Skeggs
acdf7d4f7e drm/nouveau/fifo/gk104: don't attempt recovery of unknown mmu engines
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2016-03-14 10:13:40 +10:00
Ben Skeggs
55252da161 drm/nouveau/fifo/gk104: identify fault-recovery members more clearly
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2016-03-14 10:13:39 +10:00
Ben Skeggs
6d39b83f13 drm/nouveau/fifo/gk104: rename spoon to pbdma, and move detection to oneinit
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2016-03-14 10:13:39 +10:00
Ben Skeggs
1015d81122 drm/nouveau/fifo/gf100: fix certain engines not being recovered after a fault
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2016-03-14 10:13:38 +10:00
Ben Skeggs
f22d7d45fa drm/nouveau/fifo/gf100: don't attempt recovery of unknown mmu engines
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2016-03-14 10:13:38 +10:00