Commit Graph

254 Commits

Author SHA1 Message Date
Linus Torvalds
8893a6bfff Merge tag 'drm-next-2024-01-15-1' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
 "This is just a wrap up of fixes from the last few days. It has the
  proper fix to the i915/xe collision, we can clean up what you did
  later once rc1 lands.

  Otherwise it's a few other i915, a v3d, rockchip and a nouveau fix to
  make GSP load on some original Turing GPUs.

  i915:
   - Fixes for kernel-doc warnings enforced in linux-next
   - Another build warning fix for string formatting of intel_wakeref_t
   - Display fixes for DP DSC BPC and C20 PLL state verification

  v3d:
   - register readout fix

  rockchip:
   - two build warning fixes

  nouveau:
   - fix GSP loading on Turing with different nvdec configuration"

* tag 'drm-next-2024-01-15-1' of git://anongit.freedesktop.org/drm/drm:
  nouveau/gsp: handle engines in runl without nonstall interrupts.
  drm/i915/perf: reconcile Excess struct member kernel-doc warnings
  drm/i915/guc: reconcile Excess struct member kernel-doc warnings
  drm/i915/gt: reconcile Excess struct member kernel-doc warnings
  drm/i915/gem: reconcile Excess struct member kernel-doc warnings
  drm/i915/dp: Fix the max DSC bpc supported by source
  drm/i915: don't make assumptions about intel_wakeref_t type
  drm/i915/dp: Fix the PSR debugfs entries wrt. MST connectors
  drm/i915/display: Fix C20 pll selection for state verification
  drm/v3d: Fix support for register debugging on the RPi 4
  drm/rockchip: vop2: Drop unused if_dclk_rate variable
  drm/rockchip: vop2: Drop superfluous include
2024-01-17 14:59:05 -08:00
Dave Airlie
205e18c135 nouveau/gsp: handle engines in runl without nonstall interrupts.
It appears on TU106 GPUs (2070), that some of the nvdec engines
are in the runlist but have no valid nonstall interrupt, nouveau
didn't handle that too well.

This should let nouveau/gsp work on those.

Cc: stable@vger.kernel.org # v6.7+
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://lore.kernel.org/all/20240110011826.3996289-1-airlied@gmail.com/
2024-01-15 16:04:48 +10:00
Linus Torvalds
cf65598d59 Merge tag 'drm-next-2024-01-10' of git://anongit.freedesktop.org/drm/drm
Pull drm updates from Dave Airlie:
 "This contains two major new drivers:

   - imagination is a first driver for Imagination Technologies devices,
     it only covers very specific devices, but there is hope to grow it

   - xe is a reboot of the i915 GPU (shares display) side using a more
     upstream focused development model, and trying to maximise code
     sharing. It's not enabled for any hw by default, and will hopefully
     get switched on for Intel's Lunarlake.

  This also drops a bunch of the old UMS ioctls. It's been dead long
  enough.

  amdgpu has a bunch of new color management code that is being used in
  the Steam Deck.

  amdgpu also has a new ACPI WBRF interaction to help avoid radio
  interference.

  Otherwise it's the usual lots of changes in lots of places.

  Detailed summary:

  new drivers:
   - imagination - new driver for Imagination Technologies GPU
   - xe - new driver for Intel GPUs using core drm concepts

  core:
   - add CLOSE_FB ioctl
   - remove old UMS ioctls
   - increase max objects to accomodate AMD color mgmt

  encoder:
   - create per-encoder debugfs directory

  edid:
   - split out drm_eld
   - SAD helpers
   - drop edid_firmware module parameter

  format-helper:
   - cache format conversion buffers

  sched:
   - move from kthread to workqueue
   - rename some internals
   - implement dynamic job-flow control

  gpuvm:
   - provide more features to handle GEM objects

  client:
   - don't acquire module reference

  displayport:
   - add mst path property documentation

  fdinfo:
   - alignment fix

  dma-buf:
   - add fence timestamp helper
   - add fence deadline support

  bridge:
   - transparent aux-bridge for DP/USB-C
   - lt8912b: add suspend/resume support and power regulator support

  panel:
   - edp: AUO B116XTN02, BOE NT116WHM-N21,836X2, NV116WHM-N49
   - chromebook panel support
   - elida-kd35t133: rework pm
   - powkiddy RK2023 panel
   - himax-hx8394: drop prepare/unprepare and shutdown logic
   - BOE BP101WX1-100, Powkiddy X55, Ampire AM8001280G
   - Evervision VGG644804, SDC ATNA45AF01
   - nv3052c: register docs, init sequence fixes, fascontek FS035VG158
   - st7701: Anbernic RG-ARC support
   - r63353 panel controller
   - Ilitek ILI9805 panel controller
   - AUO G156HAN04.0

  simplefb:
   - support memory regions
   - support power domains

  amdgpu:
   - add new 64-bit sequence number infrastructure
   - add AMD specific color management
   - ACPI WBRF support for RF interference handling
   - GPUVM updates
   - RAS updates
   - DCN 3.5 updates
   - Rework PCIe link speed handling
   - Document GPU reset types
   - DMUB fixes
   - eDP fixes
   - NBIO 7.9/7.11 updates
   - SubVP updates
   - XGMI PCIe state dumping for aqua vanjaram
   - GFX11 golden register updates
   - enable tunnelling on high pri compute

  amdkfd:
   - Migrate TLB flushing logic to amdgpu
   - Trap handler fixes
   - Fix restore workers handling on suspend/resume
   - Fix possible memory leak in pqm_uninit()
   - support import/export of dma-bufs using GEM handles

  radeon:
   - fix possible overflows in command buffer checking
   - check for errors in ring_lock

  i915:
   - reorg display code for reuse in xe driver
   - fdinfo memory stats printing
   - DP MST bandwidth mgmt improvements
   - DP panel replay enabling
   - MTL C20 phy state verification
   - MTL DP DSC fractional bpp support
   - Audio fastset support
   - use dma_fence interfaces instead of i915_sw_fence
   - Separate gem and display code
   - AUX register macro refactoring
   - Separate display module/device parameters
   - Move display capabilities debugfs under display
   - Makefile cleanups
   - Register cleanups
   - Move display lock inits under display/
   - VLV/CHV DPIO PHY register and interface refactoring
   - DSI VBT sequence refactoring
   - C10/C20 PHY PLL hardware readout
   - DPLL code cleanups
   - Cleanup PXP plane protection checks
   - Improve display debug msgs
   - PSR selective fetch fixes/improvements
   - DP MST fixes
   - Xe2LPD FBC restrictions removed
   - DGFX uses direct VBT pin mapping
   - more MTL WAs
   - fix MTL eDP bug
   - eliminate use of kmap_atomic

  habanalabs:
   - sysfs entry to identify a device minor id with debugfs path
   - sysfs entry to expose device module id
   - add signed device info retrieval through INFO ioctl
   - add Gaudi2C device support
   - pcie reset prepare/done hooks

  msm:
   - Add support for SDM670, SM8650
   - Handle the CFG interconnect to fix the obscure hangs / timeouts
   - Kconfig fix for QMP dependency
   - use managed allocators
   - DPU: SDM670, SM8650 support
   - DPU: Enable SmartDMA on SM8350 and SM8450
   - DP: enable runtime PM support
   - GPU: add metadata UAPI
   - GPU: move devcoredumps to GPU device
   - GPU: convert to drm_exec

  ivpu:
   - update FW API
   - new debugfs file
   - a new NOP job submission test mode
   - improve suspend/resume
   - PM improvements
   - MMU PT optimizations
   - firmware profile frequency support
   - support for uncached buffers
   - switch to gem shmem helpers
   - replace kthread with threaded irqs

  rockchip:
   - rk3066_hdmi: convert to atomic
   - vop2: support nv20 and nv30
   - rk3588 support

  mediatek:
   - use devm_platform_ioremap_resource
   - stop using iommu_present
   - MT8188 VDOSYS1 display support

  panfrost:
   - PM improvements
   - improve interrupt handling as poweroff

  qaic:
   - allow to run with single MSI
   - support host/device time sync
   - switch to persistent DRM devices

  exynos:
   - fix potential error pointer dereference
   - fix wrong error checking
   - add missing call to drm_atomic_helper_shutdown

  omapdrm:
   - dma-fence lockdep annotation fix

  tidss:
   - dma-fence lockdep annotation fix
   - support for AM62A7

  v3d:
   - BCM2712 - rpi5 support
   - fdinfo + gputop support
   - uapi for CPU job handling

  virtio-gpu:
   - add context debug name"

* tag 'drm-next-2024-01-10' of git://anongit.freedesktop.org/drm/drm: (2340 commits)
  drm/amd/display: Allow z8/z10 from driver
  drm/amd/display: fix bandwidth validation failure on DCN 2.1
  drm/amdgpu: apply the RV2 system aperture fix to RN/CZN as well
  drm/amd/display: Move fixpt_from_s3132 to amdgpu_dm
  drm/amd/display: Fix recent checkpatch errors in amdgpu_dm
  Revert "drm/amdkfd: Relocate TBA/TMA to opposite side of VM hole"
  drm/amd/display: avoid stringop-overflow warnings for dp_decide_lane_settings()
  drm/amd/display: Fix power_helpers.c codestyle
  drm/amd/display: Fix hdcp_log.h codestyle
  drm/amd/display: Fix hdcp2_execution.c codestyle
  drm/amd/display: Fix hdcp_psp.h codestyle
  drm/amd/display: Fix freesync.c codestyle
  drm/amd/display: Fix hdcp_psp.c codestyle
  drm/amd/display: Fix hdcp1_execution.c codestyle
  drm/amd/pm/smu7: fix a memleak in smu7_hwmgr_backend_init
  drm/amdkfd: Fix iterator used outside loop in 'kfd_add_peer_prop()'
  drm/amdgpu: Drop 'fence' check in 'to_amdgpu_amdkfd_fence()'
  drm/amdkfd: Confirm list is non-empty before utilizing list_first_entry in kfd_topology.c
  drm/amdgpu: Fix '*fw' from request_firmware() not released in 'amdgpu_ucode_request()'
  drm/amdgpu: Fix variable 'mca_funcs' dereferenced before NULL check in 'amdgpu_mca_smu_get_mca_entry()'
  ...
2024-01-12 11:32:19 -08:00
Dave Airlie
3108cc0323 nouveau/gsp: free userd allocation.
This was being leaked.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231222043308.3090089-6-airlied@gmail.com
2024-01-05 12:27:53 +10:00
Daniel Vetter
a13fee31f5 Merge v6.7-rc3 into drm-next
Thomas Zimermann needs 8d6ef26501 ("drm/ast: Disconnect BMC if
physical connector is connected") for further ast work in -next.

Minor conflicts in ivpu between 3de6d95978 ("accel/ivpu: Pass D0i3
residency time to the VPU firmware") and 3f7c063492
("accel/ivpu/37xx: Fix hangs related to MMIO reset") changing adjacent
lines.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2023-11-28 11:55:56 +01:00
Yang Li
b3c5a7de9a drm/nouveau/fifo: Remove duplicated include in chan.c
./drivers/gpu/drm/nouveau/nvkm/engine/fifo/chan.c: chid.h is included more than once.

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=7603
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Lyude Paul <lyude@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231122004926.84933-1-yang.lee@linux.alibaba.com
2023-11-22 15:41:13 -05:00
Dave Airlie
ab93edb2f9 nouveau/gsp: allocate enough space for all channel ids.
This probably isn't the ideal fix, but we ended up using chids
sparsely, and lots of things rely on indexing into the full range,
so just allocate the full range up front.

The GSP code fixes 8 channels into a userd page, but we end up using
a single userd page per channel so end up sparsely using the range.

Fixes a few crashes seen with multiple channels.

Link: https://gitlab.freedesktop.org/drm/nouveau/-/issues/277
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231121201109.2988516-1-airlied@gmail.com
2023-11-21 22:28:01 +01:00
Dave Airlie
b5bad8c16b nouveau/gsp: move to 535.113.01
This moves the initial effort to the latest 535 firmware.

The gsp msg structs have changed, and the message passing also.
The wpr also seems to have some struct changes.

This version of the firmware will be what we are stuck on for a while,
until we can refactor the driver and work out a better path forward.

Reviewed-by: Danilo Krummrich <dakr@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2023-11-03 12:57:14 +10:00
Dave Airlie
b76827a3a9 nouveau: fix r535 build on 32-bit arm.
This needs the proper division macros.

Reviewed-by: Danilo Krummrich <dakr@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231030012814.1208972-1-airlied@gmail.com
2023-10-31 15:11:14 +10:00
Ben Skeggs
015185cc67 drm/nouveau/ofa/r535: initial support
Adds support for allocating OFA classes from RM.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230918202149.4343-45-skeggsb@gmail.com
2023-10-31 15:08:19 +10:00
Ben Skeggs
ca9686340a drm/nouveau/nvjpg/r535: initial support
Adds support for allocating NVJPG classes from RM.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230918202149.4343-44-skeggsb@gmail.com
2023-10-31 15:08:19 +10:00
Ben Skeggs
08ab88f5a0 drm/nouveau/nvenc/r535: initial support
Adds support for allocating VIDEO_ENCODER classes from RM.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230918202149.4343-43-skeggsb@gmail.com
2023-10-31 15:08:18 +10:00
Ben Skeggs
142cd60243 drm/nouveau/nvdec/r535: initial support
Adds support for allocating VIDEO_DECODER classes from RM.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230918202149.4343-42-skeggsb@gmail.com
2023-10-31 15:08:17 +10:00
Ben Skeggs
361c3cd8ae drm/nouveau/gr/r535: initial support
Adds support for allocating GR classes from RM.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230918202149.4343-41-skeggsb@gmail.com
2023-10-31 15:08:17 +10:00
Ben Skeggs
b5ce219ab3 drm/nouveau/ce/r535: initial support
Adds support for allocating DMA_COPY classes from RM.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230918202149.4343-40-skeggsb@gmail.com
2023-10-31 15:08:17 +10:00
Ben Skeggs
2a77d015b5 drm/nouveau/fifo/r535: initial support
- Adds support for allocating CHANNEL_GPFIFO classes from RM.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230918202149.4343-39-skeggsb@gmail.com
2023-10-31 15:08:16 +10:00
Ben Skeggs
da1fbcc09e drm/nouveau/fifo/tu102-: prepare for GSP-RM
- (temporarily) disable if GSP-RM detected, will be added later
- add dtor() so GSP-RM paths can cleanup properly
- add alternate engine context mapping interface for RM engines
- add alternate chid interfaces to handle RM USERD oddities

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230918202149.4343-26-skeggsb@gmail.com
2023-10-31 15:08:14 +10:00
Ben Skeggs
55e1a59960 drm/nouveau/fifo/ga100-: add per-runlist nonstall intr handling
GSP-RM will enforce this, so implement on HW too so we can share code.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230525003106.3853741-8-skeggsb@gmail.com
2023-07-06 17:22:33 +02:00
Ben Skeggs
84ab065e7a drm/nouveau/fifo/ga100-: remove individual runlists rather than failing oneinit
We're adding better support for the non-stall interrupt, which will need
to fetch the interrupt vector from the runlist's primary engine.

NVKM doesn't support all target engines (ie. NVDEC etc), and it wouldn't
be ideal to completely fail initialisation in this case.

Instead.  Remove runlists where we can't determine all the needed info.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230525003106.3853741-7-skeggsb@gmail.com
2023-07-06 17:22:33 +02:00
Ben Skeggs
670451c33c drm/nouveau/fifo: return ERR_PTR from nvkm_runl_new()
Callers expect this - not NULL.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230525003106.3853741-6-skeggsb@gmail.com
2023-07-06 17:22:33 +02:00
Tom Rix
c14bff92ab drm/nouveau/fifo: set nvkm_engn_cgrp_get storage-class-specifier to static
smatch reports
drivers/gpu/drm/nouveau/nvkm/engine/fifo/runl.c:33:18:
  warning: symbol 'nvkm_engn_cgrp_get' was not declared. Should it be static?

nvkm_engn_cgrp_get is only used in runl.c, so it should be static

Signed-off-by: Tom Rix <trix@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230228221533.3240520-1-trix@redhat.com
2023-03-16 14:53:15 +01:00
Tom Rix
abe3c66f34 drm/nouveau/fifo: set gf100_fifo_nonstall_block_dump storage-class-specifier to static
gcc with W=1 reports
drivers/gpu/drm/nouveau/nvkm/engine/fifo/gf100.c:451:1: error:
  no previous prototype for ‘gf100_fifo_nonstall_block’ [-Werror=missing-prototypes]
  451 | gf100_fifo_nonstall_block(struct nvkm_event *event, int type, int index)
      | ^~~~~~~~~~~~~~~~~~~~~~~~~

gf100_fifo_nonstall_block is only used in gf100.c, so it should be static

Signed-off-by: Tom Rix <trix@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230303132731.1919329-1-trix@redhat.com
2023-03-16 14:53:15 +01:00
Ben Skeggs
0ceceaa9ae drm/nouveau/fifo: expose function to read engine ctxsw status
Needed to support Ampere differences in gr/gf100-:

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
2022-11-09 10:45:10 +10:00
Ben Skeggs
1ed02c3f2d drm/nouveau/engine: add HAL for engine-specific rc reset procedure
Will be used to improve gr reset on GF100 and newer.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
2022-11-09 10:45:10 +10:00
Ben Skeggs
7f4f35ea5b drm/nouveau/fifo/ga100-: initial support
- replaces the hacked-up version that existed solely to support TTM

v2. remove earlier hack preventing use of non-stall intr for fences

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2022-11-09 10:44:50 +10:00
Ben Skeggs
06db7fded6 drm/nouveau/fifo: add new channel classes
Exposes a bunch of the new features that became possible as a result
of the earlier commits.  DRM will build on this in the future to add
support for features such as SCG ("async compute") and multi-device
rendering, as part of the work necessary to be able to write a half-
decent vulkan driver - finally.

For the moment, this just crudely ports DRM to the API changes.

- channel class interfaces now the same for all HW classes
- channel group class exposed (SCG)
- channel runqueue selector exposed (SCG)
- channel sub-device id control exposed (multi-device rendering)
- channel names in logging will reflect creating process, not fd owner
- explicit USERD allocation required by VOLTA_CHANNEL_GPFIFO_A and newer
- drm is smarter about determining the appropriate channel class to use

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
2022-11-09 10:44:50 +10:00
Ben Skeggs
7ac2933281 drm/nouveau/fifo: add new engine object handling
Simplifies the GPU-specific code, completing the switch to newer HALs.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
2022-11-09 10:44:49 +10:00
Ben Skeggs
8ab849d6dd drm/nouveau/fifo: add new engine context handling
Builds on the context tracking that was added earlier.

- marks engine context PTEs as 'priv' where possible

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
2022-11-09 10:44:49 +10:00
Ben Skeggs
3647c53bd7 drm/nouveau/fifo: add RAMFC info to nvkm_chan_func
- adds support for specifying SUBDEVICE_ID for channel
- rounds non-power-of-two GPFIFO sizes down, rather than up

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
2022-11-09 10:44:49 +10:00
Ben Skeggs
fbe9f4337c drm/nouveau/fifo: add USERD info to nvkm_chan_func
And use it to cleanup multiple implementations of almost the same thing.

- prepares for non-polled / client-provided USERD
- only zeroes relevant "registers", rather than entire USERD

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
2022-11-09 10:44:49 +10:00
Ben Skeggs
d3e7a4392c drm/nouveau/fifo: add RAMIN info to nvkm_chan_func
Currently provided by {chan,dma,gpfifo}*.c, and those are going away.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
2022-11-09 10:44:49 +10:00
Ben Skeggs
b084fff210 drm/nouveau/fifo: add common runlist control
- less dependence on waiting for runlist updates, on GPUs that allow it
- supports runqueue selector in RAMRL entries
- completes switch to common runl/cgrp/chan topology info

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
2022-11-09 10:44:49 +10:00
Ben Skeggs
4d60100a23 drm/nouveau/fifo: add common channel recovery
That sure was fun to untangle.

- handled per-runlist, rather than globally
- more straight-forward process in general
- various potential SW/HW races have been fixed
- fixes lockdep issues that were present in >=gk104's prior implementation
- volta recovery now actually stands a chance of working
- volta/turing waiting for PBDMA idle before engine reset
- turing using hw-provided TSG info for CTXSW_TIMEOUT

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
2022-11-09 10:44:49 +10:00
Ben Skeggs
0b1bb1296f drm/nouveau/fifo: kill channel on NV_PPBDMA_INTR_1_CTXNOTVALID
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
2022-11-09 10:44:49 +10:00
Ben Skeggs
520db0405e drm/nouveau/fifo: kill channel on a selection of PBDMA errors
A bunch of these can be handled in such a way that the channel can
continue, however, any of these are a pretty decent sign something
has gone horribly wrong, and the safest option is to disable the
channel.

This is a bit of a hack, we will want to handle these individually
and dump relevant debug info for each at some point.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
2022-11-09 10:44:49 +10:00
Ben Skeggs
acff941535 drm/nouveau/fifo: add chan/cgrp preempt()
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
2022-11-09 10:44:49 +10:00
Ben Skeggs
67059b9fb8 drm/nouveau/fifo: add chan start()/stop()
- nvkm_chan_error() built on top, stops channel and sends 'killed' event
- removes an odd double-bashing of channel enable regs on kepler and up
- pokes doorbell on turing and up, after enabling channel

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
2022-11-09 10:44:48 +10:00
Ben Skeggs
62742b5ef3 drm/nouveau/fifo: add chan bind()/unbind()
- stops programming (non-existent) runl id field on bind(), from maxwell

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
2022-11-09 10:44:48 +10:00
Ben Skeggs
3a6bc9c242 drm/nouveau/fifo: add runlist block()/allow()
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
2022-11-09 10:44:48 +10:00
Ben Skeggs
4a492fd5d2 drm/nouveau/fifo: add runlist wait()
- adds g8x/turing registers, which were missing before
- switches fermi to polled wait, like later hw (see: 4f2fc25c0f8bc...)

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
2022-11-09 10:44:48 +10:00
Ben Skeggs
f48dd29361 drm/nouveau/fifo: add new engine context tracking
Channel groups have somewhat more complicated requirements than what we
currently support.  An engine context is shared between all channels in
a channel group, VEID/subctx support (later) brings per-VEID components,
and we need to track an individual channel's engine context pointers.

This commit adds the structures and refcounting to support the above,
wrapping the prior implementation for the moment.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
2022-11-09 10:44:48 +10:00
Ben Skeggs
c358f53871 drm/nouveau/fifo: add new channel lookup interfaces
- supports per-runlist CHIDs
- channel group lock held across reference, rather than global lock

v2:
- remove unnecessary parenthesis

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
2022-11-09 10:44:48 +10:00
Ben Skeggs
e43c872c03 drm/nouveau/fifo: merge mmu fault handlers together
After updating GF100 implementation from the GK104/TU102 ones, and using
the new runlist/engine topology info, all three handlers become (almost)
identical.

- there's a temporary kludge to call through to the HW-specific recovery
- engine fault mapping info determined at load time, not on every fault

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
2022-11-09 10:44:48 +10:00
Ben Skeggs
923f1ff527 drm/nouveau/fifo: move PBDMA intr to runq
- merges gf100/gk104- NV_PFIFO_INTR_0_PBDMA and NV_PPBDMA_INTR_0 code

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
2022-11-09 10:44:48 +10:00
Ben Skeggs
87c8602431 drm/nouveau/fifo: move PBDMA init to runq
- bumps pbdma timeout to value RM uses on newer HW
- bumps fb timeout to max from boot default
- one/both of these greatly improves stability on // piglit runs

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
2022-11-09 10:44:48 +10:00
Ben Skeggs
324176e7c8 drm/nouveau/fifo: program NV_PFIFO_FB_TIMEOUT on init
NVGPU and RM both program this value.

Fixes a bunch of random hangs running parallel piglit.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2022-11-09 10:44:48 +10:00
Ben Skeggs
965c41d911 drm/nouveau/fifo: tidy global PBDMA init
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2022-11-09 10:44:48 +10:00
Ben Skeggs
d67f3b9646 drm/nouveau/fifo: tidy up non-stall intr handling
- removes a layer of indirection in the intr handling
- prevents non-stall ctrl racing with unknown intrs

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
2022-11-09 10:44:47 +10:00
Ben Skeggs
2fc71a0566 drm/nouveau/fifo: use explicit intr interfaces
More control, and shallower call-chain to get to the point.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
2022-11-09 10:44:47 +10:00
Ben Skeggs
0fc72ee9d8 drm/nouveau/fifo: use runlist engine info to lookup engine classes
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
2022-11-09 10:44:47 +10:00