Commit Graph

1141 Commits

Author SHA1 Message Date
Dave Airlie
5493bf2d0f Merge tag 'drm-misc-fixes-2024-04-18' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes
Short summary of fixes pull:

nouveau:
- dp: Don't probe DP ports twice
- nv04: Fix OOB access
- nv50: Disable AUX bus for disconnected DP ports
- nvkm: Fix race condition

panel:
- Don't unregister DSI devices in several drivers

ttm:
- Stop pooling cached NUMA pages

v3d:
- Fix enabled_ns increment

vmwgfx:
- Fix PRIME import/export
- Fix CRTC's atomic check for primary planes
- Sort plane formats by preference

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20240418072229.GA8983@localhost.localdomain
2024-04-19 10:22:28 +10:00
Dave Airlie
fff1386cc8 nouveau: fix instmem race condition around ptr stores
Running a lot of VK CTS in parallel against nouveau, once every
few hours you might see something like this crash.

BUG: kernel NULL pointer dereference, address: 0000000000000008
PGD 8000000114e6e067 P4D 8000000114e6e067 PUD 109046067 PMD 0
Oops: 0000 [#1] PREEMPT SMP PTI
CPU: 7 PID: 53891 Comm: deqp-vk Not tainted 6.8.0-rc6+ #27
Hardware name: Gigabyte Technology Co., Ltd. Z390 I AORUS PRO WIFI/Z390 I AORUS PRO WIFI-CF, BIOS F8 11/05/2021
RIP: 0010:gp100_vmm_pgt_mem+0xe3/0x180 [nouveau]
Code: c7 48 01 c8 49 89 45 58 85 d2 0f 84 95 00 00 00 41 0f b7 46 12 49 8b 7e 08 89 da 42 8d 2c f8 48 8b 47 08 41 83 c7 01 48 89 ee <48> 8b 40 08 ff d0 0f 1f 00 49 8b 7e 08 48 89 d9 48 8d 75 04 48 c1
RSP: 0000:ffffac20c5857838 EFLAGS: 00010202
RAX: 0000000000000000 RBX: 00000000004d8001 RCX: 0000000000000001
RDX: 00000000004d8001 RSI: 00000000000006d8 RDI: ffffa07afe332180
RBP: 00000000000006d8 R08: ffffac20c5857ad0 R09: 0000000000ffff10
R10: 0000000000000001 R11: ffffa07af27e2de0 R12: 000000000000001c
R13: ffffac20c5857ad0 R14: ffffa07a96fe9040 R15: 000000000000001c
FS:  00007fe395eed7c0(0000) GS:ffffa07e2c980000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000008 CR3: 000000011febe001 CR4: 00000000003706f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:

...

 ? gp100_vmm_pgt_mem+0xe3/0x180 [nouveau]
 ? gp100_vmm_pgt_mem+0x37/0x180 [nouveau]
 nvkm_vmm_iter+0x351/0xa20 [nouveau]
 ? __pfx_nvkm_vmm_ref_ptes+0x10/0x10 [nouveau]
 ? __pfx_gp100_vmm_pgt_mem+0x10/0x10 [nouveau]
 ? __pfx_gp100_vmm_pgt_mem+0x10/0x10 [nouveau]
 ? __lock_acquire+0x3ed/0x2170
 ? __pfx_gp100_vmm_pgt_mem+0x10/0x10 [nouveau]
 nvkm_vmm_ptes_get_map+0xc2/0x100 [nouveau]
 ? __pfx_nvkm_vmm_ref_ptes+0x10/0x10 [nouveau]
 ? __pfx_gp100_vmm_pgt_mem+0x10/0x10 [nouveau]
 nvkm_vmm_map_locked+0x224/0x3a0 [nouveau]

Adding any sort of useful debug usually makes it go away, so I hand
wrote the function in a line, and debugged the asm.

Every so often pt->memory->ptrs is NULL. This ptrs ptr is set in
the nv50_instobj_acquire called from nvkm_kmap.

If Thread A and Thread B both get to nv50_instobj_acquire around
the same time, and Thread A hits the refcount_set line, and in
lockstep thread B succeeds at refcount_inc_not_zero, there is a
chance the ptrs value won't have been stored since refcount_set
is unordered. Force a memory barrier here, I picked smp_mb, since
we want it on all CPUs and it's write followed by a read.

v2: use paired smp_rmb/smp_wmb.

Cc: <stable@vger.kernel.org>
Fixes: be55287aa5 ("drm/nouveau/imem/nv50: embed nvkm_instobj directly into nv04_instobj")
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240411011510.2546857-1-airlied@gmail.com
2024-04-15 10:51:28 +02:00
Dave Airlie
1b24b3cd1a Merge tag 'drm-misc-fixes-2024-04-11' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes
Short summary of fixes pull:

ast:
- Fix soft lockup

client:
- Protect connector modes with mode_config mutex

host1x:
- Do not setup DMA for virtual addresses

ivpu:
- Fix deadlock in context_xa
- PCI fixes
- Fixes to error handling

nouveau:
- gsp: Fix OOB access
- Fix casting

panfrost:
- Fix error path in MMU code

qxl:
- Revert "drm/qxl: simplify qxl_fence_wait"

vmwgfx:
- Enable DMA for SEV mappings

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20240411073403.GA9895@localhost.localdomain
2024-04-12 05:35:46 +10:00
Dave Airlie
718c4fb221 nouveau: fix devinit paths to only handle display on GSP.
This reverts:
nouveau/gsp: don't check devinit disable on GSP.
and applies a further fix.

It turns out the open gpu driver, checks this register,
but only for display.

Match that behaviour and in the turing path only disable
the display block. (ampere already only does displays).

Fixes: 5d4e8ae6e5 ("nouveau/gsp: don't check devinit disable on GSP.")
Reviewed-by: Danilo Krummrich <dakr@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240408064243.2219527-1-airlied@gmail.com
2024-04-09 13:14:13 +10:00
Arnd Bergmann
185fdb4697 nouveau: fix function cast warning
Calling a function through an incompatible pointer type causes breaks
kcfi, so clang warns about the assignment:

drivers/gpu/drm/nouveau/nvkm/subdev/bios/shadowof.c:73:10: error: cast from 'void (*)(const void *)' to 'void (*)(void *)' converts to incompatible function type [-Werror,-Wcast-function-type-strict]
   73 |         .fini = (void(*)(void *))kfree,

Avoid this with a trivial wrapper.

Fixes: c39f472e9f ("drm/nouveau: remove symlinks, move core/ to nvkm/ (no code changes)")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240404160234.2923554-1-arnd@kernel.org
2024-04-05 18:30:29 +02:00
Kees Cook
838ae9f45c nouveau/gsp: Avoid addressing beyond end of rpc->entries
Using the end of rpc->entries[] for addressing runs into both compile-time
and run-time detection of accessing beyond the end of the array. Use the
base pointer instead, since was allocated with the additional bytes for
storing the strings. Avoids the following warning in future GCC releases
with support for __counted_by:

In function 'fortify_memcpy_chk',
    inlined from 'r535_gsp_rpc_set_registry' at ../drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c:1123:3:
../include/linux/fortify-string.h:553:25: error: call to '__write_overflow_field' declared with attribute warning: detected write beyond size of field (1st parameter); maybe use struct_group()? [-Werror=attribute-warning]
  553 |                         __write_overflow_field(p_size_field, size);
      |                         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

for this code:

	strings = (char *)&rpc->entries[NV_GSP_REG_NUM_ENTRIES];
	...
                memcpy(strings, r535_registry_entries[i].name, name_len);

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240330141159.work.063-kees@kernel.org
2024-04-05 18:30:29 +02:00
Linus Torvalds
7ee0490121 Merge tag 'drm-next-2024-03-22' of https://gitlab.freedesktop.org/drm/kernel
Pull drm fixes from Dave Airlie:
 "Fixes from the last week (or 3 weeks in amdgpu case), after amdgpu,
  it's xe and nouveau then a few scattered core fixes.

  core:
   - fix rounding in drm_fixp2int_round()

  bridge:
   - fix documentation for DRM_BRIDGE_OP_EDID

  sun4i:
   - fix 64-bit division on 32-bit architectures

  tests:
   - fix dependency on DRM_KMS_HELPER

  probe-helper:
   - never return negative values from .get_modes() plus driver fixes

  xe:
   - invalidate userptr vma on page pin fault
   - fail early on sysfs file creation error
   - skip VMA pinning on xe_exec if no batches

  nouveau:
   - clear bo resource bus after eviction
   - documentation fixes
   - don't check devinit disable on GSP

  amdgpu:
   - Freesync fixes
   - UAF IOCTL fixes
   - Fix mmhub client ID mapping
   - IH 7.0 fix
   - DML2 fixes
   - VCN 4.0.6 fix
   - GART bind fix
   - GPU reset fix
   - SR-IOV fix
   - OD table handling fixes
   - Fix TA handling on boards without display hardware
   - DML1 fix
   - ABM fix
   - eDP panel fix
   - DPPCLK fix
   - HDCP fix
   - Revert incorrect error case handling in ioremap
   - VPE fix
   - HDMI fixes
   - SDMA 4.4.2 fix
   - Other misc fixes

  amdkfd:
   - Fix duplicate BO handling in process restore"

* tag 'drm-next-2024-03-22' of https://gitlab.freedesktop.org/drm/kernel: (50 commits)
  drm/amdgpu/pm: Don't use OD table on Arcturus
  drm/amdgpu: drop setting buffer funcs in sdma442
  drm/amd/display: Fix noise issue on HDMI AV mute
  drm/amd/display: Revert Remove pixle rate limit for subvp
  Revert "drm/amdgpu/vpe: don't emit cond exec command under collaborate mode"
  Revert "drm/amd/amdgpu: Fix potential ioremap() memory leaks in amdgpu_device_init()"
  drm/amd/display: Add a dc_state NULL check in dc_state_release
  drm/amd/display: Return the correct HDCP error code
  drm/amd/display: Implement wait_for_odm_update_pending_complete
  drm/amd/display: Lock all enabled otg pipes even with no planes
  drm/amd/display: Amend coasting vtotal for replay low hz
  drm/amd/display: Fix idle check for shared firmware state
  drm/amd/display: Update odm when ODM combine is changed on an otg master pipe with no plane
  drm/amd/display: Init DPPCLK from SMU on dcn32
  drm/amd/display: Add monitor patch for specific eDP
  drm/amd/display: Allow dirty rects to be sent to dmub when abm is active
  drm/amd/display: Override min required DCFCLK in dml1_validate
  drm/amdgpu: Bypass display ta if display hw is not available
  drm/amdgpu: correct the KGQ fallback message
  drm/amdgpu/pm: Check the validity of overdiver power limit
  ...
2024-03-21 19:04:31 -07:00
Dave Airlie
5d4e8ae6e5 nouveau/gsp: don't check devinit disable on GSP.
GSP should be handling this and I can see no evidence in opengpu
driver that this register should be touched.

Fixed acceleration on 2080 Ti GPUs.

Fixes: 15740541e8 ("drm/nouveau/devinit/tu102-: prepare for GSP-RM")

Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240314014521.2695233-1-airlied@gmail.com
2024-03-19 14:34:55 +01:00
Linus Torvalds
480e035fc4 Merge tag 'drm-next-2024-03-13' of https://gitlab.freedesktop.org/drm/kernel
Pull drm updates from Dave Airlie:
 "Highlights are usual, more AMD IP blocks for future hw, i915/xe
  changes, Displayport tunnelling support for i915, msm YUV over DP
  changes, new tests for ttm, but its mostly a lot of stuff all over the
  place from lots of people.

  core:
   - EDID cleanups
   - scheduler error handling fixes
   - managed: add drmm_release_action() with tests
   - add ratelimited drm debug print
   - DPCD PSR early transport macro
   - DP tunneling and bandwidth allocation helpers
   - remove built-in edids
   - dp: Avoid AUX transfers on powered-down displays
   - dp: Add VSC SDP helpers

  cross drivers:
   - use new drm print helpers
   - switch to ->read_edid callback
   - gem: add stats for shared buffers plus updates to amdgpu, i915, xe

  syncobj:
   - fixes to waiting and sleeping

  ttm:
   - add tests
   - fix errno codes
   - simply busy-placement handling
   - fix page decryption

  media:
   - tc358743: fix v4l device registration

  video:
   - move all kernel parameters for video behind CONFIG_VIDEO

  sound:
   - remove <drm/drm_edid.h> include from header

  ci:
   - add tests for msm
   - fix apq8016 runner

  efifb:
   - use copy of global screen_info state

  vesafb:
   - use copy of global screen_info state

  simplefb:
   - fix logging

  bridge:
   - ite-6505: fix DP link-training bug
   - samsung-dsim: fix error checking in probe
   - samsung-dsim: add bsh-smm-s2/pro boards
   - tc358767: fix regmap usage
   - imx: add i.MX8MP HDMI PVI plus DT bindings
   - imx: add i.MX8MP HDMI TX plus DT bindings
   - sii902x: fix probing and unregistration
   - tc358767: limit pixel PLL input range
   - switch to new drm_bridge_read_edid() interface

  panel:
   - ltk050h3146w: error-handling fixes
   - panel-edp: support delay between power-on and enable; use put_sync
     in unprepare; support Mediatek MT8173 Chromebooks, BOE NV116WHM-N49
     V8.0, BOE NV122WUM-N41, CSO MNC207QS1-1 plus DT bindings
   - panel-lvds: support EDT ETML0700Z9NDHA plus DT bindings
   - panel-novatek: FRIDA FRD400B25025-A-CTK plus DT bindings
   - add BOE TH101MB31IG002-28A plus DT bindings
   - add EDT ETML1010G3DRA plus DT bindings
   - add Novatek NT36672E LCD DSI plus DT bindings
   - nt36523: support 120Hz timings, fix includes
   - simple: fix display timings on RK32FN48H
   - visionox-vtdr6130: fix initialization
   - add Powkiddy RGB10MAX3 plus DT bindings
   - st7703: support panel rotation plus DT bindings
   - add Himax HX83112A plus DT bindings
   - ltk500hd1829: add support for ltk101b4029w and admatec 9904370
   - simple: add BOE BP082WX1-100 8.2" panel plus DT bindungs

  panel-orientation-quirks:
   - GPD Win Mini

  amdgpu:
   - Validate DMABuf imports in compute VMs
   - Add RAS ACA framework
   - PSP 13 fixes
   - Misc code cleanups
   - Replay fixes
   - Atom interpretor PS, WS bounds checking
   - DML2 fixes
   - Audio fixes
   - DCN 3.5 Z state fixes
   - Remove deprecated ida_simple usage
   - UBSAN fixes
   - RAS fixes
   - Enable seq64 infrastructure
   - DC color block enablement
   - Documentation updates
   - DC documentation updates
   - DMCUB updates
   - ATHUB 4.1 support
   - LSDMA 7.0 support
   - JPEG DPG support
   - IH 7.0 support
   - HDP 7.0 support
   - VCN 5.0 support
   - SMU 13.0.6 updates
   - NBIO 7.11 updates
   - SDMA 6.1 updates
   - MMHUB 3.3 updates
   - DCN 3.5.1 support
   - NBIF 6.3.1 support
   - VPE 6.1.1 support

  amdkfd:
   - Validate DMABuf imports in compute VMs
   - SVM fixes
   - Trap handler updates and enhancements
   - Fix cache size reporting
   - Relocate the trap handler

  radeon:
   - Atom interpretor PS, WS bounds checking
   - Misc code cleanups

  xe:
   - new query for GuC submission version
   - Remove unused persistent exec_queues
   - Add vram frequency sysfs attributes
   - Add the flag XE_VM_BIND_FLAG_DUMPABLE
   - Drop pre-production workarounds
   - Drop kunit tests for unsupported platforms
   - Start pumbling SR-IOV support with memory based interrupts for VF
   - Allow to map BO in GGTT with PAT index corresponding to XE_CACHE_UC
     to work with memory based interrupts
   - Add GuC Doorbells Manager as prep work SR-IOV
   - Implement additional workarounds for xe2 and MTL
   - Program a few registers according to perfomance guide spec for Xe2
   - Fix remaining 32b build issues and enable it back
   - Fix build with CONFIG_DEBUG_FS=n
   - Fix warnings from GuC ABI headers
   - Introduce Relay Communication for SR-IOV for VF <-> GuC <-> PF
   - Release mmap mappings on rpm suspend
   - Disable mid-thread preemption when not properly supported by
     hardware
   - Fix xe_exec by reserving extra fence slot for CPU bind
   - Fix xe_exec with full long running exec queue
   - Canonicalize addresses where needed for Xe2 and add to devcoredum
   - Toggle USM support for Xe2
   - Only allow 1 ufence per exec / bind IOCTL
   - Add GuC firmware loading for Lunar Lake
   - Add XE_VMA_PTE_64K VMA flag

  i915:
   - Add more ADL-N PCI IDs
   - Enable fastboot also on older platforms
   - Early transport for panel replay and PSR
   - New ARL PCI IDs
   - DP TPS4 PHY test pattern support
   - Unify and improve VSC SDP for PSR and non-PSR cases
   - Refactor memory regions and improve debug logging
   - Rework global state serialization
   - Remove unused CDCLK divider fields
   - Unify HDCP connector logging format
   - Use display instead of graphics version in display code
   - Move VBT and opregion debugfs next to the implementation
   - Abstract opregion interface, use opaque type
   - MTL fixes
   - HPD handling fixes
   - Add GuC submission interface version query
   - Atomically invalidate userptr on mmu-notifier
   - Update handling of MMIO triggered reports
   - Don't make assumptions about intel_wakeref_t type
   - Extend driver code of Xe_LPG to Xe_LPG+
   - Add flex arrays to struct i915_syncmap
   - Allow for very slow HuC loading
   - DP tunneling and bandwidth allocation support

  msm:
   - Correct bindings for MSM8976 and SM8650 platforms
   - Start migration of MDP5 platforms to DPU driver
   - X1E80100 MDSS support
   - DPU:
      - Improve DSC allocation, fixing several important corner cases
      - Add support for SDM630/SDM660 platforms
      - Simplify dpu_encoder_phys_ops
      - Apply fixes targeting DSC support with a single DSC encoder
      - Apply fixes for HCTL_EN timing configuration
      - X1E80100 support
      - Add support for YUV420 over DP
   - GPU:
      - fix sc7180 UBWC config
      - fix a7xx LLC config
      - new gpu support: a305B, a750, a702
      - machine support: SM7150 (different power levels than other a618)
      - a7xx devcoredump support

  habanalabs:
   - configure IRQ affinity according to NUMA node
   - move HBM MMU page tables inside the HBM
   - improve device reset
   - check extended PCIe errors

  ivpu:
   - updates to firmware API
   - refactor BO allocation

  imx:
   - use devm_ functions during init

  hisilicon:
   - fix EDID includes

  mgag200:
   - improve ioremap usage
   - convert to struct drm_edid
   - Work around PCI write bursts

  nouveau:
   - disp: use kmemdup()
   - fix EDID includes
   - documentation fixes

  qaic:
   - fixes to BO handling
   - make use of DRM managed release
   - fix order of remove operations

  rockchip:
   - analogix_dp: get encoder port from DT
   - inno_hdmi: support HDMI for RK3128
   - lvds: error-handling fixes

  ssd130x:
   - support SSD133x plus DT bindings

  tegra:
   - fix error handling

  tilcdc:
   - make use of DRM managed release

  v3d:
   - show memory stats in debugfs
   - Support display MMU page size

  vc4:
   - fix error handling in plane prepare_fb
   - fix framebuffer test in plane helpers

  virtio:
   - add venus capset defines

  vkms:
   - fix OOB access when programming the LUT
   - Kconfig improvements

  vmwgfx:
   - unmap surface before changing plane state
   - fix memory leak in error handling
   - documentation fixes
   - list command SVGA_3D_CMD_DEFINE_GB_SURFACE_V4 as invalid
   - fix null-pointer deref in execbuf
   - refactor display-mode probing
   - fix fencing for creating cursor MOBs
   - fix cursor-memory lifetime

  xlnx:
   - fix live video input for ZynqMP DPSUB

  lima:
   - fix memory leak

  loongson:
   - fail if no VRAM present

  meson:
   - switch to new drm_bridge_read_edid() interface

  renesas:
   - add RZ/G2L DU support plus DT bindings

  mxsfb:
   - Use managed mode config

  sun4i:
   - HDMI: updates to atomic mode setting

  mediatek:
   - Add display driver for MT8188 VDOSYS1
   - DSI driver cleanups
   - Filter modes according to hardware capability
   - Fix a null pointer crash in mtk_drm_crtc_finish_page_flip

  etnaviv:
   - enhancements for NPU and MRT support"

* tag 'drm-next-2024-03-13' of https://gitlab.freedesktop.org/drm/kernel: (1420 commits)
  drm/amd/display: Removed redundant @ symbol to fix kernel-doc warnings in -next repo
  drm/amd/pm: wait for completion of the EnableGfxImu message
  drm/amdgpu/soc21: add mode2 asic reset for SMU IP v14.0.1
  drm/amdgpu: add smu 14.0.1 support
  drm/amdgpu: add VPE 6.1.1 discovery support
  drm/amdgpu/vpe: add VPE 6.1.1 support
  drm/amdgpu/vpe: don't emit cond exec command under collaborate mode
  drm/amdgpu/vpe: add collaborate mode support for VPE
  drm/amdgpu/vpe: add PRED_EXE and COLLAB_SYNC OPCODE
  drm/amdgpu/vpe: add multi instance VPE support
  drm/amdgpu/discovery: add nbif v6_3_1 ip block
  drm/amdgpu: Add nbif v6_3_1 ip block support
  drm/amdgpu: Add pcie v6_1_0 ip headers (v5)
  drm/amdgpu: Add nbif v6_3_1 ip headers (v5)
  arch/powerpc: Remove <linux/fb.h> from backlight code
  macintosh/via-pmu-backlight: Include <linux/backlight.h>
  fbdev/chipsfb: Include <linux/backlight.h>
  drm/etnaviv: Restore some id values
  drm/amdkfd: make kfd_class constant
  drm/amdgpu: add ring timeout information in devcoredump
  ...
2024-03-13 18:34:05 -07:00
Timur Tabi
dea185b71b drm/nouveau: fix kerneldoc warnings
kernel test robot complains about missing kerneldoc entries:

  drivers-gpu-drm-nouveau-nvkm-subdev-gsp-r535.c⚠️
    Function-parameter-or-struct-member-gsp-not-described-in-nvkm_gsp_radix3_sg

Signed-off-by: Timur Tabi <ttabi@nvidia.com>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240210002900.148982-1-ttabi@nvidia.com
2024-03-12 22:03:52 +01:00
Sid Pranjale
f6ecfdad35 drm/nouveau: keep DMA buffers required for suspend/resume
Nouveau deallocates a few buffers post GPU init which are required for GPU suspend/resume to function correctly.
This is likely not as big an issue on systems where the NVGPU is the only GPU, but on multi-GPU set ups it leads to a regression where the kernel module errors and results in a system-wide rendering freeze.

This commit addresses that regression by moving the two buffers required for suspend and resume to be deallocated at driver unload instead of post init.

Fixes: 042b5f8384 ("drm/nouveau: fix several DMA buffer leaks")
Signed-off-by: Sid Pranjale <sidpranjale127@protonmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2024-03-01 15:27:04 +10:00
Daniel Vetter
f112b68f27 Merge v6.8-rc6 into drm-next
Thomas Zimmermann asked to backmerge -rc6 for drm-misc branches,
there's a few same-area-changed conflicts (xe and amdgpu mostly) that
are getting a bit too annoying.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2024-02-26 11:41:07 +01:00
Dave Airlie
1d492944d3 nouveau/gsp: add kconfig option to enable GSP paths by default
Turing and Ampere will continue to use the old paths by default,
but we should allow distros to decide what the policy is.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240214040632.661069-1-airlied@gmail.com
2024-02-23 10:00:41 +10:00
Dan Carpenter
65323796de drm/nouveau/mmu/r535: uninitialized variable in r535_bar_new_()
If gf100_bar_new_() fails then "bar" is not initialized.

Fixes: 5bf0257136 ("drm/nouveau/mmu/r535: initial support")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/dab21df7-4d90-4479-97d8-97e5d228c714@moroto.mountain
2024-02-16 18:58:31 +01:00
Arnd Bergmann
0affdba22a nouveau: fix function cast warnings
clang-16 warns about casting between incompatible function types:

drivers/gpu/drm/nouveau/nvkm/subdev/bios/shadow.c:161:10: error: cast from 'void (*)(const struct firmware *)' to 'void (*)(void *)' converts to incompatible function type [-Werror,-Wcast-function-type-strict]
  161 |         .fini = (void(*)(void *))release_firmware,

This one was done to use the generic shadow_fw_release() function as a
callback for struct nvbios_source. Change it to use the same prototype
as the other five instances, with a trivial helper function that actually
calls release_firmware.

Fixes: 70c0f263cc ("drm/nouveau/bios: pull in basic vbios subdev, more to come later")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240213095753.455062-1-arnd@kernel.org
2024-02-16 18:38:47 +01:00
Thomas Zimmermann
0e85f1ae4a Merge drm/drm-next into drm-misc-next
Backmerging to update drm-misc-next to the state of v6.8-rc3. Also
fixes a build problem with xe.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
2024-02-07 13:02:20 +01:00
Timur Tabi
34e659f34a drm/nouveau: nvkm_gsp_radix3_sg() should use nvkm_gsp_mem_ctor()
Function nvkm_gsp_radix3_sg() uses nvkm_gsp_mem objects to allocate the
radix3 tables, but it unnecessarily creates those objects manually
instead of using the standard nvkm_gsp_mem_ctor() function like the
rest of the code does.

Signed-off-by: Timur Tabi <ttabi@nvidia.com>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240202230608.1981026-2-ttabi@nvidia.com
2024-02-05 18:41:09 +01:00
Timur Tabi
042b5f8384 drm/nouveau: fix several DMA buffer leaks
Nouveau manages GSP-RM DMA buffers with nvkm_gsp_mem objects.  Several of
these buffers are never dealloced.  Some of them can be deallocated
right after GSP-RM is initialized, but the rest need to stay until the
driver unloads.

Also futher bullet-proof these objects by poisoning the buffer and
clearing the nvkm_gsp_mem object when it is deallocated.  Poisoning
the buffer should trigger an error (or crash) from GSP-RM if it tries
to access the buffer after we've deallocated it, because we were wrong
about when it is safe to deallocate.

Finally, change the mem->size field to a size_t because that's the same
type that dma_alloc_coherent expects.

Cc: <stable@vger.kernel.org> # v6.7
Fixes: 176fdcbddf ("drm/nouveau/gsp/r535: add support for booting GSP-RM")
Signed-off-by: Timur Tabi <ttabi@nvidia.com>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240202230608.1981026-1-ttabi@nvidia.com
2024-02-05 18:25:13 +01:00
Dave Airlie
61712c9478 nouveau/gsp: use correct size for registry rpc.
Timur pointed this out before, and it just slipped my mind,
but this might help some things work better, around pcie power
management.

Cc: <stable@vger.kernel.org> # v6.7
Fixes: 8d55b0a940 ("nouveau/gsp: add some basic registry entries.")
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240130032643.2498315-1-airlied@gmail.com
2024-02-05 17:36:48 +01:00
Dave Airlie
f8e4806e0d Merge tag 'drm-misc-next-2024-01-11' of git://anongit.freedesktop.org/drm/drm-misc into drm-next
drm-misc-next for v6.9:

UAPI Changes:

virtio:
- add Venus capset defines

Cross-subsystem Changes:

Core Changes:

- fix drm_fixp2int_ceil()
- documentation fixes
- clean ups
- allow DRM_MM_DEBUG with DRM=m
- build fixes for debugfs support
- EDID cleanups
- sched: error-handling fixes
- ttm: add tests

Driver Changes:

bridge:
- ite-6505: fix DP link-training bug
- samsung-dsim: fix error checking in probe
- tc358767: fix regmap usage

efifb:
- use copy of global screen_info state

hisilicon:
- fix EDID includes

mgag200:
- improve ioremap usage
- convert to struct drm_edid

nouveau:
- disp: use kmemdup()
- fix EDID includes
- documentation fixes

panel:
- ltk050h3146w: error-handling fixes
- panel-edp: support delay between power-on and enable; use put_sync in
  unprepare; support Mediatek MT8173 Chromebooks, BOE NV116WHM-N49 V8.0,
  BOE NV122WUM-N41, CSO MNC207QS1-1 plus DT bindings
- panel-lvds: support EDT ETML0700Z9NDHA plus DT bindings
- panel-novatek: FRIDA FRD400B25025-A-CTK plus DT bindings

qaic:
- fixes to BO handling
- make use of DRM managed release
- fix order of remove operations

rockchip:
- analogix_dp: get encoder port from DT
- inno_hdmi: support HDMI for RK3128
- lvds: error-handling fixes

simplefb:
- fix logging

ssd130x:
- support SSD133x plus DT bindings

tegra:
- fix error handling

tilcdc:
- make use of DRM managed release

v3d:
- show memory stats in debugfs

vc4:
- fix error handling in plane prepare_fb
- fix framebuffer test in plane helpers

vesafb:
- use copy of global screen_info state

virtio:
- cleanups

vkms:
- fix OOB access when programming the LUT
- Kconfig improvements

vmwgfx:
- unmap surface before changing plane state
- fix memory leak in error handling
- documentation fixes

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20240111154902.GA8448@linux-uq9g
2024-02-05 13:50:15 +10:00
Dave Airlie
b5e69be185 nouveau/gsp: use correct size for registry rpc.
Timur pointed this out before, and it just slipped my mind,
but this might help some things work better, around pcie power
management.

Fixes: 8d55b0a940 ("nouveau/gsp: add some basic registry entries.")
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/576336/
2024-02-02 17:15:16 +10:00
Jani Nikula
1cff237962 drm/nouveau/acr/ga102: remove unused but set variable
Fix the W=1 warning -Wunused-but-set-variable.

Cc: Karol Herbst <kherbst@redhat.com>
Cc: Lyude Paul <lyude@redhat.com>
Cc: Danilo Krummrich <dakr@redhat.com>
Cc: nouveau@lists.freedesktop.org
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Danilo Krummrich <dakr@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/4d9f62fa6963acfd8b7d8f623799ba3a516e347d.1704908087.git.jani.nikula@intel.com
2024-01-31 21:48:09 +02:00
Maxime Ripard
4db102dcb0 Merge drm/drm-next into drm-misc-next
Kickstart 6.9 development cycle.

Signed-off-by: Maxime Ripard <mripard@kernel.org>
2024-01-29 14:20:23 +01:00
Linus Torvalds
0dde2bf67b Merge tag 'iommu-updates-v6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull iommu updates from Joerg Roedel:
 "Core changes:
   - Fix race conditions in device probe path
   - Retire IOMMU bus_ops
   - Support for passing custom allocators to page table drivers
   - Clean up Kconfig around IOMMU_SVA
   - Support for sharing SVA domains with all devices bound to a mm
   - Firmware data parsing cleanup
   - Tracing improvements for iommu-dma code
   - Some smaller fixes and cleanups

  ARM-SMMU drivers:
   - Device-tree binding updates:
      - Add additional compatible strings for Qualcomm SoCs
      - Document Adreno clocks for Qualcomm's SM8350 SoC
   - SMMUv2:
      - Implement support for the ->domain_alloc_paging() callback
      - Ensure Secure context is restored following suspend of Qualcomm
        SMMU implementation
   - SMMUv3:
      - Disable stalling mode for the "quiet" context descriptor
      - Minor refactoring and driver cleanups

  Intel VT-d driver:
   - Cleanup and refactoring

  AMD IOMMU driver:
   - Improve IO TLB invalidation logic
   - Small cleanups and improvements

  Rockchip IOMMU driver:
   - DT binding update to add Rockchip RK3588

  Apple DART driver:
   - Apple M1 USB4/Thunderbolt DART support
   - Cleanups

  Virtio IOMMU driver:
   - Add support for iotlb_sync_map
   - Enable deferred IO TLB flushes"

* tag 'iommu-updates-v6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (66 commits)
  iommu: Don't reserve 0-length IOVA region
  iommu/vt-d: Move inline helpers to header files
  iommu/vt-d: Remove unused vcmd interfaces
  iommu/vt-d: Remove unused parameter of intel_pasid_setup_pass_through()
  iommu/vt-d: Refactor device_to_iommu() to retrieve iommu directly
  iommu/sva: Fix memory leak in iommu_sva_bind_device()
  dt-bindings: iommu: rockchip: Add Rockchip RK3588
  iommu/dma: Trace bounce buffer usage when mapping buffers
  iommu/arm-smmu: Convert to domain_alloc_paging()
  iommu/arm-smmu: Pass arm_smmu_domain to internal functions
  iommu/arm-smmu: Implement IOMMU_DOMAIN_BLOCKED
  iommu/arm-smmu: Convert to a global static identity domain
  iommu/arm-smmu: Reorganize arm_smmu_domain_add_master()
  iommu/arm-smmu-v3: Remove ARM_SMMU_DOMAIN_NESTED
  iommu/arm-smmu-v3: Master cannot be NULL in arm_smmu_write_strtab_ent()
  iommu/arm-smmu-v3: Add a type for the STE
  iommu/arm-smmu-v3: disable stall for quiet_cd
  iommu/qcom: restore IOMMU state if needed
  iommu/arm-smmu-qcom: Add QCM2290 MDSS compatible
  iommu/arm-smmu-qcom: Add missing GMU entry to match table
  ...
2024-01-18 15:16:57 -08:00
Linus Torvalds
8893a6bfff Merge tag 'drm-next-2024-01-15-1' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
 "This is just a wrap up of fixes from the last few days. It has the
  proper fix to the i915/xe collision, we can clean up what you did
  later once rc1 lands.

  Otherwise it's a few other i915, a v3d, rockchip and a nouveau fix to
  make GSP load on some original Turing GPUs.

  i915:
   - Fixes for kernel-doc warnings enforced in linux-next
   - Another build warning fix for string formatting of intel_wakeref_t
   - Display fixes for DP DSC BPC and C20 PLL state verification

  v3d:
   - register readout fix

  rockchip:
   - two build warning fixes

  nouveau:
   - fix GSP loading on Turing with different nvdec configuration"

* tag 'drm-next-2024-01-15-1' of git://anongit.freedesktop.org/drm/drm:
  nouveau/gsp: handle engines in runl without nonstall interrupts.
  drm/i915/perf: reconcile Excess struct member kernel-doc warnings
  drm/i915/guc: reconcile Excess struct member kernel-doc warnings
  drm/i915/gt: reconcile Excess struct member kernel-doc warnings
  drm/i915/gem: reconcile Excess struct member kernel-doc warnings
  drm/i915/dp: Fix the max DSC bpc supported by source
  drm/i915: don't make assumptions about intel_wakeref_t type
  drm/i915/dp: Fix the PSR debugfs entries wrt. MST connectors
  drm/i915/display: Fix C20 pll selection for state verification
  drm/v3d: Fix support for register debugging on the RPi 4
  drm/rockchip: vop2: Drop unused if_dclk_rate variable
  drm/rockchip: vop2: Drop superfluous include
2024-01-17 14:59:05 -08:00
Dave Airlie
205e18c135 nouveau/gsp: handle engines in runl without nonstall interrupts.
It appears on TU106 GPUs (2070), that some of the nvdec engines
are in the runlist but have no valid nonstall interrupt, nouveau
didn't handle that too well.

This should let nouveau/gsp work on those.

Cc: stable@vger.kernel.org # v6.7+
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://lore.kernel.org/all/20240110011826.3996289-1-airlied@gmail.com/
2024-01-15 16:04:48 +10:00
Randy Dunlap
eeb8e8d9f1 drm/nouveau/volt/gk20a: don't misuse kernel-doc comments
Change kernel-doc "/**" comments to common "/*" comments to prevent
kernel-doc warnings:

gk20a.c:49: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst
 * cvb_mv = ((c2 * speedo / s_scale + c1) * speedo / s_scale + c0)
gk20a.c:49: warning: missing initial short description on line:
 * cvb_mv = ((c2 * speedo / s_scale + c1) * speedo / s_scale + c0)
gk20a.c:62: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst
 * cvb_t_mv =
gk20a.c:62: warning: missing initial short description on line:
 * cvb_t_mv =

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Karol Herbst <kherbst@redhat.com>
Cc: Lyude Paul <lyude@redhat.com>
Cc: Danilo Krummrich <dakr@redhat.com>
Cc: nouveau@lists.freedesktop.org
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231231233633.6596-4-rdunlap@infradead.org
2024-01-08 18:37:52 +01:00
Randy Dunlap
5f807f00b5 drm/nouveau/bios/init: drop kernel-doc notation
The "/**" comments in this file are not kernel-doc comments. They are
used on static functions which can have kernel-doc comments, but that
is not the primary focus of kernel-doc comments.
Since these comments are incomplete for kernel-doc notation, remove
the kernel-doc "/**" markers and make them common comments.

This prevents scripts/kernel-doc from issuing 68 warnings:

init.c:584: warning: Function parameter or member 'init' not described in 'init_reserved'

and 67 warnings like this one:
init.c:611: warning: expecting prototype for INIT_DONE(). Prototype was for init_done() instead

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Karol Herbst <kherbst@redhat.com>
Cc: Lyude Paul <lyude@redhat.com>
Cc: Danilo Krummrich <dakr@redhat.com>
Cc: dri-devel@lists.freedesktop.org
Cc: nouveau@lists.freedesktop.org
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: David Airlie <airlied@gmail.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231216201152.31376-1-rdunlap@infradead.org
2024-01-08 18:33:58 +01:00
Dave Airlie
9c9dd22ba5 nouveau/gsp: always free the alloc messages on r535
Fixes a memory leak seen with kmemleak.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231222043308.3090089-10-airlied@gmail.com
2024-01-05 12:27:53 +10:00
Dave Airlie
4ae3a20102 nouveau/gsp: don't free ctrl messages on errors
It looks like for some messages the upper layers need to get access to the
results of the message so we can interpret it.

Rework the ctrl push interface to not free things and cleanup properly
whereever it errors out.

Requested-by: Lyude
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231222043308.3090089-9-airlied@gmail.com
2024-01-05 12:27:53 +10:00
Dave Airlie
59f6a3d8db nouveau/gsp: convert gsp errors to generic errors
This should let the upper layers retry as needed on EAGAIN.

There may be other values we will care about in the future, but
this covers our present needs.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231222043308.3090089-8-airlied@gmail.com
2024-01-05 12:27:53 +10:00
Lyude Paul
cf22fc2846 drm/nouveau/gsp: Fix ACPI MXDM/MXDS method invocations
Currently we get an error from ACPI because both of these arguments expect
a single argument, and we don't provide one. I'm not totally clear on what
that argument does, but we're able to find the missing value from
_acpiCacheMethodData() in src/kernel/platform/acpi_common.c in nvidia's
driver. So, let's add that - which doesn't get eDP displays to power on
quite yet, but gets rid of the argument warning at least.

Signed-off-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231222043308.3090089-7-airlied@gmail.com
2024-01-05 12:27:53 +10:00
Dave Airlie
a9b9b42b54 nouveau/gsp: free acpi object after use
This fixes a memory leak for the acpi dod object.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231222043308.3090089-5-airlied@gmail.com
2024-01-05 12:27:53 +10:00
Dave Airlie
34ce62a51e nouveau/gsp: drop some acpi related debug
These were leftover debug, if we need to bring them back do so
for debugging later.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231222043308.3090089-3-airlied@gmail.com
2024-01-05 12:27:52 +10:00
Dave Airlie
24ab185d98 nouveau/gsp: add three notifier callbacks that we see in normal operation (v2)
Add NULL callbacks for some things GSP calls that we don't handle, but know about
so we avoid the logging.

v2: Timur suggested allowing null fn.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231222043308.3090089-2-airlied@gmail.com
2024-01-05 12:27:52 +10:00
Joerg Roedel
75f74f85a4 Merge branches 'apple/dart', 'arm/rockchip', 'arm/smmu', 'virtio', 'x86/vt-d', 'x86/amd' and 'core' into next 2024-01-03 09:59:32 +01:00
Thierry Reding
46dec61643 drm/nouveau: Fixup gk20a instobj hierarchy
Commit 12c9b05da9 ("drm/nouveau/imem: support allocations not
preserved across suspend") uses container_of() to cast from struct
nvkm_memory to struct nvkm_instobj, assuming that all instance objects
are derived from struct nvkm_instobj. For the gk20a family that's not
the case and they are derived from struct nvkm_memory instead. This
causes some subtle data corruption (nvkm_instobj.preserve ends up
mapping to gk20a_instobj.vaddr) that causes a NULL pointer dereference
in gk20a_instobj_acquire_iommu() (and possibly elsewhere) and also
prevents suspend/resume from working.

Fix this by making struct gk20a_instobj derive from struct nvkm_instobj
instead.

Fixes: 12c9b05da9 ("drm/nouveau/imem: support allocations not preserved across suspend")
Reported-by: Jonathan Hunter <jonathanh@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231208104653.1917055-1-thierry.reding@gmail.com
2023-12-15 14:10:40 +10:00
Jason Gunthorpe
bf9cd9fef9 iommu/tegra: Use tegra_dev_iommu_get_stream_id() in the remaining places
This API was defined to formalize the access to internal iommu details on
some Tegra SOCs, but a few callers got missed. Add them.

The helper already masks by 0xFFFF so remove this code from the callers.

Suggested-by: Thierry Reding <thierry.reding@gmail.com>
Reviewed-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/7-v2-16e4def25ebb+820-iommu_fwspec_p1_jgg@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-12-12 10:18:51 +01:00
Dave Airlie
cb9c919364 nouveau/tu102: flush all pdbs on vmm flush
This is a hack around a bug exposed with the GSP code, I'm not sure
what is happening exactly, but it appears some of our flushes don't
result in proper tlb invalidation for out BAR2 and we get a BAR2
fault from GSP and it all dies.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231130010852.4034774-1-airlied@gmail.com
2023-11-30 05:47:42 +01:00
Timur Tabi
88a2b4d34a nouveau/gsp: document some aspects of GSP-RM
Document a few aspects of communication with GSP-RM. These comments are
derived from notes made during early development of GSP-RM support in
Nouveau, but were not included in the initial patch set.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Timur Tabi <ttabi@nvidia.com>
Reviewed-by: Danilo Krummrich <dakr@redhat.com>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231122202840.2565153-1-ttabi@nvidia.com
2023-11-30 00:40:23 +01:00
Gustavo A. R. Silva
52fdb99cc4 nouveau/gsp: replace zero-length array with flex-array member and use __counted_by
Fake flexible arrays (zero-length and one-element arrays) are deprecated,
and should be replaced by flexible-array members. So, replace
zero-length array with a flexible-array member in `struct
PACKED_REGISTRY_TABLE`.

Also annotate array `entries` with `__counted_by()` to prepare for the
coming implementation by GCC and Clang of the `__counted_by` attribute.
Flexible array members annotated with `__counted_by` can have their
accesses bounds-checked at run-time via `CONFIG_UBSAN_BOUNDS` (for array
indexing) and `CONFIG_FORTIFY_SOURCE` (for strcpy/memcpy-family functions).

This fixes multiple -Warray-bounds warnings:
drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c:1069:29: warning: array subscript 0 is outside array bounds of 'PACKED_REGISTRY_ENTRY[0]' [-Warray-bounds=]
drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c:1070:29: warning: array subscript 0 is outside array bounds of 'PACKED_REGISTRY_ENTRY[0]' [-Warray-bounds=]
drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c:1071:29: warning: array subscript 0 is outside array bounds of 'PACKED_REGISTRY_ENTRY[0]' [-Warray-bounds=]
drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c:1072:29: warning: array subscript 0 is outside array bounds of 'PACKED_REGISTRY_ENTRY[0]' [-Warray-bounds=]

While there, also make use of the struct_size() helper, and address
checkpatch.pl warning:
WARNING: please, no spaces at the start of a line

This results in no differences in binary output.

Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ZVZbX7C5suLMiBf+@work
2023-11-29 03:11:49 +01:00
Dan Carpenter
45b7955b77 nouveau/gsp/r535: remove a stray unlock in r535_gsp_rpc_send()
This unlock doesn't belong here and it leads to a double unlock in
the caller, r535_gsp_rpc_push().

Fixes: 176fdcbddf ("drm/nouveau/gsp/r535: add support for booting GSP-RM")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Timur Tabi <ttabi@nvidia.com>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/a0293812-c05d-45f0-a535-3f24fe582c02@moroto.mountain
2023-11-29 03:04:41 +01:00
Dan Carpenter
42bd415bd8 nouveau/gsp/r535: Fix a NULL vs error pointer bug
The r535_gsp_cmdq_get() function returns error pointers but this code
checks for NULL.  Also we need to propagate the error pointer back to
the callers in r535_gsp_rpc_get().  Returning NULL will lead to a NULL
pointer dereference.

Fixes: 176fdcbddf ("drm/nouveau/gsp/r535: add support for booting GSP-RM")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Danilo Krummrich <dakr@redhat.com>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/f71996d9-d1cb-45ea-a4b2-2dfc21312d8c@kili.mountain
2023-11-14 22:40:25 +01:00
Dan Carpenter
09f12bf9f7 nouveau/gsp/r535: uninitialized variable in r535_gsp_acpi_mux_id()
The if we hit the "continue" statement on the first iteration through
the loop then "handle_mux" needs to be set to NULL so we continue
looping.

Fixes: 176fdcbddf ("drm/nouveau/gsp/r535: add support for booting GSP-RM")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Danilo Krummrich <dakr@redhat.com>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1d864f6e-43e9-43d8-9d90-30e76c9c843b@moroto.mountain
2023-11-14 22:40:18 +01:00
Dave Airlie
8d55b0a940 nouveau/gsp: add some basic registry entries.
The nvidia driver sets these two basic registry entries always,
so copy it.

Reviewed-by: Danilo Krummrich <dakr@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2023-11-03 12:57:23 +10:00
Dave Airlie
5177e5fa6e nouveau/gsp: fix message signature.
This original one was backwards, compared to traces from nvidia driver.

Reviewed-by: Danilo Krummrich <dakr@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2023-11-03 12:57:19 +10:00
Dave Airlie
b5bad8c16b nouveau/gsp: move to 535.113.01
This moves the initial effort to the latest 535 firmware.

The gsp msg structs have changed, and the message passing also.
The wpr also seems to have some struct changes.

This version of the firmware will be what we are stuck on for a while,
until we can refactor the driver and work out a better path forward.

Reviewed-by: Danilo Krummrich <dakr@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2023-11-03 12:57:14 +10:00
Ben Skeggs
015185cc67 drm/nouveau/ofa/r535: initial support
Adds support for allocating OFA classes from RM.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230918202149.4343-45-skeggsb@gmail.com
2023-10-31 15:08:19 +10:00
Ben Skeggs
ca9686340a drm/nouveau/nvjpg/r535: initial support
Adds support for allocating NVJPG classes from RM.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230918202149.4343-44-skeggsb@gmail.com
2023-10-31 15:08:19 +10:00
Ben Skeggs
08ab88f5a0 drm/nouveau/nvenc/r535: initial support
Adds support for allocating VIDEO_ENCODER classes from RM.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230918202149.4343-43-skeggsb@gmail.com
2023-10-31 15:08:18 +10:00