Commit Graph

6700 Commits

Author SHA1 Message Date
Dave Airlie
5493bf2d0f Merge tag 'drm-misc-fixes-2024-04-18' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes
Short summary of fixes pull:

nouveau:
- dp: Don't probe DP ports twice
- nv04: Fix OOB access
- nv50: Disable AUX bus for disconnected DP ports
- nvkm: Fix race condition

panel:
- Don't unregister DSI devices in several drivers

ttm:
- Stop pooling cached NUMA pages

v3d:
- Fix enabled_ns increment

vmwgfx:
- Fix PRIME import/export
- Fix CRTC's atomic check for primary planes
- Sort plane formats by preference

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20240418072229.GA8983@localhost.localdomain
2024-04-19 10:22:28 +10:00
Lyude Paul
bf52d7f9b2 drm/nouveau/dp: Don't probe eDP ports twice harder
I didn't pay close enough attention the last time I tried to fix this
problem - while we currently do correctly take care to make sure we don't
probe a connected eDP port more then once, we don't do the same thing for
eDP ports we found to be disconnected.

So, fix this and make sure we only ever probe eDP ports once and then leave
them at that connector state forever (since without HPD, it's not going to
change on its own anyway). This should get rid of the last few GSP errors
getting spit out during runtime suspend and resume on some machines, as we
tried to reprobe eDP ports in response to ACPI hotplug probe events.

Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240404233736.7946-3-lyude@redhat.com
(cherry picked from commit fe6660b661)
2024-04-15 14:46:50 -04:00
Lyude Paul
ee7e980dc7 drm/nouveau/kms/nv50-: Disable AUX bus for disconnected DP ports
GSP has its own state for keeping track of whether or not a given display
connector is plugged in or not, and enforces this state on the driver. In
particular, AUX transactions on a DisplayPort connector which GSP says is
disconnected can never succeed - and can in some cases even cause
unexpected timeouts, which can trickle up to cause other problems. A good
example of this is runtime power management: where we can actually get
stuck trying to resume the GPU if a userspace application like fwupd tries
accessing a drm_aux_dev for a disconnected port. This was an issue I hit a
few times with my Slimbook Executive 16 - where trying to offload something
to the discrete GPU would wake it up, and then potentially cause it to
timeout as fwupd tried to immediately access the dp_aux_dev nodes for
nouveau.

Likewise: we don't really have any cases I know of where we'd want to
ignore this state and try an aux transaction anyway - and failing pointless
aux transactions immediately can even speed things up. So - let's start
enabling/disabling the aux bus in nouveau_dp_detect() to fix this. We
enable the aux bus during connector probing, and leave it enabled if we
discover something is actually on the connector. Otherwise, we just shut it
off.

This should fix some people's runtime PM issues (like myself), and also get
rid of quite of a lot of GSP error spam in dmesg.

Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240404233736.7946-2-lyude@redhat.com
(cherry picked from commit 9c8a10bf1f)
2024-04-15 14:45:48 -04:00
Mikhail Kobuk
cf92bb778e drm: nv04: Fix out of bounds access
When Output Resource (dcb->or) value is assigned in
fabricate_dcb_output(), there may be out of bounds access to
dac_users array in case dcb->or is zero because ffs(dcb->or) is
used as index there.
The 'or' argument of fabricate_dcb_output() must be interpreted as a
number of bit to set, not value.

Utilize macros from 'enum nouveau_or' in calls instead of hardcoding.

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Fixes: 2e5702aff3 ("drm/nouveau: fabricate DCB encoder table for iMac G4")
Fixes: 670820c0e6 ("drm/nouveau: Workaround incorrect DCB entry on a GeForce3 Ti 200.")
Signed-off-by: Mikhail Kobuk <m.kobuk@ispras.ru>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240411110854.16701-1-m.kobuk@ispras.ru
2024-04-15 10:54:13 +02:00
Dave Airlie
fff1386cc8 nouveau: fix instmem race condition around ptr stores
Running a lot of VK CTS in parallel against nouveau, once every
few hours you might see something like this crash.

BUG: kernel NULL pointer dereference, address: 0000000000000008
PGD 8000000114e6e067 P4D 8000000114e6e067 PUD 109046067 PMD 0
Oops: 0000 [#1] PREEMPT SMP PTI
CPU: 7 PID: 53891 Comm: deqp-vk Not tainted 6.8.0-rc6+ #27
Hardware name: Gigabyte Technology Co., Ltd. Z390 I AORUS PRO WIFI/Z390 I AORUS PRO WIFI-CF, BIOS F8 11/05/2021
RIP: 0010:gp100_vmm_pgt_mem+0xe3/0x180 [nouveau]
Code: c7 48 01 c8 49 89 45 58 85 d2 0f 84 95 00 00 00 41 0f b7 46 12 49 8b 7e 08 89 da 42 8d 2c f8 48 8b 47 08 41 83 c7 01 48 89 ee <48> 8b 40 08 ff d0 0f 1f 00 49 8b 7e 08 48 89 d9 48 8d 75 04 48 c1
RSP: 0000:ffffac20c5857838 EFLAGS: 00010202
RAX: 0000000000000000 RBX: 00000000004d8001 RCX: 0000000000000001
RDX: 00000000004d8001 RSI: 00000000000006d8 RDI: ffffa07afe332180
RBP: 00000000000006d8 R08: ffffac20c5857ad0 R09: 0000000000ffff10
R10: 0000000000000001 R11: ffffa07af27e2de0 R12: 000000000000001c
R13: ffffac20c5857ad0 R14: ffffa07a96fe9040 R15: 000000000000001c
FS:  00007fe395eed7c0(0000) GS:ffffa07e2c980000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000008 CR3: 000000011febe001 CR4: 00000000003706f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:

...

 ? gp100_vmm_pgt_mem+0xe3/0x180 [nouveau]
 ? gp100_vmm_pgt_mem+0x37/0x180 [nouveau]
 nvkm_vmm_iter+0x351/0xa20 [nouveau]
 ? __pfx_nvkm_vmm_ref_ptes+0x10/0x10 [nouveau]
 ? __pfx_gp100_vmm_pgt_mem+0x10/0x10 [nouveau]
 ? __pfx_gp100_vmm_pgt_mem+0x10/0x10 [nouveau]
 ? __lock_acquire+0x3ed/0x2170
 ? __pfx_gp100_vmm_pgt_mem+0x10/0x10 [nouveau]
 nvkm_vmm_ptes_get_map+0xc2/0x100 [nouveau]
 ? __pfx_nvkm_vmm_ref_ptes+0x10/0x10 [nouveau]
 ? __pfx_gp100_vmm_pgt_mem+0x10/0x10 [nouveau]
 nvkm_vmm_map_locked+0x224/0x3a0 [nouveau]

Adding any sort of useful debug usually makes it go away, so I hand
wrote the function in a line, and debugged the asm.

Every so often pt->memory->ptrs is NULL. This ptrs ptr is set in
the nv50_instobj_acquire called from nvkm_kmap.

If Thread A and Thread B both get to nv50_instobj_acquire around
the same time, and Thread A hits the refcount_set line, and in
lockstep thread B succeeds at refcount_inc_not_zero, there is a
chance the ptrs value won't have been stored since refcount_set
is unordered. Force a memory barrier here, I picked smp_mb, since
we want it on all CPUs and it's write followed by a read.

v2: use paired smp_rmb/smp_wmb.

Cc: <stable@vger.kernel.org>
Fixes: be55287aa5 ("drm/nouveau/imem/nv50: embed nvkm_instobj directly into nv04_instobj")
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240411011510.2546857-1-airlied@gmail.com
2024-04-15 10:51:28 +02:00
Dave Airlie
1b24b3cd1a Merge tag 'drm-misc-fixes-2024-04-11' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes
Short summary of fixes pull:

ast:
- Fix soft lockup

client:
- Protect connector modes with mode_config mutex

host1x:
- Do not setup DMA for virtual addresses

ivpu:
- Fix deadlock in context_xa
- PCI fixes
- Fixes to error handling

nouveau:
- gsp: Fix OOB access
- Fix casting

panfrost:
- Fix error path in MMU code

qxl:
- Revert "drm/qxl: simplify qxl_fence_wait"

vmwgfx:
- Enable DMA for SEV mappings

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20240411073403.GA9895@localhost.localdomain
2024-04-12 05:35:46 +10:00
Dave Airlie
718c4fb221 nouveau: fix devinit paths to only handle display on GSP.
This reverts:
nouveau/gsp: don't check devinit disable on GSP.
and applies a further fix.

It turns out the open gpu driver, checks this register,
but only for display.

Match that behaviour and in the turing path only disable
the display block. (ampere already only does displays).

Fixes: 5d4e8ae6e5 ("nouveau/gsp: don't check devinit disable on GSP.")
Reviewed-by: Danilo Krummrich <dakr@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240408064243.2219527-1-airlied@gmail.com
2024-04-09 13:14:13 +10:00
Arnd Bergmann
185fdb4697 nouveau: fix function cast warning
Calling a function through an incompatible pointer type causes breaks
kcfi, so clang warns about the assignment:

drivers/gpu/drm/nouveau/nvkm/subdev/bios/shadowof.c:73:10: error: cast from 'void (*)(const void *)' to 'void (*)(void *)' converts to incompatible function type [-Werror,-Wcast-function-type-strict]
   73 |         .fini = (void(*)(void *))kfree,

Avoid this with a trivial wrapper.

Fixes: c39f472e9f ("drm/nouveau: remove symlinks, move core/ to nvkm/ (no code changes)")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240404160234.2923554-1-arnd@kernel.org
2024-04-05 18:30:29 +02:00
Kees Cook
838ae9f45c nouveau/gsp: Avoid addressing beyond end of rpc->entries
Using the end of rpc->entries[] for addressing runs into both compile-time
and run-time detection of accessing beyond the end of the array. Use the
base pointer instead, since was allocated with the additional bytes for
storing the strings. Avoids the following warning in future GCC releases
with support for __counted_by:

In function 'fortify_memcpy_chk',
    inlined from 'r535_gsp_rpc_set_registry' at ../drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c:1123:3:
../include/linux/fortify-string.h:553:25: error: call to '__write_overflow_field' declared with attribute warning: detected write beyond size of field (1st parameter); maybe use struct_group()? [-Werror=attribute-warning]
  553 |                         __write_overflow_field(p_size_field, size);
      |                         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

for this code:

	strings = (char *)&rpc->entries[NV_GSP_REG_NUM_ENTRIES];
	...
                memcpy(strings, r535_registry_entries[i].name, name_len);

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240330141159.work.063-kees@kernel.org
2024-04-05 18:30:29 +02:00
Dave Airlie
be141849ec nouveau/uvmm: fix addr/range calcs for remap operations
dEQP-VK.sparse_resources.image_rebind.2d_array.r64i.128_128_8
was causing a remap operation like the below.

op_remap: prev: 0000003fffed0000 00000000000f0000 00000000a5abd18a 0000000000000000
op_remap: next:
op_remap: unmap: 0000003fffed0000 0000000000100000 0
op_map: map: 0000003ffffc0000 0000000000010000 000000005b1ba33c 00000000000e0000

This was resulting in an unmap operation from 0x3fffed0000+0xf0000, 0x100000
which was corrupting the pagetables and oopsing the kernel.

Fixes the prev + unmap range calcs to use start/end and map back to addr/range.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Fixes: b88baab828 ("drm/nouveau: implement new VM_BIND uAPI")
Cc: Danilo Krummrich <dakr@redhat.com>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240328024317.2041851-1-airlied@gmail.com
2024-03-28 17:58:31 +01:00
Colin Ian King
c60ebc58f2 drm/nouveau/gr/gf100: Remove second semicolon
There is a statement with two semicolons. Remove the second one, it
is redundant.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240315090930.2429958-1-colin.i.king@gmail.com
2024-03-28 17:58:31 +01:00
Thomas Zimmermann
36a1818f5a Merge drm/drm-fixes into drm-misc-fixes
Backmerging to get drm-misc-fixes to the state of v6.9-rc1.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
2024-03-25 21:11:58 +01:00
Linus Torvalds
7ee0490121 Merge tag 'drm-next-2024-03-22' of https://gitlab.freedesktop.org/drm/kernel
Pull drm fixes from Dave Airlie:
 "Fixes from the last week (or 3 weeks in amdgpu case), after amdgpu,
  it's xe and nouveau then a few scattered core fixes.

  core:
   - fix rounding in drm_fixp2int_round()

  bridge:
   - fix documentation for DRM_BRIDGE_OP_EDID

  sun4i:
   - fix 64-bit division on 32-bit architectures

  tests:
   - fix dependency on DRM_KMS_HELPER

  probe-helper:
   - never return negative values from .get_modes() plus driver fixes

  xe:
   - invalidate userptr vma on page pin fault
   - fail early on sysfs file creation error
   - skip VMA pinning on xe_exec if no batches

  nouveau:
   - clear bo resource bus after eviction
   - documentation fixes
   - don't check devinit disable on GSP

  amdgpu:
   - Freesync fixes
   - UAF IOCTL fixes
   - Fix mmhub client ID mapping
   - IH 7.0 fix
   - DML2 fixes
   - VCN 4.0.6 fix
   - GART bind fix
   - GPU reset fix
   - SR-IOV fix
   - OD table handling fixes
   - Fix TA handling on boards without display hardware
   - DML1 fix
   - ABM fix
   - eDP panel fix
   - DPPCLK fix
   - HDCP fix
   - Revert incorrect error case handling in ioremap
   - VPE fix
   - HDMI fixes
   - SDMA 4.4.2 fix
   - Other misc fixes

  amdkfd:
   - Fix duplicate BO handling in process restore"

* tag 'drm-next-2024-03-22' of https://gitlab.freedesktop.org/drm/kernel: (50 commits)
  drm/amdgpu/pm: Don't use OD table on Arcturus
  drm/amdgpu: drop setting buffer funcs in sdma442
  drm/amd/display: Fix noise issue on HDMI AV mute
  drm/amd/display: Revert Remove pixle rate limit for subvp
  Revert "drm/amdgpu/vpe: don't emit cond exec command under collaborate mode"
  Revert "drm/amd/amdgpu: Fix potential ioremap() memory leaks in amdgpu_device_init()"
  drm/amd/display: Add a dc_state NULL check in dc_state_release
  drm/amd/display: Return the correct HDCP error code
  drm/amd/display: Implement wait_for_odm_update_pending_complete
  drm/amd/display: Lock all enabled otg pipes even with no planes
  drm/amd/display: Amend coasting vtotal for replay low hz
  drm/amd/display: Fix idle check for shared firmware state
  drm/amd/display: Update odm when ODM combine is changed on an otg master pipe with no plane
  drm/amd/display: Init DPPCLK from SMU on dcn32
  drm/amd/display: Add monitor patch for specific eDP
  drm/amd/display: Allow dirty rects to be sent to dmub when abm is active
  drm/amd/display: Override min required DCFCLK in dml1_validate
  drm/amdgpu: Bypass display ta if display hw is not available
  drm/amdgpu: correct the KGQ fallback message
  drm/amdgpu/pm: Check the validity of overdiver power limit
  ...
2024-03-21 19:04:31 -07:00
Dave Airlie
5d4e8ae6e5 nouveau/gsp: don't check devinit disable on GSP.
GSP should be handling this and I can see no evidence in opengpu
driver that this register should be touched.

Fixed acceleration on 2080 Ti GPUs.

Fixes: 15740541e8 ("drm/nouveau/devinit/tu102-: prepare for GSP-RM")

Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240314014521.2695233-1-airlied@gmail.com
2024-03-19 14:34:55 +01:00
Linus Torvalds
480e035fc4 Merge tag 'drm-next-2024-03-13' of https://gitlab.freedesktop.org/drm/kernel
Pull drm updates from Dave Airlie:
 "Highlights are usual, more AMD IP blocks for future hw, i915/xe
  changes, Displayport tunnelling support for i915, msm YUV over DP
  changes, new tests for ttm, but its mostly a lot of stuff all over the
  place from lots of people.

  core:
   - EDID cleanups
   - scheduler error handling fixes
   - managed: add drmm_release_action() with tests
   - add ratelimited drm debug print
   - DPCD PSR early transport macro
   - DP tunneling and bandwidth allocation helpers
   - remove built-in edids
   - dp: Avoid AUX transfers on powered-down displays
   - dp: Add VSC SDP helpers

  cross drivers:
   - use new drm print helpers
   - switch to ->read_edid callback
   - gem: add stats for shared buffers plus updates to amdgpu, i915, xe

  syncobj:
   - fixes to waiting and sleeping

  ttm:
   - add tests
   - fix errno codes
   - simply busy-placement handling
   - fix page decryption

  media:
   - tc358743: fix v4l device registration

  video:
   - move all kernel parameters for video behind CONFIG_VIDEO

  sound:
   - remove <drm/drm_edid.h> include from header

  ci:
   - add tests for msm
   - fix apq8016 runner

  efifb:
   - use copy of global screen_info state

  vesafb:
   - use copy of global screen_info state

  simplefb:
   - fix logging

  bridge:
   - ite-6505: fix DP link-training bug
   - samsung-dsim: fix error checking in probe
   - samsung-dsim: add bsh-smm-s2/pro boards
   - tc358767: fix regmap usage
   - imx: add i.MX8MP HDMI PVI plus DT bindings
   - imx: add i.MX8MP HDMI TX plus DT bindings
   - sii902x: fix probing and unregistration
   - tc358767: limit pixel PLL input range
   - switch to new drm_bridge_read_edid() interface

  panel:
   - ltk050h3146w: error-handling fixes
   - panel-edp: support delay between power-on and enable; use put_sync
     in unprepare; support Mediatek MT8173 Chromebooks, BOE NV116WHM-N49
     V8.0, BOE NV122WUM-N41, CSO MNC207QS1-1 plus DT bindings
   - panel-lvds: support EDT ETML0700Z9NDHA plus DT bindings
   - panel-novatek: FRIDA FRD400B25025-A-CTK plus DT bindings
   - add BOE TH101MB31IG002-28A plus DT bindings
   - add EDT ETML1010G3DRA plus DT bindings
   - add Novatek NT36672E LCD DSI plus DT bindings
   - nt36523: support 120Hz timings, fix includes
   - simple: fix display timings on RK32FN48H
   - visionox-vtdr6130: fix initialization
   - add Powkiddy RGB10MAX3 plus DT bindings
   - st7703: support panel rotation plus DT bindings
   - add Himax HX83112A plus DT bindings
   - ltk500hd1829: add support for ltk101b4029w and admatec 9904370
   - simple: add BOE BP082WX1-100 8.2" panel plus DT bindungs

  panel-orientation-quirks:
   - GPD Win Mini

  amdgpu:
   - Validate DMABuf imports in compute VMs
   - Add RAS ACA framework
   - PSP 13 fixes
   - Misc code cleanups
   - Replay fixes
   - Atom interpretor PS, WS bounds checking
   - DML2 fixes
   - Audio fixes
   - DCN 3.5 Z state fixes
   - Remove deprecated ida_simple usage
   - UBSAN fixes
   - RAS fixes
   - Enable seq64 infrastructure
   - DC color block enablement
   - Documentation updates
   - DC documentation updates
   - DMCUB updates
   - ATHUB 4.1 support
   - LSDMA 7.0 support
   - JPEG DPG support
   - IH 7.0 support
   - HDP 7.0 support
   - VCN 5.0 support
   - SMU 13.0.6 updates
   - NBIO 7.11 updates
   - SDMA 6.1 updates
   - MMHUB 3.3 updates
   - DCN 3.5.1 support
   - NBIF 6.3.1 support
   - VPE 6.1.1 support

  amdkfd:
   - Validate DMABuf imports in compute VMs
   - SVM fixes
   - Trap handler updates and enhancements
   - Fix cache size reporting
   - Relocate the trap handler

  radeon:
   - Atom interpretor PS, WS bounds checking
   - Misc code cleanups

  xe:
   - new query for GuC submission version
   - Remove unused persistent exec_queues
   - Add vram frequency sysfs attributes
   - Add the flag XE_VM_BIND_FLAG_DUMPABLE
   - Drop pre-production workarounds
   - Drop kunit tests for unsupported platforms
   - Start pumbling SR-IOV support with memory based interrupts for VF
   - Allow to map BO in GGTT with PAT index corresponding to XE_CACHE_UC
     to work with memory based interrupts
   - Add GuC Doorbells Manager as prep work SR-IOV
   - Implement additional workarounds for xe2 and MTL
   - Program a few registers according to perfomance guide spec for Xe2
   - Fix remaining 32b build issues and enable it back
   - Fix build with CONFIG_DEBUG_FS=n
   - Fix warnings from GuC ABI headers
   - Introduce Relay Communication for SR-IOV for VF <-> GuC <-> PF
   - Release mmap mappings on rpm suspend
   - Disable mid-thread preemption when not properly supported by
     hardware
   - Fix xe_exec by reserving extra fence slot for CPU bind
   - Fix xe_exec with full long running exec queue
   - Canonicalize addresses where needed for Xe2 and add to devcoredum
   - Toggle USM support for Xe2
   - Only allow 1 ufence per exec / bind IOCTL
   - Add GuC firmware loading for Lunar Lake
   - Add XE_VMA_PTE_64K VMA flag

  i915:
   - Add more ADL-N PCI IDs
   - Enable fastboot also on older platforms
   - Early transport for panel replay and PSR
   - New ARL PCI IDs
   - DP TPS4 PHY test pattern support
   - Unify and improve VSC SDP for PSR and non-PSR cases
   - Refactor memory regions and improve debug logging
   - Rework global state serialization
   - Remove unused CDCLK divider fields
   - Unify HDCP connector logging format
   - Use display instead of graphics version in display code
   - Move VBT and opregion debugfs next to the implementation
   - Abstract opregion interface, use opaque type
   - MTL fixes
   - HPD handling fixes
   - Add GuC submission interface version query
   - Atomically invalidate userptr on mmu-notifier
   - Update handling of MMIO triggered reports
   - Don't make assumptions about intel_wakeref_t type
   - Extend driver code of Xe_LPG to Xe_LPG+
   - Add flex arrays to struct i915_syncmap
   - Allow for very slow HuC loading
   - DP tunneling and bandwidth allocation support

  msm:
   - Correct bindings for MSM8976 and SM8650 platforms
   - Start migration of MDP5 platforms to DPU driver
   - X1E80100 MDSS support
   - DPU:
      - Improve DSC allocation, fixing several important corner cases
      - Add support for SDM630/SDM660 platforms
      - Simplify dpu_encoder_phys_ops
      - Apply fixes targeting DSC support with a single DSC encoder
      - Apply fixes for HCTL_EN timing configuration
      - X1E80100 support
      - Add support for YUV420 over DP
   - GPU:
      - fix sc7180 UBWC config
      - fix a7xx LLC config
      - new gpu support: a305B, a750, a702
      - machine support: SM7150 (different power levels than other a618)
      - a7xx devcoredump support

  habanalabs:
   - configure IRQ affinity according to NUMA node
   - move HBM MMU page tables inside the HBM
   - improve device reset
   - check extended PCIe errors

  ivpu:
   - updates to firmware API
   - refactor BO allocation

  imx:
   - use devm_ functions during init

  hisilicon:
   - fix EDID includes

  mgag200:
   - improve ioremap usage
   - convert to struct drm_edid
   - Work around PCI write bursts

  nouveau:
   - disp: use kmemdup()
   - fix EDID includes
   - documentation fixes

  qaic:
   - fixes to BO handling
   - make use of DRM managed release
   - fix order of remove operations

  rockchip:
   - analogix_dp: get encoder port from DT
   - inno_hdmi: support HDMI for RK3128
   - lvds: error-handling fixes

  ssd130x:
   - support SSD133x plus DT bindings

  tegra:
   - fix error handling

  tilcdc:
   - make use of DRM managed release

  v3d:
   - show memory stats in debugfs
   - Support display MMU page size

  vc4:
   - fix error handling in plane prepare_fb
   - fix framebuffer test in plane helpers

  virtio:
   - add venus capset defines

  vkms:
   - fix OOB access when programming the LUT
   - Kconfig improvements

  vmwgfx:
   - unmap surface before changing plane state
   - fix memory leak in error handling
   - documentation fixes
   - list command SVGA_3D_CMD_DEFINE_GB_SURFACE_V4 as invalid
   - fix null-pointer deref in execbuf
   - refactor display-mode probing
   - fix fencing for creating cursor MOBs
   - fix cursor-memory lifetime

  xlnx:
   - fix live video input for ZynqMP DPSUB

  lima:
   - fix memory leak

  loongson:
   - fail if no VRAM present

  meson:
   - switch to new drm_bridge_read_edid() interface

  renesas:
   - add RZ/G2L DU support plus DT bindings

  mxsfb:
   - Use managed mode config

  sun4i:
   - HDMI: updates to atomic mode setting

  mediatek:
   - Add display driver for MT8188 VDOSYS1
   - DSI driver cleanups
   - Filter modes according to hardware capability
   - Fix a null pointer crash in mtk_drm_crtc_finish_page_flip

  etnaviv:
   - enhancements for NPU and MRT support"

* tag 'drm-next-2024-03-13' of https://gitlab.freedesktop.org/drm/kernel: (1420 commits)
  drm/amd/display: Removed redundant @ symbol to fix kernel-doc warnings in -next repo
  drm/amd/pm: wait for completion of the EnableGfxImu message
  drm/amdgpu/soc21: add mode2 asic reset for SMU IP v14.0.1
  drm/amdgpu: add smu 14.0.1 support
  drm/amdgpu: add VPE 6.1.1 discovery support
  drm/amdgpu/vpe: add VPE 6.1.1 support
  drm/amdgpu/vpe: don't emit cond exec command under collaborate mode
  drm/amdgpu/vpe: add collaborate mode support for VPE
  drm/amdgpu/vpe: add PRED_EXE and COLLAB_SYNC OPCODE
  drm/amdgpu/vpe: add multi instance VPE support
  drm/amdgpu/discovery: add nbif v6_3_1 ip block
  drm/amdgpu: Add nbif v6_3_1 ip block support
  drm/amdgpu: Add pcie v6_1_0 ip headers (v5)
  drm/amdgpu: Add nbif v6_3_1 ip headers (v5)
  arch/powerpc: Remove <linux/fb.h> from backlight code
  macintosh/via-pmu-backlight: Include <linux/backlight.h>
  fbdev/chipsfb: Include <linux/backlight.h>
  drm/etnaviv: Restore some id values
  drm/amdkfd: make kfd_class constant
  drm/amdgpu: add ring timeout information in devcoredump
  ...
2024-03-13 18:34:05 -07:00
Thomas Zimmermann
a2e7496b45 Merge drm/drm-fixes into drm-misc-fixes
Backmerging to sync before merging the patchset at [1].

[1] https://lore.kernel.org/all/cover.1709913674.git.jani.nikula@intel.com/

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
2024-03-13 09:43:21 +01:00
Timur Tabi
dea185b71b drm/nouveau: fix kerneldoc warnings
kernel test robot complains about missing kerneldoc entries:

  drivers-gpu-drm-nouveau-nvkm-subdev-gsp-r535.c⚠️
    Function-parameter-or-struct-member-gsp-not-described-in-nvkm_gsp_radix3_sg

Signed-off-by: Timur Tabi <ttabi@nvidia.com>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240210002900.148982-1-ttabi@nvidia.com
2024-03-12 22:03:52 +01:00
Dave Airlie
f35c9af45e nouveau: reset the bo resource bus info after an eviction
Later attempts to refault the bo won't happen and the whole
GPU does to lunch. I think Christian's refactoring of this
code out to the driver broke this not very well tested path.

Fixes: 141b15e591 ("drm/nouveau: move io_reserve_lru handling into the driver v5")
Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240311072037.287905-1-airlied@gmail.com
2024-03-12 21:13:19 +01:00
Duoming Zhou
16e87fe23d nouveau/dmem: handle kcalloc() allocation failure
The kcalloc() in nouveau_dmem_evict_chunk() will return null if
the physical memory has run out. As a result, if we dereference
src_pfns, dst_pfns or dma_addrs, the null pointer dereference bugs
will happen.

Moreover, the GPU is going away. If the kcalloc() fails, we could not
evict all pages mapping a chunk. So this patch adds a __GFP_NOFAIL
flag in kcalloc().

Finally, as there is no need to have physically contiguous memory,
this patch switches kcalloc() to kvcalloc() in order to avoid
failing allocations.

CC: <stable@vger.kernel.org> # v6.1
Fixes: 249881232e ("nouveau/dmem: evict device private memory during release")
Suggested-by: Danilo Krummrich <dakr@redhat.com>
Signed-off-by: Duoming Zhou <duoming@zju.edu.cn>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240306050104.11259-1-duoming@zju.edu.cn
2024-03-08 17:36:28 +01:00
Dave Airlie
b7cc4ff787 nouveau: lock the client object tree.
It appears the client object tree has no locking unless I've missed
something else. Fix races around adding/removing client objects,
mostly vram bar mappings.

 4562.099306] general protection fault, probably for non-canonical address 0x6677ed422bceb80c: 0000 [#1] PREEMPT SMP PTI
[ 4562.099314] CPU: 2 PID: 23171 Comm: deqp-vk Not tainted 6.8.0-rc6+ #27
[ 4562.099324] Hardware name: Gigabyte Technology Co., Ltd. Z390 I AORUS PRO WIFI/Z390 I AORUS PRO WIFI-CF, BIOS F8 11/05/2021
[ 4562.099330] RIP: 0010:nvkm_object_search+0x1d/0x70 [nouveau]
[ 4562.099503] Code: 90 90 90 90 90 90 90 90 90 90 90 90 90 66 0f 1f 00 0f 1f 44 00 00 48 89 f8 48 85 f6 74 39 48 8b 87 a0 00 00 00 48 85 c0 74 12 <48> 8b 48 f8 48 39 ce 73 15 48 8b 40 10 48 85 c0 75 ee 48 c7 c0 fe
[ 4562.099506] RSP: 0000:ffffa94cc420bbf8 EFLAGS: 00010206
[ 4562.099512] RAX: 6677ed422bceb814 RBX: ffff98108791f400 RCX: ffff9810f26b8f58
[ 4562.099517] RDX: 0000000000000000 RSI: ffff9810f26b9158 RDI: ffff98108791f400
[ 4562.099519] RBP: ffff9810f26b9158 R08: 0000000000000000 R09: 0000000000000000
[ 4562.099521] R10: ffffa94cc420bc48 R11: 0000000000000001 R12: ffff9810f02a7cc0
[ 4562.099526] R13: 0000000000000000 R14: 00000000000000ff R15: 0000000000000007
[ 4562.099528] FS:  00007f629c5017c0(0000) GS:ffff98142c700000(0000) knlGS:0000000000000000
[ 4562.099534] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 4562.099536] CR2: 00007f629a882000 CR3: 000000017019e004 CR4: 00000000003706f0
[ 4562.099541] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 4562.099542] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 4562.099544] Call Trace:
[ 4562.099555]  <TASK>
[ 4562.099573]  ? die_addr+0x36/0x90
[ 4562.099583]  ? exc_general_protection+0x246/0x4a0
[ 4562.099593]  ? asm_exc_general_protection+0x26/0x30
[ 4562.099600]  ? nvkm_object_search+0x1d/0x70 [nouveau]
[ 4562.099730]  nvkm_ioctl+0xa1/0x250 [nouveau]
[ 4562.099861]  nvif_object_map_handle+0xc8/0x180 [nouveau]
[ 4562.099986]  nouveau_ttm_io_mem_reserve+0x122/0x270 [nouveau]
[ 4562.100156]  ? dma_resv_test_signaled+0x26/0xb0
[ 4562.100163]  ttm_bo_vm_fault_reserved+0x97/0x3c0 [ttm]
[ 4562.100182]  ? __mutex_unlock_slowpath+0x2a/0x270
[ 4562.100189]  nouveau_ttm_fault+0x69/0xb0 [nouveau]
[ 4562.100356]  __do_fault+0x32/0x150
[ 4562.100362]  do_fault+0x7c/0x560
[ 4562.100369]  __handle_mm_fault+0x800/0xc10
[ 4562.100382]  handle_mm_fault+0x17c/0x3e0
[ 4562.100388]  do_user_addr_fault+0x208/0x860
[ 4562.100395]  exc_page_fault+0x7f/0x200
[ 4562.100402]  asm_exc_page_fault+0x26/0x30
[ 4562.100412] RIP: 0033:0x9b9870
[ 4562.100419] Code: 85 a8 f7 ff ff 8b 8d 80 f7 ff ff 89 08 e9 18 f2 ff ff 0f 1f 84 00 00 00 00 00 44 89 32 e9 90 fa ff ff 0f 1f 84 00 00 00 00 00 <44> 89 32 e9 f8 f1 ff ff 0f 1f 84 00 00 00 00 00 66 44 89 32 e9 e7
[ 4562.100422] RSP: 002b:00007fff9ba2dc70 EFLAGS: 00010246
[ 4562.100426] RAX: 0000000000000004 RBX: 000000000dd65e10 RCX: 000000fff0000000
[ 4562.100428] RDX: 00007f629a882000 RSI: 00007f629a882000 RDI: 0000000000000066
[ 4562.100432] RBP: 00007fff9ba2e570 R08: 0000000000000000 R09: 0000000123ddf000
[ 4562.100434] R10: 0000000000000001 R11: 0000000000000246 R12: 000000007fffffff
[ 4562.100436] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[ 4562.100446]  </TASK>
[ 4562.100448] Modules linked in: nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables libcrc32c nfnetlink cmac bnep sunrpc iwlmvm intel_rapl_msr intel_rapl_common snd_sof_pci_intel_cnl x86_pkg_temp_thermal intel_powerclamp snd_sof_intel_hda_common mac80211 coretemp snd_soc_acpi_intel_match kvm_intel snd_soc_acpi snd_soc_hdac_hda snd_sof_pci snd_sof_xtensa_dsp snd_sof_intel_hda_mlink snd_sof_intel_hda snd_sof kvm snd_sof_utils snd_soc_core snd_hda_codec_realtek libarc4 snd_hda_codec_generic snd_compress snd_hda_ext_core vfat fat snd_hda_intel snd_intel_dspcfg irqbypass iwlwifi snd_hda_codec snd_hwdep snd_hda_core btusb btrtl mei_hdcp iTCO_wdt rapl mei_pxp btintel snd_seq iTCO_vendor_support btbcm snd_seq_device intel_cstate bluetooth snd_pcm cfg80211 intel_wmi_thunderbolt wmi_bmof intel_uncore snd_timer mei_me snd ecdh_generic i2c_i801
[ 4562.100541]  ecc mei i2c_smbus soundcore rfkill intel_pch_thermal acpi_pad zram nouveau drm_ttm_helper ttm gpu_sched i2c_algo_bit drm_gpuvm drm_exec mxm_wmi drm_display_helper drm_kms_helper drm crct10dif_pclmul crc32_pclmul nvme e1000e crc32c_intel nvme_core ghash_clmulni_intel video wmi pinctrl_cannonlake ip6_tables ip_tables fuse
[ 4562.100616] ---[ end trace 0000000000000000 ]---

Signed-off-by: Dave Airlie <airlied@redhat.com>
Cc: stable@vger.kernel.org
2024-03-08 13:40:56 +10:00
Dave Airlie
83c34dcbe0 Merge tag 'drm-misc-fixes-2024-03-07' of https://anongit.freedesktop.org/git/drm/drm-misc into drm-fixes
A connector status polling fix, a timings fix for the Himax83102-j02
panel, a deadlock fix for nouveau, A controversial format fix for udl
that got reverted to allow further discussion, and a build fix for the
drm/buddy kunit tests.

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Maxime Ripard <mripard@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240307-quizzical-auburn-starling-0ade8f@houat
2024-03-08 13:38:08 +10:00
Karol Herbst
daf8739c33 drm/nouveau: fix stale locked mutex in nouveau_gem_ioctl_pushbuf
If VM_BIND is enabled on the client the legacy submission ioctl can't be
used, however if a client tries to do so regardless it will return an
error. In this case the clients mutex remained unlocked leading to a
deadlock inside nouveau_drm_postclose or any other nouveau ioctl call.

Fixes: b88baab828 ("drm/nouveau: implement new VM_BIND uAPI")
Cc: Danilo Krummrich <dakr@redhat.com>
Cc: <stable@vger.kernel.org> # v6.6+
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Danilo Krummrich <dakr@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240305133853.2214268-1-kherbst@redhat.com
2024-03-05 14:59:38 +01:00
Sid Pranjale
f6ecfdad35 drm/nouveau: keep DMA buffers required for suspend/resume
Nouveau deallocates a few buffers post GPU init which are required for GPU suspend/resume to function correctly.
This is likely not as big an issue on systems where the NVGPU is the only GPU, but on multi-GPU set ups it leads to a regression where the kernel module errors and results in a system-wide rendering freeze.

This commit addresses that regression by moving the two buffers required for suspend and resume to be deallocated at driver unload instead of post init.

Fixes: 042b5f8384 ("drm/nouveau: fix several DMA buffer leaks")
Signed-off-by: Sid Pranjale <sidpranjale127@protonmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2024-03-01 15:27:04 +10:00
Dave Airlie
f7916c47f6 nouveau: report byte usage in VRAM usage.
Turns out usage is always in bytes not shifted.

Fixes: 72fa02fdf8 ("nouveau: add an ioctl to report vram usage")
Signed-off-by: Dave Airlie <airlied@redhat.com>
2024-03-01 15:26:28 +10:00
Thomas Zimmermann
379ca03b72 drm/nouveau: Include <linux/backlight.h>
Resolves the proxy include via <linux/fb.h>, which does not require the
backlight header.

v3:
	* fix grammar in commit message

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Acked-by: Helge Deller <deller@gmx.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20240219093941.3684-3-tzimmermann@suse.de
2024-02-28 09:59:26 +01:00
Daniel Vetter
f112b68f27 Merge v6.8-rc6 into drm-next
Thomas Zimmermann asked to backmerge -rc6 for drm-misc branches,
there's a few same-area-changed conflicts (xe and amdgpu mostly) that
are getting a bit too annoying.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2024-02-26 11:41:07 +01:00
Dave Airlie
72fa02fdf8 nouveau: add an ioctl to report vram usage
This reports the currently used vram allocations.

userspace using this has been proposed for nvk, but
it's a rather trivial uapi addition.

Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2024-02-23 10:20:07 +10:00
Dave Airlie
3f4d8aac6e nouveau: add an ioctl to return vram bar size.
This returns the BAR resources size so userspace can make
decisions based on rebar support.

userspace using this has been proposed for nvk, but
it's a rather trivial uapi addition.

Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2024-02-23 10:20:03 +10:00
Dave Airlie
1d492944d3 nouveau/gsp: add kconfig option to enable GSP paths by default
Turing and Ampere will continue to use the old paths by default,
but we should allow distros to decide what the policy is.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240214040632.661069-1-airlied@gmail.com
2024-02-23 10:00:41 +10:00
Dave Airlie
f581dbb34c Merge tag 'drm-misc-fixes-2024-02-22' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
A list handling fix and 64bit division on 32bit platform fix for the
drm/buddy allocator, a cast warning and an initialization fix for
nouveau, a bridge handling fix for meson, an initialisation fix for
ivpu, a SPARC build fix for fbdev, a double-free fix for ttm, and two
fence handling fixes for syncobj.

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Maxime Ripard <mripard@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/gl2antuifidtzn3dfm426p7xwh5fxj23behagwh26owfnosh2w@gqoa7vj5prnh
2024-02-23 08:09:50 +10:00
Dan Carpenter
65323796de drm/nouveau/mmu/r535: uninitialized variable in r535_bar_new_()
If gf100_bar_new_() fails then "bar" is not initialized.

Fixes: 5bf0257136 ("drm/nouveau/mmu/r535: initial support")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/dab21df7-4d90-4479-97d8-97e5d228c714@moroto.mountain
2024-02-16 18:58:31 +01:00
Arnd Bergmann
0affdba22a nouveau: fix function cast warnings
clang-16 warns about casting between incompatible function types:

drivers/gpu/drm/nouveau/nvkm/subdev/bios/shadow.c:161:10: error: cast from 'void (*)(const struct firmware *)' to 'void (*)(void *)' converts to incompatible function type [-Werror,-Wcast-function-type-strict]
  161 |         .fini = (void(*)(void *))release_firmware,

This one was done to use the generic shadow_fw_release() function as a
callback for struct nvbios_source. Change it to use the same prototype
as the other five instances, with a trivial helper function that actually
calls release_firmware.

Fixes: 70c0f263cc ("drm/nouveau/bios: pull in basic vbios subdev, more to come later")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240213095753.455062-1-arnd@kernel.org
2024-02-16 18:38:47 +01:00
Dave Airlie
38df7e5e6c Merge tag 'drm-misc-fixes-2024-02-15' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
A suspend/resume error fix for ivpu, a couple of scheduler fixes for
nouveau, a patch to support large page arrays in prime, a uninitialized
variable fix in crtc, a locking fix in rockchip/vop2 and a buddy
allocator error reporting fix.

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Maxime Ripard <mripard@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/b4ffqzigtfh6cgzdpwuk6jlrv3dnk4hu6etiizgvibysqgtl2p@42n2gdfdd5eu
2024-02-16 14:33:38 +10:00
Arnd Bergmann
2c80a2b715 nouveau/svm: fix kvcalloc() argument order
The conversion to kvcalloc() mixed up the object size and count
arguments, causing a warning:

drivers/gpu/drm/nouveau/nouveau_svm.c: In function 'nouveau_svm_fault_buffer_ctor':
drivers/gpu/drm/nouveau/nouveau_svm.c:1010:40: error: 'kvcalloc' sizes specified with 'sizeof' in the earlier argument and not in the later argument [-Werror=calloc-transposed-args]
 1010 |         buffer->fault = kvcalloc(sizeof(*buffer->fault), buffer->entries, GFP_KERNEL);
      |                                        ^
drivers/gpu/drm/nouveau/nouveau_svm.c:1010:40: note: earlier argument should specify number of elements, later size of each element

The behavior is still correct aside from the warning, but fixing it avoids
the warnings and can help the compiler track the individual objects better.

Fixes: 71e4bbca07 ("nouveau/svm: Use kvcalloc() instead of kvzalloc()")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240212112230.1117284-1-arnd@kernel.org
2024-02-12 15:53:49 +01:00
Danilo Krummrich
a1d8700d90 drm/nouveau: omit to create schedulers using the legacy uAPI
Omit to create scheduler instances when using the legacy uAPI. When
using the legacy NOUVEAU_GEM_PUSHBUF ioctl no scheduler instance is
required, hence omit creating scheduler instances in
nouveau_abi16_ioctl_channel_alloc().

Tested-by: Timur Tabi <ttabi@nvidia.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240202000606.3526-2-dakr@redhat.com
2024-02-12 11:41:03 +01:00
Danilo Krummrich
9a0c32d698 drm/nouveau: don't fini scheduler if not initialized
nouveau_abi16_ioctl_channel_alloc() and nouveau_cli_init() simply call
their corresponding *_fini() counterpart. This can lead to
nouveau_sched_fini() being called without struct nouveau_sched ever
being initialized in the first place.

Instead of embedding struct nouveau_sched into struct nouveau_cli and
struct nouveau_chan_abi16, allocate struct nouveau_sched separately,
such that we can check for the corresponding pointer to be NULL in the
particular *_fini() functions.

It makes sense to allocate struct nouveau_sched separately anyway, since
in a subsequent commit we can also avoid to allocate a struct
nouveau_sched in nouveau_abi16_ioctl_channel_alloc() at all, if the
VM_BIND uAPI has been disabled due to the legacy uAPI being used.

Fixes: 5f03a507b2 ("drm/nouveau: implement 1:1 scheduler - entity relationship")
Reported-by: Timur Tabi <ttabi@nvidia.com>
Tested-by: Timur Tabi <ttabi@nvidia.com>
Closes: https://lore.kernel.org/nouveau/20240131213917.1545604-1-ttabi@nvidia.com/
Reviewed-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240202000606.3526-1-dakr@redhat.com
2024-02-12 11:40:46 +01:00
Dave Airlie
6c2bf9ca24 Merge tag 'drm-misc-fixes-2024-02-08' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
A null pointer dereference fix for v3d, a TTM pool initialization fix,
several fixes for nouveau around register size, DMA buffer leaks and API
consistency, a multiple fixes for ivpu around MMU setup, initialization
and firmware interactions.

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Maxime Ripard <mripard@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/4wsi2i6kgkqdu7nzp4g7hxasbswnrmc5cakgf5zzvnix53u7lr@4rmp7hwblow3
2024-02-09 11:11:21 +10:00
Thomas Zimmermann
0e85f1ae4a Merge drm/drm-next into drm-misc-next
Backmerging to update drm-misc-next to the state of v6.8-rc3. Also
fixes a build problem with xe.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
2024-02-07 13:02:20 +01:00
Timur Tabi
34e659f34a drm/nouveau: nvkm_gsp_radix3_sg() should use nvkm_gsp_mem_ctor()
Function nvkm_gsp_radix3_sg() uses nvkm_gsp_mem objects to allocate the
radix3 tables, but it unnecessarily creates those objects manually
instead of using the standard nvkm_gsp_mem_ctor() function like the
rest of the code does.

Signed-off-by: Timur Tabi <ttabi@nvidia.com>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240202230608.1981026-2-ttabi@nvidia.com
2024-02-05 18:41:09 +01:00
Timur Tabi
042b5f8384 drm/nouveau: fix several DMA buffer leaks
Nouveau manages GSP-RM DMA buffers with nvkm_gsp_mem objects.  Several of
these buffers are never dealloced.  Some of them can be deallocated
right after GSP-RM is initialized, but the rest need to stay until the
driver unloads.

Also futher bullet-proof these objects by poisoning the buffer and
clearing the nvkm_gsp_mem object when it is deallocated.  Poisoning
the buffer should trigger an error (or crash) from GSP-RM if it tries
to access the buffer after we've deallocated it, because we were wrong
about when it is safe to deallocate.

Finally, change the mem->size field to a size_t because that's the same
type that dma_alloc_coherent expects.

Cc: <stable@vger.kernel.org> # v6.7
Fixes: 176fdcbddf ("drm/nouveau/gsp/r535: add support for booting GSP-RM")
Signed-off-by: Timur Tabi <ttabi@nvidia.com>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240202230608.1981026-1-ttabi@nvidia.com
2024-02-05 18:25:13 +01:00
Dave Airlie
61712c9478 nouveau/gsp: use correct size for registry rpc.
Timur pointed this out before, and it just slipped my mind,
but this might help some things work better, around pcie power
management.

Cc: <stable@vger.kernel.org> # v6.7
Fixes: 8d55b0a940 ("nouveau/gsp: add some basic registry entries.")
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240130032643.2498315-1-airlied@gmail.com
2024-02-05 17:36:48 +01:00
Dave Airlie
f8e4806e0d Merge tag 'drm-misc-next-2024-01-11' of git://anongit.freedesktop.org/drm/drm-misc into drm-next
drm-misc-next for v6.9:

UAPI Changes:

virtio:
- add Venus capset defines

Cross-subsystem Changes:

Core Changes:

- fix drm_fixp2int_ceil()
- documentation fixes
- clean ups
- allow DRM_MM_DEBUG with DRM=m
- build fixes for debugfs support
- EDID cleanups
- sched: error-handling fixes
- ttm: add tests

Driver Changes:

bridge:
- ite-6505: fix DP link-training bug
- samsung-dsim: fix error checking in probe
- tc358767: fix regmap usage

efifb:
- use copy of global screen_info state

hisilicon:
- fix EDID includes

mgag200:
- improve ioremap usage
- convert to struct drm_edid

nouveau:
- disp: use kmemdup()
- fix EDID includes
- documentation fixes

panel:
- ltk050h3146w: error-handling fixes
- panel-edp: support delay between power-on and enable; use put_sync in
  unprepare; support Mediatek MT8173 Chromebooks, BOE NV116WHM-N49 V8.0,
  BOE NV122WUM-N41, CSO MNC207QS1-1 plus DT bindings
- panel-lvds: support EDT ETML0700Z9NDHA plus DT bindings
- panel-novatek: FRIDA FRD400B25025-A-CTK plus DT bindings

qaic:
- fixes to BO handling
- make use of DRM managed release
- fix order of remove operations

rockchip:
- analogix_dp: get encoder port from DT
- inno_hdmi: support HDMI for RK3128
- lvds: error-handling fixes

simplefb:
- fix logging

ssd130x:
- support SSD133x plus DT bindings

tegra:
- fix error handling

tilcdc:
- make use of DRM managed release

v3d:
- show memory stats in debugfs

vc4:
- fix error handling in plane prepare_fb
- fix framebuffer test in plane helpers

vesafb:
- use copy of global screen_info state

virtio:
- cleanups

vkms:
- fix OOB access when programming the LUT
- Kconfig improvements

vmwgfx:
- unmap surface before changing plane state
- fix memory leak in error handling
- documentation fixes

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20240111154902.GA8448@linux-uq9g
2024-02-05 13:50:15 +10:00
Dave Airlie
39126abc5e nouveau: offload fence uevents work to workqueue
This should break the deadlock between the fctx lock and the irq lock.

This offloads the processing off the work from the irq into a workqueue.

Cc: linux-stable@vger.kernel.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/576237/
2024-02-02 17:15:47 +10:00
Dave Airlie
b5e69be185 nouveau/gsp: use correct size for registry rpc.
Timur pointed this out before, and it just slipped my mind,
but this might help some things work better, around pcie power
management.

Fixes: 8d55b0a940 ("nouveau/gsp: add some basic registry entries.")
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/576336/
2024-02-02 17:15:16 +10:00
Jani Nikula
041261ac4c drm/nouveau/svm: remove unused but set variables
Fix the W=1 warning -Wunused-but-set-variable.

Cc: Karol Herbst <kherbst@redhat.com>
Cc: Lyude Paul <lyude@redhat.com>
Cc: Danilo Krummrich <dakr@redhat.com>
Cc: nouveau@lists.freedesktop.org
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Danilo Krummrich <dakr@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/8b133e7ec0e9aef728be301ac019c5ddcb3bbf51.1704908087.git.jani.nikula@intel.com
2024-01-31 21:48:37 +02:00
Jani Nikula
1cff237962 drm/nouveau/acr/ga102: remove unused but set variable
Fix the W=1 warning -Wunused-but-set-variable.

Cc: Karol Herbst <kherbst@redhat.com>
Cc: Lyude Paul <lyude@redhat.com>
Cc: Danilo Krummrich <dakr@redhat.com>
Cc: nouveau@lists.freedesktop.org
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Danilo Krummrich <dakr@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/4d9f62fa6963acfd8b7d8f623799ba3a516e347d.1704908087.git.jani.nikula@intel.com
2024-01-31 21:48:09 +02:00
Maxime Ripard
4db102dcb0 Merge drm/drm-next into drm-misc-next
Kickstart 6.9 development cycle.

Signed-off-by: Maxime Ripard <mripard@kernel.org>
2024-01-29 14:20:23 +01:00
Dave Airlie
4d7acc8f48 Revert "nouveau: push event block/allowing out of the fence context"
This reverts commit eacabb5462.

This commit causes some regressions in desktop usage, this will
reintroduce the original deadlock in DRI_PRIME situations, I've
got an idea to fix it by offloading to a workqueue in a different
spot, however this code has a race condition where we sometimes
miss interrupts so I'd like to fix that as well.

Cc: stable@vger.kernel.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
2024-01-27 04:04:34 +10:00
Somalapuram Amaranath
a78a8da51b drm/ttm: replace busy placement with flags v6
Instead of a list of separate busy placement add flags which indicate
that a placement should only be used when there is room or if we need to
evict.

v2: add missing TTM_PL_FLAG_IDLE for i915
v3: fix auto build test ERROR on drm-tip/drm-tip
v4: fix some typos pointed out by checkpatch
v5: cleanup some rebase problems with VMWGFX
v6: implement some missing VMWGFX functionality pointed out by Zack,
    rename the flags as suggested by Michel, rebase on drm-tip and
    adjust XE as well

Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Somalapuram Amaranath <Amaranath.Somalapuram@amd.com>
Reviewed-by: Zack Rusin <zack.rusin@broadcom.com>
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240112125158.2748-4-christian.koenig@amd.com
2024-01-25 09:59:44 +01:00
Linus Torvalds
e08b575815 Merge tag 'drm-next-2024-01-19' of git://anongit.freedesktop.org/drm/drm
Pull more drm fixes from Dave Airlie:
 "This is mostly amdgpu and xe fixes, with an amdkfd and nouveau fix
  thrown in.

  The amdgpu ones are just the usual couple of weeks of fixes. The xe
  ones are bunch of cleanups for the new xe driver, the fix you put in
  on the merge commit and the kconfig fix that was hiding the problem
  from me.

  amdgpu:
   - DSC fixes
   - DC resource pool fixes
   - OTG fix
   - DML2 fixes
   - Aux fix
   - GFX10 RLC firmware handling fix
   - Revert a broken workaround for SMU 13.0.2
   - DC writeback fix
   - Enable gfxoff when ROCm apps are active on gfx11 with the proper FW
     version

  amdkfd:
   - Fix dma-buf exports using GEM handles

  nouveau:
   - fix a unneeded WARN_ON triggering

  xe:
   - Fix for definition of wakeref_t
   - Fix for an error code aliasing
   - Fix for VM_UNBIND_ALL in the case there are no bound VMAs
   - Fixes for a number of __iomem address space mismatches reported by
     sparse
   - Fixes for the assignment of exec_queue priority
   - A Fix for skip_guc_pc not taking effect
   - Workaround for a build problem on GCC 11
   - A couple of fixes for error paths
   - Fix a Flat CCS compression metadata copy issue
   - Fix a misplace array bounds checking
   - Don't have display support depend on EXPERT (as discussed on IRC)"

* tag 'drm-next-2024-01-19' of git://anongit.freedesktop.org/drm/drm: (71 commits)
  nouveau/vmm: don't set addr on the fail path to avoid warning
  drm/amdgpu: Enable GFXOFF for Compute on GFX11
  drm/amd/display: Drop 'acrtc' and add 'new_crtc_state' NULL check for writeback requests.
  drm/amdgpu: revert "Adjust removal control flow for smu v13_0_2"
  drm/amdkfd: init drm_client with funcs hook
  drm/amd/display: Fix a switch statement in populate_dml_output_cfg_from_stream_state()
  drm/amdgpu: Fix the null pointer when load rlc firmware
  drm/amd/display: Align the returned error code with legacy DP
  drm/amd/display: Fix DML2 watermark calculation
  drm/amd/display: Clear OPTC mem select on disable
  drm/amd/display: Port DENTIST hang and TDR fixes to OTG disable W/A
  drm/amd/display: Add logging resource checks
  drm/amd/display: Init link enc resources in dc_state only if res_pool presents
  drm/amd/display: Fix late derefrence 'dsc' check in 'link_set_dsc_pps_packet()'
  drm/amd/display: Avoid enum conversion warning
  drm/amd/pm: Fix smuv13.0.6 current clock reporting
  drm/amd/pm: Add error log for smu v13.0.6 reset
  drm/amdkfd: Fix 'node' NULL check in 'svm_range_get_range_boundaries()'
  drm/amdgpu: drop exp hw support check for GC 9.4.3
  drm/amdgpu: move debug options init prior to amdgpu device init
  ...
2024-01-19 11:50:00 -08:00