Commit Graph

6873 Commits

Author SHA1 Message Date
Alistair Popple
82ba975e4c mm: allow compound zone device pages
Zone device pages are used to represent various type of device memory
managed by device drivers.  Currently compound zone device pages are not
supported.  This is because MEMORY_DEVICE_FS_DAX pages are the only user
of higher order zone device pages and have their own page reference
counting.

A future change will unify FS DAX reference counting with normal page
reference counting rules and remove the special FS DAX reference counting.
Supporting that requires compound zone device pages.

Supporting compound zone device pages requires compound_head() to
distinguish between head and tail pages whilst still preserving the
special struct page fields that are specific to zone device pages.

A tail page is distinguished by having bit zero being set in
page->compound_head, with the remaining bits pointing to the head page. 
For zone device pages page->compound_head is shared with page->pgmap.

The page->pgmap field must be common to all pages within a folio, even if
the folio spans memory sections.  Therefore pgmap is the same for both
head and tail pages and can be moved into the folio and we can use the
standard scheme to find compound_head from a tail page.

Link: https://lkml.kernel.org/r/67055d772e6102accf85161d0b57b0b3944292bf.1740713401.git-series.apopple@nvidia.com
Signed-off-by: Alistair Popple <apopple@nvidia.com>
Signed-off-by: Balbir Singh <balbirs@nvidia.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: David Hildenbrand <david@redhat.com>
Tested-by: Alison Schofield <alison.schofield@intel.com>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Asahi Lina <lina@asahilina.net>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Chunyan Zhang <zhang.lyra@gmail.com>
Cc: "Darrick J. Wong" <djwong@kernel.org>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: linmiaohe <linmiaohe@huawei.com>
Cc: Logan Gunthorpe <logang@deltatee.com>
Cc: Matthew Wilcow (Oracle) <willy@infradead.org>
Cc: Michael "Camp Drill Sergeant" Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Ted Ts'o <tytso@mit.edu>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: WANG Xuerui <kernel@xen0n.name>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-03-17 22:06:39 -07:00
David Hildenbrand
599b684a78 mm/rmap: convert make_device_exclusive_range() to make_device_exclusive()
The single "real" user in the tree of make_device_exclusive_range() always
requests making only a single address exclusive.  The current
implementation is hard to fix for properly supporting anonymous THP /
large folios and for avoiding messing with rmap walks in weird ways.

So let's always process a single address/page and return folio + page to
minimize page -> folio lookups.  This is a preparation for further
changes.

Reject any non-anonymous or hugetlb folios early, directly after GUP.

While at it, extend the documentation of make_device_exclusive() to
clarify some things.

Link: https://lkml.kernel.org/r/20250210193801.781278-4-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Acked-by: Simona Vetter <simona.vetter@ffwll.ch>
Reviewed-by: Alistair Popple <apopple@nvidia.com>
Tested-by: Alistair Popple <apopple@nvidia.com>
Cc: Alex Shi <alexs@kernel.org>
Cc: Danilo Krummrich <dakr@kernel.org>
Cc: Dave Airlie <airlied@gmail.com>
Cc: Jann Horn <jannh@google.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Karol Herbst <kherbst@redhat.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Lyude <lyude@redhat.com>
Cc: "Masami Hiramatsu (Google)" <mhiramat@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: SeongJae Park <sj@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Yanteng Si <si.yanteng@linux.dev>
Cc: Barry Song <v-songbaohua@oppo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-03-16 22:05:57 -07:00
Dave Airlie
ac3a75bd42 Merge tag 'drm-misc-fixes-2025-03-06' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes
A Kconfig fix for nouveau, locking and timestamp fixes for imagination,
a header guard fix for sched and a DPMS regression fix for bochs.

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Maxime Ripard <mripard@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250306-antelope-of-imminent-anger-bca19e@houat
2025-03-07 07:06:34 +10:00
Dave Airlie
debda50ad5 Merge tag 'drm-misc-fixes-2025-02-27' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes
Fix a rounding error in vkms, a header fix for img, a connector status
fix for nouveau, and a NULL pointer dereference fix for deferred IO
drivers.

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Maxime Ripard <mripard@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250227-antique-robust-earthworm-09dfd1@houat
2025-02-28 07:51:00 +10:00
Dave Airlie
6b481ab0e6 drm/nouveau: select FW caching
nouveau tries to load some firmware during suspend that it loaded
earlier, but with fw caching disabled it hangs suspend, so just rely on
FW cache enabling instead of working around it in the driver.

Fixes: 176fdcbddf ("drm/nouveau/gsp/r535: add support for booting GSP-RM")
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20250207012531.621369-1-airlied@gmail.com
2025-02-27 20:27:58 +01:00
Thomas Zimmermann
01f1d77a26 drm/nouveau: Do not override forced connector status
Keep user-forced connector status even if it cannot be programmed. Same
behavior as for the rest of the drivers.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Signed-off-by: Lyude Paul <lyude@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250114100214.195386-1-tzimmermann@suse.de
2025-02-26 14:18:15 -05:00
Dave Airlie
395436f3bd Merge tag 'drm-misc-fixes-2025-02-20' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes
An reset signal polarity fix for the jd9365da-h3 panel, a folio handling
fix and config fix in nouveau, a dmem cgroup descendant pool handling
fix, and a missing header for amdxdna.

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Maxime Ripard <mripard@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250220-glorious-cockle-of-might-5b35f7@houat
2025-02-21 09:16:35 +10:00
Aaron Kling
3dbc0215e3 drm/nouveau/pmu: Fix gp10b firmware guard
Most kernel configs enable multiple Tegra SoC generations, causing this
typo to go unnoticed. But in the case where a kernel config is strictly
for Tegra186, this is a problem.

Fixes: 989863d7cb ("drm/nouveau/pmu: select implementation based on available firmware")
Signed-off-by: Aaron Kling <webgeek1234@gmail.com>
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20250218-nouveau-gm10b-guard-v2-1-a4de71500d48@gmail.com
2025-02-19 13:31:59 +01:00
David Hildenbrand
b3fefbb30a nouveau/svm: fix missing folio unlock + put after make_device_exclusive_range()
In case we have to retry the loop, we are missing to unlock+put the
folio. In that case, we will keep failing make_device_exclusive_range()
because we cannot grab the folio lock, and even return from the function
with the folio locked and referenced, effectively never succeeding the
make_device_exclusive_range().

While at it, convert the other unlock+put to use a folio as well.

This was found by code inspection.

Fixes: 8f187163eb ("nouveau/svm: implement atomic SVM access")
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Alistair Popple <apopple@nvidia.com>
Tested-by: Alistair Popple <apopple@nvidia.com>
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20250124181524.3584236-2-david@redhat.com
2025-02-14 15:52:58 +01:00
Linus Torvalds
c159dfbdd4 Merge tag 'mm-nonmm-stable-2025-01-24-23-16' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull non-MM updates from Andrew Morton:
 "Mainly individually changelogged singleton patches. The patch series
  in this pull are:

   - "lib min_heap: Improve min_heap safety, testing, and documentation"
     from Kuan-Wei Chiu provides various tightenings to the min_heap
     library code

   - "xarray: extract __xa_cmpxchg_raw" from Tamir Duberstein preforms
     some cleanup and Rust preparation in the xarray library code

   - "Update reference to include/asm-<arch>" from Geert Uytterhoeven
     fixes pathnames in some code comments

   - "Converge on using secs_to_jiffies()" from Easwar Hariharan uses
     the new secs_to_jiffies() in various places where that is
     appropriate

   - "ocfs2, dlmfs: convert to the new mount API" from Eric Sandeen
     switches two filesystems to the new mount API

   - "Convert ocfs2 to use folios" from Matthew Wilcox does that

   - "Remove get_task_comm() and print task comm directly" from Yafang
     Shao removes now-unneeded calls to get_task_comm() in various
     places

   - "squashfs: reduce memory usage and update docs" from Phillip
     Lougher implements some memory savings in squashfs and performs
     some maintainability work

   - "lib: clarify comparison function requirements" from Kuan-Wei Chiu
     tightens the sort code's behaviour and adds some maintenance work

   - "nilfs2: protect busy buffer heads from being force-cleared" from
     Ryusuke Konishi fixes an issues in nlifs when the fs is presented
     with a corrupted image

   - "nilfs2: fix kernel-doc comments for function return values" from
     Ryusuke Konishi fixes some nilfs kerneldoc

   - "nilfs2: fix issues with rename operations" from Ryusuke Konishi
     addresses some nilfs BUG_ONs which syzbot was able to trigger

   - "minmax.h: Cleanups and minor optimisations" from David Laight does
     some maintenance work on the min/max library code

   - "Fixes and cleanups to xarray" from Kemeng Shi does maintenance
     work on the xarray library code"

* tag 'mm-nonmm-stable-2025-01-24-23-16' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (131 commits)
  ocfs2: use str_yes_no() and str_no_yes() helper functions
  include/linux/lz4.h: add some missing macros
  Xarray: use xa_mark_t in xas_squash_marks() to keep code consistent
  Xarray: remove repeat check in xas_squash_marks()
  Xarray: distinguish large entries correctly in xas_split_alloc()
  Xarray: move forward index correctly in xas_pause()
  Xarray: do not return sibling entries from xas_find_marked()
  ipc/util.c: complete the kernel-doc function descriptions
  gcov: clang: use correct function param names
  latencytop: use correct kernel-doc format for func params
  minmax.h: remove some #defines that are only expanded once
  minmax.h: simplify the variants of clamp()
  minmax.h: move all the clamp() definitions after the min/max() ones
  minmax.h: use BUILD_BUG_ON_MSG() for the lo < hi test in clamp()
  minmax.h: reduce the #define expansion of min(), max() and clamp()
  minmax.h: update some comments
  minmax.h: add whitespace around operators and after commas
  nilfs2: do not update mtime of renamed directory that is not moved
  nilfs2: handle errors that nilfs_prepare_chunk() may return
  CREDITS: fix spelling mistake
  ...
2025-01-26 17:50:53 -08:00
Simona Vetter
07c5b27720 Merge v6.13 into drm-next
A regression was caused by commit e4b5ccd392 ("drm/v3d: Ensure job
pointer is set to NULL after job completion"), but this commit is not
yet in next-fixes, fast-forward it.

Note that this recreates Linus merge in 96c84703f1 ("Merge tag
'drm-next-2025-01-17' of https://gitlab.freedesktop.org/drm/kernel")
because I didn't want to backmerge a random point in the merge window.

Signed-off-by: Simona Vetter <simona.vetter@ffwll.ch>
2025-01-23 14:42:21 +01:00
Linus Torvalds
96c84703f1 Merge tag 'drm-next-2025-01-17' of https://gitlab.freedesktop.org/drm/kernel
Pull drm updates from Dave Airlie:
 "There are two external interactions of note, the msm tree pull in some
  opp tree, hopefully the opp tree arrives from the same git tree
  however it normally does.

  There is also a new cgroup controller for device memory, that is used
  by drm, so is merging through my tree. This will hopefully help open
  up gpu cgroup usage a bit more and move us forward.

  There is a new accelerator driver for the AMD XDNA Ryzen AI NPUs.

  Then the usual xe/amdgpu/i915/msm leaders and lots of changes and
  refactors across the board:

  core:
   - device memory cgroup controller added
   - Remove driver date from drm_driver
   - Add drm_printer based hex dumper
   - drm memory stats docs update
   - scheduler documentation improvements

  new driver:
   - amdxdna - Ryzen AI NPU support

  connector:
   - add a mutex to protect ELD
   - make connector setup two-step

  panels:
   - Introduce backlight quirks infrastructure
   - New panels: KDB KD116N2130B12, Tianma TM070JDHG34-00,
   - Multi-Inno Technology MI1010Z1T-1CP11

  bridge:
   - ti-sn65dsi83: Add ti,lvds-vod-swing optional properties
   - Provide default implementation of atomic_check for HDMI bridges
   - it605: HDCP improvements, MCCS Support

  xe:
   - make OA buffer size configurable
   - GuC capture fixes
   - add ufence and g2h flushes
   - restore system memory GGTT mappings
   - ioctl fixes
   - SRIOV PF scheduling priority
   - allow fault injection
   - lots of improvements/refactors
   - Enable GuC's WA_DUAL_QUEUE for newer platforms
   - IRQ related fixes and improvements

  i915:
   - More accurate engine busyness metrics with GuC submission
   - Ensure partial BO segment offset never exceeds allowed max
   - Flush GuC CT receive tasklet during reset preparation
   - Some DG2 refactor to fix DG2 bugs when operating with certain CPUs
   - Fix DG1 power gate sequence
   - Enabling uncompressed 128b/132b UHBR SST
   - Handle hdmi connector init failures, and no HDMI/DP cases
   - More robust engine resets on Haswell and older

  i915/xe display:
   - HDCP fixes for Xe3Lpd
   - New GSC FW ARL-H/ARL-U
   - support 3 VDSC engines 12 slices
   - MBUS joining sanitisation
   - reconcile i915/xe display power mgmt
   - Xe3Lpd fixes
   - UHBR rates for Thunderbolt

  amdgpu:
   - DRM panic support
   - track BO memory stats at runtime
   - Fix max surface handling in DC
   - Cleaner shader support for gfx10.3 dGPUs
   - fix drm buddy trim handling
   - SDMA engine reset updates
   - Fix doorbell ttm cleanup
   - RAS updates
   - ISP updates
   - SDMA queue reset support
   - Rework DPM powergating interfaces
   - Documentation updates and cleanups
   - DCN 3.5 updates
   - Use a pm notifier to more gracefully handle VRAM eviction on
     suspend or hibernate
   - Add debugfs interfaces for forcing scheduling to specific engine
     instances
   - GG 9.5 updates
   - IH 4.4 updates
   - Make missing optional firmware less noisy
   - PSP 13.x updates
   - SMU 13.x updates
   - VCN 5.x updates
   - JPEG 5.x updates
   - GC 12.x updates
   - DC FAMS updates

  amdkfd:
   - GG 9.5 updates
   - Logging improvements
   - Shader debugger fixes
   - Trap handler cleanup
   - Cleanup includes
   - Eviction fence wq fix

  msm:
   - MDSS:
      - properly described UBWC registers
      - added SM6150 (aka QCS615) support
   - DPU:
      - added SM6150 (aka QCS615) support
      - enabled wide planes if virtual planes are enabled (by using two
        SSPPs for a single plane)
      - added CWB hardware blocks support
   - DSI:
      - added SM6150 (aka QCS615) support
   - GPU:
      - Print GMU core fw version
      - GMU bandwidth voting for a740 and a750
      - Expose uche trap base via uapi
      - UAPI error reporting

  rcar-du:
   - Add r8a779h0 Support

  ivpu:
   - Fix qemu crash when using passthrough

  nouveau:
   - expose GSP-RM logging buffers via debugfs

  panfrost:
   - Add MT8188 Mali-G57 MC3 support

  rockchip:
   - Gamma LUT support

  hisilicon:
   - new HIBMC support

  virtio-gpu:
   - convert to helpers
   - add prime support for scanout buffers

  v3d:
   - Add DRM_IOCTL_V3D_PERFMON_SET_GLOBAL

  vc4:
   - Add support for BCM2712

  vkms:
   - line-per-line compositing algorithm to improve performance

  zynqmp:
   - Add DP audio support

  mediatek:
   - dp: Add sdp path reset
   - dp: Support flexible length of DP calibration data

  etnaviv:
   - add fdinfo memory support
   - add explicit reset handling"

* tag 'drm-next-2025-01-17' of https://gitlab.freedesktop.org/drm/kernel: (1070 commits)
  drm/bridge: fix documentation for the hdmi_audio_prepare() callback
  doc/cgroup: Fix title underline length
  drm/doc: Include new drm-compute documentation
  cgroup/dmem: Fix parameters documentation
  cgroup/dmem: Select PAGE_COUNTER
  kernel/cgroup: Remove the unused variable climit
  drm/display: hdmi: Do not read EDID on disconnected connectors
  drm/tests: hdmi: Add connector disablement test
  drm/connector: hdmi: Do atomic check when necessary
  drm/amd/display: 3.2.316
  drm/amd/display: avoid reset DTBCLK at clock init
  drm/amd/display: improve dpia pre-train
  drm/amd/display: Apply DML21 Patches
  drm/amd/display: Use HW lock mgr for PSR1
  drm/amd/display: Revised for Replay Pseudo vblank control
  drm/amd/display: Add a new flag for replay low hz
  drm/amd/display: Remove unused read_ono_state function from Hwss module
  drm/amd/display: Do not elevate mem_type change to full update
  drm/amd/display: Do not wait for PSR disable on vbl enable
  drm/amd/display: Remove unnecessary eDP power down
  ...
2025-01-21 16:09:47 -08:00
Linus Torvalds
9bffa1ad25 Merge tag 'drm-fixes-2025-01-17' of https://gitlab.freedesktop.org/drm/kernel
Pull drm fixes from Dave Airlie:
 "Final(?) set of fixes for 6.13, I think the holidays finally caught up
  with everyone, the misc changes are 2 weeks worth, otherwise amdgpu
  and xe are most of it. The largest pieces is a new test so I'm not too
  worried about that.

  kunit:
   - Fix W=1 build for kunit tests

  bridge:
   - Handle YCbCr420 better in bridge code, with tests
   - itee-it6263 error handling fix

  amdgpu:
   - SMU 13 fix
   - DP MST fixes
   - DCN 3.5 fix
   - PSR fixes
   - eDP fix
   - VRR fix
   - Enforce isolation fixes
   - GFX 12 fix
   - PSP 14.x fix

  xe:
   - Add steering info support for GuC register lists
   - Add means to wait for reset and synchronous reset
   - Make changing ccs_mode a synchronous action
   - Add missing mux registers
   - Mark ComputeCS read mode as UC on iGPU, unblocking ULLS on iGPU

  i915:
   - Relax clear color alignment to 64 bytes [fb]

  v3d:
   - Fix warn when unloading v3d

  nouveau:
   - Fix cross-device fence handling in nouveau
   - Fix backlight regression for macbooks 5,1

  vmwgfx:
   - Fix BO reservation handling in vmwgfx"

* tag 'drm-fixes-2025-01-17' of https://gitlab.freedesktop.org/drm/kernel: (33 commits)
  drm/xe: Mark ComputeCS read mode as UC on iGPU
  drm/xe/oa: Add missing VISACTL mux registers
  drm/xe: make change ccs_mode a synchronous action
  drm/xe: introduce xe_gt_reset and xe_gt_wait_for_reset
  drm/xe/guc: Adding steering info support for GuC register lists
  drm/bridge: ite-it6263: Prevent error pointer dereference in probe()
  drm/v3d: Ensure job pointer is set to NULL after job completion
  drm/vmwgfx: Add new keep_resv BO param
  drm/vmwgfx: Remove busy_places
  drm/vmwgfx: Unreserve BO on error
  drm/amdgpu: fix fw attestation for MP0_14_0_{2/3}
  drm/amdgpu: always sync the GFX pipe on ctx switch
  drm/amdgpu: disable gfxoff with the compute workload on gfx12
  drm/amdgpu: Fix Circular Locking Dependency in AMDGPU GFX Isolation
  drm/i915/fb: Relax clear color alignment to 64 bytes
  drm/amd/display: Disable replay and psr while VRR is enabled
  drm/amd/display: Fix PSR-SU not support but still call the amdgpu_dm_psr_enable
  nouveau/fence: handle cross device fences properly
  drm/tests: connector: Add ycbcr_420_allowed tests
  drm/connector: hdmi: Validate supported_formats matches ycbcr_420_allowed
  ...
2025-01-16 19:49:26 -08:00
Dave Airlie
fa6493440f Merge tag 'drm-misc-fixes-2025-01-15' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes
drm-misc-fixes for v6.13:

- itee-it6263 error handling fix.
- Fix warn when unloading v3d.
- Fix W=1 build for kunit tests.
- Fix backlight regression for macbooks 5,1 in nouveau.
- Handle YCbCr420 better in bridge code, with tests.
- Fix cross-device fence handling in nouveau.
- Fix BO reservation handling in vmwgfx.

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/a89adcd5-2042-4e7f-93f4-2b299bb1ef17@linux.intel.com
2025-01-16 11:54:14 +10:00
Chris Bainbridge
14578923e8 ACPI: video: Fix random crashes due to bad kfree()
Commit c6a837088b ("drm/amd/display: Fetch the EDID from _DDC if
available for eDP") added function dm_helpers_probe_acpi_edid(), which
fetches the EDID from the BIOS by calling acpi_video_get_edid().

acpi_video_get_edid() returns a pointer to the EDID, but this pointer
does not originate from kmalloc() - it is actually the internal
"pointer" field from an acpi_buffer struct (which did come from
kmalloc()).

dm_helpers_probe_acpi_edid() then attempts to kfree() the EDID pointer,
resulting in memory corruption which leads to random, intermittent
crashes (e.g. 4% of boots will fail with some Oops).

Fix this by allocating a new array (which can be safely freed) for the
EDID data, and correctly freeing the acpi_buffer pointer.

The only other caller of acpi_video_get_edid() is nouveau_acpi_edid():
remove the extraneous kmemdup() here as the EDID data is now copied in
acpi_video_device_EDID().

Signed-off-by: Chris Bainbridge <chris.bainbridge@gmail.com>
Fixes: c6a837088b ("drm/amd/display: Fetch the EDID from _DDC if available for eDP")
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reported-by: Borislav Petkov (AMD) <bp@alien8.de>
Tested-by: Borislav Petkov (AMD) <bp@alien8.de>
Closes: https://lore.kernel.org/amd-gfx/20250110175252.GBZ4FedNKqmBRaY4T3@fat_crate.local/T/#m324a23eb4c4c32fa7e89e31f8ba96c781e496fb1
Link: https://patch.msgid.link/Z4K_oQL7eA9Owkbs@debian.local
[ rjw: Changed function description comment into a kerneldoc one ]
[ rjw: Subject and changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-01-13 21:09:10 +01:00
Dave Airlie
1f9910b41c nouveau/fence: handle cross device fences properly
The fence sync logic doesn't handle a fence sync across devices
as it tries to write to a channel offset from one device into
the fence bo from a different device, which won't work so well.

This patch fixes that to avoid using the sync path in the case
where the fences come from different nouveau drm devices.

This works fine on a single device as the fence bo is shared
across the devices, and mapped into each channels vma space,
the channel offsets are therefore okay to pass between sides,
so one channel can sync on the seqnos from the other by using
the offset into it's vma.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Cc: stable@vger.kernel.org
Reviewed-by: Ben Skeggs <bskeggs@nvidia.com>
[ Fix compilation issue; remove version log from commit messsage.
  - Danilo ]
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20250109005553.623947-1-airlied@gmail.com
2025-01-13 15:49:49 +01:00
Yafang Shao
7e70433c2b drivers: remove get_task_comm() and print task comm directly
Since task->comm is guaranteed to be NUL-terminated, we can print it
directly without the need to copy it into a separate buffer.  This
simplifies the code and avoids unnecessary operations.

Link: https://lkml.kernel.org/r/20241219023452.69907-6-laoar.shao@gmail.com
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Reviewed-by: Jiri Slaby <jirislaby@kernel.org> (For tty)
Reviewed-by: Lyude Paul <lyude@redhat.com> (For nouveau)
Cc: Oded Gabbay <ogabbay@kernel.org>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tursulin@ursulin.net>
Cc: David Airlie <airlied@gmail.com>
Cc: Simona Vetter <simona@ffwll.ch>
Cc: Karol Herbst <kherbst@redhat.com>
Cc: Lyude Paul <lyude@redhat.com>
Cc: Danilo Krummrich <dakr@redhat.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jiri Slaby <jirislaby@kernel.org>
Cc: "André Almeida" <andrealmeid@igalia.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Borislav Petkov (AMD) <bp@alien8.de>
Cc: Darren Hart <dvhart@infradead.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Morris <jmorris@namei.org>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Kalle Valo <kvalo@kernel.org>
Cc: Kees Cook <kees@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul Moore <paul@paul-moore.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vineet Gupta <vgupta@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-01-12 20:21:16 -08:00
Jani Nikula
79cb1fad39 drm/mst: remove mgr parameter and debug logging from drm_dp_get_vc_payload_bw()
The struct drm_dp_mst_topology_mgr *mgr parameter is only used for debug
logging in case the passed in link rate or lane count are zero. There's
no further error checking as such, and the function returns 0.

There should be no case where the parameters are zero. The returned
value is generally used as a divisor, and if we were hitting this, we'd
be seeing division by zero.

Just remove the debug logging altogether, along with the mgr parameter,
so that the function can be used in non-MST contexts without the
topology manager.

v2: Also remove drm_dp_mst_helper_tests_init as unnecessary (Imre)

Cc: Imre Deak <imre.deak@intel.com>
Cc: Lyude Paul <lyude@redhat.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Acked-by: Maxime Ripard <mripard@kernel.org>
Acked-by: Thomas Zimmermann <tzimmermann@suse.de>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/72d77e7a7fe69c784e9df048b7e6f250fd7599e4.1735912293.git.jani.nikula@intel.com
2025-01-07 18:43:18 +02:00
Takashi Iwai
35243fc777 drm/nouveau/disp: Fix missing backlight control on Macbook 5,1
Macbook 5,1 with MCP79 lost its backlight control since the recent
change for supporting GSP-RM; it rewrote the whole nv50 backlight
control code and each display engine is supposed to have an entry for
IOR bl callback, but it didn't cover mcp77.

This patch adds the missing bl entry initialization for mcp77 display
engine to recover the backlight control.

Fixes: 2274ce7e36 ("drm/nouveau/disp: add output backlight control methods")
Cc: stable@vger.kernel.org
Link: https://bugzilla.suse.com/show_bug.cgi?id=1223838
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20250102114944.11499-1-tiwai@suse.de
2025-01-07 15:45:55 +01:00
Imre Deak
5a83c9293c drm/nouveau/dp_mst: Expose a connector to kernel users after it's properly initialized
After a connector is added to the drm_mode_config::connector_list, it's
visible to any in-kernel users looking up connectors via the above list.
Make sure that the connector is properly initialized before such
look-ups, by initializing the connector with
drm_connector_dynamic_init() - which doesn't add the connector to the
list - and registering it with drm_connector_dynamic_register() - which
adds the connector to the list - after the initialization is complete.

v2: Fix s/drm_connector_dynamic_register()/drm_connector_dynamic_init()
    typo in the commit log.

Cc: Karol Herbst <kherbst@redhat.com>
Cc: Lyude Paul <lyude@redhat.com>
Cc: Danilo Krummrich <dakr@kernel.org>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Wayne Lin <Wayne.Lin@amd.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241211230328.4012496-8-imre.deak@intel.com
2024-12-17 16:03:53 +02:00
Maarten Lankhorst
33f029af89 Merge remote-tracking branch 'drm/drm-next' into drm-misc-next
The v6.13-rc2 release included a bunch of breaking changes,
specifically the MODULE_IMPORT_NS commit.

Backmerge in order to fix them before the next pull-request.

Include the fix from Stephen Roswell.

Caused by commit

  25c3fd1183 ("drm/virtio: Add a helper to map and note the dma addrs and lengths")

Interacting with commit

  cdd30ebb1b ("module: Convert symbol namespace to string literal")

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Link: https://patchwork.freedesktop.org/patch/msgid/20241209121717.2abe8026@canb.auug.org.au
Signed-off-by: Maarten Lankhorst <dev@lankhorst.se>
2024-12-09 16:35:21 +01:00
Jani Nikula
cb2e1c2136 drm: remove driver date from struct drm_driver and all drivers
We stopped using the driver initialized date in commit 7fb8af6798
("drm: deprecate driver date") and (eventually) started returning "0"
for drm_version ioctl instead.

Finish the job, and remove the unused date member from struct
drm_driver, its initialization from drivers, along with the common
DRIVER_DATE macros.

v2: Also update drivers/accel (kernel test robot)

Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Simon Ser <contact@emersion.fr>
Acked-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Acked-by: Lucas De Marchi <lucas.demarchi@intel.com>
Acked-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> # msm
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/1f2bf2543aed270a06f6c707fd6ed1b78bf16712.1733322525.git.jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2024-12-05 12:35:42 +02:00
Timur Tabi
214c9539cf drm/nouveau: expose GSP-RM logging buffers via debugfs
The LOGINIT, LOGINTR, LOGRM, and LOGPMU buffers are circular buffers
that have printf-like logs from GSP-RM and PMU encoded in them.

LOGINIT, LOGINTR, and LOGRM are allocated by Nouveau and their DMA
addresses are passed to GSP-RM during initialization. The buffers are
required for GSP-RM to initialize properly.

LOGPMU is also allocated by Nouveau, but its contents are updated
when Nouveau receives an NV_VGPU_MSG_EVENT_UCODE_LIBOS_PRINT RPC from
GSP-RM. Nouveau then copies the RPC to the buffer.

The messages are encoded as an array of variable-length structures that
contain the parameters to an NV_PRINTF call. The format string and
parameter count are stored in a special ELF image that contains only
logging strings. This image is not currently shipped with the Nvidia
driver.

There are two methods to extract the logs.

OpenRM tries to load the logging ELF, and if present, parses the log
buffers in real time and outputs the strings to the kernel console.

Alternatively, and this is the method used by this patch, the buffers
can be exposed to user space, and a user-space tool (along with the
logging ELF image) can parse the buffer and dump the logs.

This method has the advantage that it allows the buffers to be parsed
even when the logging ELF file is not available to the user. However,
it has the disadvantage the debugfs entries need to remain until the
driver is unloaded.

The buffers are exposed via debugfs. If GSP-RM fails to initialize, then
Nouveau immediately shuts down the GSP interface. This would normally
also deallocate the logging buffers, thereby preventing the user from
capturing the debug logs.

To avoid this, introduce the keep-gsp-logging command line parameter. If
specified, and if at least one logging buffer has content, then Nouveau
will migrate these buffers into new debugfs entries that are retained
until the driver unloads.

An end-user can capture the logs using the following commands:

    cp /sys/kernel/debug/nouveau/<path>/loginit loginit
    cp /sys/kernel/debug/nouveau/<path>/logrm logrm
    cp /sys/kernel/debug/nouveau/<path>/logintr logintr
    cp /sys/kernel/debug/nouveau/<path>/logpmu logpmu

where (for a PCI device) <path> is the PCI ID of the GPU (e.g.
0000:65:00.0).

Since LOGPMU is not needed for normal GSP-RM operation, it is only
created if debugfs is available. Otherwise, the
NV_VGPU_MSG_EVENT_UCODE_LIBOS_PRINT RPCs are ignored.

A simple way to test the buffer migration feature is to have
nvkm_gsp_init() return an error code.

Tested-by: Ben Skeggs <bskeggs@nvidia.com>
Signed-off-by: Timur Tabi <ttabi@nvidia.com>
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20241030202952.694055-2-ttabi@nvidia.com
2024-12-04 21:47:53 +01:00
Timur Tabi
7c995e2fd9 drm/nouveau: retain device pointer in nvkm_gsp_mem object
Store the struct device pointer used to allocate the DMA buffer in
the nvkm_gsp_mem object.  This allows nvkm_gsp_mem_dtor() to release
the buffer without needing the nvkm_gsp.  This is needed so that
we can retain DMA buffers even after the nvkm_gsp object is deleted.

Signed-off-by: Timur Tabi <ttabi@nvidia.com>
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20241030202952.694055-1-ttabi@nvidia.com
2024-12-04 21:47:47 +01:00
Danilo Krummrich
97118a1816 drm/nouveau: create module debugfs root
Typically DRM drivers use the DRM debugfs root entry. However, since
Nouveau is heading towards a split into a core and a DRM driver, create
a module specific debugfs root directory.

Subsequent patches make use of this new debugfs root in order to store
GSP-RM log bufferes (optionally beyond a device driver binding).

Acked-by: Timur Tabi <ttabi@nvidia.com>
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20241125142639.9126-1-dakr@kernel.org
2024-12-04 21:43:46 +01:00
Maxime Ripard
3aba2eba84 Merge drm/drm-next into drm-misc-next
Kickstart 6.14 cycle.

Signed-off-by: Maxime Ripard <mripard@kernel.org>
2024-12-02 12:44:18 +01:00
Linus Torvalds
e70140ba0d Get rid of 'remove_new' relic from platform driver struct
The continual trickle of small conversion patches is grating on me, and
is really not helping.  Just get rid of the 'remove_new' member
function, which is just an alias for the plain 'remove', and had a
comment to that effect:

  /*
   * .remove_new() is a relic from a prototype conversion of .remove().
   * New drivers are supposed to implement .remove(). Once all drivers are
   * converted to not use .remove_new any more, it will be dropped.
   */

This was just a tree-wide 'sed' script that replaced '.remove_new' with
'.remove', with some care taken to turn a subsequent tab into two tabs
to make things line up.

I did do some minimal manual whitespace adjustment for places that used
spaces to line things up.

Then I just removed the old (sic) .remove_new member function, and this
is the end result.  No more unnecessary conversion noise.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-12-01 15:12:43 -08:00
Zhi Wang
01ed662bdd nvkm: correctly calculate the available space of the GSP cmdq buffer
r535_gsp_cmdq_push() waits for the available page in the GSP cmdq
buffer when handling a large RPC request. When it sees at least one
available page in the cmdq, it quits the waiting with the amount of
free buffer pages in the queue.

Unfortunately, it always takes the [write pointer, buf_size) as
available buffer pages before rolling back and wrongly calculates the
size of the data should be copied. Thus, it can overwrite the RPC
request that GSP is currently reading, which causes GSP hang due
to corrupted RPC request:

[  549.209389] ------------[ cut here ]------------
[  549.214010] WARNING: CPU: 8 PID: 6314 at drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c:116 r535_gsp_msgq_wait+0xd0/0x190 [nvkm]
[  549.225678] Modules linked in: nvkm(E+) gsp_log(E) snd_seq_dummy(E) snd_hrtimer(E) snd_seq(E) snd_timer(E) snd_seq_device(E) snd(E) soundcore(E) rfkill(E) qrtr(E) vfat(E) fat(E) ipmi_ssif(E) amd_atl(E) intel_rapl_msr(E) intel_rapl_common(E) mlx5_ib(E) amd64_edac(E) edac_mce_amd(E) kvm_amd(E) ib_uverbs(E) kvm(E) ib_core(E) acpi_ipmi(E) ipmi_si(E) mxm_wmi(E) ipmi_devintf(E) rapl(E) i2c_piix4(E) wmi_bmof(E) joydev(E) ptdma(E) acpi_cpufreq(E) k10temp(E) pcspkr(E) ipmi_msghandler(E) xfs(E) libcrc32c(E) ast(E) i2c_algo_bit(E) crct10dif_pclmul(E) drm_shmem_helper(E) nvme_tcp(E) crc32_pclmul(E) ahci(E) drm_kms_helper(E) libahci(E) nvme_fabrics(E) crc32c_intel(E) nvme(E) cdc_ether(E) mlx5_core(E) nvme_core(E) usbnet(E) drm(E) libata(E) ccp(E) ghash_clmulni_intel(E) mii(E) t10_pi(E) mlxfw(E) sp5100_tco(E) psample(E) pci_hyperv_intf(E) wmi(E) dm_multipath(E) sunrpc(E) dm_mirror(E) dm_region_hash(E) dm_log(E) dm_mod(E) be2iscsi(E) bnx2i(E) cnic(E) uio(E) cxgb4i(E) cxgb4(E) tls(E) libcxgbi(E) libcxgb(E) qla4xxx(E)
[  549.225752]  iscsi_boot_sysfs(E) iscsi_tcp(E) libiscsi_tcp(E) libiscsi(E) scsi_transport_iscsi(E) fuse(E) [last unloaded: gsp_log(E)]
[  549.326293] CPU: 8 PID: 6314 Comm: insmod Tainted: G            E      6.9.0-rc6+ #1
[  549.334039] Hardware name: ASRockRack 1U1G-MILAN/N/ROMED8-NL, BIOS L3.12E 09/06/2022
[  549.341781] RIP: 0010:r535_gsp_msgq_wait+0xd0/0x190 [nvkm]
[  549.347343] Code: 08 00 00 89 da c1 e2 0c 48 8d ac 11 00 10 00 00 48 8b 0c 24 48 85 c9 74 1f c1 e0 0c 4c 8d 6d 30 83 e8 30 89 01 e9 68 ff ff ff <0f> 0b 49 c7 c5 92 ff ff ff e9 5a ff ff ff ba ff ff ff ff be c0 0c
[  549.366090] RSP: 0018:ffffacbccaaeb7d0 EFLAGS: 00010246
[  549.371315] RAX: 0000000000000000 RBX: 0000000000000012 RCX: 0000000000923e28
[  549.378451] RDX: 0000000000000000 RSI: 0000000055555554 RDI: ffffacbccaaeb730
[  549.385590] RBP: 0000000000000001 R08: ffff8bd14d235f70 R09: ffff8bd14d235f70
[  549.392721] R10: 0000000000000002 R11: ffff8bd14d233864 R12: 0000000000000020
[  549.399854] R13: ffffacbccaaeb818 R14: 0000000000000020 R15: ffff8bb298c67000
[  549.406988] FS:  00007f5179244740(0000) GS:ffff8bd14d200000(0000) knlGS:0000000000000000
[  549.415076] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  549.420829] CR2: 00007fa844000010 CR3: 00000001567dc005 CR4: 0000000000770ef0
[  549.427963] PKRU: 55555554
[  549.430672] Call Trace:
[  549.433126]  <TASK>
[  549.435233]  ? __warn+0x7f/0x130
[  549.438473]  ? r535_gsp_msgq_wait+0xd0/0x190 [nvkm]
[  549.443426]  ? report_bug+0x18a/0x1a0
[  549.447098]  ? handle_bug+0x3c/0x70
[  549.450589]  ? exc_invalid_op+0x14/0x70
[  549.454430]  ? asm_exc_invalid_op+0x16/0x20
[  549.458619]  ? r535_gsp_msgq_wait+0xd0/0x190 [nvkm]
[  549.463565]  r535_gsp_msg_recv+0x46/0x230 [nvkm]
[  549.468257]  r535_gsp_rpc_push+0x106/0x160 [nvkm]
[  549.473033]  r535_gsp_rpc_rm_ctrl_push+0x40/0x130 [nvkm]
[  549.478422]  nvidia_grid_init_vgpu_types+0xbc/0xe0 [nvkm]
[  549.483899]  nvidia_grid_init+0xb1/0xd0 [nvkm]
[  549.488420]  ? srso_alias_return_thunk+0x5/0xfbef5
[  549.493213]  nvkm_device_pci_probe+0x305/0x420 [nvkm]
[  549.498338]  local_pci_probe+0x46/0xa0
[  549.502096]  pci_call_probe+0x56/0x170
[  549.505851]  pci_device_probe+0x79/0xf0
[  549.509690]  ? driver_sysfs_add+0x59/0xc0
[  549.513702]  really_probe+0xd9/0x380
[  549.517282]  __driver_probe_device+0x78/0x150
[  549.521640]  driver_probe_device+0x1e/0x90
[  549.525746]  __driver_attach+0xd2/0x1c0
[  549.529594]  ? __pfx___driver_attach+0x10/0x10
[  549.534045]  bus_for_each_dev+0x78/0xd0
[  549.537893]  bus_add_driver+0x112/0x210
[  549.541750]  driver_register+0x5c/0x120
[  549.545596]  ? __pfx_nvkm_init+0x10/0x10 [nvkm]
[  549.550224]  do_one_initcall+0x44/0x300
[  549.554063]  ? do_init_module+0x23/0x240
[  549.557989]  do_init_module+0x64/0x240

Calculate the available buffer page before rolling back based on
the result from the waiting.

Signed-off-by: Zhi Wang <zhiw@nvidia.com>
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20241017071922.2518724-3-zhiw@nvidia.com
2024-11-25 17:36:07 +01:00
Zhi Wang
8d9beb4aeb nvkm/gsp: correctly advance the read pointer of GSP message queue
A GSP event message consists three parts: message header, RPC header,
message body. GSP calculates the number of pages to write from the
total size of a GSP message. This behavior can be observed from the
movement of the write pointer.

However, nvkm takes only the size of RPC header and message body as
the message size when advancing the read pointer. When handling a
two-page GSP message in the non rollback case, It wrongly takes the
message body of the previous message as the message header of the next
message. As the "message length" tends to be zero, in the calculation of
size needs to be copied (0 - size of (message header)), the size needs to
be copied will be "0xffffffxx". It also triggers a kernel panic due to a
NULL pointer error.

[  547.614102] msg: 00000f90: ff ff ff ff ff ff ff ff 40 d7 18 fb 8b 00 00 00  ........@.......
[  547.622533] msg: 00000fa0: 00 00 00 00 ff ff ff ff ff ff ff ff 00 00 00 00  ................
[  547.630965] msg: 00000fb0: ff ff ff ff ff ff ff ff 00 00 00 00 ff ff ff ff  ................
[  547.639397] msg: 00000fc0: ff ff ff ff 00 00 00 00 ff ff ff ff ff ff ff ff  ................
[  547.647832] nvkm 0000:c1:00.0: gsp: peek msg rpc fn:0 len:0x0/0xffffffffffffffe0
[  547.655225] nvkm 0000:c1:00.0: gsp: get msg rpc fn:0 len:0x0/0xffffffffffffffe0
[  547.662532] BUG: kernel NULL pointer dereference, address: 0000000000000020
[  547.669485] #PF: supervisor read access in kernel mode
[  547.674624] #PF: error_code(0x0000) - not-present page
[  547.679755] PGD 0 P4D 0
[  547.682294] Oops: 0000 [#1] PREEMPT SMP NOPTI
[  547.686643] CPU: 22 PID: 322 Comm: kworker/22:1 Tainted: G            E      6.9.0-rc6+ #1
[  547.694893] Hardware name: ASRockRack 1U1G-MILAN/N/ROMED8-NL, BIOS L3.12E 09/06/2022
[  547.702626] Workqueue: events r535_gsp_msgq_work [nvkm]
[  547.707921] RIP: 0010:r535_gsp_msg_recv+0x87/0x230 [nvkm]
[  547.713375] Code: 00 8b 70 08 48 89 e1 31 d2 4c 89 f7 e8 12 f5 ff ff 48 89 c5 48 85 c0 0f 84 cf 00 00 00 48 81 fd 00 f0 ff ff 0f 87 c4 00 00 00 <8b> 55 10 41 8b 46 30 85 d2 0f 85 f6 00 00 00 83 f8 04 76 10 ba 05
[  547.732119] RSP: 0018:ffffabe440f87e10 EFLAGS: 00010203
[  547.737335] RAX: 0000000000000010 RBX: 0000000000000008 RCX: 000000000000003f
[  547.744461] RDX: 0000000000000000 RSI: ffffabe4480a8030 RDI: 0000000000000010
[  547.751585] RBP: 0000000000000010 R08: 0000000000000000 R09: ffffabe440f87bb0
[  547.758707] R10: ffffabe440f87dc8 R11: 0000000000000010 R12: 0000000000000000
[  547.765834] R13: 0000000000000000 R14: ffff9351df1e5000 R15: 0000000000000000
[  547.772958] FS:  0000000000000000(0000) GS:ffff93708eb00000(0000) knlGS:0000000000000000
[  547.781035] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  547.786771] CR2: 0000000000000020 CR3: 00000003cc220002 CR4: 0000000000770ef0
[  547.793896] PKRU: 55555554
[  547.796600] Call Trace:
[  547.799046]  <TASK>
[  547.801152]  ? __die+0x20/0x70
[  547.804211]  ? page_fault_oops+0x75/0x170
[  547.808221]  ? print_hex_dump+0x100/0x160
[  547.812226]  ? exc_page_fault+0x64/0x150
[  547.816152]  ? asm_exc_page_fault+0x22/0x30
[  547.820341]  ? r535_gsp_msg_recv+0x87/0x230 [nvkm]
[  547.825184]  r535_gsp_msgq_work+0x42/0x50 [nvkm]
[  547.829845]  process_one_work+0x196/0x3d0
[  547.833861]  worker_thread+0x2fc/0x410
[  547.837613]  ? __pfx_worker_thread+0x10/0x10
[  547.841885]  kthread+0xdf/0x110
[  547.845031]  ? __pfx_kthread+0x10/0x10
[  547.848775]  ret_from_fork+0x30/0x50
[  547.852354]  ? __pfx_kthread+0x10/0x10
[  547.856097]  ret_from_fork_asm+0x1a/0x30
[  547.860019]  </TASK>
[  547.862208] Modules linked in: nvkm(E) gsp_log(E) snd_seq_dummy(E) snd_hrtimer(E) snd_seq(E) snd_timer(E) snd_seq_device(E) snd(E) soundcore(E) rfkill(E) qrtr(E) vfat(E) fat(E) ipmi_ssif(E) amd_atl(E) intel_rapl_msr(E) intel_rapl_common(E) amd64_edac(E) mlx5_ib(E) edac_mce_amd(E) kvm_amd(E) ib_uverbs(E) kvm(E) ib_core(E) acpi_ipmi(E) ipmi_si(E) ipmi_devintf(E) mxm_wmi(E) joydev(E) rapl(E) ptdma(E) i2c_piix4(E) acpi_cpufreq(E) wmi_bmof(E) pcspkr(E) k10temp(E) ipmi_msghandler(E) xfs(E) libcrc32c(E) ast(E) i2c_algo_bit(E) drm_shmem_helper(E) crct10dif_pclmul(E) drm_kms_helper(E) ahci(E) crc32_pclmul(E) nvme_tcp(E) libahci(E) nvme(E) crc32c_intel(E) nvme_fabrics(E) cdc_ether(E) nvme_core(E) usbnet(E) mlx5_core(E) ghash_clmulni_intel(E) drm(E) libata(E) ccp(E) mii(E) t10_pi(E) mlxfw(E) sp5100_tco(E) psample(E) pci_hyperv_intf(E) wmi(E) dm_multipath(E) sunrpc(E) dm_mirror(E) dm_region_hash(E) dm_log(E) dm_mod(E) be2iscsi(E) bnx2i(E) cnic(E) uio(E) cxgb4i(E) cxgb4(E) tls(E) libcxgbi(E) libcxgb(E) qla4xxx(E)
[  547.862283]  iscsi_boot_sysfs(E) iscsi_tcp(E) libiscsi_tcp(E) libiscsi(E) scsi_transport_iscsi(E) fuse(E) [last unloaded: gsp_log(E)]
[  547.962691] CR2: 0000000000000020
[  547.966003] ---[ end trace 0000000000000000 ]---
[  549.012012] clocksource: Long readout interval, skipping watchdog check: cs_nsec: 1370499158 wd_nsec: 1370498904
[  549.043676] pstore: backend (erst) writing error (-28)
[  549.050924] RIP: 0010:r535_gsp_msg_recv+0x87/0x230 [nvkm]
[  549.056389] Code: 00 8b 70 08 48 89 e1 31 d2 4c 89 f7 e8 12 f5 ff ff 48 89 c5 48 85 c0 0f 84 cf 00 00 00 48 81 fd 00 f0 ff ff 0f 87 c4 00 00 00 <8b> 55 10 41 8b 46 30 85 d2 0f 85 f6 00 00 00 83 f8 04 76 10 ba 05
[  549.075138] RSP: 0018:ffffabe440f87e10 EFLAGS: 00010203
[  549.080361] RAX: 0000000000000010 RBX: 0000000000000008 RCX: 000000000000003f
[  549.087484] RDX: 0000000000000000 RSI: ffffabe4480a8030 RDI: 0000000000000010
[  549.094609] RBP: 0000000000000010 R08: 0000000000000000 R09: ffffabe440f87bb0
[  549.101733] R10: ffffabe440f87dc8 R11: 0000000000000010 R12: 0000000000000000
[  549.108857] R13: 0000000000000000 R14: ffff9351df1e5000 R15: 0000000000000000
[  549.115982] FS:  0000000000000000(0000) GS:ffff93708eb00000(0000) knlGS:0000000000000000
[  549.124061] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  549.129807] CR2: 0000000000000020 CR3: 00000003cc220002 CR4: 0000000000770ef0
[  549.136940] PKRU: 55555554
[  549.139653] Kernel panic - not syncing: Fatal exception
[  549.145054] Kernel Offset: 0x18c00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[  549.165074] ---[ end Kernel panic - not syncing: Fatal exception ]---

Also, nvkm wrongly advances the read pointer when handling a two-page GSP
message in the rollback case. In the rollback case, the GSP message will
be copied in two rounds. When handling a two-page GSP message, nvkm first
copies amount of (GSP_PAGE_SIZE - header) data into the buffer, then
advances the read pointer by the result of DIV_ROUND_UP(size,
GSP_PAGE_SIZE). Thus, the read pointer is advanced by 1.

Next, nvkm copies the amount of (total size - (GSP_PAGE_SIZE -
header)) data into the buffer. The left amount of the data will be always
larger than one page since the message header is not taken into account
in the first copy. Thus, the read pointer is advanced by DIV_ROUND_UP(
size(larger than one page), GSP_PAGE_SIZE) = 2.

In the end, the read pointer is wrongly advanced by 3 when handling a
two-page GSP message in the rollback case.

Fix the problems by taking the total size of the message into account
when advancing the read pointer and calculate the read pointer in the end
of the all copies for the rollback case.

BTW: the two-page GSP message can be observed in the msgq when vGPU is
enabled.

Signed-off-by: Zhi Wang <zhiw@nvidia.com>
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20241017071922.2518724-2-zhiw@nvidia.com
2024-11-25 17:36:06 +01:00
Linus Torvalds
28eb75e178 Merge tag 'drm-next-2024-11-21' of https://gitlab.freedesktop.org/drm/kernel
Pull drm updates from Dave Airlie:
 "There's a lot of rework, the panic helper support is being added to
  more drivers, v3d gets support for HW superpages, scheduler
  documentation, drm client and video aperture reworks, some new
  MAINTAINERS added, amdgpu has the usual lots of IP refactors, Intel
  has some Pantherlake enablement and xe is getting some SRIOV bits, but
  just lots of stuff everywhere.

  core:
   - split DSC helpers from DP helpers
   - clang build fixes for drm/mm test
   - drop simple pipeline support for gem vram
   - document submission error signaling
   - move drm_rect to drm core module from kms helper
   - add default client setup to most drivers
   - move to video aperture helpers instead of drm ones

  tests:
   - new framebuffer tests

  ttm:
   - remove swapped and pinned BOs from TTM lru

  panic:
   - fix uninit spinlock
   - add ABGR2101010 support

  bridge:
   - add TI TDP158 support
   - use standard PM OPS

  dma-fence:
   - use read_trylock instead of read_lock to help lockdep

  scheduler:
   - add errno to sched start to report different errors
   - add locking to drm_sched_entity_modify_sched
   - improve documentation

  xe:
   - add drm_line_printer
   - lots of refactoring
   - Enable Xe2 + PES disaggregation
   - add new ARL PCI ID
   - SRIOV development work
   - fix exec unnecessary implicit fence
   - define and parse OA sync props
   - forcewake refactoring

  i915:
   - Enable BMG/LNL ultra joiner
   - Enable 10bpx + CCS scanout on ICL+, fp16/CCS on TGL+
   - use DSB for plane/color mgmt
   - Arrow lake PCI IDs
   - lots of i915/xe display refactoring
   - enable PXP GuC autoteardown
   - Pantherlake (PTL) Xe3 LPD display enablement
   - Allow fastset HDR infoframe changes
   - write DP source OUI for non-eDP sinks
   - share PCI IDs between i915 and xe

  amdgpu:
   - SDMA queue reset support
   - SMU 13.0.6, JPEG 4.0.3 updates
   - Initial runtime repartitioning support
   - rework IP structs for multiple IP instances
   - Fetch EDID from _DDC if available
   - SMU13 zero rpm user control
   - lots of fixes/cleanups

  amdkfd:
   - Increase event FIFO size
   - add topology cap flag for per queue reset

  msm:
   - DPU:
      - SA8775P support
      - (disabled by default) MSM8917, MSM8937, MSM8953 and MSM8996 support
      - Enable large framebuffer support
      - Drop MSM8998 and SDM845
   - DP:
      - SA8775P support
   - GPU:
      - a7xx preemption support
      - Adreno A663 support

  ast:
   - warn about unsupported TX chips

  ivpu:
   - add coredump
   - add pantherlake support

  rockchip:
   - 4K@60Hz display enablement
   - generate pll programming tables

  panthor:
   - add timestamp query API
   - add realtime group priority
   - add fdinfo support

  etnaviv:
   - improve handling of DMA address limits
   - improve GPU hangcheck

  exynos:
   - Decon Exynos7870 support

  mediatek:
   - add OF graph support

  omap:
   - locking fixes

  bochs:
   - convert to gem/shmem from simpledrm

  v3d:
   - support big/super pages
   - add gemfs

  vc4:
   - BCM2712 support refactoring
   - add YUV444 format support

  udmabuf:
   - folio related fixes

  nouveau:
   - add panic support on nv50+"

* tag 'drm-next-2024-11-21' of https://gitlab.freedesktop.org/drm/kernel: (1583 commits)
  drm/xe/guc: Fix dereference before NULL check
  drm/amd: Fix initialization mistake for NBIO 7.7.0
  Revert "drm/amd/display: parse umc_info or vram_info based on ASIC"
  drm/amd/display: Fix failure to read vram info due to static BP_RESULT
  drm/amdgpu: enable GTT fallback handling for dGPUs only
  drm/amd/amdgpu: limit single process inside MES
  drm/fourcc: add AMD_FMT_MOD_TILE_GFX9_4K_D_X
  drm/amdgpu/mes12: correct kiq unmap latency
  drm/amdgpu: Support vcn and jpeg error info parsing
  drm/amd : Update MES API header file for v11 & v12
  drm/amd/amdkfd: add/remove kfd queues on start/stop KFD scheduling
  drm/amdkfd: change kfd process kref count at creation
  drm/amdgpu: Cleanup shift coding style
  drm/amd/amdgpu: Increase MES log buffer to dump mes scratch data
  drm/amdgpu: Implement virt req_ras_err_count
  drm/amdgpu: VF Query RAS Caps from Host if supported
  drm/amdgpu: Add msg handlers for SRIOV RAS Telemetry
  drm/amdgpu: Update SRIOV Exchange Headers for RAS Telemetry Support
  drm/amd/display: 3.2.309
  drm/amd/display: Adjust VSDB parser for replay feature
  ...
2024-11-21 14:56:17 -08:00
Thomas Zimmermann
b86711c6d6 drm/client: Move public client header to clients/ subdirectory
Move the public header file drm_client_setup.h to the clients/
subdirectory and update all drivers. No functional changes.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Jocelyn Falempe <jfalempe@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241108154600.126162-3-tzimmermann@suse.de
2024-11-15 09:42:13 +01:00
Dave Airlie
9776c0a75a nouveau/dp: handle retries for AUX CH transfers with GSP.
eb284f4b37 drm/nouveau/dp: Honor GSP link training retry timeouts

tried to fix a problem with panel retires, however it appears
the auxch also needs the same treatment, so add the same retry
wrapper around it.

This fixes some eDP panels after a suspend/resume cycle.

Fixes: eb284f4b37 ("drm/nouveau/dp: Honor GSP link training retry timeouts")
Cc: stable@vger.kernel.org
Reviewed-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241111034126.2028401-2-airlied@gmail.com
2024-11-14 11:50:01 +10:00
Dave Airlie
b6ad7debf5 nouveau: handle EBUSY and EAGAIN for GSP aux errors.
The upper layer transfer functions expect EBUSY as a return
for when retries should be done.

Fix the AUX error translation, but also check for both errors
in a few places.

Fixes: eb284f4b37 ("drm/nouveau/dp: Honor GSP link training retry timeouts")
Cc: stable@vger.kernel.org
Reviewed-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241111034126.2028401-1-airlied@gmail.com
2024-11-14 11:50:01 +10:00
Dave Airlie
21ec425eaf nouveau: fw: sync dma after setup is called.
When this code moved to non-coherent allocator the sync was put too
early for some firmwares which called the setup function, move the
sync down after the setup function.

Reported-by: Diogo Ivo <diogo.ivo@tecnico.ulisboa.pt>
Tested-by: Diogo Ivo <diogo.ivo@tecnico.ulisboa.pt>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Fixes: 9b340aeb26 ("nouveau/firmware: use dma non-coherent allocator")
Cc: stable@vger.kernel.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241114004603.3095485-1-airlied@gmail.com
2024-11-14 11:50:01 +10:00
Maarten Lankhorst
d78f0ee040 Merge remote-tracking branch 'drm/drm-next' into drm-misc-next
Didn't notice drm/drm-next had the build fix for drm_bridge, so ended up
committing the same patch. Sync with drm and pretend it didn't happen?

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
2024-11-04 14:45:21 +01:00
Jocelyn Falempe
1d26c846f3 drm/nouveau: Add drm_panic support for nv50+
Add drm_panic support for nv50+ cards.
It's enough to get the panic screen while running Gnome/Wayland with
an nv50+ nvidia GPU.
It doesn't support multi-plane or compressed format yet.
Tiling is tested on GTX1650 (Turing), GeForce GT 1030 (Pascal) and
Geforce 8800 GTS (Tesla).

Signed-off-by: Jocelyn Falempe <jfalempe@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241022185553.1103384-4-jfalempe@redhat.com
2024-11-04 12:42:56 +01:00
Jocelyn Falempe
74cfa1efe2 drm/nouveau/disp: Move tiling functions to dispnv50/tile.h
Refactor, and move the tiling geometry functions to dispnv50/tile.h,
so they can be re-used by drm_panic.
No functional impact.

Signed-off-by: Jocelyn Falempe <jfalempe@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241022185553.1103384-3-jfalempe@redhat.com
2024-11-04 12:38:06 +01:00
Dave Airlie
30169bb645 Backmerge v6.12-rc6 of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux into drm-next
Backmerge Linus tree for some drm-fixes needed for msm and xe merges.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2024-11-04 14:25:33 +10:00
Li Huafei
a2f599046c drm/nouveau/gr/gf100: Fix missing unlock in gf100_gr_chan_new()
When the call to gf100_grctx_generate() fails, unlock gr->fecs.mutex
before returning the error.

Fixes smatch warning:

drivers/gpu/drm/nouveau/nvkm/engine/gr/gf100.c:480 gf100_gr_chan_new() warn: inconsistent returns '&gr->fecs.mutex'.

Fixes: ca081fff6e ("drm/nouveau/gr/gf100-: generate golden context during first object alloc")
Signed-off-by: Li Huafei <lihuafei1@huawei.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Lyude Paul <lyude@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241026173844.2392679-1-lihuafei1@huawei.com
2024-10-29 15:48:56 -04:00
Thomas Zimmermann
4785658660 drm/nouveau: Suspend and resume clients with client helpers
Replace calls to drm_fb_helper_set_suspend_unlocked() with calls
to the client functions drm_client_dev_suspend() and
drm_client_dev_resume(). Any registered in-kernel client will now
receive suspend and resume events.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Karol Herbst <kherbst@redhat.com>
Cc: Lyude Paul <lyude@redhat.com>
Cc: Danilo Krummrich <dakr@redhat.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241014085740.582287-10-tzimmermann@suse.de
2024-10-18 09:23:03 +02:00
Thomas Zimmermann
df7e8b522a drm/client: Move client event handlers to drm_client_event.c
A number of DRM-client functions serve as entry points from device
operations to client code. Moving them info a separate file will later
allow for a more fine-grained kernel configuration. For most of the
users it is sufficient to include <drm/drm_client_event.h> instead of
the full driver-side interface in <drm/drm_client.h>

v2:
- rename new files to drm_client_event.{c,h}

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tursulin@ursulin.net>
Cc: Karol Herbst <kherbst@redhat.com>
Cc: Lyude Paul <lyude@redhat.com>
Cc: Danilo Krummrich <dakr@redhat.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241014085740.582287-7-tzimmermann@suse.de
2024-10-18 09:23:03 +02:00
Thomas Zimmermann
92f6453c9f drm/nouveau: Use video aperture helpers
DRM's aperture functions have long been implemented as helpers
under drivers/video/ for use with fbdev. Avoid the DRM wrappers by
calling the video functions directly.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Karol Herbst <kherbst@redhat.com>
Cc: Lyude Paul <lyude@redhat.com>
Cc: Danilo Krummrich <dakr@redhat.com>
Acked-by: Javier Martinez Canillas <javierm@redhat.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240930130921.689876-13-tzimmermann@suse.de
2024-10-14 15:28:47 +02:00
Dave Airlie
b634acb2a0 Merge tag 'drm-misc-fixes-2024-10-10' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes
Short summary of fixes pull:

fbdev-dma:
- Only clean up deferred I/O if instanciated

nouveau:
- dmem: Fix privileged error in copy engine channel; Fix possible
data leak in migrate_to_ram()
- gsp: Fix coding style

sched:
- Avoid leaking lockdep map

v3d:
- Stop active perfmon before destroying it

vc4:
- Stop active perfmon before destroying it

xe:
- Drop GuC submit_wq pool

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20241010133708.GA461532@localhost.localdomain
2024-10-11 09:03:30 +10:00
Dave Airlie
aa628ebb06 Merge tag 'drm-misc-next-2024-10-09' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-next
drm-misc-next for v6.13:

UAPI Changes:
- Add drm fdinfo support to panthor, and add sysfs knob to toggle.

Cross-subsystem Changes:
- Convert fbdev drivers to use backlight power constants.
- Some small dma-fence fixes.
- Some kernel-doc fixes.

Core Changes:
- Small drm client fixes.
- Document requirements that you need to file a bug before marking a test as flaky.
- Remove swapped and pinned bo's from TTM lru list.

Driver Changes:
- Assorted small fixes to panel/elida-kd35t133, nouveau, vc4, imx.
- Fix some bridges to drop cached edids on power off.
- Add Jenson BL-JT60050-01A, Samsung s6e3ha8 & AMS639RQ08 panels.
- Make 180° rotation work on ilitek-ili9881c, even for already-rotated
  panels.
-

Signed-off-by: Dave Airlie <airlied@redhat.com>

# Conflicts:
#	drivers/gpu/drm/panthor/panthor_drv.c
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/8dc111ca-d20c-4e0d-856e-c12d208cbf2a@linux.intel.com
2024-10-11 05:39:10 +10:00
Dave Airlie
54bc1d3255 Merge tag 'drm-misc-next-2024-09-26' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-next
drm-misc-next for v6.13:

UAPI Changes:
- panthor: Add realtime group priority and priority query.

Cross-subsystem Changes:
- Add Vivek Kasireddy as udmabuf maintainer.
- Assorted udmabuf changes.
- Device tree binding updates.
- dmabuf documentation fixes.
- Move drm_rect to drm core module from kms helper.

Core Changes:
- Update scheduler documentation and concurrency fixes.
- drm/ci updates.
- Add memory-agnostic fbdev client and client-agnostic setup helper.
- Huge driver conversion for using the above.

Driver Changes:
- Assorted fixes to imx, panel/nt35510, sti, accel/ivpu, v3d, vkms,
  host1x.
- Add panel quirks for AYA NEO panels.
- Make module autoloading work for bridge/it6505 and mcde.
- Add huge page support to v3d using a custom shmfs.

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/a9b95e6f-9f35-464e-83f6-bda75b35ee0b@linux.intel.com
2024-10-09 11:58:39 +10:00
Dave Airlie
7fefa1edc2 Merge tag 'drm-misc-next-2024-09-20' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-next
drm-misc-next for v6.12:

UAPI Changes:
- Add panthor/DEV_QUERY_TIMESTAMP_INFO query.

Cross-subsystem Changes:
- Updated dt bindings.
- Add documentation explaining default errnos for fences.
- Mark dma-buf heaps creation functions as __init.

Core Changes:
- Split DSC helpers from DP helpers.
- Clang build fixes for drm/mm test.
- Remove simple pipeline support for gem-vram,
  no longer any users left after converting bochs.
- Add erno to drm_sched_start to distinguish between GPU and queue
  reset.
- Add drm_framebuffer testcases.
- Fix uninitialized spinlock acquisition with CONFIG_DRM_PANIC=n.
- Use read_trylock instead of read_lock in dma_fence_begin_signalling to
  quiesce lockdep.

Driver Changes:
- Assorted small fixes and updates for tegra, host1x, imagination,
  nouveau, panfrost, panthor, panel/ili9341, mali, exynos,
  panel/samsung-s6e3fa7, ast, bridge/ti-sn65dsi86, panel/himax-hx83112a,
  bridge/tc358767, bridge/imx8mp-hdmi-tx, panel/khadas-ts050,
  panel/nt36523, panel/sony-acx565akm, kmb, accel/qaic, omap, v3d.
- Add bridge/TI TDP158.
- Assorted documentation updates.
- Convert bochs from simple drm to gem shmem, and check modes
  against available memory.
- Many VC4 fixes, most related to scaling and YUV support.
- Convert some drivers to use SYSTEM_SLEEP_PM_OPS and RUNTIME_PM_OPS.
- Rockchip 4k@60 support.

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/445713a6-2427-4c53-8ec2-3a894ec62405@linux.intel.com
2024-10-09 09:03:46 +10:00
Yonatan Maman
835745a377 nouveau/dmem: Fix vulnerability in migrate_to_ram upon copy error
The `nouveau_dmem_copy_one` function ensures that the copy push command is
sent to the device firmware but does not track whether it was executed
successfully.

In the case of a copy error (e.g., firmware or hardware failure), the
copy push command will be sent via the firmware channel, and
`nouveau_dmem_copy_one` will likely report success, leading to the
`migrate_to_ram` function returning a dirty HIGH_USER page to the user.

This can result in a security vulnerability, as a HIGH_USER page that may
contain sensitive or corrupted data could be returned to the user.

To prevent this vulnerability, we allocate a zero page. Thus, in case of
an error, a non-dirty (zero) page will be returned to the user.

Fixes: 5be73b6908 ("drm/nouveau/dmem: device memory helpers for SVM")
Signed-off-by: Yonatan Maman <Ymaman@Nvidia.com>
Co-developed-by: Gal Shalom <GalShalom@Nvidia.com>
Signed-off-by: Gal Shalom <GalShalom@Nvidia.com>
Reviewed-by: Ben Skeggs <bskeggs@nvidia.com>
Cc: stable@vger.kernel.org
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20241008115943.990286-3-ymaman@nvidia.com
2024-10-08 14:23:38 +02:00
Yonatan Maman
04e0481526 nouveau/dmem: Fix privileged error in copy engine channel
When `nouveau_dmem_copy_one` is called, the following error occurs:

[272146.675156] nouveau 0000:06:00.0: fifo: PBDMA9: 00000004 [HCE_PRIV]
ch 1 00000300 00003386

This indicates that a copy push command triggered a Host Copy Engine
Privileged error on channel 1 (Copy Engine channel). To address this
issue, modify the Copy Engine channel to allow privileged push commands

Fixes: 6de125383a ("drm/nouveau/fifo: expose runlist topology info on all chipsets")
Signed-off-by: Yonatan Maman <Ymaman@Nvidia.com>
Co-developed-by: Gal Shalom <GalShalom@Nvidia.com>
Signed-off-by: Gal Shalom <GalShalom@Nvidia.com>
Reviewed-by: Ben Skeggs <bskeggs@nvidia.com>
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20241008115943.990286-2-ymaman@nvidia.com
2024-10-08 14:23:35 +02:00
Colin Ian King
301d194d01 drm/nouveau/gsp: remove extraneous ; after mutex
The mutex field has two following semicolons, replace this with just
one semicolon.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20240917120856.1877733-1-colin.i.king@gmail.com
2024-10-03 23:49:40 +02:00
Benjamin Szőke
231bb9b4c4 drm/nouveau/i2c: rename aux.c and aux.h to auxch.c and auxch.h
The goal is to clean-up Linux repository from AUX file names, because
the use of such file names is prohibited on other operating systems
such as Windows, so the Linux repository cannot be cloned and
edited on them.

Signed-off-by: Benjamin Szőke <egyszeregy@freemail.hu>
Reviewed-by: Ben Skeggs <bskeggs@nvidia.com>
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20240603091558.35672-1-egyszeregy@freemail.hu
2024-10-03 20:05:16 +02:00