Passing a structure by value into a function is sometimes problematic,
for a number of reasons. Of of these is a warning from the 32-bit arm
compiler:
drivers/gpu/drm/drm_gpusvm.c: In function '__drm_gpusvm_unmap_pages':
drivers/gpu/drm/drm_gpusvm.c:1152:33: note: parameter passing for argument of type 'struct drm_pagemap_addr' changed in GCC 9.1
1152 | dpagemap->ops->device_unmap(dpagemap,
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1153 | dev, *addr);
| ~~~~~~~~~~~
This particular problem is harmless since we are not mixing compiler versions
inside of the compiler. However, passing this by reference avoids the warning
along with providing slightly better calling conventions as it avoids an
extra copy on the stack.
Fixes: 75af93b3f5 ("drm/pagemap, drm/xe: Support destination migration over interconnect")
Fixes: 2df55d9e66 ("drm/xe: Support pcie p2p dma as a fast interconnect")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patch.msgid.link/20260216134644.1025365-1-arnd@kernel.org
Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
(cherry picked from commit 95162db020)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
In situations where no system memory is migrated to devmem, and in
upcoming patches where another GPU is performing the migration to
the newly allocated devmem buffer, there is nothing to ensure any
ongoing clear to the devmem allocation or async eviction from the
devmem allocation is complete.
Address that by passing a struct dma_fence down to the copy
functions, and ensure it is waited for before migration is marked
complete.
v3:
- New patch.
v4:
- Update the logic used for determining when to wait for the
pre_migrate_fence.
- Update the logic used for determining when to warn for the
pre_migrate_fence since the scheduler fences apparently
can signal out-of-order.
v5:
- Fix a UAF (CI)
- Remove references to source P2P migration (Himal)
- Put the pre_migrate_fence after migration.
v6:
- Pipeline the pre_migrate_fence dependency (Matt Brost)
Fixes: c5b3eb5a90 ("drm/xe: Add GPUSVM device memory copy vfunc functions")
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: <stable@vger.kernel.org> # v6.15+
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> # For merging through drm-xe.
Link: https://patch.msgid.link/20251219113320.183860-4-thomas.hellstrom@linux.intel.com
(cherry picked from commit 16b5ad3195)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Support destination migration over interconnect when migrating from
device-private pages with the same dev_pagemap owner.
Since we now also collect device-private pages to migrate,
also abort migration if the range to migrate is already
fully populated with pages from the desired pagemap.
Finally return -EBUSY from drm_pagemap_populate_mm()
if the migration can't be completed without first migrating all
pages in the range to system. It is expected that the caller
will perform that before retrying the call to
drm_pagemap_populate_mm().
v3:
- Fix a bug where the p2p dma-address was never used.
- Postpone enabling destination interconnect migration,
since xe devices require source interconnect migration to
ensure the source L2 cache is flushed at migration time.
- Update the drm_pagemap_migrate_to_devmem() interface to
pass migration details.
v4:
- Define XE_INTERCONNECT_P2P unconditionally (CI)
- Include a missing header (CI)
v5:
- Use page order increments where possible (Matt Brost).
- Fix a negated value of can_migrate_same_pagemap.
- Move removal of some dead code to a separate patch (Matt Brost).
- Remove an unnecessary zdd get() and put() (Matt Brost).
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> # For merging through drm-xe.
Link: https://patch.msgid.link/20251219113320.183860-23-thomas.hellstrom@linux.intel.com
Pagemaps are costly to set up and tear down, and they consume a lot
of system memory for the struct pages. Ideally they should be
created only when needed.
Add a caching mechanism to allow doing just that: Create the drm_pagemaps
when needed for migration. Keep them around to avoid destruction and
re-creation latencies and destroy inactive/unused drm_pagemaps on memory
pressure using a shrinker.
Only add the helper functions. They will be hooked up to the xe driver
in the upcoming patch.
v2:
- Add lockdep checking for drm_pagemap_put(). (Matt Brost)
- Add a copyright notice. (Matt Brost)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> # For merging through drm-xe.
Link: https://patch.msgid.link/20251219113320.183860-8-thomas.hellstrom@linux.intel.com
If a device holds a reference on a foregin device's drm_pagemap,
and a device unbind is executed on the foreign device,
Typically that foreign device would evict its device-private
pages and then continue its device-managed cleanup eventually
releasing its drm device and possibly allow for module unload.
However, since we're still holding a reference on a drm_pagemap,
when that reference is released and the provider module is
unloaded we'd execute out of undefined memory.
Therefore keep a reference on the provider device and module until
the last drm_pagemap reference is gone.
Note that in theory, the drm_gpusvm_helper module may be unloaded
as soon as the final module_put() of the provider driver module is
executed, so we need to add a module_exit() function that waits
for the work item executing the module_put() has completed.
v2:
- Better commit message (Matt Brost)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> # For merging through drm-xe.
Link: https://patch.msgid.link/20251219113320.183860-7-thomas.hellstrom@linux.intel.com
In situations where no system memory is migrated to devmem, and in
upcoming patches where another GPU is performing the migration to
the newly allocated devmem buffer, there is nothing to ensure any
ongoing clear to the devmem allocation or async eviction from the
devmem allocation is complete.
Address that by passing a struct dma_fence down to the copy
functions, and ensure it is waited for before migration is marked
complete.
v3:
- New patch.
v4:
- Update the logic used for determining when to wait for the
pre_migrate_fence.
- Update the logic used for determining when to warn for the
pre_migrate_fence since the scheduler fences apparently
can signal out-of-order.
v5:
- Fix a UAF (CI)
- Remove references to source P2P migration (Himal)
- Put the pre_migrate_fence after migration.
v6:
- Pipeline the pre_migrate_fence dependency (Matt Brost)
Fixes: c5b3eb5a90 ("drm/xe: Add GPUSVM device memory copy vfunc functions")
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: <stable@vger.kernel.org> # v6.15+
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> # For merging through drm-xe.
Link: https://patch.msgid.link/20251219113320.183860-4-thomas.hellstrom@linux.intel.com
If the page is part of a folio, DMA map the whole folio at once instead of
mapping individual pages one after the other. For example if 2MB folios
are used instead of 4KB pages, this reduces the number of DMA mappings by
512.
The folio order (and consequently, the size) is persisted in the struct
drm_pagemap_device_addr to be available at the time of unmapping.
v2:
- Initialize order variable (Matthew Brost)
- Set proto and dir for completeness (Matthew Brost)
- Do not populate drm_pagemap_addr, document it (Matthew Brost)
- Add and use macro NR_PAGES(order) (Matthew Brost)
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://lore.kernel.org/r/20250805140028.599361-4-francois.dugast@intel.com
Signed-off-by: Francois Dugast <francois.dugast@intel.com>
This struct embeds more information than just the DMA address. This will
help later to support folio orders greater than zero. At this point, there
is no functional change as the only struct member used is addr.
In Xe, adapt to the new drm_gpusvm_devmem_ops type signatures using struct
drm_pagemap_addr, as well as the internal xe SVM functions implementing
those operations. The use of this struct is propagated to xe_migrate as it
makes indexed accesses to the next DMA address but they are no longer
contiguous.
v2:
- Rename drm_pagemap_device_addr to drm_pagemap_addr (Matthew Brost)
- Squash with patch for Xe (Matthew Brost)
- Set proto and dir for completeness (Matthew Brost)
- Assess DMA map protocol (Matthew Brost)
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://lore.kernel.org/r/20250805140028.599361-3-francois.dugast@intel.com
Signed-off-by: Francois Dugast <francois.dugast@intel.com>
The migration functionality and track-keeping of per-pagemap VRAM
mapped to the CPU mm is not per GPU_vm, but rather per pagemap.
This is also reflected by the functions not needing the drm_gpusvm
structures. So move to drm_pagemap.
With this, drm_gpusvm shouldn't really access the page zone-device-data
since its meaning is internal to drm_pagemap. Currently it's used to
reject mapping ranges backed by multiple drm_pagemap allocations.
For now, make the zone-device-data a void pointer.
Alter the interface of drm_gpusvm_migrate_to_devmem() to ensure we don't
pass a gpusvm pointer.
Rename CONFIG_DRM_XE_DEVMEM_MIRROR to CONFIG_DRM_XE_PAGEMAP.
Matt is listed as author of this commit since he wrote most of the code,
and it makes sense to retain his git authorship.
Thomas mostly moved the code around.
v3:
- Kerneldoc fixes (CI)
- Don't update documentation about how the drm_pagemap
migration should be interpreted until upcoming
patches where the functionality is implemented.
(Matt Brost)
v4:
- More kerneldoc fixes around timeslice_ms
(Himal Ghimiray, Matt Brost)
v6:
- Fix an uninitialized pagemap pointer (CI)
Co-developed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://lore.kernel.org/r/20250619134035.170086-2-thomas.hellstrom@linux.intel.com
Introduce drm_pagemap ops to map and unmap dma to VRAM resources. In the
local memory case it's a matter of merely providing an offset into the
device's physical address. For future p2p the map and unmap functions may
encode as needed.
Similar to how dma-buf works, let the memory provider (drm_pagemap) provide
the mapping functionality.
v3:
- Move to drm level include
v4:
- Fix kernel doc (G.G.)
v5:
- s/map_dma/device_map (Thomas)
- s/unmap_dma/device_unmap (Thomas)
v7:
- Fix kernel doc (CI, Auld)
- Drop P2P define (Thomas)
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250306012657.3505757-5-matthew.brost@intel.com