In i915_gem_madvise_ioctl() we immediately purge the object is not
currently used, like when the mm.pages are NULL. With shmem the pages
might still be hanging around or are perhaps swapped out. Similarly with
ttm we might still have the pages hanging around on the ttm resource,
like with lmem or shmem, but here we need to be extra careful since
async unbinds are possible as well as in-progress kernel moves. In
i915_ttm_purge() we expect the pipeline-gutting to nuke the ttm resource
for us, however if it's busy the memory is only moved to a ghost object,
which then leads to broken behaviour when for example clearing the
i915_tt->filp, since the actual ttm_tt is still alive and populated,
even though it's been moved to the ghost object. When we later destroy
the ghost object we hit the following, since the filp is now NULL:
[ +0.006982] #PF: supervisor read access in kernel mode
[ +0.005149] #PF: error_code(0x0000) - not-present page
[ +0.005147] PGD 11631d067 P4D 11631d067 PUD 115972067 PMD 0
[ +0.005676] Oops: 0000 [#1] PREEMPT SMP NOPTI
[ +0.012962] Workqueue: events ttm_device_delayed_workqueue [ttm]
[ +0.006022] RIP: 0010:i915_ttm_tt_unpopulate+0x3a/0x70 [i915]
[ +0.005879] Code: 89 fb 48 85 f6 74 11 8b 55 4c 48 8b 7d 30 45 31 c0 31 c9 e8 18 6a e5 e0 80 7d 60 00 74 20 48 8b 45 68
8b 55 08 4c 89 e7 5b 5d <48> 8b 40 20 83 e2 01 41 5c 89 d1 48 8b 70
30 e9 42 b2 ff ff 4c 89
[ +0.018782] RSP: 0000:ffffc9000bf6fd70 EFLAGS: 00010202
[ +0.005244] RAX: 0000000000000000 RBX: ffff8883e12ae380 RCX: 0000000000000000
[ +0.007150] RDX: 000000008000000e RSI: ffffffff823559b4 RDI: ffff8883e12ae3c0
[ +0.007142] RBP: ffff888103b65d48 R08: 0000000000000001 R09: 0000000000000001
[ +0.007144] R10: 0000000000000001 R11: ffff88829c2c8040 R12: ffff8883e12ae3c0
[ +0.007148] R13: 0000000000000001 R14: ffff888115184140 R15: ffff888115184248
[ +0.007154] FS: 0000000000000000(0000) GS:ffff88844db00000(0000) knlGS:0000000000000000
[ +0.008108] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ +0.005763] CR2: 0000000000000020 CR3: 000000013fdb4004 CR4: 00000000003706e0
[ +0.007152] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ +0.007145] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ +0.007154] Call Trace:
[ +0.002459] <TASK>
[ +0.002126] ttm_tt_unpopulate.part.0+0x17/0x70 [ttm]
[ +0.005068] ttm_bo_tt_destroy+0x1c/0x50 [ttm]
[ +0.004464] ttm_bo_cleanup_memtype_use+0x25/0x40 [ttm]
[ +0.005244] ttm_bo_cleanup_refs+0x90/0x2c0 [ttm]
[ +0.004721] ttm_bo_delayed_delete+0x235/0x250 [ttm]
[ +0.004981] ttm_device_delayed_workqueue+0x13/0x40 [ttm]
[ +0.005422] process_one_work+0x248/0x560
[ +0.004028] worker_thread+0x4b/0x390
[ +0.003682] ? process_one_work+0x560/0x560
[ +0.004199] kthread+0xeb/0x120
[ +0.003163] ? kthread_complete_and_exit+0x20/0x20
[ +0.004815] ret_from_fork+0x1f/0x30
v2:
- Just use ttm_bo_wait() directly (Niranjana)
- Add testcase reference
Testcase: igt@gem_madvise@dontneed-evict-race
Fixes: 213d509277 ("drm/i915/ttm: Introduce a TTM i915 gem object backend")
Reported-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Andrzej Hajda <andrzej.hajda@intel.com>
Cc: Nirmoy Das <nirmoy.das@intel.com>
Cc: <stable@vger.kernel.org> # v5.15+
Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Acked-by: Nirmoy Das <Nirmoy.Das@intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221115104620.120432-1-matthew.auld@intel.com
(cherry picked from commit 5524b5e52e)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
swiotlb_max_segment used to return either the maximum size that swiotlb
could bounce, or for Xen PV PAGE_SIZE even if swiotlb could bounce buffer
larger mappings. This made i915 on Xen PV work as it bypasses the
coherency aspect of the DMA API and can't cope with bounce buffering
and this avoided bounce buffering for the Xen/PV case.
So instead of adding this hack back, check for Xen/PV directly in i915
for the Xen case and otherwise use the proper DMA API helper to query
the maximum mapping size.
Replace swiotlb_max_segment() calls with dma_max_mapping_size().
In i915_gem_object_get_pages_internal() no longer consider max_segment
only if CONFIG_SWIOTLB is enabled. There can be other (iommu related)
causes of specific max segment sizes.
Fixes: a2daa27c0c ("swiotlb: simplify swiotlb_max_segment")
Reported-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Signed-off-by: Robert Beckett <bob.beckett@collabora.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
[hch: added the Xen hack, rewrote the changelog]
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221020110308.1582518-1-hch@lst.de
(cherry picked from commit 78a07fe777)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Release all mmap mapping for all lmem objects which are associated
with userfault such that, while pcie function in D3hot, any access
to memory mappings will raise a userfault.
Runtime resume the dgpu(when gem object lies in lmem).
This will transition the dgpu graphics function to D0
state if it was in D3 in order to access the mmap memory
mappings.
v2:
- Squashes the patches. [Matt Auld]
- Add adequate locking for lmem_userfault_list addition. [Matt Auld]
- Reused obj->userfault_count to avoid double addition. [Matt Auld]
- Added i915_gem_object_lock to check
i915_gem_object_is_lmem. [Matt Auld]
v3:
- Use i915_ttm_cpu_maps_iomem. [Matt Auld]
- Fix 'ret == 0 to ret == VM_FAULT_NOPAGE'. [Matt Auld]
- Reuse obj->userfault_count as a bool 0 or 1. [Matt Auld]
- Delete the mmaped obj from lmem_userfault_list in obj
destruction path. [Matt Auld]
- Get a wakeref for object destruction patch. [Matt Auld]
- Use intel_wakeref_auto to delay runtime PM. [Matt Auld]
v4:
- Avoid using mmo offset to get the vma_node. [Matt Auld]
- Added comment to use the lmem_userfault_lock. [Matt Auld]
- Get lmem_userfault_lock in i915_gem_object_release_mmap_offset.
[Matt Auld]
- Fixed kernel test robot generated warning.
v5:
- Addressed the cosmetics comments. [Andi]
- Changed i915_gem_runtime_pm_object_release_mmap_offset() name to
i915_gem_object_runtime_pm_release_mmap_offset() to be rhythmic.
PCIe Specs 5.3.1.4.1
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/6331
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Anshuman Gupta <anshuman.gupta@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220913152714.16541-3-anshuman.gupta@intel.com
UAPI Changes:
- Revert "drm/i915/dg2: Add preemption changes for Wa_14015141709"
The intent of Wa_14015141709 was to inform us that userspace can no
longer control object-level preemption as it has on past platforms
(i.e., by twiddling register bit CS_CHICKEN1[0]). The description of
the workaround in the spec wasn't terribly well-written, and when we
requested clarification from the hardware teams we were told that on the
kernel side we should also probably stop setting
FF_SLICE_CS_CHICKEN1[14], which is the register bit that directs the
hardware to honor the settings in per-context register CS_CHICKEN1. It
turns out that this guidance about FF_SLICE_CS_CHICKEN1[14] was a
mistake; even though CS_CHICKEN1[0] is non-operational and useless to
userspace, there are other bits in the register that do still work and
might need to be adjusted by userspace in the future (e.g., to implement
other workarounds that show up). If we don't set
FF_SLICE_CS_CHICKEN1[14] in i915, then those future workarounds would
not take effect.
Even more details at:
https://lists.freedesktop.org/archives/intel-gfx/2022-September/305478.html
Driver Changes:
- Align GuC/HuC firmware versioning scheme to kernel practices (John)
- Fix#6639: h264 hardware video decoding broken in 5.19 on Intel(R)
Celeron(R) N3060 (Nirmoy)
- Meteorlake (MTL) enabling (Matt R)
- GuC SLPC improvements (Vinay, Rodrigo)
- Add thread execution tuning setting for ATS-M (Matt R)
- Don't start PXP without mei_pxp bind (Juston)
- Remove leftover verbose debug logging from GuC error capture (John)
- Abort suspend on low system memory conditions (Nirmoy, Matt A, Chris)
- Add DG2 Wa_16014892111 (Matt R)
- Rename ggtt_view as gtt_view (Niranjana)
- Consider HAS_FLAT_CCS() in needs_ccs_pages (Matt A)
- Don't try to disable host RPS when this was never enabled. (Rodrigo)
- Clear stalled GuC CT request after a reset (Daniele)
- Remove runtime info printing from GuC time stamp logging (Jani)
- Skip Bit12 fw domain reset for gen12+ (Sushma, Radhakrishna)
- Make GuC log sizes runtime configurable (John)
- Selftest improvements (Daniele, Matt B, Andrzej)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/YxshfqUN+vDe92Zn@jlahtine-mobl.ger.corp.intel.com
If the move or clear operation somehow fails, and the memory underneath
is not cleared, like when moving to lmem, then we currently fallback to
memcpy or memset. However with small-BAR systems this fallback might no
longer be possible. For now we use the set_wedged sledgehammer if we
ever encounter such a scenario, and mark the object as borked to plug
any holes where access to the memory underneath can happen. Add some
basic selftests to exercise this.
v2:
- In the selftests make sure we grab the runtime pm around the reset.
Also make sure we grab the reset lock before checking if the device
is wedged, since the wedge might still be in-progress and hence the
bit might not be set yet.
- Don't wedge or put the object into an unknown state, if the request
construction fails (or similar). Just returning an error and
skipping the fallback should be safe here.
- Make sure we wedge each gt. (Thomas)
- Peek at the unknown_state in io_reserve, that way we don't have to
export or hand roll the fault_wait_for_idle. (Thomas)
- Add the missing read-side barriers for the unknown_state. (Thomas)
- Some kernel-doc fixes. (Thomas)
v3:
- Tweak the ordering of the set_wedged, also add FIXME.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Jordan Justen <jordan.l.justen@intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Cc: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220629174350.384910-11-matthew.auld@intel.com
On Xe-HP and later devices, dedicated compression control state (CCS)
stored in local memory is used for each surface, to support the
3D and media compression formats.
The memory required for the CCS of the entire local memory is 1/256 of
the local memory size. So before the kernel boot, the required memory
is reserved for the CCS data and a secure register will be programmed
with the CCS base address
So when an object is allocated in local memory, dont need to explicitly
allocate the space for ccs data. But when the obj is evicted into the
smem, to hold the compression related data along with the obj extra space
is needed in smem. i.e obj_size + (obj_size/256).
Hence when a smem pages are allocated for an obj with lmem placement
possibility we create with the extra pages required for the ccs data for
the obj size.
v2:
Used imperative wording [Thomas]
v3:
Inflate the pages only when obj's placement is lmem only
v4:
GEM_BUG_ON if the ttm->num_pages > obj page size [Thomas]
Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
cc: Christian Koenig <christian.koenig@amd.com>
cc: Hellstrom Thomas <thomas.hellstrom@intel.com>
Reviewed-by: Thomas Hellstrom <thomas.hellstrom@linux.intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220405150840.29351-9-ramalingam.c@intel.com
This way we finally fix the problem that new resource are
not immediately evict-able after allocation.
That has caused numerous problems including OOM on GDS handling
and not being able to use TTM as general resource manager.
v2: stop assuming in ttm_resource_fini that res->bo is still valid.
v3: cleanup kerneldoc, add more lockdep annotation
v4: consistently use res->num_pages
Signed-off-by: Christian König <christian.koenig@amd.com>
Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20220321132601.2161-1-christian.koenig@amd.com
UAPI Changes:
- Weak parallel submission support for execlists
Minimal implementation of the parallel submission support for
execlists backend that was previously only implemented for GuC.
Support one sibling non-virtual engine.
Core Changes:
- Two backmerges of drm/drm-next for header file renames/changes and
i915_regs reorganization
Driver Changes:
- Add new DG2 subplatform: DG2-G12 (Matt R)
- Add new DG2 workarounds (Matt R, Ram, Bruce)
- Handle pre-programmed WOPCM registers for DG2+ (Daniele)
- Update guc shim control programming on XeHP SDV+ (Daniele)
- Add RPL-S C0/D0 stepping information (Anusha)
- Improve GuC ADS initialization to work on ARM64 on dGFX (Lucas)
- Fix KMD and GuC race on accessing PMU busyness (Umesh)
- Use PM timestamp instead of RING TIMESTAMP for reference in PMU with GuC (Umesh)
- Report error on invalid reset notification from GuC (John)
- Avoid WARN splat by holding RPM wakelock during PXP unbind (Juston)
- Fixes to parallel submission implementation (Matt B.)
- Improve GuC loading status check/error reports (John)
- Tweak TTM LRU priority hint selection (Matt A.)
- Align the plane_vma to min_page_size of stolen mem (Ram)
- Introduce vma resources and implement async unbinding (Thomas)
- Use struct vma_resource instead of struct vma_snapshot (Thomas)
- Return some TTM accel move errors instead of trying memcpy move (Thomas)
- Fix a race between vma / object destruction and unbinding (Thomas)
- Remove short-term pins from execbuf (Maarten)
- Update to GuC version 69.0.3 (John, Michal Wa.)
- Improvements to GT reset paths in GuC backend (Matt B.)
- Use shrinker_release_pages instead of writeback in shmem object hooks (Matt A., Tvrtko)
- Use trylock instead of blocking lock when freeing GEM objects (Maarten)
- Allocate intel_engine_coredump_alloc with ALLOW_FAIL (Matt B.)
- Fixes to object unmapping and purging (Matt A)
- Check for wedged device in GuC backend (John)
- Avoid lockdep splat by locking dpt_obj around set_cache_level (Maarten)
- Allow dead vm to unbind vma's without lock (Maarten)
- s/engine->i915/i915/ for DG2 engine workarounds (Matt R)
- Use to_gt() helper for GGTT accesses (Michal Wi.)
- Selftest improvements (Matt B., Thomas, Ram)
- Coding style and compiler warning fixes (Matt B., Jasmine, Andi, Colin, Gustavo, Dan)
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/Yg4i2aCZvvee5Eai@jlahtine-mobl.ger.corp.intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
[Fixed conflicts while applying, using the fixups/drm-intel-gt-next.patch
from drm-rerere's 1f2b1742abdd ("2022y-02m-23d-16h-07m-57s UTC: drm-tip
rerere cache update")]
Don't wait sync while migrating, but rather make the GPU blit await the
dependencies and add a moving fence to the object.
This also enables asynchronous VRAM management in that on eviction,
rather than waiting for the moving fence to expire before freeing VRAM,
it is freed immediately and the fence is stored with the VRAM manager and
handed out to newly allocated objects to await before clears and swapins,
or for kernel objects before setting up gpu vmas or mapping.
To collect dependencies before migrating, add a set of utilities that
coalesce these to a single dma_fence.
What is still missing for fully asynchronous operation is asynchronous vma
unbinding, which is still to be implemented.
This commit substantially reduces execution time in the gem_lmem_swapping
test.
v2:
- Make a couple of functions static.
v4:
- Fix some style issues (Matthew Auld)
- Audit and add more checks for ghost objects (Matthew Auld)
- Add more documentation for the i915_deps utility (Mattew Auld)
- Simplify the i915_deps_sync() function
v6:
- Re-check for fence signaled before returning -EBUSY (Matthew Auld)
- Use dma_resv_iter_is_exclusive() (Matthew Auld)
- Await all dma-resv fences before a migration blit (Matthew Auld)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211122214554.371864-6-thomas.hellstrom@linux.intel.com
With async migration, the shrinker may end up wanting to release the
pages of an object while the migration blit is still running, since
the GT migration code doesn't set up VMAs and the shrinker is thus
oblivious to the fact that the GPU is still using the pages.
Add waiting for gpu in the shrinker_release_pages() op and an
argument to that function indicating whether the shrinker expects it
to not wait for gpu. In the latter case the shrinker_release_pages()
op will return -EBUSY if the object is not idle.
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211122214554.371864-5-thomas.hellstrom@linux.intel.com
There is an interesting refcounting loop:
struct intel_memory_region has a struct ttm_resource_manager,
ttm_resource_manager->move may hold a reference to i915_request,
i915_request may hold a reference to intel_context,
intel_context may hold a reference to drm_i915_gem_object,
drm_i915_gem_object may hold a reference to intel_memory_region.
Break this loop by dropping region reference counting.
In addition, Have regions with a manager moving fence make sure
that all region objects are released before freeing the region.
v6:
- Fix a code comment.
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211122214554.371864-4-thomas.hellstrom@linux.intel.com
We are about to introduce failsafe- and asynchronous migration and
ttm moves.
This will add complexity and code to the TTM move code so it makes sense
to split it out to a separate file to make the i915 TTM code easer to
digest.
Split the i915 TTM move code out and since we will have to change the name
of the gpu_binds_iomem() and cpu_maps_iomem() functions anyway,
we alter the name of gpu_binds_iomem() to i915_ttm_gtt_binds_lmem() which
is more reflecting what it is used for.
With this we also add some more documentation. Otherwise there should be
no functional change.
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211104110718.688420-2-thomas.hellstrom@linux.intel.com
As we start to introduce asynchronous failsafe object migration,
where we update the object state and then submit asynchronous
commands we need to record what memory resources are actually used
by various part of the command stream. Initially for three purposes:
1) Error capture.
2) Asynchronous migration error recovery.
3) Asynchronous vma bind.
At the time where these happens, the object state may have been updated
to be several migrations ahead and object sg-tables discarded.
In order to make it possible to keep sg-tables with memory resource
information for these operations, introduce refcounted sg-tables that
aren't freed until the last user is done with them.
The alternative would be to reference information sitting on the
corresponding ttm_resources which typically have the same lifetime as
these refcountes sg_tables, but that leads to other awkward constructs:
Due to the design direction chosen for ttm resource managers that would
lead to diamond-style inheritance, the LMEM resources may sometimes be
prematurely freed, and finally the subclassed struct ttm_resource would
have to bleed into the asynchronous vma bind code.
v3:
- Address a number of style issues (Matthew Auld)
v4:
- Dont check for st->sgl being NULL in i915_ttm_tt__shmem_unpopulate(),
that should never happen. (Matthew Auld)
v5:
- Fix a Potential double-free (Matthew Auld)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211101122444.114607-1-thomas.hellstrom@linux.intel.com