This was done entirely with mindless brute force, using
git grep -l '\<k[vmz]*alloc_objs*(.*, GFP_KERNEL)' |
xargs sed -i 's/\(alloc_objs*(.*\), GFP_KERNEL)/\1)/'
to convert the new alloc_obj() users that had a simple GFP_KERNEL
argument to just drop that argument.
Note that due to the extreme simplicity of the scripting, any slightly
more complex cases spread over multiple lines would not be triggered:
they definitely exist, but this covers the vast bulk of the cases, and
the resulting diff is also then easier to check automatically.
For the same reason the 'flex' versions will be done as a separate
conversion.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This is the result of running the Coccinelle script from
scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to
avoid scalar types (which need careful case-by-case checking), and
instead replace kmalloc-family calls that allocate struct or union
object instances:
Single allocations: kmalloc(sizeof(TYPE), ...)
are replaced with: kmalloc_obj(TYPE, ...)
Array allocations: kmalloc_array(COUNT, sizeof(TYPE), ...)
are replaced with: kmalloc_objs(TYPE, COUNT, ...)
Flex array allocations: kmalloc(struct_size(PTR, FAM, COUNT), ...)
are replaced with: kmalloc_flex(*PTR, FAM, COUNT, ...)
(where TYPE may also be *VAR)
The resulting allocations no longer return "void *", instead returning
"TYPE *".
Signed-off-by: Kees Cook <kees@kernel.org>
Initialize the eb.vma array with values of 0 when the eb structure is
first set up. In particular, this sets the eb->vma[i].vma pointers to
NULL, simplifying cleanup and getting rid of the bug described below.
During the execution of eb_lookup_vmas(), the eb->vma array is
successively filled up with struct eb_vma objects. This process includes
calling eb_add_vma(), which might fail; however, even in the event of
failure, eb->vma[i].vma is set for the currently processed buffer.
If eb_add_vma() fails, eb_lookup_vmas() returns with an error, which
prompts a call to eb_release_vmas() to clean up the mess. Since
eb_lookup_vmas() might fail during processing any (possibly not first)
buffer, eb_release_vmas() checks whether a buffer's vma is NULL to know
at what point did the lookup function fail.
In eb_lookup_vmas(), eb->vma[i].vma is set to NULL if either the helper
function eb_lookup_vma() or eb_validate_vma() fails. eb->vma[i+1].vma is
set to NULL in case i915_gem_object_userptr_submit_init() fails; the
current one needs to be cleaned up by eb_release_vmas() at this point,
so the next one is set. If eb_add_vma() fails, neither the current nor
the next vma is set to NULL, which is a source of a NULL deref bug
described in the issue linked in the Closes tag.
When entering eb_lookup_vmas(), the vma pointers are set to the slab
poison value, instead of NULL. This doesn't matter for the actual
lookup, since it gets overwritten anyway, however the eb_release_vmas()
function only recognizes NULL as the stopping value, hence the pointers
are being set to NULL as they go in case of intermediate failure. This
patch changes the approach to filling them all with NULL at the start
instead, rather than handling that manually during failure.
Reported-by: Gangmin Kim <km.kim1503@gmail.com>
Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/15062
Fixes: 544460c338 ("drm/i915: Multi-BB execbuf")
Cc: stable@vger.kernel.org # 5.16.x
Signed-off-by: Krzysztof Niemiec <krzysztof.niemiec@intel.com>
Reviewed-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Reviewed-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
Link: https://lore.kernel.org/r/20251216180900.54294-2-krzysztof.niemiec@intel.com
(cherry picked from commit 08889b706d)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
drm-misc-next for v6.19-rc1:
UAPI Changes:
- Add userptr support to ivpu.
- Add IOCTL's for resource and telemetry data in amdxdna.
Core Changes:
- Improve some atomic state checking handling.
- drm/client updates.
- Use forward declarations instead of including drm_print.h
- RUse allocation flags in ttm_pool/device_init and allow specifying max
useful pool size and propagate ENOSPC.
- Updates and fixes to scheduler and bridge code.
- Add support for quirking DisplayID checksum errors.
Driver Changes:
- Assorted cleanups and fixes in rcar-du, accel/ivpu, panel/nv3052cf,
sti, imxm, accel/qaic, accel/amdxdna, imagination, tidss, sti,
panthor, vkms.
- Add Samsung S6E3FC2X01 DDIC/AMS641RW, Synaptics TDDI series DSI,
TL121BVMS07-00 (IL79900A) panels.
- Add mali MediaTek MT8196 SoC gpu support.
- Add etnaviv GC8000 Nano Ultra VIP r6205 support.
- Document powervr ge7800 support in the devicetree.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patch.msgid.link/5afae707-c9aa-4a47-b726-5e1f1aa7a106@linux.intel.com
The struct_mutex will be removed from the DRM subsystem, as it was a
legacy BKL that was only used by i915 driver. After review, it was
concluded that its usage was no longer necessary
This patch updates various comments in the i915/gem and i915/gt
codebase to either remove or clarify references to struct_mutex, in
order to prevent future misunderstandings.
* i915_gem_execbuffer.c: Replace reference to struct_mutex with
vm->mutex, as noted in the eb_reserve() function, which states that
vm->mutex handles deadlocks.
* i915_gem_object.c: Change struct_mutex by
drm_i915_gem_object->vma.lock. i915_gem_object_unbind() in i915_gem.c
states that this lock is who actually protects the unbind.
* i915_gem_shrinker.c: The correct lock is actually i915->mm.obj, as
already documented in its declaration.
* i915_gem_wait.c: The existing comment already mentioned that
struct_mutex was no longer necessary. Updated to refer to a generic
global lock instead.
* intel_reset_types.h: Cleaned up the comment text. Updated to refer to
a generic global lock instead.
Signed-off-by: Luiz Otavio Mello <luiz.mello@estudante.ufscar.br>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://lore.kernel.org/r/20250908131518.36625-6-luiz.mello@estudante.ufscar.br
Acked-by: Tvrtko Ursulin <tursulin@ursulin.net>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Driver Changes:
- Fix SLPC wait boosting reference counting to avoid getting stuck on non-boost
frequency on power saving profile on DG1/DG2 (Vinay)
- Add 20ms delay to engine reset for robustness on HSW (Nitin)
- Use proper sleeping functions for timeouts shorter than 20ms (Andi)
- Fix fence not released on early probe errors for HuC (Janusz)
- Remove const from struct i915_wa list allocation (Kees)
- Apply SPDX license format where missing and use single-line format (Andi)
- Whitespace fixes (Dan, Andi)
- Selftest improvements (Mikolaj, Badal, Sk,
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://lore.kernel.org/r/aBxNYp0IviE23zy-@jlahtine-mobl
Driver Changes:
- Expose fan speed via hwmon (Raag)
- Correction to Wa_14019159160 on ARL (John H)
- Whitelist COMMON_SLICE_CHICKEN1 for UMD access on DG2/MTL/ARL (Dnyaneshwar)
- Do not attempt to load the GSC multiple times to avoid hanging GSC HW (Daniele)
- Populate /sys/class/drm/cardX/engines/ even if one engine fails (Andi)
- Use kmemdup_array instead of kmemdup for multiple allocation (Yu)
- Remove extra unlikely() (Hongbo)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/Ztrfr_Wuurfa-3Rv@jlahtine-mobl.ger.corp.intel.com
Need to take some Xe bo definition in here before
we can add the BMG display 64k aligned size restrictions.
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
UAPI Changes:
- Limit the number of relocations to INT_MAX (Tvrtko)
Only impact should be synthetic tests.
Driver Changes:
- Fix for #11396: GPU Hang and rcs0 reset on Cherrytrail platform
- Fix Virtual Memory mapping boundaries calculation (Andi)
- Fix for #11255: Long hangs in buddy allocator with DG2/A380 without
Resizable BAR since 6.9 (David)
- Mark the GT as dead when mmio is unreliable (Chris, Andi)
- Workaround additions / fixes for MTL, ARL and DG2 (John H, Nitin)
- Enable partial memory mapping of GPU virtual memory (Andi, Chris)
- Prevent NULL deref on intel_memory_regions_hw_probe (Jonathan, Dan)
- Avoid UAF on intel_engines_release (Krzysztof)
- Don't update PWR_CLK_STATE starting Gen12 (Umesh)
- Code and dmesg cleanups (Andi, Jesus, Luca)
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ZshcfSqgfnl8Mh4P@jlahtine-mobl.ger.corp.intel.com
Kernel test robot reports i915 can hit a warn in kvmalloc_node which has
a purpose of dissalowing crazy size kernel allocations. This was added in
7661809d49 ("mm: don't allow oversized kvmalloc() calls"):
/* Don't even allow crazy sizes */
if (WARN_ON_ONCE(size > INT_MAX))
return NULL;
This would be kind of okay since i915 at one point dropped the need for
making a shadow copy of the relocation list, but then it got re-added in
fd1500fcd4 ("Revert "drm/i915/gem: Drop relocation slowpath".") a year
after Linus added the above warning.
It is plausible that the issue was not seen until now because to trigger
gem_exec_reloc test requires a combination of an relatively older
generation hardware but with at least 8GiB of RAM installed. Probably even
more depending on runtime checks.
Lets cap what we allow userspace to pass in using the matching limit.
There should be no issue for real userspace since we are talking about
"crazy" number of relocations which have no practical purpose.
*) Well IGT tests might get upset but they can be easily adjusted.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Reported-by: kernel test robot <oliver.sang@intel.com>
Closes: https://lore.kernel.org/oe-lkp/202405151008.6ddd1aaf-oliver.sang@intel.com
Cc: Kees Cook <keescook@chromium.org>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20240521101201.18978-1-tursulin@igalia.com
There were several instances of the string "assocat" in the kernel, which
should have been spelled "associat", with the various endings of -ive,
-ed, -ion, and sometimes beginnging with dis-.
Add to the spelling dictionary the corrections so that future instances
will be caught by checkpatch, and fix the instances found.
Originally noticed by accident with a 'git grep socat'.
Link: https://lkml.kernel.org/r/20240612001247.356867-1-jesse.brandeburg@intel.com
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
This reverts commit 1f33dc0c11.
There was a patch supposed to fix an issue of illegal attempts to free a
still active i915 VMA object when parking a GT believed to be idle,
reported by CI on 2-GT Meteor Lake. As a solution, an extra wakeref for
a Primary GT was acquired from i915_gem_do_execbuffer() -- see commit
f56fe3e917 ("drm/i915: Fix a VMA UAF for multi-gt platform").
However, that fix occurred insufficient -- the issue was still reported by
CI. That wakeref was released on exit from i915_gem_do_execbuffer(), then
potentially before completion of the request and deactivation of its
associated VMAs. Moreover, CI reports indicated that single-GT platforms
also suffered sporadically from the same race.
Since that issue was fixed by another commit f3c71b2ded ("drm/i915/vma:
Fix UAF on destroy against retire race"), the changes introduced by that
insufficient fix were dropped as no longer useful. However, that series
resulted in another VMA UAF scenario now being triggered in CI.
<4> [260.290809] ------------[ cut here ]------------
<4> [260.290988] list_del corruption. prev->next should be ffff888118c5d990, but was ffff888118c5a510. (prev=ffff888118c5a510)
<4> [260.291004] WARNING: CPU: 2 PID: 1143 at lib/list_debug.c:62 __list_del_entry_valid_or_report+0xb7/0xe0
..
<4> [260.291055] CPU: 2 PID: 1143 Comm: kms_plane Not tainted 6.9.0-rc2-CI_DRM_14524-ga25d180c6853+ #1
<4> [260.291058] Hardware name: Intel Corporation Meteor Lake Client Platform/MTL-P LP5x T3 RVP, BIOS MTLPFWI1.R00.3471.D91.2401310918 01/31/2024
<4> [260.291060] RIP: 0010:__list_del_entry_valid_or_report+0xb7/0xe0
...
<4> [260.291087] Call Trace:
<4> [260.291089] <TASK>
<4> [260.291124] i915_vma_reopen+0x43/0x80 [i915]
<4> [260.291298] eb_lookup_vmas+0x9cb/0xcc0 [i915]
<4> [260.291579] i915_gem_do_execbuffer+0xc9a/0x26d0 [i915]
<4> [260.291883] i915_gem_execbuffer2_ioctl+0x123/0x2a0 [i915]
...
<4> [260.292301] </TASK>
...
<4> [260.292506] ---[ end trace 0000000000000000 ]---
<4> [260.292782] general protection fault, probably for non-canonical address 0x6b6b6b6b6b6b6ca3: 0000 [#1] PREEMPT SMP NOPTI
<4> [260.303575] CPU: 2 PID: 1143 Comm: kms_plane Tainted: G W 6.9.0-rc2-CI_DRM_14524-ga25d180c6853+ #1
<4> [260.313851] Hardware name: Intel Corporation Meteor Lake Client Platform/MTL-P LP5x T3 RVP, BIOS MTLPFWI1.R00.3471.D91.2401310918 01/31/2024
<4> [260.326359] RIP: 0010:eb_validate_vmas+0x114/0xd80 [i915]
...
<4> [260.428756] Call Trace:
<4> [260.431192] <TASK>
<4> [639.283393] i915_gem_do_execbuffer+0xd05/0x26d0 [i915]
<4> [639.305245] i915_gem_execbuffer2_ioctl+0x123/0x2a0 [i915]
...
<4> [639.411134] </TASK>
...
<4> [639.449979] ---[ end trace 0000000000000000 ]---
We defer actually closing, unbinding and destroying a VMA until next idle
point, or until the object is freed in the meantime. By postponing the
unbind, we allow for the VMA to be reopened by the client, avoiding the
work required to rebind the VMA.
Starting from commit b0647a5e79 ("drm/i915: Avoid live-lock with
i915_vma_parked()"), we assume that as long as a GT is held idle, no VMA
would be reopened while we destroy them. That assumption is no longer
true in multi-GT configurations, where a VMA we reopen may be handled by a
GT different from the one that we already keep active via its engine while
we set up an execbuf request.
Restoring the extra GT0 PM wakeref removed from i915_gem_do_execbuffer()
processing path seems to fix this issue.
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/10608
Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Nirmoy Das <nirmoy.das@linux.intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@linux.intel.com>
Fixes: 1f33dc0c11 ("drm/i915: Remove extra multi-gt pm-references")
Link: https://patchwork.freedesktop.org/patch/msgid/20240506180253.96858-2-janusz.krzysztofik@linux.intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
(cherry picked from commit 749670a58d)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
This reverts commit 1f33dc0c11.
There was a patch supposed to fix an issue of illegal attempts to free a
still active i915 VMA object when parking a GT believed to be idle,
reported by CI on 2-GT Meteor Lake. As a solution, an extra wakeref for
a Primary GT was acquired from i915_gem_do_execbuffer() -- see commit
f56fe3e917 ("drm/i915: Fix a VMA UAF for multi-gt platform").
However, that fix occurred insufficient -- the issue was still reported by
CI. That wakeref was released on exit from i915_gem_do_execbuffer(), then
potentially before completion of the request and deactivation of its
associated VMAs. Moreover, CI reports indicated that single-GT platforms
also suffered sporadically from the same race.
Since that issue was fixed by another commit f3c71b2ded ("drm/i915/vma:
Fix UAF on destroy against retire race"), the changes introduced by that
insufficient fix were dropped as no longer useful. However, that series
resulted in another VMA UAF scenario now being triggered in CI.
<4> [260.290809] ------------[ cut here ]------------
<4> [260.290988] list_del corruption. prev->next should be ffff888118c5d990, but was ffff888118c5a510. (prev=ffff888118c5a510)
<4> [260.291004] WARNING: CPU: 2 PID: 1143 at lib/list_debug.c:62 __list_del_entry_valid_or_report+0xb7/0xe0
..
<4> [260.291055] CPU: 2 PID: 1143 Comm: kms_plane Not tainted 6.9.0-rc2-CI_DRM_14524-ga25d180c6853+ #1
<4> [260.291058] Hardware name: Intel Corporation Meteor Lake Client Platform/MTL-P LP5x T3 RVP, BIOS MTLPFWI1.R00.3471.D91.2401310918 01/31/2024
<4> [260.291060] RIP: 0010:__list_del_entry_valid_or_report+0xb7/0xe0
...
<4> [260.291087] Call Trace:
<4> [260.291089] <TASK>
<4> [260.291124] i915_vma_reopen+0x43/0x80 [i915]
<4> [260.291298] eb_lookup_vmas+0x9cb/0xcc0 [i915]
<4> [260.291579] i915_gem_do_execbuffer+0xc9a/0x26d0 [i915]
<4> [260.291883] i915_gem_execbuffer2_ioctl+0x123/0x2a0 [i915]
...
<4> [260.292301] </TASK>
...
<4> [260.292506] ---[ end trace 0000000000000000 ]---
<4> [260.292782] general protection fault, probably for non-canonical address 0x6b6b6b6b6b6b6ca3: 0000 [#1] PREEMPT SMP NOPTI
<4> [260.303575] CPU: 2 PID: 1143 Comm: kms_plane Tainted: G W 6.9.0-rc2-CI_DRM_14524-ga25d180c6853+ #1
<4> [260.313851] Hardware name: Intel Corporation Meteor Lake Client Platform/MTL-P LP5x T3 RVP, BIOS MTLPFWI1.R00.3471.D91.2401310918 01/31/2024
<4> [260.326359] RIP: 0010:eb_validate_vmas+0x114/0xd80 [i915]
...
<4> [260.428756] Call Trace:
<4> [260.431192] <TASK>
<4> [639.283393] i915_gem_do_execbuffer+0xd05/0x26d0 [i915]
<4> [639.305245] i915_gem_execbuffer2_ioctl+0x123/0x2a0 [i915]
...
<4> [639.411134] </TASK>
...
<4> [639.449979] ---[ end trace 0000000000000000 ]---
We defer actually closing, unbinding and destroying a VMA until next idle
point, or until the object is freed in the meantime. By postponing the
unbind, we allow for the VMA to be reopened by the client, avoiding the
work required to rebind the VMA.
Starting from commit b0647a5e79 ("drm/i915: Avoid live-lock with
i915_vma_parked()"), we assume that as long as a GT is held idle, no VMA
would be reopened while we destroy them. That assumption is no longer
true in multi-GT configurations, where a VMA we reopen may be handled by a
GT different from the one that we already keep active via its engine while
we set up an execbuf request.
Restoring the extra GT0 PM wakeref removed from i915_gem_do_execbuffer()
processing path seems to fix this issue.
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/10608
Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Nirmoy Das <nirmoy.das@linux.intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@linux.intel.com>
Fixes: 1f33dc0c11 ("drm/i915: Remove extra multi-gt pm-references")
Link: https://patchwork.freedesktop.org/patch/msgid/20240506180253.96858-2-janusz.krzysztofik@linux.intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
UAPI Changes:
- drm/i915/guc: Use context hints for GT frequency
Allow user to provide a low latency context hint. When set, KMD
sends a hint to GuC which results in special handling for this
context. SLPC will ramp the GT frequency aggressively every time
it switches to this context. The down freq threshold will also be
lower so GuC will ramp down the GT freq for this context more slowly.
We also disable waitboost for this context as that will interfere with
the strategy.
We need to enable the use of SLPC Compute strategy during init, but
it will apply only to contexts that set this bit during context
creation.
Userland can check whether this feature is supported using a new param-
I915_PARAM_HAS_CONTEXT_FREQ_HINT. This flag is true for all guc submission
enabled platforms as they use SLPC for frequency management.
The Mesa usage model for this flag is here -
https://gitlab.freedesktop.org/sushmave/mesa/-/commits/compute_hint
- drm/i915/gt: Enable only one CCS for compute workload
Enable only one CCS engine by default with all the compute sices
allocated to it.
While generating the list of UABI engines to be exposed to the
user, exclude any additional CCS engines beyond the first
instance
***
NOTE: This W/A will make all DG2 SKUs appear like single CCS SKUs by
default to mitigate a hardware bug. All the EUs will still remain
usable, and all the userspace drivers have been confirmed to be able
to dynamically detect the change in number of CCS engines and adjust.
For the smaller percent of applications that get perf benefit from
letting the userspace driver dispatch across all 4 CCS engines we will
be introducing a sysfs control as a later patch to choose 4 CCS each
with 25% EUs (or 50% if 2 CCS).
NOTE: A regression has been reported at
https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/10895
However Andi has been triaging the issue and we're closing in a fix
to the gap in the W/A implementation:
https://lists.freedesktop.org/archives/intel-gfx/2024-April/348747.html
Driver Changes:
- Add new and fix to existing workarounds: Wa_14018575942 (MTL),
Wa_16019325821 (Gen12.70), Wa_14019159160 (MTL), Wa_16015675438,
Wa_14020495402 (Gen12.70) (Tejas, John, Lucas)
- Fix UAF on destroy against retire race and remove two earlier
partial fixes (Janusz)
- Limit the reserved VM space to only the platforms that need it (Andi)
- Reset queue_priority_hint on parking for execlist platforms (Chris)
- Fix gt reset with GuC submission is disabled (Nirmoy)
- Correct capture of EIR register on hang (John)
- Remove usage of the deprecated ida_simple_xx() API
- Refactor confusing __intel_gt_reset() (Nirmoy)
- Fix the fix for GuC reset lock confusion (John)
- Simplify/extend platform check for Wa_14018913170 (John)
- Replace dev_priv with i915 (Andi)
- Add and use gt_to_guc() wrapper (Andi)
- Remove bogus null check (Rodrigo, Dan)
. Selftest improvements (Janusz, Nirmoy, Daniele)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ZitVBTvZmityDi7D@jlahtine-mobl.ger.corp.intel.com
There was an attempt to fix an issue of illegal attempts to free a still
active i915 VMA object when parking a GT believed to be idle, reported by
CI on 2-GT Meteor Lake. As a solution, an extra wakeref for a Primary GT
was acquired from i915_gem_do_execbuffer() -- see commit f56fe3e917
("drm/i915: Fix a VMA UAF for multi-gt platform").
However, that fix occurred insufficient -- the issue was still reported by
CI. That wakeref was released on exit from i915_gem_do_execbuffer(), then
potentially before completion of the request and deactivation of its
associated VMAs. Moreover, CI reports indicated that single-GT platforms
also suffered sporadically from the same race.
Since the issue has now been fixed by a preceding patch "drm/i915/vma: Fix
UAF on destroy against retire race", drop the no longer useful changes
introduced by that insufficient fix.
v3: Also drop the no longer used .wakeref_gt0 field from struct
i915_execbuffer.
v2: Avoid the word "revert" in commit message (Rodrigo),
- update commit description reusing relevant chunks dropped from the
description of the proper fix (Rodrigo).
Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Cc: Nirmoy Das <nirmoy.das@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240305143747.335367-7-janusz.krzysztofik@linux.intel.com
Never block for outstanding work on userptr object upon receipt of a
mmu-notifier. The reason we originally did so was to immediately unbind
the userptr and unpin its pages, but since that has been dropped in
commit b4b9731b02 ("drm/i915: Simplify userptr locking"), we never
return the pages to the system i.e. never drop our page->mapcount and so
do not allow the page and CPU PTE to be revoked. Based on this history,
we know we are safe to drop the wait entirely.
Upon return from mmu-notifier, we will still have the userptr pages
pinned preventing the following PTE operation (such as try_to_unmap)
adjusting the vm_area_struct, so it is safe to keep the pages around for
as long as we still have i/o pending.
We do not have any means currently to asynchronously revalidate the
userptr pages, that is always prior to next use.
Signed-off-by: Chris Wilson <chris.p.wilson@linux.intel.com>
Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231128162505.3493942-1-jonathan.cavitt@intel.com
The use of kmap_atomic() is being deprecated in favor of
kmap_local_page()[1], and this patch converts the calls from
kmap_atomic() to kmap_local_page().
The main difference between atomic and local mappings is that local
mappings doesn't disable page faults or preemption (the preemption is
disabled for !PREEMPT_RT case, otherwise it only disables migration).
With kmap_local_page(), we can avoid the often unwanted side effect of
unnecessary page faults and preemption disables.
In i915_gem_execbuffer.c, eb->reloc_cache.vaddr is mapped by
kmap_atomic() in eb_relocate_entry(), and is unmapped by
kunmap_atomic() in reloc_cache_reset().
And this mapping/unmapping occurs in two places: one is in
eb_relocate_vma(), and another is in eb_relocate_vma_slow().
The function eb_relocate_vma() or eb_relocate_vma_slow() doesn't
need to disable pagefaults and preemption during the above mapping/
unmapping.
So it can simply use kmap_local_page() / kunmap_local() that can
instead do the mapping / unmapping regardless of the context.
Convert the calls of kmap_atomic() / kunmap_atomic() to
kmap_local_page() / kunmap_local().
[1]: https://lore.kernel.org/all/20220813220034.806698-1-ira.weiny@intel.com
Suggested-by: Ira Weiny <ira.weiny@intel.com>
Suggested-by: Fabio M. De Francesco <fmdefrancesco@gmail.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Fabio M. De Francesco <fmdefrancesco@gmail.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231203132947.2328805-10-zhao1.liu@linux.intel.com
GCC 14 introduces a new -Walloc-size included in -Wextra which errors out
like:
```
drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c: In function ‘eb_copy_relocations’:
drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c:1681:24: error: allocation of insufficient size ‘1’ for type ‘struct drm_i915_gem_relocation_entry’ with size ‘32’ [-Werror=alloc-size]
1681 | relocs = kvmalloc_array(size, 1, GFP_KERNEL);
| ^
```
So, just swap the number of members and size arguments to match the prototype, as
we're initialising 1 element of size `size`. GCC then sees we're not
doing anything wrong.
Signed-off-by: Sam James <sam@gentoo.org>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231107215538.1891359-1-sam@gentoo.org
Driver Changes:
- Avoid infinite GPU waits by avoidin premature release of request's
reusable memory (Chris, Janusz)
- Expose RPS thresholds in sysfs (Tvrtko)
- Apply GuC SLPC min frequency softlimit correctly (Vinay)
- Restore SLPC efficient freq earlier (Vinay)
- Consider OA buffer boundary when zeroing out reports (Umesh)
- Extend Wa_14015795083 to TGL, RKL, DG1 and ADL (Matt R)
- Fix context workarounds with non-masked regs on MTL/DG2 (Lucas)
- Enable the CCS_FLUSH bit in the pipe control and in the CS for MTL+ (Andi)
- Update MTL workarounds 14018778641, 22016122933 (Tejas, Zhanjun)
- Ensure memory quiesced before AUX CCS invalidation (Jonathan)
- Add a gsc_info debugfs (Daniele)
- Invalidate the TLBs on each GT on multi-GT device (Chris)
- Fix a VMA UAF for multi-gt platform (Nirmoy)
- Do not use stolen on MTL due to HW bug (Nirmoy)
- Check HuC and GuC version compatibility on MTL (Daniele)
- Dump perf_limit_reasons for slow GuC init debug (Vinay)
- Replace kmap() with kmap_local_page() (Sumitra, Ira)
- Add sentinel to xehp_oa_b_counters for KASAN (Andrzej)
- Add the gen12_needs_ccs_aux_inv helper (Andi)
- Fixes and updates for GSC memory allocation (Daniele)
- Fix one wrong caching mode enum usage (Tvrtko)
- Fixes for GSC wakeref (Alan)
- Static checker fixes (Harshit, Arnd, Dan, Cristophe, David, Andi)
- Rename flags with bit_group_X according to the datasheet (Andi)
- Use direct alias for i915 in requests (Andrzej)
- Replace i915->gt0 with to_gt(i915) (Andi)
- Use the i915_vma_flush_writes helper (Tvrtko)
- Selftest improvements (Alan)
- Remove dead code (Tvrtko)
Signed-off-by: Dave Airlie <airlied@redhat.com>
# Conflicts:
# drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.c
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ZMy6kDd9npweR4uy@jlahtine-mobl.ger.corp.intel.com
Currently the KMD is using enum i915_cache_level to set caching policy for
buffer objects. This is flaky because the PAT index which really controls
the caching behavior in PTE has far more levels than what's defined in the
enum. In addition, the PAT index is platform dependent, having to translate
between i915_cache_level and PAT index is not reliable, and makes the code
more complicated.
From UMD's perspective there is also a necessity to set caching policy for
performance fine tuning. It's much easier for the UMD to directly use PAT
index because the behavior of each PAT index is clearly defined in Bspec.
Having the abstracted i915_cache_level sitting in between would only cause
more ambiguity. PAT is expected to work much like MOCS already works today,
and by design userspace is expected to select the index that exactly
matches the desired behavior described in the hardware specification.
For these reasons this patch replaces i915_cache_level with PAT index. Also
note, the cache_level is not completely removed yet, because the KMD still
has the need of creating buffer objects with simple cache settings such as
cached, uncached, or writethrough. For kernel objects, cache_level is used
for simplicity and backward compatibility. For Pre-gen12 platforms PAT can
have 1:1 mapping to i915_cache_level, so these two are interchangeable. see
the use of LEGACY_CACHELEVEL.
One consequence of this change is that gen8_pte_encode is no longer working
for gen12 platforms due to the fact that gen12 platforms has different PAT
definitions. In the meantime the mtl_pte_encode introduced specfically for
MTL becomes generic for all gen12 platforms. This patch renames the MTL
PTE encode function into gen12_pte_encode and apply it to all gen12. Even
though this change looks unrelated, but separating them would temporarily
break gen12 PTE encoding, thus squash them in one patch.
Special note: this patch changes the way caching behavior is controlled in
the sense that some objects are left to be managed by userspace. For such
objects we need to be careful not to change the userspace settings.There
are kerneldoc and comments added around obj->cache_coherent, cache_dirty,
and how to bypass the checkings by i915_gem_object_has_cache_level. For
full understanding, these changes need to be looked at together with the
two follow-up patches, one disables the {set|get}_caching ioctl's and the
other adds set_pat extension to the GEM_CREATE uAPI.
Bspec: 63019
Cc: Chris Wilson <chris.p.wilson@linux.intel.com>
Signed-off-by: Fei Yang <fei.yang@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230509165200.1740-3-fei.yang@intel.com
Pull drm updates from Dave Airlie:
"The biggest highlight is that the accel subsystem framework is merged.
Hopefully for 6.3 we will be able to line up a driver to use it.
In drivers land, i915 enables DG2 support by default now, and nouveau
has a big stability refactoring and initial ampere support, AMD
includes new hw IP support and should build on ARM again. There is
also an ofdrm driver to take over offb on platforms it's used.
Stuff outside my tree, the dma-buf patches hit a few places, the vc4
firmware changes also do, and i915 has some interactions with MEI for
discrete GPUs. I think all of those should have been acked/reviewed by
relevant parties.
New driver:
- ofdrm - replacement for offb
fbdev:
- add support for nomodeset
fourcc:
- add Vivante tiled modifier
core:
- atomic-helpers: CRTC primary plane test fixes, fb access hooks
- connector: TV API consistency, cmdline parser improvements
- send connector hotplug on cleanup
- sort makefile objects
tests:
- sort kunit tests
- improve DP-MST tests
- add kunit helpers to create a device
sched:
- module param for scheduling policy
- refcounting fix
buddy:
- add back random seed log
ttm:
- convert ttm_resource to size_t
- optimize pool allocations
edid:
- HFVSDB parsing support fixes
- logging/debug improvements
- DSC quirks
dma-buf:
- Add unlocked vmap and attachment mapping
- move drivers to common locking convention
- locking improvements
firmware:
- new API for rPI firmware and vc4
xilinx:
- zynqmp: displayport bridge support
- dpsub fix
bridge:
- adv7533: Remove dynamic lane switching
- it6505: Runtime PM support, sync improvements
- ps8640: Handle AUX defer messages
- tc358775: Drop soft-reset over I2C
panel:
- panel-edp: Add INX N116BGE-EA2 C2 and C4 support.
- Jadard JD9365DA-H3
- NewVision NV3051D
amdgpu:
- DCN support on ARM
- DCN 2.1 secure display
- Sienna Cichlid mode2 reset fixes
- new GC 11.x firmware versions
- drop AMD specific DSC workarounds in favour of drm code
- clang warning fixes
- scheduler rework
- SR-IOV fixes
- GPUVM locking fixes
- fix memory leak in CS IOCTL error path
- flexible array updates
- enable new GC/PSP/SMU/NBIO IP
- GFX preemption support for gfx9
amdkfd:
- cache size fixes
- userptr fixes
- enable cooperative launch on gfx 10.3
- enable GC 11.0.4 KFD support
radeon:
- replace kmap with kmap_local_page
- ACPI ref count fix
- HDA audio notifier support
i915:
- DG2 enabled by default
- MTL enablement work
- hotplug refactoring
- VBT improvements
- Display and watermark refactoring
- ADL-P workaround
- temp disable runtime_pm for discrete-
- fix for A380 as a secondary GPU
- Wa_18017747507 for DG2
- CS timestamp support fixes for gen5 and earlier
- never purge busy TTM objects
- use i915_sg_dma_sizes for all backends
- demote GuC kernel contexts to normal priority
- gvt: refactor for new MDEV interface
- enable DC power states on eDP ports
- fix gen 2/3 workarounds
nouveau:
- fix page fault handling
- Ampere acceleration support
- driver stability improvements
- nva3 backlight support
msm:
- MSM_INFO_GET_FLAGS support
- DPU: XR30 and P010 image formats
- Qualcomm SM6115 support
- DSI PHY support for QCM2290
- HDMI: refactored dev init path
- remove exclusive-fence hack
- fix speed-bin detection
- enable clamp to idle on 7c3
- improved hangcheck detection
vmwgfx:
- fb and cursor refactoring
- convert to generic hashtable
- cursor improvements
etnaviv:
- hw workarounds
- softpin MMU fixes
ast:
- atomic gamma LUT support
- convert to SHMEM
lcdif:
- support YUV planes
- Increase DMA burst size
- FIFO threshold tuning
meson:
- fix return type of cvbs mode_valid
mgag200:
- fix PLL setup on some revisions
sun4i:
- A100 and D1 support
udl:
- modesetting improvements
- hot unplug support
vc4:
- support PAL-M
- fix regression preventing 4K @ 60Hz
- fix NULL ptr deref
v3d:
- switch to drm managed resources
renesas:
- RZ/G2L DSI support
- DU Kconfig cleanup
mediatek:
- fixup dpi and hdmi
- MT8188 dpi support
- MT8195 AFBC support
tegra:
- NVDEC hardware on Tegra234 SoC
hdlcd:
- switch to drm managed resources
ingenic:
- fix registration error path
hisilicon:
- convert to drm_mode_init
maildp:
- use managed resources
mtk:
- use drm_mode_init
rockchip:
- use drm_mode_copy"
* tag 'drm-next-2022-12-13' of git://anongit.freedesktop.org/drm/drm: (1397 commits)
drm/amdgpu: fix mmhub register base coding error
drm/amdgpu: add tmz support for GC IP v11.0.4
drm/amdgpu: enable GFX Clock Gating control for GC IP v11.0.4
drm/amdgpu: enable GFX Power Gating for GC IP v11.0.4
drm/amdgpu: enable GFX IP v11.0.4 CG support
drm/amdgpu: Make amdgpu_ring_mux functions as static
drm/amdgpu: generally allow over-commit during BO allocation
drm/amd/display: fix array index out of bound error in DCN32 DML
drm/amd/display: 3.2.215
drm/amd/display: set optimized required for comp buf changes
drm/amd/display: Add debug option to skip PSR CRTC disable
drm/amd/display: correct DML calc error of UrgentLatency
drm/amd/display: correct static_screen_event_mask
drm/amd/display: Ensure commit_streams returns the DC return code
drm/amd/display: read invalid ddc pin status cause engine busy
drm/amd/display: Bypass DET swath fill check for max clocks
drm/amd/display: Disable uclk pstate for subvp pipes
drm/amd/display: Fix DCN2.1 default DSC clocks
drm/amd/display: Enable dp_hdmi21_pcon support
drm/amd/display: prevent seamless boot on displays that don't have the preferred dig
...