Currently all module parameters are handled by i915_param.c/h. This
is a problem for display parameters when Xe driver is used. Add
a mechanism to add parameters specific to the display. This is mainly
copied from i915_[debugfs]_params.[ch]. Parameters are not yet moved. This
is done by subsequent patches.
v2:
- Drop unused predefinition (dentry)
- Clarify need for empty INTEL_DISPLAY_PARAMS_FOR_EACH in comment
Signed-off-by: Jouni Högander <jouni.hogander@intel.com>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Luca Coelho <luciano.coelho@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231024124109.384973-2-jouni.hogander@intel.com
Today we got a report at [1] for rcu stalls on the i915 testsuite in [2]
due to the conversion of files to SLAB_TYPSSAFE_BY_RCU. Afaict,
get_file_rcu() goes into an infinite loop trying to carefully verify
that i915->gem.mmap_singleton hasn't changed - see the splat below.
So I stared at this code to figure out what it actually does. It seems
that the i915->gem.mmap_singleton pointer itself never had rcu semantics.
The i915->gem.mmap_singleton is replaced in
file->f_op->release::singleton_release():
static int singleton_release(struct inode *inode, struct file *file)
{
struct drm_i915_private *i915 = file->private_data;
cmpxchg(&i915->gem.mmap_singleton, file, NULL);
drm_dev_put(&i915->drm);
return 0;
}
The cmpxchg() is ordered against a concurrent update of
i915->gem.mmap_singleton from mmap_singleton(). IOW, when
mmap_singleton() fails to get a reference on i915->gem.mmap_singleton:
While mmap_singleton() does
rcu_read_lock();
file = get_file_rcu(&i915->gem.mmap_singleton);
rcu_read_unlock();
it allocates a new file via anon_inode_getfile() and does
smp_store_mb(i915->gem.mmap_singleton, file);
So, then what happens in the case of this bug is that at some point
fput() is called and drops the file->f_count to zero leaving the pointer
in i915->gem.mmap_singleton in tact.
Now, there might be delays until
file->f_op->release::singleton_release() is called and
i915->gem.mmap_singleton is set to NULL.
Say concurrently another task hits mmap_singleton() and does:
rcu_read_lock();
file = get_file_rcu(&i915->gem.mmap_singleton);
rcu_read_unlock();
When get_file_rcu() fails to get a reference via atomic_inc_not_zero()
it will try the reload from i915->gem.mmap_singleton expecting it to be
NULL, assuming it has comparable semantics as we expect in
__fget_files_rcu().
But it hasn't so it reloads the same pointer again, trying the same
atomic_inc_not_zero() again and doing so until
file->f_op->release::singleton_release() of the old file has been
called.
So, in contrast to __fget_files_rcu() here we want to not retry when
atomic_inc_not_zero() has failed. We only want to retry in case we
managed to get a reference but the pointer did change on reload.
<3> [511.395679] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
<3> [511.395716] rcu: Tasks blocked on level-1 rcu_node (CPUs 0-9): P6238
<3> [511.395934] rcu: (detected by 16, t=65002 jiffies, g=123977, q=439 ncpus=20)
<6> [511.395944] task:i915_selftest state:R running task stack:10568 pid:6238 tgid:6238 ppid:1001 flags:0x00004002
<6> [511.395962] Call Trace:
<6> [511.395966] <TASK>
<6> [511.395974] ? __schedule+0x3a8/0xd70
<6> [511.395995] ? asm_sysvec_apic_timer_interrupt+0x1a/0x20
<6> [511.396003] ? lockdep_hardirqs_on+0xc3/0x140
<6> [511.396013] ? asm_sysvec_apic_timer_interrupt+0x1a/0x20
<6> [511.396029] ? get_file_rcu+0x10/0x30
<6> [511.396039] ? get_file_rcu+0x10/0x30
<6> [511.396046] ? i915_gem_object_mmap+0xbc/0x450 [i915]
<6> [511.396509] ? i915_gem_mmap+0x272/0x480 [i915]
<6> [511.396903] ? mmap_region+0x253/0xb60
<6> [511.396925] ? do_mmap+0x334/0x5c0
<6> [511.396939] ? vm_mmap_pgoff+0x9f/0x1c0
<6> [511.396949] ? rcu_is_watching+0x11/0x50
<6> [511.396962] ? igt_mmap_offset+0xfc/0x110 [i915]
<6> [511.397376] ? __igt_mmap+0xb3/0x570 [i915]
<6> [511.397762] ? igt_mmap+0x11e/0x150 [i915]
<6> [511.398139] ? __trace_bprintk+0x76/0x90
<6> [511.398156] ? __i915_subtests+0xbf/0x240 [i915]
<6> [511.398586] ? __pfx___i915_live_setup+0x10/0x10 [i915]
<6> [511.399001] ? __pfx___i915_live_teardown+0x10/0x10 [i915]
<6> [511.399433] ? __run_selftests+0xbc/0x1a0 [i915]
<6> [511.399875] ? i915_live_selftests+0x4b/0x90 [i915]
<6> [511.400308] ? i915_pci_probe+0x106/0x200 [i915]
<6> [511.400692] ? pci_device_probe+0x95/0x120
<6> [511.400704] ? really_probe+0x164/0x3c0
<6> [511.400715] ? __pfx___driver_attach+0x10/0x10
<6> [511.400722] ? __driver_probe_device+0x73/0x160
<6> [511.400731] ? driver_probe_device+0x19/0xa0
<6> [511.400741] ? __driver_attach+0xb6/0x180
<6> [511.400749] ? __pfx___driver_attach+0x10/0x10
<6> [511.400756] ? bus_for_each_dev+0x77/0xd0
<6> [511.400770] ? bus_add_driver+0x114/0x210
<6> [511.400781] ? driver_register+0x5b/0x110
<6> [511.400791] ? i915_init+0x23/0xc0 [i915]
<6> [511.401153] ? __pfx_i915_init+0x10/0x10 [i915]
<6> [511.401503] ? do_one_initcall+0x57/0x270
<6> [511.401515] ? rcu_is_watching+0x11/0x50
<6> [511.401521] ? kmalloc_trace+0xa3/0xb0
<6> [511.401532] ? do_init_module+0x5f/0x210
<6> [511.401544] ? load_module+0x1d00/0x1f60
<6> [511.401581] ? init_module_from_file+0x86/0xd0
<6> [511.401590] ? init_module_from_file+0x86/0xd0
<6> [511.401613] ? idempotent_init_module+0x17c/0x230
<6> [511.401639] ? __x64_sys_finit_module+0x56/0xb0
<6> [511.401650] ? do_syscall_64+0x3c/0x90
<6> [511.401659] ? entry_SYSCALL_64_after_hwframe+0x6e/0xd8
<6> [511.401684] </TASK>
Link: [1]: https://lore.kernel.org/intel-gfx/SJ1PR11MB6129CB39EED831784C331BAFB9DEA@SJ1PR11MB6129.namprd11.prod.outlook.com
Link: [2]: https://intel-gfx-ci.01.org/tree/linux-next/next-20231013/bat-dg2-11/igt@i915_selftest@live@mman.html#dmesg-warnings10963
Cc: Jann Horn <jannh@google.com>,
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20231025-formfrage-watscheln-84526cd3bd7d@brauner
Signed-off-by: Christian Brauner <brauner@kernel.org>
The steering control and semaphore registers are inside an "always on"
power domain with respect to RC6. However there are some issues if
higher-level platform sleep states are entering/exiting at the same time
these registers are accessed. Grabbing GT forcewake and holding it over
the entire lock/steer/unlock cycle ensures that those sleep states have
been fully exited before we access these registers.
This is expected to become a formally documented/numbered workaround
soon.
Note that this patch alone isn't expected to have an immediately
noticeable impact on MCR (mis)behavior; an upcoming pcode firmware
update will also be necessary to provide the other half of this
workaround.
v2:
- Move the forcewake inside the Xe_LPG-specific IP version check. This
should only be necessary on platforms that have a steering semaphore.
Fixes: 3100240bf8 ("drm/i915/mtl: Add hardware-level lock for steering")
Cc: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
Cc: Jonathan Cavitt <jonathan.cavitt@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231019170241.2102037-2-matthew.d.roper@intel.com
(cherry picked from commit 8fa1c7cd1f)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
The steering control and semaphore registers are inside an "always on"
power domain with respect to RC6. However there are some issues if
higher-level platform sleep states are entering/exiting at the same time
these registers are accessed. Grabbing GT forcewake and holding it over
the entire lock/steer/unlock cycle ensures that those sleep states have
been fully exited before we access these registers.
This is expected to become a formally documented/numbered workaround
soon.
Note that this patch alone isn't expected to have an immediately
noticeable impact on MCR (mis)behavior; an upcoming pcode firmware
update will also be necessary to provide the other half of this
workaround.
v2:
- Move the forcewake inside the Xe_LPG-specific IP version check. This
should only be necessary on platforms that have a steering semaphore.
Fixes: 3100240bf8 ("drm/i915/mtl: Add hardware-level lock for steering")
Cc: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
Cc: Jonathan Cavitt <jonathan.cavitt@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231019170241.2102037-2-matthew.d.roper@intel.com
When supporting OA for TGL, it was seen that the context valid bit in
the report ID was not defined, however revisiting the spec seems to have
this bit defined. The bit is used to determine if a context is valid on
a context switch and is essential to determine active and idle periods
for a context. Re-enable the context valid bit for gen12 platforms.
BSpec: 52196 (description of report_id)
v2: Include BSpec reference (Ashutosh)
Fixes: 00a7f0d715 ("drm/i915/tgl: Add perf support on TGL")
Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230802202854.1224547-1-umesh.nerlige.ramappa@intel.com
(cherry picked from commit 7eeaedf799)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
- Add new DG2 PCI IDs (Shekhar)
- Remove watchdog timers for PSR on Lunar Lake (Mika Kahola)
- DSB changes for proper handling of LUT programming (Ville)
- Store DSC DPCD capabilities in the connector (Imre)
- Clean up zero initializers (Ville)
- Remove Meteor Lake force_probe protection (RK)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ZTFW4g6duLtp+Wy0@intel.com
In recent discussions around some performance improvements in the file
handling area we discussed switching the file cache to rely on
SLAB_TYPESAFE_BY_RCU which allows us to get rid of call_rcu() based
freeing for files completely. This is a pretty sensitive change overall
but it might actually be worth doing.
The main downside is the subtlety. The other one is that we should
really wait for Jann's patch to land that enables KASAN to handle
SLAB_TYPESAFE_BY_RCU UAFs. Currently it doesn't but a patch for this
exists.
With SLAB_TYPESAFE_BY_RCU objects may be freed and reused multiple times
which requires a few changes. So it isn't sufficient anymore to just
acquire a reference to the file in question under rcu using
atomic_long_inc_not_zero() since the file might have already been
recycled and someone else might have bumped the reference.
In other words, callers might see reference count bumps from newer
users. For this reason it is necessary to verify that the pointer is the
same before and after the reference count increment. This pattern can be
seen in get_file_rcu() and __files_get_rcu().
In addition, it isn't possible to access or check fields in struct file
without first aqcuiring a reference on it. Not doing that was always
very dodgy and it was only usable for non-pointer data in struct file.
With SLAB_TYPESAFE_BY_RCU it is necessary that callers first acquire a
reference under rcu or they must hold the files_lock of the fdtable.
Failing to do either one of this is a bug.
Thanks to Jann for pointing out that we need to ensure memory ordering
between reallocations and pointer check by ensuring that all subsequent
loads have a dependency on the second load in get_file_rcu() and
providing a fixup that was folded into this patch.
Cc: Jann Horn <jannh@google.com>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
When supporting OA for TGL, it was seen that the context valid bit in
the report ID was not defined, however revisiting the spec seems to have
this bit defined. The bit is used to determine if a context is valid on
a context switch and is essential to determine active and idle periods
for a context. Re-enable the context valid bit for gen12 platforms.
BSpec: 52196 (description of report_id)
v2: Include BSpec reference (Ashutosh)
Fixes: 00a7f0d715 ("drm/i915/tgl: Add perf support on TGL")
Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230802202854.1224547-1-umesh.nerlige.ramappa@intel.com
The GuC firmware had defined the interface for Translation Look-Aside
Buffer (TLB) invalidation. We should use this interface when
invalidating the engine and GuC TLBs.
Add additional functionality to intel_gt_invalidate_tlb, invalidating
the GuC TLBs and falling back to GT invalidation when the GuC is
disabled.
The invalidation is done by sending a request directly to the GuC
tlb_lookup that invalidates the table. The invalidation is submitted as
a wait request and is performed in the CT event handler. This means we
cannot perform this TLB invalidation path if the CT is not enabled.
If the request isn't fulfilled in two seconds, this would constitute
an error in the invalidation as that would constitute either a lost
request or a severe GuC overload.
With this new invalidation routine, we can perform GuC-based GGTT
invalidations. GuC-based GGTT invalidation is incompatible with
MMIO invalidation so we should not perform MMIO invalidation when
GuC-based GGTT invalidation is expected.
The additional complexity incurred in this patch will be necessary for
range-based tlb invalidations, which will be platformed in the future.
Signed-off-by: Prathap Kumar Valsan <prathap.kumar.valsan@intel.com>
Signed-off-by: Bruce Chang <yu.bruce.chang@intel.com>
Signed-off-by: Chris Wilson <chris.p.wilson@intel.com>
Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Signed-off-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
Signed-off-by: Fei Yang <fei.yang@intel.com>
CC: Andi Shyti <andi.shyti@linux.intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Acked-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231017180806.3054290-4-jonathan.cavitt@intel.com
If we can't find a free fence register to handle a fault in the GMADR
range just return VM_FAULT_NOPAGE without populating the PTE so that
userspace will retry the access and trigger another fault. Eventually
we should find a free fence and the fault will get properly handled.
A further improvement idea might be to reserve a fence (or one per CPU?)
for the express purpose of handling faults without having to retry. But
that would require some additional work.
Looks like this may have gotten broken originally by
commit 39965b3766 ("drm/i915: don't trash the gtt when running out of fences")
as that changed the errno to -EDEADLK which wasn't handle by the gtt
fault code either. But later in commit 2feeb52859 ("drm/i915/gt: Fix
-EDEADLK handling regression") I changed it again to -ENOBUFS as -EDEADLK
was now getting used for the ww mutex dance. So this fix only makes
sense after that last commit.
Cc: stable@vger.kernel.org
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/9479
Fixes: 2feeb52859 ("drm/i915/gt: Fix -EDEADLK handling regression")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231012132801.16292-1-ville.syrjala@linux.intel.com
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
(cherry picked from commit 7f403caabe)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>