UAPI Changes:
- drm/i915/guc: Use context hints for GT frequency
Allow user to provide a low latency context hint. When set, KMD
sends a hint to GuC which results in special handling for this
context. SLPC will ramp the GT frequency aggressively every time
it switches to this context. The down freq threshold will also be
lower so GuC will ramp down the GT freq for this context more slowly.
We also disable waitboost for this context as that will interfere with
the strategy.
We need to enable the use of SLPC Compute strategy during init, but
it will apply only to contexts that set this bit during context
creation.
Userland can check whether this feature is supported using a new param-
I915_PARAM_HAS_CONTEXT_FREQ_HINT. This flag is true for all guc submission
enabled platforms as they use SLPC for frequency management.
The Mesa usage model for this flag is here -
https://gitlab.freedesktop.org/sushmave/mesa/-/commits/compute_hint
- drm/i915/gt: Enable only one CCS for compute workload
Enable only one CCS engine by default with all the compute sices
allocated to it.
While generating the list of UABI engines to be exposed to the
user, exclude any additional CCS engines beyond the first
instance
***
NOTE: This W/A will make all DG2 SKUs appear like single CCS SKUs by
default to mitigate a hardware bug. All the EUs will still remain
usable, and all the userspace drivers have been confirmed to be able
to dynamically detect the change in number of CCS engines and adjust.
For the smaller percent of applications that get perf benefit from
letting the userspace driver dispatch across all 4 CCS engines we will
be introducing a sysfs control as a later patch to choose 4 CCS each
with 25% EUs (or 50% if 2 CCS).
NOTE: A regression has been reported at
https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/10895
However Andi has been triaging the issue and we're closing in a fix
to the gap in the W/A implementation:
https://lists.freedesktop.org/archives/intel-gfx/2024-April/348747.html
Driver Changes:
- Add new and fix to existing workarounds: Wa_14018575942 (MTL),
Wa_16019325821 (Gen12.70), Wa_14019159160 (MTL), Wa_16015675438,
Wa_14020495402 (Gen12.70) (Tejas, John, Lucas)
- Fix UAF on destroy against retire race and remove two earlier
partial fixes (Janusz)
- Limit the reserved VM space to only the platforms that need it (Andi)
- Reset queue_priority_hint on parking for execlist platforms (Chris)
- Fix gt reset with GuC submission is disabled (Nirmoy)
- Correct capture of EIR register on hang (John)
- Remove usage of the deprecated ida_simple_xx() API
- Refactor confusing __intel_gt_reset() (Nirmoy)
- Fix the fix for GuC reset lock confusion (John)
- Simplify/extend platform check for Wa_14018913170 (John)
- Replace dev_priv with i915 (Andi)
- Add and use gt_to_guc() wrapper (Andi)
- Remove bogus null check (Rodrigo, Dan)
. Selftest improvements (Janusz, Nirmoy, Daniele)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ZitVBTvZmityDi7D@jlahtine-mobl.ger.corp.intel.com
There was an attempt to fix an issue of illegal attempts to free a still
active i915 VMA object when parking a GT believed to be idle, reported by
CI on 2-GT Meteor Lake. As a solution, an extra wakeref for a Primary GT
was acquired from i915_gem_do_execbuffer() -- see commit f56fe3e917
("drm/i915: Fix a VMA UAF for multi-gt platform").
However, that fix occurred insufficient -- the issue was still reported by
CI. That wakeref was released on exit from i915_gem_do_execbuffer(), then
potentially before completion of the request and deactivation of its
associated VMAs. Moreover, CI reports indicated that single-GT platforms
also suffered sporadically from the same race.
Since the issue has now been fixed by a preceding patch "drm/i915/vma: Fix
UAF on destroy against retire race", drop the no longer useful changes
introduced by that insufficient fix.
v3: Also drop the no longer used .wakeref_gt0 field from struct
i915_execbuffer.
v2: Avoid the word "revert" in commit message (Rodrigo),
- update commit description reusing relevant chunks dropped from the
description of the proper fix (Rodrigo).
Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Cc: Nirmoy Das <nirmoy.das@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240305143747.335367-7-janusz.krzysztofik@linux.intel.com
Allow user to provide a low latency context hint. When set, KMD
sends a hint to GuC which results in special handling for this
context. SLPC will ramp the GT frequency aggressively every time
it switches to this context. The down freq threshold will also be
lower so GuC will ramp down the GT freq for this context more slowly.
We also disable waitboost for this context as that will interfere with
the strategy.
We need to enable the use of SLPC Compute strategy during init, but
it will apply only to contexts that set this bit during context
creation.
Userland can check whether this feature is supported using a new param-
I915_PARAM_HAS_CONTEXT_FREQ_HINT. This flag is true for all guc submission
enabled platforms as they use SLPC for frequency management.
The Mesa usage model for this flag is here -
https://gitlab.freedesktop.org/sushmave/mesa/-/commits/compute_hint
v2: Rename flags as per review suggestions (Rodrigo, Tvrtko).
Also, use flag bits in intel_context as it allows finer control for
toggling per engine if needed (Tvrtko).
v3: Minor review comments (Tvrtko)
v4: Update comment (Sushma)
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Sushma Venkatesh Reddy <sushma.venkatesh.reddy@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Acked-by: Ivan Briano <ivan.briano@intel.com>
Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240306012759.204938-1-vinay.belgaumkar@intel.com
drm/i915 feature pull for v6.9:
Features and functionality:
- Early transport for panel replay and PSR (Jouni)
- New ARL PCI IDs (Matt)
- DP TPS4 PHY test pattern support (Khaled)
Refactoring and cleanups:
- Unify and improve VSC SDP for PSR and non-PSR cases (Jouni)
- Refactor memory regions and improve debug logging (Ville)
- Rework global state serialization (Ville)
- Remove unused CDCLK divider fields (Gustavo)
- Unify HDCP connector logging format (Jani)
- Use display instead of graphics version in display code (Jani)
- Move VBT and opregion debugfs next to the implementation (Jani)
- Abstract opregion interface, use opaque type (Jani)
Fixes:
- Fix MTL stolen memory access (Ville)
- Fix initial display plane readout for MTL (Ville)
- Fix HPD handling during driver init/shutdown (Imre)
- Cursor vblank evasion fixes (Ville)
- Various VSC SDP fixes (Jouni)
- Allow PSR mode changes without full modeset (Jouni)
- Fix CDCLK sanitization on module load for Xe2_LPD (Gustavo)
- Fix the max DSC bpc supported by the source (Ankit)
- Add missing LNL ALPM AUX wake configuration (Jouni)
- Cx0 PHY state readout and verify fixes (Mika)
- Fix PSR (panel replay) debugfs for MST connectors (Imre)
- Fail HDCP repeater authentication if Type1 device not present (Suraj)
- Ratelimit debug logging in vm_fault_ttm (Nirmoy)
- Use a fake PCH for MTL because south display is not on the PCH (Haridhar)
- Disable DSB for Xe driver for now (José)
- Fix some LNL display register changes (Lucas)
- Fix build on ChromeOS (Paz Zcharya)
- Preserve current shared DPLL for fastsets on Type-C ports (Ville)
- Fix state checker warnings for MG/TC/TBT PLLs (Ville)
- Fix HDCP repeater ctl register value on errors (Jani)
- Allow FBC with CCS modifiers on SKL+ (Ville)
- Fix HDCP GGTT pinning (Ville)
DRM core changes:
- Add ratelimited drm dbg print (Nirmoy)
- DPCD PSR early transport macro (Jouni)
Merges:
- Backmerge drm-next to bring Xe driver to drm-intel-next (Jani)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/87cyt8cxsh.fsf@intel.com
On MTL accessing stolen memory via the BARs is somehow borked,
and it can hang the machine. As a workaround let's bypass the
BARs and just go straight to DSMBASE/GSMBASE instead.
Note that on every other platform this itself would hang the
machine, but on MTL the system firmware is expected to relax
the access permission guarding stolen memory to enable this
workaround, and thus direct CPU accesses should be fine.
The raw stolen memory areas won't be passed to VMs so we'll
need to risk using the BAR there for the initial setup. Once
command submission is up we should switch to MI_UPDATE_GTT
which at least shouldn't hang the whole machine.
v2: Don't use direct GSM/DSM access on guests
Add w/a number
v3: Check register 0x138914 to see if pcode did its job
Add some debug prints
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Reviewed-by: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Tested-by: Paz Zcharya <pazz@chromium.org>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240202224340.30647-5-ville.syrjala@linux.intel.com
mem->region is a struct resource, but mem->io_start and
mem->io_size are not for whatever reason. Let's unify this
and convert the io stuff into a struct resource as well.
Should make life a little less annoying when you don't have
juggle between two different approaches all the time.
Mostly done using cocci (with manual tweaks at all the
places where we mutate io_size by hand):
@@
struct intel_memory_region *M;
expression START, SIZE;
@@
- M->io_start = START;
- M->io_size = SIZE;
+ M->io = DEFINE_RES_MEM(START, SIZE);
@@
struct intel_memory_region *M;
@@
- M->io_start
+ M->io.start
@@
struct intel_memory_region M;
@@
- M.io_start
+ M.io.start
@@
expression M;
@@
- M->io_size
+ resource_size(&M->io)
@@
expression M;
@@
- M.io_size
+ resource_size(&M.io)
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Acked-by: Nirmoy Das <nirmoy.das@intel.com>
Tested-by: Paz Zcharya <pazz@chromium.org>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240202224340.30647-2-ville.syrjala@linux.intel.com
Pull drm fixes from Dave Airlie:
"This is just a wrap up of fixes from the last few days. It has the
proper fix to the i915/xe collision, we can clean up what you did
later once rc1 lands.
Otherwise it's a few other i915, a v3d, rockchip and a nouveau fix to
make GSP load on some original Turing GPUs.
i915:
- Fixes for kernel-doc warnings enforced in linux-next
- Another build warning fix for string formatting of intel_wakeref_t
- Display fixes for DP DSC BPC and C20 PLL state verification
v3d:
- register readout fix
rockchip:
- two build warning fixes
nouveau:
- fix GSP loading on Turing with different nvdec configuration"
* tag 'drm-next-2024-01-15-1' of git://anongit.freedesktop.org/drm/drm:
nouveau/gsp: handle engines in runl without nonstall interrupts.
drm/i915/perf: reconcile Excess struct member kernel-doc warnings
drm/i915/guc: reconcile Excess struct member kernel-doc warnings
drm/i915/gt: reconcile Excess struct member kernel-doc warnings
drm/i915/gem: reconcile Excess struct member kernel-doc warnings
drm/i915/dp: Fix the max DSC bpc supported by source
drm/i915: don't make assumptions about intel_wakeref_t type
drm/i915/dp: Fix the PSR debugfs entries wrt. MST connectors
drm/i915/display: Fix C20 pll selection for state verification
drm/v3d: Fix support for register debugging on the RPi 4
drm/rockchip: vop2: Drop unused if_dclk_rate variable
drm/rockchip: vop2: Drop superfluous include
Pull drm updates from Dave Airlie:
"This contains two major new drivers:
- imagination is a first driver for Imagination Technologies devices,
it only covers very specific devices, but there is hope to grow it
- xe is a reboot of the i915 GPU (shares display) side using a more
upstream focused development model, and trying to maximise code
sharing. It's not enabled for any hw by default, and will hopefully
get switched on for Intel's Lunarlake.
This also drops a bunch of the old UMS ioctls. It's been dead long
enough.
amdgpu has a bunch of new color management code that is being used in
the Steam Deck.
amdgpu also has a new ACPI WBRF interaction to help avoid radio
interference.
Otherwise it's the usual lots of changes in lots of places.
Detailed summary:
new drivers:
- imagination - new driver for Imagination Technologies GPU
- xe - new driver for Intel GPUs using core drm concepts
core:
- add CLOSE_FB ioctl
- remove old UMS ioctls
- increase max objects to accomodate AMD color mgmt
encoder:
- create per-encoder debugfs directory
edid:
- split out drm_eld
- SAD helpers
- drop edid_firmware module parameter
format-helper:
- cache format conversion buffers
sched:
- move from kthread to workqueue
- rename some internals
- implement dynamic job-flow control
gpuvm:
- provide more features to handle GEM objects
client:
- don't acquire module reference
displayport:
- add mst path property documentation
fdinfo:
- alignment fix
dma-buf:
- add fence timestamp helper
- add fence deadline support
bridge:
- transparent aux-bridge for DP/USB-C
- lt8912b: add suspend/resume support and power regulator support
panel:
- edp: AUO B116XTN02, BOE NT116WHM-N21,836X2, NV116WHM-N49
- chromebook panel support
- elida-kd35t133: rework pm
- powkiddy RK2023 panel
- himax-hx8394: drop prepare/unprepare and shutdown logic
- BOE BP101WX1-100, Powkiddy X55, Ampire AM8001280G
- Evervision VGG644804, SDC ATNA45AF01
- nv3052c: register docs, init sequence fixes, fascontek FS035VG158
- st7701: Anbernic RG-ARC support
- r63353 panel controller
- Ilitek ILI9805 panel controller
- AUO G156HAN04.0
simplefb:
- support memory regions
- support power domains
amdgpu:
- add new 64-bit sequence number infrastructure
- add AMD specific color management
- ACPI WBRF support for RF interference handling
- GPUVM updates
- RAS updates
- DCN 3.5 updates
- Rework PCIe link speed handling
- Document GPU reset types
- DMUB fixes
- eDP fixes
- NBIO 7.9/7.11 updates
- SubVP updates
- XGMI PCIe state dumping for aqua vanjaram
- GFX11 golden register updates
- enable tunnelling on high pri compute
amdkfd:
- Migrate TLB flushing logic to amdgpu
- Trap handler fixes
- Fix restore workers handling on suspend/resume
- Fix possible memory leak in pqm_uninit()
- support import/export of dma-bufs using GEM handles
radeon:
- fix possible overflows in command buffer checking
- check for errors in ring_lock
i915:
- reorg display code for reuse in xe driver
- fdinfo memory stats printing
- DP MST bandwidth mgmt improvements
- DP panel replay enabling
- MTL C20 phy state verification
- MTL DP DSC fractional bpp support
- Audio fastset support
- use dma_fence interfaces instead of i915_sw_fence
- Separate gem and display code
- AUX register macro refactoring
- Separate display module/device parameters
- Move display capabilities debugfs under display
- Makefile cleanups
- Register cleanups
- Move display lock inits under display/
- VLV/CHV DPIO PHY register and interface refactoring
- DSI VBT sequence refactoring
- C10/C20 PHY PLL hardware readout
- DPLL code cleanups
- Cleanup PXP plane protection checks
- Improve display debug msgs
- PSR selective fetch fixes/improvements
- DP MST fixes
- Xe2LPD FBC restrictions removed
- DGFX uses direct VBT pin mapping
- more MTL WAs
- fix MTL eDP bug
- eliminate use of kmap_atomic
habanalabs:
- sysfs entry to identify a device minor id with debugfs path
- sysfs entry to expose device module id
- add signed device info retrieval through INFO ioctl
- add Gaudi2C device support
- pcie reset prepare/done hooks
msm:
- Add support for SDM670, SM8650
- Handle the CFG interconnect to fix the obscure hangs / timeouts
- Kconfig fix for QMP dependency
- use managed allocators
- DPU: SDM670, SM8650 support
- DPU: Enable SmartDMA on SM8350 and SM8450
- DP: enable runtime PM support
- GPU: add metadata UAPI
- GPU: move devcoredumps to GPU device
- GPU: convert to drm_exec
ivpu:
- update FW API
- new debugfs file
- a new NOP job submission test mode
- improve suspend/resume
- PM improvements
- MMU PT optimizations
- firmware profile frequency support
- support for uncached buffers
- switch to gem shmem helpers
- replace kthread with threaded irqs
rockchip:
- rk3066_hdmi: convert to atomic
- vop2: support nv20 and nv30
- rk3588 support
mediatek:
- use devm_platform_ioremap_resource
- stop using iommu_present
- MT8188 VDOSYS1 display support
panfrost:
- PM improvements
- improve interrupt handling as poweroff
qaic:
- allow to run with single MSI
- support host/device time sync
- switch to persistent DRM devices
exynos:
- fix potential error pointer dereference
- fix wrong error checking
- add missing call to drm_atomic_helper_shutdown
omapdrm:
- dma-fence lockdep annotation fix
tidss:
- dma-fence lockdep annotation fix
- support for AM62A7
v3d:
- BCM2712 - rpi5 support
- fdinfo + gputop support
- uapi for CPU job handling
virtio-gpu:
- add context debug name"
* tag 'drm-next-2024-01-10' of git://anongit.freedesktop.org/drm/drm: (2340 commits)
drm/amd/display: Allow z8/z10 from driver
drm/amd/display: fix bandwidth validation failure on DCN 2.1
drm/amdgpu: apply the RV2 system aperture fix to RN/CZN as well
drm/amd/display: Move fixpt_from_s3132 to amdgpu_dm
drm/amd/display: Fix recent checkpatch errors in amdgpu_dm
Revert "drm/amdkfd: Relocate TBA/TMA to opposite side of VM hole"
drm/amd/display: avoid stringop-overflow warnings for dp_decide_lane_settings()
drm/amd/display: Fix power_helpers.c codestyle
drm/amd/display: Fix hdcp_log.h codestyle
drm/amd/display: Fix hdcp2_execution.c codestyle
drm/amd/display: Fix hdcp_psp.h codestyle
drm/amd/display: Fix freesync.c codestyle
drm/amd/display: Fix hdcp_psp.c codestyle
drm/amd/display: Fix hdcp1_execution.c codestyle
drm/amd/pm/smu7: fix a memleak in smu7_hwmgr_backend_init
drm/amdkfd: Fix iterator used outside loop in 'kfd_add_peer_prop()'
drm/amdgpu: Drop 'fence' check in 'to_amdgpu_amdkfd_fence()'
drm/amdkfd: Confirm list is non-empty before utilizing list_first_entry in kfd_topology.c
drm/amdgpu: Fix '*fw' from request_firmware() not released in 'amdgpu_ucode_request()'
drm/amdgpu: Fix variable 'mca_funcs' dereferenced before NULL check in 'amdgpu_mca_smu_get_mca_entry()'
...
If we are at the end of suspend or very early in resume
its possible an async fence signal (via rcu_call) is triggered
to free_engines which could lead us to the execution of
the context destruction worker (after a prior worker flush).
Thus, when suspending, insert rcu_barriers at the start
of i915_gem_suspend (part of driver's suspend prepare) and
again in i915_gem_suspend_late so that all such cases have
completed and context destruction list isn't missing anything.
In destroyed_worker_func, close the race against CT-loss
by checking that CT is enabled before calling into
deregister_destroyed_contexts.
Based on testing, guc_lrc_desc_unpin may still race and fail
as we traverse the GuC's context-destroy list because the
CT could be disabled right before calling GuC's CT send function.
We've witnessed this race condition once every ~6000-8000
suspend-resume cycles while ensuring workloads that render
something onscreen is continuously started just before
we suspend (and the workload is small enough to complete
and trigger the queued engine/context free-up either very
late in suspend or very early in resume).
In such a case, we need to unroll the entire process because
guc-lrc-unpin takes a gt wakeref which only gets released in
the G2H IRQ reply that never comes through in this corner
case. Without the unroll, the taken wakeref is leaked and will
cascade into a kernel hang later at the tail end of suspend in
this function:
intel_wakeref_wait_for_idle(>->wakeref)
(called by) - intel_gt_pm_wait_for_idle
(called by) - wait_for_suspend
Thus, do an unroll in guc_lrc_desc_unpin and deregister_destroyed_-
contexts if guc_lrc_desc_unpin fails due to CT send falure.
When unrolling, keep the context in the GuC's destroy-list so
it can get picked up on the next destroy worker invocation
(if suspend aborted) or get fully purged as part of a GuC
sanitization (end of suspend) or a reset flow.
Signed-off-by: Alan Previn <alan.previn.teres.alexis@intel.com>
Signed-off-by: Anshuman Gupta <anshuman.gupta@intel.com>
Tested-by: Mousumi Jana <mousumi.jana@intel.com>
Acked-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231229215143.581619-1-alan.previn.teres.alexis@intel.com
Never block for outstanding work on userptr object upon receipt of a
mmu-notifier. The reason we originally did so was to immediately unbind
the userptr and unpin its pages, but since that has been dropped in
commit b4b9731b02 ("drm/i915: Simplify userptr locking"), we never
return the pages to the system i.e. never drop our page->mapcount and so
do not allow the page and CPU PTE to be revoked. Based on this history,
we know we are safe to drop the wait entirely.
Upon return from mmu-notifier, we will still have the userptr pages
pinned preventing the following PTE operation (such as try_to_unmap)
adjusting the vm_area_struct, so it is safe to keep the pages around for
as long as we still have i/o pending.
We do not have any means currently to asynchronously revalidate the
userptr pages, that is always prior to next use.
Signed-off-by: Chris Wilson <chris.p.wilson@linux.intel.com>
Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231128162505.3493942-1-jonathan.cavitt@intel.com
The use of kmap_atomic() is being deprecated in favor of
kmap_local_page()[1], and this patch converts the calls from
kmap_atomic() to kmap_local_page().
The main difference between atomic and local mappings is that local
mappings doesn't disable page faults or preemption (the preemption is
disabled for !PREEMPT_RT case, otherwise it only disables migration).
With kmap_local_page(), we can avoid the often unwanted side effect of
unnecessary page faults and preemption disables.
In i915_gem_execbuffer.c, eb->reloc_cache.vaddr is mapped by
kmap_atomic() in eb_relocate_entry(), and is unmapped by
kunmap_atomic() in reloc_cache_reset().
And this mapping/unmapping occurs in two places: one is in
eb_relocate_vma(), and another is in eb_relocate_vma_slow().
The function eb_relocate_vma() or eb_relocate_vma_slow() doesn't
need to disable pagefaults and preemption during the above mapping/
unmapping.
So it can simply use kmap_local_page() / kunmap_local() that can
instead do the mapping / unmapping regardless of the context.
Convert the calls of kmap_atomic() / kunmap_atomic() to
kmap_local_page() / kunmap_local().
[1]: https://lore.kernel.org/all/20220813220034.806698-1-ira.weiny@intel.com
Suggested-by: Ira Weiny <ira.weiny@intel.com>
Suggested-by: Fabio M. De Francesco <fmdefrancesco@gmail.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Fabio M. De Francesco <fmdefrancesco@gmail.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231203132947.2328805-10-zhao1.liu@linux.intel.com
The use of kmap_atomic() is being deprecated in favor of
kmap_local_page()[1], and this patch converts the call from
kmap_atomic() to kmap_local_page().
The main difference between atomic and local mappings is that local
mappings doesn't disable page faults or preemption.
With kmap_local_page(), we can avoid the often unwanted side effect of
unnecessary page faults or preemption disables.
In drm/i915/gem/selftests/i915_gem_context.c, functions cpu_fill() and
cpu_check() mainly uses mapping to flush cache and check/assign the
value.
There're 2 reasons why cpu_fill() and cpu_check() don't need to disable
pagefaults and preemption for mapping:
1. The flush operation is safe. cpu_fill() and cpu_check() call
drm_clflush_virt_range() to use CLFLUSHOPT or WBINVD to flush. Since
CLFLUSHOPT is global on x86 and WBINVD is called on each cpu in
drm_clflush_virt_range(), the flush operation is global.
2. Any context switch caused by preemption or page faults (page fault
may cause sleep) doesn't affect the validity of local mapping.
Therefore, cpu_fill() and cpu_check() are functions where the use of
kmap_local_page() in place of kmap_atomic() is correctly suited.
Convert the calls of kmap_atomic() / kunmap_atomic() to
kmap_local_page() / kunmap_local().
[1]: https://lore.kernel.org/all/20220813220034.806698-1-ira.weiny@intel.com
Suggested-by: Dave Hansen <dave.hansen@intel.com>
Suggested-by: Ira Weiny <ira.weiny@intel.com>
Suggested-by: Fabio M. De Francesco <fmdefrancesco@gmail.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Fabio M. De Francesco <fmdefrancesco@gmail.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231203132947.2328805-7-zhao1.liu@linux.intel.com
The use of kmap_atomic() is being deprecated in favor of
kmap_local_page()[1], and this patch converts the call from
kmap_atomic() to kmap_local_page().
The main difference between atomic and local mappings is that local
mappings doesn't disable page faults or preemption (the preemption is
disabled for !PREEMPT_RT case, otherwise it only disables migration)..
With kmap_local_page(), we can avoid the often unwanted side effect of
unnecessary page faults or preemption disables.
In drm/i915/gem/selftests/i915_gem_coherency.c, functions cpu_set()
and cpu_get() mainly uses mapping to flush cache and assign the value.
There're 2 reasons why cpu_set() and cpu_get() don't need to disable
pagefaults and preemption for mapping:
1. The flush operation is safe. cpu_set() and cpu_get() call
drm_clflush_virt_range() to use CLFLUSHOPT or WBINVD to flush. Since
CLFLUSHOPT is global on x86 and WBINVD is called on each cpu in
drm_clflush_virt_range(), the flush operation is global.
2. Any context switch caused by preemption or page faults (page fault
may cause sleep) doesn't affect the validity of local mapping.
Therefore, cpu_set() and cpu_get() are functions where the use of
kmap_local_page() in place of kmap_atomic() is correctly suited.
Convert the calls of kmap_atomic() / kunmap_atomic() to
kmap_local_page() / kunmap_local().
[1]: https://lore.kernel.org/all/20220813220034.806698-1-ira.weiny@intel.com
Suggested-by: Dave Hansen <dave.hansen@intel.com>
Suggested-by: Ira Weiny <ira.weiny@intel.com>
Suggested-by: Fabio M. De Francesco <fmdefrancesco@gmail.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Fabio M. De Francesco <fmdefrancesco@gmail.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231203132947.2328805-6-zhao1.liu@linux.intel.com
The use of kmap_atomic() is being deprecated in favor of
kmap_local_page()[1], and this patch converts the call from
kmap_atomic() to kmap_local_page().
The main difference between atomic and local mappings is that local
mappings doesn't disable page faults or preemption (the preemption is
disabled for !PREEMPT_RT case, otherwise it only disables migration).
With kmap_local_page(), we can avoid the often unwanted side effect of
unnecessary page faults or preemption disables.
In drm/i915/gem/selftests/huge_pages.c, function __cpu_check_shmem()
mainly uses mapping to flush cache and check the value. There're
2 reasons why __cpu_check_shmem() doesn't need to disable pagefaults
and preemption for mapping:
1. The flush operation is safe. Function __cpu_check_shmem() calls
drm_clflush_virt_range() to use CLFLUSHOPT or WBINVD to flush. Since
CLFLUSHOPT is global on x86 and WBINVD is called on each cpu in
drm_clflush_virt_range(), the flush operation is global.
2. Any context switch caused by preemption or page faults (page fault
may cause sleep) doesn't affect the validity of local mapping.
Therefore, __cpu_check_shmem() is a function where the use of
kmap_local_page() in place of kmap_atomic() is correctly suited.
Convert the calls of kmap_atomic() / kunmap_atomic() to
kmap_local_page() / kunmap_local().
[1]: https://lore.kernel.org/all/20220813220034.806698-1-ira.weiny@intel.com
Suggested-by: Dave Hansen <dave.hansen@intel.com>
Suggested-by: Ira Weiny <ira.weiny@intel.com>
Suggested-by: Fabio M. De Francesco <fmdefrancesco@gmail.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Fabio M. De Francesco <fmdefrancesco@gmail.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231203132947.2328805-5-zhao1.liu@linux.intel.com
The use of kmap_atomic() is being deprecated in favor of
kmap_local_page()[1], and this patch converts the call from
kmap_atomic() + memcpy() to memcpy_[from/to]_page(), which use
kmap_local_page() to build local mapping and then do memcpy().
The main difference between atomic and local mappings is that local
mappings doesn't disable page faults or preemption (the preemption is
disabled for !PREEMPT_RT case, otherwise it only disables migration).
With kmap_local_page(), we can avoid the often unwanted side effect of
unnecessary page faults and preemption disables.
In drm/i915/gem/i915_gem_phys.c, the functions
i915_gem_object_get_pages_phys() and i915_gem_object_put_pages_phys()
don't need to disable pagefaults and preemption for mapping because of
2 reasons:
1. The flush operation is safe. In drm/i915/gem/i915_gem_object.c,
i915_gem_object_get_pages_phys() and i915_gem_object_put_pages_phys()
calls drm_clflush_virt_range() to use CLFLUSHOPT or WBINVD to flush.
Since CLFLUSHOPT is global on x86 and WBINVD is called on each cpu in
drm_clflush_virt_range(), the flush operation is global.
2. Any context switch caused by preemption or page faults (page fault
may cause sleep) doesn't affect the validity of local mapping.
Therefore, i915_gem_object_get_pages_phys() and
i915_gem_object_put_pages_phys() are two functions where the uses of
local mappings in place of atomic mappings are correctly suited.
Convert the calls of kmap_atomic() / kunmap_atomic() + memcpy() to
memcpy_from_page() and memcpy_to_page().
[1]: https://lore.kernel.org/all/20220813220034.806698-1-ira.weiny@intel.com
Suggested-by: Dave Hansen <dave.hansen@intel.com>
Suggested-by: Ira Weiny <ira.weiny@intel.com>
Suggested-by: Fabio M. De Francesco <fmdefrancesco@gmail.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Fabio M. De Francesco <fmdefrancesco@gmail.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231203132947.2328805-3-zhao1.liu@linux.intel.com
The use of kmap_atomic() is being deprecated in favor of
kmap_local_page()[1], and this patch converts the call from
kmap_atomic() to kmap_local_page().
The main difference between atomic and local mappings is that local
mappings doesn't disable page faults or preemption (the preemption is
disabled for !PREEMPT_RT case, otherwise it only disables migration).
With kmap_local_page(), we can avoid the often unwanted side effect of
unnecessary page faults and preemption disables.
There're 2 reasons why i915_gem_object_read_from_page_kmap() doesn't
need to disable pagefaults and preemption for mapping:
1. The flush operation is safe. In drm/i915/gem/i915_gem_object.c,
i915_gem_object_read_from_page_kmap() calls drm_clflush_virt_range() to
use CLFLUSHOPT or WBINVD to flush. Since CLFLUSHOPT is global on x86
and WBINVD is called on each cpu in drm_clflush_virt_range(), the flush
operation is global.
2. Any context switch caused by preemption or page faults (page fault
may cause sleep) doesn't affect the validity of local mapping.
Therefore, i915_gem_object_read_from_page_kmap() is a function where
the use of kmap_local_page() in place of kmap_atomic() is correctly
suited.
Convert the calls of kmap_atomic() / kunmap_atomic() to
kmap_local_page() / kunmap_local().
And remove the redundant variable that stores the address of the mapped
page since kunmap_local() can accept any pointer within the page.
[1]: https://lore.kernel.org/all/20220813220034.806698-1-ira.weiny@intel.com
Suggested-by: Dave Hansen <dave.hansen@intel.com>
Suggested-by: Ira Weiny <ira.weiny@intel.com>
Suggested-by: Fabio M. De Francesco <fmdefrancesco@gmail.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Fabio M. De Francesco <fmdefrancesco@gmail.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231203132947.2328805-2-zhao1.liu@linux.intel.com
Pull drm fixes from Daniel Vetter:
"Dave's VPN to the big machine died, so it's on me to do fixes pr this
and next week while everyone else is at plumbers.
- big pile of amd fixes, but mostly for hw support newly added in 6.7
- i915 fixes, mostly minor things
- qxl memory leak fix
- vc4 uaf fix in mock helpers
- syncobj fix for DRM_SYNCOBJ_WAIT_FLAGS_WAIT_AVAILABLE"
* tag 'drm-next-2023-11-10' of git://anongit.freedesktop.org/drm/drm: (78 commits)
drm/amdgpu: fix error handling in amdgpu_vm_init
drm/amdgpu: Fix possible null pointer dereference
drm/amdgpu: move UVD and VCE sched entity init after sched init
drm/amdgpu: move kfd_resume before the ip late init
drm/amd: Explicitly check for GFXOFF to be enabled for s0ix
drm/amdgpu: Change WREG32_RLC to WREG32_SOC15_RLC where inst != 0 (v2)
drm/amdgpu: Use correct KIQ MEC engine for gfx9.4.3 (v5)
drm/amdgpu: add smu v13.0.6 pcs xgmi ras error query support
drm/amdgpu: fix software pci_unplug on some chips
drm/amd/display: remove duplicated argument
drm/amdgpu: correct mca debugfs dump reg list
drm/amdgpu: correct acclerator check architecutre dump
drm/amdgpu: add pcs xgmi v6.4.0 ras support
drm/amdgpu: Change extended-scope MTYPE on GC 9.4.3
drm/amdgpu: disable smu v13.0.6 mca debug mode by default
drm/amdgpu: Support multiple error query modes
drm/amdgpu: refine smu v13.0.6 mca dump driver
drm/amdgpu: Do not program PF-only regs in hdp_v4_0.c under SRIOV (v2)
drm/amdgpu: Skip PCTL0_MMHUB_DEEPSLEEP_IB write in jpegv4.0.3 under SRIOV
drm: amd: Resolve Sphinx unexpected indentation warning
...
In order to show per client memory usage lets add some infrastructure
which enables tracking buffer objects owned by clients.
We add a per client list protected by a new per client lock and to support
delayed destruction (post client exit) we make tracked objects hold
references to the owning client.
Also, object memory region teardown is moved to the existing RCU free
callback to allow safe dereference from the fdinfo RCU read section.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231107101806.608990-1-tvrtko.ursulin@linux.intel.com
GCC 14 introduces a new -Walloc-size included in -Wextra which errors out
like:
```
drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c: In function ‘eb_copy_relocations’:
drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c:1681:24: error: allocation of insufficient size ‘1’ for type ‘struct drm_i915_gem_relocation_entry’ with size ‘32’ [-Werror=alloc-size]
1681 | relocs = kvmalloc_array(size, 1, GFP_KERNEL);
| ^
```
So, just swap the number of members and size arguments to match the prototype, as
we're initialising 1 element of size `size`. GCC then sees we're not
doing anything wrong.
Signed-off-by: Sam James <sam@gentoo.org>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231107215538.1891359-1-sam@gentoo.org
Pull MM updates from Andrew Morton:
"Many singleton patches against the MM code. The patch series which are
included in this merge do the following:
- Kemeng Shi has contributed some compation maintenance work in the
series 'Fixes and cleanups to compaction'
- Joel Fernandes has a patchset ('Optimize mremap during mutual
alignment within PMD') which fixes an obscure issue with mremap()'s
pagetable handling during a subsequent exec(), based upon an
implementation which Linus suggested
- More DAMON/DAMOS maintenance and feature work from SeongJae Park i
the following patch series:
mm/damon: misc fixups for documents, comments and its tracepoint
mm/damon: add a tracepoint for damos apply target regions
mm/damon: provide pseudo-moving sum based access rate
mm/damon: implement DAMOS apply intervals
mm/damon/core-test: Fix memory leaks in core-test
mm/damon/sysfs-schemes: Do DAMOS tried regions update for only one apply interval
- In the series 'Do not try to access unaccepted memory' Adrian
Hunter provides some fixups for the recently-added 'unaccepted
memory' feature. To increase the feature's checking coverage. 'Plug
a few gaps where RAM is exposed without checking if it is
unaccepted memory'
- In the series 'cleanups for lockless slab shrink' Qi Zheng has done
some maintenance work which is preparation for the lockless slab
shrinking code
- Qi Zheng has redone the earlier (and reverted) attempt to make slab
shrinking lockless in the series 'use refcount+RCU method to
implement lockless slab shrink'
- David Hildenbrand contributes some maintenance work for the rmap
code in the series 'Anon rmap cleanups'
- Kefeng Wang does more folio conversions and some maintenance work
in the migration code. Series 'mm: migrate: more folio conversion
and unification'
- Matthew Wilcox has fixed an issue in the buffer_head code which was
causing long stalls under some heavy memory/IO loads. Some cleanups
were added on the way. Series 'Add and use bdev_getblk()'
- In the series 'Use nth_page() in place of direct struct page
manipulation' Zi Yan has fixed a potential issue with the direct
manipulation of hugetlb page frames
- In the series 'mm: hugetlb: Skip initialization of gigantic tail
struct pages if freed by HVO' has improved our handling of gigantic
pages in the hugetlb vmmemmep optimizaton code. This provides
significant boot time improvements when significant amounts of
gigantic pages are in use
- Matthew Wilcox has sent the series 'Small hugetlb cleanups' - code
rationalization and folio conversions in the hugetlb code
- Yin Fengwei has improved mlock()'s handling of large folios in the
series 'support large folio for mlock'
- In the series 'Expose swapcache stat for memcg v1' Liu Shixin has
added statistics for memcg v1 users which are available (and
useful) under memcg v2
- Florent Revest has enhanced the MDWE (Memory-Deny-Write-Executable)
prctl so that userspace may direct the kernel to not automatically
propagate the denial to child processes. The series is named 'MDWE
without inheritance'
- Kefeng Wang has provided the series 'mm: convert numa balancing
functions to use a folio' which does what it says
- In the series 'mm/ksm: add fork-exec support for prctl' Stefan
Roesch makes is possible for a process to propagate KSM treatment
across exec()
- Huang Ying has enhanced memory tiering's calculation of memory
distances. This is used to permit the dax/kmem driver to use 'high
bandwidth memory' in addition to Optane Data Center Persistent
Memory Modules (DCPMM). The series is named 'memory tiering:
calculate abstract distance based on ACPI HMAT'
- In the series 'Smart scanning mode for KSM' Stefan Roesch has
optimized KSM by teaching it to retain and use some historical
information from previous scans
- Yosry Ahmed has fixed some inconsistencies in memcg statistics in
the series 'mm: memcg: fix tracking of pending stats updates
values'
- In the series 'Implement IOCTL to get and optionally clear info
about PTEs' Peter Xu has added an ioctl to /proc/<pid>/pagemap
which permits us to atomically read-then-clear page softdirty
state. This is mainly used by CRIU
- Hugh Dickins contributed the series 'shmem,tmpfs: general
maintenance', a bunch of relatively minor maintenance tweaks to
this code
- Matthew Wilcox has increased the use of the VMA lock over
file-backed page faults in the series 'Handle more faults under the
VMA lock'. Some rationalizations of the fault path became possible
as a result
- In the series 'mm/rmap: convert page_move_anon_rmap() to
folio_move_anon_rmap()' David Hildenbrand has implemented some
cleanups and folio conversions
- In the series 'various improvements to the GUP interface' Lorenzo
Stoakes has simplified and improved the GUP interface with an eye
to providing groundwork for future improvements
- Andrey Konovalov has sent along the series 'kasan: assorted fixes
and improvements' which does those things
- Some page allocator maintenance work from Kemeng Shi in the series
'Two minor cleanups to break_down_buddy_pages'
- In thes series 'New selftest for mm' Breno Leitao has developed
another MM self test which tickles a race we had between madvise()
and page faults
- In the series 'Add folio_end_read' Matthew Wilcox provides cleanups
and an optimization to the core pagecache code
- Nhat Pham has added memcg accounting for hugetlb memory in the
series 'hugetlb memcg accounting'
- Cleanups and rationalizations to the pagemap code from Lorenzo
Stoakes, in the series 'Abstract vma_merge() and split_vma()'
- Audra Mitchell has fixed issues in the procfs page_owner code's new
timestamping feature which was causing some misbehaviours. In the
series 'Fix page_owner's use of free timestamps'
- Lorenzo Stoakes has fixed the handling of new mappings of sealed
files in the series 'permit write-sealed memfd read-only shared
mappings'
- Mike Kravetz has optimized the hugetlb vmemmap optimization in the
series 'Batch hugetlb vmemmap modification operations'
- Some buffer_head folio conversions and cleanups from Matthew Wilcox
in the series 'Finish the create_empty_buffers() transition'
- As a page allocator performance optimization Huang Ying has added
automatic tuning to the allocator's per-cpu-pages feature, in the
series 'mm: PCP high auto-tuning'
- Roman Gushchin has contributed the patchset 'mm: improve
performance of accounted kernel memory allocations' which improves
their performance by ~30% as measured by a micro-benchmark
- folio conversions from Kefeng Wang in the series 'mm: convert page
cpupid functions to folios'
- Some kmemleak fixups in Liu Shixin's series 'Some bugfix about
kmemleak'
- Qi Zheng has improved our handling of memoryless nodes by keeping
them off the allocation fallback list. This is done in the series
'handle memoryless nodes more appropriately'
- khugepaged conversions from Vishal Moola in the series 'Some
khugepaged folio conversions'"
[ bcachefs conflicts with the dynamically allocated shrinkers have been
resolved as per Stephen Rothwell in
https://lore.kernel.org/all/20230913093553.4290421e@canb.auug.org.au/
with help from Qi Zheng.
The clone3 test filtering conflict was half-arsed by yours truly ]
* tag 'mm-stable-2023-11-01-14-33' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (406 commits)
mm/damon/sysfs: update monitoring target regions for online input commit
mm/damon/sysfs: remove requested targets when online-commit inputs
selftests: add a sanity check for zswap
Documentation: maple_tree: fix word spelling error
mm/vmalloc: fix the unchecked dereference warning in vread_iter()
zswap: export compression failure stats
Documentation: ubsan: drop "the" from article title
mempolicy: migration attempt to match interleave nodes
mempolicy: mmap_lock is not needed while migrating folios
mempolicy: alloc_pages_mpol() for NUMA policy without vma
mm: add page_rmappable_folio() wrapper
mempolicy: remove confusing MPOL_MF_LAZY dead code
mempolicy: mpol_shared_policy_init() without pseudo-vma
mempolicy trivia: use pgoff_t in shared mempolicy tree
mempolicy trivia: slightly more consistent naming
mempolicy trivia: delete those ancient pr_debug()s
mempolicy: fix migrate_pages(2) syscall return nr_failed
kernfs: drop shared NUMA mempolicy hooks
hugetlbfs: drop shared NUMA mempolicy pretence
mm/damon/sysfs-test: add a unit test for damon_sysfs_set_targets()
...
This workaround is primarily implemented by the BIOS. However if the
BIOS applies the workaround it will reserve a small piece of our DSM
(which should be at the top, right below the WOPCM); we just need to
keep that region reserved so that nothing else attempts to re-use it.
v2: Declare regs in intel_gt_regs.h (Matt Roper)
v3: Shift WA implementation before calculation of *base (Matt Roper)
v4:
- Change condition gscpmi base to be fall in DSM range.(Matt Roper)
Signed-off-by: Dnyaneshwar Bhadane <dnyaneshwar.bhadane@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231027195052.3676632-1-dnyaneshwar.bhadane@intel.com