The ecc_irq is disabled while GPU mode2 reset suspending process,
but not be enabled during GPU mode2 reset resume process.
Changed from V1:
only do sdma/gfx ras_late_init in aldebaran_mode2_restore_ip
delete amdgpu_ras_late_resume function
Changed from V2:
check umc ras supported before put ecc_irq
Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Variable remainder is being initialized with a value that is never read,
the assignment is redundant and can be removed. Also add a newline
after the declaration to clean up the coding style.
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Correct all kernel-doc notation on pvr_device.h so that there are no
kernel-doc warnings remaining.
pvr_device.h:292: warning: Excess struct member 'active' description in 'pvr_device'
pvr_device.h:292: warning: Excess struct member 'idle' description in 'pvr_device'
pvr_device.h:292: warning: Excess struct member 'lock' description in 'pvr_device'
pvr_device.h:292: warning: Excess struct member 'work' description in 'pvr_device'
pvr_device.h:292: warning: Excess struct member 'old_kccb_cmds_executed' description in 'pvr_device'
pvr_device.h:292: warning: Excess struct member 'kccb_stall_count' description in 'pvr_device'
pvr_device.h:292: warning: Excess struct member 'ccb' description in 'pvr_device'
pvr_device.h:292: warning: Excess struct member 'rtn_q' description in 'pvr_device'
pvr_device.h:292: warning: Excess struct member 'rtn_obj' description in 'pvr_device'
pvr_device.h:292: warning: Excess struct member 'rtn' description in 'pvr_device'
pvr_device.h:292: warning: Excess struct member 'slot_count' description in 'pvr_device'
pvr_device.h:292: warning: Excess struct member 'reserved_count' description in 'pvr_device'
pvr_device.h:292: warning: Excess struct member 'waiters' description in 'pvr_device'
pvr_device.h:292: warning: Excess struct member 'fence_ctx' description in 'pvr_device'
pvr_device.h:292: warning: Excess struct member 'id' description in 'pvr_device'
pvr_device.h:292: warning: Excess struct member 'seqno' description in 'pvr_device'
pvr_device.h:292: warning: Excess struct member 'lock' description in 'pvr_device'
pvr_device.h:292: warning: Excess struct member 'active' description in 'pvr_device'
pvr_device.h:292: warning: Excess struct member 'idle' description in 'pvr_device'
pvr_device.h:292: warning: Excess struct member 'lock' description in 'pvr_device'
pvr_device.h:292: warning: Excess struct member 'work' description in 'pvr_device'
pvr_device.h:292: warning: Excess struct member 'old_kccb_cmds_executed' description in 'pvr_device'
pvr_device.h:292: warning: Excess struct member 'kccb_stall_count' description in 'pvr_device'
pvr_device.h:292: warning: Excess struct member 'ccb' description in 'pvr_device'
pvr_device.h:292: warning: Excess struct member 'rtn_q' description in 'pvr_device'
pvr_device.h:292: warning: Excess struct member 'rtn_obj' description in 'pvr_device'
pvr_device.h:292: warning: Excess struct member 'rtn' description in 'pvr_device'
pvr_device.h:292: warning: Excess struct member 'slot_count' description in 'pvr_device'
pvr_device.h:292: warning: Excess struct member 'reserved_count' description in 'pvr_device'
pvr_device.h:292: warning: Excess struct member 'waiters' description in 'pvr_device'
pvr_device.h:292: warning: Excess struct member 'fence_ctx' description in 'pvr_device'
pvr_device.h:292: warning: Excess struct member 'id' description in 'pvr_device'
pvr_device.h:292: warning: Excess struct member 'seqno' description in 'pvr_device'
pvr_device.h:292: warning: Excess struct member 'lock' description in 'pvr_device'
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Frank Binns <frank.binns@imgtec.com>
Cc: Donald Robson <donald.robson@imgtec.com>
Cc: Matt Coster <matt.coster@imgtec.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: David Airlie <airlied@gmail.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: dri-devel@lists.freedesktop.org
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Vegard Nossum <vegard.nossum@oracle.com>
Signed-off-by: Maxime Ripard <mripard@kernel.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20231231054910.31805-1-rdunlap@infradead.org
Marek Report a possible irq lock inversion dependency warning when
commit 81a06f1d02 ("Revert "drm/rockchip: vop2: Use regcache_sync()
to fix suspend/resume"") lands linux-next.
I can reproduce this warning with:
CONFIG_PROVE_LOCKING=y
CONFIG_DEBUG_LOCKDEP=y
It seems than when use regmap_reinit_cache at runtime whith Mark's
commit 3d59c22bbb ("drm/rockchip: vop2: Convert to use maple tree
register cache"), it will trigger a possible irq lock inversion dependency
warning.
One solution is switch back to REGCACHE_RBTREE, but it seems that
REGCACHE_MAPLE is the future, so I avoid using regmap_reinit_cache,
and drop all the regcache when vop is disabled, then we get a fresh
start at next enbable time.
Fixes: 81a06f1d02 ("Revert "drm/rockchip: vop2: Use regcache_sync() to fix suspend/resume"")
Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
Closes: https://lore.kernel.org/all/98a9f15d-30ac-47bf-9b93-3aa2c9900f7b@samsung.com/
Signed-off-by: Andy Yan <andy.yan@rock-chips.com>
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
[dropped the large kernel log of the lockdep report from the message]
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20231217084415.2373043-1-andyshrk@163.com
If we get a deadlock after the fb lookup in drm_mode_page_flip_ioctl()
we proceed to unref the fb and then retry the whole thing from the top.
But we forget to reset the fb pointer back to NULL, and so if we then
get another error during the retry, before the fb lookup, we proceed
the unref the same fb again without having gotten another reference.
The end result is that the fb will (eventually) end up being freed
while it's still in use.
Reset fb to NULL once we've unreffed it to avoid doing it again
until we've done another fb lookup.
This turned out to be pretty easy to hit on a DG2 when doing async
flips (and CONFIG_DEBUG_WW_MUTEX_SLOWPATH=y). The first symptom I
saw that drm_closefb() simply got stuck in a busy loop while walking
the framebuffer list. Fortunately I was able to convince it to oops
instead, and from there it was easier to track down the culprit.
Cc: stable@vger.kernel.org
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231211081625.25704-1-ville.syrjala@linux.intel.com
Acked-by: Javier Martinez Canillas <javierm@redhat.com>
Introduce a new DRM driver for Intel GPUs
Xe, is a new driver for Intel GPUs that supports both integrated and
discrete platforms. The experimental support starts with Tiger Lake.
i915 will continue be the main production driver for the platforms
up to Meteor Lake and Alchemist. Then the goal is to make this Intel
Xe driver the primary driver for Lunar Lake and newer platforms.
It uses most, if not all, of the key drm concepts, in special: TTM,
drm-scheduler, drm-exec, drm-gpuvm/gpuva and others.
Signed-off-by: Dave Airlie <airlied@redhat.com>
[airlied: add an extra X86 check, fix a typo, fix drm_exec_init interface
change].
From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ZYSwLgXZUZ57qGPQ@intel.com
"err" is not initialized when failing to create and add the freq0 sysfs
file. Remove it from the message. This fixes the following warning with
clang:
../drivers/gpu/drm/xe/xe_gt_freq.c:202:30: error: variable 'err' is uninitialized when used here [-Werror,-Wuninitialized]
kobject_name(gt->sysfs), err);
^~~
Fixes: bef52b5c7a ("drm/xe: Create a xe_gt_freq component for raw management and sysfs")
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Link: https://lore.kernel.org/r/20231220161923.3740489-1-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
As part of the FW definitions, we declare each blob as required via the
MODULE_FIRMWARE() macro. This causes the initramfs update (or equivalent
process) to look for the blobs on disk when the kernel is installed;
therefore, we need to make sure that all FWs we define are available in
linux-firmware.
We currently don't plan to push the PVC blob to linux-firmware, while the
LNL one will only be pushed once we have machines in CI to test it, so we
need to remove them from the list for now.
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
The uAPI should stay generic in regarding to the bitmask. It is
the userspace responsibility to check for the type/class of the
memory, without any assumption.
Also add comments inside the code to explain how it is actually
constructed so we don't accidentally change the assignment of
the instance and the masks.
No functional change in this patch. It only explains and document
the memory_region masks. A further follow-up work with the
organization of all memory regions around struct xe_mem_regions
is desired, but not part of this patch.
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Acked-by: José Roberto de Souza <jose.souza@intel.com>
Acked-by: Mateusz Naklicki <mateusz.naklicki@intel.com>
Signed-off-by: Francois Dugast <francois.dugast@intel.com>
remove the num_engines/instances members from drm_xe_wait_user_fence
structure and add a exec_queue_id member
Right now this is only checking if the engine list is sane and nothing
else. In the end every operation with this IOCTL is a soft check.
So, let's formalize that and only use this IOCTL to wait on the fence.
exec_queue_id member will help to user space to get proper error code
from kernel while in exec_queue reset
Signed-off-by: Bommu Krishnaiah <krishnaiah.bommu@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Acked-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Francois Dugast <francois.dugast@intel.com>
Acked-by: José Roberto de Souza <jose.souza@intel.com>
Acked-by: Mateusz Naklicki <mateusz.naklicki@intel.com>
Signed-off-by: Francois Dugast <francois.dugast@intel.com>
When CONFIG_DRM_I915_DEBUG is not set, a dummy
__i915_inject_probe_error() is provided on the xe side. Use the same
logic as in drivers/gpu/drm/i915/i915_utils.c to ifdef it out. This
fixes the build with W=1 and without that config:
CC [M] drivers/gpu/drm/xe/display/ext/i915_utils.o
../drivers/gpu/drm/xe/display/ext/i915_utils.c:19:5: error: no previous prototype for ‘__i915_inject_probe_error’ [-Werror=missing-prototypes]
19 | int __i915_inject_probe_error(struct drm_i915_private *i915, int err,
| ^~~~~~~~~~~~~~~~~~~~~~~~~
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
bo is not used since all the checks are against tbo. Fix warning:
../drivers/gpu/drm/xe/xe_bo.c: In function ‘xe_evict_flags’:
../drivers/gpu/drm/xe/xe_bo.c:250:23: error: variable ‘bo’ set but not used [-Werror=unused-but-set-variable]
250 | struct xe_bo *bo;
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Fix warning:
../drivers/gpu/drm/xe/xe_ttm_vram_mgr.c: In function ‘__xe_ttm_vram_mgr_init’:
../drivers/gpu/drm/xe/xe_ttm_vram_mgr.c:340:13: error: variable ‘err’ set but not used [-Werror=unused-but-set-variable]
340 | int err;
| ^~~
Check for the error return and return it, like done by other drivers.
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
- Clear flat ccs during user bo creation.
- copy ccs meta data between flat ccs and bo during eviction and
restore.
- Add a bool field ccs_cleared in bo, true means ccs region of bo is
already cleared.
v2:
- Rebase.
v3:
- Maintain order of xe_bo_move_notify for ttm_bo_type_sg.
v4:
- xe_migrate_copy can be used to copy src to dst bo on igfx too.
Add a bool which handles only ccs metadata copy.
v5:
- on dgfx ccs should be cleared even if the bo is not compression enabled.
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
- The XY_CTRL_SURF_COPY_BLT instruction operating on ccs data expects
size in pages of main memory for which CCS data should be copied.
- The bitfield representing copy size in XY_CTRL_SURF_COPY_BLT has
shifted one bit higher in the instruction.
v2:
- Fix the num_pages for ccs size calculation.
- Address nits (Thomas)
v3:
- Use FIELD_PREP and FIELD_FIT instead of shifts and numbers.(Matt)
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Incase of bo move from PL_TT to PL_SYSTEM these pages will be used to
store ccs metadata from flat ccs. And during bo move to PL_TT from
PL_SYSTEM the metadata will be copied from extra pages to flat ccs. This
copy of ccs metadata ensures ccs remains unaltered between swapping out
of bo to disk and its restore to PL_TT.
Bspec:58796
v2:
- For dgfx ensure system bit is not set.
- Modify comments.(Thomas)
v3:
- Separate out patch to modify main memory to ccs memory ratio.(Matt)
v4:
- Update description for commit message.
- Make bo allocation routine more readable.(Matt)
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>