Marek Report a possible irq lock inversion dependency warning when
commit 81a06f1d02 ("Revert "drm/rockchip: vop2: Use regcache_sync()
to fix suspend/resume"") lands linux-next.
I can reproduce this warning with:
CONFIG_PROVE_LOCKING=y
CONFIG_DEBUG_LOCKDEP=y
It seems than when use regmap_reinit_cache at runtime whith Mark's
commit 3d59c22bbb ("drm/rockchip: vop2: Convert to use maple tree
register cache"), it will trigger a possible irq lock inversion dependency
warning.
One solution is switch back to REGCACHE_RBTREE, but it seems that
REGCACHE_MAPLE is the future, so I avoid using regmap_reinit_cache,
and drop all the regcache when vop is disabled, then we get a fresh
start at next enbable time.
Fixes: 81a06f1d02 ("Revert "drm/rockchip: vop2: Use regcache_sync() to fix suspend/resume"")
Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
Closes: https://lore.kernel.org/all/98a9f15d-30ac-47bf-9b93-3aa2c9900f7b@samsung.com/
Signed-off-by: Andy Yan <andy.yan@rock-chips.com>
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
[dropped the large kernel log of the lockdep report from the message]
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20231217084415.2373043-1-andyshrk@163.com
If we get a deadlock after the fb lookup in drm_mode_page_flip_ioctl()
we proceed to unref the fb and then retry the whole thing from the top.
But we forget to reset the fb pointer back to NULL, and so if we then
get another error during the retry, before the fb lookup, we proceed
the unref the same fb again without having gotten another reference.
The end result is that the fb will (eventually) end up being freed
while it's still in use.
Reset fb to NULL once we've unreffed it to avoid doing it again
until we've done another fb lookup.
This turned out to be pretty easy to hit on a DG2 when doing async
flips (and CONFIG_DEBUG_WW_MUTEX_SLOWPATH=y). The first symptom I
saw that drm_closefb() simply got stuck in a busy loop while walking
the framebuffer list. Fortunately I was able to convince it to oops
instead, and from there it was easier to track down the culprit.
Cc: stable@vger.kernel.org
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231211081625.25704-1-ville.syrjala@linux.intel.com
Acked-by: Javier Martinez Canillas <javierm@redhat.com>
Introduce a new DRM driver for Intel GPUs
Xe, is a new driver for Intel GPUs that supports both integrated and
discrete platforms. The experimental support starts with Tiger Lake.
i915 will continue be the main production driver for the platforms
up to Meteor Lake and Alchemist. Then the goal is to make this Intel
Xe driver the primary driver for Lunar Lake and newer platforms.
It uses most, if not all, of the key drm concepts, in special: TTM,
drm-scheduler, drm-exec, drm-gpuvm/gpuva and others.
Signed-off-by: Dave Airlie <airlied@redhat.com>
[airlied: add an extra X86 check, fix a typo, fix drm_exec_init interface
change].
From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ZYSwLgXZUZ57qGPQ@intel.com
We're currently sending non-blocking H2G messages using the EVENT type,
which suppresses all CTB protocol replies from the GuC, including the
failure cases. This might cause errors to slip through and manifest as
unexpected behavior (e.g. a context state might not be what the driver
thinks it is because the state change command was silently rejected by
the GuC). To avoid this kind of problems, we can use the FAST_REQUEST
type instead, which suppresses the reply only on success; this way we
still get the advantage of not having to wait for an ack from the GuC
(i.e. the H2G is still non-blocking) while still detecting errors.
Since we can't escalate to the caller when a non-blocking message
fails, we need to escalate to GT reset instead.
Note that FAST_REQUEST failures are NOT expected and are usually a sign
that the H2G was either malformed or requested an illegal operation.
v2: assign fence values to FAST_REQUEST messages, fix abi doc, use xe_gt
printers (Michal).
v3: fix doc alignment, fix and improve prints (Michal)
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com> #v2
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
The GFX doorbell solution provides a mechanism for submission of
workload to the graphics hardware by a ring3 application without
the penalty of ring transition for each workload submission.
This feature is not currently used by the Linux drivers, but in
SR-IOV mode the doorbells are treated as shared resource and the
PF driver must be able to provision exclusive range of doorbells
IDs across all enabled VFs.
Introduce simple GuC doorbell ID manager that will be used by the
PF driver for VFs provisioning and can later be used by submission
code once we are ready to switch from H2G based notifications to
doorbells mechanism.
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Piotr Piórkowski <piotr.piorkowski@intel.com>
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com>
Link: https://lore.kernel.org/r/20231218190629.502-4-michal.wajdeczko@intel.com
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
When interrupts are delivered using memory based mechanism, engines
will write status to the report page at the offset (in bytes) that
corresponds to their interrupt bit from the GT_INTR_DW register.
Add engine interrupt offset definitions to engine info as we will
need this to process memory based interrupts.
Bspec: 46149, 50829, 50844
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20231214185955.1791-6-michal.wajdeczko@intel.com
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
The RING_INT_SRC_RPT_PTR register points to a cacheline in memory
to which an engine must report as source of interrupt prior to
generating an interrupt to the host.
The RING_INT_STATUS_RPT_PTR register points to the first cacheline
of the Interrupt Status Report (ISR) page (4KB) in graphics memory
to which all engines report their interrupt status.
The RING_IMR register has the interrupt enables and interrupt masks
for an engine.
We will refer to these registers shortly.
Bspec: 45963, 45964, 45965
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20231214185955.1791-3-michal.wajdeczko@intel.com
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Pre-production hardware is anything before C0 (for DG2-G10), before B1
(for DG2-G11), or before A1 (for DG2-G12). Workarounds specific to such
hardware was already removed from i915 in commit eaeb4b3614
("drm/i915/dg2: Drop pre-production GT workarounds") and there's even
less value keeping these around in the Xe driver.
v2:
- Drop Wa_14011441408 from xe_mocs.c. (Gustavo)
- Drop Wa_14010648519, Wa_14010198302, and Wa_1608949956 which were
mis-implemented; they were only supposed to apply to early steppings
of DG2-G10, but were being applied unconditionally on all DG2.
(Gustavo)
- Drop reference to Wa_16011620976; the implementation stays because it
still matches Wa_22015475538. (Gustavo)
Cc: Gustavo Sousa <gustavo.sousa@intel.com>
Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com>
Link: https://lore.kernel.org/r/20231215214531.2576215-2-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
"err" is not initialized when failing to create and add the freq0 sysfs
file. Remove it from the message. This fixes the following warning with
clang:
../drivers/gpu/drm/xe/xe_gt_freq.c:202:30: error: variable 'err' is uninitialized when used here [-Werror,-Wuninitialized]
kobject_name(gt->sysfs), err);
^~~
Fixes: bef52b5c7a ("drm/xe: Create a xe_gt_freq component for raw management and sysfs")
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Link: https://lore.kernel.org/r/20231220161923.3740489-1-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
As part of the FW definitions, we declare each blob as required via the
MODULE_FIRMWARE() macro. This causes the initramfs update (or equivalent
process) to look for the blobs on disk when the kernel is installed;
therefore, we need to make sure that all FWs we define are available in
linux-firmware.
We currently don't plan to push the PVC blob to linux-firmware, while the
LNL one will only be pushed once we have machines in CI to test it, so we
need to remove them from the list for now.
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
The uAPI should stay generic in regarding to the bitmask. It is
the userspace responsibility to check for the type/class of the
memory, without any assumption.
Also add comments inside the code to explain how it is actually
constructed so we don't accidentally change the assignment of
the instance and the masks.
No functional change in this patch. It only explains and document
the memory_region masks. A further follow-up work with the
organization of all memory regions around struct xe_mem_regions
is desired, but not part of this patch.
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Acked-by: José Roberto de Souza <jose.souza@intel.com>
Acked-by: Mateusz Naklicki <mateusz.naklicki@intel.com>
Signed-off-by: Francois Dugast <francois.dugast@intel.com>