After noticing in logs there were still mentions to GEN6 registers, it
was clear commit d9b79ad275 ("drm/xe: Drop gen afixes from registers")
didn't take care of all the afixes. Some were added later, but there are
also constants and strings still using that. Continue the cleanup
removing the remaining ones.
To keep it consistent with code nearby, a few other changes are made:
- Remove prefix in INTEL_LEGACY_64B_CONTEXT
- Remove GEN8_CTX_L3LLC_COHERENT since it's unused
- Rename GEN9_FREQ_SCALER to GT_FREQUENCY_SCALER
v2: Use XELP_ as prefix for NUM_MOCS_ENTRIES and remove changes to
MOCS_ENTRIES as this is now done as part of a previous commit
(Matt Roper)
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20231117174049.527192-3-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
There are only 4 scratch registers VF_SW_FLAG(0..3) on each GuC.
We shouldn't use non-existing register VF_SW_FLAG(4) for posting
read.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
If GuC responds with the NO_RESPONSE_BUSY message, we extend
our timeout while waiting for the actual response, but we wrongly
assumed that the next message will be RESPONSE_SUCCESS, missing
that we still can get RESPONSE_FAILURE.
Change the condition for the expected message type, using only
common bits from RESPONSE_SUCCESS and RESPONSE_FAILURE (as they
differ, by ABI design, only by the last bit).
v2: add comment/checks to the code (Matt)
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
While copying GuC response from the scratch registers to the buffer,
formula to identify next scratch register is broken. Fix it.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
The current uC status tracking has a few issues:
1) the HuC is moved to "disabled" instead of "not supported"
2) the status is left uninitialized instead of "disabled" when the
modparam is used to disable support
3) due to #1, a number of checks are done against "disabled" instead of
the appropriate status.
Address all of those by making sure to follow the appropriate state
transition and checking against the required state.
v2: rebase on s/guc_submission_enabled/uc_enabled/
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
The XE_WARN_ON macro maps to WARN_ON which is not justified
in many cases where only a simple debug check is needed.
Replace the use of the XE_WARN_ON macro with the new xe_assert
macros which relies on drm_*. This takes a struct drm_device
argument, which is one of the main changes in this commit. The
other main change is that the condition is reversed, as with
XE_WARN_ON a message is displayed if the condition is true,
whereas with xe_assert it is if the condition is false.
v2:
- Rebase
- Keep WARN splats in xe_wopcm.c (Matt Roper)
v3:
- Rebase
Signed-off-by: Francois Dugast <francois.dugast@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
On PVC unloading followed by reloading the module often results in a
completely dead machine (seems to be plaguing CI). Resetting the GuC
like we do at load seems to cure it at least when locally testing this.
v2:
- Move pc_fini into guc_fini. We want to do the GuC reset just after
calling pc_fini, otherwise we encounter communication failures. It
also seems like a good idea to do the reset before we start releasing
the various other GuC resources. In the case of pc_fini there is an
explicit stop, but for other stuff like logs, ads, ctb there is not.
References: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/542
References: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/597
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
As with PVC, Xe2 platforms require that the index of an uncached MOCS
entry be programmed into the GUC_SHIM_CONTROL register. This will
likely be needed on future platforms as well.
Xe2 also extends the size of the MOCS index register field from two bits
to four bits. Since these extra bits were unused on PVC, it should be
safe to just increase the size of the mask.
Bspec: 60592
Cc: Haridhar Kalvala <haridhar.kalvala@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Replace calls to XE_BUG_ON() with calls XE_WARN_ON() which in turn calls
WARN() instead of BUG(). BUG() crashes the kernel and should only be
used when it is absolutely unavoidable in case of catastrophic and
unrecoverable failures, which is not the case here.
Signed-off-by: Francois Dugast <francois.dugast@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
The order: 'offset, mask, val'; is more common in other
drivers and in special in i915, where any dev could copy
a sequence and end up with unexpected behavior.
Done with coccinelle:
@rule1@
expression gt, reg, val, mask, timeout, out, atomic;
@@
- xe_mmio_wait32(gt, reg, val, mask, timeout, out, atomic)
+ xe_mmio_wait32(gt, reg, mask, val, timeout, out, atomic)
spatch -sp_file mmio.cocci *.c *.h compat-i915-headers/intel_uncore.h \
--in-place
v2: Rebased after changes on xe_guc_mcr usage of xe_mmio_wait32.
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Don't init pcode and restore VRAM objects in vain.
We can rely on primary GT GUC_STATUS to detect whether
card has really lost power even when d3cold is allowed by xe.
Adding d3cold.lost_power flag to avoid pcode init and vram
restoration.
Also cleaning up the TODO code comment.
v2:
- %s/xe_guc_has_lost_power()/xe_guc_in_reset().
- Used existing gt instead of new variable. [Rodrigo]
- Added kernel-doc function comment. [Rodrigo]
- xe_guc_in_reset() return true if failed to get fw.
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Anshuman Gupta <anshuman.gupta@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230718080703.239343-6-anshuman.gupta@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reduce the number of warnings reported by checkpatch.pl from 118 to 48 by
addressing those warnings types:
LEADING_SPACE
LINE_SPACING
BRACES
TRAILING_SEMICOLON
CONSTANT_COMPARISON
BLOCK_COMMENT_STYLE
RETURN_VOID
ONE_SEMICOLON
SUSPECT_CODE_INDENT
LINE_CONTINUATIONS
UNNECESSARY_ELSE
UNSPECIFIED_INT
UNNECESSARY_INT
MISORDERED_TYPE
Signed-off-by: Francois Dugast <francois.dugast@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Although primary and media GuC share a single interrupt enable bit, they
each have distinct bits in the mask register. Although we always enable
interrupts for the primary GuC before the media GuC today (and never
disable either of them), this might not always be the case in the
future, so use a RMW when updating the mask register to ensure the other
GuC's mask doesn't get clobbered.
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20230601215244.678611-24-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Port Wa_14014475959 to xe_wa fixing its condition. The workaround should
only be applied on the primary GT, not on media. So just checking by
MTL platform is not enough: checking GT is of the right type is also
needed.
Since the GRAPHICS_STEP() does checks the GT type, we could leave the
first check as a platform one: it'd would be easier to understand and
not go out of sync with the graphics_ip_map[] in
drivers/gpu/drm/xe/xe_pci.c. However it also means that new platforms
using the same IP wouldn't match. Prefer using the IP version.
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230526164358.86393-22-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Wa_16015675438 and Wa_18020744125 apply to DG2 using the same action and
conditions. Add both to the oob rules so they are both reported as
active. Note that previously they were not checking by platform or IP
version, hence making them not future-proof. Those workarounds should
only be active in PVC and DG2, besides the check for "no render engine".
v2: From current WA database, Wa_16015675438 applies to all DG2
subplatforms except G11. Migrate condition to use subplatform and
remove G11 from the match (Matt Roper)
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230526164358.86393-19-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Wa_22012727170 and Wa_22012727685 apply to DG2 using the same action and
conditions. Add both to the oob rules so they are both reported as
active.
Do not Wa_22012727170 to PVC and MTL since only early A* steppings are
affected.
v2: Remove DG2_G10 from Wa_22012727685 to match current WA database
(Matt Roper)
v3: GRAPHICS_STEP(A0, FOREVER) can be left alone for DG2 as this means
all steppings
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230526164358.86393-18-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Both GUC_HOST_INTERRUPT and MED_GUC_HOST_INTERRUPT can pass
additional payload data to the GuC but this capability is not
used by the firmware yet.
Stop using value mandated by legacy GuC interrupt register and
use default notify value (zero) instead.
Bspec: 49813, 63363
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
The goal is to allow for a snapshot capture to be taken at the time
of the crash, while the print out can happen at a later time through
the exposed devcoredump virtual device.
v2: Handle memory allocation failures. (Matthew)
Do not use GFP_ATOMIC on cases like debugfs prints. (Matthew)
v3: checkpatch fixes
v4: Do not use atomic in the g2h_worker_func (Matthew)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
These should replace the _MMIO() and MCR_REG() from i915, with the goal
of being more extensible, allowing to pass the additional fields for
struct xe_reg and struct xe_reg_mcr. Replace all uses of _MMIO() and
MCR_REG() in xe.
Since the RTP, reg-save-restore and WA infra are not ready to use the
new type, just undef the macro like was done for the i915 types
previously. That conversion will come later.
v2: Remove MEDIA_SOFT_SCRATCH_COUNT/MEDIA_SOFT_SCRATCH re-added by
mistake (Matt Roper)
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230427223256.1432787-8-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
The defines for the registers were brought over from i915 while
bootstrapping the driver. As xe supports TGL and later only, it doesn't
make sense to keep the GEN* prefixes and suffixes in the registers: TGL
is graphics version 12, previously called "GEN12". So drop the prefix
everywhere.
v2:
- Also drop _TGL suffix and reword commit message as suggested
by Matt Roper. While at it, rename VSUNIT_CLKGATE_DIS_TGL to
VSUNIT_CLKGATE2_DIS with the additional "2", so it doesn't clash
with the define for the other register
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230427223256.1432787-3-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
There's no good reason to keep the GuC registers outside the regs/
directory: move the header with GuC registers under that.
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
That register is a completely different register, it's not the same as
SOFT_SCRATCH for GEN11 and beyond. Rename to to the same name as the
bspec uses, including the new variant for media. Also, move the
definitions to the guc header.
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reduce the use of i915_reg_defs.h so it can be encapsulated in a single
place.
1) If it was being included by mistake, remove
2) If it was included for FIELD_GET()/FIELD_PREP()/GENMASK() and the
like, just include <linux/bitfield.h>
3) If it was included to be able to define additional registers, move
the registers to the relavant headers (regs/xe_regs.h or
regs/xe_gt_regs.h)
v2:
- Squash commit fixing i915_reg_defs.h include and with the one
introducing regs/xe_reg_defs.h
- Remove more cases of i915_reg_defs.h being used when all it was
needed was linux/bitfield.h (Matt Roper)
- Move some registers to the corresponding regs/*.h file (Matt Roper)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
[Rodrigo squashed here the removal of the i915 include]
Create regs/xe_gt_regs.h file with all the registers and bit
definitions used by the xe driver. Eventually the registers may be
defined in a different way and since xe doesn't supported below gen12,
the number of registers touched is much smaller, so create a new header.
The definitions themselves are direct copy from the
gt/intel_gt_regs.h file, just sorting the registers by address.
Cleaning those up and adhering to a common coding style is left for
later.
v2: Make the change to MCR_REG location in a separate patch to go
through the i915 branch (Matt Roper / Rodrigo)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Sort includes and split them in blocks:
1) .h corresponding to the .c. Example: xe_bb.c should have a "#include
"xe_bb.h" first.
2) #include <linux/...>
3) #include <drm/...>
4) local includes
5) i915 includes
This is accomplished by running
`clang-format --style=file -i --sort-includes drivers/gpu/drm/xe/*.[ch]`
and ignoring all the changes after the includes. There are also some
manual tweaks to split the blocks.
v2: Also sort includes in headers
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
A few static functions not being declared like that break the build with
W=1, like e.g.
cc1: all warnings being treated as errors
make[2]: *** [../scripts/Makefile.build:250: drivers/gpu/drm/xe/xe_gt.o] Error 1
../drivers/gpu/drm/xe/xe_guc.c:240:6: error: no previous prototype for ‘guc_write_params’ [-Werror=missing-prototypes]
240 | void guc_write_params(struct xe_guc *guc)
| ^~~~~~~~~~~~~~~~
Make them static.
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Only the GuC should be issuing TLB invalidations if it is enabled. Part
of this patch is sanitize the device on driver unload to ensure we do
not send GuC based TLB invalidations during driver unload.
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>