xe_mmio_read64_2x32() was adjusting register addresses and then
calling xe_mmio_read32(), which applies the adjustment again.
This may shift accesses twice if adj_offset < adj_limit. There is
no issue currently, as for media gt, adj_offset > adj_limit, so
the 2nd adjust will be a no-op. But it may not work in future.
To fix it, replace the adjusted-address comparison with a direct
sanity check that ensures the MMIO address adjustment cutoff never
falls within the 8-byte range of a 64-bit register. And let
xe_mmio_read32() handle address translation.
v2: rewrite the sanity check in a more natural way. (Matt)
v3: Add Fixes tag. (Jani)
Fixes: 07431945d8 ("drm/xe: Avoid 64-bit register reads")
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com>
Link: https://patch.msgid.link/20260130165621.471408-2-shuicheng.lin@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
(cherry picked from commit a30f999681)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
clangd reports many "unused header" warnings throughout the Xe driver.
Start working to clean this up by removing unnecessary includes in our
.c files and/or replacing them with explicit includes of other headers
that were previously being included indirectly.
By far the most common offender here was unnecessary inclusion of
xe_gt.h. That likely originates from the early days of xe.ko when
xe_mmio did not exist and all register accesses, including those
unrelated to GTs, were done with GT functions.
There's still a lot of additional #include cleanup that can be done in
the headers themselves; that will come as a followup series.
v2:
- Squash the 79-patch series down to a single patch. (MattB)
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20260115032803.4067824-2-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Instead of creating ad-hoc new register definitions with altered
register addresses to mimic the VF's access to these registers,
prepare new MMIO instance per required VF, with shifted internal
location of the register map. This will allow to use unmodified
register definitions in all calls to xe_mmio() functions.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patch.msgid.link/20251024205826.4652-1-michal.wajdeczko@intel.com
The function mmio_multi_tile_setup() does not look like the proper
location for probing for the number of existing tiles in the hardware.
It should not be that function's responsibility and such information
should instead be already available when it gets called.
The fact that we need to adjust gt_count is a symptom of that.
Move probing of available tile count to a dedicated function named
xe_info_probe_tile_count() and call it from xe_info_init(), which seems
like a more appropriate place for that.
With that move, we no longer need to (and shouldn't) adjust gt_count as
a part of xe_info_probe_tile_count(), as that field will be initialized
later in xe_info_init().
v4:
- Only probe for tile count if the default tile_count != 1 (just like
was done in mmio_multi_tile_setup()). (CI)
v3:
- Unchanged.
v2:
- Use KUnit static stub so that we do not try to query hardware when
running KUnit tests. (CI)
- Tweak last paragraph of commit message to make it clearer.
(Jonathan)
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://lore.kernel.org/r/20250818-gt_count-improvements-v4-1-ee12870c6f57@intel.com
Signed-off-by: Gustavo Sousa <gustavo.sousa@intel.com>
Although "multi-tile" and "multiple GTs per tile" are mutually-exclusive
characteristics on all of our platforms today, this may not always be
true. Assign GT IDs according to xe->info.max_gt_per_tile in a way that
should work even if future platforms have different configurations.
This patch should not change the behavior of current platforms; it only
future-proofs for potential future designs.
v2:
- Re-calculate gt_count if tile count gets reduced by MTCFG. (PVC CI)
Reviewed-by: Ravi Kumar Vodapalli <ravi.kumar.vodapalli@intel.com>
Link: https://lore.kernel.org/r/20250701201320.2514369-14-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Allow to test if driver behaves correctly when xe_pcode_probe_early()
fails. Note that this is not sufficient for testing survivability mode
as it's still required to read the hw to check for errors, which doesn't
happen on an injected failure.
To complete the early probe coverage, allow injection in the other
functions as well: xe_mmio_probe_early() and xe_device_probe_early().
Reviewed-by: Francois Dugast <francois.dugast@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250314-fix-survivability-v5-3-fdb3559ea965@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
The "mmio_ext" and 'REG_EXT" code is currently unused on any existing
platform. Going forward, this also isn't the design we want to use for
any future platforms/features either, so we should just go ahead and
remove the dead code to avoid confusion.
mmio_ext was originally added in an attempt to hack around the early
(mis)design of the Xe driver, which used xe_gt as the target for all
register MMIO access, even those completely unrelated to the GT subunit
of the hardware. With the introduction of commit 34953ee349 ("drm/xe:
Create dedicated xe_mmio structure") and its follow-up patches, that
misdesign has been corrected and access to register MMIO regions
specific to hardware units is now done through xe_mmio structures which
encapsulate an iomap, region size, and some other metadata.
Although all of the registers used by the driver today happen to fall
within one specific PCI BAR region, and thus re-use a single device-wide
iomap, there's no requirement that this stay true for future platforms
or features. I.e., if a future platform adds a new 'foo' hardware unit
that exists at a different area in the BAR, or even in a completely
different BAR, then that would be handled by doing a separate iomap of
that unit's register region and wrapping it in its own 'struct xe_mmio
foo_regs' structure. The pointer to the new 'foo_regs' could be placed
within the xe_device, xe_tile, xe_gt, etc., according to where the new
hardware unit falls within the current hardware hierarchy.
This effectively reverts the following commits, although parts of these
commits had already vanished or changed with the earlier xe_mmio
refactor work:
- commit 399a13323f ("drm/xe: add 28-bit address support in struct
xe_reg")
- commit fdef72e02e ("drm/xe: add a flag to bypass multi-tile config
from MTCFG reg")
- commit 866b2b1764 ("drm/xe: add MMIO extension support flags")
- commit ef29b390c7 ("drm/xe: map MMIO BAR according to the num of
tiles in device desc")
- commit a4e2f3a299 ("drm/xe: refactor xe_mmio_probe_tiles to support
MMIO extension")
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Koby Elbaz <kelbaz@habana.ai>
Acked-by: Maciej Patelczyk <maciej.patelczyk@intel.com>
Reviewed-by: Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Stuart Summers <stuart.summers@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250106234312.2986065-2-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Since much of the MMIO register access done by the driver is to non-GT
registers, use of 'xe_gt' in these interfaces has been a long-standing
design flaw that's been hard to disentangle.
To avoid a flag day across the whole driver, munge the function names
and add temporary compatibility macros with the original function names
that can accept either the new xe_mmio or the old xe_gt structure as a
parameter. This will allow us to slowly convert parts of the driver
over to the new interface independently.
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240910234719.3335472-54-matthew.d.roper@intel.com
xe_mmio currently has a size parameter that is assigned but never used
anywhere. The current values assigned appear to be the size of the BAR
region assigned for the tile (both for registers and other purposes such
as the GGTT). Since the current field isn't being used for anything,
change the assignments to 4MB (the size of the register region on all
current platform) and rename the field to 'regs_size' to more clearly
describe what it represents. We can use this value in later patches to
help ensure no register accesses accidentally go past the end of the
desired register space (which might not be caught easily if they still
fall within the iomap).
v2:
- s/regs_length/regs_size/ (Lucas)
- Clarify kerneldoc description (Lucas)
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240910234719.3335472-48-matthew.d.roper@intel.com
Wa_15015404425 asks us to perform four "dummy" writes to a
non-existent register offset before every real register read.
Although the specific offset of the writes doesn't directly
matter, the workaround suggests offset 0x130030 as a good target
so that these writes will be easy to recognize and filter out in
debugging traces.
V5(MattR):
- Avoid negating an equality comparison
V4(MattR):
- Use writel and remove xe_reg usage
V3(MattR):
- Define dummy reg local to function
- Avoid tracing dummy writes
- Update commit message
V2:
- Add WA to 8/16/32bit reads also - MattR
- Corrected dummy reg address - MattR
- Use for loop to avoid mental pause - JaniN
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240709155606.2998941-1-tejas.upadhyay@intel.com
GuC loading can take longer than it is supposed to for various
reasons. So add in the code to cope with that and to report it when it
happens. There are also many different reasons why GuC loading can
fail, so add in the code for checking for those and for reporting
issues in a meaningful manner rather than just hitting a timeout and
saying 'fail: status = %x'.
Also, remove the 'FIXME' comment about an i915 bug that has never been
applicable to Xe!
v2: Actually report the requested and granted frequencies rather than
showing granted twice (review feedback from Badal).
v3: Locally code all the timeout and end condition handling because a
helper function is not allowed (review feedback from Lucas/Rodrigo).
v4: Add more documentation comments and rename a define to add units
(review feedback from Lucas).
v5: Fix copy/paste error in xe_mmio_wait32_not (review feedback from
Lucas) and rebase (no more return value from guc_wait_ucode).
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240518043700.3264362-3-John.C.Harrison@Intel.com
Lmem init check should be done only after pcode initialization
status is complete. Move lmem init check after pcode status
check. Also wait for a short while after pcode status check
to allow completion of the task.
Failing to do so, can lead to aborting the module load
leaving the system unusable. Wait until the lmem initialization
is complete within a timeout (60s) or till the user aborts.
v2: use bool as return type
re-order the code comment (Rodrigo)
add comment for deferring probe (Himal)
v3: rebase
Signed-off-by: Riana Tauro <riana.tauro@intel.com>
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240410085005.1126343-3-riana.tauro@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Shortly we will updating xe_mmio_read|write() functions with SR-IOV
specific features making those functions less suitable for inline.
Convert now those functions into regular ones, lowering driver
footprint, according to scripts/bloat-o-meter, by 6%
add/remove: 18/18 grow/shrink: 31/603 up/down: 2719/-79663 (-76944)
Function old new delta
Total: Before=1276633, After=1199689, chg -6.03%
add/remove: 0/0 grow/shrink: 0/0 up/down: 0/0 (0)
Data old new delta
Total: Before=48990, After=48990, chg +0.00%
add/remove: 0/0 grow/shrink: 0/0 up/down: 0/0 (0)
RO Data old new delta
Total: Before=115680, After=115680, chg +0.00%
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240314173130.1177-7-michal.wajdeczko@intel.com
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
clang complains about a nonsensical test on builds with a 32-bit phys_addr_t,
which means resizing will always fail:
drivers/gpu/drm/xe/xe_mmio.c:109:23: error: result of comparison of constant 4294967296 with expression of type 'resource_size_t' (aka 'unsigned int') is always false [-Werror,-Wtautological-constant-out-of-range-compare]
109 | root_res->start > 0x100000000ull)
| ~~~~~~~~~~~~~~~ ^ ~~~~~~~~~~~~~~
Previously, BAR resize was always disallowed on 32-bit kernels, but
this apparently changed recently. Since 32-bit machines can in theory
support PAE/LPAE for large address spaces, this may end up useful,
so change the driver to shut up the warning but still work when
phys_addr_t/resource_size_t is 64 bit wide.
Fixes: 9a6e6c14bf ("drm/xe/mmio: Use non-atomic writeq/readq variant for 32b")
Fixes: 237412e453 ("drm/xe: Enable 32bits build")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Acked-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240226124736.1272949-2-arnd@kernel.org
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
SR-IOV VF doesn't have access to MMIO registers used to determine
graphics/media ID. It can however communicate with GuC.
Introduce xe_device_probe_early, which initializes enough HW to use
MMIO GuC communication.
This will allow both VF and PF/native driver to have unified probe
ordering.
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Both MMIO registers and GGTT for root tile will need to be used earlier
during probe. Don't rely on tile count to compute the mapping size.
Further more, there's no need to remap after figuring out the real
resource size.
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Per device, set this flag to access the MTCFG register or to skip it.
This is done to standardise Xe driver naming if an access to any HW
should be avoided.
Signed-off-by: Koby Elbaz <kelbaz@habana.ai>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>