SR-IOV VF doesn't have access to MMIO registers used to determine
graphics/media ID. It can however communicate with GuC.
Introduce xe_device_probe_early, which initializes enough HW to use
MMIO GuC communication.
This will allow both VF and PF/native driver to have unified probe
ordering.
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
This function is big enough, let's move it to a shared compilation unit.
While at it, document it.
Here is the output of running bloat-o-metter on the new and old module
(execution provided by Lucas):
$ ./scripts/bloat-o-meter build64/drivers/gpu/drm/xe/xe.ko{.old,}
add/remove: 2/0 grow/shrink: 0/58 up/down: 554/-15645 (-15091)
(...) # Lines in between omitted
Total: Before=2181322, After=2166231, chg -0.69%
The overall reduction in the size is not that significant. Nevertheless,
keeping the function as inline arguably does not bring too much benefit
as well.
As noted by Lucas, we would probably benefit from an inline
function that did the fast-path check: do an optimistic first check
before entering the wait-logic, which itself would go to a compilation
unit. We might come back to implement this in the future if we have data
to justify it.
v2:
- Add note in documentation for @timeout_us regarding the exponential
backoff strategy. (Lucas)
- Share output of bloat-o-meter in the commit message. (Lucas)
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20231116214000.70573-2-gustavo.sousa@intel.com
Signed-off-by: Gustavo Sousa <gustavo.sousa@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
After noticing in logs there were still mentions to GEN6 registers, it
was clear commit d9b79ad275 ("drm/xe: Drop gen afixes from registers")
didn't take care of all the afixes. Some were added later, but there are
also constants and strings still using that. Continue the cleanup
removing the remaining ones.
To keep it consistent with code nearby, a few other changes are made:
- Remove prefix in INTEL_LEGACY_64B_CONTEXT
- Remove GEN8_CTX_L3LLC_COHERENT since it's unused
- Rename GEN9_FREQ_SCALER to GT_FREQUENCY_SCALER
v2: Use XELP_ as prefix for NUM_MOCS_ENTRIES and remove changes to
MOCS_ENTRIES as this is now done as part of a previous commit
(Matt Roper)
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20231117174049.527192-3-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
During xe_mmio_probe_vram(), we already store the values returned from
xe_mmio_tile_vram_size() into the xe_tile structures.
There is no need to call xe_mmio_tile_vram_size() again later during
setup of the STOLEN region. Just use the values stored in the root tile.
Signed-off-by: Brian Welty <brian.welty@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper at intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
The only possible 64-bit register writes in the driver come from the
highly questionable MMIO ioctl. That ioctl's register write support
only operates for userspace running as root and cannot be used by any
real userspace; it exists solely to support the "xe_reg" debug tool in
IGT. Since the spec indicates that hardware does not officially support
64-bit register accesses, there's no reason to allow such 64-bit writes,
even for debugging.
Bspec: 60027
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://lore.kernel.org/r/20230823003312.1356779-4-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Intel hardware officially only supports GTTMMADR register accesses of
32-bits or less (although 64-bit accesses to device memory and PTEs in
the GSM are fine). Even though we do usually seem to get back
reasonable values when performing readq() operations on registers in
BAR0, we shouldn't rely on this violation of the spec working
consistently. It's likely that even when we do get proper register
values back the hardware is internally satisfying the request via a
non-atomic sequence of two 32-bit reads, which can be problematic for
timestamps and counters if rollover of the lower bits is not considered.
Replace xe_mmio_read64() with xe_mmio_read64_2x32() that implements
64-bit register reads as two 32-bit reads and attempts to ensure that
the upper dword has stabilized to avoid problematic rollovers for
counter and timestamp registers.
v2:
- Move function from xe_mmio.h to xe_mmio.c. (Lucas)
- Convert comment to kerneldoc and note that it shouldn't be used on
registers where reads may trigger side effects. (Lucas)
Bspec: 60027
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://lore.kernel.org/r/20230823003312.1356779-3-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
The order: 'offset, mask, val'; is more common in other
drivers and in special in i915, where any dev could copy
a sequence and end up with unexpected behavior.
Done with coccinelle:
@rule1@
expression gt, reg, val, mask, timeout, out, atomic;
@@
- xe_mmio_wait32(gt, reg, val, mask, timeout, out, atomic)
+ xe_mmio_wait32(gt, reg, mask, val, timeout, out, atomic)
spatch -sp_file mmio.cocci *.c *.h compat-i915-headers/intel_uncore.h \
--in-place
v2: Rebased after changes on xe_guc_mcr usage of xe_mmio_wait32.
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Each tile has its own register region in the BAR, containing instances
of all registers for the platform. In contrast, the multiple GTs within
a tile share the same MMIO space; there's just a small subset of
registers (the GSI registers) which have multiple copies at different
offsets (0x0 for primary GT, 0x380000 for media GT). Move the register
MMIO region size/pointers to the tile structure, leaving just the GSI
offset information in the GT structure.
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20230601215244.678611-7-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
The _total_vram_size helper is device based and is not complete.
Teach the helper to be tile aware and add the ability to size
DG1 correctly.
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
writeq() and readq() and other functions working on 64 bit variables
are not provided by 32b arch. For that it's needed to choose between
linux/io-64-nonatomic-hi-lo.h and linux/io-64-nonatomic-lo-hi.h,
spliting the read/write in 2 accesses. For xe driver, it doesn't matter
much, so just choose one and include in xe_mmio.h.
This also removes some ifdef CONFIG_64BIT we had around because of the
missing 64bit functions.
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
This adds support for stolen memory, with the same allocator as
vram_mgr. This allows us to skip a whole lot of copy-paste,
by re-using parts of xe_ttm_vram_mgr.
The stolen memory may be bound using VM_BIND, so it performs like any
other memory region.
We should be able to map a stolen BO directly using the physical memory
location instead of through GGTT even on old platforms, but I don't know
what the effects are on coherency.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Expand xe_mmio_wait32 to accept atomic and then use
that directly when possible, and create own routine to
wait for the pcode status.
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
This is already useful because it avoids some extra reads
where registers might have changed after the timeout decision.
But also, it will be important to end the kill of i915's wait_for.
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Then, move the i915_utils.h include to its user.
The overall goal is to kill all the usages of the i915_utils
stuff.
Yes, wait_for also depends on <linux/delay.h>, so they go
together to where it is needed. It will be likely needed
anyway directly for udelay or usleep_range.
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Xe, is a new driver for Intel GPUs that supports both integrated and
discrete platforms starting with Tiger Lake (first Intel Xe Architecture).
The code is at a stage where it is already functional and has experimental
support for multiple platforms starting from Tiger Lake, with initial
support implemented in Mesa (for Iris and Anv, our OpenGL and Vulkan
drivers), as well as in NEO (for OpenCL and Level0).
The new Xe driver leverages a lot from i915.
As for display, the intent is to share the display code with the i915
driver so that there is maximum reuse there. But it is not added
in this patch.
This initial work is a collaboration of many people and unfortunately
the big squashed patch won't fully honor the proper credits. But let's
get some git quick stats so we can at least try to preserve some of the
credits:
Co-developed-by: Matthew Brost <matthew.brost@intel.com>
Co-developed-by: Matthew Auld <matthew.auld@intel.com>
Co-developed-by: Matt Roper <matthew.d.roper@intel.com>
Co-developed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Co-developed-by: Francois Dugast <francois.dugast@intel.com>
Co-developed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Co-developed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Co-developed-by: Philippe Lecluse <philippe.lecluse@intel.com>
Co-developed-by: Nirmoy Das <nirmoy.das@intel.com>
Co-developed-by: Jani Nikula <jani.nikula@intel.com>
Co-developed-by: José Roberto de Souza <jose.souza@intel.com>
Co-developed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Co-developed-by: Dave Airlie <airlied@redhat.com>
Co-developed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Co-developed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Co-developed-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>