At present the interrupts are enabled while initializing the last GT.
But this is incorrect for a Multi-GT platform, as root GT initialization
will fail with interrupt disabled. Interrupts are required for
the GuC submission triggered during initialization.
Enable the interrupt during the root GT initialization.
Signed-off-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Some of the tests may benefit from running with ARCH=um, forgoing any
additional setup on the CI build side. Add min config for that.
Tested with:
./tools/testing/kunit/kunit.py build \
--kunitconfig drivers/gpu/drm/xe/.kunitconfig \
--jobs $(nproc) \
--build_dir build_kunit
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Mauro Carvalho Chehab <mchehab@kernel.org>
mem_type field was added in commit d8b52a02cb ("drm/xe: Implement
stolen memory.") to designate the TTM memory type for that mgr. Add
kernel-doc with its description.
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Fix the following error while building for 32b:
In file included from ../drivers/gpu/drm/xe/xe_gt.c:6:
../drivers/gpu/drm/xe/xe_gt.c: In function ‘gt_ttm_mgr_init’:
../include/linux/minmax.h:20:35: error: comparison of distinct pointer types lacks a cast [-Werror]
20 | (!!(sizeof((typeof(x) *)1 == (typeof(y) *)1)))
| ^~
Cast it to u64 so size of the second operand matches the first one when
building it for 32 bits.
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Leave the types as u64, but cast the pointers to unsigned long before
assigning so the compiler doesn't throw warning about casting a pointer
to integer of different size.
Also, size_t should use %zu, not %ld.
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
writeq() and readq() and other functions working on 64 bit variables
are not provided by 32b arch. For that it's needed to choose between
linux/io-64-nonatomic-hi-lo.h and linux/io-64-nonatomic-lo-hi.h,
spliting the read/write in 2 accesses. For xe driver, it doesn't matter
much, so just choose one and include in xe_mmio.h.
This also removes some ifdef CONFIG_64BIT we had around because of the
missing 64bit functions.
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
On DG2 we are now getting:
[ 104.456607] xe 0000:03:00.0: [drm] *ERROR* PCODE timeout, retrying with preemption disabled
Looks like we just need to invert the error check for
xe_pcode_try_request(), which returns zero on success.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
On DG1, BAR2 is not reliable for reporting Vram size, need to use GSMBASE.
Simplify xe_mmio_total_vram_size to report vram size and usable size.
Signed-off-by: Philippe Lecluse <philippe.lecluse@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
The bo_create ioctl relied on the internal ordering of memory regions to
be the same, make sure we don't allocate stolen instead of VRAM0.
Also remove a debug warning left in.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Philippe Lecluse <philippe.lecluse1@gmail.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Starting in 70.6.* GuC firmware the CSS header includes the submission
version, pull this from the CSS header. Prior 70.* versions accidentally
omitted this informatio so hard code to the correct values. This
information will be used by VFs when communicating with the PF.
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Philippe Lecluse <philippe.lecluse1@gmail.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
The blob doesn't fully support this yet, so fake for now to ensure our
driver load order is correct.
Once the blob supports pulling gt->info.engine_mask from the blob, this
patch can be removed.
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
This adds support for stolen memory, with the same allocator as
vram_mgr. This allows us to skip a whole lot of copy-paste,
by re-using parts of xe_ttm_vram_mgr.
The stolen memory may be bound using VM_BIND, so it performs like any
other memory region.
We should be able to map a stolen BO directly using the physical memory
location instead of through GGTT even on old platforms, but I don't know
what the effects are on coherency.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
When a job is inflight we may access memory to read the hardware seqno.
All user jobs have VM open which has a ref but kernel jobs do not
require VM so it is possible to not have memory ref. To avoid this, take
a memory ref on kernel job creation.
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Expand xe_mmio_wait32 to accept atomic and then use
that directly when possible, and create own routine to
wait for the pcode status.
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
We don't need any macro for a simple check we can do explicitly
and clear.
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
To make it simpler, all of the status checks also waits and
times out.
Also, no ktime precision is needed in this case, and we
can use usleep_range because we are not in atomic paths here.
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Rather than a constant check on proto and wait not busy,
let's wait for the expected success and then check the
protocol afterwards.
With this, we can now use the regular xe_mmio_wait32
and kill this local need for the wait_for.
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Possible now that the wait function returns the last read value.
So we can remove the users of i915's wait_for one by one...
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
This is already useful because it avoids some extra reads
where registers might have changed after the timeout decision.
But also, it will be important to end the kill of i915's wait_for.
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Then, move the i915_utils.h include to its user.
The overall goal is to kill all the usages of the i915_utils
stuff.
Yes, wait_for also depends on <linux/delay.h>, so they go
together to where it is needed. It will be likely needed
anyway directly for udelay or usleep_range.
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Xe, is a new driver for Intel GPUs that supports both integrated and
discrete platforms starting with Tiger Lake (first Intel Xe Architecture).
The code is at a stage where it is already functional and has experimental
support for multiple platforms starting from Tiger Lake, with initial
support implemented in Mesa (for Iris and Anv, our OpenGL and Vulkan
drivers), as well as in NEO (for OpenCL and Level0).
The new Xe driver leverages a lot from i915.
As for display, the intent is to share the display code with the i915
driver so that there is maximum reuse there. But it is not added
in this patch.
This initial work is a collaboration of many people and unfortunately
the big squashed patch won't fully honor the proper credits. But let's
get some git quick stats so we can at least try to preserve some of the
credits:
Co-developed-by: Matthew Brost <matthew.brost@intel.com>
Co-developed-by: Matthew Auld <matthew.auld@intel.com>
Co-developed-by: Matt Roper <matthew.d.roper@intel.com>
Co-developed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Co-developed-by: Francois Dugast <francois.dugast@intel.com>
Co-developed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Co-developed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Co-developed-by: Philippe Lecluse <philippe.lecluse@intel.com>
Co-developed-by: Nirmoy Das <nirmoy.das@intel.com>
Co-developed-by: Jani Nikula <jani.nikula@intel.com>
Co-developed-by: José Roberto de Souza <jose.souza@intel.com>
Co-developed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Co-developed-by: Dave Airlie <airlied@redhat.com>
Co-developed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Co-developed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Co-developed-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>