Commit Graph

45 Commits

Author SHA1 Message Date
Michał Winiarski
a44bbace73 drm/xe/guc: Allocate GuC data structures in system memory for initial load
GuC load will need to happen at an earlier point in probe, where local
memory is not yet available. Use system memory for GuC data structures
used for initial "hwconfig" load, and realloc at a later,
"post-hwconfig" load if needed, when local memory is available.

Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240219130530.1406044-1-michal.winiarski@intel.com
2024-02-20 14:13:42 -05:00
Matthew Brost
d688b86a29 drm/xe/guc: Flush G2H handler when turning off CTs
Make sure G2H handler is not running when changing the CT state to drop
messages or disabled. This will help prevent races in the code ensuring
that G2H are not being processed after changing the state.

v2:
- s/flush_g2h_handler/stop_g2h_handler (Michal)

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
[Rodrigo remove the extra line while pushing]
Link: https://patchwork.freedesktop.org/patch/msgid/20240122210156.1517444-4-matthew.brost@intel.com
2024-01-26 15:14:01 -05:00
Matthew Brost
dc75d03716 drm/xe/guc: Add more GuC CT states
The Guc CT has more than enabled / disables states rather it has 4. The
4 states are not initialized, disabled, stopped, and enabled. Change the
code to reflect this. These states will enable proper return codes from
functions and therefore enable proper error messages.

v2:
- s/XE_GUC_CT_STATE_DROP_MESSAGES/XE_GUC_CT_STATE_STOPPED (Michal)
- Add assert for CT being initialized (Michal)
- Fix kernel for CT state enum (Michal)

v3:
- Kernel doc (Michal)
- s/reiecved/received (Michal)
- assert CT state not initialized in xe_guc_ct_init (Michal)
- add argument xe_guc_ct_set_state to clear g2h (Michal)

v4:
- Drop clear_outstanding_g2h argument (Michal)

v5:
- Move xa_destroy outside of fast lock (CI)

Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Tejas Upadhyay <tejas.upadhyay@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240122210156.1517444-2-matthew.brost@intel.com
2024-01-26 15:12:28 -05:00
José Roberto de Souza
c65908c33b drm/xe: Remove double new lines in devcoredump
Right now devcoredump has a new line between '**** GuC CT ****' and
'H2G CTB (all sizes in DW):' while other sections don't have.

v2: remove double new line after IPEHR

Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Maarten Lankhorst <dev@lankhorst.se>
Cc: Stuart Summers <stuart.summers@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Stuart Summers <stuart.summers@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240123204454.246788-1-jose.souza@intel.com
2024-01-24 10:53:31 -08:00
Michal Wajdeczko
6af7ee0827 drm/xe/guc: Add kernel-doc for xe_guc_ct_send_recv()
Add initial documentation for recently updated xe_guc_ct_send_recv().

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240112102554.761-2-michal.wajdeczko@intel.com
2024-01-18 19:28:55 +01:00
Michal Wajdeczko
a54e016ace drm/xe/guc: Return CTB HXG response DATA0 if no buffer provided
Most of the synchronous GuC HXG action responses are defined in
such a way that only mandatory DATA0 from the HXG header is used
and only in few cases it is more than MBZ (must be zero).

For those cases where HXG action returns just DATA0, return that
value if caller didn't provide buffer for the full response.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240112102554.761-1-michal.wajdeczko@intel.com
2024-01-18 19:28:41 +01:00
Michal Wajdeczko
d4978a67ae drm/xe/guc: Use HXG definitions on HXG messages
While parsing and processing CTB G2H messages we should extract
underlying HXG message and use HXG definitions on such message.
Using outer CTB layer message in HXG definitions require use of
shifted dword index, which might be confusing:

	FIELD_GET(GUC_HXG_MSG_0_xxx, msg[1])

instead of:

	FIELD_GET(GUC_HXG_MSG_0_xxx, hxg[0])

Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://lore.kernel.org/r/20240111210632.717-1-michal.wajdeczko@intel.com
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
2024-01-12 09:50:25 +01:00
Michal Wajdeczko
d898c2e555 drm/xe/guc: Return CTB response length
Not all CTB responses from the GuC are fixed size and we need to
pass response length to the caller, if there was a response_buffer.
Easiest solution is to return it as positive value from all
xe_guc_ct_send_recv() functions.  The CTB response length is always
between 1 and 254 (ie. GUC_HXG_MSG_MIN_LEN and GUC_CTB_MAX_DWORDS
- GUC_HXG_MSG_MIN_LEN).

Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://lore.kernel.org/r/20240111152724.497-1-michal.wajdeczko@intel.com
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
2024-01-11 20:54:29 +01:00
Michal Wajdeczko
26d4481ac2 drm/xe/guc: Start handling GuC Relay event messages
GuC Relay infrastructure is ready, start handling relay messages
from the GuC to unblock testing on the live system.

Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com>
Link: https://lore.kernel.org/r/20240104222031.277-11-michal.wajdeczko@intel.com
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
2024-01-05 16:25:54 +01:00
Michal Wajdeczko
4469eae6bc drm/xe/kunit: Allow to replace xe_guc_ct_send_recv() with stub
We want to use replacement functions in upcoming kunit tests.

Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com>
Link: https://lore.kernel.org/r/20240104222031.277-9-michal.wajdeczko@intel.com
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
2024-01-05 16:25:53 +01:00
Daniele Ceraolo Spurio
0eb16fd267 drm/xe/guc: Use FAST_REQUEST for non-blocking H2G messages
We're currently sending non-blocking H2G messages using the EVENT type,
which suppresses all CTB protocol replies from the GuC, including the
failure cases. This might cause errors to slip through and manifest as
unexpected behavior (e.g. a context state might not be what the driver
thinks it is because the state change command was silently rejected by
the GuC). To avoid this kind of problems, we can use the FAST_REQUEST
type instead, which suppresses the reply only on success; this way we
still get the advantage of not having to wait for an ack from the GuC
(i.e. the H2G is still non-blocking) while still detecting errors.
Since we can't escalate to the caller when a non-blocking message
fails, we need to escalate to GT reset instead.

Note that FAST_REQUEST failures are NOT expected and are usually a sign
that the H2G was either malformed or requested an illegal operation.

v2: assign fence values to FAST_REQUEST messages, fix abi doc, use xe_gt
printers (Michal).

v3: fix doc alignment, fix and improve prints (Michal)

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com> #v2
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
2023-12-21 16:31:30 -05:00
Michał Winiarski
0e1a47fcab drm/xe: Add a helper for DRM device-lifetime BO create
A helper for managed BO allocations makes it possible to remove specific
"fini" actions and will simplify the following patches adding ability to
execute a release action for specific BO directly.

Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:45:11 -05:00
Michal Wajdeczko
b67cb798e4 drm/xe/guc: Include only required GuC ABI headers
On i915 we were adding new GuC ABI headers directly to guc_fwif.h
file since we were replacing old definitions from that file.

On xe driver we could do more and better by including ABI headers
only in files that need those definitions.

Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/741
Cc: Jani Nikula <jani.nikula@intel.com>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://lore.kernel.org/r/20231128203203.1147-3-michal.wajdeczko@intel.com
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:45:08 -05:00
Michal Wajdeczko
4a9b7d29c1 drm/xe/guc: Fix wrong assert about full_len
This variable holds full length of the message, including header
length so it should be checked against GUC_CTB_MSG_MAX_LEN.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:44:38 -05:00
Balasubramani Vivekanandan
8656ea9ae8 drm/xe: Add event tracing for CTB
Event tracing enabled for CTB submissions.

Additional minor refactor - Removed a unnecessary ct_to_xe() call.

v2: Remove a unwanted comment (Hari)
    Add missing change to commit message

Signed-off-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Reviewed-by: Haridhar Kalvala <haridhar.kalvala@intel.com>
Link: https://lore.kernel.org/intel-xe/20231019093140.1901665-2-balasubramani.vivekanandan@intel.com/
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:43:19 -05:00
Bommithi Sakeena
28b1d9155c drm/xe: Ensure mutex are destroyed
Add missing mutex_destroy calls to fini functions or convert to
drmm_mutex_init where fini function is not available.

Cc: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Bommithi Sakeena <bommithi.sakeena@intel.com>
Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:41:21 -05:00
Francois Dugast
c73acc1eeb drm/xe: Use Xe assert macros instead of XE_WARN_ON macro
The XE_WARN_ON macro maps to WARN_ON which is not justified
in many cases where only a simple debug check is needed.
Replace the use of the XE_WARN_ON macro with the new xe_assert
macros which relies on drm_*. This takes a struct drm_device
argument, which is one of the main changes in this commit. The
other main change is that the condition is reversed, as with
XE_WARN_ON a message is displayed if the condition is true,
whereas with xe_assert it is if the condition is false.

v2:
- Rebase
- Keep WARN splats in xe_wopcm.c (Matt Roper)

v3:
- Rebase

Signed-off-by: Francois Dugast <francois.dugast@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:41:08 -05:00
Francois Dugast
5c0553cdc8 drm/xe: Replace XE_WARN_ON with drm_warn when just printing a string
Use the generic drm_warn instead of the driver-specific XE_WARN_ON
in cases where XE_WARN_ON is used to unconditionally print a debug
message.

v2: Rebase

Signed-off-by: Francois Dugast <francois.dugast@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:41:07 -05:00
Matthew Auld
429d56a6b1 drm/xe/ct: fix resv_space print
Actually print the info.resv_space.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:40:27 -05:00
Francois Dugast
9b9529ce37 drm/xe: Rename engine to exec_queue
Engine was inappropriately used to refer to execution queues and it
also created some confusion with hardware engines. Where it applies
the exec_queue variable name is changed to q and comments are also
updated.

Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/162
Signed-off-by: Francois Dugast <francois.dugast@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:39:20 -05:00
Matthew Brost
f82686ef74 drm/xe: remove header variable from parse_g2h_msg
The header variable is unused, remove it.

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Suggested-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:39:17 -05:00
Francois Dugast
99fea68288 drm/xe: Prefer WARN() over BUG() to avoid crashing the kernel
Replace calls to XE_BUG_ON() with calls XE_WARN_ON() which in turn calls
WARN() instead of BUG(). BUG() crashes the kernel and should only be
used when it is absolutely unavoidable in case of catastrophic and
unrecoverable failures, which is not the case here.

Signed-off-by: Francois Dugast <francois.dugast@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:39:17 -05:00
Matthew Brost
8d7a91fe58 drm/xe: Remove ct->fence_context
This is unused, remove it.

Suggested-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:39:16 -05:00
Matthew Brost
757d9fdfe3 drm/xe: Remove XE_GUC_CT_SELFTEST
XE_GUC_CT_SELFTEST enabled a debugfs entry to which ran a very simple
selftest ensuring the GuC CT code worked. This was added before the
kunit framework was available and before submissions were working too.
This test isn't worth porting over to the kunit frame as if the GuC CT
didn't work, literally almost nothing would work so just remove this.

Suggested-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:39:16 -05:00
Matthew Auld
7eed01a926 drm/xe: drop xe_device_mem_access_get() from guc_ct_send
The callers should already be holding the mem_access reference, before
calling into this.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:37:36 -05:00
Matthew Auld
a00b8f1aae drm/xe: fix xe_device_mem_access_get() races
It looks like there is at least one race here, given that the
pm_runtime_suspended() check looks to return false if we are in the
process of suspending the device (RPM_SUSPENDING vs RPM_SUSPENDED).  We
later also do xe_pm_runtime_get_if_active(), but since the device is
suspending or has now suspended, this doesn't do anything either.
Following from this we can potentially return from
xe_device_mem_access_get() with the device suspended or about to be,
leading to broken behaviour.

Attempt to fix this by always grabbing the runtime ref when our internal
ref transitions from 0 -> 1. The hard part is then dealing with the
runtime_pm callbacks also calling xe_device_mem_access_get() and
deadlocking, which the pm_runtime_suspended() check prevented.

v2:
 - ct->lock looks to be primed with fs_reclaim, so holding that and then
   allocating memory will cause lockdep to complain. Now that we
   unconditionally grab the mem_access.lock around mem_access_{get,put}, we
   need to change the ordering wrt to grabbing the ct->lock, since some of
   the runtime_pm routines can allocate memory (or at least that's what
   lockdep seems to suggest). Hopefully not a big deal.  It might be that
   there were already issues with this, just that the atomics where
   "hiding" the potential issues.
v3:
 - Use Thomas Hellström' idea with tracking the active task that is
   executing in the resume or suspend callback, in order to avoid
   recursive resume/suspend calls deadlocking on itself.
 - Split the ct->lock change.
v4:
 - Add smb_mb() around accessing the pm_callback_task for extra safety.
   (Thomas Hellström)
v5:
 - Clarify the kernel-doc for the mem_access.lock, given that it is quite
   strange in what it protects (data vs code). The real motivation is to
   aid lockdep. (Rodrigo Vivi)
v6:
 - Split out the lock change. We still want this as a lockdep aid but
   only for the xe_device_mem_access_get() path. Sticking a lock on the
   put() looks be a no-go, also the runtime_put() there is always async.
 - Now that the lock is gone move to atomics and rely on the pm code
   serialising multiple callers on the 0 -> 1 transition.
 - g2h_worker_func() looks to be the next issue, given that
   suspend-resume callbacks are using CT, so try to handle that.
v7:
 - Add xe_device_mem_access_get_if_ongoing(), and use it in
   g2h_worker_func().
v8 (Anshuman):
 - Just always grab the rpm, instead of just on the 0 -> 1 transition,
   which is a lot clearer and simplifies the code quite a bit.
v9:
 - Make sure we also adjust the CT fast-path with if-active.

Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/258
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Anshuman Gupta <anshuman.gupta@intel.com>
Acked-by: Anshuman Gupta <anshuman.gupta@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:37:35 -05:00
Francois Dugast
3e8e7ee6a3 drm/xe: Cleanup style warnings
Reduce the number of warnings reported by checkpatch.pl from 118 to 48 by
addressing those warnings types:

  LEADING_SPACE
  LINE_SPACING
  BRACES
  TRAILING_SEMICOLON
  CONSTANT_COMPARISON
  BLOCK_COMMENT_STYLE
  RETURN_VOID
  ONE_SEMICOLON
  SUSPECT_CODE_INDENT
  LINE_CONTINUATIONS
  UNNECESSARY_ELSE
  UNSPECIFIED_INT
  UNNECESSARY_INT
  MISORDERED_TYPE

Signed-off-by: Francois Dugast <francois.dugast@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:37:31 -05:00
Matthew Auld
35c8a96439 drm/xe: handle TLB invalidations from CT fast-path
In various test cases that put the system under a heavy load, we can
sometimes see errors with missed TLB invalidations. In such cases we see
the interrupt arrive for the invalidation from the GuC, however the
actual processing of the completion is pushed onto a workqueue and
handled with all the other CT stuff, which might take longer than
expected. Since we expect TLB invalidations to complete within a
reasonable amount of time (at most ~250ms), and they do seem pretty
critical, allow handling directly from the CT fast-path.

v2 (José):
  - Actually use the correct spinlock/unlock_irq, since pending_lock is
    grabbed from IRQ.
v3:
  - Don't publish the TLB fence on the list until after we fully
    initialize it and successfully do the CT send. The list is now only
    protected by the spin_lock pending_lock and we can't hold that
    across the entire TLB send operation.
v4 (Matt Brost):
  - Be careful with racing against fast CT path writing the seqno,
    before we have actually published the fence.

References: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/297
References: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/320
References: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/449
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:35:23 -05:00
Matthew Auld
0b688f9b28 drm/xe/ct: update g2h outstanding for CTB capture
Looks to always to be zero when inspecting the CTB dump.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:35:23 -05:00
Matthew Auld
a4d362bbed drm/xe/ct: serialise fast_lock during CT disable
The fast-path CT could be running as we enter a runtime-suspend or
potentially a GT reset, however here we only use the ct->fast_lock and
not the full ct->lock. Before disabling the CT, also serialise against
the fast_lock to ensure any in-progress work finishes before we start
nuking the CT related stuff. Once we disable ct->enabled and drop the
lock, any new work should fail gracefully, and anything that was in
progress should be finished.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:35:23 -05:00
Matthew Auld
dad33831d8 drm/xe/ct: hold fast_lock when reserving space for g2h
Reserving and checking for space on the g2h side relies on the
fast_lock, and not the CT lock since we need to release space from the
fast CT path. Make sure we hold it when checking for space and reserving
it. The main concern is calling __g2h_release_space() as we are reserving
something and since the info.space and info.g2h_outstanding operations
are not atomic we can get some nonsense values back.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:35:22 -05:00
Matthew Auld
c4bbc32e09 drm/xe: hold mem_access.ref for CT fast-path
Just checking xe_device_mem_access_ongoing() is not enough, we also need
to hold the reference otherwise the ref can transition from 1 -> 0 as we
enter g2h_read(), leading to warnings. While we can't do a full rpm sync
in the IRQ, we can keep the device awake if the ref is non-zero.
Introduce a new helper for this and set it to work in for the CT
fast-path.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:35:22 -05:00
Alan Previn
c7fac450dd drm/xe/guc: Fix h2g_write usage of GUC_CTB_MSG_MAX_LEN
In the ABI header, GUC_CTB_MSG_MIN_LEN is '1' because
GUC_CTB_HDR_LEN is 1. This aligns with H2G/G2H CTB specification
where all command formats are defined in units of dwords so that '1'
is a dword. Accordingly, GUC_CTB_MSG_MAX_LEN is 256-1 (i.e. 255
dwords). However, h2g_write was incorrectly assuming that
GUC_CTB_MSG_MAX_LEN was in bytes. Fix this.

v3: Fix nit on #define location.(Matt)
v2: By correctly treating GUC_CTB_MSG_MAX_LEN as dwords, it causes
    a local array to consume 4x the stack size. Rework the function
    to avoid consuming stack even if the action size is large. (Matt)

Signed-off-by: Alan Previn <alan.previn.teres.alexis@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:35:07 -05:00
Lucas De Marchi
90738d8665 drm/xe/guc: Fix typo s/enabled/enable/
Fix the log message when it fails to enable CT.

Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://lore.kernel.org/r/20230611222447.2837573-2-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:34:14 -05:00
Matthew Brost
f1a5a9bf14 drm/xe/guc: Read HXG fields from DW1 of G2H response
The HXG fields are DW1 not DW0, fix this.

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Acked-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:35:20 -05:00
Matt Roper
876611c2b7 drm/xe: Memory allocations are tile-based, not GT-based
Since memory and address spaces are a tile concept rather than a GT
concept, we need to plumb tile-based handling through lots of
memory-related code.

Note that one remaining shortcoming here that will need to be addressed
before media GT support can be re-enabled is that although the address
space is shared between a tile's GTs, each GT caches the PTEs
independently in their own TLB and thus TLB invalidation should be
handled at the GT level.

v2:
 - Fix kunit test build.

Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20230601215244.678611-13-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:34:14 -05:00
Matthew Auld
565ce72e1c drm/xe: don't allocate under ct->lock
Seems to be a sensitive lock, where ct->lock looks to be primed with
fs_reclaim, so holding that and then allocating memory will cause
lockdep to complain. We need to change the ordering wrt to grabbing the
ct->lock and potentially grabbing the runtime_pm, since some of the
runtime_pm routines can allocate memory (or at least that's what lockdep
seems to suggest).

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:34:09 -05:00
Rodrigo Vivi
513260dfd1 drm/xe: Convert GuC CT print to snapshot capture and print.
The goal is to allow for a snapshot capture to be taken at the time
of the crash, while the print out can happen at a later time through
the exposed devcoredump virtual device.

v2: Handle memory allocation failures. (Matthew)
    Do not use GFP_ATOMIC on cases like debugfs prints. (Matthew)
v3: checkpatch fixes
v4: Do not use atomic in the g2h_worker_func (Matthew)

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
2023-12-19 18:33:52 -05:00
Rodrigo Vivi
a7ca8157ec drm/xe: Extract non mapped regions out of GuC CTB into its own struct.
No functional change here. The goal is to have a clear split between
the mapped portions of the CTB and the static information, so we can
easily capture snapshots that will be used for later read out with
the devcoredump infrastructure.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
2023-12-19 18:33:52 -05:00
Niranjana Vishwanathapura
2988cf02ee drm/xe: Fix memory use after free
The wait_event_timeout() on g2h_fence.wq which is declared on
stack can return before the wake_up() gets called, resulting in a
stack out of bound access when wake_up() accesses the g2h_fene.wq.

Do not declare g2h_fence related wait_queue_head_t on stack.

Fixes the below KASAN BUG and associated kernel crashes.

BUG: KASAN: stack-out-of-bounds in do_raw_spin_lock+0x6f/0x1e0
Read of size 4 at addr ffff88826252f4ac by task kworker/u128:5/467

CPU: 25 PID: 467 Comm: kworker/u128:5 Tainted: G  U 6.3.0-rc4-xe #1
Workqueue: events_unbound g2h_worker_func [xe]
Call Trace:
 <TASK>
 dump_stack_lvl+0x64/0xb0
 print_report+0xc2/0x600
 kasan_report+0x96/0xc0
 do_raw_spin_lock+0x6f/0x1e0
 _raw_spin_lock_irqsave+0x47/0x60
 __wake_up_common_lock+0xc0/0x150
 dequeue_one_g2h+0x20f/0x6a0 [xe]
 g2h_worker_func+0xa9/0x180 [xe]
 process_one_work+0x527/0x990
 worker_thread+0x2d1/0x640
 kthread+0x174/0x1b0
 ret_from_fork+0x29/0x50
 </TASK>

Tested-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Bruce Chang <yu.bruce.chang@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:31:40 -05:00
Lucas De Marchi
ea9f879d03 drm/xe: Sort includes
Sort includes and split them in blocks:

1) .h corresponding to the .c. Example: xe_bb.c should have a "#include
   "xe_bb.h" first.
2) #include <linux/...>
3) #include <drm/...>
4) local includes
5) i915 includes

This is accomplished by running
`clang-format --style=file -i --sort-includes drivers/gpu/drm/xe/*.[ch]`
and ignoring all the changes after the includes. There are also some
manual tweaks to split the blocks.

v2: Also sort includes in headers

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:29:20 -05:00
Matthew Brost
a9351846d9 drm/xe: Break of TLB invalidation into its own file
TLB invalidation is used by more than USM (page faults) so break this
code out into its own file.

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:27:45 -05:00
Matthew Brost
5b64366087 drm/xe: Don't process TLB invalidation done in CT fast-path
We can't currently do this due to TLB invalidation done handler
expecting the seqno being received in-order, with the fast-path a TLB
invalidation done could pass one being processed in the slow-path in an
extreme corner case. Remove TLB invalidation done from the fast-path for
now and in a follow up reenable this once the TLB invalidation done
handler can deal with out of order seqno.

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:27:45 -05:00
Matthew Brost
f900725af8 drm/xe/guc: s/xe_guc_send_mmio/xe_guc_mmio_send
Now aligns with the xe_guc_ct_send naming.

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Philippe Lecluse <philippe.lecluse1@gmail.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-12 14:06:00 -05:00
Matthew Brost
dd08ebf6c3 drm/xe: Introduce a new DRM driver for Intel GPUs
Xe, is a new driver for Intel GPUs that supports both integrated and
discrete platforms starting with Tiger Lake (first Intel Xe Architecture).

The code is at a stage where it is already functional and has experimental
support for multiple platforms starting from Tiger Lake, with initial
support implemented in Mesa (for Iris and Anv, our OpenGL and Vulkan
drivers), as well as in NEO (for OpenCL and Level0).

The new Xe driver leverages a lot from i915.

As for display, the intent is to share the display code with the i915
driver so that there is maximum reuse there. But it is not added
in this patch.

This initial work is a collaboration of many people and unfortunately
the big squashed patch won't fully honor the proper credits. But let's
get some git quick stats so we can at least try to preserve some of the
credits:

Co-developed-by: Matthew Brost <matthew.brost@intel.com>
Co-developed-by: Matthew Auld <matthew.auld@intel.com>
Co-developed-by: Matt Roper <matthew.d.roper@intel.com>
Co-developed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Co-developed-by: Francois Dugast <francois.dugast@intel.com>
Co-developed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Co-developed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Co-developed-by: Philippe Lecluse <philippe.lecluse@intel.com>
Co-developed-by: Nirmoy Das <nirmoy.das@intel.com>
Co-developed-by: Jani Nikula <jani.nikula@intel.com>
Co-developed-by: José Roberto de Souza <jose.souza@intel.com>
Co-developed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Co-developed-by: Dave Airlie <airlied@redhat.com>
Co-developed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Co-developed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Co-developed-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
2023-12-12 14:05:48 -05:00