Implement multi-lrc submission via a single workqueue entry and single
H2G. The workqueue entry contains an updated tail value for each
request, of all the contexts in the multi-lrc submission, and updates
these values simultaneously. As such, the tasklet and bypass path have
been updated to coalesce requests into a single submission.
v2:
(John Harrison)
- s/wqe/wqi
- Use FIELD_PREP macros
- Add GEM_BUG_ONs ensures length fits within field
- Add comment / white space to intel_guc_write_barrier
(Kernel test robot)
- Make need_tasklet a static function
v3:
(Docs)
- A comment for submission_stall_reason
v4:
(Kernel test robot)
- Initialize return value in bypass tasklt submit function
(John Harrison)
- Add comment near work queue defs
- Add BUILD_BUG_ON to ensure WQ_SIZE is a power of 2
- Update write_barrier comment to talk about work queue
v5:
(John Harrison)
- Fix typo in work queue comment
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211014172005.27155-13-matthew.brost@intel.com
Assign contexts in parent-child relationship consecutive guc_ids. This
is accomplished by partitioning guc_id space between ones that need to
be consecutive (1/16 available guc_ids) and ones that do not (15/16 of
available guc_ids). The consecutive search is implemented via the bitmap
API.
This is a precursor to the full GuC multi-lrc implementation but aligns
to how GuC mutli-lrc interface is defined - guc_ids must be consecutive
when using the GuC multi-lrc interface.
v2:
(Daniel Vetter)
- Explicitly state why we assign consecutive guc_ids
v3:
(John Harrison)
- Bring back in spin lock
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211014172005.27155-11-matthew.brost@intel.com
Taking a PM reference to prevent intel_gt_wait_for_idle from short
circuiting while a deregister context H2G is in flight. To do this must
issue the deregister H2G from a worker as context can be destroyed from
an atomic context and taking GT PM ref blows up. Previously we took a
runtime PM from this atomic context which worked but will stop working
once runtime pm autosuspend in enabled.
So this patch is two fold, stop intel_gt_wait_for_idle from short
circuting and fix runtime pm autosuspend.
v2:
(John Harrison)
- Split structure changes out in different patch
(Tvrtko)
- Don't drop lock in deregister_destroyed_contexts
v3:
(John Harrison)
- Flush destroyed contexts before destroying context reg pool
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211014172005.27155-3-matthew.brost@intel.com
This feature hands over the control of HW RC6 to the GuC.
GuC decides when to put HW into RC6 based on it's internal
busyness algorithms.
GuCRC needs GuC submission to be enabled, and only
supported on Gen12+ for now.
When GuCRC is enabled, do not set HW RC6. Use a H2G message
to tell GuC to enable GuCRC. When disabling RC6, tell GuC to
revert RC6 control back to KMD. KMD is still responsible for
enabling everything related to Coarse Power Gating though.
v2: Address comments (Michal W)
v3: Don't set hysterisis values when GuCRC is used (Matt Roper)
v4: checkpatch()
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210730202119.23810-15-vinay.belgaumkar@intel.com
In the case of a full GPU reset (e.g. because GuC has died or because
GuC's hang detection has been disabled), the driver can't rely on GuC
reporting the guilty context. Instead, the driver needs to scan all
active contexts and find one that is currently executing, as per the
execlist mode behaviour. In GuC mode, this scan is different to
execlist mode as the active request list is handled very differently.
Similarly, the request state dump in debugfs needs to be handled
differently when in GuC submission mode.
Also refactured some of the request scanning code to avoid duplication
across the multiple code paths that are now replicating it.
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-20-matthew.brost@intel.com
Reset implementation for new GuC interface. This is the legacy reset
implementation which is called when the i915 owns the engine hang check.
Future patches will offload the engine hang check to GuC but we will
continue to maintain this legacy path as a fallback and this code path
is also required if the GuC dies.
With the new GuC interface it is not possible to reset individual
engines - it is only possible to reset the GPU entirely. This patch
forces an entire chip reset if any engine hangs.
v2:
(Michal)
- Check for -EPIPE rather than -EIO (CT deadlock/corrupt check)
v3:
(John H)
- Split into a series of smaller patches
v4:
(John H)
- Fix typo
- Add braces around if statements in reset code
v5:
(Checkpatch)
- Fix warnings
Cc: John Harrison <john.c.harrison@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: John Harrison <john.c.harrison@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-9-matthew.brost@intel.com
When running the GuC the GPU can't be considered idle if the GuC still
has contexts pinned. As such, a call has been added in
intel_gt_wait_for_idle to idle the UC and in turn the GuC by waiting for
the number of unpinned contexts to go to zero.
v2: rtimeout -> remaining_timeout
v3: Drop unnecessary includes, guc_submission_busy_loop ->
guc_submission_send_busy_loop, drop negatie timeout trick, move a
refactor of guc_context_unpin to earlier path (John H)
v4: Add stddef.h back into intel_gt_requests.h, sort circuit idle
function if not in GuC submission mode
Cc: John Harrison <john.c.harrison@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210721215101.139794-16-matthew.brost@intel.com
With GuC scheduling, it isn't safe to unpin a context while scheduling
is enabled for that context as the GuC may touch some of the pinned
state (e.g. LRC). To ensure scheduling isn't enabled when an unpin is
done, a call back is added to intel_context_unpin when pin count == 1
to disable scheduling for that context. When the response CTB is
received it is safe to do the final unpin.
Future patches may add a heuristic / delay to schedule the disable
call back to avoid thrashing on schedule enable / disable.
v2:
(John H)
- s/drm_dbg/drm_err
(Daneiel)
- Clean up sched state function
Cc: John Harrison <john.c.harrison@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210721215101.139794-9-matthew.brost@intel.com
Implement GuC context operations which includes GuC specific operations
alloc, pin, unpin, and destroy.
v2:
(Daniel Vetter)
- Use msleep_interruptible rather than cond_resched in busy loop
(Michal)
- Remove C++ style comment
v3:
(Matthew Brost)
- Drop GUC_ID_START
(John Harrison)
- Fix a bunch of typos
- Use drm_err rather than drm_dbg for G2H errors
(Daniele)
- Fix ;; typo
- Clean up sched state functions
- Add lockdep for guc_id functions
- Don't call __release_guc_id when guc_id is invalid
- Use MISSING_CASE
- Add comment in guc_context_pin
- Use shorter path to rpm
(Daniele / CI)
- Don't call release_guc_id on an invalid guc_id in destroy
v4:
(Daniel Vetter)
- Add FIXME comment
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210721215101.139794-7-matthew.brost@intel.com
Implement GuC submission tasklet for new interface. The new GuC
interface uses H2G to submit contexts to the GuC. Since H2G use a single
channel, a single tasklet is used for the submission path.
Also the per engine interrupt handler has been updated to disable the
rescheduling of the physical engine tasklet, when using GuC scheduling,
as the physical engine tasklet is no longer used.
In this patch the field, guc_id, has been added to intel_context and is
not assigned. Patches later in the series will assign this value.
v2:
(John Harrison)
- Clean up some comments
v3:
(John Harrison)
- More comment cleanups
Cc: John Harrison <john.c.harrison@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210721215101.139794-5-matthew.brost@intel.com
Add non blocking CTB send function, intel_guc_send_nb. GuC submission
will send CTBs in the critical path and does not need to wait for these
CTBs to complete before moving on, hence the need for this new function.
The non-blocking CTB now must have a flow control mechanism to ensure
the buffer isn't overrun. A lazy spin wait is used as we believe the
flow control condition should be rare with a properly sized buffer.
The function, intel_guc_send_nb, is exported in this patch but unused.
Several patches later in the series make use of this function.
v2:
(Michal)
- Use define for H2G room calculations
- Move INTEL_GUC_SEND_NB define
(Daniel Vetter)
- Use msleep_interruptible rather than cond_resched
v3:
(Michal)
- Move includes to following patch
- s/INTEL_GUC_SEND_NB/INTEL_GUC_CT_SEND_NB/g
v4:
(John H)
- Update comment, add type local variable
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210708162055.129996-5-matthew.brost@intel.com
Drop the variable guc->interrupts.enabled as this variable is just
leading to bugs creeping into the code.
e.g. A full GPU reset disables the GuC interrupts but forgot to clear
guc->interrupts.enabled, guc->interrupts.enabled being true suppresses
interrupts from getting re-enabled and now we are broken.
It is harmless to enable interrupt while already enabled so let's just
delete this variable to avoid bugs like this going forward.
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210603051630.2635-7-matthew.brost@intel.com
Delete GuC code unused in future patches that rewrite the GuC interface
to work with the new firmware. Most of the code deleted relates to
workqueues or execlist port. The code is safe to remove because we still
don't allow GuC submission to be enabled, even when overriding the
modparam, so it currently can't be reached.
The defines + structs for the process descriptor and workqueue remain.
Although the new GuC interface does not require either of these for the
normal submission path multi-lrc submission does. The usage of the
process descriptor and workqueue for multi-lrc will be quite different
from the code that is deleted in this patch. A future patch will
implement multi-lrc submission.
v2: add a code in the commit message about the code being safe to
remove (Chris)
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: John Harrison <john.c.harrison@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20210113021236.8164-2-daniele.ceraolospurio@intel.com
To be able to setup GuC submission functions during engine init we need
to commit to using GuC as soon as possible.
Currently, the only thing that can stop us from using the
microcontrollers once we've fetched the blobs is a fundamental
error (e.g. OOM); given that if we hit such an error we can't really
fall-back to anything, we can "officialize" the FW fetching completion
as the moment at which we're committing to using GuC.
To better differentiate this case, the uses_guc check, which indicates
that GuC is supported and was selected in modparam, is renamed to
wants_guc and a new uses_guc is introduced to represent the case were
we're committed to using the GuC. Note that uses_guc does still not imply
that the blob is actually loaded on the HW (is_running is the check for
that). Also, since we need to have attempted the fetch for the result
of uses_guc to be meaningful, we need to make sure we've moved away
from INTEL_UC_FIRMWARE_SELECTED.
All the GuC changes have been mirrored on the HuC for coherency.
v2: split fetch return changes and new macros to their own patches,
support HuC only if GuC is wanted, improve "used" state
description (Michal)
v3: s/wants_huc/uses_huc in uc_init_wopcm
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Fernando Pacheco <fernando.pacheco@intel.com> #v1
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200218223327.11058-6-daniele.ceraolospurio@intel.com
Instead of relying on the workqueue, the upcoming reworked GuC
submission flow will offer the host driver indipendent control over
the execution status of each context submitted to GuC. As part of this,
the doorbell usage model has been reworked, with each doorbell being
paired to a single lrc and a doorbell ring representing new work
available for that specific context. This mechanism, however, limits
the number of contexts that can be registered with GuC to the number of
doorbells, which is an undesired limitation. To avoid this limitation,
we requested the GuC team to also provide a H2G that will allow the host
to notify the GuC of work available for a specified lrc, so we can use
that mechanism instead of relying on the doorbells. We can therefore drop
the doorbell code we currently have, also given the fact that in the
unlikely case we'd want to switch back to using doorbells we'd have to
heavily rework it.
The workqueue will still have a use in the new interface to pass special
commands, so that code has been retained for now.
With the doorbells gone and the GuC client becoming even simpler, the
existing GuC selftests don't give us any meaningful coverage so we can
remove them as well. Some selftests might come with the new code, but
they will look different from what we have now so if doesn't seem worth
it to keep the file around in the meantime.
v2: fix comments and commit message (John)
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191205220243.27403-3-daniele.ceraolospurio@intel.com