In commit 9632dfb0de ("drm/xe/vf: Don't run any save-restore
RTP actions if VF") we disabled processing of all RTP rules if
we were running as a VFs, since many of the RTP actions were
trying to access registers unaccessible for VFs.
This also included all of LRC WA rules, since some of them were
implemented in a way that required RMW pattern.
Now, as we can program LRC WAs without accessing such registers
from the driver, relying on the MI_MATH instruction instead, we
can unblock xe_rtp_process_to_sr() for VFs.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250311105221.1910-1-michal.wajdeczko@intel.com
There's a mismatch on API: while xe_rtp_process_to_sr() processes
entries until an entry without name, the active tracking with
xe_rtp_process_ctx_enable_active_tracking() needs to use the number of
elements. The number of elements is taken everywhere using ARRAY_SIZE(),
but that will have one entry too many. This leads to the following
warning, as reported by lkp:
drivers/gpu/drm/xe/xe_tuning.c: In function 'xe_tuning_dump':
>> include/drm/drm_print.h:228:31: warning: '%s' directive argument is null [-Wformat-overflow=]
228 | drm_printf((printer), "%.*s" fmt, (indent), "\t\t\t\t\tX", ##__VA_ARGS__)
| ^~~~~~
drivers/gpu/drm/xe/xe_tuning.c:226:17: note: in expansion of macro 'drm_printf_indent'
226 | drm_printf_indent(p, 1, "%s\n", engine_tunings[idx].name);
| ^~~~~~~~~~~~~~~~~
That's because it will still process the last entry when tracking the
active tunings. The same issue exists in the WAs. Change
xe_rtp_process_to_sr() to also take the number of elements so the empty
entry can be removed and the warning should go away. Fixing on the
active-tracking side would more fragile as the it would need a `- 1`
everywhere and continue to use a different approach for number of
elements.
Aside from the warning, it's a non-issue as there would always be enough
bits allocated and the last entry would never be active since
xe_rtp_process_to_sr() stops on the sentinel.
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202503021906.P2MwAvyK-lkp@intel.com/
Cc: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250306-fix-print-warning-v1-1-979c3dc03c0d@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Now that rtp has OR rules, it's not needed to extend it to process OOB
WAs. Previously if an entry had no name, it was considered as "a set of
rules OR'ed with the last named entry".
Instead of generating new entries, add OR rules. The syntax for
xe_wa_oob.rules remains the same, with xe_gen_wa_oob generating the
slightly different table. Object sizes delta are negligible, but having
just one logic makes it easier to maintain:
add/remove: 0/0 grow/shrink: 1/2 up/down: 160/-269 (-109)
Function old new delta
__compound_literal 6104 6264 +160
xe_wa_dump 1839 1810 -29
oob_was 816 576 -240
Total: Before=17257, After=17148, chg -0.63%
Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240727015907.899192-9-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
The OOB WAs use xe_rtp_process(), without passing an sr to save result
of the actions since there are none. They are also executed in a gt-only
context, making it harder to share the implementation. Thus, introduce a
new set of tests to check these RTP entries. The only check that can be
done is if the entry was marked as active.
Before commit fd6797ec50 ("drm/xe/rtp: Fix off-by-one when processing
rules") several of these tests were failing: the processing of OR'ed
entries would make the subsequent entry to be inadvertently enabled.
Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240727015907.899192-6-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Gustavo noticed an odd "+ 2" in rtp_mark_active() while processing
rtp rules and pointed that it should be "+ 1". In fact, while processing
entries without actions (OOB workarounds), if the WA is activated and
has OR rules, it will also inadvertently activate the very next
workaround.
Test in a LNL B0 platform by moving 18024947630 on top of 16020292621,
makes the latter become active:
$ cat /sys/kernel/debug/dri/0/gt0/workarounds
...
OOB Workarounds
18024947630
16020292621
14018094691
16022287689
13011645652
22019338487_display
In future a kunit test will be added to cover the rtp checks for entries
without actions.
Fixes: fe19328b90 ("drm/xe/rtp: Add support for entries with no action")
Cc: Gustavo Sousa <gustavo.sousa@intel.com>
Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240726064337.797576-6-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Some workarounds started to depend on different set of conditions where
the action should be applied if any of them match. See e.g.
commit 24d0d98af1 ("drm/xe/xe2lpm: Fixup Wa_14020756599"). Add
XE_RTP_MATCH_OR that allows to implement a logical OR for the rules.
Normal precedence applies:
r1, r2, OR, r3
means
(r1 AND r2) OR r3
The check is shortcut as soon as a set of conditions match.
v2: Do not match on empty number of rules-other-than-OR evaluated
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240618050044.324454-4-lucas.demarchi@intel.com
It's not possible for the condition checking if we're running on
platform without geometry pipeline to ever be true, since
gt->fuse_topo.g_dss_mask is an array.
It also breaks the build:
../drivers/gpu/drm/xe/xe_rtp.c:183:50: error: address of array 'gt->fuse_topo.g_dss_mask' will always evaluate to 'true' [-Werror,-Wpointer-bool-conversion]
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230523135020.345596-2-michal@hardline.pl
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Using uninitialized variables leads to undefined behavior.
Moreover, it causes the compiler to complain with:
../drivers/gpu/drm/xe/xe_vm.c:3265:40: error: variable 'vma' is uninitialized when used here [-Werror,-Wuninitialized]
../drivers/gpu/drm/xe/xe_rtp.c:118:36: error: variable 'i' is uninitialized when used here [-Werror,-Wuninitialized]
../drivers/gpu/drm/xe/xe_mocs.c:449:3: error: variable 'flags' is uninitialized when used here [-Werror,-Wuninitialized]
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230523135020.345596-1-michal@hardline.pl
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
When running rules on MTL and beyond that have media as a standalone GT,
the rule should only match if the gt passed as parameter match the
version/range/stepping that the rule is checking. This allows
workarounds affecting only the media GT to be applied only on that GT
and vice-versa.
For platforms before MTL, the GT will not be of media type, even if it
includes media engines. Make sure to cover that case by checking if the
platforma has standalone media.
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230526164358.86393-21-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Start differentiating the media and graphics stepping as it will be
important for MTL. Note that RTP is still not prepared to handle the
different types of GT, i.e. checking for graphics version/range/stepping
on a media gt or vice versa still matches regardless of the gt being
passed as parameter. Changing it to accommodate MTL is left for a future
patch.
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230526164358.86393-10-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
The selection between hwe and gt is exposed to the outside of rtp, by
the xe_rtp_process() function. However it doesn't make seense from the
caller point of view to pass a hwe and a gt as argument since the gt
should always be the one containing the hwe.
This clarifies the interface by separating the context creation into an
initializer. The initializer then passes the correct value and there
should never be a case with hwe and gt set: when hwe is passed, the gt
is the one containing it. Internally the functions continue receiving
the argument separately.
v2: Leave the device-only context to a separate patch if they are indeed
needed later
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230526164358.86393-3-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Now that struct xe_reg and struct xe_reg_mcr are types that can be used
by xe, convert more of the driver to use them. Some notes about the
conversions:
- The RTP tables don't need the MASKED flags anymore in the
actions as that information now comes from the register
definition
- There is no need for the _XE_RTP_REG/_XE_RTP_REG_MCR macros
and the register types on RTP infra: that comes from the
register definitions.
- When declaring the RTP entries, there is no need anymore to
undef XE_REG and friends: the RTP macros deal with removing
the cast where needed due to not being able to use a compound
statement for initialization in the tables
- The index in the reg-sr xarray is the register offset only.
Otherwise we wouldn't catch mistakes about adding both a
MCR-style and normal-style registers. For that, the register
is now also part of the entry, so the options can be compared
to check for compatible entries.
In order to be able to accomplish this, some improvements are needed on
the RTP macros. Change its implementation to concentrate on "pasting a prefix
to each argument" rather than the more general "call any macro for each
argument". Hopefully this will avoid trying to extend this infra and
making it more complex. With the use of tuples for building the
arguments, it's not possible to pass additional register fields and
using xe_reg in the RTP tables.
xe_mmio_* still need to be converted, from u32 to xe_reg, but that is
left for another change.
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230427223256.1432787-10-lucas.demarchi@intel.com
Link: https://lore.kernel.org/r/20230427223256.1432787-6-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Add some basic unit tests for rtp. This is intended to prove the
functionality of the rtp itself, like coalescing entries, rejecting
non-disjoint values, etc.
Contrary to the other tests in xe, this is a unit test to test the
sw-side only, so it can be executed on any machine - it doesn't interact
with the real hardware. Running it produces the following output:
$ ./tools/testing/kunit/kunit.py run --raw_output-kunit \
--kunitconfig drivers/gpu/drm/xe/.kunitconfig xe_rtp
...
[01:26:27] Starting KUnit Kernel (1/1)...
KTAP version 1
1..1
KTAP version 1
# Subtest: xe_rtp
1..1
KTAP version 1
# Subtest: xe_rtp_process_tests
ok 1 coalesce-same-reg
ok 2 no-match-no-add
ok 3 no-match-no-add-multiple-rules
ok 4 two-regs-two-entries
ok 5 clr-one-set-other
ok 6 set-field
[drm:xe_reg_sr_add] *ERROR* Discarding save-restore reg 0001 (clear: 00000001, set: 00000001, masked: no): ret=-22
ok 7 conflict-duplicate
[drm:xe_reg_sr_add] *ERROR* Discarding save-restore reg 0001 (clear: 00000003, set: 00000000, masked: no): ret=-22
ok 8 conflict-not-disjoint
[drm:xe_reg_sr_add] *ERROR* Discarding save-restore reg 0001 (clear: 00000002, set: 00000002, masked: no): ret=-22
[drm:xe_reg_sr_add] *ERROR* Discarding save-restore reg 0001 (clear: 00000001, set: 00000001, masked: yes): ret=-22
ok 9 conflict-reg-type
# xe_rtp_process_tests: pass:9 fail:0 skip:0 total:9
ok 1 xe_rtp_process_tests
# Totals: pass:9 fail:0 skip:0 total:9
ok 1 xe_rtp
...
Note that the ERRORs in the kernel log are expected since it's testing
incompatible entries.
v2:
- Use parameterized table for tests (Michał Winiarski)
- Move everything to the xe_rtp_test.ko and only add a few exports to the
right namespace
- Add more tests to cover FIELD_SET, CLR, partially true rules, etc
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Maarten Lankhorst<maarten.lankhorst@linux.intel.com> # v1
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Link: https://lore.kernel.org/r/20230401085151.1786204-7-lucas.demarchi@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Add match helper to detect when the first gslice is fused off, as needed
by future workarounds.
v2:
- Add warning if called on a platform without geometry pipeline
(Matt Roper)
- Hardcode 4 as the number of gslices, which matches all the currently
supported platforms. PVC doesn't have geometry pipeline and
shouldn't use this function (Matt Roper)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230314003012.2600353-2-lucas.demarchi@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
This allows to create WA/tuning rules that match the first engine that
is either of compute or render class. This matters for platforms that
don't have a render engine and that may have arbitrary compute engines
fused off: some register programming need to be added to one of those
engines.
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Match functions are generally useful for other parts of the code (e.g.
xe_tuning.c). Move and rename the single one available to create a place
where similar match functions can be added.
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Just like there is support for multiple rules per entry in an rtp table,
also support multiple actions. This makes it easier to add support for
workarounds that need to change multiple registers. It also makes it
slightly more readable as now the action part resembles the rule part.
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Entry flags is meant for the whole entry, including the rule
evaluation. Action flags are for flags applied to the register or
action being taken. Since there's only one action per entry, the
distinction was not important and a u8 was spared. However more and more
workarounds are needing multiple actions. This prepares for multiple
action support.
Right now there are these action flags:
- XE_RTP_ACTION_FLAG_MASKED_REG: register in the action is a masked
register
- XE_RTP_ACTION_FLAG_ENGINE_BASE: the engine base should be added to
the register in order to form the real address
And this entry flag:
- XE_RTP_ENTRY_FLAG_FOREACH_ENGINE: the rules should be evaluated for
each engine on the gt. It also automatically implies
XE_RTP_ACTION_FLAG_ENGINE_BASE.
Since there are likely not that many rules, reduce n_rules to u8 so the
overall entry size doesn't increase more than needed.
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
It's true that the struct records the register and the value (in form of
2 masks) to restore, but it also records more fields important to
the application of workarounds/tuning, etc. One important part is what
is the macro used to record these fields: SET/CLR/WR/FIELD_SET/etc.
Thinking of the table as a set of rules + actions is more intuitive than
rules + regval.
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Xe, is a new driver for Intel GPUs that supports both integrated and
discrete platforms starting with Tiger Lake (first Intel Xe Architecture).
The code is at a stage where it is already functional and has experimental
support for multiple platforms starting from Tiger Lake, with initial
support implemented in Mesa (for Iris and Anv, our OpenGL and Vulkan
drivers), as well as in NEO (for OpenCL and Level0).
The new Xe driver leverages a lot from i915.
As for display, the intent is to share the display code with the i915
driver so that there is maximum reuse there. But it is not added
in this patch.
This initial work is a collaboration of many people and unfortunately
the big squashed patch won't fully honor the proper credits. But let's
get some git quick stats so we can at least try to preserve some of the
credits:
Co-developed-by: Matthew Brost <matthew.brost@intel.com>
Co-developed-by: Matthew Auld <matthew.auld@intel.com>
Co-developed-by: Matt Roper <matthew.d.roper@intel.com>
Co-developed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Co-developed-by: Francois Dugast <francois.dugast@intel.com>
Co-developed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Co-developed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Co-developed-by: Philippe Lecluse <philippe.lecluse@intel.com>
Co-developed-by: Nirmoy Das <nirmoy.das@intel.com>
Co-developed-by: Jani Nikula <jani.nikula@intel.com>
Co-developed-by: José Roberto de Souza <jose.souza@intel.com>
Co-developed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Co-developed-by: Dave Airlie <airlied@redhat.com>
Co-developed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Co-developed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Co-developed-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>