As more KUnit tests are introduced to evaluate the basic capabilities of
the `timedout_job()` hook, the test suite will continue to increase in
duration. To reduce the overall running time of the test suite, decrease
the scheduler's timeout for the timeout tests.
Before this commit:
[15:42:26] Elapsed time: 15.637s total, 0.002s configuring, 10.387s building, 5.229s running
After this commit:
[15:45:26] Elapsed time: 9.263s total, 0.002s configuring, 5.168s building, 4.037s running
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Acked-by: Philipp Stanner <phasta@kernel.org>
Link: https://lore.kernel.org/r/20250714-sched-skip-reset-v6-3-5c5ba4f55039@igalia.com
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Among the scheduler's statuses, the only one that indicates an error is
DRM_GPU_SCHED_STAT_ENODEV. Any status other than DRM_GPU_SCHED_STAT_ENODEV
signifies that the operation succeeded and the GPU is in a nominal state.
However, to provide more information about the GPU's status, it is needed
to convey more information than just "OK".
Therefore, rename DRM_GPU_SCHED_STAT_NOMINAL to
DRM_GPU_SCHED_STAT_RESET, which better communicates the meaning of this
status. The status DRM_GPU_SCHED_STAT_RESET indicates that the GPU has
hung, but it has been successfully reset and is now in a nominal state
again.
Reviewed-by: Philipp Stanner <phasta@kernel.org>
Link: https://lore.kernel.org/r/20250714-sched-skip-reset-v6-1-5c5ba4f55039@igalia.com
Signed-off-by: Maíra Canal <mcanal@igalia.com>
The GPU Scheduler now supports a new callback, cancel_job(), which lets
the scheduler cancel all jobs which might not yet be freed when
drm_sched_fini() runs. Using this callback allows for significantly
simplifying the mock scheduler teardown code.
Implement the cancel_job() callback and adjust the code where necessary.
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Signed-off-by: Philipp Stanner <phasta@kernel.org>
Link: https://lore.kernel.org/r/20250710125412.128476-5-phasta@kernel.org
Since the drm_mock_scheduler does not have real users in userspace, nor
does it have real hardware or firmware rings, it's not necessary to
signal timedout fences nor free jobs - from a functional standpoint.
Still, the dma_fence framework establishes the hard rule that all fences
must always get signaled.
The unit tests, moreover, should as much as possible represent the
intended usage of the scheduler API.
Furthermore, this later enables simplifying the mock scheduler's
teardown code path.
Make sure timed out hardware fences get signaled with the appropriate
error code.
Signed-off-by: Philipp Stanner <phasta@kernel.org>
Acked-by: Danilo Krummrich <dakr@kernel.org>
Link: https://lore.kernel.org/r/20250605134154.191764-2-phasta@kernel.org
There is no need for separate locks for single jobs and the entire
scheduler. The dma_fence context can be protected by the scheduler lock,
allowing for removing the jobs' locks. This simplifies things and
reduces the likelyhood of deadlocks etc.
Replace the jobs' locks with the mock scheduler lock.
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Signed-off-by: Philipp Stanner <phasta@kernel.org>
Link: https://lore.kernel.org/r/20250527101029.56491-2-phasta@kernel.org
This will be used in a later commit to trace the drm client_id in
some of the gpu_scheduler trace events.
This requires changing all the users of drm_sched_job_init to
add an extra parameter.
The newly added drm_client_id field in the drm_sched_fence is a bit
of a duplicate of the owner one. One suggestion I received was to
merge those 2 fields - this can't be done right now as amdgpu uses
some special values (AMDGPU_FENCE_OWNER_*) that can't really be
translated into a client id. Christian is working on getting rid of
those; when it's done we should be able to squash owner/drm_client_id
together.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Signed-off-by: Philipp Stanner <phasta@kernel.org>
Link: https://lore.kernel.org/r/20250526125505.2360-3-pierre-eric.pelloux-prayer@amd.com
Backmerging to get v6.15-rc1 into drm-misc-next. Also fixes a
build issue when enabling CONFIG_DRM_SCHED_KUNIT_TEST.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>