mirror of
https://github.com/torvalds/linux.git
synced 2026-04-18 06:44:00 -04:00
rv: Add nrp and sssw per-task monitors
Add 2 per-task monitors as part of the sched model:
* nrp: need-resched preempts
Monitor to ensure preemption requires need resched.
* sssw: set state sleep and wakeup
Monitor to ensure sched_set_state to sleepable leads to sleeping and
sleeping tasks require wakeup.
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Tomas Glozar <tglozar@redhat.com>
Cc: Juri Lelli <jlelli@redhat.com>
Cc: Clark Williams <williams@redhat.com>
Cc: John Kacur <jkacur@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/20250728135022.255578-9-gmonaco@redhat.com
Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
Acked-by: Nam Cao <namcao@linutronix.de>
Tested-by: Nam Cao <namcao@linutronix.de>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
This commit is contained in:
committed by
Steven Rostedt (Google)
parent
d0096c2f9c
commit
e8440a88e5
@@ -174,6 +174,173 @@ running one, no real task switch occurs but interrupts are disabled nonetheless:
|
||||
| | irq_entry
|
||||
+---------------+ irq_enable
|
||||
|
||||
Monitor nrp
|
||||
-----------
|
||||
|
||||
The need resched preempts (nrp) monitor ensures preemption requires
|
||||
``need_resched``. Only kernel preemption is considered, since preemption
|
||||
while returning to userspace, for this monitor, is indistinguishable from
|
||||
``sched_switch_yield`` (described in the sssw monitor).
|
||||
A kernel preemption is whenever ``__schedule`` is called with the preemption
|
||||
flag set to true (e.g. from preempt_enable or exiting from interrupts). This
|
||||
type of preemption occurs after the need for ``rescheduling`` has been set.
|
||||
This is not valid for the *lazy* variant of the flag, which causes only
|
||||
userspace preemption.
|
||||
A ``schedule_entry_preempt`` may involve a task switch or not, in the latter
|
||||
case, a task goes through the scheduler from a preemption context but it is
|
||||
picked as the next task to run. Since the scheduler runs, this clears the need
|
||||
to reschedule. The ``any_thread_running`` state does not imply the monitored
|
||||
task is not running as this monitor does not track the outcome of scheduling.
|
||||
|
||||
In theory, a preemption can only occur after the ``need_resched`` flag is set. In
|
||||
practice, however, it is possible to see a preemption where the flag is not
|
||||
set. This can happen in one specific condition::
|
||||
|
||||
need_resched
|
||||
preempt_schedule()
|
||||
preempt_schedule_irq()
|
||||
__schedule()
|
||||
!need_resched
|
||||
__schedule()
|
||||
|
||||
In the situation above, standard preemption starts (e.g. from preempt_enable
|
||||
when the flag is set), an interrupt occurs before scheduling and, on its exit
|
||||
path, it schedules, which clears the ``need_resched`` flag.
|
||||
When the preempted task runs again, the standard preemption started earlier
|
||||
resumes, although the flag is no longer set. The monitor considers this a
|
||||
``nested_preemption``, this allows another preemption without re-setting the
|
||||
flag. This condition relaxes the monitor constraints and may catch false
|
||||
negatives (i.e. no real ``nested_preemptions``) but makes the monitor more
|
||||
robust and able to validate other scenarios.
|
||||
For simplicity, the monitor starts in ``preempt_irq``, although no interrupt
|
||||
occurred, as the situation above is hard to pinpoint::
|
||||
|
||||
schedule_entry
|
||||
irq_entry #===========================================#
|
||||
+-------------------------- H H
|
||||
| H H
|
||||
+-------------------------> H any_thread_running H
|
||||
H H
|
||||
+-------------------------> H H
|
||||
| #===========================================#
|
||||
| schedule_entry | ^
|
||||
| schedule_entry_preempt | sched_need_resched | schedule_entry
|
||||
| | schedule_entry_preempt
|
||||
| v |
|
||||
| +----------------------+ |
|
||||
| +--- | | |
|
||||
| sched_need_resched | | rescheduling | -+
|
||||
| +--> | |
|
||||
| +----------------------+
|
||||
| | irq_entry
|
||||
| v
|
||||
| +----------------------+
|
||||
| | | ---+
|
||||
| ---> | | | sched_need_resched
|
||||
| | preempt_irq | | irq_entry
|
||||
| | | <--+
|
||||
| | | <--+
|
||||
| +----------------------+ |
|
||||
| | schedule_entry | sched_need_resched
|
||||
| | schedule_entry_preempt |
|
||||
| v |
|
||||
| +-----------------------+ |
|
||||
+-------------------------- | nested_preempt | --+
|
||||
+-----------------------+
|
||||
^ irq_entry |
|
||||
+-------------------+
|
||||
|
||||
Due to how the ``need_resched`` flag on the preemption count works on arm64,
|
||||
this monitor is unstable on that architecture, as it often records preemption
|
||||
when the flag is not set, even in presence of the workaround above.
|
||||
For the time being, the monitor is disabled by default on arm64.
|
||||
|
||||
Monitor sssw
|
||||
------------
|
||||
|
||||
The set state sleep and wakeup (sssw) monitor ensures ``set_state`` to
|
||||
sleepable leads to sleeping and sleeping tasks require wakeup. It includes the
|
||||
following types of switch:
|
||||
|
||||
* ``switch_suspend``:
|
||||
a task puts itself to sleep, this can happen only after explicitly setting
|
||||
the task to ``sleepable``. After a task is suspended, it needs to be woken up
|
||||
(``waking`` state) before being switched in again.
|
||||
Setting the task's state to ``sleepable`` can be reverted before switching if it
|
||||
is woken up or set to ``runnable``.
|
||||
* ``switch_blocking``:
|
||||
a special case of a ``switch_suspend`` where the task is waiting on a
|
||||
sleeping RT lock (``PREEMPT_RT`` only), it is common to see wakeup and set
|
||||
state events racing with each other and this leads the model to perceive this
|
||||
type of switch when the task is not set to sleepable. This is a limitation of
|
||||
the model in SMP system and workarounds may slow down the system.
|
||||
* ``switch_preempt``:
|
||||
a task switch as a result of kernel preemption (``schedule_entry_preempt`` in
|
||||
the nrp model).
|
||||
* ``switch_yield``:
|
||||
a task explicitly calls the scheduler or is preempted while returning to
|
||||
userspace. It can happen after a ``yield`` system call, from the idle task or
|
||||
if the ``need_resched`` flag is set. By definition, a task cannot yield while
|
||||
``sleepable`` as that would be a suspension. A special case of a yield occurs
|
||||
when a task in ``TASK_INTERRUPTIBLE`` calls the scheduler while a signal is
|
||||
pending. The task doesn't go through the usual blocking/waking and is set
|
||||
back to runnable, the resulting switch (if there) looks like a yield to the
|
||||
``signal_wakeup`` state and is followed by the signal delivery. From this
|
||||
state, the monitor expects a signal even if it sees a wakeup event, although
|
||||
not necessary, to rule out false negatives.
|
||||
|
||||
This monitor doesn't include a running state, ``sleepable`` and ``runnable``
|
||||
are only referring to the task's desired state, which could be scheduled out
|
||||
(e.g. due to preemption). However, it does include the event
|
||||
``sched_switch_in`` to represent when a task is allowed to become running. This
|
||||
can be triggered also by preemption, but cannot occur after the task got to
|
||||
``sleeping`` before a ``wakeup`` occurs::
|
||||
|
||||
+--------------------------------------------------------------------------+
|
||||
| |
|
||||
| |
|
||||
| switch_suspend | |
|
||||
| switch_blocking | |
|
||||
v v |
|
||||
+----------+ #==========================# set_state_runnable |
|
||||
| | H H wakeup |
|
||||
| | H H switch_in |
|
||||
| | H H switch_yield |
|
||||
| sleeping | H H switch_preempt |
|
||||
| | H H signal_deliver |
|
||||
| | switch_ H H ------+ |
|
||||
| | _blocking H runnable H | |
|
||||
| | <----------- H H <-----+ |
|
||||
+----------+ H H |
|
||||
| wakeup H H |
|
||||
+---------------------> H H |
|
||||
H H |
|
||||
+---------> H H |
|
||||
| #==========================# |
|
||||
| | ^ |
|
||||
| | | set_state_runnable |
|
||||
| | | wakeup |
|
||||
| set_state_sleepable | +------------------------+
|
||||
| v | |
|
||||
| +--------------------------+ set_state_sleepable
|
||||
| | | switch_in
|
||||
| | | switch_preempt
|
||||
signal_deliver | sleepable | signal_deliver
|
||||
| | | ------+
|
||||
| | | |
|
||||
| | | <-----+
|
||||
| +--------------------------+
|
||||
| | ^
|
||||
| switch_yield | set_state_sleepable
|
||||
| v |
|
||||
| +---------------+ |
|
||||
+---------- | signal_wakeup | -+
|
||||
+---------------+
|
||||
^ | switch_in
|
||||
| | switch_preempt
|
||||
| | switch_yield
|
||||
+-----------+ wakeup
|
||||
|
||||
References
|
||||
----------
|
||||
|
||||
|
||||
Reference in New Issue
Block a user