workqueue: Report work funcs that trigger automatic CPU_INTENSIVE mechanism

Workqueue now automatically marks per-cpu work items that hog CPU for too
long as CPU_INTENSIVE, which excludes them from concurrency management and
prevents stalling other concurrency-managed work items. If a work function
keeps running over the thershold, it likely needs to be switched to use an
unbound workqueue.

This patch adds a debug mechanism which tracks the work functions which
trigger the automatic CPU_INTENSIVE mechanism and report them using
pr_warn() with exponential backoff.

v3: Documentation update.

v2: Drop bouncing to kthread_worker for printing messages. It was to avoid
    introducing circular locking dependency through printk but not effective
    as it still had pool lock -> wci_lock -> printk -> pool lock loop. Let's
    just print directly using printk_deferred().

Signed-off-by: Tejun Heo <tj@kernel.org>
Suggested-by: Peter Zijlstra <peterz@infradead.org>
This commit is contained in:
Tejun Heo
2023-05-17 17:02:08 -10:00
parent 616db8779b
commit 6363845005
3 changed files with 111 additions and 0 deletions

View File

@@ -1134,6 +1134,19 @@ config WQ_WATCHDOG
state. This can be configured through kernel parameter
"workqueue.watchdog_thresh" and its sysfs counterpart.
config WQ_CPU_INTENSIVE_REPORT
bool "Report per-cpu work items which hog CPU for too long"
depends on DEBUG_KERNEL
help
Say Y here to enable reporting of concurrency-managed per-cpu work
items that hog CPUs for longer than
workqueue.cpu_intensive_threshold_us. Workqueue automatically
detects and excludes them from concurrency management to prevent
them from stalling other per-cpu work items. Occassional
triggering may not necessarily indicate a problem. Repeated
triggering likely indicates that the work item should be switched
to use an unbound workqueue.
config TEST_LOCKUP
tristate "Test module to generate lockups"
depends on m