workqueue: add test_workqueue benchmark module

Add a kernel module that benchmarks queue_work() throughput on an
unbound workqueue to measure pool->lock contention under different
affinity scope configurations (cache vs cache_shard).

The module spawns N kthreads (default: num_online_cpus()), each bound
to a different CPU. All threads start simultaneously and queue work
items, measuring the latency of each queue_work() call. Results are
reported as p50/p90/p95 latencies for each affinity scope.

The affinity scope is switched between runs via the workqueue's sysfs
affinity_scope attribute (WQ_SYSFS), avoiding the need for any new
exported symbols.

The module runs as __init-only, returning -EAGAIN to auto-unload,
and can be re-run via insmod.

Example of the output:

 running 50 threads, 50000 items/thread

   cpu              6806017 items/sec p50=2574    p90=5068    p95=5818 ns
   smt              6821040 items/sec p50=2624    p90=5168    p95=5949 ns
   cache_shard      1633653 items/sec p50=5337    p90=9694    p95=11207 ns
   cache            286069 items/sec p50=72509    p90=82304   p95=85009 ns
   numa             319403 items/sec p50=63745    p90=73480   p95=76505 ns
   system           308461 items/sec p50=66561    p90=75714   p95=78048 ns

Signed-off-by: Breno Leitao <leitao@debian.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
This commit is contained in:
Breno Leitao
2026-04-01 06:03:56 -07:00
committed by Tejun Heo
parent 738390a532
commit 24b2e73f97
3 changed files with 305 additions and 0 deletions

View File

@@ -2654,6 +2654,16 @@ config TEST_VMALLOC
If unsure, say N.
config TEST_WORKQUEUE
tristate "Test module for stress/performance analysis of workqueue"
default n
help
This builds the "test_workqueue" module for benchmarking
workqueue throughput under contention. Useful for evaluating
affinity scope changes (e.g., cache_shard vs cache).
If unsure, say N.
config TEST_BPF
tristate "Test BPF filter functionality"
depends on m && NET