mirror of
https://github.com/torvalds/linux.git
synced 2026-04-18 14:53:58 -04:00
watchdog: softlockup: panic when lockup duration exceeds N thresholds
The softlockup_panic sysctl is currently a binary option: panic immediately or never panic on soft lockups. Panicking on any soft lockup, regardless of duration, can be overly aggressive for brief stalls that may be caused by legitimate operations. Conversely, never panicking may allow severe system hangs to persist undetected. Extend softlockup_panic to accept an integer threshold, allowing the kernel to panic only when the normalized lockup duration exceeds N watchdog threshold periods. This provides finer-grained control to distinguish between transient delays and persistent system failures. The accepted values are: - 0: Don't panic (unchanged) - 1: Panic when duration >= 1 * threshold (20s default, original behavior) - N > 1: Panic when duration >= N * threshold (e.g., 2 = 40s, 3 = 60s.) The original behavior is preserved for values 0 and 1, maintaining full backward compatibility while allowing systems to tolerate brief lockups while still catching severe, persistent hangs. [lirongqing@baidu.com: v2] Link: https://lkml.kernel.org/r/20251218074300.4080-1-lirongqing@baidu.com Link: https://lkml.kernel.org/r/20251216074521.2796-1-lirongqing@baidu.com Signed-off-by: Li RongQing <lirongqing@baidu.com> Cc: Eduard Zingerman <eddyz87@gmail.com> Cc: Hao Luo <haoluo@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Fastabend <john.fastabend@gmail.com> Cc: KP Singh <kpsingh@kernel.org> Cc: Lance Yang <lance.yang@linux.dev> Cc: Martin KaFai Lau <martin.lau@linux.dev> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Song Liu <song@kernel.org> Cc: Stanislav Fomichev <sdf@fomichev.me> Cc: Yonghong Song <yonghong.song@linux.dev> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
This commit is contained in:
committed by
Andrew Morton
parent
b5bfcc1ffe
commit
e700f5d156
@@ -1110,13 +1110,14 @@ config SOFTLOCKUP_DETECTOR_INTR_STORM
|
||||
the CPU stats and the interrupt counts during the "soft lockups".
|
||||
|
||||
config BOOTPARAM_SOFTLOCKUP_PANIC
|
||||
bool "Panic (Reboot) On Soft Lockups"
|
||||
int "Panic (Reboot) On Soft Lockups"
|
||||
depends on SOFTLOCKUP_DETECTOR
|
||||
default 0
|
||||
help
|
||||
Say Y here to enable the kernel to panic on "soft lockups",
|
||||
which are bugs that cause the kernel to loop in kernel
|
||||
mode for more than 20 seconds (configurable using the watchdog_thresh
|
||||
sysctl), without giving other tasks a chance to run.
|
||||
Set to a non-zero value N to enable the kernel to panic on "soft
|
||||
lockups", which are bugs that cause the kernel to loop in kernel
|
||||
mode for more than (N * 20 seconds) (configurable using the
|
||||
watchdog_thresh sysctl), without giving other tasks a chance to run.
|
||||
|
||||
The panic can be used in combination with panic_timeout,
|
||||
to cause the system to reboot automatically after a
|
||||
@@ -1124,7 +1125,7 @@ config BOOTPARAM_SOFTLOCKUP_PANIC
|
||||
high-availability systems that have uptime guarantees and
|
||||
where a lockup must be resolved ASAP.
|
||||
|
||||
Say N if unsure.
|
||||
Say 0 if unsure.
|
||||
|
||||
config HAVE_HARDLOCKUP_DETECTOR_BUDDY
|
||||
bool
|
||||
|
||||
Reference in New Issue
Block a user