On a 8-socket server the TSC is wrongly marked as 'unstable' and disabled
during boot time on about one out of 120 boot attempts:
clocksource: timekeeping watchdog on CPU227: wd-tsc-wd excessive read-back delay of 153560ns vs. limit of 125000ns,
wd-wd read-back delay only 11440ns, attempt 3, marking tsc unstable
tsc: Marking TSC unstable due to clocksource watchdog
TSC found unstable after boot, most likely due to broken BIOS. Use 'tsc=unstable'.
sched_clock: Marking unstable (119294969739, 159204297)<-(125446229205, -5992055152)
clocksource: Checking clocksource tsc synchronization from CPU 319 to CPUs 0,99,136,180,210,542,601,896.
clocksource: Switched to clocksource hpet
The reason is that for platform with a large number of CPUs, there are
sporadic big or huge read latencies while reading the watchog/clocksource
during boot or when system is under stress work load, and the frequency and
maximum value of the latency goes up with the number of online CPUs.
The cCurrent code already has logic to detect and filter such high latency
case by reading the watchdog twice and checking the two deltas. Due to the
randomness of the latency, there is a low probabilty that the first delta
(latency) is big, but the second delta is small and looks valid. The
watchdog code retries the readouts by default twice, which is not
necessarily sufficient for systems with a large number of CPUs.
There is a command line parameter 'max_cswd_read_retries' which allows to
increase the number of retries, but that's not user friendly as it needs to
be tweaked per system. As the number of required retries is proportional to
the number of online CPUs, this parameter can be calculated at runtime.
Scale and enlarge the number of retries according to the number of online
CPUs and remove the command line parameter completely.
[ tglx: Massaged change log and comments ]
Signed-off-by: Feng Tang <feng.tang@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Jin Wang <jin1.wang@intel.com>
Tested-by: Paul E. McKenney <paulmck@kernel.org>
Reviewed-by: Waiman Long <longman@redhat.com>
Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
Link: https://lore.kernel.org/r/20240221060859.1027450-1-feng.tang@intel.com
This commit does the long-overdue conversion of the parse-console.sh
file to use mktemp to create its temporary directory.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
This commit adds a --debug-info argument to kvm.sh in order to ease
interpretation of addresses printed on the console and the like.
This argument also disables KASLR.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
In torture.sh, the testing of refscale incorrectly used verbose_batched
as a kernel boot parameter, which causes this parameter to be passed
to the init process. This commit therefore prefixes it with refscale,
so that refscale.verbose_batched is passed to the kernel.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
When debugging, it can be difficult to quickly find the ftrace dump
within the console log, which in turn makes it difficult to process it
independent of the rest of the console output. This commit therefore
copies the contents of the buffers into its own file to make it easier
to locate and process the ftrace dump. The original ftrace dump is still
available in the console log in cases because it can be more convenient
to process it in situ, for example, for scripts that process console
output as well as ftrace-dump data.
Also handle the case of multiple ftrace dumps potentially showing up in the
log. Example for a file like [1], it will extract as [2].
[1]:
foo
foo
Dumping ftrace buffer:
---------------------------------
blah
blah
---------------------------------
more
bar
baz
Dumping ftrace buffer:
---------------------------------
blah2
blah2
---------------------------------
bleh
bleh
[2]:
Ftrace dump 1:
blah
blah
Ftrace dump 2:
blah2
blah2
[ paulmck: Fixed awk indentation, input up front. ]
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
This commit switches from the old "/tmp/kvm-recheck.sh.$$" approach to
the newer and now reliable "mktemp" approach.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Pull smp_call_function torture-test updates from Paul McKenney:
"This prevents some memory-exhaustion false-postitive failures in
scftorture testing"
* tag 'scftorture.2023.08.15a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu:
scftorture: Add CONFIG_PREEMPT_DYNAMIC=n to NOPREEMPT scenario
scftorture: Pause testing after memory-allocation failure
scftorture: Forgive memory-allocation failure if KASAN
torture: Scale scftorture memory based on number of CPUs
Currently, if the C program created by mkinitrd.sh has compile errors,
the errors are printed, but kvm.sh soldiers on, building kernels that
have init-less initrd setups. The kernels then fail on boot when they
attempt to mount non-existent root filesystems.
This commit therefore improves user friendliness by making mkinitrd.sh
return non-zero exit status on compile errors, which in turn causes kvm.sh
to take an early exit, with the compile errors still clearly visible.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
This commit causes the init program generated by mkinitrd.sh dump out
its parameters. Although this is in some sense redundant given that
the kernel already dumps them out, confirmation can be a good thing.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
This commit switches the qemu argument "-nographic" to "-display none",
aligning with the nolibc tests.
Cc: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
This commit adds the __loongarch__, __loongarch_lp64, and
__loongarch_double_float targets to rcutorture's mkinitrd.sh
script in order to allow nolibc init programs for loongarch.
[ paulmck: Apply feedback from Feiyang Chen. ]
Cc: Feiyang Chen <chenfeiyang@loongson.cn>
Cc: Huacai Chen <chenhuacai@loongson.cn>
Cc: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Currently, the various torture tests sometimes react to an early-boot
bug by rebooting. This is almost always counterproductive, needlessly
consuming CPU time and bloating the console log. This commit therefore
adds the "-no-reboot" argument to qemu so that reboot requests will
cause qemu to exit.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
This commit adds srcu_lockdep.sh to torture.sh, thus exercizing the
extended SRCU-aware lockdep-RCU functionality on a regular basis.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
KCSAN enables some Kconfig options unilaterally and unconditionally,
including CONFIG_PROVE_LOCKING. This in turn enables CONFIG_PROVE_RCU
and CONFIG_PREEMPT_COUNT, which conflicts with constraints in SRCU-T,
TRACE01, and TREE10, which in turn causes rcutorture to emit spurious
configuration complaints. This commit therefore forgives configuration
complaints involving CONFIG_PROVE_RCU and CONFIG_PREEMPT_COUNT.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
If some of the torture.sh runs had config and/or build errors, but all
runs for which kernels were built ran successfully to completion, then
torture.sh will incorrectly claim that all errors were KCSAN errors.
This commit therefore makes torture.sh print the number of runs with
config and build errors, and to refrain from claiming that all bugs were
KCSAN bugs in that case.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Currently, the kernel boot parameters specified by the kvm.sh --bootargs
parameter are placed near the beginning of the -append list that is
passed to qemu. This means that in the not-uncommon case of a kernel
boot parameter where the last argument wins, the --bootargs list overrides
neither the list in the .boot file nor the additional parameters supplied
by the rcutorture scripting.
This commit therefore places the kernel boot parameters specified by
the kvm.sh --bootargs parameter at the end of qemu's -append list.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
The mkinitrd.sh script no longer takes an argument, so this commit
therefore removes the code that checks for the parameter being present.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
Currently, if the initial ssh fails, kvm-remote.sh gives up, printing a
message saying so. But it would be nice to get a better idea as to why
ssh failed. This commit therefore dumps out ssh's exit code, stdout,
and stderr upon ssh failure for diagnostic purposes.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
This commit adds build tests of the individual RCU Tasks flavors in
order to detect inadvertent dependencies among the flavors.
Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Currently, kvm-recheck.sh will print out any .config errors with messages
of the form:
:CONFIG_TASKS_TRACE_RCU=y: improperly set
However, if these are the only errors, the resulting exit code will
declare the run successful. This commit therefore causes kvm-recheck.sh
to record .config errors in the results directory in a file named
ConfigFragment.diags and also returns a non-zero error code in that case.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Testing building of a given RCU Tasks flavor with the other two
flavors disabled requires checking that the other two flavors are in
fact disabled. This commit therefore modifies the scripting to permit
things like "#CHECK#CONFIG_TASKS_TRACE_RCU=n" to be passed into the
kvm.sh script's --kconfig parameter.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
In order to (for example) omit the real-time testing that torture.sh would
otherwise carry out, you put "--do-no-rt" on the torture.sh command line.
This works, but it is all too easy to instead type "--no-rt". This is
unambiguous and easier to type, so this commit therefore allows all
"--no-" arguments as synonyms for their "--do-no-" counterparts.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
As the number of CPUs increases, the number of outstanding no-wait
smp_call_function() handlers also increases, so that the default of
2G of memory is not always sufficient on 80-CPU systems. This commit
therefore scales the amount of memory specified to qemu based on the
number of CPUs specified to the scftorture test instance.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
This commit prints out the CPU time consumed by the grace-period kthread,
if the specified RCU flavor supports this notion.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Pull mode documentation updates from Jonathan Corbet:
"A half-dozen late arriving docs patches. They are mostly fixes, but we
also have a kernel-doc tweak for enums and the long-overdue removal of
the outdated and redundant patch-submission comments at the top of the
MAINTAINERS file"
* tag 'docs-6.5-2' of git://git.lwn.net/linux:
scripts: kernel-doc: support private / public marking for enums
Documentation: KVM: SEV: add a missing backtick
Documentation: ACPI: fix typo in ssdt-overlays.rst
Fix documentation of panic_on_warn
docs: remove the tips on how to submit patches from MAINTAINERS
docs: fix typo in zh_TW and zh_CN translation
The kernel cmdline option panic_on_warn expects an integer, it is not a
plain option as documented. A number of uses in the tree figured this
already, and use panic_on_warn=1 for their purpose.
Adjust a comment which otherwise may mislead people in the future.
Fixes: 9e3961a097 ("kernel: add panic_on_warn")
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
The qemu argument -enable-kvm is duplicated because the qemu_args bash
variable in kvm-test-1-run.sh already provides it. This commit therefore
removes the ppc64-specific copy in functions.sh.
Signed-off-by: Zhouyi Zhou <zhouzhouyi@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org>
This commit adds an srcu_lockdep.sh script that checks whether lockdep
correctly classifies SRCU-based, SRCU/mutex-based, and SRCU/rwsem-based
deadlocks.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
[ boqun: Fix "RCUTORTURE" with "$RCUTORTURE" ]
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
This commit tests the "tsc=watchdog" kernel boot parameter when running
the clocksourcewd torture tests.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Currently, invoking kvm-again.sh without a --duration argument results
in a bash error message. This commit therefore adds quotes around the
$dur argument to kvm-transform.sh to allow a default duration to be
taken from the earlier run.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Pull nolibc updates from Paul McKenney:
- Add s390 support
- Add support for the ARM Thumb1 instruction set
- Fix O_* flags definitions for open() and fcntl()
- Make errno a weak symbol instead of a static variable
- Export environ as a weak symbol
- Export _auxv as a weak symbol for auxilliary vector retrieval
- Implement getauxval() and getpagesize()
- Further improve self tests, including permitting userland testing of
the nolibc library
* tag 'nolibc.2023.02.06a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu: (28 commits)
selftests/nolibc: Add a "run-user" target to test the program in user land
selftests/nolibc: Support "x86_64" for arch name
selftests/nolibc: Add `getpagesize(2)` selftest
nolibc/sys: Implement `getpagesize(2)` function
nolibc/stdlib: Implement `getauxval(3)` function
tools/nolibc: add auxiliary vector retrieval for s390
tools/nolibc: add auxiliary vector retrieval for mips
tools/nolibc: add auxiliary vector retrieval for riscv
tools/nolibc: add auxiliary vector retrieval for arm
tools/nolibc: add auxiliary vector retrieval for arm64
tools/nolibc: add auxiliary vector retrieval for x86_64
tools/nolibc: add auxiliary vector retrieval for i386
tools/nolibc: export environ as a weak symbol on s390
tools/nolibc: export environ as a weak symbol on riscv
tools/nolibc: export environ as a weak symbol on mips
tools/nolibc: export environ as a weak symbol on arm
tools/nolibc: export environ as a weak symbol on arm64
tools/nolibc: export environ as a weak symbol on i386
tools/nolibc: export environ as a weak symbol on x86_64
tools/nolibc: make errno a weak symbol instead of a static one
...
Add the required values to identify_qemu() and
identify_bootimage().
Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Acked-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
This commit upgrades the kvm.sh script's --kconfig parameter to accept
string-valued Kconfig options with double-quoted string values.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Currently, the presence of any quoted-string Kconfig option in the
scenario files or the CFcommon file (aside from the special-cased
CONFIG_INITRAMFS_SOURCE option) will result in an "improperly set"
diagnostic. This commit updates configcheck.sh to strip double quotes
in order to permit string-valued Kconfig options to be handled correctly.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
The latest version of grep is deprecating the egrep command, so that
its output contains warnings as follows:
egrep: warning: egrep is obsolescent; using grep -E
Fix this using "grep -E" instead.
sed -i "s/egrep/grep -E/g" `grep egrep -rwl tools/testing/selftests/rcutorture`
Here are the steps to install the latest grep:
wget http://ftp.gnu.org/gnu/grep/grep-3.8.tar.gz
tar xf grep-3.8.tar.gz
cd grep-3.8 && ./configure && make
sudo make install
export PATH=/usr/local/bin:$PATH
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Under some conditions, a given run's vmlinux file will be compressed,
so that it is named vmlinux.xz rather than vmlinux. in such cases,
kvm-find-errors.sh will complain about the nonexistence of vmlinux.
This commit therefore causes kvm-find-errors.sh to check for vmlinux.xz
as well as for vmlinux.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Currently, if the torture.sh allmodconfig step fails, this is counted as
an error (as it should be), but there is also an extraneous complaint
about a missing log file. This commit therefore adds that log file,
which is hoped to reduce confused reactions to the error report.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Currently, torture.sh will compress the vmlinux files for KASAN and
KCSAN runs. But it will compress all of the files, including those
copied verbatim by the kvm-again.sh script. Compression takes around ten
minutes, so this is not a good thing. This commit therefore compresses
only one of a given set of identical vmlinux files, and then hard-links
it to the directories produced by kvm-again.sh.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
This commit causes torture.sh to use the new --bootargs and --datestamp
parameters to kvm-again.sh in order to avoid redundant kernel builds
during rcuscale and refscale testing. This trims the better part of an
hour off of torture.sh runs that use --do-kasan.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
This commit adds a --datestamp parameter to kvm-again.sh, which, in
contrast to the existing --rundir argument, specifies only the last
segments of the pathname. This addition enables torture.sh to use
kvm-again.sh in order to avoid redundant kernel builds.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
As it should, the kvm-recheck.sh script sets the TORTURE_SUITE bash
variable based on the type of rcutorture test being run. However,
it does not export it. Which is OK, at least until you try running
kvm-again.sh on either a rcuscale or a refscale test, at which point you
get false-positive "no success message, N successful version messages"
errors. This commit therefore causes the kvm-recheck.sh script to export
TORTURE_SUITE, suppressing these false positives.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
The kvm-again.sh script, when running locally, can place the QEMU output
into kvm-test-1-run-qemu.sh.out instead of kvm-test-1-run.sh.out. This
commit therefore makes kvm-test-1-run-qemu.sh check both locations.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
This commit drags the rcutorture scripting kicking and screaming into the
twenty-first century by making use of the BSD-derived mktemp command to
create temporary files and directories. In happy contrast to many of its
ill-behaved predecessors, mktemp seems to actually work reasonably reliably!
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
The kvm-again.sh script can be used to repeat short boot-time tests,
but the kernel boot arguments cannot be changed. This means that every
change in kernel boot arguments currently necessitates a kernel build,
which greatly increases the duration of kernel-boot testing.
This commit therefore adds a --bootargs parameter to kvm-again.sh,
which allows a given kernel to be repeatedly booted, but overriding
old and adding new kernel boot parameters. This allows an old kernel
to be booted with new kernel boot parameters, avoiding the overhead of
rebuilding the kernel under test.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Currently, kvm-check-branches.sh causes each kvm.sh invocation create a
separate date-stamped directory, then after that invocation completes,
moves it into the *-group/NNNN directory. This works, but makes it more
difficult to monitor an ongoing run. This commit therefore uses the
kvm.sh --datestamp argument to make kvm.sh put the output in the right
place to start with, and also dispenses with the additional level of
datestamping. (Those wanting datestamps can find them in the log files.)
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
A recent change to the DEBUG_INFO Kconfig option means that simply adding
CONFIG_DEBUG_INFO=y to the .config file and running "make oldconfig" no
longer works. It is instead necessary to add CONFIG_DEBUG_INFO_NONE=n
and (for example) CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y.
This combination will then result in CONFIG_DEBUG_INFO being selected.
This commit therefore updates the Kconfig options produced in response
to the kvm.sh --gdb, --kasan, and --kcsan Kconfig options.
Fixes: f9b3cd2457 ("Kconfig.debug: make DEBUG_INFO selectable from a choice")
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>