thuge-gen.c defines SHM_HUGE_* macros that are provided by the uapi since
4.14. These macros get redefined when compiling with Android's bionic
because its sys/shm.h will import the uapi definitions.
However if linux/shm.h is included, with glibc, sys/shm.h will clash on
some struct definitions:
/usr/include/linux/shm.h:26:8: error: redefinition of ‘struct shmid_ds’
26 | struct shmid_ds {
| ^~~~~~~~
In file included from /usr/include/x86_64-linux-gnu/bits/shm.h:45,
from /usr/include/x86_64-linux-gnu/sys/shm.h:30:
/usr/include/x86_64-linux-gnu/bits/types/struct_shmid_ds.h:24:8: note: originally defined here
24 | struct shmid_ds
| ^~~~~~~~
For now, guard the SHM_HUGE_* defines with ifndef to prevent redefinition
warnings on Android bionic.
Link: https://lkml.kernel.org/r/20240605223637.1374969-3-edliaw@google.com
Signed-off-by: Edward Liaw <edliaw@google.com>
Reviewed-by: Carlos Llamas <cmllamas@google.com>
Reviewed-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Cc: Bill Wendling <morbo@google.com>
Cc: Carlos Llamas <cmllamas@google.com>
Cc: Justin Stitt <justinstitt@google.com>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
create_pagecache_thp_and_fd() in split_huge_page_test.c used the variable
dummy to perform mmap read.
However, this test was skipped even on XFS which has large folio support.
The issue was compiler (gcc 13.2.0) was optimizing out the dummy variable,
therefore, not creating huge page in the page cache.
Use asm volatile() trick to force the compiler not to optimize out the
loop where we read from the mmaped addr. This is similar to what is being
done in other tests (cow.c, etc)
As the variable is now used in the asm statement, remove the unused
attribute.
Link: https://lkml.kernel.org/r/20240606203619.677276-1-kernel@pankajraghav.com
Signed-off-by: Pankaj Raghav <p.raghav@samsung.com>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Pankaj Raghav <p.raghav@samsung.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Patch series "cleanups, fixes, and progress towards avoiding "make
headers"", v3.
Eventually, once the build succeeds on a sufficiently old distro, the idea
is to delete $(KHDR_INCLUDES) from the selftests/mm build, and then after
that, from selftests/lib.mk and all of the other selftest builds.
For now, this series merely achieves a clean build of selftests/mm on a
not-so-old distro: Ubuntu 23.04. In other words, after this series is
applied, it is possible to delete $(KHDR_INCLUDES) from
selftests/mm/Makefile and the build will still succeed.
1. Add tools/uapi/asm/unistd_[32|x32|64].h files, which include
definitions of __NR_mseal, and include them (indirectly) from the files
that use __NR_mseal. The new files are copied from ./usr/include/asm,
which is how we have agreed to do this sort of thing, see [1].
2. Add fs.h, similarly created: it was copied directly from a snapshot
of ./usr/include/linux/fs.h after running "make headers".
3. Add a few selected prctl.h values that the ksm and mdwe tests require.
4. Factor out some common code from mseal_test.c and seal_elf.c, into
a new mseal_helpers.h file.
5. Remove local __NR_* definitions and checks.
[1] commit e076eaca59 ("selftests: break the dependency upon local
header files")
This patch (of 6):
The selftests/mm build isn't exactly "broken", according to the current
documentation, which still claims that one must run "make headers",
before building the kselftests. However, according to the new plan to
get rid of that requirement [1], they are future-broken: attempting to
build selftests/mm *without* first running "make headers" will fail due
to not finding __NR_mseal.
Therefore, include asm-generic/unistd.h, which has all of the system
call numbers that are needed, abstracted across the various CPU arches.
Some explanation in support of this "asm-generic" approach:
For most user space programs, the header file inclusion behaves as per
this microblaze example, which comes from David Hildenbrand (thanks!):
arch/microblaze/include/asm/unistd.h
-> #include <uapi/asm/unistd.h>
arch/microblaze/include/uapi/asm/unistd.h
-> #include <asm/unistd_32.h>
-> Generated during "make headers"
usr/include/asm/unistd_32.h is generated via
arch/microblaze/kernel/syscalls/Makefile with the syshdr command.
So we never end up including asm-generic/unistd.h directly on
microblaze... [2]
However, those programs are installed on a single computer that has a
single set of asm and kernel headers installed.
In contrast, the kselftests are quite special, because they must
provide a set of user space programs that:
a) Mostly avoid using the installed (distro) system header files.
b) Build (and run) on all supported CPU architectures
c) Occasionally use symbols that have so new that they have not
yet been included in the distro's header files.
Doing (a) creates a new problem: how to get a set of cross-platform
headers that works in all cases.
Fortunately, asm-generic headers solve that one. Which is why we need
to use them here--at least, for particularly difficult headers such as
unistd.h.
The reason this hasn't really come up yet, is that until now, the
kselftests requirement (which I'm trying to eventually remove) was that
"make headers" must first be run. That allowed the selftests to get a
snapshot of sufficiently new header files that looked just like (and
conflict with) the installed system headers.
And as an aside, this is also an improvement over past practices of
simply open-coding in a single (not per-arch) definition of a new
symbol, directly into the selftest code.
[1] commit e076eaca59 ("selftests: break the dependency upon local
header files")
[2] https://lore.kernel.org/all/0b152bea-ccb6-403e-9c57-08ed5e828135@redhat.com/
Link: https://lkml.kernel.org/r/20240618022422.804305-1-jhubbard@nvidia.com
Link: https://lkml.kernel.org/r/20240618022422.804305-2-jhubbard@nvidia.com
Fixes: 4926c7a52d ("selftest mm/mseal memory sealing")
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Jeff Xu <jeffxu@chromium.org>
Cc: Andrei Vagin <avagin@google.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Kees Cook <kees@kernel.org>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Muhammad Usama Anjum <usama.anjum@collabora.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Rich Felker <dalias@libc.org>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Commit 1b151e2435 ("block: Remove special-casing of compound pages")
caused a change in behaviour when releasing the pages if the buffer does
not start at the beginning of the page. This was because the calculation
of the number of pages to release was incorrect. This was fixed by commit
38b43539d6 ("block: Fix page refcounts for unaligned buffers in
__bio_release_pages()").
We pin the user buffer during direct I/O writes. If this buffer is a
hugepage, bio_release_page() will unpin it and decrement all references
and pin counts at ->bi_end_io. However, if any references to the hugepage
remain post-I/O, the hugepage will not be freed upon unmap, leading to a
memory leak.
This patch verifies that a hugepage, used as a user buffer for DIO
operations, is correctly freed upon unmapping, regardless of whether the
offsets are aligned or unaligned w.r.t page boundary.
Test Result Fail Scenario (Without the fix)
--------------------------------------------------------
[]# ./hugetlb_dio
TAP version 13
1..4
No. Free pages before allocation : 7
No. Free pages after munmap : 7
ok 1 : Huge pages freed successfully !
No. Free pages before allocation : 7
No. Free pages after munmap : 7
ok 2 : Huge pages freed successfully !
No. Free pages before allocation : 7
No. Free pages after munmap : 7
ok 3 : Huge pages freed successfully !
No. Free pages before allocation : 7
No. Free pages after munmap : 6
not ok 4 : Huge pages not freed!
Totals: pass:3 fail:1 xfail:0 xpass:0 skip:0 error:0
Test Result PASS Scenario (With the fix)
---------------------------------------------------------
[]#./hugetlb_dio
TAP version 13
1..4
No. Free pages before allocation : 7
No. Free pages after munmap : 7
ok 1 : Huge pages freed successfully !
No. Free pages before allocation : 7
No. Free pages after munmap : 7
ok 2 : Huge pages freed successfully !
No. Free pages before allocation : 7
No. Free pages after munmap : 7
ok 3 : Huge pages freed successfully !
No. Free pages before allocation : 7
No. Free pages after munmap : 7
ok 4 : Huge pages freed successfully !
Totals: pass:4 fail:0 xfail:0 xpass:0 skip:0 error:0
[donettom@linux.ibm.com: address review comments from Muhammad]
Link: https://lkml.kernel.org/r/20240604132801.23377-1-donettom@linux.ibm.com
[donettom@linux.ibm.com: add this test to run_vmtests.sh]
Link: https://lkml.kernel.org/r/20240607182000.6494-1-donettom@linux.ibm.com
Link: https://lkml.kernel.org/r/20240523063905.3173-1-donettom@linux.ibm.com
Fixes: 38b43539d6 ("block: Fix page refcounts for unaligned buffers in __bio_release_pages()")
Signed-off-by: Donet Tom <donettom@linux.ibm.com>
Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Co-developed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Reviewed-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Tony Battersby <tonyb@cybernetics.com>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Patch series "Restructure va_high_addr_switch".
The va_high_addr_switch memory selftest tests out some corner cases
related to allocation and page/hugepage faulting around the switch
boundary. Currently, the page size and hugepage size have been statically
defined. Post FEAT_LPA2, the Aarch64 Linux kernel adds support for 4k and
16k translation granules on higher addresses; we restructure the test to
support the same. In addition, we avoid invocation of the binary twice,
in the shell script, to reduce test noise.
This patch (of 2):
When invoking the binary with "--run-hugetlb" flag, the testcases
involving the base page are anyways going to be run. Therefore, remove
duplication by invoking the binary only once.
Link: https://lkml.kernel.org/r/20240522070435.773918-1-dev.jain@arm.com
Link: https://lkml.kernel.org/r/20240522070435.773918-2-dev.jain@arm.com
Signed-off-by: Dev Jain <dev.jain@arm.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
The openvswitch selftest is difficult to debug for anyone that isn't
directly familiar with the openvswitch module and the specifics of the
test cases. Many times when something fails, the debug log will be
sparsely populated and it takes some time to understand where a failure
occured.
Increase the amount of details logged to the debug log by trapping all
'info' logs, and all 'ovs_sbx' commands.
Signed-off-by: Aaron Conole <aconole@redhat.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240702132830.213384-4-aconole@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Previously, the openvswitch.sh test suites would not attempt to autoload
the openvswitch module. The idea was that a user who is manually running
tests might not even have the OVS module loaded or configured for their
own development. However, if the kernel module is configured, and the
module can be autoloaded then we should just attempt to load it and run
the tests. This is especially true in the CI environments, where the CI
tests should be able to rely on auto loading to get the test suite running.
Signed-off-by: Aaron Conole <aconole@redhat.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240702132830.213384-3-aconole@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
As predicted by David running the test on a machine with a single
interface is a bit unreliable. We try to send 20k packets with
iperf and expect fewer than 10k packets on the default context.
The test isn't very quick, iperf will usually send 100k packets
by the time we stop it. So we're off by 5x on the number of iperf
packets but still expect default context to only get the hardcoded
10k. The intent is to make sure we get noticeably less traffic
on the default context. Use half of the resulting iperf traffic
instead of the hard coded 10k.
Link: https://patch.msgid.link/20240702233728.4183387-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Pull kselftest fixes from Shuah Khan:
"One single patch to fix the non-contiguous CBM resctrl:
- AMD supports non-contiguous CBM but does not report it via CPUID.
This test should not use CPUID on AMD to detect non-contiguous CBM
support. Fix the problem so the test uses CPUID to discover
non-contiguous CBM support only on Intel"
* tag 'linux_kselftest-fixes-6.10-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
selftests/resctrl: Fix non-contiguous CBM for AMD
Check that __sync_*() functions don't cause kernel panics when handling
freed arena pages.
x86_64 does not support some arena atomics yet, and aarch64 may or may
not support them, based on the availability of LSE atomics at run time.
Do not enable this test for these architectures for simplicity.
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240701234304.14336-12-iii@linux.ibm.com
Introduce dynamic adjustment capabilities for fill_size and comp_size
parameters to support larger batch sizes beyond the previous 2K limit.
Update HW_SW_MAX_RING_SIZE test cases to evaluate AF_XDP's robustness by
pushing hardware and software ring sizes to their limits. This test
ensures AF_XDP's reliability amidst potential producer/consumer throttling
due to maximum ring utilization.
Signed-off-by: Tushar Vyavahare <tushar.vyavahare@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Link: https://lore.kernel.org/bpf/20240702055916.48071-3-tushar.vyavahare@intel.com
Pull cxl fixes from Dave Jiang:
- Fix no cxl_nvd during pmem region auto-assemble
- Avoid NULLL pointer dereference in region lookup
- Add missing checks to interleave capability
- Add cxl kdoc fix to address document compilation error
* tag 'cxl-fixes-6.10-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl:
cxl: documentation: add missing files to cxl driver-api
cxl/region: check interleave capability
cxl/region: Avoid null pointer dereference in region lookup
cxl/mem: Fix no cxl_nvd during pmem region auto-assembling
Building the sigaltstack test with GCC on 64-bit powerpc errors with:
gcc -Wall sas.c -o /home/michael/linux/.build/kselftest/sigaltstack/sas
In file included from sas.c:23:
current_stack_pointer.h:22:2: error: #error "implement current_stack_pointer equivalent"
22 | #error "implement current_stack_pointer equivalent"
| ^~~~~
sas.c: In function ‘my_usr1’:
sas.c:50:13: error: ‘sp’ undeclared (first use in this function); did you mean ‘p’?
50 | if (sp < (unsigned long)sstack ||
| ^~
This happens because GCC doesn't define __ppc__ for 64-bit builds, only
32-bit builds. Instead use __powerpc__ to detect powerpc builds, which
is defined by clang and GCC for 64-bit and 32-bit builds.
Fixes: 05107edc91 ("selftests: sigaltstack: fix -Wuninitialized")
Cc: stable@vger.kernel.org # v6.3+
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20240520062647.688667-1-mpe@ellerman.id.au
Pablo Neira Ayuso says:
====================
Netfilter/IPVS updates for net-next
The following patchset contains Netfilter/IPVS updates for net-next:
Patch #1 to #11 to shrink memory consumption for transaction objects:
struct nft_trans_chain { /* size: 120 (-32), cachelines: 2, members: 10 */
struct nft_trans_elem { /* size: 72 (-40), cachelines: 2, members: 4 */
struct nft_trans_flowtable { /* size: 80 (-48), cachelines: 2, members: 5 */
struct nft_trans_obj { /* size: 72 (-40), cachelines: 2, members: 4 */
struct nft_trans_rule { /* size: 80 (-32), cachelines: 2, members: 6 */
struct nft_trans_set { /* size: 96 (-24), cachelines: 2, members: 8 */
struct nft_trans_table { /* size: 56 (-40), cachelines: 1, members: 2 */
struct nft_trans_elem can now be allocated from kmalloc-96 instead of
kmalloc-128 slab.
Series from Florian Westphal. For the record, I have mangled patch #1
to add nft_trans_container_*() and use if for every transaction object.
I have also added BUILD_BUG_ON to ensure struct nft_trans always comes
at the beginning of the container transaction object. And few minor
cleanups, any new bugs are of my own.
Patch #12 simplify check for SCTP GSO in IPVS, from Ismael Luceno.
Patch #13 nf_conncount key length remains in the u32 bound, from Yunjian Wang.
Patch #14 removes unnecessary check for CTA_TIMEOUT_L3PROTO when setting
default conntrack timeouts via nfnetlink_cttimeout API, from
Lin Ma.
Patch #15 updates NFT_SECMARK_CTX_MAXLEN to 4096, SELinux could use
larger secctx names than the existing 256 bytes length.
Patch #16 adds a selftest to exercise nfnetlink_queue listeners leaving
nfnetlink_queue, from Florian Westphal.
Patch #17 increases hitcount from 255 to 65535 in xt_recent, from Phil Sutter.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
nolibc gained an implementation of strerror() recently.
Use it and drop the ifdeffery.
Acked-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
strerror() is commonly used.
For example in kselftest which currently needs to do an #ifdef NOLIBC to
handle the lack of strerror().
Keep it simple and reuse the output format of perror() for strerror().
Acked-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Some tests only make sense on nolibc. To avoid gaps in the test numbers
do to inline "#ifdef NOLIBC", add a condition to formally skip these
tests.
Acked-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
This implements what I was describing in [1]. When writing a test
author can schedule cleanup / undo actions right after the creation
completes, eg:
cmd("touch /tmp/file")
defer(cmd, "rm /tmp/file")
defer() takes the function name as first argument, and the rest are
arguments for that function. defer()red functions are called in
inverse order after test exits. It's also possible to capture them
and execute earlier (in which case they get automatically de-queued).
undo = defer(cmd, "rm /tmp/file")
# ... some unsafe code ...
undo.exec()
As a nice safety all exceptions from defer()ed calls are captured,
printed, and ignored (they do make the test fail, however).
This addresses the common problem of exceptions in cleanup paths
often being unhandled, leading to potential leaks.
There is a global action queue, flushed by ksft_run(). We could support
function level defers too, I guess, but there's no immediate need..
Link: https://lore.kernel.org/all/877cedb2ki.fsf@nvidia.com/ # [1]
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Link: https://patch.msgid.link/20240627185502.3069139-3-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add udelay() for x86 tests to allow busy waiting in the guest for a
specific duration, and to match ARM and RISC-V's udelay() in the hopes
of eventually making udelay() available on all architectures.
Get the guest's TSC frequency using KVM_GET_TSC_KHZ and expose it to all
VMs via a new global, guest_tsc_khz. Assert that KVM_GET_TSC_KHZ returns
a valid frequency, instead of simply skipping tests, which would require
detecting which tests actually need/want udelay(). KVM hasn't returned an
error for KVM_GET_TSC_KHZ since commit cc578287e3 ("KVM: Infrastructure
for software and hardware based TSC rate scaling"), which predates KVM
selftests by 6+ years (KVM_GET_TSC_KHZ itself predates KVM selftest by 7+
years).
Note, if the GUEST_ASSERT() in udelay() somehow fires and the test doesn't
check for guest asserts, then the test will fail with a very cryptic
message. But fixing that, e.g. by automatically handling guest asserts,
is a much larger task, and practically speaking the odds of a test afoul
of this wart are infinitesimally small.
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Link: https://lore.kernel.org/r/5aa86285d1c1d7fe1960e3fe490f4b22273977e6.1718214999.git.reinette.chatre@intel.com
Co-developed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>