Commit Graph

3632 Commits

Author SHA1 Message Date
Jakub Kicinski
c49b292d03 Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Alexei Starovoitov says:

====================
pull-request: bpf-next 2023-12-18

This PR is larger than usual and contains changes in various parts
of the kernel.

The main changes are:

1) Fix kCFI bugs in BPF, from Peter Zijlstra.

End result: all forms of indirect calls from BPF into kernel
and from kernel into BPF work with CFI enabled. This allows BPF
to work with CONFIG_FINEIBT=y.

2) Introduce BPF token object, from Andrii Nakryiko.

It adds an ability to delegate a subset of BPF features from privileged
daemon (e.g., systemd) through special mount options for userns-bound
BPF FS to a trusted unprivileged application. The design accommodates
suggestions from Christian Brauner and Paul Moore.

Example:
$ sudo mkdir -p /sys/fs/bpf/token
$ sudo mount -t bpf bpffs /sys/fs/bpf/token \
             -o delegate_cmds=prog_load:MAP_CREATE \
             -o delegate_progs=kprobe \
             -o delegate_attachs=xdp

3) Various verifier improvements and fixes, from Andrii Nakryiko, Andrei Matei.

 - Complete precision tracking support for register spills
 - Fix verification of possibly-zero-sized stack accesses
 - Fix access to uninit stack slots
 - Track aligned STACK_ZERO cases as imprecise spilled registers.
   It improves the verifier "instructions processed" metric from single
   digit to 50-60% for some programs.
 - Fix verifier retval logic

4) Support for VLAN tag in XDP hints, from Larysa Zaremba.

5) Allocate BPF trampoline via bpf_prog_pack mechanism, from Song Liu.

End result: better memory utilization and lower I$ miss for calls to BPF
via BPF trampoline.

6) Fix race between BPF prog accessing inner map and parallel delete,
from Hou Tao.

7) Add bpf_xdp_get_xfrm_state() kfunc, from Daniel Xu.

It allows BPF interact with IPSEC infra. The intent is to support
software RSS (via XDP) for the upcoming ipsec pcpu work.
Experiments on AWS demonstrate single tunnel pcpu ipsec reaching
line rate on 100G ENA nics.

8) Expand bpf_cgrp_storage to support cgroup1 non-attach, from Yafang Shao.

9) BPF file verification via fsverity, from Song Liu.

It allows BPF progs get fsverity digest.

* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (164 commits)
  bpf: Ensure precise is reset to false in __mark_reg_const_zero()
  selftests/bpf: Add more uprobe multi fail tests
  bpf: Fail uprobe multi link with negative offset
  selftests/bpf: Test the release of map btf
  s390/bpf: Fix indirect trampoline generation
  selftests/bpf: Temporarily disable dummy_struct_ops test on s390
  x86/cfi,bpf: Fix bpf_exception_cb() signature
  bpf: Fix dtor CFI
  cfi: Add CFI_NOSEAL()
  x86/cfi,bpf: Fix bpf_struct_ops CFI
  x86/cfi,bpf: Fix bpf_callback_t CFI
  x86/cfi,bpf: Fix BPF JIT call
  cfi: Flip headers
  selftests/bpf: Add test for abnormal cnt during multi-kprobe attachment
  selftests/bpf: Don't use libbpf_get_error() in kprobe_multi_test
  selftests/bpf: Add test for abnormal cnt during multi-uprobe attachment
  bpf: Limit the number of kprobes when attaching program to multiple kprobes
  bpf: Limit the number of uprobes when attaching program to multiple uprobes
  bpf: xdp: Register generic_kfunc_set with XDP programs
  selftests/bpf: utilize string values for delegate_xxx mount options
  ...
====================

Link: https://lore.kernel.org/r/20231219000520.34178-1-alexei.starovoitov@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-18 16:46:08 -08:00
Andrii Nakryiko
8e432e6197 bpf: Ensure precise is reset to false in __mark_reg_const_zero()
It is safe to always start with imprecise SCALAR_VALUE register.
Previously __mark_reg_const_zero() relied on caller to reset precise
mark, but it's very error prone and we already missed it in a few
places. So instead make __mark_reg_const_zero() reset precision always,
as it's a safe default for SCALAR_VALUE. Explanation is basically the
same as for why we are resetting (or rather not setting) precision in
current state. If necessary, precision propagation will set it to
precise correctly.

As such, also remove a big comment about forward precision propagation
in mark_reg_stack_read() and avoid unnecessarily setting precision to
true after reading from STACK_ZERO stack. Again, precision propagation
will correctly handle this, if that SCALAR_VALUE register will ever be
needed to be precise.

Reported-by: Maxim Mikityanskiy <maxtram95@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Acked-by: Maxim Mikityanskiy <maxtram95@gmail.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20231218173601.53047-1-andrii@kernel.org
2023-12-18 23:54:21 +01:00
Jiri Olsa
f17d1a18a3 selftests/bpf: Add more uprobe multi fail tests
We fail to create uprobe if we pass negative offset. Add more tests
validating kernel-side error checking code.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/bpf/20231217215538.3361991-3-jolsa@kernel.org
2023-12-18 09:51:50 -08:00
Hou Tao
e58aac1a9a selftests/bpf: Test the release of map btf
When there is bpf_list_head or bpf_rb_root field in map value, the free
of map btf and the free of map value may run concurrently and there may
be use-after-free problem, so add two test cases to demonstrate it. And
the use-after-free problem can been easily reproduced by using bpf_next
tree and a KASAN-enabled kernel.

The first test case tests the racing between the free of map btf and the
free of array map. It constructs the racing by releasing the array map in
the end after other ref-counter of map btf has been released. To delay
the free of array map and make it be invoked after btf_free_rcu() is
invoked, it stresses system_unbound_wq by closing multiple percpu array
maps before it closes the array map.

The second case tests the racing between the free of map btf and the
free of inner map. Beside using the similar method as the first one
does, it uses bpf_map_delete_elem() to delete the inner map and to defer
the release of inner map after one RCU grace period.

The reason for using two skeletons is to prevent the release of outer
map and inner map in map_in_map_btf.c interfering the release of bpf
map in normal_map_btf.c.

Signed-off-by: Hou Tao <houtao1@huawei.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/bpf/20231216035510.4030605-1-houtao@huaweicloud.com
2023-12-18 18:15:49 +01:00
Alexei Starovoitov
0c970ed2f8 s390/bpf: Fix indirect trampoline generation
The func_addr used to be NULL for indirect trampolines used by struct_ops.
Now func_addr is a valid function pointer.
Hence use BPF_TRAMP_F_INDIRECT flag to detect such condition.

Fixes: 2cd3e3772e ("x86/cfi,bpf: Fix bpf_struct_ops CFI")
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Link: https://lore.kernel.org/bpf/20231216004549.78355-1-alexei.starovoitov@gmail.com
2023-12-18 12:00:37 +01:00
Alexei Starovoitov
42d45c4562 selftests/bpf: Temporarily disable dummy_struct_ops test on s390
Temporarily disable dummy_struct_ops test on s390.
The breakage is likely due to
commit 2cd3e3772e ("x86/cfi,bpf: Fix bpf_struct_ops CFI").

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-15 16:28:25 -08:00
Jens Axboe
ae1914174a cred: get rid of CONFIG_DEBUG_CREDENTIALS
This code is rarely (never?) enabled by distros, and it hasn't caught
anything in decades. Let's kill off this legacy debug code.

Suggested-by: Linus Torvalds <torvalds@linuxfoundation.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-12-15 14:19:48 -08:00
Hou Tao
1467affd16 selftests/bpf: Add test for abnormal cnt during multi-kprobe attachment
If an abnormally huge cnt is used for multi-kprobes attachment, the
following warning will be reported:

  ------------[ cut here ]------------
  WARNING: CPU: 1 PID: 392 at mm/util.c:632 kvmalloc_node+0xd9/0xe0
  Modules linked in: bpf_testmod(O)
  CPU: 1 PID: 392 Comm: test_progs Tainted: G ...... 6.7.0-rc3+ #32
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996)
  ......
  RIP: 0010:kvmalloc_node+0xd9/0xe0
   ? __warn+0x89/0x150
   ? kvmalloc_node+0xd9/0xe0
   bpf_kprobe_multi_link_attach+0x87/0x670
   __sys_bpf+0x2a28/0x2bc0
   __x64_sys_bpf+0x1a/0x30
   do_syscall_64+0x36/0xb0
   entry_SYSCALL_64_after_hwframe+0x6e/0x76
  RIP: 0033:0x7fbe067f0e0d
  ......
   </TASK>
  ---[ end trace 0000000000000000 ]---

So add a test to ensure the warning is fixed.

Signed-off-by: Hou Tao <houtao1@huawei.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20231215100708.2265609-6-houtao@huaweicloud.com
2023-12-15 22:54:55 +01:00
Hou Tao
00cdcd2900 selftests/bpf: Don't use libbpf_get_error() in kprobe_multi_test
Since libbpf v1.0, libbpf doesn't return error code embedded into the
pointer iteself, libbpf_get_error() is deprecated and it is basically
the same as using -errno directly.

So replace the invocations of libbpf_get_error() by -errno in
kprobe_multi_test. For libbpf_get_error() in test_attach_api_fails(),
saving -errno before invoking ASSERT_xx() macros just in case that
errno is overwritten by these macros. However, the invocation of
libbpf_get_error() in get_syms() should be kept intact, because
hashmap__new() still returns a pointer with embedded error code.

Signed-off-by: Hou Tao <houtao1@huawei.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20231215100708.2265609-5-houtao@huaweicloud.com
2023-12-15 22:54:55 +01:00
Hou Tao
0d83786f56 selftests/bpf: Add test for abnormal cnt during multi-uprobe attachment
If an abnormally huge cnt is used for multi-uprobes attachment, the
following warning will be reported:

  ------------[ cut here ]------------
  WARNING: CPU: 7 PID: 406 at mm/util.c:632 kvmalloc_node+0xd9/0xe0
  Modules linked in: bpf_testmod(O)
  CPU: 7 PID: 406 Comm: test_progs Tainted: G ...... 6.7.0-rc3+ #32
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996) ......
  RIP: 0010:kvmalloc_node+0xd9/0xe0
  ......
  Call Trace:
   <TASK>
   ? __warn+0x89/0x150
   ? kvmalloc_node+0xd9/0xe0
   bpf_uprobe_multi_link_attach+0x14a/0x480
   __sys_bpf+0x14a9/0x2bc0
   do_syscall_64+0x36/0xb0
   entry_SYSCALL_64_after_hwframe+0x6e/0x76
   ......
   </TASK>
  ---[ end trace 0000000000000000 ]---

So add a test to ensure the warning is fixed.

Signed-off-by: Hou Tao <houtao1@huawei.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20231215100708.2265609-4-houtao@huaweicloud.com
2023-12-15 22:54:55 +01:00
Andrii Nakryiko
f2d0ffee1f selftests/bpf: utilize string values for delegate_xxx mount options
Use both hex-based and string-based way to specify delegate mount
options for BPF FS.

Acked-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20231214225016.1209867-3-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-14 17:30:27 -08:00
Daniel Xu
2cd07b0eb0 bpf: xfrm: Add selftest for bpf_xdp_get_xfrm_state()
This commit extends test_tunnel selftest to test the new XDP xfrm state
lookup kfunc.

Co-developed-by: Antony Antony <antony.antony@secunet.com>
Signed-off-by: Antony Antony <antony.antony@secunet.com>
Signed-off-by: Daniel Xu <dxu@dxuuu.xyz>
Link: https://lore.kernel.org/r/e704e9a4332e3eac7b458e4bfdec8fcc6984cdb6.1702593901.git.dxu@dxuuu.xyz
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-14 17:12:49 -08:00
Daniel Xu
e7adc8291a bpf: selftests: Move xfrm tunnel test to test_progs
test_progs is better than a shell script b/c C is a bit easier to
maintain than shell. Also it's easier to use new infra like memory
mapped global variables from C via bpf skeleton.

Co-developed-by: Antony Antony <antony.antony@secunet.com>
Signed-off-by: Antony Antony <antony.antony@secunet.com>
Signed-off-by: Daniel Xu <dxu@dxuuu.xyz>
Link: https://lore.kernel.org/r/a350db9e08520c64544562d88ec005a039124d9b.1702593901.git.dxu@dxuuu.xyz
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-14 17:12:49 -08:00
Daniel Xu
02b4e126e6 bpf: selftests: test_tunnel: Use vmlinux.h declarations
vmlinux.h declarations are more ergnomic, especially when working with
kfuncs. The uapi headers are often incomplete for kfunc definitions.

This commit also switches bitfield accesses to use CO-RE helpers.
Switching to vmlinux.h definitions makes the verifier very
unhappy with raw bitfield accesses. The error is:

    ; md.u.md2.dir = direction;
    33: (69) r1 = *(u16 *)(r2 +11)
    misaligned stack access off (0x0; 0x0)+-64+11 size 2

Fix by using CO-RE-aware bitfield reads and writes.

Co-developed-by: Antony Antony <antony.antony@secunet.com>
Signed-off-by: Antony Antony <antony.antony@secunet.com>
Signed-off-by: Daniel Xu <dxu@dxuuu.xyz>
Link: https://lore.kernel.org/r/884bde1d9a351d126a3923886b945ea6b1b0776b.1702593901.git.dxu@dxuuu.xyz
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-14 17:12:49 -08:00
Daniel Xu
77a7a8220f bpf: selftests: test_tunnel: Setup fresh topology for each subtest
This helps with determinism b/c individual setup/teardown prevents
leaking state between different subtests.

Signed-off-by: Daniel Xu <dxu@dxuuu.xyz>
Link: https://lore.kernel.org/r/0fb59fa16fb58cca7def5239df606005a3e8dd0e.1702593901.git.dxu@dxuuu.xyz
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-14 17:12:49 -08:00
Yonghong Song
56925f389e selftests/bpf: Remove flaky test_btf_id test
With previous patch, one of subtests in test_btf_id becomes
flaky and may fail. The following is a failing example:

  Error: #26 btf
  Error: #26/174 btf/BTF ID
    Error: #26/174 btf/BTF ID
    btf_raw_create:PASS:check 0 nsec
    btf_raw_create:PASS:check 0 nsec
    test_btf_id:PASS:check 0 nsec
    ...
    test_btf_id:PASS:check 0 nsec
    test_btf_id:FAIL:check BTF lingersdo_test_get_info:FAIL:check failed: -1

The test tries to prove a btf_id not available after the map is closed.
But btf_id is freed only after workqueue and a rcu grace period, compared
to previous case just after a rcu grade period.
Depending on system workload, workqueue could take quite some time
to execute function bpf_map_free_deferred() which may cause the test failure.
Instead of adding arbitrary delays, let us remove the logic to
check btf_id availability after map is closed.

Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20231214203820.1469402-1-yonghong.song@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-14 17:10:32 -08:00
Tushar Vyavahare
2e1d6a0411 selftests/xsk: Fix for SEND_RECEIVE_UNALIGNED test
Fix test broken by shared umem test and framework enhancement commit.

Correct the current implementation of pkt_stream_replace_half() by
ensuring that nb_valid_entries are not set to half, as this is not true
for all the tests. Ensure that the expected value for valid_entries for
the SEND_RECEIVE_UNALIGNED test equals the total number of packets sent,
which is 4096.

Create a new function called pkt_stream_pkt_set() that allows for packet
modification to meet specific requirements while ensuring the accurate
maintenance of the valid packet count to prevent inconsistencies in packet
tracking.

Fixes: 6d198a89c0 ("selftests/xsk: Add a test for shared umem feature")
Reported-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Tushar Vyavahare <tushar.vyavahare@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Acked-by: Magnus Karlsson <magnus.karlsson@intel.com>
Link: https://lore.kernel.org/bpf/20231214130007.33281-1-tushar.vyavahare@intel.com
2023-12-14 16:11:13 +01:00
John Fastabend
50d96f05af bpf: sockmap, test for unconnected af_unix sock
Add test to sockmap_basic to ensure af_unix sockets that are not connected
can not be added to the map. Ensure we keep DGRAM sockets working however
as these will not be connected typically.

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Jakub Sitnicki <jakub@cloudflare.com>
Link: https://lore.kernel.org/r/20231201180139.328529-3-john.fastabend@gmail.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-12-13 16:33:01 -08:00
Larysa Zaremba
4c6612f610 selftests/bpf: Check VLAN tag and proto in xdp_metadata
Verify, whether VLAN tag and proto are set correctly.

To simulate "stripped" VLAN tag on veth, send test packet from VLAN
interface.

Also, add TO_STR() macro for convenience.

Acked-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
Link: https://lore.kernel.org/r/20231205210847.28460-19-larysa.zaremba@intel.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-13 16:16:41 -08:00
Larysa Zaremba
a3850af4ea selftests/bpf: Add AF_INET packet generation to xdp_metadata
The easiest way to simulate stripped VLAN tag in veth is to send a packet
from VLAN interface, attached to veth. Unfortunately, this approach is
incompatible with AF_XDP on TX side, because VLAN interfaces do not have
such feature.

Check both packets sent via AF_XDP TX and regular socket.

AF_INET packet will also have a filled-in hash type (XDP_RSS_TYPE_L4),
unlike AF_XDP packet, so more values can be checked.

Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
Acked-by: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/r/20231205210847.28460-18-larysa.zaremba@intel.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-13 16:16:41 -08:00
Larysa Zaremba
8e68a4beba selftests/bpf: Add flags and VLAN hint to xdp_hw_metadata
Add VLAN hint to the xdp_hw_metadata program.

Also, to make metadata layout more straightforward, add flags field
to pass information about validity of every separate hint separately.

Acked-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
Link: https://lore.kernel.org/r/20231205210847.28460-17-larysa.zaremba@intel.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-13 16:16:41 -08:00
Larysa Zaremba
e71a9fa7fd selftests/bpf: Allow VLAN packets in xdp_hw_metadata
Make VLAN c-tag and s-tag XDP hint testing more convenient
by not skipping VLAN-ed packets.

Allow both 802.1ad and 802.1Q headers.

Acked-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
Link: https://lore.kernel.org/r/20231205210847.28460-16-larysa.zaremba@intel.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-13 16:16:41 -08:00
Andrii Nakryiko
322122bf8c selftests/bpf: add tests for LIBBPF_BPF_TOKEN_PATH envvar
Add new subtest validating LIBBPF_BPF_TOKEN_PATH envvar semantics.
Extend existing test to validate that LIBBPF_BPF_TOKEN_PATH allows to
disable implicit BPF token creation by setting envvar to empty string.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20231213190842.3844987-11-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-13 15:47:05 -08:00
Andrii Nakryiko
18678cf0ee selftests/bpf: add tests for BPF object load with implicit token
Add a test to validate libbpf's implicit BPF token creation from default
BPF FS location (/sys/fs/bpf). Also validate that disabling this
implicit BPF token creation works.

Acked-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20231213190842.3844987-9-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-13 15:47:05 -08:00
Andrii Nakryiko
98e0eaa36a selftests/bpf: add BPF object loading tests with explicit token passing
Add a few tests that attempt to load BPF object containing privileged
map, program, and the one requiring mandatory BTF uploading into the
kernel (to validate token FD propagation to BPF_BTF_LOAD command).

Acked-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20231213190842.3844987-8-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-13 15:47:05 -08:00
Daniel Xu
f04f2ce601 bpf: selftests: Add verifier tests for CO-RE bitfield writes
Add some tests that exercise BPF_CORE_WRITE_BITFIELD() macro. Since some
non-trivial bit fiddling is going on, make sure various edge cases (such
as adjacent bitfields and bitfields at the edge of structs) are
exercised.

Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Xu <dxu@dxuuu.xyz>
Link: https://lore.kernel.org/r/72698a1080fa565f541d5654705255984ea2a029.1702325874.git.dxu@dxuuu.xyz
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-12-13 15:42:19 -08:00
Daniel Xu
7d19c00e9a bpf: selftests: test_loader: Support __btf_path() annotation
This commit adds support for per-prog btf_custom_path. This is necessary
for testing CO-RE relocations on non-vmlinux types using test_loader
infrastructure.

Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Xu <dxu@dxuuu.xyz>
Link: https://lore.kernel.org/r/660ea7f2fdbdd5103bc1af87c9fc931f05327926.1702325874.git.dxu@dxuuu.xyz
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-12-13 15:42:19 -08:00
Andrii Nakryiko
62d9a969f4 selftests/bpf: fix compiler warnings in RELEASE=1 mode
When compiling BPF selftests with RELEASE=1, we get two new
warnings, which are treated as errors. Fix them.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/r/20231212225343.1723081-1-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-13 10:27:16 -08:00
YiFei Zhu
e1ba7f64b1 selftests/bpf: Relax time_tai test for equal timestamps in tai_forward
We're observing test flakiness on an arm64 platform which might not
have timestamps as precise as x86. The test log looks like:

  test_time_tai:PASS:tai_open 0 nsec
  test_time_tai:PASS:test_run 0 nsec
  test_time_tai:PASS:tai_ts1 0 nsec
  test_time_tai:PASS:tai_ts2 0 nsec
  test_time_tai:FAIL:tai_forward unexpected tai_forward: actual 1702348135471494160 <= expected 1702348135471494160
  test_time_tai:PASS:tai_gettime 0 nsec
  test_time_tai:PASS:tai_future_ts1 0 nsec
  test_time_tai:PASS:tai_future_ts2 0 nsec
  test_time_tai:PASS:tai_range_ts1 0 nsec
  test_time_tai:PASS:tai_range_ts2 0 nsec
  #199     time_tai:FAIL

This patch changes ASSERT_GT to ASSERT_GE in the tai_forward assertion
so that equal timestamps are permitted.

Fixes: 64e15820b9 ("selftests/bpf: Add BPF-helper test for CLOCK_TAI access")
Signed-off-by: YiFei Zhu <zhuyifei@google.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20231212182911.3784108-1-zhuyifei@google.com
2023-12-12 15:53:34 -08:00
Manu Bretelle
f77d795618 selftests/bpf: Fixes tests for filesystem kfuncs
`fs_kfuncs.c`'s `test_xattr` would fail the test even when the
filesystem did not support xattr, for instance when /tmp is mounted as
tmpfs.

This change checks errno when setxattr fail. If the failure is due to
the operation being unsupported, we will skip the test (just like we
would if verity was not enabled on the FS.

Before the change, fs_kfuncs test would fail in test_axattr:

 $ vmtest -k $(make -s image_name) './tools/testing/selftests/bpf/test_progs -a fs_kfuncs'
 => bzImage
 ===> Booting
 [    0.000000] rcu:        RCU restricting CPUs from NR_CPUS=128 to
 nr_cpu_
 ===> Setting up VM
 ===> Running command
 [    4.157491] bpf_testmod: loading out-of-tree module taints kernel.
 [    4.161515] bpf_testmod: module verification failed: signature and/or
 required key missing - tainting kernel
 test_xattr:PASS:create_file 0 nsec
 test_xattr:FAIL:setxattr unexpected error: -1 (errno 95)
 #90/1    fs_kfuncs/xattr:FAIL
 #90/2    fs_kfuncs/fsverity:SKIP
 #90      fs_kfuncs:FAIL

 All error logs:
 test_xattr:PASS:create_file 0 nsec
 test_xattr:FAIL:setxattr unexpected error: -1 (errno 95)
 #90/1    fs_kfuncs/xattr:FAIL
 #90      fs_kfuncs:FAIL

 Summary: 0/0 PASSED, 1 SKIPPED, 1 FAILED

Test plan:

  $ touch tmpfs_file && truncate -s 1G tmpfs_file && mkfs.ext4 tmpfs_file
  # /tmp mounted as tmpfs
  $ vmtest -k $(make -s image_name) './tools/testing/selftests/bpf/test_progs -a fs_kfuncs'
  => bzImage
  ===> Booting
  ===> Setting up VM
  ===> Running command
  WARNING! Selftests relying on bpf_testmod.ko will be skipped.
  Can't find bpf_testmod.ko kernel module: -2
  #90/1    fs_kfuncs/xattr:SKIP
  #90/2    fs_kfuncs/fsverity:SKIP
  #90      fs_kfuncs:SKIP
  Summary: 1/0 PASSED, 2 SKIPPED, 0 FAILED
  # /tmp mounted as ext4 with xattr enabled but not verity
  $ vmtest -k $(make -s image_name) 'mount -o loop tmpfs_file /tmp && \
    /tools/testing/selftests/bpf/test_progs -a fs_kfuncs'
  => bzImage
  ===> Booting
  ===> Setting up VM
  ===> Running command
  [    4.067071] loop0: detected capacity change from 0 to 2097152
  [    4.191882] EXT4-fs (loop0): mounted filesystem
  407ffa36-4553-4c8c-8c78-134443630f69 r/w with ordered data mode. Quota
  mode: none.
  WARNING! Selftests relying on bpf_testmod.ko will be skipped.
  Can't find bpf_testmod.ko kernel module: -2
  #90/1    fs_kfuncs/xattr:OK
  #90/2    fs_kfuncs/fsverity:SKIP
  #90      fs_kfuncs:OK (SKIP: 1/2)
  Summary: 1/1 PASSED, 1 SKIPPED, 0 FAILED
  $ tune2fs -O verity tmpfs_file
  # /tmp as ext4 with both xattr and verity enabled
  $ vmtest -k $(make -s image_name) 'mount -o loop tmpfs_file /tmp && \
    ./tools/testing/selftests/bpf/test_progs -a fs_kfuncs'
  => bzImage
  ===> Booting
  ===> Setting up VM
  ===> Running command
  [    4.291434] loop0: detected capacity change from 0 to 2097152
  [    4.460828] EXT4-fs (loop0): recovery complete
  [    4.468631] EXT4-fs (loop0): mounted filesystem
  7b4a7b7f-c442-4b06-9ede-254e63cceb52 r/w with ordered data mode. Quota
  mode: none.
  [    4.988074] fs-verity: sha256 using implementation "sha256-generic"
  WARNING! Selftests relying on bpf_testmod.ko will be skipped.
  Can't find bpf_testmod.ko kernel module: -2
  #90/1    fs_kfuncs/xattr:OK
  #90/2    fs_kfuncs/fsverity:OK
  #90      fs_kfuncs:OK
  Summary: 1/2 PASSED, 0 SKIPPED, 0 FAILED

Fixes: 341f06fddd ("selftests/bpf: Add tests for filesystem kfuncs")
Signed-off-by: Manu Bretelle <chantr4@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20231211180733.763025-1-chantr4@gmail.com
2023-12-12 09:51:47 -08:00
Andrii Nakryiko
e72c1ccfd4 selftests/bpf: validate eliminated global subprog is not freplaceable
Add selftest that establishes dead code-eliminated valid global subprog
(global_dead) and makes sure that it's not possible to freplace it, as
it's effectively not there. This test will fail with unexpected success
before 2afae08c9d ("bpf: Validate global subprogs lazily").

v2->v3:
  - add missing err assignment (Alan);
  - undo unnecessary signature changes in verifier_global_subprogs.c (Eduard);
v1->v2:
  - don't rely on assembly output in verifier log, which changes between
    compiler versions (CI).

Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Reviewed-by: Alan Maguire <alan.maguire@oracle.com>
Suggested-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/r/20231211174131.2324306-1-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-11 19:16:32 -08:00
Larysa Zaremba
15c79c6507 selftests/bpf: Increase invalid metadata size
Changed check expects passed data meta to be deemed invalid. After loosening
the requirement, the size of 36 bytes becomes valid. Therefore, increase
tested meta size to 256, so we do not get an unexpected success.

Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20231206205919.404415-2-larysa.zaremba@intel.com
2023-12-11 16:09:24 +01:00
David Vernet
88f6047191 selftests/bpf: Add test for bpf_cpumask_weight() kfunc
The new bpf_cpumask_weight() kfunc can be used to count the number of
bits that are set in a struct cpumask* kptr. Let's add a selftest to
verify its behavior.

Signed-off-by: David Vernet <void@manifault.com>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20231207210843.168466-3-void@manifault.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-09 21:37:33 -08:00
Andrii Nakryiko
7d8ed51bcb selftests/bpf: validate fake register spill/fill precision backtracking logic
Add two tests validating that verifier's precision backtracking logic
handles BPF_ST_MEM instructions that produce fake register spill into
register slot. This is happening when non-zero constant is written
directly to a slot, e.g., *(u64 *)(r10 -8) = 123.

Add both full 64-bit register spill, as well as 32-bit "sub-spill".

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20231209010958.66758-2-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-09 18:43:20 -08:00
Yafang Shao
a2c6380b17 selftests/bpf: Add selftests for cgroup1 local storage
Expanding the test coverage from cgroup2 to include cgroup1. The result
as follows,

Already existing test cases for cgroup2:
  #48/1    cgrp_local_storage/tp_btf:OK
  #48/2    cgrp_local_storage/attach_cgroup:OK
  #48/3    cgrp_local_storage/recursion:OK
  #48/4    cgrp_local_storage/negative:OK
  #48/5    cgrp_local_storage/cgroup_iter_sleepable:OK
  #48/6    cgrp_local_storage/yes_rcu_lock:OK
  #48/7    cgrp_local_storage/no_rcu_lock:OK

Expanded test cases for cgroup1:
  #48/8    cgrp_local_storage/cgrp1_tp_btf:OK
  #48/9    cgrp_local_storage/cgrp1_recursion:OK
  #48/10   cgrp_local_storage/cgrp1_negative:OK
  #48/11   cgrp_local_storage/cgrp1_iter_sleepable:OK
  #48/12   cgrp_local_storage/cgrp1_yes_rcu_lock:OK
  #48/13   cgrp_local_storage/cgrp1_no_rcu_lock:OK

Summary:
  #48      cgrp_local_storage:OK
  Summary: 1/13 PASSED, 0 SKIPPED, 0 FAILED

Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Acked-by: Tejun Heo <tj@kernel.org>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20231206115326.4295-4-laoar.shao@gmail.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-12-08 17:08:18 -08:00
Yafang Shao
f4199271da selftests/bpf: Add a new cgroup helper open_classid()
This new helper allows us to obtain the fd of a net_cls cgroup, which will
be utilized in the subsequent patch.

Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Acked-by: Tejun Heo <tj@kernel.org>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20231206115326.4295-3-laoar.shao@gmail.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-12-08 17:08:18 -08:00
Andrii Nakryiko
1720c42b90 selftests/bpf: fix timer/test_bad_ret subtest on test_progs-cpuv4 flavor
Because test_bad_ret main program is not written in assembly, we don't
control instruction indices in timer_cb_ret_bad() subprog. This bites us
in timer/test_bad_ret subtest, where we see difference between cpuv4 and
other flavors.

For now, make __msg() expectations not rely on instruction indices by
anchoring them around bpf_get_prandom_u32 call. Once we have regex/glob
support for __msg(), this can be expressed a bit more nicely, but for
now just mitigating the problem with available means.

Fixes: e02dea158d ("selftests/bpf: validate async callback return value check correctness")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20231208233028.3412690-1-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-08 16:50:50 -08:00
Andrei Matei
6b4a64bafd bpf: Fix accesses to uninit stack slots
Privileged programs are supposed to be able to read uninitialized stack
memory (ever since 6715df8d5) but, before this patch, these accesses
were permitted inconsistently. In particular, accesses were permitted
above state->allocated_stack, but not below it. In other words, if the
stack was already "large enough", the access was permitted, but
otherwise the access was rejected instead of being allowed to "grow the
stack". This undesired rejection was happening in two places:
- in check_stack_slot_within_bounds()
- in check_stack_range_initialized()
This patch arranges for these accesses to be permitted. A bunch of tests
that were relying on the old rejection had to change; all of them were
changed to add also run unprivileged, in which case the old behavior
persists. One tests couldn't be updated - global_func16 - because it
can't run unprivileged for other reasons.

This patch also fixes the tracking of the stack size for variable-offset
reads. This second fix is bundled in the same commit as the first one
because they're inter-related. Before this patch, writes to the stack
using registers containing a variable offset (as opposed to registers
with fixed, known values) were not properly contributing to the
function's needed stack size. As a result, it was possible for a program
to verify, but then to attempt to read out-of-bounds data at runtime
because a too small stack had been allocated for it.

Each function tracks the size of the stack it needs in
bpf_subprog_info.stack_depth, which is maintained by
update_stack_depth(). For regular memory accesses, check_mem_access()
was calling update_state_depth() but it was passing in only the fixed
part of the offset register, ignoring the variable offset. This was
incorrect; the minimum possible value of that register should be used
instead.

This tracking is now fixed by centralizing the tracking of stack size in
grow_stack_state(), and by lifting the calls to grow_stack_state() to
check_stack_access_within_bounds() as suggested by Andrii. The code is
now simpler and more convincingly tracks the correct maximum stack size.
check_stack_range_initialized() can now rely on enough stack having been
allocated for the access; this helps with the fix for the first issue.

A few tests were changed to also check the stack depth computation. The
one that fails without this patch is verifier_var_off:stack_write_priv_vs_unpriv.

Fixes: 01f810ace9 ("bpf: Allow variable-offset stack access")
Reported-by: Hao Sun <sunhao.th@gmail.com>
Signed-off-by: Andrei Matei <andreimatei1@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20231208032519.260451-3-andreimatei1@gmail.com

Closes: https://lore.kernel.org/bpf/CABWLsev9g8UP_c3a=1qbuZUi20tGoUXoU07FPf-5FLvhOKOY+Q@mail.gmail.com/
2023-12-08 14:19:00 -08:00
Jakub Kicinski
2483e7f04c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.

Conflicts:

drivers/net/ethernet/stmicro/stmmac/dwmac5.c
drivers/net/ethernet/stmicro/stmmac/dwmac5.h
drivers/net/ethernet/stmicro/stmmac/dwxgmac2_core.c
drivers/net/ethernet/stmicro/stmmac/hwif.h
  37e4b8df27 ("net: stmmac: fix FPE events losing")
  c3f3b97238 ("net: stmmac: Refactor EST implementation")
https://lore.kernel.org/all/20231206110306.01e91114@canb.auug.org.au/

Adjacent changes:

net/ipv4/tcp_ao.c
  9396c4ee93 ("net/tcp: Don't store TCP-AO maclen on reqsk")
  7b0f570f87 ("tcp: Move TCP-AO bits from cookie_v[46]_check() to tcp_ao_syncookie().")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-07 17:53:17 -08:00
Andrei Matei
e28bd359bc bpf: Add verifier regression test for previous patch
Add a regression test for var-off zero-sized reads.

Signed-off-by: Andrei Matei <andreimatei1@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/bpf/20231207041150.229139-3-andreimatei1@gmail.com
2023-12-07 13:58:02 -08:00
Andrii Nakryiko
7065eefb38 bpf: rename MAX_BPF_LINK_TYPE into __MAX_BPF_LINK_TYPE for consistency
To stay consistent with the naming pattern used for similar cases in BPF
UAPI (__MAX_BPF_ATTACH_TYPE, etc), rename MAX_BPF_LINK_TYPE into
__MAX_BPF_LINK_TYPE.

Also similar to MAX_BPF_ATTACH_TYPE and MAX_BPF_REG, add:

  #define MAX_BPF_LINK_TYPE __MAX_BPF_LINK_TYPE

Not all __MAX_xxx enums have such #define, so I'm not sure if we should
add it or not, but I figured I'll start with a completely backwards
compatible way, and we can drop that, if necessary.

Also adjust a selftest that used MAX_BPF_LINK_TYPE enum.

Suggested-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20231206190920.1651226-1-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-06 14:41:16 -08:00
Jiri Olsa
ffed24eff9 selftests/bpf: Add test for early update in prog_array_map_poke_run
Adding test that tries to trigger the BUG_IN during early map update
in prog_array_map_poke_run function.

The idea is to share prog array map between thread that constantly
updates it and another one loading a program that uses that prog
array.

Eventually we will hit a place where the program is ok to be updated
(poke->tailcall_target_stable check) but the address is still not
registered in kallsyms, so the bpf_arch_text_poke returns -EINVAL
and cause imbalance for the next tail call update check, which will
fail with -EBUSY in bpf_arch_text_poke as described in previous fix.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Ilya Leoshkevich <iii@linux.ibm.com>
Link: https://lore.kernel.org/bpf/20231206083041.1306660-3-jolsa@kernel.org
2023-12-06 22:40:43 +01:00
Andrii Nakryiko
dc5196fac4 selftests/bpf: add BPF token-enabled tests
Add a selftest that attempts to conceptually replicate intended BPF
token use cases inside user namespaced container.

Child process is forked. It is then put into its own userns and mountns.
Child creates BPF FS context object. This ensures child userns is
captured as the owning userns for this instance of BPF FS. Given setting
delegation mount options is privileged operation, we ensure that child
cannot set them.

This context is passed back to privileged parent process through Unix
socket, where parent sets up delegation options, creates, and mounts it
as a detached mount. This mount FD is passed back to the child to be
used for BPF token creation, which allows otherwise privileged BPF
operations to succeed inside userns.

We validate that all of token-enabled privileged commands (BPF_BTF_LOAD,
BPF_MAP_CREATE, and BPF_PROG_LOAD) work as intended. They should only
succeed inside the userns if a) BPF token is provided with proper
allowed sets of commands and types; and b) namespaces CAP_BPF and other
privileges are set. Lacking a) or b) should lead to -EPERM failures.

Based on suggested workflow by Christian Brauner ([0]).

  [0] https://lore.kernel.org/bpf/20230704-hochverdient-lehne-eeb9eeef785e@brauner/

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20231130185229.2688956-17-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-06 10:03:00 -08:00
Andrii Nakryiko
e1cef620f5 bpf: add BPF token support to BPF_PROG_LOAD command
Add basic support of BPF token to BPF_PROG_LOAD. Wire through a set of
allowed BPF program types and attach types, derived from BPF FS at BPF
token creation time. Then make sure we perform bpf_token_capable()
checks everywhere where it's relevant.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20231130185229.2688956-7-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-06 10:02:59 -08:00
Andrii Nakryiko
688b7270b3 bpf: add BPF token support to BPF_MAP_CREATE command
Allow providing token_fd for BPF_MAP_CREATE command to allow controlled
BPF map creation from unprivileged process through delegated BPF token.

Wire through a set of allowed BPF map types to BPF token, derived from
BPF FS at BPF token creation time. This, in combination with allowed_cmds
allows to create a narrowly-focused BPF token (controlled by privileged
agent) with a restrictive set of BPF maps that application can attempt
to create.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20231130185229.2688956-5-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-06 10:02:59 -08:00
Andrii Nakryiko
064e0bea19 selftests/bpf: validate precision logic in partial_stack_load_preserves_zeros
Enhance partial_stack_load_preserves_zeros subtest with detailed
precision propagation log checks. We know expect fp-16 to be spilled,
initially imprecise, zero const register, which is later marked as
precise even when partial stack slot load is performed, even if it's not
a register fill (!).

Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20231205184248.1502704-10-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-05 13:40:21 -08:00
Andrii Nakryiko
add1cd7f22 selftests/bpf: validate zero preservation for sub-slot loads
Validate that 1-, 2-, and 4-byte loads from stack slots not aligned on
8-byte boundary still preserve zero, when loading from all-STACK_ZERO
sub-slots, or when stack sub-slots are covered by spilled register with
known constant zero value.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20231205184248.1502704-8-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-05 13:40:21 -08:00
Andrii Nakryiko
b33ceb6a3d selftests/bpf: validate STACK_ZERO is preserved on subreg spill
Add tests validating that STACK_ZERO slots are preserved when slot is
partially overwritten with subregister spill.

Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20231205184248.1502704-6-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-05 13:40:20 -08:00
Andrii Nakryiko
876301881c selftests/bpf: add stack access precision test
Add a new selftests that validates precision tracking for stack access
instruction, using both r10-based and non-r10-based accesses. For
non-r10 ones we also make sure to have non-zero var_off to validate that
final stack offset is tracked properly in instruction history
information inside verifier.

Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20231205184248.1502704-3-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-05 13:40:20 -08:00
Andrii Nakryiko
41f6f64e69 bpf: support non-r10 register spill/fill to/from stack in precision tracking
Use instruction (jump) history to record instructions that performed
register spill/fill to/from stack, regardless if this was done through
read-only r10 register, or any other register after copying r10 into it
*and* potentially adjusting offset.

To make this work reliably, we push extra per-instruction flags into
instruction history, encoding stack slot index (spi) and stack frame
number in extra 10 bit flags we take away from prev_idx in instruction
history. We don't touch idx field for maximum performance, as it's
checked most frequently during backtracking.

This change removes basically the last remaining practical limitation of
precision backtracking logic in BPF verifier. It fixes known
deficiencies, but also opens up new opportunities to reduce number of
verified states, explored in the subsequent patches.

There are only three differences in selftests' BPF object files
according to veristat, all in the positive direction (less states).

File                                    Program        Insns (A)  Insns (B)  Insns  (DIFF)  States (A)  States (B)  States (DIFF)
--------------------------------------  -------------  ---------  ---------  -------------  ----------  ----------  -------------
test_cls_redirect_dynptr.bpf.linked3.o  cls_redirect        2987       2864  -123 (-4.12%)         240         231    -9 (-3.75%)
xdp_synproxy_kern.bpf.linked3.o         syncookie_tc       82848      82661  -187 (-0.23%)        5107        5073   -34 (-0.67%)
xdp_synproxy_kern.bpf.linked3.o         syncookie_xdp      85116      84964  -152 (-0.18%)        5162        5130   -32 (-0.62%)

Note, I avoided renaming jmp_history to more generic insn_hist to
minimize number of lines changed and potential merge conflicts between
bpf and bpf-next trees.

Notice also cur_hist_entry pointer reset to NULL at the beginning of
instruction verification loop. This pointer avoids the problem of
relying on last jump history entry's insn_idx to determine whether we
already have entry for current instruction or not. It can happen that we
added jump history entry because current instruction is_jmp_point(), but
also we need to add instruction flags for stack access. In this case, we
don't want to entries, so we need to reuse last added entry, if it is
present.

Relying on insn_idx comparison has the same ambiguity problem as the one
that was fixed recently in [0], so we avoid that.

  [0] https://patchwork.kernel.org/project/netdevbpf/patch/20231110002638.4168352-3-andrii@kernel.org/

Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Reported-by: Tao Lyu <tao.lyu@epfl.ch>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20231205184248.1502704-2-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-05 13:40:20 -08:00