Commit Graph

4 Commits

Author SHA1 Message Date
Jiri Olsa
fe8e5a3215 selftests/bpf: Add 5-byte NOP uprobe trigger benchmark
Add a 5-byte NOP uprobe trigger benchmark (x86_64 specific) to measure
uprobes/uretprobes on top of NOP5 instructions.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Yonghong Song <yhs@fb.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Alan Maguire <alan.maguire@oracle.com>
Link: https://lore.kernel.org/r/20250414083647.1234007-2-jolsa@kernel.org
2025-04-18 09:03:45 +02:00
Andrii Nakryiko
208c439120 selftests/bpf: remove syscall-driven benchs, keep syscall-count only
Remove "legacy" benchmarks triggered by syscalls in favor of newly added
in-kernel/batched benchmarks. Drop -batched suffix now as well.
Next patch will restore "feature parity" by adding back
tp/raw_tp/fmodret benchmarks based on in-kernel kfunc approach.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20240326162151.3981687-4-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-03-28 18:31:40 -07:00
Andrii Nakryiko
1175f8dea3 selftests/bpf: rename and clean up userspace-triggered benchmarks
Rename uprobe-base to more precise usermode-count (it will match other
baseline-like benchmarks, kernel-count and syscall-count). Also use
BENCH_TRIG_USERMODE() macro to define all usermode-based triggering
benchmarks, which include usermode-count and uprobe/uretprobe benchmarks.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20240326162151.3981687-2-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-03-28 18:31:39 -07:00
Andrii Nakryiko
8f79870ec8 selftests/bpf: Extend uprobe/uretprobe triggering benchmarks
Settle on three "flavors" of uprobe/uretprobe, installed on different
kinds of instruction: nop, push, and ret. All three are testing
different internal code paths emulating or single-stepping instructions,
so are interesting to compare and benchmark separately.

To ensure `push rbp` instruction we ensure that uprobe_target_push() is
not a leaf function by calling (global __weak) noop function and
returning something afterwards (if we don't do that, compiler will just
do a tail call optimization).

Also, we need to make sure that compiler isn't skipping frame pointer
generation, so let's add `-fno-omit-frame-pointers` to Makefile.

Just to give an idea of where we currently stand in terms of relative
performance of different uprobe/uretprobe cases vs a cheap syscall
(getpgid()) baseline, here are results from my local machine:

$ benchs/run_bench_uprobes.sh
base           :    1.561 ± 0.020M/s
uprobe-nop     :    0.947 ± 0.007M/s
uprobe-push    :    0.951 ± 0.004M/s
uprobe-ret     :    0.443 ± 0.007M/s
uretprobe-nop  :    0.471 ± 0.013M/s
uretprobe-push :    0.483 ± 0.004M/s
uretprobe-ret  :    0.306 ± 0.007M/s

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240301214551.1686095-1-andrii@kernel.org
2024-03-04 14:40:24 +01:00