linux

mirror of https://github.com/torvalds/linux.git synced 2026-04-18 06:44:00 -04:00

Author	SHA1	Message	Date
Eduard Zingerman	ecdd4fd8a5	bpf: fix arg tracking for imprecise/multi-offset BPF_ST/STX BPF_STX through ARG_IMPRECISE dst should be recognized as a local spill and join at_stack with the written value. For example, consider the following situation: // r1 = ARG_IMPRECISE{mask=BIT(0)\|BIT(1)} (u64 )(r1 + 0) = r8 Here the analysis should produce an equivalent of at_stack[] = join(old, r8) BPF_ST through multi-offset or imprecise dst should join at_stack with none instead of overwriting the slots. For example, consider the following situation: // r1 = ARG_IMPRECISE{mask=BIT(0)\|BIT(1)} (u64 )(r1 + 0) = 0 Here the analysis should produce an equivalent of at_stack[r1] = join(old, none). Move the definition of the clear_overlapping_stack_slots() in order to have __arg_track_join() visible. Remove the OFF_IMPRECISE constant to avoid having two ways to express imprecise offset. Only 'offset-imprecise {frame=N, cnt=0}' remains. Fixes: `bf0c571f7f` ("bpf: introduce forward arg-tracking dataflow analysis") Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20260413-stacklive-fixes-v2-1-398e126e5cf3@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-04-15 08:40:47 -07:00
Alexei Starovoitov	fc150cddee	bpf: Move compute_insn_live_regs() into liveness.c verifier.c is huge. Move compute_insn_live_regs() into liveness.c. Mechanical move. No functional changes. Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/r/20260412152936.54262-3-alexei.starovoitov@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-04-12 12:36:38 -07:00
Alexei Starovoitov	57205e2dd9	bpf: Delete unused variable 'cnt' is set, but not used. Delete it. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202604111401.eqzyF2kx-lkp@intel.com/ Fixes: `2c167d9177` ("bpf: change logging scheme for live stack analysis") Link: https://lore.kernel.org/r/20260411141447.45932-1-alexei.starovoitov@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-04-11 10:04:28 -07:00
Eduard Zingerman	2c167d9177	bpf: change logging scheme for live stack analysis Instead of breadcrumbs like: (d2,cs15) frame 0 insn 18 +live -16 (d2,cs15) frame 0 insn 17 +live -16 Print final accumulated stack use/def data per-func_instance per-instruction. printed func_instance's are ordered by callsite and depth. For example: stack use/def subprog#0 shared_instance_must_write_overwrite (d0,cs0): 0: (b7) r1 = 1 1: (7b) (u64 )(r10 -8) = r1 ; def: fp0-8 2: (7b) (u64 )(r10 -16) = r1 ; def: fp0-16 3: (bf) r1 = r10 4: (07) r1 += -8 5: (bf) r2 = r10 6: (07) r2 += -16 7: (85) call pc+7 ; use: fp0-8 fp0-16 8: (bf) r1 = r10 9: (07) r1 += -16 10: (bf) r2 = r10 11: (07) r2 += -8 12: (85) call pc+2 ; use: fp0-8 fp0-16 13: (b7) r0 = 0 14: (95) exit stack use/def subprog#1 forwarding_rw (d1,cs7): 15: (85) call pc+1 ; use: fp0-8 fp0-16 16: (95) exit stack use/def subprog#1 forwarding_rw (d1,cs12): 15: (85) call pc+1 ; use: fp0-8 fp0-16 16: (95) exit stack use/def subprog#2 write_first_read_second (d2,cs15): 17: (7a) (u64 )(r1 +0) = 42 18: (79) r0 = (u64 )(r2 +0) ; use: fp0-8 fp0-16 19: (95) exit For groups of three or more consecutive stack slots, abbreviate as follows: 25: (85) call bpf_loop#181 ; use: fp2-8..-512 fp1-8..-512 fp0-8..-512 Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20260410-patch-set-v4-10-5d4eecb343db@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-04-10 15:13:37 -07:00
Eduard Zingerman	6762e3a0bc	bpf: simplify liveness to use (callsite, depth) keyed func_instances Rework func_instance identification and remove the dynamic liveness API, completing the transition to fully static stack liveness analysis. Replace callchain-based func_instance keys with (callsite, depth) pairs. The full callchain (all ancestor callsites) is no longer part of the hash key; only the immediate callsite and the call depth matter. This does not lose precision in practice and simplifies the data structure significantly: struct callchain is removed entirely, func_instance stores just callsite, depth. Drop must_write_acc propagation. Previously, must_write marks were accumulated across successors and propagated to the caller via propagate_to_outer_instance(). Instead, callee entry liveness (live_before at subprog start) is pulled directly back to the caller's callsite in analyze_subprog() after each callee returns. Since (callsite, depth) instances are shared across different call chains that invoke the same subprog at the same depth, must_write marks from one call may be stale for another. To handle this, analyze_subprog() records into a fresh_instance() when the instance was already visited (must_write_initialized), then merge_instances() combines the results: may_read is unioned, must_write is intersected. This ensures only slots written on ALL paths through all call sites are marked as guaranteed writes. This replaces commit_stack_write_marks() logic. Skip recursive descent into callees that receive no FP-derived arguments (has_fp_args() check). This is needed because global subprogram calls can push depth beyond MAX_CALL_FRAMES (max depth is 64 for global calls but only 8 frames are accommodated for FP passing). It also handles the case where a callback subprog cannot be determined by argument tracking: such callbacks will be processed by analyze_subprog() at depth 0 independently. Update lookup_instance() (used by is_live_before queries) to search for the func_instance with maximal depth at the corresponding callsite, walking depth downward from frameno to 0. This accounts for the fact that instance depth no longer corresponds 1:1 to bpf_verifier_state->curframe, since skipped non-FP calls create gaps. Remove the dynamic public liveness API from verifier.c: - bpf_mark_stack_{read,write}(), bpf_reset/commit_stack_write_marks() - bpf_update_live_stack(), bpf_reset_live_stack_callchain() - All call sites in check_stack_{read,write}_fixed_off(), check_stack_range_initialized(), mark_stack_slot_obj_read(), mark/unmark_stack_slots_{dynptr,iter,irq_flag}() - The per-instruction write mark accumulation in do_check() - The bpf_update_live_stack() call in prepare_func_exit() mark_stack_read() and mark_stack_write() become static functions in liveness.c, called only from the static analysis pass. The func_instance->updated and must_write_dropped flags are removed. Remove spis_single_slot(), spis_one_bit() helpers from bpf_verifier.h as they are no longer used. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Tested-by: Paul Chaignon <paul.chaignon@gmail.com> Link: https://lore.kernel.org/r/20260410-patch-set-v4-9-5d4eecb343db@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-04-10 15:13:20 -07:00
Eduard Zingerman	fed53dbcdb	bpf: record arg tracking results in bpf_liveness masks After arg tracking reaches a fixed point, perform a single linear scan over the converged at_in[] state and translate each memory access into liveness read/write masks on the func_instance: - Load/store instructions: FP-derived pointer's frame and offset(s) are converted to half-slot masks targeting per_frame_masks->{may_read,must_write} - Helper/kfunc calls: record_call_access() queries bpf_helper_stack_access_bytes() / bpf_kfunc_stack_access_bytes() for each FP-derived argument to determine access size and direction. Unknown access size (S64_MIN) conservatively marks all slots from fp_off to fp+0 as read. - Imprecise pointers (frame == ARG_IMPRECISE): conservatively mark all slots in every frame covered by the pointer's frame bitmask as fully read. - Static subprog calls with unresolved arguments: conservatively mark all frames as fully read. Instead of a call to clean_live_states(), start cleaning the current state continuously as registers and stack become dead since the static analysis provides complete liveness information. This makes clean_live_states() and bpf_verifier_state->cleaned unnecessary. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20260410-patch-set-v4-8-5d4eecb343db@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-04-10 15:06:14 -07:00
Eduard Zingerman	bf0c571f7f	bpf: introduce forward arg-tracking dataflow analysis The analysis is a basis for static liveness tracking mechanism introduced by the next two commits. A forward fixed-point analysis that tracks which frame's FP each register value is derived from, and at what byte offset. This is needed because a callee can receive a pointer to its caller's stack frame (e.g. r1 = fp-16 at the call site), then do (u64 )(r1 + 0) inside the callee — a cross-frame stack access that the callee's local liveness must attribute to the caller's stack. Each register holds an arg_track value from a three-level lattice: - Precise {frame=N, off=[o1,o2,...]} — known frame index and up to 4 concrete byte offsets - Offset-imprecise {frame=N, off_cnt=0} — known frame, unknown offset - Fully-imprecise {frame=ARG_IMPRECISE, mask=bitmask} — unknown frame, mask says which frames might be involved At CFG merge points the lattice moves toward imprecision (same frame+offset stays precise, same frame different offsets merges offset sets or becomes offset-imprecise, different frames become fully-imprecise with OR'd bitmask). The analysis also tracks spills/fills to the callee's own stack (at_stack_in/out), so FP derived values spilled and reloaded. This pass is run recursively per call site: when subprog A calls B with specific FP-derived arguments, B is re-analyzed with those entry args. The recursion follows analyze_subprog -> compute_subprog_args -> (for each call insn) -> analyze_subprog. Subprogs that receive no FP-derived args are skipped during recursion and analyzed independently at depth 0. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20260410-patch-set-v4-7-5d4eecb343db@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-04-10 15:05:05 -07:00
Eduard Zingerman	8d3219f64d	bpf: prepare liveness internal API for static analysis pass Move the `updated` check and reset from bpf_update_live_stack() into update_instance() itself, so callers outside the main loop can reuse it. Similarly, move write_insn_idx assignment out of reset_stack_write_marks() into its public caller, and thread insn_idx as a parameter to commit_stack_write_marks() instead of reading it from liveness->write_insn_idx. Drop the unused `env` parameter from alloc_frame_masks() and mark_stack_read(). Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20260410-patch-set-v4-6-5d4eecb343db@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-04-10 15:05:03 -07:00
Eduard Zingerman	be23266b4a	bpf: 4-byte precise clean_verifier_state Migrate clean_verifier_state() and its liveness queries from 8-byte SPI granularity to 4-byte half-slot granularity. In __clean_func_state(), each SPI is cleaned in two independent halves: - half_spi 2i (lo): slot_type[0..3] - half_spi 2i+1 (hi): slot_type[4..7] Slot types STACK_DYNPTR, STACK_ITER and STACK_IRQ_FLAG are never cleaned, as their slot type markers are required by destroy_if_dynptr_stack_slot(), is_iter_reg_valid_uninit() and is_irq_flag_reg_valid_uninit() for correctness. When only the hi half is dead, spilled_ptr metadata is destroyed and the lo half's STACK_SPILL bytes are downgraded to STACK_MISC or STACK_ZERO. When only the lo half is dead, spilled_ptr is preserved because the hi half may still need it for state comparison. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20260410-patch-set-v4-5-5d4eecb343db@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-04-10 15:04:59 -07:00
Eduard Zingerman	7ca5f68cda	bpf: make liveness.c track stack with 4-byte granularity Convert liveness bitmask type from u64 to spis_t, doubling the number of trackable stack slots from 64 to 128 to support 4-byte granularity. Each 8-byte SPI now maps to two consecutive 4-byte sub-slots in the bitmask: spi2 half and spi2+1 half. In verifier.c, check_stack_write_fixed_off() now reports 4-byte aligned writes of 4-byte writes as half-slot marks and 8-byte aligned 8-byte writes as two slots. Similar logic applied in check_stack_read_fixed_off(). Queries (is_live_before) are not yet migrated to half-slot granularity. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20260410-patch-set-v4-4-5d4eecb343db@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-04-10 15:04:56 -07:00
Kees Cook	69050f8d6d	treewide: Replace kmalloc with kmalloc_obj for non-scalar types This is the result of running the Coccinelle script from scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to avoid scalar types (which need careful case-by-case checking), and instead replace kmalloc-family calls that allocate struct or union object instances: Single allocations: kmalloc(sizeof(TYPE), ...) are replaced with: kmalloc_obj(TYPE, ...) Array allocations: kmalloc_array(COUNT, sizeof(TYPE), ...) are replaced with: kmalloc_objs(TYPE, COUNT, ...) Flex array allocations: kmalloc(struct_size(PTR, FAM, COUNT), ...) are replaced with: kmalloc_flex(PTR, FAM, COUNT, ...) (where TYPE may also be VAR) The resulting allocations no longer return "void ", instead returning "TYPE ". Signed-off-by: Kees Cook <kees@kernel.org>	2026-02-21 01:02:28 -08:00
Eduard Zingerman	e40f5a6bf8	bpf: correct stack liveness for tail calls This updates bpf_insn_successors() reflecting that control flow might jump over the instructions between tail call and function exit, verifier might assume that some writes to parent stack always happen, which is not the case. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Martin Teichmann <martin.teichmann@xfel.eu> Link: https://lore.kernel.org/r/20251119160355.1160932-4-martin.teichmann@xfel.eu Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-11-21 17:45:30 -08:00
Anton Protopopov	493d9e0d60	bpf, x86: add support for indirect jumps Add support for a new instruction BPF_JMP\|BPF_X\|BPF_JA, SRC=0, DST=Rx, off=0, imm=0 which does an indirect jump to a location stored in Rx. The register Rx should have type PTR_TO_INSN. This new type assures that the Rx register contains a value (or a range of values) loaded from a correct jump table – map of type instruction array. For example, for a C switch LLVM will generate the following code: 0: r3 = r1 # "switch (r3)" 1: if r3 > 0x13 goto +0x666 # check r3 boundaries 2: r3 <<= 0x3 # adjust to an index in array of addresses 3: r1 = 0xbeef ll # r1 is PTR_TO_MAP_VALUE, r1->map_ptr=M 5: r1 += r3 # r1 inherits boundaries from r3 6: r1 = (u64 )(r1 + 0x0) # r1 now has type INSN_TO_PTR 7: gotox r1 # jit will generate proper code Here the gotox instruction corresponds to one particular map. This is possible however to have a gotox instruction which can be loaded from different maps, e.g. 0: r1 &= 0x1 1: r2 <<= 0x3 2: r3 = 0x0 ll # load from map M_1 4: r3 += r2 5: if r1 == 0x0 goto +0x4 6: r1 <<= 0x3 7: r3 = 0x0 ll # load from map M_2 9: r3 += r1 A: r1 = (u64 )(r3 + 0x0) B: gotox r1 # jump to target loaded from M_1 or M_2 During check_cfg stage the verifier will collect all the maps which point to inside the subprog being verified. When building the config, the high 16 bytes of the insn_state are used, so this patch (theoretically) supports jump tables of up to 2^16 slots. During the later stage, in check_indirect_jump, it is checked that the register Rx was loaded from a particular instruction array. Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20251105090410.1250500-9-a.s.protopopov@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-11-05 17:53:23 -08:00
Anton Protopopov	2f69c56854	bpf: make bpf_insn_successors to return a pointer The bpf_insn_successors() function is used to return successors to a BPF instruction. So far, an instruction could have 0, 1 or 2 successors. Prepare the verifier code to introduction of instructions with more than 2 successors (namely, indirect jumps). To do this, introduce a new struct, struct bpf_iarray, containing an array of bpf instruction indexes and make bpf_insn_successors to return a pointer of that type. The storage for all instructions is allocated in the env->succ, which holds an array of size 2, to be used for all instructions. Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20251019202145.3944697-10-a.s.protopopov@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-10-21 11:20:23 -07:00
Shardul Bankar	96d31dff3f	bpf: Clarify get_outer_instance() handling in propagate_to_outer_instance() propagate_to_outer_instance() calls get_outer_instance() and uses the returned pointer to reset and commit stack write marks. Under normal conditions, update_instance() guarantees that an outer instance exists, so get_outer_instance() cannot return an ERR_PTR. However, explicitly checking for IS_ERR(outer_instance) makes this code more robust and self-documenting. It reduces cognitive load when reading the control flow and silences potential false-positive reports from static analysis or automated tooling. No functional change intended. Signed-off-by: Shardul Bankar <shardulsb08@gmail.com> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20251021080849.860072-1-shardulsb08@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-10-21 09:39:05 -07:00
Shardul Bankar	f6fddc6df3	bpf: Fix memory leak in __lookup_instance error path When __lookup_instance() allocates a func_instance structure but fails to allocate the must_write_set array, it returns an error without freeing the previously allocated func_instance. This causes a memory leak of 192 bytes (sizeof(struct func_instance)) each time this error path is triggered. Fix by freeing 'result' on must_write_set allocation failure. Fixes: `b3698c356a` ("bpf: callchain sensitive stack liveness tracking using CFG") Reported-by: BPF Runtime Fuzzer (BRF) Signed-off-by: Shardul Bankar <shardulsb08@gmail.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://patch.msgid.link/20251016063330.4107547-1-shardulsb08@gmail.com	2025-10-16 10:45:17 -07:00
Eduard Zingerman	79f047c7d9	bpf: table based bpf_insn_successors() Converting bpf_insn_successors() to use lookup table makes it ~1.5 times faster. Also remove unnecessary conditionals: - `idx + 1 < prog->len` is unnecessary because after check_cfg() all jump targets are guaranteed to be within a program; - `i == 0 \|\| succ[0] != dst` is unnecessary because any client of bpf_insn_successors() can handle duplicate edges: - compute_live_registers() - compute_scc() Moving bpf_insn_successors() to liveness.c allows its inlining in liveness.c:__update_stack_liveness(). Such inlining speeds up __update_stack_liveness() by ~40%. bpf_insn_successors() is used in both verifier.c and liveness.c. perf shows such move does not negatively impact users in verifier.c, as these are executed only once before main varification pass. Unlike __update_stack_liveness() which can be triggered multiple times. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20250918-callchain-sensitive-liveness-v3-10-c3cd27bacc60@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-09-19 09:27:23 -07:00
Eduard Zingerman	b3698c356a	bpf: callchain sensitive stack liveness tracking using CFG This commit adds a flow-sensitive, context-sensitive, path-insensitive data flow analysis for live stack slots: - flow-sensitive: uses program control flow graph to compute data flow values; - context-sensitive: collects data flow values for each possible call chain in a program; - path-insensitive: does not distinguish between separate control flow graph paths reaching the same instruction. Compared to the current path-sensitive analysis, this approach trades some precision for not having to enumerate every path in the program. This gives a theoretical capability to run the analysis before main verification pass. See cover letter for motivation. The basic idea is as follows: - Data flow values indicate stack slots that might be read and stack slots that are definitely written. - Data flow values are collected for each (call chain, instruction number) combination in the program. - Within a subprogram, data flow values are propagated using control flow graph. - Data flow values are transferred from entry instructions of callee subprograms to call sites in caller subprograms. In other words, a tree of all possible call chains is constructed. Each node of this tree represents a subprogram. Read and write marks are collected for each instruction of each node. Live stack slots are first computed for lower level nodes. Then, information about outer stack slots that might be read or are definitely written by a subprogram is propagated one level up, to the corresponding call instructions of the upper nodes. Procedure repeats until root node is processed. In the absence of value range analysis, stack read/write marks are collected during main verification pass, and data flow computation is triggered each time verifier.c:states_equal() needs to query the information. Implementation details are documented in kernel/bpf/liveness.c. Quantitative data about verification performance changes and memory consumption is in the cover letter. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20250918-callchain-sensitive-liveness-v3-6-c3cd27bacc60@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-09-19 09:27:23 -07:00

18 Commits