linux

mirror of https://github.com/torvalds/linux.git synced 2026-04-18 06:44:00 -04:00

Author	SHA1	Message	Date
Xu Kuohai	d9ef13f727	bpf: Pass bpf_verifier_env to JIT Pass bpf_verifier_env to bpf_int_jit_compile(). The follow-up patch will use env->insn_aux_data in the JIT stage to detect indirect jump targets. Since bpf_prog_select_runtime() can be called by cbpf and lib/test_bpf.c code without verifier, introduce helper __bpf_prog_select_runtime() to accept the env parameter. Remove the call to bpf_prog_select_runtime() in bpf_prog_load(), and switch to call __bpf_prog_select_runtime() in the verifier, with env variable passed. The original bpf_prog_select_runtime() is preserved for cbpf and lib/test_bpf.c, where env is NULL. Now all constants blinding calls are moved into the verifier, except the cbpf and lib/test_bpf.c cases. The instructions arrays are adjusted by bpf_patch_insn_data() function for normal cases, so there is no need to call adjust_insn_arrays() in bpf_jit_blind_constants(). Remove it. Reviewed-by: Anton Protopopov <a.s.protopopov@gmail.com> # v8 Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com> # v12 Acked-by: Hengqi Chen <hengqi.chen@gmail.com> # v14 Signed-off-by: Xu Kuohai <xukuohai@huawei.com> Link: https://lore.kernel.org/r/20260416064341.151802-3-xukuohai@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-04-16 07:03:40 -07:00
Xu Kuohai	d3e945223e	bpf: Move constants blinding out of arch-specific JITs During the JIT stage, constants blinding rewrites instructions but only rewrites the private instruction copy of the JITed subprog, leaving the global env->prog->insnsi and env->insn_aux_data untouched. This causes a mismatch between subprog instructions and the global state, making it difficult to use the global data in the JIT. To avoid this mismatch, and given that all arch-specific JITs already support constants blinding, move it to the generic verifier code, and switch to rewrite the global env->prog->insnsi with the global states adjusted, as other rewrites in the verifier do. This removes the constants blinding calls in each JIT, which are largely duplicated code across architectures. Since constants blinding is only required for JIT, and there are two JIT entry functions, jit_subprogs() for BPF programs with multiple subprogs and bpf_prog_select_runtime() for programs with no subprogs, move the constants blinding invocation into these two functions. In the verifier path, bpf_patch_insn_data() is used to keep global verifier auxiliary data in sync with patched instructions. A key question is whether this global auxiliary data should be restored on the failure path. Besides instructions, bpf_patch_insn_data() adjusts: - prog->aux->poke_tab - env->insn_array_maps - env->subprog_info - env->insn_aux_data For prog->aux->poke_tab, it is only used by JIT or only meaningful after JIT succeeds, so it does not need to be restored on the failure path. For env->insn_array_maps, when JIT fails, programs using insn arrays are rejected by bpf_insn_array_ready() due to missing JIT addresses. Hence, env->insn_array_maps is only meaningful for JIT and does not need to be restored. For subprog_info, if jit_subprogs fails and CONFIG_BPF_JIT_ALWAYS_ON is not enabled, kernel falls back to interpreter. In this case, env->subprog_info is used to determine subprogram stack depth. So it must be restored on failure. For env->insn_aux_data, it is freed by clear_insn_aux_data() at the end of bpf_check(). Before freeing, clear_insn_aux_data() loops over env->insn_aux_data to release jump targets recorded in it. The loop uses env->prog->len as the array length, but this length no longer matches the actual size of the adjusted env->insn_aux_data array after constants blinding. To address it, a simple approach is to keep insn_aux_data as adjusted after failure, since it will be freed shortly, and record its actual size for the loop in clear_insn_aux_data(). But since clear_insn_aux_data() uses the same index to loop over both env->prog->insnsi and env->insn_aux_data, this approach results in incorrect index for the insnsi array. So an alternative approach is adopted: clone the original env->insn_aux_data before blinding and restore it after failure, similar to env->prog. For classic BPF programs, constants blinding works as before since it is still invoked from bpf_prog_select_runtime(). Reviewed-by: Anton Protopopov <a.s.protopopov@gmail.com> # v8 Reviewed-by: Hari Bathini <hbathini@linux.ibm.com> # powerpc jit Reviewed-by: Pu Lehui <pulehui@huawei.com> # riscv jit Acked-by: Hengqi Chen <hengqi.chen@gmail.com> # loongarch jit Signed-off-by: Xu Kuohai <xukuohai@huawei.com> Link: https://lore.kernel.org/r/20260416064341.151802-2-xukuohai@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-04-16 07:03:40 -07:00
Christophe Leroy	e60adf5132	bpf: Take return from set_memory_rox() into account with bpf_jit_binary_lock_ro() set_memory_rox() can fail, leaving memory unprotected. Check return and bail out when bpf_jit_binary_lock_ro() returns an error. Link: https://github.com/KSPP/linux/issues/7 Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: linux-hardening@vger.kernel.org <linux-hardening@vger.kernel.org> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Puranjay Mohan <puranjay12@gmail.com> Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com> # s390x Acked-by: Tiezhu Yang <yangtiezhu@loongson.cn> # LoongArch Reviewed-by: Johan Almbladh <johan.almbladh@anyfinetworks.com> # MIPS Part Message-ID: <036b6393f23a2032ce75a1c92220b2afcb798d5d.1709850515.git.christophe.leroy@csgroup.eu> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-03-14 19:28:52 -07:00
Jiaxun Yang	bbefef2f07	bpf, mips: Implement DADDI workarounds for JIT For DADDI errata we just workaround by disable immediate operation for BPF_ADD / BPF_SUB to avoid generation of DADDIU. All other use cases in JIT won't cause overflow thus they are all safe. Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Acked-by: Johan Almbladh <johan.almbladh@anyfinetworks.com> Link: https://lore.kernel.org/bpf/20230228113305.83751-2-jiaxun.yang@flygoat.com	2023-02-28 14:52:40 +01:00
Johan Almbladh	72570224bb	mips, bpf: Add JIT workarounds for CPU errata This patch adds workarounds for the following CPU errata to the MIPS eBPF JIT, if enabled in the kernel configuration. - R10000 ll/sc weak ordering - Loongson-3 ll/sc weak ordering - Loongson-2F jump hang The Loongson-2F nop errata is implemented in uasm, which the JIT uses, so no additional mitigations are needed for that. Signed-off-by: Johan Almbladh <johan.almbladh@anyfinetworks.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Jiaxun Yang <jiaxun.yang@flygoat.com> Link: https://lore.kernel.org/bpf/20211005165408.2305108-6-johan.almbladh@anyfinetworks.com	2021-10-06 12:28:25 -07:00
Johan Almbladh	eb63cfcd2e	mips, bpf: Add eBPF JIT for 32-bit MIPS This is an implementation of an eBPF JIT for 32-bit MIPS I-V and MIPS32. The implementation supports all 32-bit and 64-bit ALU and JMP operations, including the recently-added atomics. 64-bit div/mod and 64-bit atomics are implemented using function calls to math64 and atomic64 functions, respectively. All 32-bit operations are implemented natively by the JIT, except if the CPU lacks ll/sc instructions. Register mapping ================ All 64-bit eBPF registers are mapped to native 32-bit MIPS register pairs, and does not use any stack scratch space for register swapping. This means that all eBPF register data is kept in CPU registers all the time, and this simplifies the register management a lot. It also reduces the JIT's pressure on temporary registers since we do not have to move data around. Native register pairs are ordered according to CPU endiannes, following the O32 calling convention for passing 64-bit arguments and return values. The eBPF return value, arguments and callee-saved registers are mapped to their native MIPS equivalents. Since the 32 highest bits in the eBPF FP (frame pointer) register are always zero, only one general-purpose register is actually needed for the mapping. The MIPS fp register is used for this purpose. The high bits are mapped to MIPS register r0. This saves us one CPU register, which is much needed for temporaries, while still allowing us to treat the R10 (FP) register just like any other eBPF register in the JIT. The MIPS gp (global pointer) and at (assembler temporary) registers are used as internal temporary registers for constant blinding. CPU registers t6-t9 are used internally by the JIT when constructing more complex 64-bit operations. This is precisely what is needed - two registers to store an operand value, and two more as scratch registers when performing the operation. The register mapping is shown below. R0 - $v1, $v0 return value R1 - $a1, $a0 argument 1, passed in registers R2 - $a3, $a2 argument 2, passed in registers R3 - $t1, $t0 argument 3, passed on stack R4 - $t3, $t2 argument 4, passed on stack R5 - $t4, $t3 argument 5, passed on stack R6 - $s1, $s0 callee-saved R7 - $s3, $s2 callee-saved R8 - $s5, $s4 callee-saved R9 - $s7, $s6 callee-saved FP - $r0, $fp 32-bit frame pointer AX - $gp, $at constant-blinding $t6 - $t9 unallocated, JIT temporaries Jump offsets ============ The JIT tries to map all conditional JMP operations to MIPS conditional PC-relative branches. The MIPS branch offset field is 18 bits, in bytes, which is equivalent to the eBPF 16-bit instruction offset. However, since the JIT may emit more than one CPU instruction per eBPF instruction, the field width may overflow. If that happens, the JIT converts the long conditional jump to a short PC-relative branch with the condition inverted, jumping over a long unconditional absolute jmp (j). This conversion will change the instruction offset mapping used for jumps, and may in turn result in more branch offset overflows. The JIT therefore dry-runs the translation until no more branches are converted and the offsets do not change anymore. There is an upper bound on this of course, and if the JIT hits that limit, the last two iterations are run with all branches being converted. Tail call count =============== The current tail call count is stored in the 16-byte area of the caller's stack frame that is reserved for the callee in the o32 ABI. The value is initialized in the prologue, and propagated to the tail-callee by skipping the initialization instructions when emitting the tail call. Signed-off-by: Johan Almbladh <johan.almbladh@anyfinetworks.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211005165408.2305108-4-johan.almbladh@anyfinetworks.com	2021-10-06 12:28:14 -07:00

6 Commits