Recently Andy changed the 64-bit syscall logic so that
pt_regs->ax is initially set to -ENOSYS, and on syscall exit,
it is updated with the actual return value. This simplified
the logic there.
This patch does the same for 32-bit syscall entry points.
The check for %rax being too big is moved to be just before
the call instruction which dispatches execution through the
syscall table.
There is no way to accidentally skip this check now by jumping
to a label after it. This allows us to remove redundant checks
after ptrace et al.
If %rax is too big, we just skip over the (call, write %rax to
pt_regs->ax) instruction pair. pt_regs->ax remains set to -ENOSYS,
and it gets returned to userspace.
Similar to 64-bit code, this eliminates the "ia32_badsys" code path.
Run-tested.
Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Will Drewry <wad@chromium.org>
Link: http://lkml.kernel.org/r/1429632194-13445-2-git-send-email-dvlasenk@redhat.com
[ Changelog massage. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
This change makes the check exact (no more false positives
on "negative" addresses).
Andy explains:
"Canonical addresses either start with 17 zeros or 17 ones.
In the old code, we checked that the top (64-47) = 17 bits were all
zero. We did this by shifting right by 47 bits and making sure that
nothing was left.
In the new code, we're shifting left by (64 - 48) = 16 bits and then
signed shifting right by the same amount, this propagating the 17th
highest bit to all positions to its left. If we get the same value we
started with, then we're good to go."
While it isn't really important to be fully correct here -
almost all addresses we'll ever see will be userspace ones,
but OTOH it looks to be cheap enough: the new code uses
two more ALU ops but preserves %rcx, allowing to not
reload it from pt_regs->cx again.
On disassembly level, the changes are:
cmp %rcx,0x80(%rsp) -> mov 0x80(%rsp),%r11; cmp %rcx,%r11
shr $0x2f,%rcx -> shl $0x10,%rcx; sar $0x10,%rcx; cmp %rcx,%r11
mov 0x58(%rsp),%rcx -> (eliminated)
Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
Acked-by: Andy Lutomirski <luto@kernel.org>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Drewry <wad@chromium.org>
Link: http://lkml.kernel.org/r/1429633649-20169-1-git-send-email-dvlasenk@redhat.com
[ Changelog massage. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The core_pmu does not define cpu_* callbacks, which handles
allocation of 'struct cpu_hw_events::shared_regs' data,
initialization of debug store and PMU_FL_EXCL_CNTRS counters.
While this probably won't happen on bare metal, virtual CPU can
define x86_pmu.extra_regs together with PMU version 1 and thus
be using core_pmu -> using shared_regs data without it being
allocated. That could could leave to following panic:
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [<ffffffff8152cd4f>] _spin_lock_irqsave+0x1f/0x40
SNIP
[<ffffffff81024bd9>] __intel_shared_reg_get_constraints+0x69/0x1e0
[<ffffffff81024deb>] intel_get_event_constraints+0x9b/0x180
[<ffffffff8101e815>] x86_schedule_events+0x75/0x1d0
[<ffffffff810586dc>] ? check_preempt_curr+0x7c/0x90
[<ffffffff810649fe>] ? try_to_wake_up+0x24e/0x3e0
[<ffffffff81064ba2>] ? default_wake_function+0x12/0x20
[<ffffffff8109eb16>] ? autoremove_wake_function+0x16/0x40
[<ffffffff810577e9>] ? __wake_up_common+0x59/0x90
[<ffffffff811a9517>] ? __d_lookup+0xa7/0x150
[<ffffffff8119db5f>] ? do_lookup+0x9f/0x230
[<ffffffff811a993a>] ? dput+0x9a/0x150
[<ffffffff8119c8f5>] ? path_to_nameidata+0x25/0x60
[<ffffffff8119e90a>] ? __link_path_walk+0x7da/0x1000
[<ffffffff8101d8f9>] ? x86_pmu_add+0xb9/0x170
[<ffffffff8101d7a7>] x86_pmu_commit_txn+0x67/0xc0
[<ffffffff811b07b0>] ? mntput_no_expire+0x30/0x110
[<ffffffff8119c731>] ? path_put+0x31/0x40
[<ffffffff8107c297>] ? current_fs_time+0x27/0x30
[<ffffffff8117d170>] ? mem_cgroup_get_reclaim_stat_from_page+0x20/0x70
[<ffffffff8111b7aa>] group_sched_in+0x13a/0x170
[<ffffffff81014a29>] ? sched_clock+0x9/0x10
[<ffffffff8111bac8>] ctx_sched_in+0x2e8/0x330
[<ffffffff8111bb7b>] perf_event_sched_in+0x6b/0xb0
[<ffffffff8111bc36>] perf_event_context_sched_in+0x76/0xc0
[<ffffffff8111eb3b>] perf_event_comm+0x1bb/0x2e0
[<ffffffff81195ee9>] set_task_comm+0x69/0x80
[<ffffffff81195fe1>] setup_new_exec+0xe1/0x2e0
[<ffffffff811ea68e>] load_elf_binary+0x3ce/0x1ab0
Adding cpu_(prepare|starting|dying) for core_pmu to have
shared_regs data allocated for core_pmu. AFAICS there's no harm
to initialize debug store and PMU_FL_EXCL_CNTRS either for
core_pmu.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/20150421152623.GC13169@krava.redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
During some code analysis I realized that atomic_add(), atomic_sub()
and friends are not necessarily inlined AND that each function
is defined multiple times:
atomic_inc: 544 duplicates
atomic_dec: 215 duplicates
atomic_dec_and_test: 107 duplicates
atomic64_inc: 38 duplicates
[...]
Each definition is exact equally, e.g.:
ffffffff813171b8 <atomic_add>:
55 push %rbp
48 89 e5 mov %rsp,%rbp
f0 01 3e lock add %edi,(%rsi)
5d pop %rbp
c3 retq
In turn each definition has one or more callsites (sure):
ffffffff81317c78: e8 3b f5 ff ff callq ffffffff813171b8 <atomic_add> [...]
ffffffff8131a062: e8 51 d1 ff ff callq ffffffff813171b8 <atomic_add> [...]
ffffffff8131a190: e8 23 d0 ff ff callq ffffffff813171b8 <atomic_add> [...]
The other way around would be to remove the static linkage - but
I prefer an enforced inlining here.
Before:
text data bss dec hex filename
81467393 19874720 20168704 121510817 73e1ba1 vmlinux.orig
After:
text data bss dec hex filename
81461323 19874720 20168704 121504747 73e03eb vmlinux.inlined
Yes, the inlining here makes the kernel even smaller! ;)
Linus further observed:
"I have this memory of having seen that before - the size
heuristics for gcc getting confused by inlining.
[...]
It might be a good idea to mark things that are basically just
wrappers around a single (or a couple of) asm instruction to be
always_inline."
Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1429565231-4609-1-git-send-email-hagen@jauu.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The host's decision to enable machine check exceptions should remain
in force during non-root mode. KVM was writing 0 to cr4 on VCPU reset
and passed a slightly-modified 0 to the vmcs.guest_cr4 value.
Tested: Built.
On earlier version, tested by injecting machine check
while a guest is spinning.
Before the change, if guest CR4.MCE==0, then the machine check is
escalated to Catastrophic Error (CATERR) and the machine dies.
If guest CR4.MCE==1, then the machine check causes VMEXIT and is
handled normally by host Linux. After the change, injecting a machine
check causes normal Linux machine check handling.
Signed-off-by: Ben Serebrin <serebrin@google.com>
Reviewed-by: Venkatesh Srinivas <venkateshs@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Pull char/misc driver updates from Greg KH:
"Here's the big char/misc driver patchset for 4.1-rc1.
Lots of different driver subsystem updates here, nothing major, full
details are in the shortlog.
All of this has been in linux-next for a while"
* tag 'char-misc-4.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (133 commits)
mei: trace: remove unused TRACE_SYSTEM_STRING
DTS: ARM: OMAP3-N900: Add lis3lv02d support
Documentation: DT: lis302: update wakeup binding
lis3lv02d: DT: add wakeup unit 2 and wakeup threshold
lis3lv02d: DT: use s32 to support negative values
Drivers: hv: hv_balloon: correctly handle num_pages>INT_MAX case
Drivers: hv: hv_balloon: correctly handle val.freeram<num_pages case
mei: replace check for connection instead of transitioning
mei: use mei_cl_is_connected consistently
mei: fix mei_poll operation
hv_vmbus: Add gradually increased delay for retries in vmbus_post_msg()
Drivers: hv: hv_balloon: survive ballooning request with num_pages=0
Drivers: hv: hv_balloon: eliminate jumps in piecewiese linear floor function
Drivers: hv: hv_balloon: do not online pages in offline blocks
hv: remove the per-channel workqueue
hv: don't schedule new works in vmbus_onoffer()/vmbus_onoffer_rescind()
hv: run non-blocking message handlers in the dispatch tasklet
coresight: moving to new "hwtracing" directory
coresight-tmc: Adding a status interface to sysfs
coresight: remove the unnecessary configuration coresight-default-sink
...
Pull tty/serial updates from Greg KH:
"Here's the big tty/serial driver update for 4.1-rc1.
It was delayed for a bit due to some questions surrounding some of the
console command line parsing changes that are in here. There's still
one tiny regression for people who were previously putting multiple
console command lines and expecting them all to be ignored for some
odd reason, but Peter is working on fixing that. If not, I'll send a
revert for the offending patch, but I have faith that Peter can
address it.
Other than the console work here, there's the usual serial driver
updates and changes, and a buch of 8250 reworks to try to make that
driver easier to maintain over time, and have it support more devices
in the future.
All of these have been in linux-next for a while"
* tag 'tty-4.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (119 commits)
n_gsm: Drop unneeded cast on netdev_priv
sc16is7xx: expose RTS inversion in RS-485 mode
serial: 8250_pci: port failed after wakeup from S3
earlycon: 8250: Document kernel command line options
earlycon: 8250: Fix command line regression
earlycon: Fix __earlycon_table stride
tty: clean up the tty time logic a bit
serial: 8250_dw: only get the clock rate in one place
serial: 8250_dw: remove useless ACPI ID check
dmaengine: hsu: move memory allocation to GFP_NOWAIT
dmaengine: hsu: remove redundant pieces of code
serial: 8250_pci: add Intel Tangier support
dmaengine: hsu: add Intel Tangier PCI ID
serial: 8250_pci: replace switch-case by formula for Intel MID
serial: 8250_pci: replace switch-case by formula
tty: cpm_uart: replace CONFIG_8xx by CONFIG_CPM1
serial: jsm: some off by one bugs
serial: xuartps: Fix check in console_setup().
serial: xuartps: Get rid of register access macros.
serial: xuartps: Fix iobase use.
...
Pull final removal of deprecated cpus_* cpumask functions from Rusty Russell:
"This is the final removal (after several years!) of the obsolete
cpus_* functions, prompted by their mis-use in staging.
With these function removed, all cpu functions should only iterate to
nr_cpu_ids, so we finally only allocate that many bits when cpumasks
are allocated offstack"
* tag 'cpumask-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: (25 commits)
cpumask: remove __first_cpu / __next_cpu
cpumask: resurrect CPU_MASK_CPU0
linux/cpumask.h: add typechecking to cpumask_test_cpu
cpumask: only allocate nr_cpumask_bits.
Fix weird uses of num_online_cpus().
cpumask: remove deprecated functions.
mips: fix obsolete cpumask_of_cpu usage.
x86: fix more deprecated cpu function usage.
ia64: remove deprecated cpus_ usage.
powerpc: fix deprecated CPU_MASK_CPU0 usage.
CPU_MASK_ALL/CPU_MASK_NONE: remove from deprecated region.
staging/lustre/o2iblnd: Don't use cpus_weight
staging/lustre/libcfs: replace deprecated cpus_ calls with cpumask_
staging/lustre/ptlrpc: Do not use deprecated cpus_* functions
blackfin: fix up obsolete cpu function usage.
parisc: fix up obsolete cpu function usage.
tile: fix up obsolete cpu function usage.
arm64: fix up obsolete cpu function usage.
mips: fix up obsolete cpu function usage.
x86: fix up obsolete cpu function usage.
...
Pavel Machek reported the following compiler warning on
x86/32 CONFIG_HIGHMEM64G=y builds:
arch/x86/mm/ioremap.c:344:10: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
Clean up the types in this function by using a single natural type for
internal calculations (unsigned long), to make it more apparent what's
happening, and also to remove fragile casts.
Reported-by: Pavel Machek <pavel@ucw.cz>
Cc: jgross@suse.com
Cc: roland@purestorage.com
Link: http://lkml.kernel.org/r/20150416080440.GA507@amd
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull turbostat update from Len Brown:
"Updates to the turbostat utility.
Just one kernel dependency in this batch -- added a #define to
msr-index.h"
* 'turbostat' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux:
tools/power turbostat: correct dumped pkg-cstate-limit value
tools/power turbostat: calculate TSC frequency from CPUID(0x15) on SKL
tools/power turbostat: correct DRAM RAPL units on recent Xeon processors
tools/power turbostat: Initial Skylake support
tools/power turbostat: Use $(CURDIR) instead of $(PWD) and add support for O= option in Makefile
tools/power turbostat: modprobe msr, if needed
tools/power turbostat: dump MSR_TURBO_RATIO_LIMIT2
tools/power turbostat: use new MSR_TURBO_RATIO_LIMIT names
x86 msr-index: define MSR_TURBO_RATIO_LIMIT,1,2
tools/power turbostat: label base frequency
tools/power turbostat: update PERF_LIMIT_REASONS decoding
tools/power turbostat: simplify default output
Skylake adds some additional residency counters.
Skylake supports a different mix of RAPL registers
from any previous product.
In most other ways, Skylake is like Broadwell.
Signed-off-by: Len Brown <len.brown@intel.com>
Pull PMEM driver from Ingo Molnar:
"This is the initial support for the pmem block device driver:
persistent non-volatile memory space mapped into the system's physical
memory space as large physical memory regions.
The driver is based on Intel code, written by Ross Zwisler, with fixes
by Boaz Harrosh, integrated with x86 e820 memory resource management
and tidied up by Christoph Hellwig.
Note that there were two other separate pmem driver submissions to
lkml: but apparently all parties (Ross Zwisler, Boaz Harrosh) are
reasonably happy with this initial version.
This version enables minimal support that enables persistent memory
devices out in the wild to work as block devices, identified through a
magic (non-standard) e820 flag and auto-discovered if
CONFIG_X86_PMEM_LEGACY=y, or added explicitly through manipulating the
memory maps via the "memmap=..." boot option with the new, special '!'
modifier character.
Limitations: this is a regular block device, and since the pmem areas
are not struct page backed, they are invisible to the rest of the
system (other than the block IO device), so direct IO to/from pmem
areas, direct mmap() or XIP is not possible yet. The page cache will
also shadow and double buffer pmem contents, etc.
Initial support is for x86"
* 'x86-pmem-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
drivers/block/pmem: Fix 32-bit build warning in pmem_alloc()
drivers/block/pmem: Add a driver for persistent memory
x86/mm: Add support for the non-standard protected e820 type
Pull x86 fixes from Ingo Molnar:
"This tree includes:
- an FPU related crash fix
- a ptrace fix (with matching testcase in tools/testing/selftests/)
- an x86 Kconfig DMA-config defaults tweak to better avoid
non-working drivers"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
config: Enable NEED_DMA_MAP_STATE by default when SWIOTLB is selected
x86/fpu: Load xsave pointer *after* initialization
x86/ptrace: Fix the TIF_FORCED_TF logic in handle_signal()
x86, selftests: Add single_step_syscall test
Pull perf updates from Ingo Molnar:
"This update has mostly fixes, but also other bits:
- perf tooling fixes
- PMU driver fixes
- Intel Broadwell PMU driver HW-enablement for LBR callstacks
- a late coming 'perf kmem' tool update that enables it to also
analyze page allocation data. Note, this comes with MM tracepoint
changes that we believe to not break anything: because it changes
the formerly opaque 'struct page *' field that uniquely identifies
pages to 'pfn' which identifies pages uniquely too, but isn't as
opaque and can be used for other purposes as well"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86/intel/pt: Fix and clean up error handling in pt_event_add()
perf/x86/intel: Add Broadwell support for the LBR callstack
perf/x86/intel/rapl: Fix energy counter measurements but supporing per domain energy units
perf/x86/intel: Fix Core2,Atom,NHM,WSM cycles:pp events
perf/x86: Fix hw_perf_event::flags collision
perf probe: Fix segfault when probe with lazy_line to file
perf probe: Find compilation directory path for lazy matching
perf probe: Set retprobe flag when probe in address-based alternative mode
perf kmem: Analyze page allocator events also
tracing, mm: Record pfn instead of pointer to struct page
Dan Carpenter reported that pt_event_add() has buggy
error handling logic: it returns 0 instead of -EBUSY when
it fails to start a newly added event.
Furthermore, the control flow in this function is messy,
with cleanup labels mixed with direct returns.
Fix the bug and clean up the code by converting it to
a straight fast path for the regular non-failing case,
plus a clear sequence of cascading goto labels to do
all cleanup.
NOTE: I materially changed the existing clean up logic in the
pt_event_start() failure case to use the direct
perf_aux_output_end() path, not pt_event_del(), because
perf_aux_output_end() is enough here.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Julia Lawall <julia.lawall@lip6.fr>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20150416103830.GB7847@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Until now, the EFI stub was only setting the 32 bit cmd_line_ptr in
the setup_header structure, so on 64 bit platforms this could be truncated.
This patch adds setting the upper bits of the buffer address in
ext_cmd_line_ptr. This case was likely never hit, as the allocation
for this buffer is done at the lowest available address. Only
x86_64 kernels have this problem, as the 1-1 mapping mandated
by EFI ensures that all memory is 32 bit addressable on 32 bit
platforms. The EFI stub does not support mixed mode, so the
32 bit kernel on 64 bit firmware case does not need to be handled.
Signed-off-by: Roy Franz <roy.franz@linaro.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
Merge third patchbomb from Andrew Morton:
- various misc things
- a couple of lib/ optimisations
- provide DIV_ROUND_CLOSEST_ULL()
- checkpatch updates
- rtc tree
- befs, nilfs2, hfs, hfsplus, fatfs, adfs, affs, bfs
- ptrace fixes
- fork() fixes
- seccomp cleanups
- more mmap_sem hold time reductions from Davidlohr
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (138 commits)
proc: show locks in /proc/pid/fdinfo/X
docs: add missing and new /proc/PID/status file entries, fix typos
drivers/rtc/rtc-at91rm9200.c: make IO endian agnostic
Documentation/spi/spidev_test.c: fix warning
drivers/rtc/rtc-s5m.c: allow usage on device type different than main MFD type
.gitignore: ignore *.tar
MAINTAINERS: add Mediatek SoC mailing list
tomoyo: reduce mmap_sem hold for mm->exe_file
powerpc/oprofile: reduce mmap_sem hold for exe_file
oprofile: reduce mmap_sem hold for mm->exe_file
mips: ip32: add platform data hooks to use DS1685 driver
lib/Kconfig: fix up HAVE_ARCH_BITREVERSE help text
x86: switch to using asm-generic for seccomp.h
sparc: switch to using asm-generic for seccomp.h
powerpc: switch to using asm-generic for seccomp.h
parisc: switch to using asm-generic for seccomp.h
mips: switch to using asm-generic for seccomp.h
microblaze: use asm-generic for seccomp.h
arm: use asm-generic for seccomp.h
seccomp: allow COMPAT sigreturn overrides
...
Switch to using the newly created asm-generic/seccomp.h for the seccomp
strict mode syscall definitions. The obsolete sigreturn syscall override
is retained in 32-bit mode, and the ia32 syscall overrides are used in
the compat case. Remaining definitions were identical.
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
So I was playing with gdb today and did this simple thing:
gdb /bin/ls
...
(gdb) run
Box exploded with this splat:
BUG: unable to handle kernel NULL pointer dereference at 00000000000001d0
IP: [<ffffffff8100fe5a>] xstateregs_get+0x7a/0x120
[...]
Call Trace:
ptrace_regset
ptrace_request
? wait_task_inactive
? preempt_count_sub
arch_ptrace
? ptrace_get_task_struct
SyS_ptrace
system_call_fastpath
... because we do cache &target->thread.fpu.state->xsave into the
local variable xsave but that pointer is NULL at that time and
it gets initialized later, in init_fpu(), see:
e7f180dcd8 ("x86/fpu: Change xstateregs_get()/set() to use ->xsave.i387 rather than ->fxsave")
The fix is simple: load xsave *after* init_fpu() has run.
Also do the same in xstateregs_set(), as suggested by Oleg Nesterov.
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Tavis Ormandy <taviso@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1429209697-5902-1-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
RAPL energy hardware unit can vary within a single CPU package, e.g.
HSW server DRAM has a fixed energy unit of 15.3 uJ (2^-16) whereas
the unit on other domains can be enumerated from power unit MSR.
There might be other variations in the future, this patch adds
per cpu model quirk to allow special handling of certain cpus.
hw_unit is also removed from per cpu data since it is not per cpu
and the sampling rate for energy counter is typically not high.
Without this patch, DRAM domain on HSW servers will be counted
4x higher than the real energy counter.
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Stephane Eranian <eranian@google.com>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/1427405325-780-1-git-send-email-jacob.jun.pan@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Ingo reported that cycles:pp didn't work for him on some machines.
It turns out that in this commit:
af4bdcf675 perf/x86/intel: Disallow flags for most Core2/Atom/Nehalem/Westmere events
Andi forgot to explicitly allow that event when he
disabled event flags for PEBS on those uarchs.
Reported-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Fixes: af4bdcf675 ("perf/x86/intel: Disallow flags for most Core2/Atom/Nehalem/Westmere events")
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull xen features and fixes from David Vrabel:
- use a single source list of hypercalls, generating other tables etc.
at build time.
- add a "Xen PV" APIC driver to support >255 VCPUs in PV guests.
- significant performance improve to guest save/restore/migration.
- scsiback/front save/restore support.
- infrastructure for multi-page xenbus rings.
- misc fixes.
* tag 'stable/for-linus-4.1-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen/pci: Try harder to get PXM information for Xen
xenbus_client: Extend interface to support multi-page ring
xen-pciback: also support disabling of bus-mastering and memory-write-invalidate
xen: support suspend/resume in pvscsi frontend
xen: scsiback: add LUN of restored domain
xen-scsiback: define a pr_fmt macro with xen-pvscsi
xen/mce: fix up xen_late_init_mcelog() error handling
xen/privcmd: improve performance of MMAPBATCH_V2
xen: unify foreign GFN map/unmap for auto-xlated physmap guests
x86/xen/apic: WARN with details.
x86/xen: Provide a "Xen PV" APIC driver to support >255 VCPUs
xen/pciback: Don't print scary messages when unsupported by hypervisor.
xen: use generated hypercall symbols in arch/x86/xen/xen-head.S
xen: use generated hypervisor symbols in arch/x86/xen/trace.c
xen: synchronize include/xen/interface/xen.h with xen
xen: build infrastructure for generating hypercall depending symbols
xen: balloon: Use static attribute groups for sysfs entries
xen: pcpu: Use static attribute groups for sysfs entry
When the TIF_SINGLESTEP tracee dequeues a signal,
handle_signal() clears TIF_FORCED_TF and X86_EFLAGS_TF but
leaves TIF_SINGLESTEP set.
If the tracer does PTRACE_SINGLESTEP again, enable_single_step()
sets X86_EFLAGS_TF but not TIF_FORCED_TF. This means that the
subsequent PTRACE_CONT doesn't not clear X86_EFLAGS_TF, and the
tracee gets the wrong SIGTRAP.
Test-case (needs -O2 to avoid prologue insns in signal handler):
#include <unistd.h>
#include <stdio.h>
#include <sys/ptrace.h>
#include <sys/wait.h>
#include <sys/user.h>
#include <assert.h>
#include <stddef.h>
void handler(int n)
{
asm("nop");
}
int child(void)
{
assert(ptrace(PTRACE_TRACEME, 0,0,0) == 0);
signal(SIGALRM, handler);
kill(getpid(), SIGALRM);
return 0x23;
}
void *getip(int pid)
{
return (void*)ptrace(PTRACE_PEEKUSER, pid,
offsetof(struct user, regs.rip), 0);
}
int main(void)
{
int pid, status;
pid = fork();
if (!pid)
return child();
assert(wait(&status) == pid);
assert(WIFSTOPPED(status) && WSTOPSIG(status) == SIGALRM);
assert(ptrace(PTRACE_SINGLESTEP, pid, 0, SIGALRM) == 0);
assert(wait(&status) == pid);
assert(WIFSTOPPED(status) && WSTOPSIG(status) == SIGTRAP);
assert((getip(pid) - (void*)handler) == 0);
assert(ptrace(PTRACE_SINGLESTEP, pid, 0, SIGALRM) == 0);
assert(wait(&status) == pid);
assert(WIFSTOPPED(status) && WSTOPSIG(status) == SIGTRAP);
assert((getip(pid) - (void*)handler) == 1);
assert(ptrace(PTRACE_CONT, pid, 0,0) == 0);
assert(wait(&status) == pid);
assert(WIFEXITED(status) && WEXITSTATUS(status) == 0x23);
return 0;
}
The last assert() fails because PTRACE_CONT wrongly triggers
another single-step and X86_EFLAGS_TF can't be cleared by
debugger until the tracee does sys_rt_sigreturn().
Change handle_signal() to do user_disable_single_step() if
stepping, we do not need to preserve TIF_SINGLESTEP because we
are going to do ptrace_notify(), and it is simply wrong to leak
this bit.
While at it, change the comment to explain why we also need to
clear TF unconditionally after setup_rt_frame().
Note: in the longer term we should probably change
setup_sigcontext() to use get_flags() and then just remove this
user_disable_single_step(). And, the state of TIF_FORCED_TF can
be wrong after restore_sigcontext() which can set/clear TF, this
needs another fix.
This fix fixes the 'single_step_syscall_32' testcase in
the x86 testsuite:
Before:
~/linux/tools/testing/selftests/x86> ./single_step_syscall_32
[RUN] Set TF and check nop
[OK] Survived with TF set and 9 traps
[RUN] Set TF and check int80
[OK] Survived with TF set and 9 traps
[RUN] Set TF and check a fast syscall
[WARN] Hit 10000 SIGTRAPs with si_addr 0xf7789cc0, ip 0xf7789cc0
Trace/breakpoint trap (core dumped)
After:
~/linux/linux/tools/testing/selftests/x86> ./single_step_syscall_32
[RUN] Set TF and check nop
[OK] Survived with TF set and 9 traps
[RUN] Set TF and check int80
[OK] Survived with TF set and 9 traps
[RUN] Set TF and check a fast syscall
[OK] Survived with TF set and 39 traps
[RUN] Fast syscall with TF cleared
[OK] Nothing unexpected happened
Reported-by: Evan Teran <eteran@alum.rit.edu>
Reported-by: Pedro Alves <palves@redhat.com>
Tested-by: Andres Freund <andres@anarazel.de>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
[ Added x86 self-test info. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Merge second patchbomb from Andrew Morton:
- the rest of MM
- various misc bits
- add ability to run /sbin/reboot at reboot time
- printk/vsprintf changes
- fiddle with seq_printf() return value
* akpm: (114 commits)
parisc: remove use of seq_printf return value
lru_cache: remove use of seq_printf return value
tracing: remove use of seq_printf return value
cgroup: remove use of seq_printf return value
proc: remove use of seq_printf return value
s390: remove use of seq_printf return value
cris fasttimer: remove use of seq_printf return value
cris: remove use of seq_printf return value
openrisc: remove use of seq_printf return value
ARM: plat-pxa: remove use of seq_printf return value
nios2: cpuinfo: remove use of seq_printf return value
microblaze: mb: remove use of seq_printf return value
ipc: remove use of seq_printf return value
rtc: remove use of seq_printf return value
power: wakeup: remove use of seq_printf return value
x86: mtrr: if: remove use of seq_printf return value
linux/bitmap.h: improve BITMAP_{LAST,FIRST}_WORD_MASK
MAINTAINERS: CREDITS: remove Stefano Brivio from B43
.mailmap: add Ricardo Ribalda
CREDITS: add Ricardo Ribalda Delgado
...
Pull exec domain removal from Richard Weinberger:
"This series removes execution domain support from Linux.
The idea behind exec domains was to support different ABIs. The
feature was never complete nor stable. Let's rip it out and make the
kernel signal handling code less complicated"
* 'exec_domain_rip_v2' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/misc: (27 commits)
arm64: Removed unused variable
sparc: Fix execution domain removal
Remove rest of exec domains.
arch: Remove exec_domain from remaining archs
arc: Remove signal translation and exec_domain
xtensa: Remove signal translation and exec_domain
xtensa: Autogenerate offsets in struct thread_info
x86: Remove signal translation and exec_domain
unicore32: Remove signal translation and exec_domain
um: Remove signal translation and exec_domain
tile: Remove signal translation and exec_domain
sparc: Remove signal translation and exec_domain
sh: Remove signal translation and exec_domain
s390: Remove signal translation and exec_domain
mn10300: Remove signal translation and exec_domain
microblaze: Remove signal translation and exec_domain
m68k: Remove signal translation and exec_domain
m32r: Remove signal translation and exec_domain
m32r: Autogenerate offsets in struct thread_info
frv: Remove signal translation and exec_domain
...
Pull UML updates from Richard Weinberger:
- hostfs saw a face lifting
- old/broken stuff was removed (SMP, HIGHMEM, SKAS3/4)
- random cleanups and bug fixes
* tag 'for-linus-4.1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml: (26 commits)
um: Print minimum physical memory requirement
um: Move uml_postsetup in the init_thread stack
um: add a kmsg_dumper
x86, UML: fix integer overflow in ELF_ET_DYN_BASE
um: hostfs: Reduce number of syscalls in readdir
um: Remove broken highmem support
um: Remove broken SMP support
um: Remove SKAS3/4 support
um: Remove ppc cruft
um: Remove ia64 cruft
um: Remove dead code from stacktrace
hostfs: No need to box and later unbox the file mode
hostfs: Use page_offset()
hostfs: Set page flags in hostfs_readpage() correctly
hostfs: Remove superfluous initializations in hostfs_open()
hostfs: hostfs_open: Reset open flags upon each retry
hostfs: Remove superfluous test in hostfs_open()
hostfs: Report append flag in ->show_options()
hostfs: Use __getname() in follow_link
hostfs: Remove open coded strcpy()
...
Pull kbuild updates from Michal Marek:
"Here is the first round of kbuild changes for v4.1-rc1:
- kallsyms fix for ARM and cleanup
- make dep(end) removed (developers have no sense of nostalgia these
days...)
- include Makefiles by relative path
- stop useless rebuilds of asm-offsets.h and bounds.h"
* 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
Kbuild: kallsyms: drop special handling of pre-3.0 GCC symbols
Kbuild: kallsyms: ignore veneers emitted by the ARM linker
kbuild: ia64: use $(src)/Makefile.gate rather than particular path
kbuild: include $(src)/Makefile rather than $(obj)/Makefile
kbuild: use relative path more to include Makefile
kbuild: use relative path to include Makefile
kbuild: do not add $(bounds-file) and $(offsets-file) to targets
kbuild: remove warning about "make depend"
kbuild: Don't reset timestamps in include/generated if not needed
Pull crypto update from Herbert Xu:
"Here is the crypto update for 4.1:
New interfaces:
- user-space interface for AEAD
- user-space interface for RNG (i.e., pseudo RNG)
New hashes:
- ARMv8 SHA1/256
- ARMv8 AES
- ARMv8 GHASH
- ARM assembler and NEON SHA256
- MIPS OCTEON SHA1/256/512
- MIPS img-hash SHA1/256 and MD5
- Power 8 VMX AES/CBC/CTR/GHASH
- PPC assembler AES, SHA1/256 and MD5
- Broadcom IPROC RNG driver
Cleanups/fixes:
- prevent internal helper algos from being exposed to user-space
- merge common code from assembly/C SHA implementations
- misc fixes"
* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (169 commits)
crypto: arm - workaround for building with old binutils
crypto: arm/sha256 - avoid sha256 code on ARMv7-M
crypto: x86/sha512_ssse3 - move SHA-384/512 SSSE3 implementation to base layer
crypto: x86/sha256_ssse3 - move SHA-224/256 SSSE3 implementation to base layer
crypto: x86/sha1_ssse3 - move SHA-1 SSSE3 implementation to base layer
crypto: arm64/sha2-ce - move SHA-224/256 ARMv8 implementation to base layer
crypto: arm64/sha1-ce - move SHA-1 ARMv8 implementation to base layer
crypto: arm/sha2-ce - move SHA-224/256 ARMv8 implementation to base layer
crypto: arm/sha256 - move SHA-224/256 ASM/NEON implementation to base layer
crypto: arm/sha1-ce - move SHA-1 ARMv8 implementation to base layer
crypto: arm/sha1_neon - move SHA-1 NEON implementation to base layer
crypto: arm/sha1 - move SHA-1 ARM asm implementation to base layer
crypto: sha512-generic - move to generic glue implementation
crypto: sha256-generic - move to generic glue implementation
crypto: sha1-generic - move to generic glue implementation
crypto: sha512 - implement base layer for SHA-512
crypto: sha256 - implement base layer for SHA-256
crypto: sha1 - implement base layer for SHA-1
crypto: api - remove instance when test failed
crypto: api - Move alg ref count init to crypto_check_alg
...
Soft mmu uses direct shadow page to fill guest large mapping with small
pages if huge mapping is disallowed on host. So zapping direct shadow
page works well both for soft mmu and hard mmu, it's just less widely
applicable.
Fix the comment to reflect this.
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Message-Id: <552C91BA.1010703@linux.intel.com>
[Fix comment wording further. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
As Andres pointed out:
| I don't understand the value of this check here. Are we looking for a
| broken memslot? Shouldn't this be a BUG_ON? Is this the place to care
| about these things? npages is capped to KVM_MEM_MAX_NR_PAGES, i.e.
| 2^31. A 64 bit overflow would be caused by a gigantic gfn_start which
| would be trouble in many other ways.
This patch drops the memslot overflow check to make the codes more simple.
Reviewed-by: Andres Lagar-Cavilla <andreslc@google.com>
Signed-off-by: Wanpeng Li <wanpeng.li@linux.intel.com>
Message-Id: <1429064694-3072-1-git-send-email-wanpeng.li@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Pull power management and ACPI updates from Rafael Wysocki:
"These are mostly fixes and cleanups all over, although there are a few
items that sort of fall into the new feature category.
First off, we have new callbacks for PM domains that should help us to
handle some issues related to device initialization in a better way.
There also is some consolidation in the unified device properties API
area allowing us to use that inferface for accessing data coming from
platform initialization code in addition to firmware-provided data.
We have some new device/CPU IDs in a few drivers, support for new
chips and a new cpufreq driver too.
Specifics:
- Generic PM domains support update including new PM domain callbacks
to handle device initialization better (Russell King, Rafael J
Wysocki, Kevin Hilman)
- Unified device properties API update including a new mechanism for
accessing data provided by platform initialization code (Rafael J
Wysocki, Adrian Hunter)
- ARM cpuidle update including ARM32/ARM64 handling consolidation
(Daniel Lezcano)
- intel_idle update including support for the Silvermont Core in the
Baytrail SOC and for the Airmont Core in the Cherrytrail and
Braswell SOCs (Len Brown, Mathias Krause)
- New cpufreq driver for Hisilicon ACPU (Leo Yan)
- intel_pstate update including support for the Knights Landing chip
(Dasaratharaman Chandramouli, Kristen Carlson Accardi)
- QorIQ cpufreq driver update (Tang Yuantian, Arnd Bergmann)
- powernv cpufreq driver update (Shilpasri G Bhat)
- devfreq update including Tegra support changes (Tomeu Vizoso,
MyungJoo Ham, Chanwoo Choi)
- powercap RAPL (Running-Average Power Limit) driver update including
support for Intel Broadwell server chips (Jacob Pan, Mathias Krause)
- ACPI device enumeration update related to the handling of the
special PRP0001 device ID allowing DT-style 'compatible' property
to be used for ACPI device identification (Rafael J Wysocki)
- ACPI EC driver update including limited _DEP support (Lan Tianyu,
Lv Zheng)
- ACPI backlight driver update including a new mechanism to allow
native backlight handling to be forced on non-Windows 8 systems and
a new quirk for Lenovo Ideapad Z570 (Aaron Lu, Hans de Goede)
- New Windows Vista compatibility quirk for Sony VGN-SR19XN (Chen Yu)
- Assorted ACPI fixes and cleanups (Aaron Lu, Martin Kepplinger,
Masanari Iida, Mika Westerberg, Nan Li, Rafael J Wysocki)
- Fixes related to suspend-to-idle for the iTCO watchdog driver and
the ACPI core system suspend/resume code (Rafael J Wysocki, Chen Yu)
- PM tracing support for the suspend phase of system suspend/resume
transitions (Zhonghui Fu)
- Configurable delay for the system suspend/resume testing facility
(Brian Norris)
- PNP subsystem cleanups (Peter Huewe, Rafael J Wysocki)"
* tag 'pm+acpi-4.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (74 commits)
ACPI / scan: Fix NULL pointer dereference in acpi_companion_match()
ACPI / scan: Rework modalias creation when "compatible" is present
intel_idle: mark cpu id array as __initconst
powercap / RAPL: mark rapl_ids array as __initconst
powercap / RAPL: add ID for Broadwell server
intel_pstate: Knights Landing support
intel_pstate: remove MSR test
cpufreq: fix qoriq uniprocessor build
ACPI / scan: Take the PRP0001 position in the list of IDs into account
ACPI / scan: Simplify acpi_match_device()
ACPI / scan: Generalize of_compatible matching
device property: Introduce firmware node type for platform data
device property: Make it possible to use secondary firmware nodes
PM / watchdog: iTCO: stop watchdog during system suspend
cpufreq: hisilicon: add acpu driver
ACPI / EC: Call acpi_walk_dep_device_list() after installing EC opregion handler
cpufreq: powernv: Report cpu frequency throttling
intel_idle: Add support for the Airmont Core in the Cherrytrail and Braswell SOCs
intel_idle: Update support for Silvermont Core in Baytrail SOC
PM / devfreq: tegra: Register governor on module init
...
Merge first patchbomb from Andrew Morton:
- arch/sh updates
- ocfs2 updates
- kernel/watchdog feature
- about half of mm/
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (122 commits)
Documentation: update arch list in the 'memtest' entry
Kconfig: memtest: update number of test patterns up to 17
arm: add support for memtest
arm64: add support for memtest
memtest: use phys_addr_t for physical addresses
mm: move memtest under mm
mm, hugetlb: abort __get_user_pages if current has been oom killed
mm, mempool: do not allow atomic resizing
memcg: print cgroup information when system panics due to panic_on_oom
mm: numa: remove migrate_ratelimited
mm: fold arch_randomize_brk into ARCH_HAS_ELF_RANDOMIZE
mm: split ET_DYN ASLR from mmap ASLR
s390: redefine randomize_et_dyn for ELF_ET_DYN_BASE
mm: expose arch_mmap_rnd when available
s390: standardize mmap_rnd() usage
powerpc: standardize mmap_rnd() usage
mips: extract logic for mmap_rnd()
arm64: standardize mmap_rnd() usage
x86: standardize mmap_rnd() usage
arm: factor out mmap ASLR into mmap_rnd
...
Memtest is a simple feature which fills the memory with a given set of
patterns and validates memory contents, if bad memory regions is detected
it reserves them via memblock API. Since memblock API is widely used by
other architectures this feature can be enabled outside of x86 world.
This patch set promotes memtest to live under generic mm umbrella and
enables memtest feature for arm/arm64.
It was reported that this patch set was useful for tracking down an issue
with some errant DMA on an arm64 platform.
This patch (of 6):
There is nothing platform dependent in the core memtest code, so other
platforms might benefit from this feature too.
[linux@roeck-us.net: MEMTEST depends on MEMBLOCK]
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This fixes the "offset2lib" weakness in ASLR for arm, arm64, mips,
powerpc, and x86. The problem is that if there is a leak of ASLR from
the executable (ET_DYN), it means a leak of shared library offset as
well (mmap), and vice versa. Further details and a PoC of this attack
is available here:
http://cybersecurity.upv.es/attacks/offset2lib/offset2lib.html
With this patch, a PIE linked executable (ET_DYN) has its own ASLR
region:
$ ./show_mmaps_pie
54859ccd6000-54859ccd7000 r-xp ... /tmp/show_mmaps_pie
54859ced6000-54859ced7000 r--p ... /tmp/show_mmaps_pie
54859ced7000-54859ced8000 rw-p ... /tmp/show_mmaps_pie
7f75be764000-7f75be91f000 r-xp ... /lib/x86_64-linux-gnu/libc.so.6
7f75be91f000-7f75beb1f000 ---p ... /lib/x86_64-linux-gnu/libc.so.6
7f75beb1f000-7f75beb23000 r--p ... /lib/x86_64-linux-gnu/libc.so.6
7f75beb23000-7f75beb25000 rw-p ... /lib/x86_64-linux-gnu/libc.so.6
7f75beb25000-7f75beb2a000 rw-p ...
7f75beb2a000-7f75beb4d000 r-xp ... /lib64/ld-linux-x86-64.so.2
7f75bed45000-7f75bed46000 rw-p ...
7f75bed46000-7f75bed47000 r-xp ...
7f75bed47000-7f75bed4c000 rw-p ...
7f75bed4c000-7f75bed4d000 r--p ... /lib64/ld-linux-x86-64.so.2
7f75bed4d000-7f75bed4e000 rw-p ... /lib64/ld-linux-x86-64.so.2
7f75bed4e000-7f75bed4f000 rw-p ...
7fffb3741000-7fffb3762000 rw-p ... [stack]
7fffb377b000-7fffb377d000 r--p ... [vvar]
7fffb377d000-7fffb377f000 r-xp ... [vdso]
The change is to add a call the newly created arch_mmap_rnd() into the
ELF loader for handling ET_DYN ASLR in a separate region from mmap ASLR,
as was already done on s390. Removes CONFIG_BINFMT_ELF_RANDOMIZE_PIE,
which is no longer needed.
Signed-off-by: Kees Cook <keescook@chromium.org>
Reported-by: Hector Marco-Gisbert <hecmargi@upv.es>
Cc: Russell King <linux@arm.linux.org.uk>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: "David A. Long" <dave.long@linaro.org>
Cc: Andrey Ryabinin <a.ryabinin@samsung.com>
Cc: Arun Chandran <achandran@mvista.com>
Cc: Yann Droneaud <ydroneaud@opteya.com>
Cc: Min-Hua Chen <orca.chen@gmail.com>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: Alex Smith <alex@alex-smith.me.uk>
Cc: Markos Chandras <markos.chandras@imgtec.com>
Cc: Vineeth Vijayan <vvijayan@mvista.com>
Cc: Jeff Bailey <jeffbailey@google.com>
Cc: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Cc: Ben Hutchings <ben@decadent.org.uk>
Cc: Behan Webster <behanw@converseincode.com>
Cc: Ismael Ripoll <iripoll@upv.es>
Cc: Jan-Simon Mller <dl9pf@gmx.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>