There 2 problems when reading symbols files:
* It doesn't report any errors even if when users specify symbol
files which don't exist with --kallsyms or --vmlinux. The result
just shows the address without symbols, which is not what is expected.
So it's better to report errors and exit the program.
* When using command perf report --kallsyms=/proc/kallsyms with a
non-root user, symbols are resolved. Then select one symbol and
annotate it, it reports the error as the following:
Can't annotate __clear_user: No vmlinux file with build id xxx was
found.
The problem is caused by reading /proc/kcore without access permission.
/proc/kcore requires CAP_SYS_RAWIO capability to access, so it needs to
change access permission to allow a specific user to read /proc/kcore or
use root to execute the perf command.
This patch is to report errors when symbol files specified by users
don't exist. And check access permission of /proc/kcore when reading it.
Signed-off-by: Li Zhang <zhlcindy@linux.vnet.ibm.com>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/1434704253-2632-1-git-send-email-zhlcindy@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Currently all the -p option PID arguments tasks values get aggregated
and printed as single values.
Adding --per-tasks option to print values per task.
$ perf stat -e cycles,instructions --per-thread -p 30190,30242
^C
Performance counter stats for process id '30190,30242':
cat-30190 0 cycles
yes-30242 3,842,525,421 cycles
cat-30190 0 instructions
yes-30242 10,370,817,010 instructions
1.143155657 seconds time elapsed
Also works under interval mode:
$ perf stat -e cycles,instructions --per-thread -p 30190,30242 -I 1000
# time comm-pid counts unit events
1.000073435 cat-30190 89,058 cycles
1.000073435 yes-30242 3,360,786,902 cycles (100.00%)
1.000073435 cat-30190 14,066 instructions
1.000073435 yes-30242 9,069,937,462 instructions
2.000204830 cat-30190 0 cycles
2.000204830 yes-30242 3,351,667,626 cycles
2.000204830 cat-30190 0 instructions
2.000204830 yes-30242 9,045,796,885 instructions
^C 2.771286639 cat-30190 0 cycles
2.771286639 yes-30242 2,593,884,166 cycles
2.771286639 cat-30190 0 instructions
2.771286639 yes-30242 7,001,171,191 instructions
It works only with -t and -p options, otherwise following error is
printed:
$ perf stat -e cycles --per-thread -I 1000 ls
The --per-thread option is only available when monitoring via -p -t options.
-p, --pid <pid> stat events on existing process id
-t, --tid <tid> stat events on existing thread id
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1435310967-14570-23-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Pull perf/core improvements and refactorings from Arnaldo Carvalho de Melo:
Infrastructure changes:
- Reference count the cpu_map and thread_map classes. (Jiri Olsa)
- Set evsel->{cpus,threads} from the evlist, if not set,
allowing the generalization of some 'perf stat' functions that
previously were accessing private static evlist variable. (Jiri Olsa)
- Delete an unnecessary check before the calling
free_event_desc() (Markus Elfring)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Fix failure to probe events on arm, the problem was introduced by commit
5a51fcd1f3 ("perf probe: Skip kernel symbols which is out of .text").
For some architectures, the '_etext' label is not in the .text section
(in the .notes section for arm/arm64). Labels out of the .text section
are not loaded as symbols and we get a zero value when looking up its
addresses, which causes all events to be wrongly skipped.
This patch skips checking the text address range when failing to get the
address of '_etext' and thus fixes the problem.
The problem can be reproduced on arm as follows:
# perf probe --add='generic_perform_write'
generic_perform_write+0 is out of .text, skip it.
Probe point 'generic_perform_write' not found.
Error: Failed to add events.
After this patch:
# perf probe --add='generic_perform_write'
Added new event:
probe:generic_perform_write (on generic_perform_write)
You can now use it in all perf tools, such as:
perf record -e probe:generic_perform_write -aR sleep 1
Signed-off-by: He Kuang <hekuang@huawei.com>
Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1434595750-129791-1-git-send-email-hekuang@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
System wide sampling like 'perf top' or 'perf record -a' read all
threads /proc/xxx/maps before sampling. If there are any threads which
generating a keeping growing huge maps, perf will do infinite loop
during synthesizing. Nothing will be sampled.
This patch fixes this issue by adding per-thread timeout to force stop
this kind of endless proc map processing.
PERF_RECORD_MISC_PROC_MAP_PARSE_TIME_OUT is introduced to indicate that
the mmap record are truncated by time out. User will get warning
notification when truncated mmap records are detected.
Reported-by: Ying Huang <ying.huang@intel.com>
Signed-off-by: Kan Liang <kan.liang@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ying Huang <ying.huang@intel.com>
Link: http://lkml.kernel.org/r/1434549071-25611-1-git-send-email-kan.liang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The thread-stack represents a thread's current stack. When a thread
exits there can still be many functions on the stack e.g. exit() can be
called many levels deep, so all the callers will never return. To get
that information output, the thread-stack must be flushed.
Previously it was assumed the thread-stack would be flushed when the
struct thread was deleted. With thread ref-counting it is no longer
clear when that will be, if ever. So instead explicitly flush all the
thread-stacks at the end of a session.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1432906425-9911-3-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
I get following crash on multiple systems and across several releases
(at least since v3.18).
Core was generated by `/tmp/perf trace sleep 0.2 '.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 perf_mmap__read_head (mm=0x3fff9bf30070) at util/evlist.h:195
195 u64 head = ACCESS_ONCE(pc->data_head);
(gdb) bt
#0 perf_mmap__read_head (mm=0x3fff9bf30070) at util/evlist.h:195
#1 perf_evlist__mmap_read (evlist=0x10027f11910, idx=<optimized out>)
at util/evlist.c:637
#2 0x000000001003ce4c in trace__run (argv=<optimized out>,
argc=<optimized out>, trace=0x3fffd7b28288) at builtin-trace.c:2259
#3 cmd_trace (argc=<optimized out>, argv=<optimized out>,
prefix=<optimized out>) at builtin-trace.c:2799
#4 0x00000000100657b8 in run_builtin (p=0x10176798 <commands+480>, argc=3,
argv=0x3fffd7b2b550) at perf.c:370
#5 0x00000000100063e8 in handle_internal_command (argv=0x3fffd7b2b550, argc=3)
at perf.c:429
#6 run_argv (argv=0x3fffd7b2af70, argcp=0x3fffd7b2af7c) at perf.c:473
#7 main (argc=3, argv=0x3fffd7b2b550) at perf.c:588
The problem seems to be a race condition, when the application has just
exited. Some/all fds associated with the perf-events (tracepoints) go
into a POLLHUP/ POLLERR state and the mmap region associated with those
events are unmapped (in perf_evlist__filter_pollfd()).
But we go back and do a perf_evlist__mmap_read() which assumes that the
mmaps are still valid and we hit the crash.
If the mapping for an event is released, its refcnt is 0 (and ->base
is NULL), so ensure we have non-zero refcount before accessing the map.
Note that perf-record has a similar logic but unlike perf-trace, the
record__mmap_read_all() checks the evlist->mmap[i].base before accessing
the map.
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Li Zhang <zhlcindy@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/20150612060003.GA19913@us.ibm.com
[ Fixed it up to use atomic_read() ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Speed up the "perf probe --list" by caching the last used debuginfo.
perf probe --list always open and load debuginfo for each entry of probe
list. This takes very a long time.
E.g. with vfs_* events (total 96 probes)
[root@localhost perf]# time ./perf probe -l &> /dev/null
real 0m25.376s
user 0m24.381s
sys 0m1.012s
To solve this issue, this adds debuginfo_cache to cache the
last used debuginfo on memory.
With this fix, the perf-probe --list significantly improves
its speed.
[root@localhost perf]# time ./perf probe -l &> /dev/null
real 0m0.161s
user 0m0.136s
sys 0m0.025s
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Naohiro Aota <naota@elisp.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20150617145854.19715.15314.stgit@localhost.localdomain
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When the last part of converted events are blacklisted or out-of-text,
those are skipped and perf probe doesn't show usage examples. This
fixes it to show the example even if the last part of event list is
skipped.
E.g. without this patch, events are added, but suddenly end:
# perf probe vfs_*
vfs_caches_init_early is out of .text, skip it.
vfs_caches_init is out of .text, skip it.
Added new events:
probe:vfs_fallocate (on vfs_*)
probe:vfs_open (on vfs_*)
...
probe:vfs_dentry_acceptable (on vfs_*)
probe:vfs_load_quota_inode (on vfs_*)
#
With this fix:
# perf probe vfs_*
vfs_caches_init_early is out of .text, skip it.
vfs_caches_init is out of .text, skip it.
Added new events:
probe:vfs_fallocate (on vfs_*)
...
probe:vfs_load_quota_inode (on vfs_*)
You can now use it in all perf tools, such as:
perf record -e probe:vfs_load_quota_inode -aR sleep 1
Note that this can be reproduced ONLY IF the vfs_caches_init* is the
last part of matched symbol list. I've checked this happens on
"3.19.0-generic #18-Ubuntu" kernel binary.
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Naohiro Aota <naota@elisp.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20150616115057.19906.5502.stgit@localhost.localdomain
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Following error occurs when trying to use 'perf report' on x86_64 to
cross analysis a perf.data generated by an old perf on a big-endian
machine:
# perf report
*** Error in `/home/w00229757/perf': free(): invalid next size (fast): 0x00000000032c99f0 ***
======= Backtrace: =========
/lib64/libc.so.6(+0x6eeef)[0x7ff6ff7e2eef]
/lib64/libc.so.6(+0x78cae)[0x7ff6ff7eccae]
/lib64/libc.so.6(+0x79987)[0x7ff6ff7ed987]
/path/to/perf[0x4ac734]
/path/to/perf[0x4ac829]
/path/to/perf(perf_header__process_sections+0x129)[0x4ad2c9]
/path/to/perf(perf_session__read_header+0x2e1)[0x4ad9e1]
/path/to/perf(perf_session__new+0x168)[0x4bd458]
/path/to/perf(cmd_report+0xfa0)[0x43eb70]
/path/to/perf[0x47adc3]
/path/to/perf(main+0x5f6)[0x42fd06]
/lib64/libc.so.6(__libc_start_main+0xf5)[0x7ff6ff795bd5]
/path/to/perf[0x42fe35]
======= Memory map: ========
[SNIP]
The bug is in perf_event__attr_swap(). It swaps all fields in 'struct
perf_event_attr' without checking whether the swapped field exist or
not. In addition, in read_event_desc() allocs memory for attr according
to size read from perf.data.
Therefore, if the perf.data is collected by an old perf (without
aux_watermark, for example), when perf_event__attr_swap() swaping
attr->aux_watermark it destroy malloc's metadata.
This patch introduces boundary checking in perf_event__attr_swap(). It
adds macros bswap_field_64 and bswap_field_32 into
perf_event__attr_swap() to make it only swap exist fields.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1434534999-85347-1-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Fix perf probe to return an error if no probe is added due to the given
probe point being on the blacklist.
To fix this problem, this moves the blacklist checking to right after
finding symbols/probe-points and marks them as skipped.
If all the symbols are skipped, "perf probe" returns an error as it
fails to find the corresponding probe address.
E.g. currently if a blacklisted probe is given:
# perf probe do_trap && echo 'succeed'
Added new event:
Warning: Skipped probing on blacklisted function: sync_regs
succeed
No! It must fail! With this patch, it correctly fails:
# perf probe do_trap && echo 'succeed'
do_trap is blacklisted function, skip it.
Probe point 'do_trap' not found.
Error: Failed to add events.
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Naohiro Aota <naota@elisp.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20150616115055.19906.31359.stgit@localhost.localdomain
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>