tag name | perf-tools-for-v6.9-2024-03-13 (11364e6cccc9460054938bc43c277156d680fe6a) |
tag date | 2024-03-13 10:46:09 -0700 |
tagged by | Namhyung Kim <namhyung@kernel.org> |
tagged object | commit 0f66dfe7b9... |
download | perf-tools-next-perf-tools-for-v6.9-2024-03-13.tar.gz |
---|
perf tools changes for v6.9
perf stat
---------
* Support new 'cluster' aggregation mode for shared resources depending on the
hardware configuration.
$ sudo perf stat -a --per-cluster -e cycles,instructions sleep 1
Performance counter stats for 'system wide':
S0-D0-CLS0 2 85,051,822 cycles
S0-D0-CLS0 2 73,909,908 instructions # 0.87 insn per cycle
S0-D0-CLS2 2 93,365,918 cycles
S0-D0-CLS2 2 83,006,158 instructions # 0.89 insn per cycle
S0-D0-CLS4 2 104,157,523 cycles
S0-D0-CLS4 2 53,234,396 instructions # 0.51 insn per cycle
S0-D0-CLS6 2 65,891,079 cycles
S0-D0-CLS6 2 41,478,273 instructions # 0.63 insn per cycle
1.002407989 seconds time elapsed
* Various fixes and cleanups for event metrics including NaN handling.
perf script
-----------
* Use libcapstone if available to disassemble the instructions. This enables
'perf script -F disasm' and 'perf script --insn-trace=disasm' (for Intel-PT).
$ perf script -F event,ip,disasm
cycles:P: ffffffffa988d428 wrmsr
cycles:P: ffffffffa9839d25 movq %rax, %r14
cycles:P: ffffffffa9cdcaf0 endbr64
cycles:P: ffffffffa988d428 wrmsr
cycles:P: ffffffffa988d428 wrmsr
cycles:P: ffffffffaa401f86 iretq
cycles:P: ffffffffa99c4de5 movq 0x30(%rcx), %r8
cycles:P: ffffffffa988d428 wrmsr
cycles:P: ffffffffaa401f86 iretq
cycles:P: ffffffffa9907983 movl 0x68(%rbx), %eax
cycles:P: ffffffffa988d428 wrmsr
* Expose sample ID / stream ID to python scripts
perf test
---------
* Add more perf test cases from Redhat internal test suites. This time it adds
the base infra and a few perf probe tests. More to come. :)
* Add 'perf test -p' for parallel execution and fix some issues found by the
parallel test.
* Support symbol test to print symbols in given (active) module:
$ perf test -F -v Symbols --dso /lib/modules/$(uname -r)/kernel/fs/ext4/ext4.ko
--- start ---
Testing /lib/modules/6.5.13-1rodete2-amd64/kernel/fs/ext4/ext4.ko
Overlapping symbols:
7a990-7a9a0 l __pfx_ext4_exit_fs
7a990-7a9a0 g __pfx_cleanup_module
Overlapping symbols:
7a9a0-7aa1c l ext4_exit_fs
7a9a0-7aa1c g cleanup_module
...
JSON metric updates
-------------------
* A new round of Intel metric updates.
* Support Power11 PVR (compatible to Power10).
* Fix cache latency events on Zen 4 to set SliceId properly.
Internal
--------
* Fix reference counting for 'map' data structure, tireless work from Ian!
* More memory optimization for struct thread and annotate histogram. Now,
'perf report' (TUI) and 'perf annotate' should be much lighter-weight in
terms of memory footprint.
* Support cross-arch perf register access. Clean up the build configuration
so that it can detect arch-register support at runtime. This can allow to
parse register data in sample which was recorded in a different arch.
Others
------
* Sync task state in 'perf sched' to kernel using trace event fields. The
task states have been changed so tools cannot assume a fixed encoding.
* Clean up 'perf mem' to generalize the arch-specific events.
* Add support for local and global variables to data type profiling. This
would increase the success rate of type resolution with DWARF.
* Add short option -H for --hierarchy in 'perf report' and 'perf top'.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQSo2x5BnqMqsoHtzsmMstVUGiXMgwUCZfHmfhQcbmFtaHl1bmdA
a2VybmVsLm9yZwAKCRCMstVUGiXMg5krAP9Es5KEhAHvTHo6y4OX9ktrNGB3j/FB
YgakrWSuJxJ+UAD8D49wUloO3yVDVOe6MxJrZrHcEDGDV6qVSr0aPwDpyw4=
=gPPl
-----END PGP SIGNATURE-----