Commit Graph

68 Commits

Author SHA1 Message Date
Zachary Leaf
268d7be66b perfmon: enable perfmon plugin for Arm
This patch enables statistics from the Arm PMUv3 through the perfmon
plugin.

In comparison to using the Linux "perf" tool, it allows obtaining
direct, per node level statistics (rather than per thread). By accessing
the PMU counter registers directly from userspace, we can avoid the
overhead of using a read() system call and get more accurate and fine
grained statistics about the running of individual nodes.

A demo of perfmon on Arm can be found at:
https://asciinema.org/a/egVNN1OF7JEKHYmfl5bpDYxfF

*Important Note*
Perfmon on Arm is dependent on and works only on Linux kernel versions
of v5.17+ as this is when userspace access to Arm perf counters was
included.

On most Arm systems, a maximum of 7 PMU events can be configured at once
- (6x PMU events + 1x CPU_CYCLE counter). If some perf counters are in
use elsewhere by other applications, and there are insufficient counters
remaining to open the bundle, the perf_event_open call will fail
(provided the events are grouped with the group_fd param, which perfmon
currently utilises).

See arm/events.h for a list of PMUv3 events available, although it is
implementation defined whether most events are implemented or not. Only
a small set of 7 events is required to be implemented in Armv8.0, with
some additional events required in later versions. As such, depending on
the implementation, some statistics may not be available. See Arm
Architecture Reference Manual for Armv8-A, D7.10.2 "The PMU event number
space and common events" for more information.

arm/events.c:arm_init() gets information from the sysfs about what
events are implemented on a particular CPU at runtime. Arm's
implementation of the perfmon source callback .bundle_support uses this
information to disable unsupported events in a bundle, or in the case
no events are supported, disable the entire bundle.

Where a particular event in a bundle is not implemented, the statistic
for that event is shown as '-' in the 'show perfmon statistics' cli
output, by disabling the column.

There is additional code in perfmon.c to only open events which are
marked as implemented. Since we're only opening and reading events that
are implemented, some extra logic is required in cli.c to re-align
either perfmon_node_stats_t or perfmon_reading_t with the column
headings configured in each bundle, taking into account disabled
columns.

Userspace access to perf counters is disabled by default, and needs to
be enabled with 'sudo sysctl kernel/perf_user_access=1'.

There is a check built into the Arm event source init function
(arm/events.c:arm_init) to check that userspace reading of perf counters
is enabled in the /proc/sys/kernel/perf_user_access file.

If the above file does not exist, it means the kernel version is
unsupported. Users without a supported kernel will see a warning
message, and no Arm bundles will be registered to use in perfmon.

Enabling/using plugin:
  - include the following in startup.conf:
    - plugins { plugin perfmon_plugin.so { enable }
  - 'show perfmon bundle [verbose]' - show available statistics bundles
  - 'perfmon start bundle <bundle-name>' - enable and start logging
  - 'perfmon stop' - stop logging
  - 'show perfmon statistics' - show output

For a general guide on using and understanding Arm PMUv3 events, see
https://community.arm.com/arm-community-blogs/b/tools-software-ides-blog/posts/arm-neoverse-n1-performance-analysis-methodology

Type: feature
Signed-off-by: Zachary Leaf <zachary.leaf@arm.com>
Tested-by: Jieqiang Wang <jieqiang.wang@arm.com>
Change-Id: I0620fe5b1bbe78842dfb1d0b6a060bb99e777651
2022-07-12 15:29:23 +00:00
Zachary Leaf
c7d43a5eb1 perfmon: make less arch dependent
In preparation for enabling perfmon on Arm platforms, move some Intel
/arch specific logic into the /intel directory and update the CMake to
split the common code from arch specific files.

Since the dispatch_wrapper code is very different on Arm/Intel,
each arch can provide their own implementation + conduct any additional
arch specific config e.g. on Intel, all indexes from the mmap pages are
cached. The new method intel_config_dispatch_wrapper conducts this
config and returns a pointer to the dispatch wrapper to use.

Similarly, is_bundle_supported() looks very different on Arm/Intel, so
each implementation is to provide their own arch specific checks.

Two new callbacks/function ptrs are added in PERFMON_REGISTER_SOURCE to
support this - .bundle_support and .config_dispatch_wrapper.

Type: refactor
Signed-off-by: Zachary Leaf <zachary.leaf@arm.com>
Change-Id: Idd121ddcfd1cc80a57c949cecd64eb2db0ac8be3
2022-07-12 15:29:23 +00:00
Ray Kinsella
53e575ce8a perfmon: fix order in cmakelists.txt
Fix ordering in CMakeLists.txt

Type: refactor

Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
Change-Id: I8e71e4fbc048a80c4b250c2a66cfd8a522bde5f4
2022-03-29 10:10:48 +00:00
Benoît Ganne
81878a9e3c perfmon: fix non-NULL terminated C-string
format() expects a NULL-terminated C-string as format string.

Type: fix

Change-Id: Ib428cf2debbf98850eed512907175f8ae8ba3c04
Signed-off-by: Benoît Ganne <bganne@cisco.com>
2022-03-29 10:10:24 +00:00
Damjan Marion
8296a1d043 perfmon: null-terminate string
Type: fix
Change-Id: I43ebb2c2922f3b8b8eddf26ccdf044f31d7b7a10
Signed-off-by: Damjan Marion <damarion@cisco.com>
2022-03-23 17:21:23 +00:00
Ray Kinsella
489d89c1cb perfmon: show distribution of uops delivered to frontend
Breakdown the distribution of uops delivered to the frontend.
Collerates directly with the source of the uops.

Type: improvement

Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
Change-Id: I93a57dbe56dfa0f378527844aa4e63f45a548e55
2022-02-18 14:50:07 +00:00
Jon Loeliger
fdbafb8ca1 perfmon: Fix typo in debug log messages
Signed-off-by: Jon Loeliger <jdl@netgate.com>
Type: style
Change-Id: I955c19ddbe06ef3651c03820fcc14054c63258b9
2022-02-06 11:44:49 +00:00
Ray Kinsella
9d0c638b0f perfmon: topdown level 1 and 2 for icx
Topdown level 1 and 2 for Intel Ice Lake (ICX). Limiting topdown support
to THREAD for the moment on Ice Lake, as NODE support is still
unreliable. Also removing Topdown Level 1 from Sapphire Rapids onwards,
as Topdown LeveL 2 also shows Level 1 on Sapphire, and it reduces the
overall number of bundles.

Type: improvement

Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
Change-Id: Iaa68b711dc8b6fb1090880b411debadb3c37f8bc
2022-01-30 15:08:18 +00:00
Ray Kinsella
7e8aeb876b perfmon: fix init of bundles with pseudo events
Previously Linux pseudo events were being counted as multiple fixed
events, such that a bundle with pseudo events could exceed the number of
available fixed counters. Reworked to ignore pseudo events in the
accounting for the moment.

Type: fix
Fixes: 0024e53ad
Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
Change-Id: Ic938f8266fd04d7731afbd02e261c61ef22a8522
2022-01-30 15:08:18 +00:00
Ray Kinsella
0a0e711cce perfmon: check for duplicates after other checks
Move checking for duplicate bundle names after the other checks.

Type: fix

Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
Change-Id: I7fed5be758814e166eb8756b3df090130ac13bfd
2022-01-30 15:08:18 +00:00
Ray Kinsella
fe85d87235 perfmon: topdown backend bound core bundle
Add a bundle to measure topdown backend bound core cycles, will indicate if any
given execution port has contention.

Type: improvement

Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
Change-Id: I37d1b38c101ac42d51c10fa4452b822d34b729c9
2022-01-30 14:43:34 +00:00
Ray Kinsella
4a6306aa69 perfmon: frontend and backend boundness bundles
Renamed memory stalls to topdown backend-bound-mem, added topdown
frontend-bound-latency and frontend-bound-bandwidth.

Type: improvement

Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
Change-Id: I70f42b6b63fe2502635cad4aed4271e2bbdda5f1
2022-01-27 20:02:24 +00:00
Ray Kinsella
0024e53ad0 perfmon: prune bundles by available pmu counters
Prune perfmon bundles that exceed the number of available pmu counters.

Type: improvement

Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
Change-Id: I70fec26bb8ca915f4b980963e06c2e43dfde5a23
2022-01-27 20:01:45 +00:00
Ray Kinsella
aedcfaf80c perfmon: add cli to show perf config
Added a cli to show Linux perf config for a give perfmon bundle. This
makes it easier to format Linux perf commands for next level analysis.

Type: improvement

Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
Change-Id: I9adafa7d441b72120390d186e3c8f884b1bc9828
2022-01-27 15:54:02 +00:00
Ray Kinsella
b2bf388b81 perfmon: skipping bundle message
Change the skipping bundle message to debug

Type: refactor

Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
Change-Id: I942ff72bd9c26ccad923442fdedddf22ba75e117
2022-01-12 15:53:25 +00:00
Damjan Marion
e31c48a66b perfmon: compile dispatch wrapper once for each number of counters
A bit ugly, but generates faster and less noisy code which
should be important for this particular use case.

Type: improvement
Change-Id: If2bba947dac33ffedb4236a5b3fb50fc783668e1
Signed-off-by: Damjan Marion <damarion@cisco.com>
2021-12-02 17:49:49 +00:00
Ray Kinsella
e893beab27 perfmon: refactor perf metric support
Refactoring perf metric support to remove branching on bundle type in
the dispatch wrapper. This change includes caching the rdpmc index at
perfmon_start(), so that the mmap_page.index doesn't need to be looked
up each time. It also exclude the effects of mmap_page.index.

This patch prepares the path for bundles that support general, fixed and
metrics counters simulataneously.

Type: refactor

Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
Change-Id: I9c5b4917bd02fea960e546e8558452c4362eabc4
2021-12-02 15:02:39 +00:00
Klement Sekera
cbc81eae6e perfmon: fix coverity warning
Check for possible hash lookup failure to avoid NULL dereference.

Type: fix
Fixes: e15c999c30
Signed-off-by: Klement Sekera <ksekera@cisco.com>
Change-Id: Ib806b4d124be26fbccf36fe9d19af1aec63f487b
2021-11-16 16:19:40 +00:00
Ray Kinsella
e75084025a perfmon: rename bundle to memory stalls
Rename the memory bandwidth bundle to memory stalls, to differentiate it
from the bundle that measures memory controller bandwidth boundedness.

Type: refactor

Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
Change-Id: I828c73b6f769046e1ab592712bdf81ceefcd7911
2021-11-15 12:41:14 +00:00
Ray Kinsella
81865bc0e3 perfmon: fix iio-bw coverity issues
Fixes an number of coverity issues associated with the iio-bw feature.

Type: fix
Fixes: e15c999c3

Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
Change-Id: I9ad2b336694132545d90a3483200a510226e9198
2021-11-08 09:33:25 +00:00
Xiaoming Jiang
e58c5c5bb4 perfmon: numa node list probing should use '/online' instead of '/has_cpu'
Type: fix
Signed-off-by: Xiaoming Jiang <jiangxiaoming@outlook.com>
Change-Id: I85e41d58884af71afba960d20604bb1b01876d26
2021-11-07 04:01:17 +00:00
Ray Kinsella
e15c999c30 perfmon: added bundle to measure pci bandwidth
Added an Intel Ice Lake specific bundles to measure pci bandwidth through the
Intel IO PMU. The "PCI" bundle measures read/writes from pci devices. The "CPU"
bundle measure read/writes from cpus to pci devices.

Type: improvement

Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
Change-Id: Id48cef5988113e8dc4690b97d22243311bfa7961
2021-11-02 22:25:40 +00:00
Ray Kinsella
63081acb3f perfmon: added intel internal io pmu support
Added support for the Intel Internal IO Uncore PMU, along with the ability to
format PMU Unit specific names.

Type: improvement

Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
Change-Id: I2939f8ade5e5ed63ccf7f3ccd0279d7c72e95a6e
2021-11-02 22:25:40 +00:00
Klement Sekera
5d8072c059 perfmon: fix coverity warning
Check that cpumask is initialised properly to avoid possible NULL
pointer dereference.

Type: fix
Signed-off-by: Klement Sekera <ksekera@cisco.com>
Change-Id: I8df5a718104fe703d6baf3f1294b4a6d2ca01619
2021-10-28 10:58:47 +00:00
Klement Sekera
dec79ecf39 perfmon: properly unmap mmapped pages
Add missing array index so that actual mmapped pages are unmpapped
instead of attempting to unmap array holding those pages.

Type: fix
Signed-off-by: Klement Sekera <ksekera@cisco.com>
Change-Id: Ib8709cce1bcbfb505307c140266834b284af796c
2021-10-26 11:42:57 +02:00
Ray Kinsella
0d27e3e7a1 perfmon: topdown lvl 2 support on sapphire rapids
Added topdown level  2 support on sapphire rapids,
including ability to indentify a sapphire rapids cpu.

Type: improvement

Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
Change-Id: I9f99a92fa0886b98bb5185cff32bebd5a094f329
2021-10-16 08:32:43 +00:00
Ray Kinsella
5bb0eb122f perfmon: additional perf counters on icelake
The Intel Icelake uArch supports measuring up to 12 counters,
comprised of 4 fixed and 8 general counters.

Type: improvement

Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
Change-Id: I68369ea55a0c95d6a4a280a464e69502bbf5474f
2021-10-16 08:32:43 +00:00
Ray Kinsella
12ba95bff5 perfmon: Topdown Level 1 support on Snowridge
Enable Topdown Level 1 support on Snowridge,
enabled with standard CPU events on small core.

Type: improvement

Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
Change-Id: I58ad09383de7464265ac1b69e683f253591e3b5e
2021-10-07 13:23:06 +00:00
Ray Kinsella
ce45b16156 perfmon: check bundle is supported
Add a check bundle is supported before futher activation.
Enable different bundles with same name, supported on different platforms.

Type: improvement

Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
Change-Id: I73e8bbd1e07c05ebccd9146d48a234eb598a2388
2021-10-07 13:23:06 +00:00
Ray Kinsella
0d3914c026 perfmon: fix peusdo events
Fix peusdo events, missed populating "core" events with peusdo events.

Type: fix
Fixes: bf37bf6f7

Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
Change-Id: I569fa876f1b58540adac0b095be0ff4ade664dec
2021-10-07 13:23:06 +00:00
Ray Kinsella
ede7143386 perfmon: bundles with multiple types
Allow perfmon bundles to support more than one bundle type, either node
or thread. Only used for topdown bundle for the moment.

Type: improvement

Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
Change-Id: Iba3653a4deb39b0a8ee8ad448a7e8f954283ccd8
2021-10-05 10:44:39 +00:00
Ray Kinsella
bf37bf6f79 perfmon: topdown events as peusdo events
Topdown events are peusdo events exposed by linux,
and are only present on Intel platforms.
Change to clarifies this.

Type: fix

Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
Change-Id: I6a3dcea5f43f53dbb96475329baf5e596a24d54f
2021-10-04 09:14:24 +00:00
Nathan Skrzypczak
0e6584014a vppinfra: move format_table from perfmon
This code seems really usefull for reuse in
other plugins, for pretty table formatting

Type: feature

Change-Id: Ib5784a0dfc81b7d5a5d1f5ccdd02072e460a50fb
Signed-off-by: Nathan Skrzypczak <nathan.skrzypczak@gmail.com>
2021-09-17 20:10:59 +00:00
Damjan Marion
a274c3a2ed misc: put devtools plugins into separate component/package
Type: make
Change-Id: I2958e9eddadee6434766ecd3cdb3b9cea742ed64
Signed-off-by: Damjan Marion <damarion@cisco.com>
2021-09-17 15:04:28 +00:00
Zachary Leaf
0c373fc928 perfmon: sort 'show perfmon bundle' output
This patch sorts 'show perfmon bundle' output in alphabetical order.

Type: improvement
Signed-off-by: Zachary Leaf <zachary.leaf@arm.com>
Change-Id: I26b379b5d6766b9f87f9a3a5013ea92b207fb5d4
2021-09-08 14:34:44 +00:00
Ray Kinsella
710bdef43c perfmon: add membw-bound bundle
Added memory bandwidth boundedness bundle, closely related to cache-hierarchy.
This bundle works on ICX only, due to an ICX specific counter.

Type: improvement

Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
Change-Id: Id385bd5f4e645ac020774e311c623afb64b79b1e
2021-09-08 14:30:03 +00:00
Ray Kinsella
c3cb2075de perfmon: adding support for papi TMAM
Adding support for Linux papi TMAM on Intel Snowridge. Adds the ability to
indicate that a bundle should be thread or node bundle type based on available
cpu features (rdpmc support).

Type: feature

Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
Change-Id: Ib871b2644fdb2410fbb580e0d21c3a8e2be13aba
2021-09-08 14:30:03 +00:00
Benoît Ganne
4e3af51a66 perfmon: fix perf event user page read
When mmap()-ing perf event in userspace, we must adhere to the kernel
update protocol to read consistent values.
Also, 'offset' is an offset to add to the counter value, not to apply
to the PMC index.

Type: fix

Change-Id: I59106bb3a48185ff3fcb0d2f09097269a67bb6d6
Signed-off-by: Benoît Ganne <bganne@cisco.com>
2021-08-20 11:22:29 +00:00
Ray Kinsella
dbf90d499b perfmon: revert raw column support
Revert raw column from the perfmon plugin.

Type: refactor

Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
Change-Id: If127f57ee2022cc1c0ea5177f1655a792f195f1d
2021-05-26 06:45:40 +00:00
mdr78
8e1384f7bf perfmon: top down level 1 support
Adding perfmon node TMAM support on ICX.

Type: improvement

Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
Change-Id: I48a9a9ff6a72efc28eaf0cb11ef39fb62cebb126
2021-04-27 09:22:35 +00:00
Ray Kinsella
7e3862927e perfmon: combined set and start command.
Original set, start, stop, reset, show etc interface was somewhat cumbersome, we
can improve slightly by combining set and start.

Type: improvement

Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
Change-Id: I7b865b2c29d2ab32adbd24d7f8a580da6990bb76
2021-04-01 13:07:09 +00:00
Ray Kinsella
0614c6240b perfmon: % power level per node
Show % time spent per graph node in power level 0, 1 and 2.

Type: improvement

Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
Change-Id: I678ee812fa993af39568e9f9dfbf2396fc13ad42
2021-04-01 12:44:56 +00:00
Ray Kinsella
7b9b19d7bb perfmon: add branch mispredictions
Add branches, branches taken (a meteric for branchy code), and branch
misses.

Type: improvement

Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
Change-Id: If92d4aaf9d0a6e3b99b8c19e6311cc08ca470590
2021-03-31 15:06:30 +00:00
Damjan Marion
6ffb7c6189 vlib: introduce vlib_get_main_by_index(), vlib_get_n_threads()
Type: improvement
Change-Id: If3da7d4338470912f37ff1794620418d928fb77f
Signed-off-by: Damjan Marion <damarion@cisco.com>
2021-03-26 16:33:21 +01:00
Ray Kinsella
e4551ffc48 perfmon: fixes for cache hierarchy
Account for occasional instances with the misses rates between caches
are inconsistent.

Type: fix

Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
Change-Id: Idfb8bb7543401405cfe04291ad201c28be030cc9
2021-03-16 21:37:20 +00:00
Ray Kinsella
5e798bce42 perfmon: add support for raw and timestamps
Add perfmon plugin support to output raw counter and timestamps, both
are useful for debug.

Type: improvement

Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
Change-Id: Ia5a73d1f05e3464c18991c2346f0ed8b7ef63099
2021-03-16 21:36:47 +00:00
Ray Kinsella
1e4309538d perfmon: added cache hits and misses
Added basic support for counting cache hits and misses per node.

Type: improvement

Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
Change-Id: Ic566611fd3d4246ccaa2117d8f74a569a6862e80
2021-01-21 13:17:47 +00:00
Damjan Marion
8b60fb0fe6 perfmon: new perfmon plugin
Type: feature
Change-Id: I2c14f82393d11fc05c6d229f5c58603ab5c0f14d
Signed-off-by: Damjan Marion <damarion@cisco.com>
2020-12-18 17:20:28 +00:00
Damjan Marion
f5b27cbcc7 misc: deprecate old perfmon
Type: refactor
Change-Id: I1303219f9f2a25d821737665903b0264edd3de32
Signed-off-by: Damjan Marion <damarion@cisco.com>
2020-12-18 17:20:28 +00:00
Damjan Marion
b2c31b685f misc: move to new pool_foreach macros
Type: refactor
Change-Id: Ie67dc579e88132ddb1ee4a34cb69f96920101772
Signed-off-by: Damjan Marion <damarion@cisco.com>
2020-12-14 12:14:21 +00:00