Skip to content

Latest commit

 

History

History
58 lines (47 loc) · 2.99 KB

PCP.md

File metadata and controls

58 lines (47 loc) · 2.99 KB

Performance Co-Pilot (PCP) Data

Elektron makes use of PCP to collect performance metrics using the pcp config file.

pmdumptext is used to retrieve all the data. The command used to retrieve the performance metrics is shown below (can also be found here).

pmdumptext -m -l -f '' -t 1.0 -d , -c <config file>

The logs are written to a file named <logFilePrefix>_<timestamp>.pcplog, where

  • logFilePrefix is the prefix provided using the -logPrefix option.
  • timestamp corresponds to the time when Elektron was run.

Use -pminfo to obtain information about different performance metrics that can be monitored through Performance Co-Pilot. Please see pminfo doc for usage and options.

Example PCP log

Assume we want to retrieve the following performance metrics collected from one host, myhost.

  • System CPU time
  • User CPU time

Then the PCP config file would be as shown below.

myhost:kernel.all.cpu.user
myhost:kernel.all.cpu.sys

When we run the pmdumptext command mentioned above for 5 seconds, the PCP log from Elektron would be as shown below.

[<loglevel>]: <yyyy-mm-dd> <hh:mm:ss> myhost:kernel.all.cpu.user,myhost:kernel.all.cpu.sys
[<loglevel>]: <yyyy-mm-dd> <hh:mm:ss> <myhost user cpu time>,<myhost system cpu time>
[<loglevel>]: <yyyy-mm-dd> <hh:mm:ss> <myhost user cpu time>,<myhost system cpu time>
[<loglevel>]: <yyyy-mm-dd> <hh:mm:ss> <myhost user cpu time>,<myhost system cpu time>
[<loglevel>]: <yyyy-mm-dd> <hh:mm:ss> <myhost user cpu time>,<myhost system cpu time>
[<loglevel>]: <yyyy-mm-dd> <hh:mm:ss> <myhost user cpu time>,<myhost system cpu time>

Power Measurements

It is also possible to measure the power consumption of CPU, DRAM etc., through the use of RAPL hardware counters.

When running the power capping strategies, Extrema and Progressive Extrema, the following performance metrics MUST be included in the PCP config file.

#RAPL CPU PKG
<hostname1>:perfevent.hwcounters.rapl__RAPL_ENERGY_PKG.value
<hostname2>:perfevent.hwcounters.rapl__RAPL_ENERGY_PKG.value
...
#RAPL DRAM
<hostname1>:perfevent.hwcounters.rapl__RAPL_ENERGY_DRAM.value
<hostname2>:perfevent.hwcounters.rapl__RAPL_ENERGY_DRAM.value
...

Note that the power readings are retrieved for each processor on each worker node. For example, if you have two processors on a machine (hostname = myhost), then the PCP log for CPU and DRAM power readings would contain the following headers.

myhost:perfevent.hwcounters.rapl__RAPL_ENERGY_PKG.value["cpux"] myhost:perfevent.hwcounters.rapl__RAPL_ENERGY_PKG.value["cpuy"] myhost:perfevent.hwcounters.rapl__RAPL_ENERGY_DRAM.value["cpux"] myhost:perfevent.hwcounters.rapl__RAPL_ENERGY_DRAM.value["cpuy"]

Use -pminfo and search for RAPL to get the list of RAPL packages from which data can be read from.