-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathProfilingWithPerf.html
210 lines (167 loc) · 12.1 KB
/
ProfilingWithPerf.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
<!DOCTYPE html>
<html lang="en" data-content_root="./">
<head>
<meta charset="utf-8" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" /><meta name="viewport" content="width=device-width, initial-scale=1" />
<title>Profiling with perf</title>
<link rel="stylesheet" type="text/css" href="_static/pygments.css?v=03e43079" />
<link rel="stylesheet" type="text/css" href="_static/bootstrap-sphinx.css?v=fadd4351" />
<link rel="stylesheet" type="text/css" href="_static/custom.css?v=77160d70" />
<script src="_static/documentation_options.js?v=a8da1a53"></script>
<script src="_static/doctools.js?v=9bcbadda"></script>
<script src="_static/sphinx_highlight.js?v=dc90522c"></script>
<link rel="index" title="Index" href="genindex.html" />
<link rel="search" title="Search" href="search.html" />
<link rel="next" title="Profiling with Valgrind" href="ProfilingWithValgrind.html" />
<link rel="prev" title="Mantid Timers" href="Timers.html" />
<script>
(function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
(i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
})(window,document,'script','//www.google-analytics.com/analytics.js','ga');
ga('create', 'UA-59110517-1', 'auto');
ga('send', 'pageview');
</script>
</head><body>
<div id="navbar" class="navbar navbar-default ">
<div class="container">
<div class="navbar-header">
<!-- .btn-navbar is used as the toggle for collapsed navbar content -->
<button type="button" class="navbar-toggle" data-toggle="collapse" data-target=".nav-collapse">
<span class="icon-bar"></span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
</button>
<a class="navbar-brand" href="http://www.mantidproject.org">
</a>
<span class="navbar-text navbar-version pull-left"><b>main</b></span>
</div>
<div class="collapse navbar-collapse nav-collapse">
<ul class="nav navbar-nav">
<li class="divider-vertical"></li>
<li><a href="index.html">Home</a></li>
<li><a href="https://download.mantidproject.org">Download</a></li>
<li><a href="https://docs.mantidproject.org">User Documentation</a></li>
<li><a href="http://www.mantidproject.org/contact">Contact Us</a></li>
</ul>
<form class="navbar-form navbar-right" action="search.html" method="get">
<div class="form-group">
<input type="text" name="q" class="form-control" placeholder="Search" />
</div>
<input type="hidden" name="check_keywords" value="yes" />
<input type="hidden" name="area" value="default" />
</form>
</div>
</div>
<p>
<div class="related" role="navigation" aria-label="related navigation">
<h3>Navigation</h3>
<ul>
<li class="nav-item nav-item-0"><a href="index.html">Documentation</a> »</li>
<li class="nav-item nav-item-this"><a href="">Profiling with perf</a></li>
</ul>
</div> </p>
</div>
<div class="container">
<div class="row">
<div class="body col-md-12 content" role="main">
<div class="admonition note">
<p class="admonition-title">Note</p>
<p>These tools generally require <code class="docutils literal notranslate"><span class="pre">sudo</span></code> permissions to configure the system to allow them to get information.</p>
</div>
<section id="profiling-with-perf">
<h1>Profiling with perf<a class="headerlink" href="#profiling-with-perf" title="Link to this heading">¶</a></h1>
<p>Perf is a tool designed for use with the linux kernel, but can be used to profile user apps as well.
It is available on all linuxes, but requries root permission to enable.
Much of this information was originally gained from <a class="reference external" href="https://github.com/NAThompson/performance_tuning_tutorial">Nick Tompson’s Performance Tuning Tutorail</a> held at Oak Ridge National Laboratory.</p>
<div class="admonition note">
<p class="admonition-title">Note</p>
<p>perf is a sampling based performance tool.
This means the results are percentages rather than absolute times.
However, many visualizations will associate times as well.
Disk access issues are almost completely invisible to perf-based tools.</p>
</div>
<section id="install-and-configure">
<h2>Install and configure<a class="headerlink" href="#install-and-configure" title="Link to this heading">¶</a></h2>
<p>To install perf on ubuntu one needs (this is <a class="reference external" href="https://www.howtoforge.com/how-to-install-perf-performance-analysis-tool-on-ubuntu-20-04/">inspired from here</a>)</p>
<div class="highlight-sh notranslate"><div class="highlight"><pre><span></span>sudo<span class="w"> </span>apt<span class="w"> </span>install<span class="w"> </span>linux-tools-common
sudo<span class="w"> </span>apt<span class="w"> </span>install<span class="w"> </span>linux-tools-generic
sudo<span class="w"> </span>apt<span class="w"> </span>install<span class="w"> </span>linux-tools-<span class="sb">`</span>uname<span class="w"> </span>-r<span class="sb">`</span>
</pre></div>
</div>
<p>the last command gets the kernel modules specific to your kernel.</p>
<p>The final step of configuration allows for getting more information from perf traces.
Any debug symbols that are found will aid in understanding the output.</p>
<div class="highlight-sh notranslate"><div class="highlight"><pre><span></span><span class="ch">#!/bin/bash</span>
<span class="c1"># Taken from Milian Wolf's talk "Linux perf for Qt developers"</span>
sudo<span class="w"> </span>mount<span class="w"> </span>-o<span class="w"> </span>remount,mode<span class="o">=</span><span class="m">755</span><span class="w"> </span>/sys/kernel/debug
sudo<span class="w"> </span>mount<span class="w"> </span>-o<span class="w"> </span>remount,mode<span class="o">=</span><span class="m">755</span><span class="w"> </span>/sys/kernel/debug/tracing
<span class="nb">echo</span><span class="w"> </span><span class="s2">"0"</span><span class="w"> </span><span class="p">|</span><span class="w"> </span>sudo<span class="w"> </span>tee<span class="w"> </span>/proc/sys/kernel/kptr_restrict
<span class="nb">echo</span><span class="w"> </span><span class="s2">"-1"</span><span class="w"> </span><span class="p">|</span><span class="w"> </span>sudo<span class="w"> </span>tee<span class="w"> </span>/proc/sys/kernel/perf_event_paranoid
sudo<span class="w"> </span>chown<span class="w"> </span><span class="sb">`</span>whoami<span class="sb">`</span><span class="w"> </span>/sys/kernel/debug/tracing/uprobe_events
sudo<span class="w"> </span>chmod<span class="w"> </span>a+rw<span class="w"> </span>/sys/kernel/debug/tracing/uprobe_events
</pre></div>
</div>
<p>Python 3.12 will have <a class="reference external" href="https://docs.python.org/3.12/howto/perf_profiling.html">native support for perf</a>.</p>
</section>
<section id="running-perf">
<h2>Running perf<a class="headerlink" href="#running-perf" title="Link to this heading">¶</a></h2>
<p>To profile a single test (this starts with <code class="docutils literal notranslate"><span class="pre">time</span></code> to see how long the overall test takes)</p>
<div class="highlight-sh notranslate"><div class="highlight"><pre><span></span><span class="nb">time</span><span class="w"> </span>perf<span class="w"> </span>record<span class="w"> </span>-g<span class="w"> </span>./bin/AlgorithmsTest<span class="w"> </span>FilterEventsTest
</pre></div>
</div>
<p>The report can be viewed in a couple of ways. Using the curses-based tool</p>
<div class="highlight-sh notranslate"><div class="highlight"><pre><span></span>perf<span class="w"> </span>report<span class="w"> </span>--no-children<span class="w"> </span>-s<span class="w"> </span>dso,sym,srcline
</pre></div>
</div>
<p>The report can also be viewed <a class="reference external" href="https://github.com/brendangregg/FlameGraph">FlameGraph</a> which generates an <code class="docutils literal notranslate"><span class="pre">.svg</span></code> that can be viewed in a web browser</p>
<div class="highlight-sh notranslate"><div class="highlight"><pre><span></span>perf<span class="w"> </span>script<span class="w"> </span><span class="p">|</span><span class="w"> </span>~/code/FlameGraph/stackcollapse-perf.pl<span class="w"> </span><span class="p">|</span><span class="w"> </span>~/code/FlameGraph/flamegraph.pl<span class="w"> </span>><span class="w"> </span>flame.svg
</pre></div>
</div>
</section>
</section>
<section id="profiling-with-intel-s-vtune">
<h1>Profiling with Intel’s VTune<a class="headerlink" href="#profiling-with-intel-s-vtune" title="Link to this heading">¶</a></h1>
<p>Intel’s VTune profiler (<a class="reference external" href="https://www.intel.com/content/www/us/en/developer/tools/oneapi/vtune-profiler-download.html">download link</a> and <a class="reference external" href="https://www.intel.com/content/www/us/en/docs/vtune-profiler/installation-guide/2023-0/package-managers.html">install instructions</a>) is part of the one-api suite of software that is available for open source projects.
This is part of the same suite that provides TBB (threaded building blocks) that are used in mantid.
After installing, one must configure (these <a class="reference external" href="https://www.intel.com/content/www/us/en/docs/vtune-profiler/user-guide/2023-0/linux-targets.html">instructions are for ubuntu</a>) using the command</p>
<div class="highlight-sh notranslate"><div class="highlight"><pre><span></span>sudo<span class="w"> </span>sysctl<span class="w"> </span>-w<span class="w"> </span>kernel.yama.ptrace_scope<span class="o">=</span><span class="m">0</span>
</pre></div>
</div>
<p>This needs to be done at every system reboot, but can be configured in sysctl to be a permanent option as well.
Finally, the environment settings for vtune are in <code class="docutils literal notranslate"><span class="pre">/opt/intel/oneapi/vtune/latest/vtune-vars.sh</span></code> and the gui can be started using <code class="docutils literal notranslate"><span class="pre">vtune-gui</span></code>.</p>
<p>From the welcome screen, you will want to “Configure Analysis” (the play button).</p>
<figure class="align-default">
<img alt="Example configuration" src="_images/vtune_configure.png" />
</figure>
<p>This example takes advantage of how cxxtestgen works by running the command</p>
<div class="highlight-sh notranslate"><div class="highlight"><pre><span></span>bin/AlgorithmsTest<span class="w"> </span>FilterEventsTest<span class="w"> </span>test_tableSplitterHuge
</pre></div>
</div>
<p>which runs The <code class="docutils literal notranslate"><span class="pre">test_tableSplitterHuge</span></code> test of the <code class="docutils literal notranslate"><span class="pre">FilterEventsTest</span></code> suite, in the <code class="docutils literal notranslate"><span class="pre">AlgorithmsTest</span></code> binary.
It is suggested that one selects “User-Mode Sampling” to avoid seeing kernel methods and get the flame graph visualization.
Once the analysis is completed, you will see the summary.
It is recommended that you start with the “Flame Graph” and “Top-down Tree” visualizations first.</p>
</section>
</div>
</div>
</div>
<footer class="footer">
<div class="container">
<ul class="nav navbar-nav" style=" float: right;">
<li>
<a href="Timers.html" title="Previous Chapter: Mantid Timers"><span class="glyphicon glyphicon-chevron-left visible-sm"></span><span class="hidden-sm hidden-tablet">« Mantid Timers</span>
</a>
</li>
<li>
<a href="ProfilingWithValgrind.html" title="Next Chapter: Profiling with Valgrind"><span class="glyphicon glyphicon-chevron-right visible-sm"></span><span class="hidden-sm hidden-tablet">Profiling wit... »</span>
</a>
</li>
<li><a href="#">Back to top</a></li>
</ul>
<p>
</p>
</div>
</footer>
</body>
</html>