Skip to content

Commit 2f2f993

Browse files
authored
Merge pull request #657 from tiago-rodrigues/trodrigues/tracy_libunwind
Add support for using using libunwind
2 parents 348be05 + c373647 commit 2f2f993

File tree

6 files changed

+51
-6
lines changed

6 files changed

+51
-6
lines changed

CMakeLists.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -84,6 +84,7 @@ set_option(TRACY_MANUAL_LIFETIME "Enable the manual lifetime management of the p
8484
set_option(TRACY_FIBERS "Enable fibers support" OFF)
8585
set_option(TRACY_NO_CRASH_HANDLER "Disable crash handling" OFF)
8686
set_option(TRACY_TIMER_FALLBACK "Use lower resolution timers" OFF)
87+
set_option(TRACE_CLIENT_LIBUNWIND_BACKTRACE "Use libunwind backtracing where supported" OFF)
8788

8889
if(NOT TRACY_STATIC)
8990
target_compile_definitions(TracyClient PRIVATE TRACY_EXPORTS)

manual/techdoc.tex

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -258,12 +258,16 @@ \subsubsection{Initialization}
258258

259259
On some platforms a bit of setup work is required. This is done in the \texttt{InitCallstack()} function.
260260

261+
On Windows, tracy will attempt to preload symbols at \texttt{InitCallstack()} time. It does this for device drivers and process modules. As this process can be slow when a lot of pdbs are involved, you can set the \texttt{TRACY\_NO\_DBHELP\_INIT\_LOAD} environment variable to "1" to disable this behavior and rely on-demand symbol loading.
262+
261263
\subsubsection{Getting the frames}
262264

263265
Call stack collection is initiated by calling the \texttt{Callstack()} procedure, with maximum stack depth to be collected passed as a parameter. Stack unwinding must be performed in the place in which call stack was queried, as further execution of the application will change the stack contents. The unfortunate part is that the stack unwinding on platforms other than x86 is not a fast operation.
264266

265267
To perform unwinding various OS functions are used: \texttt{RtlWalkFrameChain()}, \texttt{\_Unwind\_Backtrace()}, \texttt{backtrace()}. A list of returned frame pointers is saved in a buffer, which will be later sent to the server. The maximum unwinding depth limit (63 entries) is due to the specifics of the underlying OS functionality.
266268

269+
On some platforms you can define \texttt{TRACE\_CLIENT\_LIBUNWIND\_BACKTRACE} to use libunwind to perform callstack captures, as it might be a faster alternative than the default implementation. If you do, you must compile/link you client against libunwind. See \url{https://github.com/libunwind/libunwind} for more details.
270+
267271
\subsubsection{Decoding stack frames}
268272

269273
Unlike the always changing call stack, stack frames themselves are immutable pointers to a specific place in the executable code. As such, the decoding process can be performed at any time (even outside of the program execution, as exemplified by debuggers). Frame decoding is only performed when the server asks for the details of a frame (section~\ref{communicationsprotocol}).

manual/tracy.tex

Lines changed: 11 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1698,6 +1698,14 @@ \subsection{Collecting call stacks}
16981698
Tracy will prepare for call stack collection regardless of whether you use the functionality or not. In some cases, this may be unwanted or otherwise troublesome for the user. To disable support for collecting call stacks, define the \texttt{TRACY\_NO\_CALLSTACK} macro.
16991699
\end{bclogo}
17001700

1701+
\begin{bclogo}[
1702+
noborder=true,
1703+
couleur=black!5,
1704+
logo=\bclampe
1705+
]{libunwind}
1706+
On some platforms you can define \texttt{TRACE\_CLIENT\_LIBUNWIND\_BACKTRACE} to use libunwind to perform callstack captures as it might be a faster alternative than the default implementation. If you do, you must compile/link you client against libunwind. See \url{https://github.com/libunwind/libunwind} for more details.
1707+
\end{bclogo}
1708+
17011709
\subsubsection{Debugging symbols}
17021710

17031711
You must compile the profiled application with debugging symbols enabled to have correct call stack information. You can achieve that in the following way:
@@ -1768,6 +1776,8 @@ \subsubsection{Debugging symbols}
17681776
}
17691777
\end{lstlisting}
17701778

1779+
At initilization time, tracy will attempt to preload symbols for device drivers and process modules. As this process can be slow when a lot of pdbs are involved, you can set the \texttt{TRACY\_NO\_DBHELP\_INIT\_LOAD} environment variable to "1" to disable this behavior and rely on-demand symbol loading.
1780+
17711781
\paragraph{Disabling resolution of inline frames}
17721782

17731783
Inline frames retrieval on Windows can be multiple orders of magnitude slower than just performing essential symbol resolution. This manifests as profiler seemingly being stuck for a long time, having hundreds of thousands of query backlog entries queued, which are slowly trickling down. If your use case requires speed of operation rather than having call stacks with inline frames included, you may define the \texttt{TRACY\_NO\_CALLSTACK\_INLINES} macro, which will make the profiler stick to the basic but fast frame resolution mode.
@@ -2049,7 +2059,7 @@ \subsubsection{Privilege elevation}
20492059

20502060
Some profiling data can only be retrieved using the kernel facilities, which are not available to users with normal privilege level. To collect such data, you will need to elevate your rights to the administrator level. You can do so either by running the profiled program from the \texttt{root} account on Unix or through the \emph{Run as administrator} option on Windows\footnote{To make this easier, you can run MSVC with admin privileges, which will be inherited by your program when you start it from within the IDE.}. On Android, you will need to have a rooted device (see section~\ref{androidlunacy} for additional information).
20512061

2052-
As this system-level tracing functionality is part of the automated collection process, no user intervention is necessary to enable it (assuming that the program was granted the rights needed). However, if, for some reason, you would want to prevent your application from trying to access kernel data, you may recompile your program with the \texttt{TRACY\_NO\_SYSTEM\_TRACING} define.
2062+
As this system-level tracing functionality is part of the automated collection process, no user intervention is necessary to enable it (assuming that the program was granted the rights needed). However, if, for some reason, you would want to prevent your application from trying to access kernel data, you may recompile your program with the \texttt{TRACY\_NO\_SYSTEM\_TRACING} define. If you want to disable this functionality dynamically at runtime instead, you can set the \texttt{TRACY\_NO\_SYSTEM\_TRACING} environment variable to "1".
20532063

20542064
\begin{bclogo}[
20552065
noborder=true,

public/client/TracyCallstack.cpp

Lines changed: 13 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -157,9 +157,20 @@ void InitCallstack()
157157
SymInitialize( GetCurrentProcess(), nullptr, true );
158158
SymSetOptions( SYMOPT_LOAD_LINES );
159159

160+
// use TRACY_NO_DBHELP_INIT_LOAD=1 to disable preloading of driver
161+
// and process module symbol loading at startup time - they will be loaded on demand later
162+
// Sometimes this process can take a very long time and prevent resolving callstack frames
163+
// symbols during that time.
164+
const char* noInitLoadEnv = GetEnvVar( "TRACY_NO_DBHELP_INIT_LOAD" );
165+
const bool initTimeModuleLoad = !( noInitLoadEnv && noInitLoadEnv[0] == '1' );
166+
if ( !initTimeModuleLoad )
167+
{
168+
TracyDebug("TRACY: skipping init time dbghelper module load\n");
169+
}
170+
160171
DWORD needed;
161172
LPVOID dev[4096];
162-
if( EnumDeviceDrivers( dev, sizeof(dev), &needed ) != 0 )
173+
if( initTimeModuleLoad && EnumDeviceDrivers( dev, sizeof(dev), &needed ) != 0 )
163174
{
164175
char windir[MAX_PATH];
165176
if( !GetWindowsDirectoryA( windir, sizeof( windir ) ) ) memcpy( windir, "c:\\windows", 11 );
@@ -214,7 +225,7 @@ void InitCallstack()
214225

215226
HANDLE proc = GetCurrentProcess();
216227
HMODULE mod[1024];
217-
if( EnumProcessModules( proc, mod, sizeof( mod ), &needed ) != 0 )
228+
if( initTimeModuleLoad && EnumProcessModules( proc, mod, sizeof( mod ), &needed ) != 0 )
218229
{
219230
const auto sz = needed / sizeof( HMODULE );
220231
for( size_t i=0; i<sz; i++ )

public/client/TracyCallstack.hpp

Lines changed: 13 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -8,10 +8,15 @@
88
#if TRACY_HAS_CALLSTACK == 2 || TRACY_HAS_CALLSTACK == 5
99
# include <unwind.h>
1010
#elif TRACY_HAS_CALLSTACK >= 3
11-
# include <execinfo.h>
11+
# ifdef TRACE_CLIENT_LIBUNWIND_BACKTRACE
12+
// libunwind is, in general, significantly faster than execinfo based backtraces
13+
# define UNW_LOCAL_ONLY
14+
# include <libunwind.h>
15+
# else
16+
# include <execinfo.h>
17+
# endif
1218
#endif
1319

14-
1520
#ifndef TRACY_HAS_CALLSTACK
1621

1722
namespace tracy
@@ -127,7 +132,13 @@ static tracy_force_inline void* Callstack( int depth )
127132
assert( depth >= 1 );
128133

129134
auto trace = (uintptr_t*)tracy_malloc( ( 1 + (size_t)depth ) * sizeof( uintptr_t ) );
135+
136+
#ifdef TRACE_CLIENT_LIBUNWIND_BACKTRACE
137+
size_t num = unw_backtrace( (void**)(trace+1), depth );
138+
#else
130139
const auto num = (size_t)backtrace( (void**)(trace+1), depth );
140+
#endif
141+
131142
*trace = num;
132143

133144
return trace;

public/client/TracyProfiler.cpp

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1439,7 +1439,15 @@ Profiler::Profiler()
14391439
void Profiler::SpawnWorkerThreads()
14401440
{
14411441
#ifdef TRACY_HAS_SYSTEM_TRACING
1442-
if( SysTraceStart( m_samplingPeriod ) )
1442+
// use TRACY_NO_SYS_TRACE=1 to force disabling sys tracing (even if available in the underlying system)
1443+
// as it can have significant impact on the size of the traces
1444+
const char* noSysTrace = GetEnvVar( "TRACY_NO_SYS_TRACE" );
1445+
const bool disableSystrace = (noSysTrace && noSysTrace[0] == '1');
1446+
if( disableSystrace )
1447+
{
1448+
TracyDebug("TRACY: Sys Trace was disabled by 'TRACY_NO_SYS_TRACE=1'\n");
1449+
}
1450+
else if( SysTraceStart( m_samplingPeriod ) )
14431451
{
14441452
s_sysTraceThread = (Thread*)tracy_malloc( sizeof( Thread ) );
14451453
new(s_sysTraceThread) Thread( SysTraceWorker, nullptr );

0 commit comments

Comments
 (0)