You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Running Ubuntu 23.10 kernel 6.5.0-44 on Intel Xeon Gold 6230R (cascade lake), I’ve compiled extrae 4.2.3 and linked against apt-provided libpapi 7.0 as well as self-compiled libpapi 7.1, using gcc 13.2.0 and libgomp.
I’ve tested that:
this happens on all tested and even trivial openmp programs
libseqtrace seems to not have the same issue and creates a trace with counter values
papi works fine on its own, without instrumentation
Note that it doesn’t fail on single thread executions but fails as soon as 2 threads appear. The segfault appears to happen inside ioctl() which is called by PAPI_add_event internals, and the offending address is stack pointer ($rsp) - 8.
Would appreciate any help you can give me in debugging/avoiding this crash.
Here’s Extrae’s configure summary:
Package configuration for Extrae 4.2.3
-----------------------
Installation prefix: /home/ljaulmes/.local
Cross compilation: no
CC: gcc
CXX: g++
Binary type: 64 bits
MPI instrumentation: no
GASPI instrumentation: no
OpenMP instrumentation: yes, through LD_PRELOAD
GNU OpenMP: yes
IBM OpenMP: no
Intel OpenMP: yes
OMPT: yes
OpenSHMEM instrumentation: no
pThread instrumentation: yes
Support for pthread_barrier_wait: yes
Support for pthread_cond_* calls: yes
CUDA instrumentation: no
OpenCL instrumentation: no
OPENACC instrumentation: no
Java instrumentation: unsupported
Performance counters: yes
Performance API: PAPI
PAPI home: /usr
Sampling support: yes
PEBS sampling: yes
libbfd available: yes (/usr/lib/x86_64-linux-gnu)
libiberty available: yes (/usr/lib/x86_64-linux-gnu)
zlib available: yes (/usr/local)
libxml2 available: yes (/usr)
BOOST available: no
callstack access: through libunwind (/usr)
Dynamic instrumentation: no
Optional features:
------------------
On-line analysis: no
Clock routine: POSIX / clock_gettime, but don't need to link against posix clock library explicitly
Heterogeneous support: no
Parallel merge: not available as MPI is not given
``̀`
Here’s a simple example:
``̀`c
#include <stdio.h>
#include <omp.h>
int main(void) {
#pragma omp parallel
{
int thread_id = omp_get_thread_num();
printf("Hello from process: %d\n", thread_id);
}
return 0;
}
Extrae: WARNING! omp_get_thread_num_real is a NULL pointer. Did the initialization of this module trigger? Retrying initialization...
Welcome to Extrae 4.2.3
Extrae: Detected GOMP version is 4.5
Extrae: Detected and hooked OpenMP runtime: [GNU GOMP]
Extrae: OMP_NUM_THREADS set to 2
Extrae: Parsing the configuration file (/home/ljaulmes/tests/openmp/extrae.xml) begins
Extrae: Tracing package is located on /home/ljaulmes/.local
Extrae: Generating intermediate files for Paraver traces.
Extrae: PAPI domain set to ALL for HWC set 1
Extrae: HWC set 1 contains following counters < PAPI_TOT_CYC (0x8000003b) > - never changes
Extrae: Dynamic memory instrumentation is disabled.
Extrae: Basic I/O memory instrumentation is disabled.
Extrae: System calls instrumentation is disabled.
Extrae: Parsing the configuration file (/home/ljaulmes/tests/openmp/extrae.xml) has ended
Extrae: Intermediate traces will be stored in /home/ljaulmes/tests/openmp
Extrae: Tracing mode is set to: Detail.
Extrae: Successfully initiated with 1 tasks and 2 threads
Hello from process: 0
Segmentation fault (core dumped)
I tried Extrae 3.8.3 which did not have a crash, so went ahead and ran a git bisect. First bad commit appears to be 0df3e97. Reverting that commit on top of v4.2.3 fixes the issue.
This problem is also reproducible in the Marenostrum 5 installation available via the module extrae/4.2.3. When tracing with OpenMP and PAPI Counters extrae segfaults.
When reverting the commit 0df3e97 and doing my own installation I can successfully trace OpenMP with PAPI counters.
See #127 for a revert commit that solves the issue for me on Marenostrum 5
Running Ubuntu 23.10 kernel 6.5.0-44 on Intel Xeon Gold 6230R (cascade lake), I’ve compiled extrae 4.2.3 and linked against apt-provided libpapi 7.0 as well as self-compiled libpapi 7.1, using gcc 13.2.0 and libgomp.
I’ve tested that:
Note that it doesn’t fail on single thread executions but fails as soon as 2 threads appear. The segfault appears to happen inside
ioctl()
which is called byPAPI_add_event
internals, and the offending address is stack pointer ($rsp
) - 8.Would appreciate any help you can give me in debugging/avoiding this crash.
Here’s Extrae’s configure summary:
config file:
What I run:
Output:
This is the gdb info:
Let me know any other info you need.
The text was updated successfully, but these errors were encountered: