Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

events values ids incorrectly reported to Extrae, int array c conversion problem #13

Open
clasqui opened this issue Feb 28, 2023 · 3 comments · May be fixed by #14
Open

events values ids incorrectly reported to Extrae, int array c conversion problem #13

clasqui opened this issue Feb 28, 2023 · 3 comments · May be fixed by #14
Assignees
Labels
bug Something isn't working help wanted Extra attention is needed

Comments

@clasqui
Copy link
Contributor

clasqui commented Feb 28, 2023

When a trace is generated, values reported in the pcf file are invalid. By the values reported, I can tell that it has to do with some kind of memory corruption at the time of converting/accessing the array from Julia to C. You can see the values generated with one execution in the attached pcf file.
test-cfg.pcf.txt

At the merging process we can also see that some values are duplicated:

Extrae(paraver/labels.c,885): Warning! Ignoring duplicate definition "init_worker" for value type 400002,10!
Extrae(paraver/labels.c,885): Warning! Ignoring duplicate definition "start_worker" for value type 400002,10!
Extrae(paraver/labels.c,885): Warning! Ignoring duplicate definition "CallWaitMsg" for value type 400004,10!
Extrae(paraver/labels.c,885): Warning! Ignoring duplicate definition "RemoteDoMsg" for value type 400004,10!
Extrae(paraver/labels.c,885): Warning! Ignoring duplicate definition "JoinPGRPMsg" for value type 400004,0!

At the register function values are correctly reported in a debug line.

@clasqui clasqui added bug Something isn't working help wanted Extra attention is needed labels Feb 28, 2023
@mofeing mofeing self-assigned this Feb 28, 2023
@mofeing
Copy link
Member

mofeing commented Mar 10, 2023

From what I see in the PCF file, some events are also registered twice. In the next example, Useful event value is registered twice.

EVENT_TYPE
0    400001    Distributed workload execution
VALUES
894271024      Useful
2      Not Useful
-1598911216      Useful

@clasqui Should the PCF file be generated only by the master process in the application? Could it be that different processes are trying to register the event values in the same PCF file?

@clasqui
Copy link
Contributor Author

clasqui commented Mar 10, 2023

The PCF is generated at merging step, which is done separately of the application execution, using the julia2prv script. So this is not an option...

@mofeing
Copy link
Member

mofeing commented Mar 11, 2023

Okay, so it really looks like a memory corruption thing. Duplicated event values are because the events are initialized with different values on each process. Most probably, the values come from uninitialized memory.

I'm gonna write some tests to check that conversion to C FFI is done correctly.

@mofeing mofeing linked a pull request Mar 12, 2023 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working help wanted Extra attention is needed
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants