-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stress testing causes PDBGroupMonitor to segfault #64
Comments
Do you have any chance to test this setup against QSRV2/PVXS in the IOC? |
Using above setup adapted for
|
Race conditions / ownership bugs like this one motivated to create PVXS and QSRV2. So I would recommend migrating your IOCs.
I think this makes the circumstance clear enough. If practical, please attach the result of running |
File with stack traces from all threads: all_threads.log |
Well this is puzzling. The associated There is also a second pair of threads for another client where both threads also appear to be idle. So whatever corruption has already occurred. @akete If you are feeling adventurous, or are interested in learning about a neat Linux-only tool. The RR reverse execution debugger can be very effective in finding these sorts of use-after-free errors. Run to the point of the crash, set a watchpoint on I might also suggest |
Background: this was first observed using custom-built IOC using our custom record type support and using Phoebus CSS as a client. Later, I was able to reproduce the problem entirely in
softIocPVA
with multiplepvget -m
clients.Problem
With Qsrv grouping multiple PVs into NTTable (complete setup attached) and either CSS screen displaying all components in a X/Y Plot or using multiple
pvget -m
clients (*), IOC crashes withSteps to reproduce
while true; do pvput ....; done
) and continue to do so until IOC crashes,.bob
file),Test setup
NTTable_example.zip
softIocPVA
with PV structure:CSS GUI:
(*) I was able to reproduce this using 7
pvget -m
clients and killing them (kill -9
) at the same time after a while.Initial analysis
By adding additional traces into
modules/pva2pva/pdbApp/pdbgroup.cpp
I was able to confirm thatPDBGroupMonitor::release()
is being called afterPDBGroupMonitor::destroy()
has already been called. This happens when IOC becomes overwhelmed, and calls toPDBGroupMonitor::release()
start occurring.The text was updated successfully, but these errors were encountered: