-
Notifications
You must be signed in to change notification settings - Fork 1
refactor zmqrunner to async #136
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
CodSpeed Performance ReportMerging this PR will degrade performance by 21.95%Comparing Summary
Performance Changes
Footnotes
|
…ation, before any cleanup so this is still messy as all hell and the statefulness test is not working bc the stateless nodes aren't returning anything
|
ok perf regressions look like they are in part from an error being emitted during benchmarking, which would be bad. also it looks like there is plenty of overhead from the thread pool executor in the command node callbacks which i think i can avoid by making the ZMQRunner callbacks private and async - in general i need to make the callbacks private on both command node and zmqrunner |
|
ok need to stop here for rn, but just confirming that the ZMQRunner long-add regression is an artifact because of different parts of the code being visible to the benchmarking instrumentation, but the ZMQ kitchen sink test is weirdly slower for reasons I think are related to the threadpoolexecutor. that is a reasonable trade because we should get perf gains when there is actually network i/o to do, which is the purpose of the zmq runner. will do profiling later on this. |
|
the gains in the sync runner, on the other hand are real - from removing the Condition objects from the scheduler (where they weren't being used by the sync runner anyway) |
|
alright i can’t really replicate this locally, but if you take a look at the codspeed metrics, something like 40% of the time is spent in a set_dealloc function which looks like an internal python method used to deallocate memory used by frames when rendering a traceback? if i remove that from the calculations then the numbers match what i see locally. I don’t see any tracebacks printed and there isn’t a place where they would be getting swallowed without being re-raised that I can see. i’m going to mark those regressions as acknowledged and move on since it is actually faster and this looks like it’s just some artifact from mysterious profiler instrumentation when measuring locally, this is about twice as fast as the prior implementation, and 4x as fast as 74cf737 for some reason, not sure why Poller is so slow. |
|
doing a merge since raymond is busy! |
fix #135
(just drafting, awkward middle state here, need to pull out all the scheduler awaiting stuff. might be best to just take it out of scheduler altogether)
📚 Documentation preview 📚: https://noob--136.org.readthedocs.build/en/136/