Scalability of the library #195
-
Hello, For my application, I have to run a large number of simulations each time with different data. I have access to compute servers that each have 96 cores. I also have access to GPU based clusters, but I have read that GPU is used only for some parts of the simulation and might not offer significant speedup. I am using the python wrapper. However, when I run the code for a fixed time and check the number of simulation steps run for different number of cores, I see faster performance up until 24 cores, slower performance thereafter and significantly slower performance at 96 cores. Does the library in general scale well with the number of cores? I need to pass some variables to 'time_step_callback' method and currently I am wrapping the whole simulation code into a python class and using class variables to access the values with the 'time_step_callback' method. Could this be a potential bottleneck for the multithreading part? Thanks in advance. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
Hi, Typically the library does not scale that bad. However, we never measuerd this using the Python interface. The Python part can be a bottleneck. A current issue with the Python binding is that we can only activate the SIMD optimizations on Windows, for Linux currently the Python version does not use SIMD optimizations. In the C++ version that works. But we have it on our todo list. Hope that helps! |
Beta Was this translation helpful? Give feedback.
Hi,
Typically the library does not scale that bad. However, we never measuerd this using the Python interface. The Python part can be a bottleneck. A current issue with the Python binding is that we can only activate the SIMD optimizations on Windows, for Linux currently the Python version does not use SIMD optimizations. In the C++ version that works. But we have it on our todo list.
Hope that helps!