You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Testing the 2 examples, comparing to standard map vs 4 ray workers. Map is much faster with default example(as expected due to overhear?).
Arbitrability adding a loop to the eval function to ensure each individuals eval takes more time. Testing if batching out via ray ActorPool vs standard python map is improving things to ensure the implementation in ray_map.py is not inefficient.
Hard to tell if the issue is the simple example being efficient...the real benefit of Ray(scoop in the past) and dealing with the network and future-spin-up overhead was to gain access to parallel for heavy work loads.
For big data cases and heavy eval functions, esp in cases with large arrays and the ability to use ray shared memory objects, Id like to imagine the speedup would be much greater. Need to set up an example for this or adjust current examples to really go ham(change fitness calc to really expensive operations) to show off ability of batching via ray.
This was tested with new manager update that did not use a remote manager(so one less worker to consume resources provided to ray) to fire off remote batched map actors.
symbreg_ray.py
Eval is made arbitrarily more expensive to show difference.
'time python symbreg_ray.py' on my machine(8 processors) shows:
num_cpus=1 (map): 20.3 sec(real)
num_cpus=2 (ray): 14.1 sec(real)
num_cpus=4 (ray): 11.9 sec(real)
num_cpus=7 (ray): 13.1 sec(real)
num_cpus=8 (ray): 13.0 sec(real)
onemax_ray.py
Eval is made arbitrarily more expensive to show difference. Tricky as DeltaPenalty skips evals sometimes.
'time python onemax_ray.py' on my machine(8 processors) shows:
num_cpus=1 (map): 25.5 sec(real)
num_cpus=2 (ray): 17.5 sec(real)
num_cpus=4 (ray): 13.0 sec(real)
num_cpus=7 (ray): 13.3 sec(real)
num_cpus=8 (ray): 13.6 sec(real)
I need to make better decisions with the example's intentions.
Part of the problem: Arbitrarily increasing their complexity via loops just breaks the idea behind each example. Also my onemax is not very onemax as it prevents a 1 in first index of individual...this will confuse people when they look for meaningful examples to learn from, and especially so when comparing to the original examples.
Symbreg is nearly ok, but it can have more meaningful slowdown(to show speedup on multi cpu) via adding big data. Then we can use ray shared memory to improve the example further without making it senseless.
so todo: Go with bigger pops and bigger data to show off batching speedup asap vs breaking examples and retime things using multiple runs to average
changed the title [-]Investigating potential of speedup(improving examples)[/-][+]Investigating potential speedup(improving examples)[/+]on Jul 24, 2020
Updated the symbreg examples with heavier loads and shared memory items to better illustrate the speedups.
Attempted to convert the onemax_island_scoop example but its recursion approach caused a ton of actors to generate. which broke things. Need to find a way to handle that...
Activity
DMTSource commentedon Jul 24, 2020
Hard to tell if the issue is the simple example being efficient...the real benefit of Ray(scoop in the past) and dealing with the network and future-spin-up overhead was to gain access to parallel for heavy work loads.
For big data cases and heavy eval functions, esp in cases with large arrays and the ability to use ray shared memory objects, Id like to imagine the speedup would be much greater. Need to set up an example for this or adjust current examples to really go ham(change fitness calc to really expensive operations) to show off ability of batching via ray.
This was tested with new manager update that did not use a remote manager(so one less worker to consume resources provided to ray) to fire off remote batched map actors.
symbreg_ray.py
onemax_ray.py
DMTSource commentedon Jul 24, 2020
I need to make better decisions with the example's intentions.
Part of the problem: Arbitrarily increasing their complexity via loops just breaks the idea behind each example. Also my onemax is not very onemax as it prevents a 1 in first index of individual...this will confuse people when they look for meaningful examples to learn from, and especially so when comparing to the original examples.
Symbreg is nearly ok, but it can have more meaningful slowdown(to show speedup on multi cpu) via adding big data. Then we can use ray shared memory to improve the example further without making it senseless.
so todo:
Go with bigger pops and bigger data to show off batching speedup asap vs breaking examples and retime things using multiple runs to average
[-]Investigating lack of speedup[/-][+]Investigating potential of speedup(improving examples)[/+][-]Investigating potential of speedup(improving examples)[/-][+]Investigating potential speedup(improving examples)[/+]DMTSource commentedon Jul 24, 2020
Updated the symbreg examples with heavier loads and shared memory items to better illustrate the speedups.
Attempted to convert the onemax_island_scoop example but its recursion approach caused a ton of actors to generate. which broke things. Need to find a way to handle that...