You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently we have no control over the random seed used by scikit-learn. We don't even know how it works. We need to add a command line parameter to set the seed yourself, or by default use numpy.random.randomint(min_seed_value, max_seed_value).
The only problem I can see with this is the use of pmap. When we introduce parallelism, we no longer guarantee that the stars will be processed in any particular order. With the way pmap currently works, it would have to have processes=1 in order to produce the exact same results every time. Although the same set of random numbers is used, they are mapped to different stars depending on race conditions, and therefore may produce different results for the same seed. A solution would be to impose a strict mapping of random numbers to stars. We could somehow associate random numbers with stars in a repeatable way. This way, it will be possible to publish a set of results along with the seed used, and others should be able to reproduce the exact same results, guaranteed.
The text was updated successfully, but these errors were encountered:
Currently we have no control over the random seed used by
scikit-learn
. We don't even know how it works. We need to add a command line parameter to set the seed yourself, or by default usenumpy.random.randomint(min_seed_value, max_seed_value)
.The only problem I can see with this is the use of
pmap
. When we introduce parallelism, we no longer guarantee that the stars will be processed in any particular order. With the waypmap
currently works, it would have to haveprocesses=1
in order to produce the exact same results every time. Although the same set of random numbers is used, they are mapped to different stars depending on race conditions, and therefore may produce different results for the same seed. A solution would be to impose a strict mapping of random numbers to stars. We could somehow associate random numbers with stars in a repeatable way. This way, it will be possible to publish a set of results along with the seed used, and others should be able to reproduce the exact same results, guaranteed.The text was updated successfully, but these errors were encountered: