You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The following message directly preceding the exception suggests that the error occurred when training a BayesNet:
2021-06-01 17:22:03.846 [ORGraphSearch-worker-1] INFO executor - Fitting the learner (class: ai.libs.mlplan.core.TimeTrackingLearnerWrapper) ai.libs.mlplan.core.TimeTrackingLearnerWrapper -
2021-06-01 17:23:03.691 [Global Timer] INFO InterruptionTimerTask - Executing interruption task 1293092700 with descriptor "Timeout for timed computation with thread Thread[ORGraphSearch-wo
2021-06-01 17:23:03.693 [Global Timer] INFO Interrupter - Interrupting Thread[ORGraphSearch-worker-1,5,main] on behalf of Thread[Global Timer,10,main] with reason InterruptionTimerTask [thr
2021-06-01 17:23:03.694 [Global Timer] INFO Interrupter - Interrupt accomplished. Interrupt flag of Thread[ORGraphSearch-worker-1,5,main]: true
2021-06-01 17:23:03.833 [Global Timer] INFO InterruptionTimerTask - Executing interruption task 1024325039 with descriptor "Timeout for timed computation with thread Thread[ORGraphSearch-wo
2021-06-01 17:23:03.834 [Global Timer] INFO Interrupter - Interrupting Thread[ORGraphSearch-worker-1,5,main] on behalf of Thread[Global Timer,10,main] with reason InterruptionTimerTask [thr
2021-06-01 17:23:03.835 [Global Timer] INFO Interrupter - Interrupt accomplished. Interrupt flag of Thread[ORGraphSearch-worker-1,5,main]: true
The question is really whether this can be avoided without spawning external processes.
The text was updated successfully, but these errors were encountered:
Thinking more about this, I believe that there is really no solution to this problem except of spawning a new process. The problem with new processes is though that one needs to block a good deal of memory for each of them to avoid problems. This can easily become a total waste of resources.
Probably the best solution is to introduce an option that allows to run ML-Plan in process mode if there is an anticipated risk of memory overflows.
Then, more generally, it would be cool to add the opportunity to the process project of AILibs that allows to execute objects that implement both Callable<T extends Serializable> and Serializable in a separate process with specific resource limitations. One could then have a general executor for such operations that serializes the object to be executed and launches a new JVM with a general executor that unserializes the object, calls it, and serializes the T into some output file, which can then again be unserialized by the original process.
Observing this error when running MLPlan in cluster experiments:
Logs show that this stack trace is immediately followed by an indication of memory overflow:
One dataset where this occured was the DNA dataset (https://www.openml.org/d/40670) using 24G memory.
The following message directly preceding the exception suggests that the error occurred when training a BayesNet:
The question is really whether this can be avoided without spawning external processes.
The text was updated successfully, but these errors were encountered: