-
Notifications
You must be signed in to change notification settings - Fork 51
Submitting tasks #300
Comments
Hi Simon, For submitting a large number of tasks, I'll look into increasing the throughput of task submission similar to the other Azure Batch SDKs. Thanks, |
Hi @brnleehng, thanks for the reply. It sounds like it could be a great enhancement to the package. |
Hi Brian, I was just wondering how complex this is to implement? Is it possible to get a rough ETA on when this could be expected to go live, assuming it's possible to increase the throughput in the first place? In addition, is this a back-end update that will just start working after submitting a job, rather than needing a package update within doAzureParallel? Many thanks, |
We should be able to get the task factory to call AddTaskCollection and handle its complexity by the end of Oct. That will get about 100x improvement. |
Wow, 100x is a serious improvement. I guess this leaves me to figure out whether waiting until the end of October and then running all my tasks will be quicker than letting everything chug along slowly as things currently stand. Currently about 75% of the time each iteration takes to run can be explained by the time taken to submit tasks but if this gets improved 100x, there will essentially be no waiting. Decisions... |
Yeah it’s a bit mad that this throughput wasn’t included from day 1, really. I’m spending thousands of dollars more than I should be because, cumulatively, submitting tasks is taking hundreds and hundreds of hours.
From: Pullarg <notifications@github.com>
Sent: 27 September 2018 00:26
Subject: Re: [Azure/doAzureParallel] Submitting tasks (#300)
This looks great it took 10ish hours for us to submit 255000 jobs, once it had the jobs it blitz through them.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub<#300 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AaJxdpLO4p_7peductWMEXTOAnvG_lqNks5ufA0YgaJpZM4WXwmy>
|
Hello, any news on the status of this feature? Thanks! |
would be excellent to know, running 800k jobs 12 hours and still submitting, at this point spinning up a 64core machine and using doparallel would be better |
I reckon half of the $20,000 I've spent over the last 6 weeks has been on waiting for tasks to submit... |
Hello, is there a way to improve the speed at which tasks are submitted across nodes in a pool?
I have a cluster with 1024 cores but it takes about 03:30 to submit all tasks. There's very little data from my local R session which needs to be uploaded, so I don't think it's a bandwidth issue. I'm finding that 75% of the time to complete each iteration of my model is accounted for by submitting the tasks (merging is much, much faster) so any performance increases here will make a big difference to my workflow.
Additionally, will the speed at which tasks are submitted increase/decrease depending on where you're based and where your cluster is located? i.e. will it take longer for jobs to submit in 'South Central US' if you're based somewhere in Europe?
Thanks.
The text was updated successfully, but these errors were encountered: