Running many simulations in parallel causing MongoDB bottleneck - anyone have experience? #2794

jefc1111 · 2022-01-20T21:10:30Z

jefc1111
Jan 20, 2022

Edit: I think I am CPU bound after all.

Am cross-posting this from Reddit, as there seems to be more activity here than there. https://www.reddit.com/r/zenbot/comments/s8sza7/running_many_simulations_in_parallel_causing/

I am working on a backtesting platform for Zenbot (have posted here a couple of times before).

Since I made those posts I have decided it is not really suitable for open-sourcing / self-hosting as there are just too many moving parts and expert configuration of various elements involved. I set it all up on AWS before Christmas and it was working pretty well ... but AWS is expensive, and no fun :) so now I bought a server to host it at home. My goal is to build a small community around it.

So anyway, it's working pretty well and I am able to run around 100 - 200 simulations in parallel. For example, i was able to run a batch of 500 simulations against BTC/USD covering 5 different strategies for a trading period of 2 months. That took almost a couple of days to run and I would like to try to optimise that down to a few hours.

So - lots of stuff working! However, the next technical hurdle I have is trying to eliminate bottlenecks to further optimise performance. I believe my current bottleneck is MongoDB. Using mongotop as a rough method of measuring I can see that once I get to around 15-20 simultaneous simulations mongo starts to struggle and the speed at which simulations run slows down drastically. Mongo has a lot of cache space available so that is not the problem. I think Zenbot is doing very large amounts of reads on the trades collection.

If I have got this right I see a few possible options;

Try to optimise the queries Zenbot is running (i.e. doing fewer queries over larger quantities of data in each query.
Rewrite Zenbot to use Redis instead of Mongo
Pay loads of money for MongoDB Enterprise which includes an in-memory storage engine (not gonna happen)
Rewrite Zenbot so that instead of running many instances in parallel, you run one instance, but in the main loop you process multiple strategies in one (I haven't really looked deeply into this and is really a totally different overall approach from the route I have gone down)

None of these options seem particularly attractive as they all seem quite high effort without any guarantee of success.

Anyway, just wondering if anybody has any comments or advice on this issue. Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Running many simulations in parallel causing MongoDB bottleneck - anyone have experience? #2794

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 0 comments

Select a reply

Running many simulations in parallel causing MongoDB bottleneck - anyone have experience? #2794

jefc1111 Jan 20, 2022

Replies: 0 comments

jefc1111
Jan 20, 2022