Replies: 2 comments 1 reply
-
I hope things are going well for you too. Good luck with your prototyping. |
Beta Was this translation helpful? Give feedback.
-
Okay, good points.
From your first two points, I think should likely shift away from having a controller on the VM. I think that was a mis-application on my part. So let's assume that there is one controller on the HPC gateway (fully accessible by all workers) that is colocated with a redis instance.
It looks as though Thank you for the encouraging feedback, I'll let you know if I find anything. Any R confs in your future? |
Beta Was this translation helpful? Give feedback.
-
I'm hoping to get confirmation that
crew
would be a good fit for this use-case with minimal shoe-horning.I do some real-time strategy-recommendation for car racing that involves 1000s of simultaneous models. The VM where the data and API are located is available via the internet (auth required), and the only access to the HPC is over SSH from the VM, no port-forwarding allowed, no other ports are exposed. The VM needs to get current-state (times, positions, etc) to the HPC every couple of laps, so I've set up a file-based mechanism that provides atomic transfer of data in both directions; this uses a dedicated process on both the VM and on the HPC (I'll call these process "brokers"). This is the only way I have to get anything other than human-typed information between the VM and the HPC.
I'm able to SSH into the HPC (and use tmux) and start up a (personal?) redis instance, then I start the brokers. They poll redis and the filesystem, this part works well (enough). Once all that is started, I start up the HPC batch, and 1000s of processes periodically poll the redis instance for new tasks.
The legacy system (for the 2023 season) used Python and Pyomo on the HPC, and that is changing for this year, so a complete rewrite of that end of it is required. (The VM is running entirely R, no change.)
crew
andrrq
a good fit for this type of coordination? I think I'd use a single worker type (not sure if a group is required).gpfs
, and lag between nodes can be on the order of 10 seconds.) I think this is fine (the controller doesn't reach out to workers, it's all "poll" in a sense), but please confirm that I'm seeing this correctly.Will, I met you several years ago at an R conference (UCLA, perhaps?) when you were still forming
targets
, I hope things are going well. I'm really intrigued by the notion ofcrew
and want to wrap my brain around it. (FYI, my first "full-up run" with this system should be this upcoming weekend ... I'm feeling no pressure here ;-)Beta Was this translation helpful? Give feedback.
All reactions