A simple make-based first level makefile distributed compilation.
Processes are run like this:
d1make.py
+-----------------ssh-------------------- d1make-server.py
\ . \
make / make
\ . \
d1make-client.py / compile
| .
\----------ssh------d1make-client.py --remote
An explained sequence of events
d1make.py
- Opens up sessions, configure PATH and other tool settings and startsd1make-server.py
.- This is repeated several times, once for each host to use.
- When
d1make-server.py
processes starts, it opens a FIFO to receive requests and reports that FIFO back together with load information. d1make.py
reads themakefile
and replaces locations where make is called with calls tod1make-client.py
.d1make.py
startsmake
on the modifiedmakefile
.- When
d1make-client.py
is started bymake
, it contactsd1make.py
using a FIFO and gets information on what host to contact and the location of thed1make-server.py
FIFO on that host. d1make-client.py
starts an ssh session to that host, runningd1make-client.py --remote
on the remote side.d1make-client.py --remote
ordersd1make-server.py
to start themake
command and deliver stderr, stdout and exit code using FIFOs.- The
make
executes and terminates. The termination is reported back tod1make-client.py --remote
byd1make-server.py
closing the FIFOs. - The
d1make-client.py --remote
exits with the exit code from make in turn causing the ssh to exit andd1make-client.py
exits as a consequence. - The
make
can then continue with the next line in the recipe or the next possible goal as the job is finished.
- That you have a fairly large project to compile (using
make
) - That you have a set of hosts available to use for compilation. The hosts are approximately the same in processing capacity, memory, disc access speed etc.
- That the hosts available to build on all have the same user ids, file systems, and tools (including the d1make-tool) installed.
- That you can contact the hosts using ssh.
- That the compile rules of the project is structured in one
makefile
on the top dispatching to manymakefile
s just one level down. - That most of the compilation is done one level down without very much arbitration and rules in the top
makefile
.
- You set up the list of hosts to contact in the environment variable
D1MAKE_HOSTS
.
Make sure that you can connect to all of them using ssh without password. - You set up the configuration of tools used in the variable
D1MAKE_SERVER_SETUP
. The contents of this variable will be inserted just before startingd1make-server.py
so if it is a command (as opposed to just a line of environment variable settings), end with semicolon (;). - You set up the environment variable
D1MAKE_CLIENT_MAKEARGS
with parameters passed tomake
on the client side (typically -j and/or -l) to control how many simultaneous compilations one of the makes shall run. You could probably find the optimal values for this by running a sequence of test compilations with different values on one of these hosts to find out what settings are best. - You start
d1make.py
with a goal and the -j (--jobs) flag specifying the amount ofd1make-clientl.py
jobs you will run. If each of the hosts specified inD1MAKE_HOSTS
are two-CPU hosts, you should probably choose the amount of jobs between 2 * #hosts and 4 * #hosts.
- The distribution of jobs will be attempted at the server with the lowest load based on the last reported load on the each of the hosts and the amount of running job to each of them. This means that if some other user runs a job on one of the hosts, this host will be less likely to be used until the other user's jobs is complete and his contribution to the load starts to drop.
- The ssh connection between
d1make-client.py
andd1make-client.py --remote
uses the same ssh connection as the one betweend1make.py
andd1make-server.py
using the ssh ctl_path mechanism. If you don't use an ssh compatible with openssh in this respect, it will not work. - OpenSSH 7.2 (at least) seems to limit the amount of sessions using the same ctl_path to 9. This is a problem if the top make is run with -j > 9 and they all happen to start against the same ssh session.