cluster-ssh-tools - a collection of cluster ssh tools
I often work on clusters of machines where I need to do the same operation to many hosts at the same time. I started out with regular shell loops:
for i in `seq -f '%02g' 1 20`
do
ssh root@hostname$i.dc.domain.com reboot
done
This worked fine for about half a day. Then I looked at DSH and simlar tools available and easy to find in 2007. Over the course of time I built up cl-run.pl and a couple copies like cl-rsync.pl and cl-psgrep.pl. It didn't take long and I split all the common bits out to a module and made it a bit more generic. Then the rest of the tools were pretty trivial to throw together as I needed them.
That's all to say, these tools work well for me but are not good examples of perl coding nor are they good for everybody.
I've had good luck using most of these tools on 300+ hosts at a time from a bastion host with 8G of RAM and plenty of available CPU cycles. Currently I run these all the time from a smallish Linux VM (2G RAM, 2 vcpus), my workstation, and my Macbook Air and haven't ever had a problem with performance.
cl-run.pl # run a command or script
cl-rsync.pl # parallel rsync
cl-sendfile.pl # push a file out
cl-gatherfile.pl # pull a file in (sorted by hostname)
cl-ping.pl # ping hosts
cl-killall.pl # kill a process on hosts with a regular expression
cl-psgrep.pl # look for processes across the cluster
cl-netstat.pl # a distributed network I/O display
nssh.rb # ssh wrapper that sets screen title & other things
$> cat > ~/.dsh/machines.nosqldb-dev <<EOF
nosqldb-dev12.tobert.org
nosqldb-dev11.tobert.org
nosqldb-dev10.tobert.org
nosqldb-dev9.tobert.org
nosqldb-dev8.tobert.org
nosqldb-dev7.tobert.org
nosqldb-dev6.tobert.org
nosqldb-dev5.tobert.org
nosqldb-dev4.tobert.org
nosqldb-dev3.tobert.org
nosqldb-dev2.tobert.org
nosqldb-dev1.tobert.org
EOF
# set default list to save typing, I generally do not use this
$> ln -sf ~/.dsh/machines.nosqldb-dev ~/.dsh/machines.list
# set up a user account (cheezy example, assumes user@localhost has root@remotehost keys set up)
$> cl-run.pl --root -c "useradd -m tobert"
$> cl-rsync.pl --root -l ~/.ssh -r /home/tobert
$> cl-run.pl --root -c "chown -R tobert /home/tobert"
$> cl-run.pl --root -c "(grep -q '^tobert' /etc/sudoers) || echo 'tobert ALL=(ALL) NOPASSWD: ALL' >> /etc/sudoers"
$> cl-run.pl --list nosqldb-dev.pl -c "uname -a"
nosqldb-dev12.tobert.org: Linux ip-xx-xx-xx-xx 2.6.32-316-ec2 #31-Ubuntu SMP Wed May 18 14:10:36 UTC 2011 x86_64 GNU/Linux
nosqldb-dev1.tobert.org: Linux ip-xx-xx-xx-xx 2.6.32-316-ec2 #31-Ubuntu SMP Wed May 18 14:10:36 UTC 2011 x86_64 GNU/Linux
nosqldb-dev11.tobert.org: Linux ip-xx-xx-xx-xx 2.6.32-316-ec2 #31-Ubuntu SMP Wed May 18 14:10:36 UTC 2011 x86_64 GNU/Linux
nosqldb-dev5.tobert.org: Linux ip-xx-xx-xx-xx 2.6.32-316-ec2 #31-Ubuntu SMP Wed May 18 14:10:36 UTC 2011 x86_64 GNU/Linux
nosqldb-dev2.tobert.org: Linux ip-xx-xx-xx-xx 2.6.32-316-ec2 #31-Ubuntu SMP Wed May 18 14:10:36 UTC 2011 x86_64 GNU/Linux
nosqldb-dev4.tobert.org: Linux ip-xx-xx-xx-xx 2.6.32-316-ec2 #36-Ubuntu SMP Fri Jul 8 18:12:30 UTC 2011 x86_64 GNU/Linux
nosqldb-dev6.tobert.org: Linux ip-xx-xx-xx-xx 2.6.32-316-ec2 #36-Ubuntu SMP Fri Jul 8 18:12:30 UTC 2011 x86_64 GNU/Linux
nosqldb-dev10.tobert.org: Linux ip-xx-xx-xx-xx 2.6.32-316-ec2 #31-Ubuntu SMP Wed May 18 14:10:36 UTC 2011 x86_64 GNU/Linux
nosqldb-dev3.tobert.org: Linux ip-xx-xx-xx-xx 2.6.32-316-ec2 #36-Ubuntu SMP Fri Jul 8 18:12:30 UTC 2011 x86_64 GNU/Linux
nosqldb-dev8.tobert.org: Linux ip-xx-xx-xx-xx 2.6.32-316-ec2 #31-Ubuntu SMP Wed May 18 14:10:36 UTC 2011 x86_64 GNU/Linux
nosqldb-dev9.tobert.org: Linux ip-xx-xx-xx-xx 2.6.32-316-ec2 #31-Ubuntu SMP Wed May 18 14:10:36 UTC 2011 x86_64 GNU/Linux
nosqldb-dev7.tobert.org: Linux ip-xx-xx-xx-xx 2.6.32-316-ec2 #31-Ubuntu SMP Wed May 18 14:10:36 UTC 2011 x86_64 GNU/Linux
$> cl-run.pl -c "sudo nohup dd if=/dev/zero of=/dev/null bs=1M &"
$> cl-netstat.pl --list nosqldb-dev --device md3
hostname: eth0_total eth0_recv eth0_send read_iops write_iops 1min 5min 15min
--------------------------------------------------------------------------------------------------------------
nosqldb-dev12: 864 121 743 0/s 0/s 0.00 0.00 0.00
nosqldb-dev11: 864 121 743 0/s 0/s 0.00 0.00 0.00
nosqldb-dev10: 846 121 725 0/s 0/s 0.00 0.00 0.00
nosqldb-dev9: 840 128 712 0/s 111,608/s 1.00 1.01 1.00
nosqldb-dev8: 827 117 710 0/s 120,828/s 1.04 1.03 1.00
nosqldb-dev7: 999 175 824 0/s 93,479/s 1.02 1.03 1.00
nosqldb-dev6: 1,674 201 1,473 0/s 0/s 0.00 0.00 0.00
nosqldb-dev5: 947 136 811 0/s 0/s 0.00 0.00 0.00
nosqldb-dev4: 1,674 201 1,473 0/s 0/s 0.00 0.00 0.00
nosqldb-dev3: 961 136 825 0/s 0/s 0.00 0.00 0.00
nosqldb-dev2: 961 136 825 0/s 0/s 0.00 0.00 0.00
nosqldb-dev1: 967 136 831 0/s 0/s 0.00 0.00 0.00
Total: 12,424 Recv: 1,729 Send: 10,695 (0 mbit/s) | 0 read/s 325,915 write/s
Average: 12,828 Recv: 151 Send: 917 (0 mbit/s) | 0 read/s 129,660 write/s
nssh.rb does a few nice things around sshing inside GNU screen. In the original version, all it did was set the screen title automatically by grabbing the hostname off the args. Now it does quite a bit more, including letting you create a bunch of new named sessions in screen without a lot of typing.
It also tries to flip CNAME's to A names automatically while still setting your screen title to the CNAME. This can be pretty handy when working with lots of EC2 hosts where you may not necessarily have set up all the CNAME's in ~/.ssh/config.
By default, GNU screen has MAXWIN at 40. I almost always run a rebuilt version from the git head with MAXWIN 512.
$> nssh.rb hostname.tobert.org
# in screen, ctrl-a c, then
$> nssh.rb reset
$> nssh.rb next --list nosqldb-dev
# ctrl-a c
$> nssh.rb next --list nosqldb-dev
# ctrl-a c
# etc. ...
And finally, the latest incarnation of my screenrc generation script is included. At the moment, I hard-code a list of clusters I want to connect to at screen startup so I can do something like the following after reboots:
$> ssh-add
$> generate-screen-config.rb
$> screen -c ~/.screenrc-main -S main -T xterm-color -U
This repo includes .screenrc-main based on what I use all the time.
Base perl with Tie::IxHash for most of the tools. cl-netstat.pl requires Net::SSH2 built against a fairly modern libssh2. The ruby utils are probably fine with a base system ruby 1.8 or 1.9.
SSH agent support requires Net::SSH2 >= 0.40.
I usually symlink all these files into ~/bin, which my ~/.profile sets to be in my PATH.
$> mkdir ~/bin ~/src
$> cd ~/src
$> git clone https://github.com/tobert/perl-ssh-tools.git
$> ln -s ~/src/perl-ssh-tools/* ~/bin/
$> export PATH=~/bin:$PATH
You'll need libssh2 and the perl modules. If you're using Macports:
$> sudo port install perl
$> sudo port install libssh2
$> sudo /opt/local/bin/perl -MCPAN -e 'install Net::SSH2'
$> sudo /opt/local/bin/perl -MCPAN -e 'install Tie::IxHash'
The docs for common options are in DshPerlHostLoop.pm.
perldoc ~/bin/DshPerlHostLoop.pm
All of the utilities have their own POD and use Pod::Usage.
cl-run.pl
cl-rsync.pl
cl-sendfile.pl
cl-gatherfile.pl
cl-ping.pl
cl-killall.pl
cl-psgrep.pl
cl-netstat.pl
Al Tobey <tobert@gmail.com>
This software is copyright (c) 2007-2013 by Al Tobey.
This is free software; you can redistribute it and/or modify it under the terms of the Artistic License 2.0. (Note that, unlike the Artistic License 1.0, version 2.0 is GPL compatible by itself, hence there is no benefit to having an Artistic 2.0 / GPL disjunction.) See the file LICENSE for details.