Managing 100s of Linux and Solaris machines with clusterit

I use numerous tools to perform my SysAdmin duties. One of my favorite tools it clusterit, which is a suite of programs that allows you to run commands across one or more machines in parallel. To begin using the awesomeness that is clusterit, you will first need to download and install the software. This is as easy as:

$ wget

$ tar xfvz clusterit*.gz

$ cd clusterit* && ./configure –prefix=/usr/local/clusterit && make && make install

Once the software is installed, you should have a set of binaries and manual pages in /usr/local/clusterit. To use the various tools in the clusterit/bin directory, you will first need to create one or more cluster files. Each cluster file contains a list of hosts you want to manage as a group, and each host is separated by a newline. Here is an example:

$ cat servers

The cluster file listed above contains 5 servers named foo1 – foo5. To tell clusterit you want to use this list of hosts, you will need to export the file via the $CLUSTER environment variable:

$ export CLUSTER=/home/matty/clusters/servers

Once you specify the list of hosts you want to use in the $CLUSTER variable, you can start using the various tools. One of the handiest tools is dsh, which allows you to run commands across the hosts in parallel:

$ dsh uptime

foo1  :   2:17pm  up 8 day(s), 23:37,  1 user,  load average: 0.06, 0.06, 0.06
foo2  :   2:17pm  up 8 day(s), 23:56,  0 users,  load average: 0.03, 0.03, 0.02
foo3  :   2:17pm  up 7 day(s), 23:32,  1 user,  load average: 0.27, 2.04, 3.21
foo4  :   2:17pm  up 7 day(s), 23:33,  1 user,  load average: 3.98, 2.07, 0.96
foo5  :   2:17pm  up  5:06,  0 users,  load average: 0.08, 0.09, 0.09

In the example above I ran the uptime command across all the servers listed in file that is referenced by the CLUSTER variable! You can also do more complex activities through dsh:

$ dsh ‘if uname -a | grep SunOS >/dev/null; then echo Solaris; fi’
foo1 : Solaris
foo2 : Solaris
foo3 : Solaris
foo4 : Solaris
foo5 : Solaris

This example uses dsh to run uname across a batch of servers, and prints the string Solaris if the keyword “SunOS” is found in the uname output. Clusterit also comes with a distributed scp command called pcp, which you can use to copy a file to a number of hosts in parallel:

$ pcp /etc/services /tmp

services                   100%  616KB 616.2KB/s   00:00    
services                   100%  616KB 616.2KB/s   00:00    
services                   100%  616KB 616.2KB/s   00:00    
services                   100%  616KB 616.2KB/s   00:00    
services                   100%  616KB 616.2KB/s   00:00    

$ openssl md5 /etc/services
MD5(/etc/services)= 14801984e8caa4ea3efb44358de3bb91

$ dsh openssl md5 /tmp/services
foo1 : MD5(/tmp/services)= 14801984e8caa4ea3efb44358de3bb91
foo2 : MD5(/tmp/services)= 14801984e8caa4ea3efb44358de3bb91
foo3 : MD5(/tmp/services)= 14801984e8caa4ea3efb44358de3bb91
foo4 : MD5(/tmp/services)= 14801984e8caa4ea3efb44358de3bb91
foo5 : MD5(/tmp/services)= 14801984e8caa4ea3efb44358de3bb91

In this example I am using pcp to copy the file /etc/services to each host, and then using dsh to create a checksum of the file that was copied. Clusterit also comes with a distributed top (dtop), distributed df (pdf) as well as a number of job control tools! If you are currently performing management operations using the old for stanza:

for i in `cat hosts`
    ssh $host 'run_some_command'

You really owe it to yourself to set up clusterit. You will be glad you did!

4 thoughts on “Managing 100s of Linux and Solaris machines with clusterit”

  1. This is good. but is there anyway to use clusterit for scp’ing files in parallel ?

    Or any other tool that would do it?


  2. Found this very useful, thanks.

    I do have a question though. I use sudo quite extensively and often require to enter my password. dsh doesn’t seem to allow interactive commands, any idea how to get around that without setting ALL: NOPASSWD: ALL in suders?

  3. Hey mharris45,

    Take a look at the sudo “-A” and “-S” options. I’ve used both of those successfully in the past.

    – Ryan

Leave a Reply

Your email address will not be published. Required fields are marked *