10.1 C3

Cluster Command and Control is a set of about a dozen command-line utilities used to execute common management tasks. These commands were designed to provide a look and feel similar to that of issuing commands on a single machine.^[1] The commands are both secure and scale reliably. Each command is actually a Python script. C3 was developed at Oak Ridge National Laboratory and is freely available.

^[1] A Python/TK GUI known as C2G has also been developed.

10.1.1 Installing C3

There are two ways C3 can be installed. With the basic install, you'll do a full C3 installation on a single machine, typically the head node, and issue commands on that machine. With large clusters, this can be inefficient because that single machine must communicate with each of the other machines in the cluster. The alternate approach is referred to as a scalable installation. With this method, C3 is installed on all the machines and the configuration is changed so that a tree structure is used to distribute commands. That is, commands fan out through intermediate machines and are relayed across the cluster more efficiently. Both installations begin the same way; you'll just need to repeat the installation with the scalable install to alter the configuration file. This description will stick to the simple install. The simple installation includes a file README.scale that describes the scalable installation.

Since the C3 tools are scripts, there is very little to do to install them. However, since they rely on several other common packages and services, you will need to be sure that all the prerequisites are met. On most systems this won't be a problem; everything you'll need will already be in place.

Before you can install C3, make sure that rsync, Perl, SSH, and Python are installed on your system and available. Name resolution, either through DNS or a host file, must be available as well. Additionally, if you want to use the C3 command pushimage, SystemImager must be installed. Installing SystemImager is discussed in Chapter 8.

Once you have met the prerequisites, you can download, unpack, and install C3. To download it, go to http://www.csm.ornl.gov/torc/C3/ and follow the link to the download page. You can download sources or an RPM package. In this example, sources are used. If you install from RPMs, install the full install RPM and profile RPM on servers and the client RPM on clients. Note that with the simple installation you only need to install C3 on the head node of your cluster. However, you will need SSH and the like on every node.

Once you have unpacked the software and read the README files, you can run the install script Install-c3.

[root@fanny src]# gunzip c3-4.0.1.tar.gz

[root@fanny src]# tar -xvf c3-4.0.1.tar

[root@fanny src]# cd c3-4.0.1

[root@fanny c3-4.0.1]# ./Install-c3

The install script will copy the scripts to /opt/c3-4 (for Version 4 at least), set paths, and install man pages. There is nothing to compile.

The next step is creating a configuration file. The default file is /etc/c3.conf. However, you can use other configuration files if you wish by explicitly referencing them in C3 commands using the -f option with the file name.

Here is a very simple configuration file:

cluster local {

        fanny.wofford.int

        george.wofford.int

        hector.wofford.int

        ida.wofford.int

        james.wofford.int

}

This example shows a configuration for a single cluster. In fact, the configuration file can contain information on multiple clusters. Each cluster will have its own cluster description block, which begins with the identifier cluster followed by a name for a cluster. The name can be used in C3 commands to identify the specific cluster if you have multiple cluster description blocks. Next, the machines within the cluster are listed within curly braces. The first machine listed is the head node. To remove ambiguity, the head node entry can consist of two parts separated by a colon-the head node's external interface to the left of the colon and the head node's internal interface to the right of the colon. (Since fanny has a single interface, that format was not appropriate for this example.) The head node is followed by the compute nodes. In this example, the compute nodes are listed one per line. It is possible to specify a range. For example, node[01-64] would specify 64 machines with the names node1, node2, etc. The cluster definition block is closed with another curly brace. Of course, all machine names must resolve to IP addresses, typically via the /etc/hosts file. (The commands cname and cnum, described later in this section, can be discerning the details surrounding node indices.)

Within the compute node list, you can also use the qualifiers exclude and dead. exclude is applied to range qualifiers and immediately follow a range specification. dead applies to individual machines and precedes the machine name. For example,

node[1-64]

exclude 60

alice

dead bob

carol

In this list node60 and bob are designated as being unavailable. Starting with Version 3 of C3, it is possible to use ranges in C3 commands to restrict actions to just those machines within the range. The order of the machines in the configuration file determines their numerical position within the range. In the example, the 67 machines defined have list positions 0 through 66. If you deleted bob from the file instead of marking it as dead, carol's position would change from 66 to 65, which could cause confusion. By using exclude and dead, you effectively remove a machine from a cluster without renumbering the remaining machines. dead can also be used with a dummy machine to switch from 0-indexing to 1-indexing. For example, just add the following line to the beginning of the machine list:

dead place_holder

Once done, all the machines in the list move up one position. For more details on the configuration file, see the c3.conf(5) and c3-scale(5) manpages.

Once you have created your configuration file, there is one last thing you need to do before C3 is ready to go. For the command ckill to work properly, the Perl script ckillnode must be installed on each individual machine. Fortunately, the rest of C3 is installed and functional, so you can use it to complete the installation. Just issue these commands:

[root@fanny root]# cexec mkdir /opt/c3-4

************************* local *************************

--------- george.wofford.int---------

...

[root@fanny root]# cpush /opt/c3-4/ckillnode

building file list ... building file list ... building file list ... building 

file list ... done

...

The first command makes the directory /opt/c3-4 on each machine in your cluster and the second copies the file ckillnode to each machine. You should see a fair amount of output with each command. If you are starting SSH manually, you'll need to start it before you try this.

10.1.2 Using C3 Commands

Here is a brief description of C3's more useful utilities.

10.1.2.1 cexec

This command executes a command string on each node in a cluster. For example,

[root@fanny root]# cexec mkdir tmp

************************* local *************************

--------- george.wofford.int---------

--------- hector.wofford.int---------

--------- ida.wofford.int---------

--------- james.wofford.int---------

The directory tmp has been created on each machine in the local cluster. cexec has a serial version cexecs that can be used for testing. With the serial version, the command is executed to completion on each machine before it is executed on the next machine. If there is any ambiguity about the order of execution for the parts of a command, you should use double quotes within the command. Consider:

[root@fanny root]# cexec "ps | grep a.out"

...

The quotes are needed here so grep will be run on each individual machine rather than have the full output from ps shipped to the head node.

10.1.2.2 cget

This command is used to retrieve a file from each machine in the cluster. Since each file will initially have the same name, when the file is copied over, the cluster and host names are appended. Here is an example.

[root@fanny root]# cget /etc/motd

[root@fanny root]# ls

motd_local_george.wofford.int

motd_local_hector.wofford.int

motd_local_ida.wofford.int

motd_local_james.wofford.int

cget ignores links and subdirectories.

10.1.2.3 ckill

This script allows you to kill a process running on each node in your cluster. To use it, specify the process by name, not by number, because it is unlikely that the processes will have the same process ID on each node.

[root@fanny root]# ckill -u sloanjd a.out

uid selected is 500

uid selected is 500

uid selected is 500

uid selected is 500

You may also specify an owner as shown in the example. By default, the local user name will be used.

10.1.2.4 cpush

This command is used to move a file to each node on the cluster.

[root@fanny root]# cpush /etc/motd /root/motd.bak

building file list ... done

building file list ... done

motd

motd

building file list ... done

motd

wrote 119 bytes  read 36 bytes  62.00 bytes/sec

total size is 39  speedup is 0.25

wrote 119 bytes  read 36 bytes  62.00 bytes/sec

total size is 39  speedup is 0.25

wrote 119 bytes  read 36 bytes  62.00 bytes/sec

total size is 39  speedup is 0.25

building file list ... done

motd

wrote 119 bytes  read 36 bytes  62.00 bytes/sec

total size is 39  speedup is 0.25

As you can see, statistics for each move are printed. If you only specify one file, it will use the same name and directory for the source and the destination.

10.1.2.5 crm

This routine deletes or removes files across the cluster.

[root@fanny root]# crm /root/motd.bak

Like its serial counterpart, you can use the -i, -r and -v options for interactive, recursive, and verbose deletes, respectively. Please note, the -i option only prompts once, not for each node. Without options, crm silently deletes files.

10.1.2.6 cshutdown

This utility allows you to shut down the nodes in your cluster.

[root@fanny root]# cshutdown -r t 0

In this example, the time specified was 0 for an immediate reboot. (Note the absence of the hyphen for the t option.) Additional options are supported, e.g., to include a shutdown message.

10.1.2.7 clist, cname, and cnum

These three commands are used to query the configuration file to assist in determining the appropriate numerical ranges to use with C3 commands. clist lists the different clusters in the configuration file.

[root@amy root]# clist

cluster  oscar_cluster  is a direct local cluster

cluster  pvfs_clients  is a direct local cluster

cluster  pvfs_iod  is a direct local cluster

cname lists the names of machines for a specified range.

[root@fanny root]# cname local:0-1

nodes from cluster:  local

cluster:  local ; node name:  george.wofford.int

cluster:  local ; node name:  hector.wofford.int

Note the use of 0 indexing.

cnum determines the index of a machine given its name.

[root@fanny root]# cnum ida.wofford.int

nodes from cluster:  local

ida.wofford.int is at index 2 in cluster local

These can be very helpful because it is easy to lose track of which machine has which index.

10.1.2.8 Further examples and comments

Here is an example using a range:

[root@fanny root]# cpush local:2-3 data

...

local designates which cluster is within your configuration file. Because compute nodes are numbered from 0, this will push the file data to the third and fourth nodes in the cluster. (That is, it will send the file from fanny to ida and james, skipping over george and hector.) Is that what you expected? For more information on ranges, see the manpage c3-range(5).

Note that the name used in C3 commands must match the name used in the configuration file. For C3, ida and ida.wofford.int are not equal even if there is an alias ida that resolves to ida.wofford.int. For example,

[root@fanny root]# cnum ida.wofford.int

nodes from cluster:  local

ida.wofford.int is at index 2 in cluster local

[root@fanny root]# cnum ida

nodes from cluster:  local

When in doubt about what form to use, just refer back to /etc/c3.conf.

In addition to the commands just described, the C3 command cpushimage can be used with SystemImager to push an image from server to nodes. There are also several user-contributed utilities. While not installed, these can be found in the C3 source tree in the subdirectory contrib. User-contributed scripts can be used as examples for writing other scripts using C3 commands.

C3 commands take a number of different options not discussed here. For a brief description of other options, use the --help option with individual commands. For greater detail, consult the manpage for the individual command.

Table of Contents