14.3 Managing Communicators

Collective communication simplifies the communication process but has the limitation that you must communicate with every process in the communicator or communication group. There are times when you may want to communicate with only a subset of available processes. For example, you may want to divide your processes so that different groups of processes work on different tasks. Fortunately, the designers of MPI foresaw that possibility and included functions that allow you to define and manipulate new communicators. By creating new communicators that are subsets of your original communicator, you'll still be able to use collective communication. This ability to create and manipulate communicators has been described as MPI's key distinguishing feature, i.e., what distinguishes MPI from other message passing systems.

Communicators are composed of two parts: a group of processes and a context. New communicators can be built by manipulating an existing communicator or by taking the group from an existing communicator and, after modifying that group, building a new communicator based on that group. The default communicator MPI_COMM_WORLD is usually the starting point, but once you have other communicators, you can use them as well.^[1]

^[1] Although it sounds like there is only one default communicator, there are actually two. The other default communicator is MPI_COMM_SELF. Since this is defined for each process and contains only that process, it isn't all that useful when defining new communicators.

14.3.1 Communicator Commands

MPI provides a number of functions for manipulating groups. The simplest way to create a new group is to select processes from an existing group, either by explicitly including or excluding processes. In the following example, process 0 is excluded from the group associated with MPI_COMM_WORLD to create a new group. You might want to do this if you are organizing your program using one process as a master, typically process 0, and all remaining processes as workers. This is often called a master/slave algorithm. At times, the slave processes may need to communicate with each other without including process 0. By creating a new communicator (newComm in the following example), you can then carry out the communication using collective functions.

#include "mpi.h"

#include <stdio.h>

   

int main( int argc, char * argv[  ] )

{

   int processId, i, flag = 0;

   int processes[1] = {0};

   MPI_Group  worldGroup, newGroup;

   MPI_Comm  newComm;

  

   MPI_Init(&argc, &argv);

   MPI_Comm_rank(MPI_COMM_WORLD, &processId);

   

   MPI_Comm_group(MPI_COMM_WORLD, &worldGroup);

   MPI_Group_excl(worldGroup, 1, processes, &newGroup);

   MPI_Comm_create(MPI_COMM_WORLD, newGroup, &newComm);

  

   fprintf(stderr, "Before: process: %d   Flag: %d\n", processId, flag); 

   if (processId = = 1) flag = 1;  

   if (processId != 0)

      MPI_Bcast(&flag, 1, MPI_INT, 0, newComm);

   fprintf(stderr, "After  process: %d   Flag: %d\n", processId, flag); 

   

   if (processId !=0)

   { MPI_Comm_free(&newComm);

   MPI_Group_free(&newGroup);

   }

   MPI_Finalize( );

   return 0;

}

Relevant portions of this program appear in boldface.

The first things to notice about the program are the new type declarations using MPI_Group and MPI_Comm. MPI_Group allows us to define handles for manipulating process groups. In this example, we need two group handles, one for the existing group from MPI_COMM_WORLD and one for the new group, we are defining. The MPI_Comm type is used to define a variable for the new communicator being created.

14.3.1.1 MPI_Comm_group

Next, we need to extract the group from MPI_COMM_WORLD, our starting point for the new group. We use the function MPI_Comm_group to do this. It takes two arguments: the first is the communicator; the second argument is a handle used to return the group for the specified communicator, in this case, MPI_COMM_WORLD's group.

14.3.1.2 MPI_Group_incl and MPI_Group_excl

Once we have an existing group, we can use it to create a new group. In this example, we exclude process from the original group using the MPI_Group_excl command. Exclusion is the easiest way to handle this particular case since only one process needs to be specified. MPI_Group_incl should be used when it is simpler to list processes to include rather than exclude. The four arguments to MPI_Group_incl and MPI_Group_excl are the same: the first argument is the original group you are using as a starting point; the second argument is the number of processes that will be included or excluded; the third argument is an integer array giving the ranks of the processes to be included or excluded; and the last parameter is the address of the group's handle.

In this example, since process 0 is excluded, we have used the array process to list the single process rank that we want excluded. We could have accomplished the same thing with the array

int processes[3] = {1, 2, 3};

and the call

MPI_Group_incl(worldGroup, 3, processes, &newGroup);

Either way works fine.

14.3.1.3 MPI_Comm_create

Finally, we need to turn the new group into a communicator. This is done with the MPI_Comm_create command, which takes three arguments: the original communicator, the new group, and the address for the new communicator's handle. Once this call is made, we have our communicator.

In the code sample given above, the next block of code shows how the new communicator could be used. In the example, there is a variable flag initially set to 0. It is changed in process 1 to 1 and then broadcast to the remaining processes within the new communicator. Here is what the output for four processes looks like.

[sloanjd@amy COMM]$ mpirun -np 4 comm

Process: 0   Flag: 0

Process: 0   Flag: 0

Process: 1   Flag: 0

Process: 2   Flag: 0

Process: 3   Flag: 0

Process: 1   Flag: 1

Process: 2   Flag: 1

Process: 3   Flag: 1

Note that the value changes for every process except process 0.

There are a couple of things worth noting about how the new communicator is used. First, notice that only the relevant processes are calling MPI_Bcast. Process 0 has been excluded. Had this not been done, the call in process 0 would have returned a null communicator error since it is not part of the communicator. The other thing to note is that the process with rank 1 in MPI_COMM_WORLD has a rank of 0 in the new communicator. Thus, the fourth argument to MPI_Bcast is 0, not 1.

14.3.1.4 MPI_Comm_free and MPI_Group_free

It is good housekeeping to release any communicators or groups you are no longer using. For these two functions, the handles will be set to MPI_COMM_NULL and MPI_GROUP_NULL, respectively. While releasing these isn't absolutely necessary, it can be helpful at times. For example, doing so may alert you to the inadvertent use of what should be defunct groups or communicators. Each of these two functions takes the address of the communicator or of the group as an argument, respectively. It doesn't matter which function you call first.

Since process 0 is not part of the new communicator in the last example, we need to guard againt using the new communicator within process 0. This isn't too difficult when a single process is involved but can be a bit of a problem when more processes are involved. So in some instances, splitting communicators is a better approach. Here is a simple example.

#include "mpi.h"

#include <stdio.h>

   

int main( int argc, char * argv[  ] )

{

   int processId, i, flag = 0, color = 0;

   MPI_Comm    newComm;

   

   MPI_Init(&argc, &argv);

   MPI_Comm_rank(MPI_COMM_WORLD, &processId);

  

   if (processId = = 0 || processId = = 1) color = 1;

   MPI_Comm_split(MPI_COMM_WORLD, color, processId, &newComm);

   

   fprintf(stderr, "Process: %d   Flag: %d\n", processId, flag); 

   if (processId = = 0) flag = 1;  

   MPI_Bcast(&flag, 1, MPI_INT, 0, newComm);

   fprintf(stderr, "Process: %d   Flag: %d\n", processId, flag); 

   

   MPI_Comm_free(&newComm);

   MPI_Finalize( );

   return 0;

}

Notice that, in this example, the communicator is manipulated directly without resorting to dealing with groups.

14.3.1.5 MPI_Comm_split

The function MPI_Comm_split is at the heart of this example. It is used to break a communicator into any number of pieces. The first argument is the original communicator. The second argument, often referred to as the color, is used to determine which communicator a process will belong to. All processes that make the call to MPI_Comm_split with the same color will be in the same communicator. Processes with different values (or colors) will be in different communicators. In this example, processes 0 and 1 have a color of 1 so they are in one communicator while processes 2 and above have a color of 0 and are in a separate communicator. (If the color is MPI_UNDEFINED, the process is excluded from any of the new communicators.) The third argument, often called the key, is used to determine the rank ordering for processes within a communicator. When keys are the same, the original rank is used to break the tie. The last argument is the address of the new communicator.

Table 14-1 gives a slightly more complicated example of how this might work. Using the data in this table, three new communicators are created. The first communicator consists of processes A, C, and D with ranks in the new communicator of 1, 0, and 2, respectively. The second communicator consists of processes B and E with ranks 0 and 1, respectively. The last communicator consists of the single process F with a rank of 0. Process G is not included in any of the new communicators.

Table 14-1. Communicator assignments

Process

A

B

C

D

E

F

G

Original rank

0

1

2

3

4

5

6

Color

1

2

1

1

2

3

MPI_UNDEFINED

Key

3

3

2

3

3

a

0

Returning to the code given above, with four processes, two communicators will be created. Both will be called newComm. The first will have the original processes 0 and 1 with the same ranks in the new communicator. The second will have the original processes 2 and 3 with new ranks 0 and 1, respectively. Notice that a communicator is defined for every process, all with the same name.

These two examples should give you an idea of why communicators are useful and how they are used. Group management functions include functions to access groups (e.g., MPI_Group_size, MPI_Group_rank, and MPI_Group_compare) and functions to construct groups (e.g., MPI_Group_difference, MPI_Group_union, MPI_Group_incl, and MPI_Group_range_incl). There are also a number of different communicator management functions (e.g., MPI_Comm_size, MPI_Comm_dup, MPI_Comm_compare, and MPI_Comm_create).

Table of Contents