14.4 Packaging Data

Since communication is expensive, the fewer messages sent, the better your program performance will be. With this in mind, MPI provides several ways of packaging data. This allows you to maximize the amount of information exchanged in each message. There are three basic strategies.

Although we glossed over it, you've already seen one technique. You'll recall that the messaged package in MPI_Send consists of a buffer address, a count, and a data type. Clearly, this mechanism can be used to send multiple pieces of information as a single message, provided they are of the same type. For example, in our first interactive version of the numerical integration program, three calls to MPI_Send were used to distribute the values of numberRects, lowerLimit, upperLimit to all the processes.

MPI_Send(&numberRects, 1, MPI_INT, dest, 0, MPI_COMM_WORLD);        

MPI_Send(&lowerLimit, 1, MPI_DOUBLE, dest, 1, MPI_COMM_WORLD);

MPI_Send(&upperLimit, 1, MPI_DOUBLE, dest, 2, MPI_COMM_WORLD);

We could have eliminated one of these calls by putting lowerLimit and upperLimit in an array and sending it in a single call.

         params[0] = lowerLimit;

         params[1] = upperLimit;       

         MPI_Send(params, 2, MPI_DOUBLE, dest, 1, MPI_COMM_WORLD);

If you do this, don't forget to declare the array params and to make corresponding changes to call to MPI_Recv to retrieve the data from the array.

For this to work, items must be in contiguous locations in memory. While this is true for arrays, there are no guarantees for variables in general. Hence, using an array was necessary. This is certainly a legitimate way to write code and, when sending blocks of data, is very reasonable and efficient. In this case we've removed only one call so its value is somewhat dubious. Furthermore, we weren't able to include numberRects since it is an integer rather than a double.

It might seem that a structure would be a logical way around this last problem since the elements in a structure are guaranteed to be in contiguous memory. Before a structure can be used in an MPI function, however, it is necessary to define a new MPI type. Fortunately, MPI provides a mechanism to do just that.

14.4.1 User-Defined Types

A user-defined data type can be used in place of the predefined data types included with MPI. Such a type can be used as the data type in any MPI communication function. MPI type-constructor functions are used to describe the memory layout for these new types in terms of primitive types. User-defined or -derived data types are opaque objects that specify the sequence of the primitive data types used and a sequence of displacements or offsets.

Here is the numerical integration problem adapted to use a user-defined data type.

#include "mpi.h"

#include <stdio.h>

   

/* problem parameters */

#define f(x)            ((x) * (x))

   

int main( int argc, char * argv[  ] )

{

   /* MPI variables */

   int            noProcesses, processId;

   int            blocklengths[3] = {1, 1, 1};

   MPI_Aint        displacements[3] = {0, sizeof(double), 2*sizeof(double)};

   MPI_Datatype    rectStruct; /* the new type */

   MPI_Datatype    types[3] = {MPI_DOUBLE, MPI_DOUBLE, MPI_INT};

 

   /* problem variables */

   int        i;

   double     area, at, height, lower, width, total, range;

   

   struct ParamStruct 

   {  double  lowerLimit;

     double  upperLimit;

     int     numberRects;

   } params;

   

   /* MPI setup */

   MPI_Init(&argc, &argv);

   MPI_Comm_size(MPI_COMM_WORLD, &noProcesses);

   MPI_Comm_rank(MPI_COMM_WORLD, &processId);

   

   /* define type */

   MPI_Type_struct(3, blocklengths, displacements, types, &rectStruct);

   MPI_Type_commit(&rectStruct);

      

   if (processId = = 0)         /* if rank is 0, collect parameters */

   {   

      fprintf(stderr, "Enter number of steps:\n");

      scanf("%d", &params.numberRects);

      fprintf(stderr, "Enter low end of interval:\n");

      scanf("%lf", &params.lowerLimit);

      fprintf(stderr, "Enter high end of interval:\n");

      scanf("%lf", &params.upperLimit);

        }

      

   MPI_Bcast(&params, 1, rectStruct, 0, MPI_COMM_WORLD);

  

   /* adjust problem size for subproblem*/

   range = (params.upperLimit - params.lowerLimit) / noProcesses;

   width = range / params.numberRects;

   lower = params.lowerLimit + range * processId;

   

   /* calculate area for subproblem */

   area = 0.0;

   for (i = 0; i < params.numberRects; i++)

   {  at = lower + i * width + width / 2.0;

      height = f(at);

      area = area + width * height;

   }

   

   MPI_Reduce(&area, &total, 1, MPI_DOUBLE, MPI_SUM, 0, MPI_COMM_WORLD);

   

   /* collect information and print results */

   if (processId = = 0)         /* if rank is 0, collect results */

   {  fprintf(stderr, "The area from %f to %f is: %f\n",

              params.lowerLimit, params.upperLimit, total );

   } 

 

   /* finish */

   MPI_Finalize( );

   return 0;

}

To simplify the new MPI type definition, a structure type ParamStruct was defined. params is an instance of that structure. To access the individual elements of the structure, constructs such as params.numberRects must be used. These constructs have not been highlighted. All other changes related to user-defined types appear in boldface in the code.

14.4.1.1 MPI_Type_struct

MPI_Type_struct was used to define the new type. This function takes five arguments. The first four are input parameters while the last is the output parameter. The first is an integer that gives the number of blocks of elements in the type. In our example, we have three blocks of data. Our blocks are two doubles and an integer, but a block could be an aggregate data type such as an array. The next argument is an array giving the lengths of each block. In this example, because we've used scalars instead of arrays for our three blocks, the argument is just an array of 1's, one for each block. The third argument is an array of displacements. A displacement is determined by the size of the previous blocks, and the first displacement is always zero. Note the use of the type MPI_Aint. This bit of MPI magic is an integer type defined to be large enough to hold any address on the target architecture.^[2] The fourth argument is an array of primitive data types. Basically, you can think of an MPI data type as a set of pairs, each pair defining the basic MPI type and its displacement in bytes.

^[2] MPI also supplies a function MPI_Address that can be used to calculate an offset. It takes a variable and returns its byte address in memory.

14.4.1.2 MPI_Type_commit

Before a derived type can be used, it must be committed. You can think of this as "compiling" the new data type. The only argument to MPI_Type_commit is the type being defined.

As you can see from the example, the new type is used just like any existing type once defined. Keep in mind that this is a very simplistic example. Much more complicated structures can be built. MPI provides a rich feature set for user-defined data types.

14.4.2 Packing Data

Another alternative to packaging data is to use the MPI functions MPI_Pack and MPI_Unpack. MPI_Pack allows you to store noncontiguous data in contiguous memory while MPI_Unpack is used to retrieve that data.

14.4.2.1 MPI_Pack

MPI_Pack takes seven arguments. The first three define the message to be packed: the input buffer, the number of input components, and the data type of each component. The next two parameters define the buffer where the information is packed: the output buffer and the buffer size. The next to the last argument gives the current position in the buffer in bytes, while the last parameter is the communicator for the message.

Here is an example of how data is packed.

position = 0;

MPI_Pack(&numberRects, 1, MPI_INT, buffer, 50, &position, MPI_COMM_WORLD);

MPI_Pack(&lowerLimit, 1, MPI_DOUBLE, buffer, 50, &position, MPI_COMM_WORLD);

MPI_Pack(&upperLimit, 1, MPI_DOUBLE, buffer, 50, &position, MPI_COMM_WORLD);

In this instance, buffer has been defined as an array of 50 chars and position is an int. Notice that the value of position is automatically incremented as it is used.

14.4.2.2 MPI_Unpack

The first argument is the input buffer, a contiguous storage area containing the number of bytes specified in the second argument. The third argument is the position where unpacking should begin. The fourth and fifth arguments give the output buffer and the number of components to unpack. The next to last argument is the output data type while the last argument is the communicator for the message. You need not unpack the entire message.

Here is an example of unpacking the data just packed.

if (processId !=0)

   {  position = 0;

      MPI_Unpack(buffer, 50, &position, &numberRects, 1, MPI_INT, 

      MPI_COMM_WORLD);  

      MPI_Unpack(buffer, 50, &position, &lowerLimit, 1, MPI_DOUBLE, 

      MPI_COMM_WORLD);     

      MPI_Unpack(buffer, 50, &position, &upperLimit, 1, MPI_DOUBLE, 

      MPI_COMM_WORLD);

   }

This is the call to MPI_Bcast used to send the data.

MPI_Bcast(buffer, 50, MPI_PACKED,  0, MPI_COMM_WORLD);

As you can see, it's pretty straightforward. The most likely mistake is getting parameters in the wrong order or using the wrong type.

In general, if you have an array of data, the first approach (using a count) is the easiest. If you have lots of different data scattered around your program, packing and unpacking is likely to be the best choice. If the data are stored at regular intervals and of the same type, e.g., the column of a matrix, a derived type is usually a good choice.

This chapter has only scratched the surface. There is a lot more to know about MPI. For more information, consult any of the books described in the Appendix A.

Table of Contents