16.5 Tracing with printf
Printing information at key points
is a way of tracing or following the execution of the code. With C
code, you stick printf's
throughout the code that let you know you've reached
a particular point in the code or tell you what the value of a
variable is. By using this approach, you can zero in on a crucial
point in the program and see the value of parameters that may affect
the execution of the code. This quick and dirty approach works best
when you already have an idea of what might be going wrong. But if
you are clueless as to where the problem is, you may need a lot of
print statements to zero in on the problem, particularly with large
programs. Moreover, it is very easy for the truly useful information
to get lost in the deluge of output you create.
On the other hand, there is certainly nothing wrong with printing
information that provides the user with some sense of progress and an
indication of how the program is working. We did this in our
numerical integration program when we printed the process number and
the individual areas calculated by each process.
It can be particularly helpful to echo values that are read into the
program to ensure that they didn't get garbled in
the process. For example, if you've inadvertently
coerced a floating point number into an integer, the truncation that
occurs will likely cause problems. By printing the value, you may be
alerted to the problem.
Including print statements can also be helpful when you are working
with complicated data structures since you will be able to format the
data in meaningful ways. Examining a large array with a symbolic
debugger can be challenging. Since it is straightforward to
conditionally print information, print statements can be helpful when
the data you are interested in is embedded within a large loop and
you want to examine it only under selective conditions.
In developing code, programmers will frequently write large blocks of
diagnostic code that they will discard
once the code seems to be working. When the code has to be changed at
a later date, they will often find themselves rewriting similar code
as new problems arise. A better solution is to consider the
diagnostic code a key part of the development process and keep it in
your program. By using conditional compile directives, the code can
be disabled in production versions so that program efficiency
isn't compromised, but can be enabled easily should
the need arise.
A technique that is often used with
printf is deleting extraneous
code. The idea is, after making a copy of your program, to start
deleting code and retesting to see whether the problem has
disappeared. The goal is to produce the smallest piece of code that
still exhibits the problem. This can be useful with some types of
problems, particularly when you are trying to piece together how some
feature of a language works. It can also be helpful when generating a
bug report.
With parallel code, the printf approach can be
problematic. Earlier in this book, you saw examples of how the output
from different processes could be printed in a seemingly arbitrary
order. Buffering further complicates matters. If your code is
crashing, a process may die before its output is displayed. That
output will be lost. Also, output can change timings which can limit
its effectiveness if you are dealing with a race problem. Finally,
print statements can seriously deteriorate performance.
If you are going to use the printf approach with
parallel programs, there are two things you should do. First, if
there is any possibility of the source of the output being confused,
be sure to label the output with the process number or machine name.
Second, follow your calls to printf with a call to
fflush
so that the output is actually printed at the moment the program
generates it. For example,
...
int processId;
char processName[MPI_MAX_PROCESSOR_NAME];
...
MPI_Comm_rank(MPI_COMM_WORLD, &processId);
MPI_Get_processor_name(processName, &nameSize);
...
fprintf(stdout, "Process %d on %s at checkpoint 1. \n", processId,
processName);
fflush(stdout);
...
If you want to control the order of the output,
you'll need to have the master process coordinate
output.
|