Audience
This book is an introduction to building high-performance clusters.
It is written for the biologist, chemist, or physicist who has just
acquired two dozen recycled computers and is wondering how she might
combine them to perform that calculation that has always taken too
long to complete on her desktop machine. It is written for the
computer science student who needs help getting started building his
first cluster. It is not meant to be an exhaustive treatment of
clusters, but rather attempts to introduce the basics needed to build
and begin using a cluster.
In writing this book, I have assumed that the reader is familiar with
the basics of setting up and administering a Linux system. At a
number of places in this book, I provide a very quick overview of
some of the issues. These sections are meant as a review, not an
exhaustive introduction. If you need help in this area, several
excellent books are available and are listed in the Appendix of this
book.
When introducing a topic as extensive as clusters, it is impossible
to discuss every relevant topic in detail without losing focus and
producing an unmanageable book. Thus, I have had to make a number of
hard decisions about what to include. There are many topics that,
while of no interest to most readers, are nonetheless important to
some. When faced with such topics, I have tried to briefly describe
alternatives and provide pointers to additional material. For
example, while computational grids are outside the scope of this
book, I have tried to provide pointers for those of you who wish to
know more about grids.
For the chapters dealing with programming, I have assumed a basic
knowledge of C. For high-performance computing, FORTRAN and C are
still the most common choices. For Linux-based systems, C seemed a
more reasonable choice.
I have limited the programming examples to MPI since I believe
this is the most appropriate parallel library for beginners. I have
made a particular effort to keep the programming examples as simple
as possible. There are a number of excellent books on MPI
programming. Unfortunately, the available books on MPI all tend to
use fairly complex problems as examples. Consequently, it is all too
easy to get lost in the details of an example and miss the point.
While you may become annoyed with my simplistic examples, I hope that
you won't miss the point. You can always turn to
these other books for more complex, real-world examples.
With any introductory book, there are things that must be omitted to
keep the book manageable. This problem is further compounded by the
time constraints of publication. I did not include a chapter on
diskless systems because I believe the complexities introduced by
using diskless systems are best avoided by people new to clusters.
Because covering computational grids would have considerably
lengthened this book, they are not included. There simply
wasn't time or space to cover some very worthwhile
software, most notably PVM and Condor. These were hard decisions.
|