Title: How to Build a Beowulf
Authors: Thomas Sterling, John Salmon, Donald Becker, Daniel Savarese
Publisher: MIT Press
ISBN: 026269218X
Reviewer: Joe Greenseid
How to Build a Beowulf attempts to walk the
reader through the process described in the title --
namely, building a Beowulf cluster, all the way from
choosing hardware to assembling the systems, installing
Linux, configuring the "Beowulf" type environment, and
even how to use a parallel programming environment to
take advantage of your new parallel computing
system.
The book succeeds in doing exactly what it sets out to
do. Having been published in 1999, certain information
is dated (such as hardware pricing, or the need for
floppy disks to install the cluster, as a few examples),
but overlooking these simple cosmetic issues, the book
is extremely useful for someone relatively new to
clustering.
The first two chapters are overview. Chapter one is a
brief introduction to the history of Beowulf computing
and a bit about the book. Chapter two goes into the
overview of a Beowulf system, hardware, software,
message passing, and systems management.
The next four chapters get into building the system.
First comes a description of the node hardware the reader
should look into getting in chapter three. Chapter four
is an explanation of what Linux is, how it works, some
of the things the reader will be doing, and so on.
Chapter five is devoted entirely to networking hardware
and software. The hardware section looks at the
different types of network that can be set up. This
information is interesting for a brief overview of the
different types of networking, though these days, it is
clear that basically everyone who wants to build their
own personal cluster is using ethernet. The software
section of this chapter looks into aspects of networking
such as TCP/IP, sockets, high level protocols,
distributed objects (CORBA and the Java RMI),
distributed file systems, and remote access and command
execution.
After the previous three chapters lay out all the
pieces one will need to build a cluster, the last
chapter of this section, chapter six, titled
"Managing Ensembles," is about how to build and
configure a cluster (ensemble) system. This book does
something that I found refreshing, namely basically
saying "install Linux and then do this..." leaving the
installation of Linux as an exercise for the reader.
More detail is gotten into for such topics as cloning
your system, imaging your slave nodes, account
management, and some simple security topics (ssh,
restricting host access, IP masquerading). Again, we
find a little outdated information here. A decent
segment of this chapter is about how to clone your slave
nodes. At the time, there were very few reliable
software packages that could do this, but today, there
are a number of options out there to image a system that
can do everything that is described here in an automated
fashion. However, the basic idea behind what is done in
the book and what these software packages do is the
same, so if you are curious how they work, a quick read
will give you the basics.
The next three chapters are the last major section of
this book. Once the cluster is built, the next hurdle
is using it. Chapters seven through nine deal with
parallel computing and programming. Chapter seven is
title "Parallel Applications." As explained in the
chapter introduction, "...a beowulf [may] be the user's
first experience with parallel computing. The purpose
of this chapter is to offer some guidelines, examples,
and advice in designing parallel applications suitable
to Beowulf systems." Time in this chapter is spent
covering categories of parallel algorithms,
process-level parallelism, and some overheads you may
encounter.
Chapter eight is devoted to explaining MPI, the
"Message Passing Interface." The author does a
good job explaining MPI, and includes examples and
sample code to help guide the reader through. Chapter
nine is then devoted to "Programming with MPI - A
Detailed Example." This chapter is devoted to the
writing, and analysis, of MPI programs.
The last chapter is a conclusions chapter, talking
about what the author sees in the future for Beowulf
computing.
Overall, I very much enjoyed this book. This is the
book I wish I had read when I first wanted to learn
about Linux clustering. Unfortunately, I did not; I did
not get the opportunity to read this book until I had
been playing around with clustering for a few years.
Consequently, much of the information provided were
things I had already learned, one way or
another. However, even though some of the information in
this book is dated, the concepts are sound, and for
someone just starting out, this book is an excellent
read. I would highly recommend it.
|