Lab
1: Grid of Resistors
(Note about the lab: The HPF compiler we are using is a little
buggy. Unfortunately, the installation of it here (at the IMA)
is a lot buggy. Portland Group technical people are working to
fix this RIGHT NOW. Maybe it will improve this afternoon!)
Part of the job here is to find out what does or does not work
and what, if anything, works well. Getting the code to work in
higher dimensions is relatively easy. Getting performance may
be hard.)
During the lab, you can contact me by email (schreibe), phone
(4-3829) or coming to my office (510 Vincent).
Life is too short to spend writing do loops.
-- Cleve Moler, 1993
The main goal of this assignment is to convince you of the ease
of use of the data parallel programming model. For those problems
that lend themselves to this model, programming a parallel machine
can be easier than programming serial machines. Of
course, data parallel languages can execute on serial machines,
but parallel computers have promoted their use.
Step 1: Let's get used to the SGI power challenge. At the IMA
this is a four processor, shared memory machine. Login to i7.ima.umn.edu.
Try to compile and run a "hello world" HPF program:
program p
print *, 'Running on ', number_of_processors(), ' processors.'
end
The
compiler is
/usr/pgi/bin/pghpf
and
I use the following in my .cshrc
alias hpf pghpf -Mfree -Mextend -Minline -Mnosequence
which
allows me to use freeform input (no special columns), long lines,
and causes the compiler to do some inlining. You can also use
the -O2 or (if feeling very brave) the -O3 flags to get a more
aggressively optimized executable program. I also put
setenv MANPATH /usr/pgi/man
in
my environment; this allows me to use
man pghpf
to
get the man page.
Step 2: We will be writing this code using High Peformance Fortran.
You can get online documentation that is set up at the parallel
computer center at
BU but you dont need to do the stuff with your .cshrc file
that they recommend. Other web sites that may be useful to you
are: the Portland Group
pghpf bugs list
and
the HPF home
page Next step is grabbing the code:
cp ~schreibe/HPF/resistors.hpf .
You'll
obtain an HPF program that computes the effective resistance
and voltages of a grid of resistors. Compile with the pghpf
command. Read the online
users guide to find out how to compie and run on one or
more processors. See
Chapter 3 specifically. The IMA machine has four processors,
so there is no point in trying more than that. As far as I can
tell, you need to set the number of processors before you compile,
by
setenv PGHPF_OPTS '-np 4'
The problem is to compute the voltages and the effective resistance
of a 2n+3 by 2n+2 grid of 1 ohm resistors if a
battery is connected to the two center points. This is a discrete
version of finding the lines of force using iron filings for
a magnet. We force a net current on one amp, calculate the voltage
at every point in the grid (assuming the edges of the grid are
connected to ground) and find the effective resistance as the
voltage drop across the terminals of the battery. The equations
are just conservation of charge -- Kirchoff's laws.
The method of solution that we will use here is successive
overrelaxation (SOR) with red-black ordering. This is certainly
not the fastest way to solve the problem, but it does illustrate
many important programming ideas.
The nodes are divided in half into red nodes and black nodes.
During the first pass, the red nodes obtain the voltages as
a weighted average of their original voltage, the input (if
any) and the four surrounding black nodes. During the
second pass, the black nodes obtain voltages from the four surrounding
red nodes. In the infinite limit, the process converges to the
correct answer.
Compile and run the program. Then use the Fortran intrinsic
SYSTEM_CLOCK to time the program (insert the calls to system
clock and print the execution time.) Do you get any speedup
with additional processors? How is runtime influenced by the
grid size? First assignment. Implement the three dimensional
version. What is the effective resistance in N dimensions? (Any
guesses?) Try changing the distribution. Try rewriting the program
to use FORALL instead of EOSHIFT for the finite difference stencil
computation. Does it work? Faster or slower?

Now
this gets harder. In our program we use the where(red)
statement. Though you might have imagined that the compiler
is smart enough to work in time proportional to the active part
(i.e. the "true" part) of the statement, unfortunately
it is not. In fact, the time spent is proportional to the entire
size and even more time for conditioning.
We would like you to improve the program by putting all the
red nodes into one array called red and all the black
nodes in the same size array black.
By setting the edge of the voltage array at every iteration
to be equal to the neighboring values, you would implement a
no-current (i.e. a Neumann) boundary condition. You could try
this. Or, you could try periodic boundary conditions. For these,
use the CSHIFT intrinsic instead of EOSHIFT.

Assignment
Checklist
- Copy
the file.
- Understand
and time the program (using eoshift.)
- Extend
to three and four dimensions.
- Try
to use FORALL and/or to use two arrays.
- Use
a homogeneous Neumann boundary condition (assume no current
at the edge). Or try periodic boundaries.
1996-1997
Mathematics in High Performance Computing