FMM on multicore processors

Tuesday, August 3, 2010 - 2:30pm - 3:00pm
Keller 3-180
Nikos Pitsianis (Duke University)
We present preliminary results of parallelizing the fast multipole
method (FMM) for multicore processors using POSIX threads.

Short-range interactions are straight forward to parallelize. We invoke
multiple threads per compute core to alleviate partition and load
imbalances. For the calculation of long-range interactions, we assign
the multipole subtrees below a certain level to compute threads with
affinity settings that conform to the interaction lists of the tree
nodes and exploit the memory hierarchy.

On a Sun SunFire X4600 with 8 AMD Opteron 885 processors (16 cores)
running at 2.6 GHz clock rate and 64 GB of memory, we observe a better
than 15x speedup compared to the sequential version of FMM-Yukawa for
two sample benchmark problems of 10 to 100 million charges uniformly
distributed inside a unit box or on the surface of a unit sphere and
require six-digit accuracy.

MSC Code: