[an error occurred while processing this directive]
How much does the Altivec unit on Motorola's G4 chip improve floating-point performance?

On my modified HINT floating-point benchmark code, Altivec vs. code which isn't Altivec-enabled on the same machine, a Power Macintosh 7300 with a 450 Mhz G4 processor upgrade, results in:

39% performance increase overall  --- 58% improvement in peak performance.

Without Altivec, performance matches a Pentium III/600 Mhz 'Coppermine.'
With Altivec, performance approaches that of an Alpha 21264/500 Mhz.

It It should be noted that the performance of the Altivec-enabled code depends heavily on my skill and knowledge of Altivec programming; this is the first code I have written for Altivec.
 
 
 
CPU type
G4, no Altivec
G4, Altivec
Pentium III/Coppermine
Alpha 21264
net MQUIPS
15.31
21.30
15.42
27.62
peak MQUIPS
2.22
3.51
2.21
3.94
CPU clock speed
450 Mhz
 same
600 Mhz
500 Mhz
L2 cache size
1024 kilobytes
 same
256 kilobytes
4 megabytes
L2 cache speed
225 Mhz
 same
600 Mhz
200 Mhz
Operating System
Linux 2.2.15
 same
Linux 2.0.35
Linux 2.2.12
Compiler
gcc-vec-2.9.2 -fvec -O3 -funroll-all-loops
 same
pgcc-3.1 -fast
ccc -fast

For more information on the HINT benchmark, see the original web page at Ames Laboratory.

See my software download page to download the original source code for HINT (covered under the GPL), as well as my modified HINT benchmark.  Due to the nature of the computations, results from the original and modified versions may in fact be directly compared.

HINT Benchmark:
Most benchmarks measure either the number of operations that can be performed in a given time period, or the time required to perform a given fixed calculation.  HINT does neither; rather, it performs a particular calculation (estimating upper and lower bounds for the definite integral of a monotone-decreasing function) with ever-increasing accuracy.  The accuracy of the result at any given time is called the "Quality"; we may measure the improvement in quality at any given time as "Quality Improvements per Second," or QUIPS.  As the computation progresses and the quality of the result improves, more memory and more operations are required to improve the answer.  The full HINT benchmark is a plot of QUIPS versus time for a given data type.  Net MQUIPS is area under the HINT curve with time plotted logarithmically, in millions of QUIPS.   Higher is better.  HINT curves are  a function of raw CPU processing power, L1 and L2 cache size and speed, and main-memory bandwidth.

Vector processing:
The Motorola PowerPC 7400 (aka 'G4') has a special processing unit called Altivec.  This processing unit is capable of performing a given computation (say, adding two numbers together) simultaneously on groups of input data 128 bits long.  Since a single-precision floating-point number is 32 bits long, this means that the Altivec unit can theoretically perform four floating-point operations per cycle.  This is rarely achieved in practice.  However, one can still achieve a substantial improvement in performance by utilizing this unit.  In the HINT results above, I used a modified version of the HINT code, written by myself, which utilizes the special Altivec instructions.

Questions? Comments? Send email to coult011@tc.umn.edu.
Last modified May 26, 2000.

[ Nicholas Coult's Home PageInstitute for Mathematics and Its Applications ]