<!doctype html public "-//w3c//dtd html 4.0 transitional//en">
<html>
<head>
   <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
   <meta name="GENERATOR" content="Mozilla/4.7 [en] (X11; I; Linux 2.0.36 i686) [Netscape]">
   <title>Nicholas Coult's HINT Benchmark Page</title>
<!--#exec cmd="/home/coult/public_html/log.cgi"-->
</head>
<body text="#000000" bgcolor="#FFFFFF" link="#3366FF" vlink="#A4A430" alink="#FF0000">

<center><b><font size=+1>How much does the Altivec unit on Motorola's G4
chip improve floating-point performance?</font></b></center>

<p><b><font size=+1>On my modified HINT floating-point benchmark code,
Altivec vs. code which isn't Altivec-enabled on the same machine, a Power
Macintosh 7300 with a 450 Mhz G4 processor upgrade, results in:</font></b>
<center>
<p><b><font color="#330000"><font size=+1>39% performance increase overall&nbsp;
--- 58% improvement in peak performance.</font></font></b>
<p><b><font color="#000000"><font size=+1>Without Altivec, performance
matches a Pentium III/600 Mhz 'Coppermine.'</font></font></b>
<br><b><font color="#000000"><font size=+1>With Altivec, performance approaches
that of an Alpha 21264/500 Mhz.</font></font></b></center>

<p><img SRC="hint.gif" height=599 width=800>It It should be noted that
the performance of the Altivec-enabled code depends heavily on my skill
and knowledge of Altivec programming; this is the first code I have written
for Altivec.
<br>&nbsp;
<br>&nbsp;
<br>&nbsp;
<table BORDER COLS=5 WIDTH="100%" NOSAVE >
<tr>
<td><b>CPU type</b></td>

<td>
<center><font color="#330033"><font size=+2>G4, no Altivec</font></font></center>
</td>

<td>
<center><font color="#330033"><font size=+2>G4, Altivec</font></font></center>
</td>

<td>
<center><font color="#330033"><font size=+2>Pentium III/Coppermine</font></font></center>
</td>

<td>
<center><font color="#330033"><font size=+2>Alpha 21264</font></font></center>
</td>
</tr>

<tr>
<td><b>net MQUIPS</b></td>

<td>
<center><font size=+2>15.31</font></center>
</td>

<td>
<center><font size=+2>21.30</font></center>
</td>

<td>
<center><font size=+2>15.42</font></center>
</td>

<td>
<center><font size=+2>27.62</font></center>
</td>
</tr>

<tr>
<td><b>peak MQUIPS</b></td>

<td>
<center><font size=+2>2.22</font></center>
</td>

<td>
<center><font size=+2>3.51</font></center>
</td>

<td>
<center><font size=+2>2.21</font></center>
</td>

<td>
<center><font size=+2>3.94</font></center>
</td>
</tr>

<tr>
<td><b>CPU clock speed</b></td>

<td>
<center>450 Mhz</center>
</td>

<td>
<center>&nbsp;same</center>
</td>

<td>
<center>600 Mhz</center>
</td>

<td>
<center>500 Mhz</center>
</td>
</tr>

<tr>
<td><b>L2 cache size</b></td>

<td>
<center>1024 kilobytes</center>
</td>

<td>
<center>&nbsp;same</center>
</td>

<td>
<center>256 kilobytes</center>
</td>

<td>
<center>4 megabytes</center>
</td>
</tr>

<tr>
<td><b>L2 cache speed</b></td>

<td>
<center>225 Mhz</center>
</td>

<td>
<center>&nbsp;same</center>
</td>

<td>
<center>600 Mhz</center>
</td>

<td>
<center>200 Mhz</center>
</td>
</tr>

<tr>
<td><b>Operating System</b></td>

<td>
<center>Linux 2.2.15</center>
</td>

<td>
<center>&nbsp;same</center>
</td>

<td>
<center>Linux 2.0.35</center>
</td>

<td>
<center>Linux 2.2.12</center>
</td>
</tr>

<tr>
<td><b>Compiler</b></td>

<td>
<center>gcc-vec-2.9.2 -fvec -O3 -funroll-all-loops</center>
</td>

<td>
<center>&nbsp;same</center>
</td>

<td>
<center>pgcc-3.1 -fast</center>
</td>

<td>
<center>ccc -fast</center>
</td>
</tr>
</table>

<p>For more information on the HINT benchmark, see the <a href="http://www.scl.ameslab.gov/Projects/HINT">original
web page</a> at Ames Laboratory.
<p>See my <a href="http://www.ima.umn.edu/~coult/soft.html">software download
page</a> to download the original source code for HINT (covered under the
GPL), as well as my modified HINT benchmark.&nbsp; Due to the nature of
the computations, results from the original and modified versions may in
fact be directly compared.<b><u></u></b>
<p><b><u>HINT Benchmark:</u></b>
<br>Most benchmarks measure either the number of operations that can be
performed in a given time period, or the time required to perform a given
fixed calculation.&nbsp; HINT does neither; rather, it performs a particular
calculation (estimating upper and lower bounds for the definite integral
of a monotone-decreasing function) with ever-increasing accuracy.&nbsp;
The accuracy of the result at any given time is called the "Quality"; we
may measure the improvement in quality at any given time as "Quality Improvements
per Second," or QUIPS.&nbsp; As the computation progresses and the quality
of the result improves, more memory and more operations are required to
improve the answer.&nbsp; The full HINT benchmark is a plot of QUIPS versus
time for a given data type.&nbsp; Net MQUIPS is area under the HINT curve
with time plotted logarithmically, in millions of QUIPS.&nbsp;&nbsp; Higher
is better.&nbsp; HINT curves are&nbsp; a function of raw CPU processing
power, L1 and L2 cache size and speed, and main-memory bandwidth.
<p><b><u>Vector processing:</u></b>
<br>The Motorola PowerPC 7400 (aka 'G4') has a special processing unit
called Altivec.&nbsp; This processing unit is capable of performing a given
computation (say, adding two numbers together) simultaneously on groups
of input data 128 bits long.&nbsp; Since a single-precision floating-point
number is 32 bits long, this means that the Altivec unit can theoretically
perform four floating-point operations per cycle.&nbsp; This is rarely
achieved in practice.&nbsp; However, one can still achieve a substantial
improvement in performance by utilizing this unit.&nbsp; In the HINT results
above, I used a modified version of the HINT code, written by myself, which
utilizes the special Altivec instructions.
<center>
<p><font size=-1>Questions? Comments? Send email to <a href="mailto:coult011@tc.umn.edu">coult011@tc.umn.edu.</a></font>
<br><font size=-1>Last modified May 26, 2000.</font>
<p>[ <a href="http://www.ima.umn.edu/~coult/">Nicholas Coult's Home Page</a>
|&nbsp; <a href="http://www.ima.umn.edu/">Institute for Mathematics and
Its Applications</a> ]</center>

<p><br>
<br>
</body>
</html>

