<- previous index next ->
The best method of measuring a computers performance
is to use benchmarks. Some suggestions from my
personal experience preparing a benchmark suite
and several updates and personal benchmark
experience are presented in pdf format.
Lecture 2
Do not trust your computers clock or the software
that reads and processes the time.
First: Test the wall clock time against your watch.
time_test.c
The program displays 0, 5, 10, 15 ... at 0 seconds,
5 seconds, 10 seconds etc.
Note the use of <time.h> and 'time()'
Then: Test CPU time, this should be just the time
used by the program that is running. With only
this program running, checking against your watch
should work.
time_cpu.c
The program displays 0, 5, 10, 15 ... at 0 seconds,
5 seconds, 10 seconds etc.
Note the use of <time.h> and
'(double)clock()/(double)CLOCKS_PER_SEC'
I have found one machine with the constant
CLOCKS_PER_SECOND completely wrong and
another machine with a value 64 that should
have been 100. A computer used for real time
applications could have a value of 1,000,000
or more.
More graphs of FFT benchmarks
The source code, C language, for the FFT benchmarks:
Note the check run to be sure the code works.
Note the non uniform data to avoid special cases.
fft_time.c main program
fftc.h header file
FFT and inverse FFT for various numbers of complex data points
The same source code was used for all benchmark measurements.
These were optimized for embedded computer use where all
constants were burned into rom.
fft16.c ifft16.c
fft32.c ifft32.c
fft64.c ifft64.c
fft128.c ifft128.c
fft256.c ifft256.c
fft512.c ifft512.c
fft1024.c ifft1024.c
fft2048.c ifft2048.c
fft4096.c ifft4096.c
Some of the result files:
P1-166MHz
P1-166MHz -O2
P2-266MHz
P2-266MHz -O2
Celeron-500MHz
P3-450MHz MS
P3-450MHz Linux
PPC-2.2GHz
PPC-2.5GHz
P4-2.53GHz XP
Alpha-533MHz XP
Xeon-2.8GHz
Athlon-1.4GHz MS
Athlon-1.4GHz XP
Athlon-1.4GHz SuSe
OK, since these were old and I did not want to change them,
they give some indications of performance on various machines
with various operating systems and compiler options.
To measure very short times, a higher quality, double-difference
method is needed. The following program measures the time
to do a double precision floating point add. This may be
a time smaller than 1ns, 10^-9 seconds.
A test harness is needed to calibrate the loops and make sure
dead code elimination can not be used by the compiler.
The the item to be tested is placed in a copy of the test harness
to make the measurement.
The time of the test harness is the stop minus start time in seconds.
The time for the measurement is the stop minus start time in seconds.
The difference, thus double difference, between the harness and
measurement is the time for the item being measured.
Here A = A + B with B not known to be a constant by the compiler,
is reasonably expected to be a single instruction to add B to
a register. If not, we have timed the full statement.
The double difference time must be divided by the total
number of iterations from the nested loops to get the
time for the computer to execute the item once.
An attempt is made to get a very stable time measurement.
Doubling the number of iterations should double the time.
Summary of double difference
t1 saved
run test harness
t2 saved
t3 saved
run measurement, test harness with item to be timed
t4 saved
tdiff = (t2-t1) - (t4-t3)
t_item = tdiff / number of iterations
check against previous time, if not close, double iterations
The source code is:
time_fadd.c
fadd on P4 2.53GHz
fadd on Xeon 2.66GHz
Some extra information for students wanting to explore their computer:
Windows OS Linux OS
What is in my computer?
start cd /proc
control panel cat cpuinfo
system
device manager
processor
etc.
What processes are running?
ctrl-alt-del ps -el
process top
How do I easily time a program?
command prompt time prog < input > output
time
prog < input > output
time
The time available through normal software calls may be
updated less than 30 times per second to more than a
million times per second. A general rule of thumb is to
have the time being measured be 10 seconds or more. This
will give a reasonable accurate time measurement on all
computers. Just repeat what is being measured if it does
not run 10 seconds.
Some history about computer time reporting.
There were time sharing system where you bought time on
the computer by the cpu second. There is the cpu time
your program requires that is usually called your process
time. There is also operating system cpu time. When there
are multiple processes running, the operating system
time slices, running each job for a short time, called
a quanta. The operating system must manage memory, devices,
scheduling and related tasks. In the past we had to keep
a very close eye on how cpu time was charged to the users
process verses the systems processes and was "dead time"
the idle process, charged to either. From a users point
of view, the user did not request to be swapped out, thus
the user does not want any of the operating system time
for stopping and restarting the users process to be
charged to the user.
Another historic tidbit, some Unix systems would add
one microsecond to the time reported on each system
request for the time. Never allowing the same time
to be reported twice even if the clock had not
updated. This was to ensure that all disk file times
were unique and thus programs such as 'make' would
be reliable.
For more recent SPEC benchmarks, CPU 2000
For Intel Extreme X6800
<- previous index next ->