$time ../../x-povray +i chess2.pov +o jhpark.tga -L../../include/ +D -W320 -H200
$pvmpov -i skyvase.pov +v1 -d +ft -x +a0.300 +r3 -q9 -w640 -h480 -mv2.0 +b1000 > & pvmpovray_13cpu.dat
Persistence of Vision(tm) Ray Tracer Version 3.02.Linux.gcc
This is an unofficial version compiled by:
Harald Deischinger - PVM-POV Version 3.1
The POV-Ray Team(tm) is not responsible for supporting this version.
Copyright 1997 POV-Ray Team(tm)
Initializing PVMPOV
Spawning /home/nurapt/jhpark/pvm3/bin/LINUX/pvmpov with 13 PVM tasks on 13 hosts...
...13 PVM tasks successfully spawned.
Waiting up to 120s for first slave to start...
Slave 0 successfully started.
Parsing Options
Input file: skyvase.pov (compatible to version 2.0)
Remove bounds........On Split unions........Off
Library paths:
Output Options
Image resolution 640 by 480 (rows 1 to 480, columns 1 to 640).
Output file: skyvase.tga, 24 bpp Targa, 1000 KByte buffer
Graphic display.....Off
Mosaic preview......Off
CPU usage histogram.Off
Continued trace.....Off Allow interruption..Off Pause when done.....Off
Verbose messages.....On
Tracing Options
Quality: 9
Bounding boxes.......On Bounding threshold: 25
Light Buffer.........On Vista Buffer.........On
Antialiasing.........On (Method 1, Threshold 0.300, Depth 3, Jitter 1.00)
Radiosity...........Off
Animation Options
Clock value.... 0.000 (Animation off)
PVM Options
Block Width.... 32 Block Height... 32
PVM Tasks...... 13
PVM Nice....... 5
PVM Arch.......
PVM Slave...... /home/nurapt/jhpark/pvm3/bin/LINUX/pvmpov
PVM WorkingDir. /home/nurapt/jhpark/SYSBENCH/povray3/include
Redirecting Options
All Streams to console..........On
Debug Stream to console.........On
Fatal Stream to console.........On
Render Stream to console........On
Statistics Stream to console....On
Warning Stream to console.......On
Starting frame 0...
Slave 1 at n05 successfully started.
Slave 2 at n02 successfully started.
Slave 3 at n03 successfully started.
Slave 4 at n04 successfully started.
Slave 5 at n06 successfully started.
Slave 6 at n07 successfully started.
Slave 7 at n08 successfully started.
Slave 8 at n09 successfully started.
Slave 9 at n10 successfully started.
Slave 10 at n11 successfully started.
Slave 11 at n12 successfully started.
Slave 12 at n01 successfully started.
Finishing frame 0...rtw. 480
Waiting for remaining slave stats.
PVM Task Distribution Statistics:
host name [ done ] [ late ] host name [ done ] [ late ]
n02 [ 8.00%] [ 0.17%] n03 [ 7.33%] [ 0.46%]
n04 [ 7.33%] [ 0.21%] n05 [ 8.00%] [ 0.00%]
n06 [ 8.00%] [ 0.88%] n07 [ 7.67%] [ 0.65%]
n08 [ 7.00%] [ 0.00%] n09 [ 8.33%] [ 0.00%]
n10 [ 8.00%] [ 0.08%] n11 [ 7.67%] [ 0.08%]
n12 [ 8.00%] [ 0.23%] beowulf [ 7.00%] [ 0.00%]
n01 [ 7.67%] [ 0.25%]
POV-Ray statistics for finished frames:
skyvase.pov Statistics (Partial Image Rendered), Resolution 640 x 480
----------------------------------------------------------------------------
Pixels: 28000 Samples: 36168 Smpls/Pxl: 1.29
Rays: 135186 Saved: 0 Max Level: 0/5
----------------------------------------------------------------------------
Ray->Shape Intersection Tests Succeeded Percentage
----------------------------------------------------------------------------
Cone/Cylinder 211840 124223 58.64
CSG Intersection 336063 34558 10.28
CSG Union 248446 72322 29.11
Plane 2366846 1312043 55.43
Quadric 248446 161830 65.14
Sphere 248446 74511 29.99
Bounding Object 211840 124223 58.64
----------------------------------------------------------------------------
Calls to Noise: 154566 Calls to DNoise: 279836
----------------------------------------------------------------------------
Shadow Ray Tests: 331300 Succeeded: 10262
Reflected Rays: 99018
----------------------------------------------------------------------------
Smallest Alloc: 12 bytes Largest: 1024008
Peak memory used: 2040599 bytes
----------------------------------------------------------------------------
Time For Trace: 0 hours 0 minutes 14.0 seconds (14 seconds)
Total Time: 0 hours 0 minutes 14.0 seconds (14 seconds)
BYTEmark* Native Mode Benchmark ver. 2 (10/95)
Index-split by Andrew D. Balsa (11/97)
Linux/Unix* port by Uwe F. Mayer (12/96,11/97)
TEST : Iterations/sec. : Old Index : New Index
: : Pentium 90* : AMD K6/233*
--------------------:------------------:-------------:------------
NUMERIC SORT : 132 : 3.39 : 1.11
STRING SORT : 13.752 : 6.14 : 0.95
BITFIELD : 4.2404e+07 : 7.27 : 1.52
FP EMULATION : 6.9951 : 3.36 : 0.77
FOURIER : 3140.4 : 3.57 : 2.01
ASSIGNMENT : 2.2157 : 8.43 : 2.19
IDEA : 247.62 : 3.79 : 1.12
HUFFMAN : 123.98 : 3.44 : 1.10
NEURAL NET : 2.9055 : 4.67 : 1.96
LU DECOMPOSITION : 79.464 : 4.12 : 2.97
==========================ORIGINAL BYTEMARK RESULTS==========================
INTEGER INDEX : 4.765
FLOATING-POINT INDEX: 4.094
Baseline (MSDOS*) : Pentium* 90, 256 KB L2-cache, Watcom* compiler 10.0
==============================LINUX DATA BELOW===============================
C compiler : gcc version egcs-2.91.57 19980901 (egcs-1.1 release)
libc : unknown version
MEMORY INDEX : 1.467
INTEGER INDEX : 1.015
FLOATING-POINT INDEX: 2.271
Baseline (LINUX) : AMD K6/233*, 512 KB L2-cache, gcc 2.7.2.3, libc-5.4.38
* Trademarks are property of their respective holder.
------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 3000000, Offset = 0 Total memory required = 68.7 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Your clock granularity/precision appears to be 9999 microseconds. Each test below will take on the order of 199999 microseconds. (= 20 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) RMS time Min time Max time Copy: 200.0000 0.2480 0.2400 0.2500 Scale: 200.0000 0.2480 0.2400 0.2500 Add: 240.0000 0.3070 0.3000 0.3100 Triad: 205.7143 0.3530 0.3500 0.3600
Compare another processor results NAS Parallel Benchmarks 2.3-serial verision - BT Benchmark Class = W Size = 24x 24x 24 Iterations = 200 Time in seconds = 157.13 Mop/s total = 49.12 Operation type = floating point Verification = SUCCESSFUL Version = 2.3 Compile date = 07 Nov 1998
Compare another cluster results NAS Parallel Benchmarks 2.3 -- BT Benchmark Class = A Size = 64x 64x 64 Iterations = 200 Time in seconds = 543.88 Total processes = 9 Compiled procs = 9 Mop/s total = 309.42 Mop/s/process = 34.38 Operation type = floating point Verification = SUCCESSFUL Version = 2.3 Compile date = 26 Dec 1998
[jhpark@n12 qcdmpi]# mpirun -np 8 qcd
Input Beta=
6.0
Input Number of Sweeps=
4
Input Random Number Key=
0
----------------------------------------------
QCDMPI program version 1.5, by S.Hioki, 98/7/8
beta = 6.00000
lattice sizes = 8 8 8 8
on 2 2 2 1
nsweep, iseed = 4 0
----------------------------------------------
sweep, plaq, t_total, t_comm 1 .772989 .279 .182
sweep, plaq, t_total, t_comm 2 .646477 .190 .094
sweep, plaq, t_total, t_comm 3 .618864 .159 .063
sweep, plaq, t_total, t_comm 4 .609579 .159 .063
***** QCD PERFORMANCE (from last sweep data)***
link update time = 9.717590 micro sec/link
comm bandwidth = 10.603260 Mega Byte/sec
***********************************************
[jhpark@n12 qcdmpi]#
8 CPU : 6 minutes 16 seconds 4 CPU : 12 minutes 31 seconds 2 CPU : 24 minutes 22 seconds 1 CPU : 48 minutes 55 seconds
8 CPU : 24 minutes 29 seconds
4 CPU : 45 minutes 45 seconds
2 CPU : 90 minutes 10 seconds
1 CPU : 173 minutes 30 seconds
(HP-C180: 170 minutes 25 seconds)