altivo

Something seems to be making Dreamwidth respond really slowly, so I'll keep this short.

Fine tuned the benchmark code and added scripting in Rexx to execute and time it. Interesting results.

Estimating the value of PI to 10 decimal places, using Simpson's rule to find the area of one quarter of a unit circle. The same source code, compiled by SAS/C 6.3 on the Amiga, and gcc 4.4.3 on Linux, yields these times. The results were the same in all cases, and correct to the tenth decimal place.

Amiga 3000T (actual hardware Amiga system with AmigaOS 2.04) 12 minutes, 20 seconds.
Amiga 3000 (E-UAE emulation, with AmigaOS 3.1) 5.2 seconds (amazing but true)
Intel P4 2.4 GHz (Ubuntu 10.04 LTS, Linux/gcc) 0.22 seconds (even I can't believe this)

I should note that the Z80A TRS-80 4P emulation takes 70 minutes to achieve 6 decimal places. I haven't tried to push it any farther than that.

The Amiga 3000T has a Motorola 68020 CPU and 68881 math co-processor. Running the same code using software double precision or IEEE libraries is slower and yields less precision than the hardware floating point.

I expected an obvious difference in speed, but not to this degree. I'm both impressed and puzzled.

Addendum, July 27: This morning I booted the DEC Alpha (old one, only 433MHz, with VMS 8.3 and HP/Compaq/VMS C compiler) and tried the PI code on it. It ran the 10 decimal places in about 3 seconds, so I pushed it up to 12 decimal places which took it 46 seconds to complete. Tried the Linux system and gcc, and the 12 decimal places took about 3 seconds. This is very roughly an order of magnitude in time for both machines to go from 10 to 12 decimal places.

It's the nature of the algorithm that each power of ten takes about twice as many calculations as the one before it, and of course on multiprocessing systems there can be other things that affect the timing so this all seems to be in order.