Slowness

Jul. 26th, 2011 10:44 pm
altivo: Geekish ham radio pony (geek)
[personal profile] altivo
Something seems to be making Dreamwidth respond really slowly, so I'll keep this short.

Fine tuned the benchmark code and added scripting in Rexx to execute and time it. Interesting results.

Estimating the value of PI to 10 decimal places, using Simpson's rule to find the area of one quarter of a unit circle. The same source code, compiled by SAS/C 6.3 on the Amiga, and gcc 4.4.3 on Linux, yields these times. The results were the same in all cases, and correct to the tenth decimal place.

Amiga 3000T (actual hardware Amiga system with AmigaOS 2.04) 12 minutes, 20 seconds.
Amiga 3000 (E-UAE emulation, with AmigaOS 3.1) 5.2 seconds (amazing but true)
Intel P4 2.4 GHz (Ubuntu 10.04 LTS, Linux/gcc) 0.22 seconds (even I can't believe this)

I should note that the Z80A TRS-80 4P emulation takes 70 minutes to achieve 6 decimal places. I haven't tried to push it any farther than that.

The Amiga 3000T has a Motorola 68020 CPU and 68881 math co-processor. Running the same code using software double precision or IEEE libraries is slower and yields less precision than the hardware floating point.

I expected an obvious difference in speed, but not to this degree. I'm both impressed and puzzled.

Addendum, July 27: This morning I booted the DEC Alpha (old one, only 433MHz, with VMS 8.3 and HP/Compaq/VMS C compiler) and tried the PI code on it. It ran the 10 decimal places in about 3 seconds, so I pushed it up to 12 decimal places which took it 46 seconds to complete. Tried the Linux system and gcc, and the 12 decimal places took about 3 seconds. This is very roughly an order of magnitude in time for both machines to go from 10 to 12 decimal places.

It's the nature of the algorithm that each power of ten takes about twice as many calculations as the one before it, and of course on multiprocessing systems there can be other things that affect the timing so this all seems to be in order.

Date: 2011-07-27 08:23 am (UTC)
baphnedia: (Default)
From: [personal profile] baphnedia
Now I'm just waiting to see how technical support deals with you when you ask Electronic Arts for help on Deluxe Paint.

Date: 2011-07-27 01:13 pm (UTC)
From: [personal profile] lhexa
That's actually surprisingly slow for the P4, considering how efficient an algorithm Simpson's rule is. It's possible that you have a poorly optimized square root function in there, eating up most of the time. Newton's Rule would be the fast way to calculate a square root, but maybe your library uses brute force instead.

Date: 2011-07-30 10:36 pm (UTC)
From: [personal profile] lhexa
Ah, I underestimated the number of steps needed. I'm seeing the expected dependence of the error on the square of the step size, but even with that, one more digit requires more than tripling the steps taken.

November 2024

S M T W T F S
     12
345678 9
10111213141516
17181920212223
24252627282930

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags
Page generated Jan. 20th, 2026 09:26 pm
Powered by Dreamwidth Studios