20 years ago (1992) the first floating point DSP with a dedicated C compiler (The Texas Instruments TMS320C30 (30 MHz)) took 16 microseconds to run the same code
10 years ago (2002) the state of the art floating point DSP (The Texas Instruments TMS320C6701 (167 MHz)) took 0.82 microseconds to run the same code
Today (2012) a modern Pentium laptop (2.4 GHz) can run the same code in 0.13 microseconds
Maybe I will add another comment, in the future, to compare the cost and power consumption of the various devices.
If you have found this solution useful then please do hit the Google (+1) button so that others may be able to find it as well.
Numerix-DSP Libraries : http://www.numerix-dsp.com/eval/