On 07 Nov 2005 14:22:37 -0500, Greg Stark <gsstark@mit.edu> wrote:
> IIRC, floating point registers are actually longer than a double so if the
> entire calculation is done in registers and then the result rounded off to
> store in memory it may get the right answer. Whereas if it loses the extra
> bits on the intermediate values (the infinite repeating fractions) that might
> be where you get the imprecise results.
Hm. I thought -march=pentium4 -mcpu=pentium4 implies -mfpmath=sse.
SSE is a much better choice on P4 for performance reasons, and never
has excess precision. I'm guessing from the above that I'm incorrect,
in which case we should always be compiled with -mfpmath=sse -msse2
when we are complied -march=pentium4, this should remove problems
caused by excess precision. The same behavior can be had on non sse
platforms with -ffloat-store.