Gregory Maxwell wrote:
> On 07 Nov 2005 14:22:37 -0500, Greg Stark <gsstark@mit.edu> wrote:
>
>>IIRC, floating point registers are actually longer than a double so if the
>>entire calculation is done in registers and then the result rounded off to
>>store in memory it may get the right answer. Whereas if it loses the extra
>>bits on the intermediate values (the infinite repeating fractions) that might
>>be where you get the imprecise results.
>
>
> Hm. I thought -march=pentium4 -mcpu=pentium4 implies -mfpmath=sse.
> SSE is a much better choice on P4 for performance reasons, and never
> has excess precision. I'm guessing from the above that I'm incorrect,
> in which case we should always be compiled with -mfpmath=sse -msse2
> when we are complied -march=pentium4, this should remove problems
> caused by excess precision. The same behavior can be had on non sse
> platforms with -ffloat-store.
Just for the record (and those interested): using 'CFLAGS=-O2
-mcpu=pentium4 -march=pentium4 -mfpmath=sse -msse2' actually passes the
regression tests.
Best Regards,
Michael Paesold