Thread: Floating-point software assist fault?

Floating-point software assist fault?

From
"Ed L."
Date:
We're seeing gobs of these via dmesg in PostgreSQL 8.3.3 on
ia64-unknown-linux-gnu, compiled by GCC gcc (GCC) 3.4.6 20060404
(Red Hat 3.4.6-8), kernel 2.6.9-55.EL:

postmaster(13144): floating-point assist fault at ip
40000000003a9382, isr 0000040000000008

It appears to be an Itanium-specific issue with floating-point
normalization, here is a document describing the issue.

http://i-cluster2.inrialpes.fr/doc/misc/fpswa.txt

“The Intel Itanium does not fully support IEEE denormals and
requires software assistance to handle them. Without further
informations, the ia64 GNU/Linux kernel triggers a fault when
denormals are computed. This is the "floating-point software
assist" fault (FPSWA) in the kernel messages. It is the user's
task to clearly design his program to prevent such cases.”

“To conclude, I'd like to stress the fact that the programmer has
to be careful when dealing with floating-point numbers. Even
with high precision, it is easy to produce denormals and get
strange behaviour.”

Any thoughts?

TIA.

Ed

Re: Floating-point software assist fault?

From
"Ed L."
Date:
On Thursday 08/07/08 @ 5:43 pm MDT, I received this from "Ed L."
<pgsql@bluepolka.net>:
> We're seeing gobs of these via dmesg in PostgreSQL 8.3.3 on
> ia64-unknown-linux-gnu, compiled by GCC gcc (GCC) 3.4.6
> 20060404 (Red Hat 3.4.6-8), kernel 2.6.9-55.EL:
>
> postmaster(13144): floating-point assist fault at ip
> 40000000003a9382, isr 0000040000000008

These are coming lately exclusively from the writer process...

TIA.

Ed

Re: Floating-point software assist fault?

From
"Ed L."
Date:
On Thursday 08/07/08 @ 5:46 pm MDT, I received this from "Ed L."
> > postmaster(13144): floating-point assist fault at ip
> > 40000000003a9382, isr 0000040000000008
>
> These are coming lately exclusively from the writer process...

Actually, the machine has been up for 45 days and dmesg doesn't
have timestamps, so I'm not sure if those pids have any relation
to the ones currently in use.

Ed

Re: Floating-point software assist fault?

From
Tom Lane
Date:
"Ed L." <pgsql@bluepolka.net> writes:
> We're seeing gobs of these via dmesg in PostgreSQL 8.3.3 on
> ia64-unknown-linux-gnu, compiled by GCC gcc (GCC) 3.4.6 20060404
> (Red Hat 3.4.6-8), kernel 2.6.9-55.EL:

> postmaster(13144): floating-point assist fault at ip
> 40000000003a9382, isr 0000040000000008

See if you can trace that instruction pointer address to any specific
part of the PG code (gdb will help, if the execute is built
--enable-debug).

PG isn't intentionally dealing in denormals, but I wonder if you've
found an edge case in, say, the code to manage spread-out checkpoints.
I believe that does do FP arithmetic, and it's in the bgwriter ...

            regards, tom lane