Re: gcc versus division-by-zero traps - Mailing list pgsql-hackers

From David Fetter
Subject Re: gcc versus division-by-zero traps
Date
Msg-id 20090903171522.GT8410@fetter.org
Whole thread Raw
In response to gcc versus division-by-zero traps  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: gcc versus division-by-zero traps
List pgsql-hackers
On Thu, Sep 03, 2009 at 10:24:17AM -0400, Tom Lane wrote:
> We have seen several previous reports of regression test failures
> due to division by zero causing SIGFPE, even though the code should
> never reach the division command:
> 
> http://archives.postgresql.org/pgsql-bugs/2006-11/msg00180.php
> http://archives.postgresql.org/pgsql-bugs/2007-11/msg00032.php
> http://archives.postgresql.org/pgsql-bugs/2008-05/msg00148.php
> http://archives.postgresql.org/pgsql-general/2009-05/msg00774.php
> 
> It's always been on non-mainstream architectures so it was hard to
> investigate.  But I have finally been able to reproduce this:
> https://bugzilla.redhat.com/show_bug.cgi?id=520916
> 
> While s390x is still not quite mainstream, at least I can get
> access to one ;-).

Do you also have access to z/OS with Unix System Services?  IBM's
compiler, c89, is amazingly strict, and should help us flush out bugs. :)

> What turns out to be the case is that
> "simple" test cases like
>     if (y == 0)
>         single_function_call(...);
>     z = x / y;
> do not show the problem; you need something pretty complex in the
> if-command.  Like, say, an ereport() construct.  So that's why the gcc
> boys haven't already had visits from mobs of villagers about this.
> 
> I hope that the bug will get fixed in due course, but even if they
> respond pretty quickly it will be years before the problem disappears
> from every copy of gcc in the field.  So I'm thinking that it would
> behoove us to install a workaround, now that we've characterized the
> problem sufficiently.  What I am thinking is that in the three
> functions known to exhibit the bug (int24div, int28div, int48div)
> we should do something like this:
> 
> 
>     if (arg2 == 0)
> +    {
>         ereport(ERROR,
>                 (errcode(ERRCODE_DIVISION_BY_ZERO),
>                  errmsg("division by zero")));
> +        /* ensure compiler realizes we don't reach the division */
> +        PG_RETURN_NULL();
> +    }
>     /* No overflow is possible */
>     PG_RETURN_INT64((int64) arg1 / arg2);
> 
> Thoughts?

How big would this change be?  How would people know to use that
construct everywhere it's appropriate?

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


pgsql-hackers by date:

Previous
From: ioguix@free.fr
Date:
Subject: Re: Triggers on columns
Next
From: Alvaro Herrera
Date:
Subject: Re: Triggers on columns