Re: Test suite fails on alpha architecture - Mailing list pgsql-bugs

From Martin Pitt
Subject Re: Test suite fails on alpha architecture
Date
Msg-id 20071204224340.GH6765@piware.de
Whole thread Raw
In response to Re: Test suite fails on alpha architecture  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Test suite fails on alpha architecture  (Martin Pitt <mpitt@debian.org>)
List pgsql-bugs
Hi,

Tom Lane [2007-11-07 13:49 -0500]:
> All the other diffs that Martin showed are divide-by-zero failures,
> and I do not see any of them on Gentoo's machine.  I think that this
> must be a compiler bug.  The first example in his diffs is just
> "select 1/0", which executes this code:
>
>     int32        arg1 = PG_GETARG_INT32(0);
>     int32        arg2 = PG_GETARG_INT32(1);
>     int32        result;
>
>     if (arg2 == 0)
>         ereport(ERROR,
>                 (errcode(ERRCODE_DIVISION_BY_ZERO),
>                  errmsg("division by zero")));
>
>     result = arg1 / arg2;
>
> It looks to me like Debian's compiler must be allowing the division
> instruction to be speculatively executed before the if-test branch
> is taken.  Perhaps it is supposing that this is OK because control
> will return from ereport(), when in fact it will not (the routine
> throws a longjmp).  Since we've not seen such behavior on any other
> platform, however, I suspect this is just a bug and not intentional.

I tried this on a Debian Alpha porter box (thanks, Steve, for pointing
me at it) with Debian's gcc 4.2.2. Latest sid indeed still has this
bug (the floor() one is confirmed fixed), not only on Alpha, but also
on sparc.

Since the simple test case did not reproduce the error, I tried to
make a more sophisticated one which resembles more closely what
PostgreSQL does (sigsetjmp/siglongjmp instead of exit(), some macros,
etc.). Unfortunately in vain, since the test case still works
perfectly with both no compiler options and also the ones used for
PostgreSQL. I attach it here nevertheless just in case someone has
more luck than me.

So I tried to approach it from the other side: Building postgresql
with CFLAGS="-O0 -g" or "-O1 -g" works correctly, but with "-O2 -g" I
get above bug.

So I guess I'll build with -O1 for the time being on sparc and alpha
to get correct binaries until this is sorted out. Any idea what else I
could try?

Thanks,

Martin

--
Martin Pitt        http://www.piware.de
Ubuntu Developer   http://www.ubuntu.com
Debian Developer   http://www.debian.org

Attachment

pgsql-bugs by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: BUG #3790: pg_restore error canceling statement due to user request
Next
From: Martin Pitt
Date:
Subject: Re: Test suite fails on alpha architecture