Thread: OpenBSD/Sparc status

OpenBSD/Sparc status

From
Andrew Dunstan
Date:
The fix for unflushed changed to pg_database records seems to have fixed 
the problem we were seeing on spoonbill ... but it is now seeing 
problems with the seg module:

http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=spoonbill&dt=2004-11-18%2016:02:58

cheers

andrew


Re: OpenBSD/Sparc status

From
Tom Lane
Date:
Andrew Dunstan <andrew@dunslane.net> writes:
> The fix for unflushed changed to pg_database records seems to have fixed 
> the problem we were seeing on spoonbill ... but it is now seeing 
> problems with the seg module:

> http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=spoonbill&dt=2004-11-18%2016:02:58

Don't tell me that just started happening?  We haven't touched seg in
weeks...

I'm unsure how this could fail when float4 passes, because it's using
float4in to convert the strings.
        regards, tom lane


Re: OpenBSD/Sparc status

From
Andrew Dunstan
Date:

Tom Lane wrote:

>Andrew Dunstan <andrew@dunslane.net> writes:
>  
>
>>The fix for unflushed changed to pg_database records seems to have fixed 
>>the problem we were seeing on spoonbill ... but it is now seeing 
>>problems with the seg module:
>>    
>>
>
>  
>
>>http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=spoonbill&dt=2004-11-18%2016:02:58
>>    
>>
>
>Don't tell me that just started happening?  We haven't touched seg in
>weeks...
>
>I'm unsure how this could fail when float4 passes, because it's using
>float4in to convert the strings.
>
>
>  
>

We're only seeing it now because up to now the run on this platform was 
bombing out on the error you so brilliantly fixed last night.

You might recall I wanted to patch contrib/Makefile to force 
installcheck on all modules regardless of error - if we had that we'd 
have seen this before.

cheers

andrew


Re: OpenBSD/Sparc status

From
Tom Lane
Date:
Andrew Dunstan <andrew@dunslane.net> writes:
> We're only seeing it now because up to now the run on this platform was 
> bombing out on the error you so brilliantly fixed last night.

Consistently?  I'd have thought that problem would only fail once in a
while.  It's hard to believe the timing would work out to make it a 100%
failure.
        regards, tom lane


Re: OpenBSD/Sparc status

From
Andrew Dunstan
Date:

Tom Lane wrote:

>Andrew Dunstan <andrew@dunslane.net> writes:
>  
>
>>We're only seeing it now because up to now the run on this platform was 
>>bombing out on the error you so brilliantly fixed last night.
>>    
>>
>
>Consistently?  I'd have thought that problem would only fail once in a
>while.  It's hard to believe the timing would work out to make it a 100%
>failure.
>
>
>  
>

You can see the history of the latest build runs here:

http://www.pgbuildfarm.org/cgi-bin/show_history.pl?nm=spoonbill&br=HEAD

cheers

andrew


Re: OpenBSD/Sparc status

From
Tom Lane
Date:
Andrew Dunstan <andrew@dunslane.net> writes:
> Tom Lane wrote:
>> Consistently?  I'd have thought that problem would only fail once in a
>> while.  It's hard to believe the timing would work out to make it a 100%
>> failure.

> You can see the history of the latest build runs here:
> http://www.pgbuildfarm.org/cgi-bin/show_history.pl?nm=spoonbill&br=HEAD

Remarkable.  There is one run (2004-11-15) where it got past the rtree
test (and did indeed fail at seg) but the failure rate is certainly
upwards of 90%.  Curious.  There must be some effect that is
synchronizing the bgwriter's actions with the test sequence.

Back at the ranch, I am even more surprised to note that the bogus
seg output in the 11-15 run is different from what it is in today's.
There's not much I can do about it without access to a machine where
it's failing though.  Can we get personal accounts on the buildfarm
machines?
        regards, tom lane


Re: OpenBSD/Sparc status

From
Andrew Dunstan
Date:

Tom Lane wrote:

>Can we get personal accounts on the buildfarm
>machines?
>
>
>  
>

That's up to the owner of each machine - it's a distributed system.

I've sent email to the owner of this one.

When I get a few minutes soon I hope to start some discussion on 
-hackers about what members we want in the buildfarm and what our 
expectations are about help with solving problems.

cheers

andrew




Re: OpenBSD/Sparc status

From
Tom Lane
Date:
The answer is: it's a gcc bug.  The attached program should print
x = 12.3
y = 12.3

but if compiled with -O or -O2 on Stefan's machine, I get garbage:

$ gcc -O  ftest.c
$ ./a.out
x = 12.3
y = 1.47203e-39
$ gcc -v
Reading specs from /usr/lib/gcc-lib/sparc64-unknown-openbsd3.6/3.3.2/specs
Configured with:
Thread model: single
gcc version 3.3.2 (propolice)
$
        regards, tom lane


#include <stdio.h>

float
returnfloat(float *x)
{return *x;
}

int
main()
{float x = 12.3;union {    float f;    char *t;} y;
y.f = returnfloat(&x);
printf("x = %g\n", x);printf("y = %g\n", y.f);
return 0;
}


Re: OpenBSD/Sparc status

From
Stefan Kaltenbrunner
Date:
Tom Lane wrote:
> The answer is: it's a gcc bug.  The attached program should print
> x = 12.3
> y = 12.3
> 
> but if compiled with -O or -O2 on Stefan's machine, I get garbage:
> 
> $ gcc -O  ftest.c
> $ ./a.out
> x = 12.3
> y = 1.47203e-39

woa - scary. I will report that to the OpenBSD-folks upstream - many 
thanks for the nice testcase!


Stefan


Re: OpenBSD/Sparc status

From
Andrew Dunstan
Date:

Stefan Kaltenbrunner wrote:

> Tom Lane wrote:
>
>> The answer is: it's a gcc bug.  The attached program should print
>> x = 12.3
>> y = 12.3
>>
>> but if compiled with -O or -O2 on Stefan's machine, I get garbage:
>>
>> $ gcc -O  ftest.c
>> $ ./a.out
>> x = 12.3
>> y = 1.47203e-39
>
>
> woa - scary. I will report that to the OpenBSD-folks upstream - many 
> thanks for the nice testcase!
>
>
>

very scary.

Meanwhile, what do we do? Turn off -O in src/template/openbsd for 
some/all releases?

cheers

andrew


Re: OpenBSD/Sparc status

From
Tom Lane
Date:
Andrew Dunstan <andrew@dunslane.net> writes:
> Meanwhile, what do we do? Turn off -O in src/template/openbsd for 
> some/all releases?

Certainly not.  This problem is only known to exist in one gcc version
for one architecture, and besides it's only affecting (so far as we can
tell) one rather inessential contrib module.  I'd say ignore the test
failure until Stefan can get a fixed gcc.
        regards, tom lane


Re: OpenBSD/Sparc status

From
Stefan Kaltenbrunner
Date:
Tom Lane wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
> 
>>Meanwhile, what do we do? Turn off -O in src/template/openbsd for 
>>some/all releases?
> 
> 
> Certainly not.  This problem is only known to exist in one gcc version
> for one architecture, and besides it's only affecting (so far as we can
> tell) one rather inessential contrib module.  I'd say ignore the test
> failure until Stefan can get a fixed gcc.

FWIW: I got the bug confirmed by Miod Vallat (OpenBSD hacker) on IRC, it 
looks that at least OpenBSD 3.6-STABLE and OpenBSD-current on Sparc64 
with the stock system compiler are affected.


Stefan


Re: OpenBSD/Sparc status

From
"Andrew Dunstan"
Date:
Stefan Kaltenbrunner said:
> Tom Lane wrote:
>> Andrew Dunstan <andrew@dunslane.net> writes:
>>
>>>Meanwhile, what do we do? Turn off -O in src/template/openbsd for
>>>some/all releases?
>>
>>
>> Certainly not.  This problem is only known to exist in one gcc version
>> for one architecture, and besides it's only affecting (so far as we
>> can tell) one rather inessential contrib module.  I'd say ignore the
>> test failure until Stefan can get a fixed gcc.
>
> FWIW: I got the bug confirmed by Miod Vallat (OpenBSD hacker) on IRC,
> it  looks that at least OpenBSD 3.6-STABLE and OpenBSD-current on
> Sparc64  with the stock system compiler are affected.


I guess my concern is that on Sparc64/OpenBSD-3.6* at least, this bug is
exposed by the seg tests but might well occur elsewhere and bite us in
various unpleasant ways.

I have no idea how many people out there are using this combination. Of
course, even it it's only one (and I suspect that's the right order of
magnitude) we should want to be careful with their data.

cheers

andrew






Re: OpenBSD/Sparc status

From
Tom Lane
Date:
"Andrew Dunstan" <andrew@dunslane.net> writes:
> I guess my concern is that on Sparc64/OpenBSD-3.6* at least, this bug is
> exposed by the seg tests but might well occur elsewhere and bite us in
> various unpleasant ways.

The experimentation I did to develop the test case suggested that the
problem only occurs when the result of a function returning float is
stored directly into a union member.  That's a sufficiently weird case
that I'm reasonably confident it doesn't occur elsewhere in the backend.
It might be worth Stefan's time to vary the test case a bit (eg try
double instead of float, struct instead of union, etc) and see just how
general the bug is.
        regards, tom lane


Re: OpenBSD/Sparc status

From
Darcy Buskermolen
Date:
On November 19, 2004 10:55 am, you wrote:
> The answer is: it's a gcc bug.  The attached program should print
> x = 12.3
> y = 12.3
>
> but if compiled with -O or -O2 on Stefan's machine, I get garbage:
>
> $ gcc -O  ftest.c
> $ ./a.out
> x = 12.3
> y = 1.47203e-39
> $ gcc -v
> Reading specs from /usr/lib/gcc-lib/sparc64-unknown-openbsd3.6/3.3.2/specs
> Configured with:
> Thread model: single
> gcc version 3.3.2 (propolice)
> $

I can confirm this behavior on Solaris 8/sparc 64 as well.

bash-2.03$ gcc -O -m64 test.c
bash-2.03$ ./a.out 
x = 12.3
y = 2.51673e-42
bash-2.03$ file a.out 
a.out:          ELF 64-bit MSB executable SPARCV9 Version 1, dynamically 
linked, not stripped
bash-2.03$ gcc -v
Reading specs from /usr/local/lib/gcc-lib/sparc-sun-solaris2.8/3.3.2/specs
Configured with: ../configure --with-as=/usr/ccs/bin/as 
--with-ld=/usr/ccs/bin/ld --disable-nls
Thread model: posix
gcc version 3.3.2
bash-2.03$ gcc -m64 test.c
bash-2.03$ ./a.out 
x = 12.3
y = 12.3
bash-2.03$ gcc -m64 -02 test.c
gcc: unrecognized option `-02'
bash-2.03$ gcc -m64 -O2 test.c
bash-2.03$ ./a.out 
x = 12.3
y = 2.51673e-42
bash-2.03$ gcc -m64 -O3 test.c
bash-2.03$ ./a.out 
x = 12.3
y = 12.3
bash-2.03$ 


>
>             regards, tom lane
>
>
> #include <stdio.h>
>
> float
> returnfloat(float *x)
> {
>     return *x;
> }
>
> int
> main()
> {
>     float x = 12.3;
>     union {
>         float f;
>         char *t;
>     } y;
>
>     y.f = returnfloat(&x);
>
>     printf("x = %g\n", x);
>     printf("y = %g\n", y.f);
>
>     return 0;
> }
>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: the planner will ignore your desire to choose an index scan if your
>       joining column's datatypes do not match

-- 
Darcy Buskermolen
Wavefire Technologies Corp.
ph: 250.717.0200
fx:  250.763.1759
http://www.wavefire.com


Re: OpenBSD/Sparc status

From
Stefan Kaltenbrunner
Date:
Tom Lane wrote:
> Darcy Buskermolen <darcy@wavefire.com> writes:
> 
>>I can confirm this behavior on Solaris 8/sparc 64 as well.
> 
> 
>>bash-2.03$ gcc -m64 -O2 test.c
>>bash-2.03$ ./a.out 
>>x = 12.3
>>y = 2.51673e-42
>>bash-2.03$ gcc -m64 -O3 test.c
>>bash-2.03$ ./a.out 
>>x = 12.3
>>y = 12.3
>>bash-2.03$ 
> 
> 
> Hmm.  I hadn't bothered to try -O3 ... interesting that it works
> correctly again at that level.

-O3 works on my box too

> 
> Anyway, this proves that it is an upstream gcc bug and not something
> OpenBSD broke.

I just tried on solaris9 with gcc 3.4.2 - seems the bug is fixed in this  version. Unfortunably it is quite problematic
tochange the compiler 
 
at least on OpenBSD gcc 3.3.2 is quite heavily modified on that platform 
and switching the base system compiler might screw a boatload of other 
tools.
The actual recommendation I got from the OpenBSD-folks was to add 
"-mfaster-structs" to the compiler flags with seems to work around the 
issue - I'm currently doing a full build to verify that though ...


Stefan


Re: OpenBSD/Sparc status

From
Michael Fuhr
Date:
On Tue, Nov 23, 2004 at 09:57:03AM -0800, Darcy Buskermolen wrote:

> I can confirm this behavior on Solaris 8/sparc 64 as well.

gcc 3.4.2 on Solaris 9/sparc 64 appears to be okay.

% gcc -v
Reading specs from /usr/local/lib/gcc/sparc-sun-solaris2.9/3.4.2/specs
Configured with: ../configure --with-as=/usr/ccs/bin/as --with-ld=/usr/ccs/bin/ld --disable-nls
Thread model: posix
gcc version 3.4.2
% gcc -m64 test.c
% ./a.out
x = 12.3
y = 12.3
% gcc -O -m64 test.c
% ./a.out
x = 12.3
y = 12.3
% gcc -O2 -m64 test.c
% ./a.out
x = 12.3
y = 12.3
% gcc -O3 -m64 test.c
% ./a.out
x = 12.3
y = 12.3
% file a.out
a.out:          ELF 64-bit MSB executable SPARCV9 Version 1, dynamically linked, not stripped

-- 
Michael Fuhr
http://www.fuhr.org/~mfuhr/


Re: OpenBSD/Sparc status

From
Stefan Kaltenbrunner
Date:
Darcy Buskermolen wrote:
> On November 19, 2004 10:55 am, you wrote:
> 
>>The answer is: it's a gcc bug.  The attached program should print
>>x = 12.3
>>y = 12.3
>>
>>but if compiled with -O or -O2 on Stefan's machine, I get garbage:
>>
>>$ gcc -O  ftest.c
>>$ ./a.out
>>x = 12.3
>>y = 1.47203e-39
>>$ gcc -v
>>Reading specs from /usr/lib/gcc-lib/sparc64-unknown-openbsd3.6/3.3.2/specs
>>Configured with:
>>Thread model: single
>>gcc version 3.3.2 (propolice)
>>$
> 
> 
> I can confirm this behavior on Solaris 8/sparc 64 as well.

some more datapoints:

solaris 2.9 with gcc 3.1 is broken(-O3 does not help here)
linux/sparc64 (debian) with gcc 3.3.5 is broken too

So it looks like at least gcc 3.1 and gcc 3.3.x are affected on Sparc64 
on all operating systems.


Stefan


Re: OpenBSD/Sparc status

From
jseymour@linxnet.com (Jim Seymour)
Date:
Stefan Kaltenbrunner <stefan@kaltenbrunner.cc> wrote:
> 
> Darcy Buskermolen wrote:
> > On November 19, 2004 10:55 am, you wrote:
> > 
> >>The answer is: it's a gcc bug.  The attached program should print
> >>x = 12.3
> >>y = 12.3
> >>
> >>but if compiled with -O or -O2 on Stefan's machine, I get garbage:
> >>
> >>$ gcc -O  ftest.c
> >>$ ./a.out
> >>x = 12.3
> >>y = 1.47203e-39
> >>$ gcc -v
> >>Reading specs from /usr/lib/gcc-lib/sparc64-unknown-openbsd3.6/3.3.2/specs
> >>Configured with:
> >>Thread model: single
> >>gcc version 3.3.2 (propolice)
> >>$
> > 
> > 
> > I can confirm this behavior on Solaris 8/sparc 64 as well.
> 
> some more datapoints:
> 
> solaris 2.9 with gcc 3.1 is broken(-O3 does not help here)
> linux/sparc64 (debian) with gcc 3.3.5 is broken too
> 
> So it looks like at least gcc 3.1 and gcc 3.3.x are affected on Sparc64 
> on all operating systems.

Yet Another Datapoint:

$ uname -a
SunOS jimsun 5.7 Generic_106541-29 sun4u sparc SUNW,UltraSPARC-IIi-Engine
$ gcc -v
...
gcc version 3.3.1
$ gcc -O -m64 test.c
$ a.out
x = 12.3
y = 2.55036e-42

Same on a "real" UltraSparc box, running Solaris 8 and gcc 3.3.1
at work.

Looks like it's time for a gcc upgrade.

Jim


Re: OpenBSD/Sparc status

From
Darcy Buskermolen
Date:
On November 23, 2004 11:37 am, Jim Seymour wrote:
> Stefan Kaltenbrunner <stefan@kaltenbrunner.cc> wrote:
> > Darcy Buskermolen wrote:
> > > On November 19, 2004 10:55 am, you wrote:
> > >>The answer is: it's a gcc bug.  The attached program should print
> > >>x = 12.3
> > >>y = 12.3
> > >>
> > >>but if compiled with -O or -O2 on Stefan's machine, I get garbage:
> > >>
> > >>$ gcc -O  ftest.c
> > >>$ ./a.out
> > >>x = 12.3
> > >>y = 1.47203e-39
> > >>$ gcc -v
> > >>Reading specs from
> > >> /usr/lib/gcc-lib/sparc64-unknown-openbsd3.6/3.3.2/specs Configured
> > >> with:
> > >>Thread model: single
> > >>gcc version 3.3.2 (propolice)
> > >>$
> > >
> > > I can confirm this behavior on Solaris 8/sparc 64 as well.
> >
> > some more datapoints:
> >
> > solaris 2.9 with gcc 3.1 is broken(-O3 does not help here)
> > linux/sparc64 (debian) with gcc 3.3.5 is broken too
> >
> > So it looks like at least gcc 3.1 and gcc 3.3.x are affected on Sparc64
> > on all operating systems.
>
> Yet Another Datapoint:
>
> $ uname -a
> SunOS jimsun 5.7 Generic_106541-29 sun4u sparc SUNW,UltraSPARC-IIi-Engine
> $ gcc -v
> ...
> gcc version 3.3.1
> $ gcc -O -m64 test.c
> $ a.out
> x = 12.3
> y = 2.55036e-42
>
> Same on a "real" UltraSparc box, running Solaris 8 and gcc 3.3.1
> at work.
>
> Looks like it's time for a gcc upgrade.
>
> Jim

The following compilers work fine producing 12.3 at all optimization levels:

Sun C 5.5 2003/03/12
and 
sparc-sun-solaris2.9-gcc (GCC) 3.4.1


I'm guessing we need to add some more configure logic to detect gcc versions 
3.4 on sparc trying to produce 64bit code and disable optimizations, or else 
bail out and ask them to upgrade.


>
> ---------------------------(end of broadcast)---------------------------
> TIP 8: explain analyze is your friend

-- 
Darcy Buskermolen
Wavefire Technologies Corp.
ph: 250.717.0200
fx:  250.763.1759
http://www.wavefire.com


Re: OpenBSD/Sparc status

From
Michael Fuhr
Date:
On Tue, Nov 23, 2004 at 12:47:28PM -0800, Darcy Buskermolen wrote:

> I'm guessing we need to add some more configure logic to detect gcc versions 
> 3.4 on sparc trying to produce 64bit code and disable optimizations, or else 
> bail out and ask them to upgrade.

Shouldn't that be gcc versions 3.3?

-- 
Michael Fuhr
http://www.fuhr.org/~mfuhr/


Re: OpenBSD/Sparc status

From
Michael Fuhr
Date:
On Tue, Nov 23, 2004 at 11:34:44AM -0700, Michael Fuhr wrote:
> 
> gcc 3.4.2 on Solaris 9/sparc 64 appears to be okay.

But gcc 3.3.2 on Solaris 9/sparc 64 isn't.

% gcc -m64 test.c
% ./a.out
x = 12.3
y = 12.3

% gcc -O -m64 test.c
% ./a.out
x = 12.3
y = 2.51673e-42

% gcc -O2 -m64 test.c
% ./a.out
x = 12.3
y = 2.51673e-42

% gcc -O3 -m64 test.c
% ./a.out
x = 12.3
y = 12.3

% file a.out
a.out:          ELF 64-bit MSB executable SPARCV9 Version 1, dynamically linked, not stripped

-- 
Michael Fuhr
http://www.fuhr.org/~mfuhr/


Re: OpenBSD/Sparc status

From
Darcy Buskermolen
Date:
On November 23, 2004 06:18 pm, Michael Fuhr wrote:
> On Tue, Nov 23, 2004 at 12:47:28PM -0800, Darcy Buskermolen wrote:
> > I'm guessing we need to add some more configure logic to detect gcc
> > versions 3.4 on sparc trying to produce 64bit code and disable
> > optimizations, or else bail out and ask them to upgrade.
>
> Shouldn't that be gcc versions 3.3?

My bad, It should have read prior to 3.4.

-- 
Darcy Buskermolen
Wavefire Technologies Corp.
ph: 250.717.0200
fx:  250.763.1759
http://www.wavefire.com