Thread: OpenBSD/Sparc status
The fix for unflushed changed to pg_database records seems to have fixed the problem we were seeing on spoonbill ... but it is now seeing problems with the seg module: http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=spoonbill&dt=2004-11-18%2016:02:58 cheers andrew
Andrew Dunstan <andrew@dunslane.net> writes: > The fix for unflushed changed to pg_database records seems to have fixed > the problem we were seeing on spoonbill ... but it is now seeing > problems with the seg module: > http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=spoonbill&dt=2004-11-18%2016:02:58 Don't tell me that just started happening? We haven't touched seg in weeks... I'm unsure how this could fail when float4 passes, because it's using float4in to convert the strings. regards, tom lane
Tom Lane wrote: >Andrew Dunstan <andrew@dunslane.net> writes: > > >>The fix for unflushed changed to pg_database records seems to have fixed >>the problem we were seeing on spoonbill ... but it is now seeing >>problems with the seg module: >> >> > > > >>http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=spoonbill&dt=2004-11-18%2016:02:58 >> >> > >Don't tell me that just started happening? We haven't touched seg in >weeks... > >I'm unsure how this could fail when float4 passes, because it's using >float4in to convert the strings. > > > > We're only seeing it now because up to now the run on this platform was bombing out on the error you so brilliantly fixed last night. You might recall I wanted to patch contrib/Makefile to force installcheck on all modules regardless of error - if we had that we'd have seen this before. cheers andrew
Andrew Dunstan <andrew@dunslane.net> writes: > We're only seeing it now because up to now the run on this platform was > bombing out on the error you so brilliantly fixed last night. Consistently? I'd have thought that problem would only fail once in a while. It's hard to believe the timing would work out to make it a 100% failure. regards, tom lane
Tom Lane wrote: >Andrew Dunstan <andrew@dunslane.net> writes: > > >>We're only seeing it now because up to now the run on this platform was >>bombing out on the error you so brilliantly fixed last night. >> >> > >Consistently? I'd have thought that problem would only fail once in a >while. It's hard to believe the timing would work out to make it a 100% >failure. > > > > You can see the history of the latest build runs here: http://www.pgbuildfarm.org/cgi-bin/show_history.pl?nm=spoonbill&br=HEAD cheers andrew
Andrew Dunstan <andrew@dunslane.net> writes: > Tom Lane wrote: >> Consistently? I'd have thought that problem would only fail once in a >> while. It's hard to believe the timing would work out to make it a 100% >> failure. > You can see the history of the latest build runs here: > http://www.pgbuildfarm.org/cgi-bin/show_history.pl?nm=spoonbill&br=HEAD Remarkable. There is one run (2004-11-15) where it got past the rtree test (and did indeed fail at seg) but the failure rate is certainly upwards of 90%. Curious. There must be some effect that is synchronizing the bgwriter's actions with the test sequence. Back at the ranch, I am even more surprised to note that the bogus seg output in the 11-15 run is different from what it is in today's. There's not much I can do about it without access to a machine where it's failing though. Can we get personal accounts on the buildfarm machines? regards, tom lane
Tom Lane wrote: >Can we get personal accounts on the buildfarm >machines? > > > > That's up to the owner of each machine - it's a distributed system. I've sent email to the owner of this one. When I get a few minutes soon I hope to start some discussion on -hackers about what members we want in the buildfarm and what our expectations are about help with solving problems. cheers andrew
The answer is: it's a gcc bug. The attached program should print x = 12.3 y = 12.3 but if compiled with -O or -O2 on Stefan's machine, I get garbage: $ gcc -O ftest.c $ ./a.out x = 12.3 y = 1.47203e-39 $ gcc -v Reading specs from /usr/lib/gcc-lib/sparc64-unknown-openbsd3.6/3.3.2/specs Configured with: Thread model: single gcc version 3.3.2 (propolice) $ regards, tom lane #include <stdio.h> float returnfloat(float *x) {return *x; } int main() {float x = 12.3;union { float f; char *t;} y; y.f = returnfloat(&x); printf("x = %g\n", x);printf("y = %g\n", y.f); return 0; }
Tom Lane wrote: > The answer is: it's a gcc bug. The attached program should print > x = 12.3 > y = 12.3 > > but if compiled with -O or -O2 on Stefan's machine, I get garbage: > > $ gcc -O ftest.c > $ ./a.out > x = 12.3 > y = 1.47203e-39 woa - scary. I will report that to the OpenBSD-folks upstream - many thanks for the nice testcase! Stefan
Stefan Kaltenbrunner wrote: > Tom Lane wrote: > >> The answer is: it's a gcc bug. The attached program should print >> x = 12.3 >> y = 12.3 >> >> but if compiled with -O or -O2 on Stefan's machine, I get garbage: >> >> $ gcc -O ftest.c >> $ ./a.out >> x = 12.3 >> y = 1.47203e-39 > > > woa - scary. I will report that to the OpenBSD-folks upstream - many > thanks for the nice testcase! > > > very scary. Meanwhile, what do we do? Turn off -O in src/template/openbsd for some/all releases? cheers andrew
Andrew Dunstan <andrew@dunslane.net> writes: > Meanwhile, what do we do? Turn off -O in src/template/openbsd for > some/all releases? Certainly not. This problem is only known to exist in one gcc version for one architecture, and besides it's only affecting (so far as we can tell) one rather inessential contrib module. I'd say ignore the test failure until Stefan can get a fixed gcc. regards, tom lane
Tom Lane wrote: > Andrew Dunstan <andrew@dunslane.net> writes: > >>Meanwhile, what do we do? Turn off -O in src/template/openbsd for >>some/all releases? > > > Certainly not. This problem is only known to exist in one gcc version > for one architecture, and besides it's only affecting (so far as we can > tell) one rather inessential contrib module. I'd say ignore the test > failure until Stefan can get a fixed gcc. FWIW: I got the bug confirmed by Miod Vallat (OpenBSD hacker) on IRC, it looks that at least OpenBSD 3.6-STABLE and OpenBSD-current on Sparc64 with the stock system compiler are affected. Stefan
Stefan Kaltenbrunner said: > Tom Lane wrote: >> Andrew Dunstan <andrew@dunslane.net> writes: >> >>>Meanwhile, what do we do? Turn off -O in src/template/openbsd for >>>some/all releases? >> >> >> Certainly not. This problem is only known to exist in one gcc version >> for one architecture, and besides it's only affecting (so far as we >> can tell) one rather inessential contrib module. I'd say ignore the >> test failure until Stefan can get a fixed gcc. > > FWIW: I got the bug confirmed by Miod Vallat (OpenBSD hacker) on IRC, > it looks that at least OpenBSD 3.6-STABLE and OpenBSD-current on > Sparc64 with the stock system compiler are affected. I guess my concern is that on Sparc64/OpenBSD-3.6* at least, this bug is exposed by the seg tests but might well occur elsewhere and bite us in various unpleasant ways. I have no idea how many people out there are using this combination. Of course, even it it's only one (and I suspect that's the right order of magnitude) we should want to be careful with their data. cheers andrew
"Andrew Dunstan" <andrew@dunslane.net> writes: > I guess my concern is that on Sparc64/OpenBSD-3.6* at least, this bug is > exposed by the seg tests but might well occur elsewhere and bite us in > various unpleasant ways. The experimentation I did to develop the test case suggested that the problem only occurs when the result of a function returning float is stored directly into a union member. That's a sufficiently weird case that I'm reasonably confident it doesn't occur elsewhere in the backend. It might be worth Stefan's time to vary the test case a bit (eg try double instead of float, struct instead of union, etc) and see just how general the bug is. regards, tom lane
On November 19, 2004 10:55 am, you wrote: > The answer is: it's a gcc bug. The attached program should print > x = 12.3 > y = 12.3 > > but if compiled with -O or -O2 on Stefan's machine, I get garbage: > > $ gcc -O ftest.c > $ ./a.out > x = 12.3 > y = 1.47203e-39 > $ gcc -v > Reading specs from /usr/lib/gcc-lib/sparc64-unknown-openbsd3.6/3.3.2/specs > Configured with: > Thread model: single > gcc version 3.3.2 (propolice) > $ I can confirm this behavior on Solaris 8/sparc 64 as well. bash-2.03$ gcc -O -m64 test.c bash-2.03$ ./a.out x = 12.3 y = 2.51673e-42 bash-2.03$ file a.out a.out: ELF 64-bit MSB executable SPARCV9 Version 1, dynamically linked, not stripped bash-2.03$ gcc -v Reading specs from /usr/local/lib/gcc-lib/sparc-sun-solaris2.8/3.3.2/specs Configured with: ../configure --with-as=/usr/ccs/bin/as --with-ld=/usr/ccs/bin/ld --disable-nls Thread model: posix gcc version 3.3.2 bash-2.03$ gcc -m64 test.c bash-2.03$ ./a.out x = 12.3 y = 12.3 bash-2.03$ gcc -m64 -02 test.c gcc: unrecognized option `-02' bash-2.03$ gcc -m64 -O2 test.c bash-2.03$ ./a.out x = 12.3 y = 2.51673e-42 bash-2.03$ gcc -m64 -O3 test.c bash-2.03$ ./a.out x = 12.3 y = 12.3 bash-2.03$ > > regards, tom lane > > > #include <stdio.h> > > float > returnfloat(float *x) > { > return *x; > } > > int > main() > { > float x = 12.3; > union { > float f; > char *t; > } y; > > y.f = returnfloat(&x); > > printf("x = %g\n", x); > printf("y = %g\n", y.f); > > return 0; > } > > ---------------------------(end of broadcast)--------------------------- > TIP 9: the planner will ignore your desire to choose an index scan if your > joining column's datatypes do not match -- Darcy Buskermolen Wavefire Technologies Corp. ph: 250.717.0200 fx: 250.763.1759 http://www.wavefire.com
Tom Lane wrote: > Darcy Buskermolen <darcy@wavefire.com> writes: > >>I can confirm this behavior on Solaris 8/sparc 64 as well. > > >>bash-2.03$ gcc -m64 -O2 test.c >>bash-2.03$ ./a.out >>x = 12.3 >>y = 2.51673e-42 >>bash-2.03$ gcc -m64 -O3 test.c >>bash-2.03$ ./a.out >>x = 12.3 >>y = 12.3 >>bash-2.03$ > > > Hmm. I hadn't bothered to try -O3 ... interesting that it works > correctly again at that level. -O3 works on my box too > > Anyway, this proves that it is an upstream gcc bug and not something > OpenBSD broke. I just tried on solaris9 with gcc 3.4.2 - seems the bug is fixed in this version. Unfortunably it is quite problematic tochange the compiler at least on OpenBSD gcc 3.3.2 is quite heavily modified on that platform and switching the base system compiler might screw a boatload of other tools. The actual recommendation I got from the OpenBSD-folks was to add "-mfaster-structs" to the compiler flags with seems to work around the issue - I'm currently doing a full build to verify that though ... Stefan
On Tue, Nov 23, 2004 at 09:57:03AM -0800, Darcy Buskermolen wrote: > I can confirm this behavior on Solaris 8/sparc 64 as well. gcc 3.4.2 on Solaris 9/sparc 64 appears to be okay. % gcc -v Reading specs from /usr/local/lib/gcc/sparc-sun-solaris2.9/3.4.2/specs Configured with: ../configure --with-as=/usr/ccs/bin/as --with-ld=/usr/ccs/bin/ld --disable-nls Thread model: posix gcc version 3.4.2 % gcc -m64 test.c % ./a.out x = 12.3 y = 12.3 % gcc -O -m64 test.c % ./a.out x = 12.3 y = 12.3 % gcc -O2 -m64 test.c % ./a.out x = 12.3 y = 12.3 % gcc -O3 -m64 test.c % ./a.out x = 12.3 y = 12.3 % file a.out a.out: ELF 64-bit MSB executable SPARCV9 Version 1, dynamically linked, not stripped -- Michael Fuhr http://www.fuhr.org/~mfuhr/
Darcy Buskermolen wrote: > On November 19, 2004 10:55 am, you wrote: > >>The answer is: it's a gcc bug. The attached program should print >>x = 12.3 >>y = 12.3 >> >>but if compiled with -O or -O2 on Stefan's machine, I get garbage: >> >>$ gcc -O ftest.c >>$ ./a.out >>x = 12.3 >>y = 1.47203e-39 >>$ gcc -v >>Reading specs from /usr/lib/gcc-lib/sparc64-unknown-openbsd3.6/3.3.2/specs >>Configured with: >>Thread model: single >>gcc version 3.3.2 (propolice) >>$ > > > I can confirm this behavior on Solaris 8/sparc 64 as well. some more datapoints: solaris 2.9 with gcc 3.1 is broken(-O3 does not help here) linux/sparc64 (debian) with gcc 3.3.5 is broken too So it looks like at least gcc 3.1 and gcc 3.3.x are affected on Sparc64 on all operating systems. Stefan
Stefan Kaltenbrunner <stefan@kaltenbrunner.cc> wrote: > > Darcy Buskermolen wrote: > > On November 19, 2004 10:55 am, you wrote: > > > >>The answer is: it's a gcc bug. The attached program should print > >>x = 12.3 > >>y = 12.3 > >> > >>but if compiled with -O or -O2 on Stefan's machine, I get garbage: > >> > >>$ gcc -O ftest.c > >>$ ./a.out > >>x = 12.3 > >>y = 1.47203e-39 > >>$ gcc -v > >>Reading specs from /usr/lib/gcc-lib/sparc64-unknown-openbsd3.6/3.3.2/specs > >>Configured with: > >>Thread model: single > >>gcc version 3.3.2 (propolice) > >>$ > > > > > > I can confirm this behavior on Solaris 8/sparc 64 as well. > > some more datapoints: > > solaris 2.9 with gcc 3.1 is broken(-O3 does not help here) > linux/sparc64 (debian) with gcc 3.3.5 is broken too > > So it looks like at least gcc 3.1 and gcc 3.3.x are affected on Sparc64 > on all operating systems. Yet Another Datapoint: $ uname -a SunOS jimsun 5.7 Generic_106541-29 sun4u sparc SUNW,UltraSPARC-IIi-Engine $ gcc -v ... gcc version 3.3.1 $ gcc -O -m64 test.c $ a.out x = 12.3 y = 2.55036e-42 Same on a "real" UltraSparc box, running Solaris 8 and gcc 3.3.1 at work. Looks like it's time for a gcc upgrade. Jim
On November 23, 2004 11:37 am, Jim Seymour wrote: > Stefan Kaltenbrunner <stefan@kaltenbrunner.cc> wrote: > > Darcy Buskermolen wrote: > > > On November 19, 2004 10:55 am, you wrote: > > >>The answer is: it's a gcc bug. The attached program should print > > >>x = 12.3 > > >>y = 12.3 > > >> > > >>but if compiled with -O or -O2 on Stefan's machine, I get garbage: > > >> > > >>$ gcc -O ftest.c > > >>$ ./a.out > > >>x = 12.3 > > >>y = 1.47203e-39 > > >>$ gcc -v > > >>Reading specs from > > >> /usr/lib/gcc-lib/sparc64-unknown-openbsd3.6/3.3.2/specs Configured > > >> with: > > >>Thread model: single > > >>gcc version 3.3.2 (propolice) > > >>$ > > > > > > I can confirm this behavior on Solaris 8/sparc 64 as well. > > > > some more datapoints: > > > > solaris 2.9 with gcc 3.1 is broken(-O3 does not help here) > > linux/sparc64 (debian) with gcc 3.3.5 is broken too > > > > So it looks like at least gcc 3.1 and gcc 3.3.x are affected on Sparc64 > > on all operating systems. > > Yet Another Datapoint: > > $ uname -a > SunOS jimsun 5.7 Generic_106541-29 sun4u sparc SUNW,UltraSPARC-IIi-Engine > $ gcc -v > ... > gcc version 3.3.1 > $ gcc -O -m64 test.c > $ a.out > x = 12.3 > y = 2.55036e-42 > > Same on a "real" UltraSparc box, running Solaris 8 and gcc 3.3.1 > at work. > > Looks like it's time for a gcc upgrade. > > Jim The following compilers work fine producing 12.3 at all optimization levels: Sun C 5.5 2003/03/12 and sparc-sun-solaris2.9-gcc (GCC) 3.4.1 I'm guessing we need to add some more configure logic to detect gcc versions 3.4 on sparc trying to produce 64bit code and disable optimizations, or else bail out and ask them to upgrade. > > ---------------------------(end of broadcast)--------------------------- > TIP 8: explain analyze is your friend -- Darcy Buskermolen Wavefire Technologies Corp. ph: 250.717.0200 fx: 250.763.1759 http://www.wavefire.com
On Tue, Nov 23, 2004 at 12:47:28PM -0800, Darcy Buskermolen wrote: > I'm guessing we need to add some more configure logic to detect gcc versions > 3.4 on sparc trying to produce 64bit code and disable optimizations, or else > bail out and ask them to upgrade. Shouldn't that be gcc versions 3.3? -- Michael Fuhr http://www.fuhr.org/~mfuhr/
On Tue, Nov 23, 2004 at 11:34:44AM -0700, Michael Fuhr wrote: > > gcc 3.4.2 on Solaris 9/sparc 64 appears to be okay. But gcc 3.3.2 on Solaris 9/sparc 64 isn't. % gcc -m64 test.c % ./a.out x = 12.3 y = 12.3 % gcc -O -m64 test.c % ./a.out x = 12.3 y = 2.51673e-42 % gcc -O2 -m64 test.c % ./a.out x = 12.3 y = 2.51673e-42 % gcc -O3 -m64 test.c % ./a.out x = 12.3 y = 12.3 % file a.out a.out: ELF 64-bit MSB executable SPARCV9 Version 1, dynamically linked, not stripped -- Michael Fuhr http://www.fuhr.org/~mfuhr/
On November 23, 2004 06:18 pm, Michael Fuhr wrote: > On Tue, Nov 23, 2004 at 12:47:28PM -0800, Darcy Buskermolen wrote: > > I'm guessing we need to add some more configure logic to detect gcc > > versions 3.4 on sparc trying to produce 64bit code and disable > > optimizations, or else bail out and ask them to upgrade. > > Shouldn't that be gcc versions 3.3? My bad, It should have read prior to 3.4. -- Darcy Buskermolen Wavefire Technologies Corp. ph: 250.717.0200 fx: 250.763.1759 http://www.wavefire.com