Thread: gcc 4.3 breaks ContribCheck in 8.2 and older.
Hi, I did some tests with gcc 4.3 on the branches from 7.4 to 8.3 and head. 8.3 and head don't have a problem. All others failed in the ContribCheck state. You can see the results on buildfarm member panda. Kurt
Kurt Roeckx <kurt@roeckx.be> writes: > I did some tests with gcc 4.3 on the branches from 7.4 to 8.3 and head. > 8.3 and head don't have a problem. All others failed in the > ContribCheck state. > You can see the results on buildfarm member panda. Bizarre. There doesn't seem to be any significant difference in the seg code between 8.2 and 8.3, so why is 8.2 failing there? The cube code got a significant overhaul (V0 to V1 call conventions) between 8.1 and 8.2, so that might explain why it's OK in 8.2 and up. Or not. If that is it, we'd likely just write it off as not-gonna-fix, but it could stand investigation to pinpoint the cause. Can you poke into it with a debugger? The first failure case in seg ought to be easy enough to investigate. regards, tom lane
On Thu, Mar 20, 2008 at 06:53:27PM -0400, Tom Lane wrote: > Kurt Roeckx <kurt@roeckx.be> writes: > > I did some tests with gcc 4.3 on the branches from 7.4 to 8.3 and head. > > 8.3 and head don't have a problem. All others failed in the > > ContribCheck state. > > > You can see the results on buildfarm member panda. > > Bizarre. There doesn't seem to be any significant difference in the seg > code between 8.2 and 8.3, so why is 8.2 failing there? > > The cube code got a significant overhaul (V0 to V1 call conventions) > between 8.1 and 8.2, so that might explain why it's OK in 8.2 and up. > Or not. If that is it, we'd likely just write it off as not-gonna-fix, > but it could stand investigation to pinpoint the cause. > > Can you poke into it with a debugger? The first failure case in seg > ought to be easy enough to investigate. I currently don't have the time to look into it. Kurt
On Thu, Mar 20, 2008 at 06:53:27PM -0400, Tom Lane wrote: > Kurt Roeckx <kurt@roeckx.be> writes: > > I did some tests with gcc 4.3 on the branches from 7.4 to 8.3 and head. > > 8.3 and head don't have a problem. All others failed in the > > ContribCheck state. > > > You can see the results on buildfarm member panda. > > Bizarre. There doesn't seem to be any significant difference in the seg > code between 8.2 and 8.3, so why is 8.2 failing there? > > The cube code got a significant overhaul (V0 to V1 call conventions) > between 8.1 and 8.2, so that might explain why it's OK in 8.2 and up. > Or not. If that is it, we'd likely just write it off as not-gonna-fix, > but it could stand investigation to pinpoint the cause. > > Can you poke into it with a debugger? The first failure case in seg > ought to be easy enough to investigate. It seems this can even be reproduced at -O0. What I see is that in seg.c we have: typedef char bool; bool seg_same(SEG * a, SEG * b); And in fmgr_oldstyle() you have: typedef char *(*func_ptr) (); func_ptr user_fn; char *returnValue; user_fn = fnextra->func returnValue = (*user_fn) (fcinfo->arg[0], fcinfo->arg[1]); With fnextra->func set to seg_same. Somewhere seg_same is converted from returning a bool to returning a char pointer. returnValue has the value 0xffffff00 for me, which of course is an invalid pointer. Kurt
Kurt Roeckx wrote: > On Thu, Mar 20, 2008 at 06:53:27PM -0400, Tom Lane wrote: > > Kurt Roeckx <kurt@roeckx.be> writes: > > > I did some tests with gcc 4.3 on the branches from 7.4 to 8.3 and head. > > > 8.3 and head don't have a problem. All others failed in the > > > ContribCheck state. > > > > Bizarre. There doesn't seem to be any significant difference in the seg > > code between 8.2 and 8.3, so why is 8.2 failing there? > Somewhere seg_same is converted from returning a bool to > returning a char pointer. > > returnValue has the value 0xffffff00 for me, which of course is an > invalid pointer. So the difference is in CFLAGS? I do recall reading something about these kind of things in the GCC 4.3.0 release notes. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
On Mon, Mar 24, 2008 at 02:05:14PM -0300, Alvaro Herrera wrote: > Kurt Roeckx wrote: > > On Thu, Mar 20, 2008 at 06:53:27PM -0400, Tom Lane wrote: > > > Kurt Roeckx <kurt@roeckx.be> writes: > > > > I did some tests with gcc 4.3 on the branches from 7.4 to 8.3 and head. > > > > 8.3 and head don't have a problem. All others failed in the > > > > ContribCheck state. > > > > > > Bizarre. There doesn't seem to be any significant difference in the seg > > > code between 8.2 and 8.3, so why is 8.2 failing there? > > > Somewhere seg_same is converted from returning a bool to > > returning a char pointer. > > > > returnValue has the value 0xffffff00 for me, which of course is an > > invalid pointer. > > So the difference is in CFLAGS? I do recall reading something about > these kind of things in the GCC 4.3.0 release notes. No, this has nothing to do with CFLAGS. It's calling a function which returns something other than it actually returns. The code basicly does: char foo() { return 0; } char *bar() { char *p; char (*f)() = (char *(*)())foo; p = f();return p; } foo is: char foo() But we called it like it's an: char *foo() In our case, foo() contains a function call, and is then checked for == 0. gcc-4.3 generates such code for that: 0x000000000040052f <foo+19>: test %eax,%eax 0x0000000000400531 <foo+21>: sete %al While gcc-4.2 generates: 0x000000000040052f <foo+19>: test %eax,%eax 0x0000000000400531 <foo+21>: sete %al 0x0000000000400534 <foo+24>: movzbl %al,%eax Since it's only returning a char which is 8 bit, it only should change the lowest part of the register, so it's perfectly valid to do that. seg_cmp() happened to return -1 (0xffffffff). So seg_same (foo) happened to change eax to 0xffffff00. And then that gets interpreted as a pointer, which doesn't make much sense anymore. Kurt
Kurt Roeckx wrote: > No, this has nothing to do with CFLAGS. It's calling a function which > returns something other than it actually returns. Yeah but apparently gcc 4.3 is working in 8.3 and later. What happens to your sample program if you compile it with the CFLAGS used in 8.3 versus those used in 8.2? pg_config --configure will show these. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Alvaro Herrera wrote: > Kurt Roeckx wrote: > > >> No, this has nothing to do with CFLAGS. It's calling a function which >> returns something other than it actually returns. >> > > Yeah but apparently gcc 4.3 is working in 8.3 and later. What happens > to your sample program if you compile it with the CFLAGS used in 8.3 > versus those used in 8.2? pg_config --configure will show these. > > I am seeing the same results as Kurt. The CFLAGS as reported by pg_config are identical: CFLAGS = -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Winline -Wdeclaration-after-statement -Wendif-labels -fno-strict-aliasing -fwrapv -g CFLAGS_SL = -fpic cheers andrew
On Mon, Mar 24, 2008 at 03:05:31PM -0300, Alvaro Herrera wrote: > Kurt Roeckx wrote: > > > No, this has nothing to do with CFLAGS. It's calling a function which > > returns something other than it actually returns. > > Yeah but apparently gcc 4.3 is working in 8.3 and later. What happens > to your sample program if you compile it with the CFLAGS used in 8.3 > versus those used in 8.2? pg_config --configure will show these. 8.2 and 8.3 actually get the same value in fmgr_oldstyle(), it just seems they do something else with it somewhere else. Both are wrong, it's just that we don't see it in 8.3. I think it should return a Datum, and use PG_RETURN_BOOL. Kurt
Kurt Roeckx <kurt@roeckx.be> writes: > 8.2 and 8.3 actually get the same value in fmgr_oldstyle(), it just > seems they do something else with it somewhere else. Doh. I'll bet the difference is this fix: http://archives.postgresql.org/pgsql-committers/2007-07/msg00131.php Please try that patch and see what it fixes pre-8.3. regards, tom lane
On Mon, Mar 24, 2008 at 02:52:18PM -0400, Tom Lane wrote: > Kurt Roeckx <kurt@roeckx.be> writes: > > 8.2 and 8.3 actually get the same value in fmgr_oldstyle(), it just > > seems they do something else with it somewhere else. > > Doh. I'll bet the difference is this fix: > > http://archives.postgresql.org/pgsql-committers/2007-07/msg00131.php > > Please try that patch and see what it fixes pre-8.3. No, returnValue contains the same value for both 8.2 and 8.3. There is no reason why casting from a char * to a char * should result into something else. Kurt
Kurt Roeckx <kurt@roeckx.be> writes: > On Mon, Mar 24, 2008 at 02:52:18PM -0400, Tom Lane wrote: >> Please try that patch and see what it fixes pre-8.3. > No, returnValue contains the same value for both 8.2 and 8.3. There is > no reason why casting from a char * to a char * should result into > something else. Please try the patch rather than assuming you know the answer. regards, tom lane
On Mon, Mar 24, 2008 at 03:52:09PM -0400, Tom Lane wrote: > Kurt Roeckx <kurt@roeckx.be> writes: > > On Mon, Mar 24, 2008 at 02:52:18PM -0400, Tom Lane wrote: > >> Please try that patch and see what it fixes pre-8.3. > > > No, returnValue contains the same value for both 8.2 and 8.3. There is > > no reason why casting from a char * to a char * should result into > > something else. > > Please try the patch rather than assuming you know the answer. I did try the patch. It fails just the same way. With or without the patch, with both 8.2 and 8.3, the 0xffffff00 gets passed from seg_same end ends up atleast in ExecutePlan(): ExecutePlan (estate=0xc1c468, planstate=0xc1c698, operation=CMD_SELECT, numberTuples=0, direction=ForwardScanDirection,dest=0xc03250) at execMain.c:1239 1239 if (TupIsNull(planSlot)) (gdb) p /x planSlot->tts_values[0] $4 = 0xffffff00 From there on I have no idea what happens with it, but 8.2 turns it a true, 8.3 into a false. Kurt
Kurt Roeckx <kurt@roeckx.be> writes: > I did try the patch. It fails just the same way. Hmph. So we still don't know why 8.2 and 8.3 behave differently ... [ pokes around ... ] Hah, maybe this is it: http://archives.postgresql.org/pgsql-committers/2007-03/msg00292.php regards, tom lane
On Mon, Mar 24, 2008 at 05:59:33PM -0400, Tom Lane wrote: > Kurt Roeckx <kurt@roeckx.be> writes: > > I did try the patch. It fails just the same way. > > Hmph. So we still don't know why 8.2 and 8.3 behave differently ... > [ pokes around ... ] Hah, maybe this is it: > > http://archives.postgresql.org/pgsql-committers/2007-03/msg00292.php The comment atleast seems to describe what I'm seeing. I'll try this tomorrow. Kurt
On Mon, Mar 24, 2008 at 05:59:33PM -0400, Tom Lane wrote: > Kurt Roeckx <kurt@roeckx.be> writes: > > I did try the patch. It fails just the same way. > > Hmph. So we still don't know why 8.2 and 8.3 behave differently ... > [ pokes around ... ] Hah, maybe this is it: > > http://archives.postgresql.org/pgsql-committers/2007-03/msg00292.php This patch atleast solves the problems with 8.2. Kurt
Kurt Roeckx <kurt@roeckx.be> writes: > On Mon, Mar 24, 2008 at 05:59:33PM -0400, Tom Lane wrote: >> http://archives.postgresql.org/pgsql-committers/2007-03/msg00292.php > This patch atleast solves the problems with 8.2. Excellent, I'll go back-patch that and we can see what else there is. You still have panda set up to build with gcc 4.3, correct? regards, tom lane
On Tue, Mar 25, 2008 at 02:11:30PM -0400, Tom Lane wrote: > Kurt Roeckx <kurt@roeckx.be> writes: > > On Mon, Mar 24, 2008 at 05:59:33PM -0400, Tom Lane wrote: > >> http://archives.postgresql.org/pgsql-committers/2007-03/msg00292.php > > > This patch atleast solves the problems with 8.2. > > Excellent, I'll go back-patch that and we can see what else there is. > You still have panda set up to build with gcc 4.3, correct? Yes, I just disabled some cron jobs for those that failed. kurt
Kurt Roeckx <kurt@roeckx.be> writes: > On Tue, Mar 25, 2008 at 02:11:30PM -0400, Tom Lane wrote: >> Excellent, I'll go back-patch that and we can see what else there is. >> You still have panda set up to build with gcc 4.3, correct? > Yes, I just disabled some cron jobs for those that failed. Patches committed, please re-enable the back branches so we can see what happens. regards, tom lane
Tom Lane wrote: > Kurt Roeckx <kurt@roeckx.be> writes: > >> On Tue, Mar 25, 2008 at 02:11:30PM -0400, Tom Lane wrote: >> >>> Excellent, I'll go back-patch that and we can see what else there is. >>> You still have panda set up to build with gcc 4.3, correct? >>> > > >> Yes, I just disabled some cron jobs for those that failed. >> > > Patches committed, please re-enable the back branches so we can > see what happens. > > > I have tested this back as far as 8.0, and all seems OK. cheers andrew
On Tue, Mar 25, 2008 at 06:03:39PM -0400, Andrew Dunstan wrote: >> >> Patches committed, please re-enable the back branches so we can >> see what happens. > > I have tested this back as far as 8.0, and all seems OK. 7.4 passed too. Kurt