Thread: ECPG failure on BF member Vaquita (Windows Vista)

ECPG failure on BF member Vaquita (Windows Vista)

From
Dave Page
Date:
I'm seeing an ECPG-Check failure on Windows Vista - any ideas what might
be causing this?

http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=vaquita&dt=2007-04-24%2020:00:05

The only other Vista buildfarm member (baiji, on the same physical box)
is running MSVC builds which don't yet test ECPG from what I can see.

Regards, Dave


Re: ECPG failure on BF member Vaquita (Windows Vista)

From
Michael Meskes
Date:
On Wed, Apr 25, 2007 at 10:47:57AM +0100, Dave Page wrote:
> I'm seeing an ECPG-Check failure on Windows Vista - any ideas what might
> be causing this?

Hmm, first glance suggests some permission problems. 

I never touched a Vista system so far, so I'm at a loss as far as
details are concerned.

I also saw that wombat is segfaulting in ecpg tests but not only with
CVS HEAD but also trying to test 8.2. Any idea what's going on with this
machine?

Michael
-- 
Michael Meskes
Email: Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org)
ICQ: 179140304, AIM/Yahoo: michaelmeskes, Jabber: meskes@jabber.org
Go SF 49ers! Go Rhein Fire! Use Debian GNU/Linux! Use PostgreSQL!


Re: ECPG failure on BF member Vaquita (Windows Vista)

From
Dave Page
Date:
Michael Meskes wrote:
> On Wed, Apr 25, 2007 at 10:47:57AM +0100, Dave Page wrote:
>> I'm seeing an ECPG-Check failure on Windows Vista - any ideas what might
>> be causing this?
> 
> Hmm, first glance suggests some permission problems. 

Yes, that was my thought as well, however I ran cacls down the entire
build tree, granting full control to Everyone on everything, but the
problem persists.

Additionally, all the other tests pass, including make check and make
installcheck, so unless the ECPG tests are trying to create something
outside of the buildtree, I don't think it is a permissions issue.

Is there anything I can try building/running manually to help debug this?

> I also saw that wombat is segfaulting in ecpg tests but not only with
> CVS HEAD but also trying to test 8.2. Any idea what's going on with this
> machine?
> 
> Michael

Can't help with that one I'm afraid.

Regards, Dave


Re: ECPG failure on BF member Vaquita (Windows Vista)

From
Andrew Dunstan
Date:
Dave Page wrote:
> Michael Meskes wrote:
>   
>> On Wed, Apr 25, 2007 at 10:47:57AM +0100, Dave Page wrote:
>>     
>>> I'm seeing an ECPG-Check failure on Windows Vista - any ideas what might
>>> be causing this?
>>>       
>> Hmm, first glance suggests some permission problems. 
>>     
>
> Yes, that was my thought as well, however I ran cacls down the entire
> build tree, granting full control to Everyone on everything, but the
> problem persists.
>   

Please don't do that on your buildfarm repo copy (if that's what you 
did). You should not touch *anything* inside it. If need to you do this, 
make a copy (see later) and alter that.

If you did do this to the buildfarm repo copy, please blow it away so 
that buildfarm will get a fresh clean copy next time it runs.

> Additionally, all the other tests pass, including make check and make
> installcheck, so unless the ECPG tests are trying to create something
> outside of the buildtree, I don't think it is a permissions issue.
>
> Is there anything I can try building/running manually to help debug this?
>   


To recreate the failing buildfarm environment (including making an 
alterable repo copy), run the following (with suitable local mods):

run_build.pl --test --keepall

At the end of that you should have the build tree that was actually 
used, and the install tree too if it got that far. Then you can start 
playing manually.

Remember to move or remove the kept trees before your next buildfarm run.

cheers

andrew




Re: ECPG failure on BF member Vaquita (Windows Vista)

From
Dave Page
Date:
Andrew Dunstan wrote:
> Please don't do that on your buildfarm repo copy (if that's what you
> did). You should not touch *anything* inside it. If need to you do this,
> make a copy (see later) and alter that.
> 
> If you did do this to the buildfarm repo copy, please blow it away so
> that buildfarm will get a fresh clean copy next time it runs.

No, I reset the permissions on /msys. Thats entirely necessary on Vista
where the permissions are so locked down by default that you can't even
create it without some measure of pain.

Regards, Dave.


Re: ECPG failure on BF member Vaquita (Windows Vista)

From
Andrew Dunstan
Date:
Dave Page wrote:
> Andrew Dunstan wrote:
>   
>> Please don't do that on your buildfarm repo copy (if that's what you
>> did). You should not touch *anything* inside it. If need to you do this,
>> make a copy (see later) and alter that.
>>
>> If you did do this to the buildfarm repo copy, please blow it away so
>> that buildfarm will get a fresh clean copy next time it runs.
>>     
>
> No, I reset the permissions on /msys. Thats entirely necessary on Vista
> where the permissions are so locked down by default that you can't even
> create it without some measure of pain.
>
>   

Yes, my Vista experience so far has been very unpleasant indeed. It 
strikes me as the OS equivalent of jumping the shark. I haven't even 
thought about Msys ...

cheers

andrew


Re: ECPG failure on BF member Vaquita (Windows Vista)

From
"Mark Wong"
Date:
On 4/25/07, Michael Meskes <meskes@postgresql.org> wrote:
> I also saw that wombat is segfaulting in ecpg tests but not only with
> CVS HEAD but also trying to test 8.2. Any idea what's going on with this
> machine?

I generated a stack trace for REL8_2_STABLE, but I'm not sure how
helpful it is.  Let me know what other information I can provide...
Looks like I don't have symbols for libc.

Core was generated by `sql/update '.
Program terminated with signal 11, Segmentation fault.
#0  ECPGget_variable (ap=<value optimized out>, type=<value optimized out>,   var=0x10028730, indicator=1 '\001') at
execute.c:131
131                             var->ind_value = *((char **)
(var->ind_pointer));
(gdb) bt
#0  ECPGget_variable (ap=<value optimized out>, type=<value optimized out>,   var=0x10028730, indicator=1 '\001') at
execute.c:131
#1  0x0000040000048948 in ECPGdo (lineno=28, compat=<value optimized out>,   force_indicator=<value optimized out>,
connection_name=<valueoptimized out>, query=<value optimized out>)   at execute.c:195
 
#2  0x0000000010000d20 in main (argc=<value optimized out>,   argv=<value optimized out>) at update.pgc:28
#3  0x000004000063ce4c in .generic_start_main () from /lib/libc.so.6
#4  0x000004000063d0f8 in .__libc_start_main () from /lib/libc.so.6
#5  0x0000000000000000 in ?? ()


Regards,
Mark


Re: ECPG failure on BF member Vaquita (Windows Vista)

From
Andrew Dunstan
Date:
Mark Wong wrote:
> On 4/25/07, Michael Meskes <meskes@postgresql.org> wrote:
>> I also saw that wombat is segfaulting in ecpg tests but not only with
>> CVS HEAD but also trying to test 8.2. Any idea what's going on with this
>> machine?
>
> I generated a stack trace for REL8_2_STABLE, but I'm not sure how
> helpful it is.  Let me know what other information I can provide...
> Looks like I don't have symbols for libc.
>
> Core was generated by `sql/update '.
> Program terminated with signal 11, Segmentation fault.
> #0  ECPGget_variable (ap=<value optimized out>, type=<value optimized 
> out>,
>    var=0x10028730, indicator=1 '\001') at execute.c:131
> 131                             var->ind_value = *((char **)
> (var->ind_pointer));
> (gdb) bt
> #0  ECPGget_variable (ap=<value optimized out>, type=<value optimized 
> out>,
>    var=0x10028730, indicator=1 '\001') at execute.c:131
> #1  0x0000040000048948 in ECPGdo (lineno=28, compat=<value optimized 
> out>,
>    force_indicator=<value optimized out>,
>    connection_name=<value optimized out>, query=<value optimized out>)
>    at execute.c:195
> #2  0x0000000010000d20 in main (argc=<value optimized out>,
>    argv=<value optimized out>) at update.pgc:28
> #3  0x000004000063ce4c in .generic_start_main () from /lib/libc.so.6
> #4  0x000004000063d0f8 in .__libc_start_main () from /lib/libc.so.6
> #5  0x0000000000000000 in ?? ()
>
>
>

I think you'll need to compile with optimisation turned off and then try 
running the test under debugger control, putting a breakpoint in 
ECPGget_variable() and then stepping through it. I wonder what value of 
var->ind_pointer it is getting?

cheers

andrew



Re: ECPG failure on BF member Vaquita (Windows Vista)

From
Tom Lane
Date:
Andrew Dunstan <andrew@dunslane.net> writes:
> I think you'll need to compile with optimisation turned off and then try 
> running the test under debugger control, putting a breakpoint in 
> ECPGget_variable() and then stepping through it. I wonder what value of 
> var->ind_pointer it is getting?

You could probably inspect the contents of *var in that dump without
having to recompile.  Given that this is PPC64, I'm betting on a pointer
size or alignment problem ...
        regards, tom lane


Re: ECPG failure on BF member Vaquita (Windows Vista)

From
"Mark Wong"
Date:
On 4/25/07, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
> > I think you'll need to compile with optimisation turned off and then try
> > running the test under debugger control, putting a breakpoint in
> > ECPGget_variable() and then stepping through it. I wonder what value of
> > var->ind_pointer it is getting?
>
> You could probably inspect the contents of *var in that dump without
> having to recompile.  Given that this is PPC64, I'm betting on a pointer
> size or alignment problem ...

Does this help?

(gdb) p var->type
$1 = 4267828624
(gdb) p var->value
$2 = (void *) 0x1
(gdb) p var->pointer
$3 = (void *) 0x1
(gdb) p var->varcharsize
$4 = 3
(gdb) p var->arrsize
$5 = 4
(gdb) p var->offset
$6 = 29
(gdb) p var->ind_type
$7 = 0
(gdb) p var->ind_pointer
$8 = (void *) 0x0
(gdb) p var->ind_varcharsize
$9 = 0
(gdb) p var->ind_arrsize
$10 = 0
(gdb) p var->ind_offset
$11 = 5
(gdb) p var->next
$12 = (struct variable *) 0x0

Regards,
Mark


Re: ECPG failure on BF member Vaquita (Windows Vista)

From
Tom Lane
Date:
"Mark Wong" <markwkm@gmail.com> writes:
> Does this help?

> (gdb) p var->ind_pointer
> $8 = (void *) 0x0

Well, that seems to be the reason why it's failing to indirect through
ind_pointer ... but why is it only failing on your machine and not
everyone else's?  I think this indicates something unportable about
ecpg's usage of va_list.

Hmm, and I don't have to look far to find a smoking gun:

#if defined(__GNUC__) && (defined (__powerpc__) || defined(__amd64__) || defined(__x86_64__))if
(create_statement(lineno,compat, force_indicator, con, &stmt, query, args) == false)
 
#elseif (create_statement(lineno, compat, force_indicator, con, &stmt, query, &args) == false)
#endif

Why in the world is that like that?  We don't have such a kluge
anyplace else we use va_list.  stringinfo.c for instance has
never needed any such thing.

Mark, does your gcc define __powerpc__, or only __powerpc64__?
        regards, tom lane


Re: ECPG failure on BF member Vaquita (Windows Vista)

From
"Mark Wong"
Date:
On 4/25/07, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> "Mark Wong" <markwkm@gmail.com> writes:
> > Does this help?
>
> > (gdb) p var->ind_pointer
> > $8 = (void *) 0x0
>
> Well, that seems to be the reason why it's failing to indirect through
> ind_pointer ... but why is it only failing on your machine and not
> everyone else's?  I think this indicates something unportable about
> ecpg's usage of va_list.
>
> Hmm, and I don't have to look far to find a smoking gun:
>
> #if defined(__GNUC__) && (defined (__powerpc__) || defined(__amd64__) || defined(__x86_64__))
>         if (create_statement(lineno, compat, force_indicator, con, &stmt, query, args) == false)
> #else
>         if (create_statement(lineno, compat, force_indicator, con, &stmt, query, &args) == false)
> #endif
>
> Why in the world is that like that?  We don't have such a kluge
> anyplace else we use va_list.  stringinfo.c for instance has
> never needed any such thing.
>
> Mark, does your gcc define __powerpc__, or only __powerpc64__?

$ touch foo.c; gcc -E -dM foo.c | grep __p ; rm foo.c
#define __powerpc64__ 1
#define __powerpc__ 1

Regards,
Mark


Re: ECPG failure on BF member Vaquita (Windows Vista)

From
Andrew Dunstan
Date:
Tom Lane wrote:
>
> Hmm, and I don't have to look far to find a smoking gun:
>
> #if defined(__GNUC__) && (defined (__powerpc__) || defined(__amd64__) || defined(__x86_64__))
>     if (create_statement(lineno, compat, force_indicator, con, &stmt, query, args) == false)
> #else
>     if (create_statement(lineno, compat, force_indicator, con, &stmt, query, &args) == false)
> #endif
>
>
>   
I also see:

#if defined(__GNUC__) && (defined (__powerpc__) || defined(__amd64__) || 
defined(__x86_64__))
#define APREF ap
#else
#define APREF *ap
#endif


But I also see that my amd64/FC6 machine does pass these tests with gcc.

I would certainly be nice if we could simplify all this. And if not, we 
should have a note about why it's needed.

cheers

andrew



Re: ECPG failure on BF member Vaquita (Windows Vista)

From
Tom Lane
Date:
Andrew Dunstan <andrew@dunslane.net> writes:
> But I also see that my amd64/FC6 machine does pass these tests with gcc.

Yeah, but the typedef represented by va_list can and probably does vary
between amd64 and ppc64.  I haven't an easy way to check, but I wonder
whether it's not an array type on ppc.

I'm of the opinion that ecpg is trying to do something here that's not
portable.  The C99 draft I have says
      [#3] The type declared is
              va_list
      which is an object type  suitable  for  holding  information      needed  by the macros va_start, va_arg, va_end,
andva_copy.      If access to the varying arguments is  desired,  the  called      function  shall declare an object
(referredto as ap in this      subclause) having type va_list.  The object ap may be passed      as an argument to
anotherfunction; if that function invokes      the va_arg macro with parameter ap, the value of ap  in  the
callingfunction is indeterminate and shall be passed to the      va_end macro prior to any further reference to ap.
(198)
      ____________________
     198 It is permitted to create a pointer to a va_list and pass         that  pointer  to  another  function,  in
whichcase the         original function may make further use  of  the  original         list after the other function
returns.

The footnote seems to say that what the code is doing is OK ... but
there isn't any such footnote in the Single Unix Spec:
http://www.opengroup.org/onlinepubs/007908799/xsh/stdarg.h.html
which makes me wonder just how portable it really is.

My recommendation is to get rid of the APREF hack, deal only in
va_list not &va_list, and inline ECPGget_variable into the two
places it's used to avoid the question of passing va_lists around
after they've been modified.  The routine's not that big (especially
seeing that only half of it is actually shared by the two callers)
and it's just not worth the notational effort, let alone any portability
risks, to factor it out.
        regards, tom lane


Re: ECPG failure on BF member Vaquita (Windows Vista)

From
Michael Meskes
Date:
On Wed, Apr 25, 2007 at 03:17:19PM -0400, Tom Lane wrote:
> Why in the world is that like that?  We don't have such a kluge
> anyplace else we use va_list.  stringinfo.c for instance has
> never needed any such thing.

I don't remember the exact details but this was added a long time ago
before 8.0 because we had some problems with one of the archs. I was
suprised about the differences back then too, but haven't checked
since.

Michael
-- 
Michael Meskes
Email: Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org)
ICQ: 179140304, AIM/Yahoo: michaelmeskes, Jabber: meskes@jabber.org
Go SF 49ers! Go Rhein Fire! Use Debian GNU/Linux! Use PostgreSQL!


Re: ECPG failure on BF member Vaquita (Windows Vista)

From
Michael Meskes
Date:
On Wed, Apr 25, 2007 at 04:38:30PM -0400, Tom Lane wrote:
> My recommendation is to get rid of the APREF hack, deal only in
> va_list not &va_list, and inline ECPGget_variable into the two
> places it's used to avoid the question of passing va_lists around
> after they've been modified.  The routine's not that big (especially
> seeing that only half of it is actually shared by the two callers)
> and it's just not worth the notational effort, let alone any portability
> risks, to factor it out.

Having spend countless hours debugging this stuff I fully agree with
you. It's not just ECPGget_variable though. I also had to inline
create_statement. This is only called once, so no big deal, but the
calling function gets bigger by quite a margin that way. It was
definitely easier to read with these functions, but again all this
hassle isn't worth it.

Attached you'll find a patch that should inline both functions and
remove the APREF stuff. This successfully runs the regression suite on
my Linux box. Please test it on those archs that needed special
treatment before I commit.

Michael
-- 
Michael Meskes
Email: Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org)
ICQ: 179140304, AIM/Yahoo: michaelmeskes, Jabber: meskes@jabber.org
Go SF 49ers! Go Rhein Fire! Use Debian GNU/Linux! Use PostgreSQL!


Re: ECPG failure on BF member Vaquita (Windows Vista)

From
Michael Meskes
Date:
ARGH!

This time with patch.

Michael
--
Michael Meskes
Email: Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org)
ICQ: 179140304, AIM/Yahoo: michaelmeskes, Jabber: meskes@jabber.org
Go SF 49ers! Go Rhein Fire! Use Debian GNU/Linux! Use PostgreSQL!

Attachment

Re: ECPG failure on BF member Vaquita (Windows Vista)

From
"Andrew Dunstan"
Date:
Michael Meskes wrote:
>
> Attached you'll find a patch that should inline both functions and
> remove the APREF stuff. This successfully runs the regression suite on
> my Linux box. Please test it on those archs that needed special
> treatment before I commit.
>

If you commit to HEAD it will be automatically tested on the buildfarm.

cheers

andrew



Re: ECPG failure on BF member Vaquita (Windows Vista)

From
Michael Meskes
Date:
On Thu, Apr 26, 2007 at 06:28:29AM -0500, Andrew Dunstan wrote:
> If you commit to HEAD it will be automatically tested on the buildfarm.

True. But it might also break a lot of other archs without helping on
those troubled ones. I thought this way would be better.

Michael
-- 
Michael Meskes
Email: Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org)
ICQ: 179140304, AIM/Yahoo: michaelmeskes, Jabber: meskes@jabber.org
Go SF 49ers! Go Rhein Fire! Use Debian GNU/Linux! Use PostgreSQL!


Re: ECPG failure on BF member Vaquita (Windows Vista)

From
"Mark Wong"
Date:
On 4/26/07, Michael Meskes <meskes@postgresql.org> wrote:
> On Wed, Apr 25, 2007 at 04:38:30PM -0400, Tom Lane wrote:
> > My recommendation is to get rid of the APREF hack, deal only in
> > va_list not &va_list, and inline ECPGget_variable into the two
> > places it's used to avoid the question of passing va_lists around
> > after they've been modified.  The routine's not that big (especially
> > seeing that only half of it is actually shared by the two callers)
> > and it's just not worth the notational effort, let alone any portability
> > risks, to factor it out.
>
> Having spend countless hours debugging this stuff I fully agree with
> you. It's not just ECPGget_variable though. I also had to inline
> create_statement. This is only called once, so no big deal, but the
> calling function gets bigger by quite a margin that way. It was
> definitely easier to read with these functions, but again all this
> hassle isn't worth it.
>
> Attached you'll find a patch that should inline both functions and
> remove the APREF stuff. This successfully runs the regression suite on
> my Linux box. Please test it on those archs that needed special
> treatment before I commit.

I applied it to REL8_2_STABLE/pgsql and ran 'run_build.pl --test
--keepall REL8_2_STABLE'.  Looks like it passed everything for me. :)

Regards,
Mark


Re: ECPG failure on BF member Vaquita (Windows Vista)

From
Tom Lane
Date:
Michael Meskes <meskes@postgresql.org> writes:
> Having spend countless hours debugging this stuff I fully agree with
> you. It's not just ECPGget_variable though. I also had to inline
> create_statement.

AFAICS you do not need to inline create_statement.  The risk factor
is where you call a routine that does something with a va_list, and
then you want to do something else (other than va_end) with that va_list
after it returns.  The one use of create_statement doesn't do that,
hence no problem.  (We know this works, because stringinfo.c does it.)

BTW, I think there is a small bug there: you fail to call va_end() in the
failure path.  I'm not sure if there are any modern machines where
va_end() isn't a no-op, but nonetheless the code isn't meeting the spec.
        regards, tom lane


Re: ECPG failure on BF member Vaquita (Windows Vista)

From
Michael Meskes
Date:
On Thu, Apr 26, 2007 at 01:35:42PM -0400, Tom Lane wrote:
> AFAICS you do not need to inline create_statement.  The risk factor
> is where you call a routine that does something with a va_list, and
> then you want to do something else (other than va_end) with that va_list
> after it returns.  The one use of create_statement doesn't do that,
> hence no problem.  (We know this works, because stringinfo.c does it.)

Given all the problems I had so far with this code I'd prefer to keep it
inlined. :-)

> BTW, I think there is a small bug there: you fail to call va_end() in the
> failure path.  I'm not sure if there are any modern machines where
> va_end() isn't a no-op, but nonetheless the code isn't meeting the spec.

Fixed. Thanks.

Michael

-- 
Michael Meskes
Email: Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org)
ICQ: 179140304, AIM/Yahoo: michaelmeskes, Jabber: meskes@jabber.org
Go SF 49ers! Go Rhein Fire! Use Debian GNU/Linux! Use PostgreSQL!


Re: ECPG failure on BF member Vaquita (Windows Vista)

From
Michael Meskes
Date:
On Thu, Apr 26, 2007 at 10:29:47AM -0700, Mark Wong wrote:
> I applied it to REL8_2_STABLE/pgsql and ran 'run_build.pl --test
> --keepall REL8_2_STABLE'.  Looks like it passed everything for me. :)

Thanks for this test. I just committed the changes to CVS.

Michael
-- 
Michael Meskes
Email: Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org)
ICQ: 179140304, AIM/Yahoo: michaelmeskes, Jabber: meskes@jabber.org
Go SF 49ers! Go Rhein Fire! Use Debian GNU/Linux! Use PostgreSQL!