Re: Solaris ecpg program doesn't work - pulling my hair - Mailing list pgsql-general

From Jan Wieck
Subject Re: Solaris ecpg program doesn't work - pulling my hair
Date
Msg-id 4063360A.3040103@Yahoo.com
Whole thread Raw
In response to Re: Solaris ecpg program doesn't work - pulling my hair  (<wespvp@syntegra.com>)
Responses Re: Solaris ecpg program doesn't work - pulling my hair
Re: Solaris ecpg program doesn't work - pulling my hair
List pgsql-general
wespvp@syntegra.com wrote:

>> We had this in the past. I'm not sure and would have to search the
>> archives but I vaguely remember that this has been a threading bug in
>> the Solaris version. Could you please try using 7.4.2 or cvs head where
>> this should be fixed. Alternatively you could try with threadding
>> disabled.
>
> I verified last night that this problem also occurs with 7.4.2.  I did some
> more extensive testing on the solution in my previous follow-up email.  That
> is definitely the problem - configure is setting "-pthread" instead of
> "-lpthread" in config.status.  After manually correcting this in
> config.status, everything works properly.

As stated before, this is not true. If you don't compile with
-D_REENTRANT, the /usr/include/errno.h declared errno as

     extern int errno;

instead of the thread safe

     extern int *___errno();
     #define errno *(___errno())

At least it does so here on Solaris 8. That leads to libpq using the
global errno variable, which might or might not be the one where "your"
error is in a multithreaded program. I mailed the correct solution as a
follow up to the other thread earlier today as a patch against 7.4.2.

>
> I don't know enough about configure to know how to fix configure.  It is
> properly setting -lpthread on linux.

Just linking against the right libraries does not do it here. Solaris is
not Linux.


Jan

>
>
> It's also not clear why the symptoms occur since the build does not abort
> with an unsatisfied external.  It must be picking up the pthread externals
> from soemwhere else?  The only difference I can se in the ldd's is the order
> of the libraries.  An ldd of ecpglib shows:
>
> Good:
>
> gcc -shared -h libecpg.so.4 execute.o typename.o descriptor.o data.o error.o
> prepare.o memory.o connect.o misc.o -L../../../../src/port
> -L/mhinteg/trees/4/sun32_fixes/ported/openssl -L../pgtypeslib -lpgtypes
> -L../../../../src/interfaces/libpq -lpq -lssl -lcrypto -lm -lpthread
> -R/home/wrp/local/pgsql.7.4.2/lib -o libecpg.so.4.1
> rm -f libecpg.so.4
> ln -s libecpg.so.4.1 libecpg.so.4
> rm -f libecpg.so
> ln -s libecpg.so.4.1 libecpg.so
>
> % ldd libecpg.so
>         libpgtypes.so.1 =>
> /home/wrp/local/pgsql.7.4.2/lib/libpgtypes.so.1
>         libpq.so.3 =>    /home/wrp/local/pgsql.7.4.2/lib/libpq.so.3
>         libssl.so.0.9.7 =>
> /mhinteg/trees/4/sun32_fixes/ported/openssl/libssl.so.0.9.7
>         libcrypto.so.0.9.7 =>
> /mhinteg/trees/4/sun32_fixes/ported/openssl/libcrypto.so.0.9.7
>         libm.so.1 =>     /usr/lib/libm.so.1
>         libpthread.so.1 =>       /usr/lib/libpthread.so.1
>         libresolv.so.2 =>        /usr/lib/libresolv.so.2
>         libsocket.so.1 =>        /usr/lib/libsocket.so.1
>         libnsl.so.1 =>   /usr/lib/libnsl.so.1
>         libdl.so.1 =>    /usr/lib/libdl.so.1
>         libc.so.1 =>     /usr/lib/libc.so.1
>         libmp.so.2 =>    /usr/lib/libmp.so.2
>         libthread.so.1 =>        /usr/lib/libthread.so.1
>         /usr/platform/SUNW,Ultra-Enterprise/lib/libc_psr.so.1
>
> Bad:
>
> gcc -shared -h libecpg.so.4 execute.o typename.o descriptor.o data.o error.o
> prepare.o memory.o connect.o misc.o -L../../../../src/port
> -L/mhinteg/trees/4/sun32_fixes/ported/openssl -L../pgtypeslib -lpgtypes
> -L../../../../src/interfaces/libpq -lpq -lssl -lcrypto -lm -pthread
> -R/home/wrp/local/pgsql.7.4.2/lib -o libecpg.so.4.1
> gcc: unrecognized option `-pthread'
> rm -f libecpg.so.4
> ln -s libecpg.so.4.1 libecpg.so.4
> rm -f libecpg.so
> ln -s libecpg.so.4.1 libecpg.so
>
> % !ldd
> ldd libecpg.so
>         libpgtypes.so.1 =>
> /home/wrp/local/pgsql.7.4.2/lib/libpgtypes.so.1
>         libpq.so.3 =>    /home/wrp/local/pgsql.7.4.2/lib/libpq.so.3
>         libssl.so.0.9.7 =>
> /mhinteg/trees/4/sun32_fixes/ported/openssl/libssl.so.0.9.7
>         libcrypto.so.0.9.7 =>
> /mhinteg/trees/4/sun32_fixes/ported/openssl/libcrypto.so.0.9.7
>         libm.so.1 =>     /usr/lib/libm.so.1
>         libresolv.so.2 =>        /usr/lib/libresolv.so.2
>         libsocket.so.1 =>        /usr/lib/libsocket.so.1
>         libnsl.so.1 =>   /usr/lib/libnsl.so.1
>         libpthread.so.1 =>       /usr/lib/libpthread.so.1
>         libdl.so.1 =>    /usr/lib/libdl.so.1
>         libc.so.1 =>     /usr/lib/libc.so.1
>         libmp.so.2 =>    /usr/lib/libmp.so.2
>         libthread.so.1 =>        /usr/lib/libthread.so.1
>         /usr/platform/SUNW,Ultra-Enterprise/lib/libc_psr.so.1
>
>
>
> I realize it isn't entirely meaningful without the source code to know
> exactly where I put the print statements, but here is my debug output
> running the previously enclosed test program.  You can see that it is
> allocating a new sqlca structure when it shouldn't be.
>
>
> Good:
>
>
> % ./testit
> ECPGget_sqlca: after pthread_getspecific: address of sqlca = 0x0
> ECPGINIT: address of sqlca = 0x23b98
> ECPGget_sqlca: before return: address of sqlca = 0x23b98
> ECPGINIT: address of sqlca = 0x23b98
> In ECPGconnect
> ECPGconnect: address of sqlca = 0x23b98
> Before connection check
> bad connection
> ECPGconnect: address of sqlca = 0x23b98
> ECPGget_sqlca: after pthread_getspecific: address of sqlca = 0x23b98
> ECPGget_sqlca: before return: address of sqlca = 0x23b98
> In error.c - code = -402
> ECPGraise: address of sqlca = 0x23b98
> After ECPGraise, sqlca->sqlcode = -402
> ECPGconnect: address of sqlca = 0x23b98
> Before return false, sqlca->sqlcode = -402
> ECPGget_sqlca: after pthread_getspecific: address of sqlca = 0x23b98
> ECPGget_sqlca: before return: address of sqlca = 0x23b98
> ECPGget_sqlca: after pthread_getspecific: address of sqlca = 0x23b98
> ECPGget_sqlca: before return: address of sqlca = 0x23b98
> Connect failure: -402
>
>
>
> Bad:
>
>
> % ./testit
> ECPGget_sqlca: after pthread_getspecific: address of sqlca = 0x0
> ECPGINIT: address of sqlca = 0x23900
> ECPGget_sqlca: before return: address of sqlca = 0x23900
> ECPGINIT: address of sqlca = 0x23900
> In ECPGconnect
> ECPGconnect: address of sqlca = 0x23900
> Before connection check
> bad connection
> ECPGconnect: address of sqlca = 0x23900
> ECPGget_sqlca: after pthread_getspecific: address of sqlca = 0x0
> ECPGINIT: address of sqlca = 0x251b0
> ECPGget_sqlca: before return: address of sqlca = 0x251b0
> In error.c - code = -402
> ECPGraise: address of sqlca = 0x251b0
> After ECPGraise, sqlca->sqlcode = 0
> ECPGconnect: address of sqlca = 0x23900
> Before return false, sqlca->sqlcode = 0
> ECPGget_sqlca: after pthread_getspecific: address of sqlca = 0x0
> ECPGINIT: address of sqlca = 0x25248
> ECPGget_sqlca: before return: address of sqlca = 0x25248
> ECPGget_sqlca: after pthread_getspecific: address of sqlca = 0x0
> ECPGINIT: address of sqlca = 0x252e0
> ECPGget_sqlca: before return: address of sqlca = 0x252e0
> ECPGINIT: address of sqlca = 0x252e0
> ECPGget_sqlca: after pthread_getspecific: address of sqlca = 0x0
> ECPGINIT: address of sqlca = 0x25378
> ECPGget_sqlca: before return: address of sqlca = 0x25378
> In error.c - code = -220
> ECPGraise: address of sqlca = 0x25378
> ECPGget_sqlca: after pthread_getspecific: address of sqlca = 0x0
> ECPGINIT: address of sqlca = 0x25410
> ECPGget_sqlca: before return: address of sqlca = 0x25410
> ECPGget_sqlca: after pthread_getspecific: address of sqlca = 0x0
> ECPGINIT: address of sqlca = 0x254a8
> ECPGget_sqlca: before return: address of sqlca = 0x254a8
> ECPGget_sqlca: after pthread_getspecific: address of sqlca = 0x0
> ECPGINIT: address of sqlca = 0x25540
> ECPGget_sqlca: before return: address of sqlca = 0x25540
> SELECT error code: 0
> systemNum = -4261248
>
> I just got this in response to a post to pgsql-general on a different
> Solaris problem.  This sounds like the same problem as I'm seeing.  I've
> sent him my solution.  Hopefully it will solve his symptoms also.
>
>>> One other problem I am looking into (and why I tried to compile with
>>> thread safety in the first place) is that this somehow did not turn on
>>> -D_REENTRANT in the CFLAGS for libpq. And that leads to libpq not using
>>> the threadsafe definition of errno, leading to serious communication
>>> trouble in the end (pqReadData() failing with ENOENT while the real
>>> error is a harmless EAGAIN from a nonblocking recv()).
>>>
>>>
>>> Jan
>
>
> Wes
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 8: explain analyze is your friend


--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #


pgsql-general by date:

Previous
From: Shelby Cain
Date:
Subject: Re: Memory usage during vacuum
Next
From: Rob Hoopman
Date:
Subject: Re: self referencing tables/ nested sets etc...