Re: Solaris ecpg program doesn't work - pulling my hair - Mailing list pgsql-general
From | Jan Wieck |
---|---|
Subject | Re: Solaris ecpg program doesn't work - pulling my hair |
Date | |
Msg-id | 4063360A.3040103@Yahoo.com Whole thread Raw |
In response to | Re: Solaris ecpg program doesn't work - pulling my hair (<wespvp@syntegra.com>) |
Responses |
Re: Solaris ecpg program doesn't work - pulling my hair
Re: Solaris ecpg program doesn't work - pulling my hair |
List | pgsql-general |
wespvp@syntegra.com wrote: >> We had this in the past. I'm not sure and would have to search the >> archives but I vaguely remember that this has been a threading bug in >> the Solaris version. Could you please try using 7.4.2 or cvs head where >> this should be fixed. Alternatively you could try with threadding >> disabled. > > I verified last night that this problem also occurs with 7.4.2. I did some > more extensive testing on the solution in my previous follow-up email. That > is definitely the problem - configure is setting "-pthread" instead of > "-lpthread" in config.status. After manually correcting this in > config.status, everything works properly. As stated before, this is not true. If you don't compile with -D_REENTRANT, the /usr/include/errno.h declared errno as extern int errno; instead of the thread safe extern int *___errno(); #define errno *(___errno()) At least it does so here on Solaris 8. That leads to libpq using the global errno variable, which might or might not be the one where "your" error is in a multithreaded program. I mailed the correct solution as a follow up to the other thread earlier today as a patch against 7.4.2. > > I don't know enough about configure to know how to fix configure. It is > properly setting -lpthread on linux. Just linking against the right libraries does not do it here. Solaris is not Linux. Jan > > > It's also not clear why the symptoms occur since the build does not abort > with an unsatisfied external. It must be picking up the pthread externals > from soemwhere else? The only difference I can se in the ldd's is the order > of the libraries. An ldd of ecpglib shows: > > Good: > > gcc -shared -h libecpg.so.4 execute.o typename.o descriptor.o data.o error.o > prepare.o memory.o connect.o misc.o -L../../../../src/port > -L/mhinteg/trees/4/sun32_fixes/ported/openssl -L../pgtypeslib -lpgtypes > -L../../../../src/interfaces/libpq -lpq -lssl -lcrypto -lm -lpthread > -R/home/wrp/local/pgsql.7.4.2/lib -o libecpg.so.4.1 > rm -f libecpg.so.4 > ln -s libecpg.so.4.1 libecpg.so.4 > rm -f libecpg.so > ln -s libecpg.so.4.1 libecpg.so > > % ldd libecpg.so > libpgtypes.so.1 => > /home/wrp/local/pgsql.7.4.2/lib/libpgtypes.so.1 > libpq.so.3 => /home/wrp/local/pgsql.7.4.2/lib/libpq.so.3 > libssl.so.0.9.7 => > /mhinteg/trees/4/sun32_fixes/ported/openssl/libssl.so.0.9.7 > libcrypto.so.0.9.7 => > /mhinteg/trees/4/sun32_fixes/ported/openssl/libcrypto.so.0.9.7 > libm.so.1 => /usr/lib/libm.so.1 > libpthread.so.1 => /usr/lib/libpthread.so.1 > libresolv.so.2 => /usr/lib/libresolv.so.2 > libsocket.so.1 => /usr/lib/libsocket.so.1 > libnsl.so.1 => /usr/lib/libnsl.so.1 > libdl.so.1 => /usr/lib/libdl.so.1 > libc.so.1 => /usr/lib/libc.so.1 > libmp.so.2 => /usr/lib/libmp.so.2 > libthread.so.1 => /usr/lib/libthread.so.1 > /usr/platform/SUNW,Ultra-Enterprise/lib/libc_psr.so.1 > > Bad: > > gcc -shared -h libecpg.so.4 execute.o typename.o descriptor.o data.o error.o > prepare.o memory.o connect.o misc.o -L../../../../src/port > -L/mhinteg/trees/4/sun32_fixes/ported/openssl -L../pgtypeslib -lpgtypes > -L../../../../src/interfaces/libpq -lpq -lssl -lcrypto -lm -pthread > -R/home/wrp/local/pgsql.7.4.2/lib -o libecpg.so.4.1 > gcc: unrecognized option `-pthread' > rm -f libecpg.so.4 > ln -s libecpg.so.4.1 libecpg.so.4 > rm -f libecpg.so > ln -s libecpg.so.4.1 libecpg.so > > % !ldd > ldd libecpg.so > libpgtypes.so.1 => > /home/wrp/local/pgsql.7.4.2/lib/libpgtypes.so.1 > libpq.so.3 => /home/wrp/local/pgsql.7.4.2/lib/libpq.so.3 > libssl.so.0.9.7 => > /mhinteg/trees/4/sun32_fixes/ported/openssl/libssl.so.0.9.7 > libcrypto.so.0.9.7 => > /mhinteg/trees/4/sun32_fixes/ported/openssl/libcrypto.so.0.9.7 > libm.so.1 => /usr/lib/libm.so.1 > libresolv.so.2 => /usr/lib/libresolv.so.2 > libsocket.so.1 => /usr/lib/libsocket.so.1 > libnsl.so.1 => /usr/lib/libnsl.so.1 > libpthread.so.1 => /usr/lib/libpthread.so.1 > libdl.so.1 => /usr/lib/libdl.so.1 > libc.so.1 => /usr/lib/libc.so.1 > libmp.so.2 => /usr/lib/libmp.so.2 > libthread.so.1 => /usr/lib/libthread.so.1 > /usr/platform/SUNW,Ultra-Enterprise/lib/libc_psr.so.1 > > > > I realize it isn't entirely meaningful without the source code to know > exactly where I put the print statements, but here is my debug output > running the previously enclosed test program. You can see that it is > allocating a new sqlca structure when it shouldn't be. > > > Good: > > > % ./testit > ECPGget_sqlca: after pthread_getspecific: address of sqlca = 0x0 > ECPGINIT: address of sqlca = 0x23b98 > ECPGget_sqlca: before return: address of sqlca = 0x23b98 > ECPGINIT: address of sqlca = 0x23b98 > In ECPGconnect > ECPGconnect: address of sqlca = 0x23b98 > Before connection check > bad connection > ECPGconnect: address of sqlca = 0x23b98 > ECPGget_sqlca: after pthread_getspecific: address of sqlca = 0x23b98 > ECPGget_sqlca: before return: address of sqlca = 0x23b98 > In error.c - code = -402 > ECPGraise: address of sqlca = 0x23b98 > After ECPGraise, sqlca->sqlcode = -402 > ECPGconnect: address of sqlca = 0x23b98 > Before return false, sqlca->sqlcode = -402 > ECPGget_sqlca: after pthread_getspecific: address of sqlca = 0x23b98 > ECPGget_sqlca: before return: address of sqlca = 0x23b98 > ECPGget_sqlca: after pthread_getspecific: address of sqlca = 0x23b98 > ECPGget_sqlca: before return: address of sqlca = 0x23b98 > Connect failure: -402 > > > > Bad: > > > % ./testit > ECPGget_sqlca: after pthread_getspecific: address of sqlca = 0x0 > ECPGINIT: address of sqlca = 0x23900 > ECPGget_sqlca: before return: address of sqlca = 0x23900 > ECPGINIT: address of sqlca = 0x23900 > In ECPGconnect > ECPGconnect: address of sqlca = 0x23900 > Before connection check > bad connection > ECPGconnect: address of sqlca = 0x23900 > ECPGget_sqlca: after pthread_getspecific: address of sqlca = 0x0 > ECPGINIT: address of sqlca = 0x251b0 > ECPGget_sqlca: before return: address of sqlca = 0x251b0 > In error.c - code = -402 > ECPGraise: address of sqlca = 0x251b0 > After ECPGraise, sqlca->sqlcode = 0 > ECPGconnect: address of sqlca = 0x23900 > Before return false, sqlca->sqlcode = 0 > ECPGget_sqlca: after pthread_getspecific: address of sqlca = 0x0 > ECPGINIT: address of sqlca = 0x25248 > ECPGget_sqlca: before return: address of sqlca = 0x25248 > ECPGget_sqlca: after pthread_getspecific: address of sqlca = 0x0 > ECPGINIT: address of sqlca = 0x252e0 > ECPGget_sqlca: before return: address of sqlca = 0x252e0 > ECPGINIT: address of sqlca = 0x252e0 > ECPGget_sqlca: after pthread_getspecific: address of sqlca = 0x0 > ECPGINIT: address of sqlca = 0x25378 > ECPGget_sqlca: before return: address of sqlca = 0x25378 > In error.c - code = -220 > ECPGraise: address of sqlca = 0x25378 > ECPGget_sqlca: after pthread_getspecific: address of sqlca = 0x0 > ECPGINIT: address of sqlca = 0x25410 > ECPGget_sqlca: before return: address of sqlca = 0x25410 > ECPGget_sqlca: after pthread_getspecific: address of sqlca = 0x0 > ECPGINIT: address of sqlca = 0x254a8 > ECPGget_sqlca: before return: address of sqlca = 0x254a8 > ECPGget_sqlca: after pthread_getspecific: address of sqlca = 0x0 > ECPGINIT: address of sqlca = 0x25540 > ECPGget_sqlca: before return: address of sqlca = 0x25540 > SELECT error code: 0 > systemNum = -4261248 > > I just got this in response to a post to pgsql-general on a different > Solaris problem. This sounds like the same problem as I'm seeing. I've > sent him my solution. Hopefully it will solve his symptoms also. > >>> One other problem I am looking into (and why I tried to compile with >>> thread safety in the first place) is that this somehow did not turn on >>> -D_REENTRANT in the CFLAGS for libpq. And that leads to libpq not using >>> the threadsafe definition of errno, leading to serious communication >>> trouble in the end (pqReadData() failing with ENOENT while the real >>> error is a harmless EAGAIN from a nonblocking recv()). >>> >>> >>> Jan > > > Wes > > > ---------------------------(end of broadcast)--------------------------- > TIP 8: explain analyze is your friend -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com #
pgsql-general by date: