Re: Solaris ecpg program doesn't work - pulling my hair - Mailing list pgsql-general
From | Bruce Momjian |
---|---|
Subject | Re: Solaris ecpg program doesn't work - pulling my hair |
Date | |
Msg-id | 200406100234.i5A2Yj826547@candle.pha.pa.us Whole thread Raw |
In response to | Re: Solaris ecpg program doesn't work - pulling my hair (Jan Wieck <JanWieck@Yahoo.com>) |
List | pgsql-general |
Jan, is this fixed in current CVS and 7.4.X CVS? --------------------------------------------------------------------------- Jan Wieck wrote: > wespvp@syntegra.com wrote: > > >> We had this in the past. I'm not sure and would have to search the > >> archives but I vaguely remember that this has been a threading bug in > >> the Solaris version. Could you please try using 7.4.2 or cvs head where > >> this should be fixed. Alternatively you could try with threadding > >> disabled. > > > > I verified last night that this problem also occurs with 7.4.2. I did some > > more extensive testing on the solution in my previous follow-up email. That > > is definitely the problem - configure is setting "-pthread" instead of > > "-lpthread" in config.status. After manually correcting this in > > config.status, everything works properly. > > As stated before, this is not true. If you don't compile with > -D_REENTRANT, the /usr/include/errno.h declared errno as > > extern int errno; > > instead of the thread safe > > extern int *___errno(); > #define errno *(___errno()) > > At least it does so here on Solaris 8. That leads to libpq using the > global errno variable, which might or might not be the one where "your" > error is in a multithreaded program. I mailed the correct solution as a > follow up to the other thread earlier today as a patch against 7.4.2. > > > > > I don't know enough about configure to know how to fix configure. It is > > properly setting -lpthread on linux. > > Just linking against the right libraries does not do it here. Solaris is > not Linux. > > > Jan > > > > > > > It's also not clear why the symptoms occur since the build does not abort > > with an unsatisfied external. It must be picking up the pthread externals > > from soemwhere else? The only difference I can se in the ldd's is the order > > of the libraries. An ldd of ecpglib shows: > > > > Good: > > > > gcc -shared -h libecpg.so.4 execute.o typename.o descriptor.o data.o error.o > > prepare.o memory.o connect.o misc.o -L../../../../src/port > > -L/mhinteg/trees/4/sun32_fixes/ported/openssl -L../pgtypeslib -lpgtypes > > -L../../../../src/interfaces/libpq -lpq -lssl -lcrypto -lm -lpthread > > -R/home/wrp/local/pgsql.7.4.2/lib -o libecpg.so.4.1 > > rm -f libecpg.so.4 > > ln -s libecpg.so.4.1 libecpg.so.4 > > rm -f libecpg.so > > ln -s libecpg.so.4.1 libecpg.so > > > > % ldd libecpg.so > > libpgtypes.so.1 => > > /home/wrp/local/pgsql.7.4.2/lib/libpgtypes.so.1 > > libpq.so.3 => /home/wrp/local/pgsql.7.4.2/lib/libpq.so.3 > > libssl.so.0.9.7 => > > /mhinteg/trees/4/sun32_fixes/ported/openssl/libssl.so.0.9.7 > > libcrypto.so.0.9.7 => > > /mhinteg/trees/4/sun32_fixes/ported/openssl/libcrypto.so.0.9.7 > > libm.so.1 => /usr/lib/libm.so.1 > > libpthread.so.1 => /usr/lib/libpthread.so.1 > > libresolv.so.2 => /usr/lib/libresolv.so.2 > > libsocket.so.1 => /usr/lib/libsocket.so.1 > > libnsl.so.1 => /usr/lib/libnsl.so.1 > > libdl.so.1 => /usr/lib/libdl.so.1 > > libc.so.1 => /usr/lib/libc.so.1 > > libmp.so.2 => /usr/lib/libmp.so.2 > > libthread.so.1 => /usr/lib/libthread.so.1 > > /usr/platform/SUNW,Ultra-Enterprise/lib/libc_psr.so.1 > > > > Bad: > > > > gcc -shared -h libecpg.so.4 execute.o typename.o descriptor.o data.o error.o > > prepare.o memory.o connect.o misc.o -L../../../../src/port > > -L/mhinteg/trees/4/sun32_fixes/ported/openssl -L../pgtypeslib -lpgtypes > > -L../../../../src/interfaces/libpq -lpq -lssl -lcrypto -lm -pthread > > -R/home/wrp/local/pgsql.7.4.2/lib -o libecpg.so.4.1 > > gcc: unrecognized option `-pthread' > > rm -f libecpg.so.4 > > ln -s libecpg.so.4.1 libecpg.so.4 > > rm -f libecpg.so > > ln -s libecpg.so.4.1 libecpg.so > > > > % !ldd > > ldd libecpg.so > > libpgtypes.so.1 => > > /home/wrp/local/pgsql.7.4.2/lib/libpgtypes.so.1 > > libpq.so.3 => /home/wrp/local/pgsql.7.4.2/lib/libpq.so.3 > > libssl.so.0.9.7 => > > /mhinteg/trees/4/sun32_fixes/ported/openssl/libssl.so.0.9.7 > > libcrypto.so.0.9.7 => > > /mhinteg/trees/4/sun32_fixes/ported/openssl/libcrypto.so.0.9.7 > > libm.so.1 => /usr/lib/libm.so.1 > > libresolv.so.2 => /usr/lib/libresolv.so.2 > > libsocket.so.1 => /usr/lib/libsocket.so.1 > > libnsl.so.1 => /usr/lib/libnsl.so.1 > > libpthread.so.1 => /usr/lib/libpthread.so.1 > > libdl.so.1 => /usr/lib/libdl.so.1 > > libc.so.1 => /usr/lib/libc.so.1 > > libmp.so.2 => /usr/lib/libmp.so.2 > > libthread.so.1 => /usr/lib/libthread.so.1 > > /usr/platform/SUNW,Ultra-Enterprise/lib/libc_psr.so.1 > > > > > > > > I realize it isn't entirely meaningful without the source code to know > > exactly where I put the print statements, but here is my debug output > > running the previously enclosed test program. You can see that it is > > allocating a new sqlca structure when it shouldn't be. > > > > > > Good: > > > > > > % ./testit > > ECPGget_sqlca: after pthread_getspecific: address of sqlca = 0x0 > > ECPGINIT: address of sqlca = 0x23b98 > > ECPGget_sqlca: before return: address of sqlca = 0x23b98 > > ECPGINIT: address of sqlca = 0x23b98 > > In ECPGconnect > > ECPGconnect: address of sqlca = 0x23b98 > > Before connection check > > bad connection > > ECPGconnect: address of sqlca = 0x23b98 > > ECPGget_sqlca: after pthread_getspecific: address of sqlca = 0x23b98 > > ECPGget_sqlca: before return: address of sqlca = 0x23b98 > > In error.c - code = -402 > > ECPGraise: address of sqlca = 0x23b98 > > After ECPGraise, sqlca->sqlcode = -402 > > ECPGconnect: address of sqlca = 0x23b98 > > Before return false, sqlca->sqlcode = -402 > > ECPGget_sqlca: after pthread_getspecific: address of sqlca = 0x23b98 > > ECPGget_sqlca: before return: address of sqlca = 0x23b98 > > ECPGget_sqlca: after pthread_getspecific: address of sqlca = 0x23b98 > > ECPGget_sqlca: before return: address of sqlca = 0x23b98 > > Connect failure: -402 > > > > > > > > Bad: > > > > > > % ./testit > > ECPGget_sqlca: after pthread_getspecific: address of sqlca = 0x0 > > ECPGINIT: address of sqlca = 0x23900 > > ECPGget_sqlca: before return: address of sqlca = 0x23900 > > ECPGINIT: address of sqlca = 0x23900 > > In ECPGconnect > > ECPGconnect: address of sqlca = 0x23900 > > Before connection check > > bad connection > > ECPGconnect: address of sqlca = 0x23900 > > ECPGget_sqlca: after pthread_getspecific: address of sqlca = 0x0 > > ECPGINIT: address of sqlca = 0x251b0 > > ECPGget_sqlca: before return: address of sqlca = 0x251b0 > > In error.c - code = -402 > > ECPGraise: address of sqlca = 0x251b0 > > After ECPGraise, sqlca->sqlcode = 0 > > ECPGconnect: address of sqlca = 0x23900 > > Before return false, sqlca->sqlcode = 0 > > ECPGget_sqlca: after pthread_getspecific: address of sqlca = 0x0 > > ECPGINIT: address of sqlca = 0x25248 > > ECPGget_sqlca: before return: address of sqlca = 0x25248 > > ECPGget_sqlca: after pthread_getspecific: address of sqlca = 0x0 > > ECPGINIT: address of sqlca = 0x252e0 > > ECPGget_sqlca: before return: address of sqlca = 0x252e0 > > ECPGINIT: address of sqlca = 0x252e0 > > ECPGget_sqlca: after pthread_getspecific: address of sqlca = 0x0 > > ECPGINIT: address of sqlca = 0x25378 > > ECPGget_sqlca: before return: address of sqlca = 0x25378 > > In error.c - code = -220 > > ECPGraise: address of sqlca = 0x25378 > > ECPGget_sqlca: after pthread_getspecific: address of sqlca = 0x0 > > ECPGINIT: address of sqlca = 0x25410 > > ECPGget_sqlca: before return: address of sqlca = 0x25410 > > ECPGget_sqlca: after pthread_getspecific: address of sqlca = 0x0 > > ECPGINIT: address of sqlca = 0x254a8 > > ECPGget_sqlca: before return: address of sqlca = 0x254a8 > > ECPGget_sqlca: after pthread_getspecific: address of sqlca = 0x0 > > ECPGINIT: address of sqlca = 0x25540 > > ECPGget_sqlca: before return: address of sqlca = 0x25540 > > SELECT error code: 0 > > systemNum = -4261248 > > > > I just got this in response to a post to pgsql-general on a different > > Solaris problem. This sounds like the same problem as I'm seeing. I've > > sent him my solution. Hopefully it will solve his symptoms also. > > > >>> One other problem I am looking into (and why I tried to compile with > >>> thread safety in the first place) is that this somehow did not turn on > >>> -D_REENTRANT in the CFLAGS for libpq. And that leads to libpq not using > >>> the threadsafe definition of errno, leading to serious communication > >>> trouble in the end (pqReadData() failing with ENOENT while the real > >>> error is a harmless EAGAIN from a nonblocking recv()). > >>> > >>> > >>> Jan > > > > > > Wes > > > > > > ---------------------------(end of broadcast)--------------------------- > > TIP 8: explain analyze is your friend > > > -- > #======================================================================# > # It's easier to get forgiveness for being wrong than for being right. # > # Let's break this rule - forgive me. # > #================================================== JanWieck@Yahoo.com # > > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Don't 'kill -9' the postmaster > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
pgsql-general by date: