Thread: PostgreSQL libraries - PThread Support, but not use...
On a slightly related note to the other threads thread [sic] going on... Over the Christmas/New Year break i've been looking into making the PostgreSQL client libraries (in particular libpq and ecpg) thread-safe - that is they can safely be used by a program which itself is using mutliple threads. I'm largely there with ecpg (globals are protected, a sqlca for each thread), but of course it relies on libpq which needs work to share a connection between thrreads. Relying on thread experience from a few years back on Solaris I had assumed I could just use the pthread_* routines in the shared libraries and they would reference the weak symbols in libc, rather than explicitly pull in libpthread. If the application using the library linked to libpthreads then the 'real' thread routines would be activated, single threaded apps wouldn't need link to libpthread... However there seems to be no standard to which pthread_* routines are available as weak symbols in libc (Linux and Solaris differ). So a couple of questions to decide the course of this work: 1. It's looking likely that the libraries will need to link to libpthread, and as a result anything linking to the libraries will need to link to libpthread too. Will this be accepted in a patch? A similar problem has cropt up with the perl integration recently too (i.e. the Perl developers have decided to link in libpthread). 2. Is their any mileage in using an abstraction layer - ACE, npr? A Bit OTT for what i'm doing, but... 3. Am I wasting my time? I think making the PostgreSQL libraries thread-safe/aware is very worthwhile, but a lot of hurdles... Thanks, Lee.
Lee Kindness <lkindness@csl.co.uk> writes: > On a slightly related note to the other threads thread [sic] going > on... Over the Christmas/New Year break i've been looking into making > the PostgreSQL client libraries (in particular libpq and ecpg) > thread-safe - that is they can safely be used by a program which > itself is using mutliple threads. I'm largely there with ecpg (globals > are protected, a sqlca for each thread), but of course it relies on > libpq which needs work to share a connection between thrreads. AFAIK, libpq is thread-safe already, it's just not thread-aware. What you'd presumably want is a wrapper layer that adds a mutex to prevent multiple threads from manipulating a PGconn at the same time. Couldn't this be done without touching libpq at all? > 1. It's looking likely that the libraries will need to link to > libpthread, and as a result anything linking to the libraries will > need to link to libpthread too. Will this be accepted in a patch? If the patch requires pthread and not any other form of thread support, probably not. See nearby threading thread ;-) In general I'd be unhappy with doing anything to libpq that forces non-threaded clients to start depending on libpthread (or other thread libraries). Thus the idea of a wrapper seems more appealing to me. regards, tom lane
On Mon, Jan 06, 2003 at 11:58:17AM -0500, Tom Lane wrote: > > AFAIK, libpq is thread-safe already, it's just not thread-aware. > What you'd presumably want is a wrapper layer that adds a mutex to > prevent multiple threads from manipulating a PGconn at the same time. > Couldn't this be done without touching libpq at all? In fact it'd probably be better to do it without touching libpq at all, so the application can tune for the level of thread-safety it needs. There's no point in locking a PGresult for every time the application wants to read a field--it'd be unacceptably slow yet some applications may want to do it. But I'm sure this has been discussed in that other threading thread... Having a thread-aware wrapper multiplex a PGconn between multiple client threads using the nonblocking functions probably isn't going to wash either, because nonblocking operation isn't quite the same as fully asynchronous. And even if the backend protocol allows for it, there's no great benefit since the threads would still be waiting on the same socket and the same backend. Better to give each thread its own PGconn and its own file descriptor to block on, if needed. Jeroen
Tom Lane writes:> Lee Kindness <lkindness@csl.co.uk> writes:> > On a slightly related note to the other threads thread [sic]going> > on... Over the Christmas/New Year break i've been looking into making> > the PostgreSQL client libraries (inparticular libpq and ecpg)> > thread-safe - that is they can safely be used by a program which> > itself is using mutliplethreads. I'm largely there with ecpg (globals> > are protected, a sqlca for each thread), but of course it relieson> > libpq which needs work to share a connection between thrreads.> > AFAIK, libpq is thread-safe already, it's justnot thread-aware.> What you'd presumably want is a wrapper layer that adds a mutex to> prevent multiple threads frommanipulating a PGconn at the same time.> Couldn't this be done without touching libpq at all? Certainly, it could. I've not fully investigated the current failings of libpq WRT to threading yet. But it's certainly a bit more than I stated above. I don't know where the statement that libpq is thread safe originated from (I see it's on the website). but at a quick glance I believe that krb4, krb5, PQoidStatus(), PQsetClientEncoding(), winsock_strerror() need more though investigation and non-thread-safe fuctions are also being used (at a glance gethostbyname() and strtok()). > > 1. It's looking likely that the libraries will need to link to> > libpthread, and as a result anything linking to thelibraries will> > need to link to libpthread too. Will this be accepted in a patch?> If the patch requires pthread andnot any other form of thread support,> probably not. See nearby threading thread ;-)> In general I'd be unhappy withdoing anything to libpq that forces> non-threaded clients to start depending on libpthread (or other thread> libraries). Thus the idea of a wrapper seems more appealing to me. Okay, but what about ecpg? Thread-local sqlca instances would be a real benefit, guess Michael and Christof are the guys to talk to? I suppose the real problem is the needed baggage with this - the autoconf macros will be a nightmare! Thanks, Lee.
We have definatly had requests for improved thread-safeness for libpq and ecpg in the past, so whatever you can do would be a help. We say libpq is thread-safe, but specifically mention the non-threadsafe calls in the libpq documentation, or at least we should. We have been making marginal thread-safe improvements over the years. --------------------------------------------------------------------------- Lee Kindness wrote: > Tom Lane writes: > > Lee Kindness <lkindness@csl.co.uk> writes: > > > On a slightly related note to the other threads thread [sic] going > > > on... Over the Christmas/New Year break i've been looking into making > > > the PostgreSQL client libraries (in particular libpq and ecpg) > > > thread-safe - that is they can safely be used by a program which > > > itself is using mutliple threads. I'm largely there with ecpg (globals > > > are protected, a sqlca for each thread), but of course it relies on > > > libpq which needs work to share a connection between thrreads. > > > > AFAIK, libpq is thread-safe already, it's just not thread-aware. > > What you'd presumably want is a wrapper layer that adds a mutex to > > prevent multiple threads from manipulating a PGconn at the same time. > > Couldn't this be done without touching libpq at all? > > Certainly, it could. I've not fully investigated the current failings > of libpq WRT to threading yet. But it's certainly a bit more than I > stated above. I don't know where the statement that libpq is thread > safe originated from (I see it's on the website). but at a quick > glance I believe that krb4, krb5, PQoidStatus(), > PQsetClientEncoding(), winsock_strerror() need more though > investigation and non-thread-safe fuctions are also being used (at a > glance gethostbyname() and strtok()). > > > > 1. It's looking likely that the libraries will need to link to > > > libpthread, and as a result anything linking to the libraries will > > > need to link to libpthread too. Will this be accepted in a patch? > > If the patch requires pthread and not any other form of thread support, > > probably not. See nearby threading thread ;-) > > In general I'd be unhappy with doing anything to libpq that forces > > non-threaded clients to start depending on libpthread (or other thread > > libraries). Thus the idea of a wrapper seems more appealing to me. > > Okay, but what about ecpg? Thread-local sqlca instances would be a > real benefit, guess Michael and Christof are the guys to talk to? > > I suppose the real problem is the needed baggage with this - the > autoconf macros will be a nightmare! > > Thanks, Lee. > > ---------------------------(end of broadcast)--------------------------- > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Bruce Momjian <pgman@candle.pha.pa.us> writes: > We have definatly had requests for improved thread-safeness for libpq > and ecpg in the past, so whatever you can do would be a help. We say > libpq is thread-safe, but specifically mention the non-threadsafe calls > in the libpq documentation, or at least we should. We do: http://www.ca.postgresql.org/users-lounge/docs/7.3/postgres/libpq-threading.html But Lee's point about depending on possibly-unsafe libc routines is a good one. I don't think anyone's gone through the code with an eye to that. regards, tom lane
On Monday 06 January 2003 12:28, Lee Kindness wrote: > Tom Lane writes: > > Lee Kindness <lkindness@csl.co.uk> writes: > > > are protected, a sqlca for each thread), but of course it relies on > > > libpq which needs work to share a connection between thrreads. > > AFAIK, libpq is thread-safe already, it's just not thread-aware. > > Couldn't this be done without touching libpq at all? > Certainly, it could. I've not fully investigated the current failings > of libpq WRT to threading yet. But it's certainly a bit more than I > stated above. I don't know where the statement that libpq is thread > safe originated from (I see it's on the website). but at a quick > glance I believe that krb4, krb5, PQoidStatus(), > PQsetClientEncoding(), winsock_strerror() need more though > investigation and non-thread-safe fuctions are also being used (at a > glance gethostbyname() and strtok()). Lee, see the AOLserver source code. AOLserver (www.aolserver.com) is a multithreaded dynamic web server that supports pooled database connections. One of the supported databases is PostgreSQL (I maintain the driver, currently). It's dual licensed (AOLserver Public License and GPL). I've not yet seen a problem that could be traced to libpq not being threadsafe. And AOLserver certainly would show a non-threadsafe problem, particularly with sites running OpenACS, which can easily beat the database to death. BTW: thanks for the Bison RPMs. And I believe the PGDG is appropriate as well. :-) -- Lamar Owen WGCR Internet Radio 1 Peter 4:11
Tom Lane writes:> Bruce Momjian <pgman@candle.pha.pa.us> writes:> > We have definatly had requests for improved thread-safenessfor libpq> > and ecpg in the past, so whatever you can do would be a help. We say> > libpq is thread-safe,but specifically mention the non-threadsafe calls> > in the libpq documentation, or at least we should.> We do:>http://www.ca.postgresql.org/users-lounge/docs/7.3/postgres/libpq-threading.html> But Lee's point about depending onpossibly-unsafe libc routines is a> good one. I don't think anyone's gone through the code with an eye to> that. Right, so a reasonable angle for me to take is to go through the libpq source looking for potential problem areas and use of "known bad" functions. I can add autoconf checks for the replacement *_r() functions, and use these in place of the traditional ones where available. If any function is found to be not thread-safe and cannot be made so using standard library calls then it needs to be documented as such both in the source and the aforementioned documentation. This approach avoids any thread library dependencies and documents the current state of play WRT thread safety (i.e it's a good, and needed, basis for any later work). ECPG is a separate issue, and best handled as such (it will need thread calls). I'll post a patch for it at a later date so the changes are available to anyone who wants to play with ECPG and threads. Ta, Lee.
Lee Kindness wrote: > Tom Lane writes: > > Bruce Momjian <pgman@candle.pha.pa.us> writes: > > > We have definatly had requests for improved thread-safeness for libpq > > > and ecpg in the past, so whatever you can do would be a help. We say > > > libpq is thread-safe, but specifically mention the non-threadsafe calls > > > in the libpq documentation, or at least we should. > > We do: > > http://www.ca.postgresql.org/users-lounge/docs/7.3/postgres/libpq-threading.html > > But Lee's point about depending on possibly-unsafe libc routines is a > > good one. I don't think anyone's gone through the code with an eye to > > that. > > Right, so a reasonable angle for me to take is to go through the libpq > source looking for potential problem areas and use of "known bad" > functions. I can add autoconf checks for the replacement *_r() > functions, and use these in place of the traditional ones where > available. I am a little confused by the *_r functions. Are they for all functions? BSD/OS doesn't have them, but all our libc functions are threadsafe except for things like strtok, where they recommend strsep, and gethostbyname, where they would suggest getaddrinfo, I guess. > If any function is found to be not thread-safe and cannot be made so > using standard library calls then it needs to be documented as such > both in the source and the aforementioned documentation. Ideally we will find we can get them all fixed in some way. > This approach avoids any thread library dependencies and documents the > current state of play WRT thread safety (i.e it's a good, and needed, > basis for any later work). Yes, good idea. > ECPG is a separate issue, and best handled as such (it will need > thread calls). I'll post a patch for it at a later date so the changes > are available to anyone who wants to play with ECPG and threads. Yes, needs to be done too. Someone was complaining about ecpg not being thread safe several months ago. I don't remember the details. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Bruce Momjian writes:> Lee Kindness wrote:> > Right, so a reasonable angle for me to take is to go through the libpq> > sourcelooking for potential problem areas and use of "known bad"> > functions. I can add autoconf checks for the replacement*_r()> > functions, and use these in place of the traditional ones where> > available.> I am a little confusedby the *_r functions. Are they for all> functions? BSD/OS doesn't have them, but all our libc functions are> threadsafeexcept for things like strtok, where they recommend strsep,> and gethostbyname, where they would suggest getaddrinfo,I guess. Some functions in the C library (and other common system libraries) are defined in such a way to make their implementation non-reentrant. Normally this is due to return values being supplied in a static buffer which is overwritten by the subsequent call, or the calls relying on static data between calls. A list such functions would include: asctime crypt ctime drand48 ecvt encrypt erand48 fcvt fgetgrentfgetpwent fgetspent getaliasbyname getaliasent getdate getgrentgetgrgidgetgrnam gethostbyaddr gethostbyname gethostbyname2gethostent getlogin getnetbyaddr getnetbyname getnetentgetnetgrentgetprotobyname getprotobynumber getprotoent getpwent getpwnam getpwuidgetservbyname getservbyport getserventgetspent getspnam getutentgetutid getutline gmtime hcreate hdestroy hsearch initstate jrand48lcong48 localtimelrand48 mrand48 nrand48 ptsname qecvt qfcvt randrandom readdir readdir64 seed48 setkey setstate sgetspent srand48srandomstrerror strtok tmpnam ttyname to one degree or another. The important ones to watch for are: ctime, localtime, asctime, gmtime, readdir, strtok and tmpnam. Now these functions are often augmented by a _r partner which fixes their API to allow their implementations to be reentrant. After a quick grep libpq could be using crypt, gethostbyname, random, strerror, encrypt, getpwuid, rand and strtok. As you rightly note, ins ome cases the correct fix is to use alternative functions and not the _r versions - this avoids lots of ifdefs! L.
Guys, just posted patches for libpq to address thread-safe issues. Now uses strerror_r(), gethostbyname_r(), getpwuid_r() and crypt_r() where available. Passes all tests on Linux and Solaris. Any comments let me know - my first major(ish) outing in the PostgreSQL source, in particular I'm not use in the configure stuff is 100% right... Patches also at: http://services.csl.co.uk/postgresql/diffs-20030109-configure.txt.gzhttp://services.csl.co.uk/postgresql/diffs-20030109-libpq.txt.gz Thanks, Lee. Lee Kindness writes:> Tom Lane writes:> > Bruce Momjian <pgman@candle.pha.pa.us> writes:> > > We have definatly had requestsfor improved thread-safeness for libpq> > > and ecpg in the past, so whatever you can do would be a help. We say> > > libpq is thread-safe, but specifically mention the non-threadsafe calls> > > in the libpq documentation, or atleast we should.> > We do:> > http://www.ca.postgresql.org/users-lounge/docs/7.3/postgres/libpq-threading.html> > ButLee's point about depending on possibly-unsafe libc routines is a> > good one. I don't think anyone's gone through thecode with an eye to> > that.> > Right, so a reasonable angle for me to take is to go through the libpq> source lookingfor potential problem areas and use of "known bad"> functions. I can add autoconf checks for the replacement *_r()>functions, and use these in place of the traditional ones where> available.> > If any function is found to be not thread-safeand cannot be made so> using standard library calls then it needs to be documented as such> both in the sourceand the aforementioned documentation.> > This approach avoids any thread library dependencies and documents the> currentstate of play WRT thread safety (i.e it's a good, and needed,> basis for any later work).> > ECPG is a separate issue,and best handled as such (it will need> thread calls). I'll post a patch for it at a later date so the changes> areavailable to anyone who wants to play with ECPG and threads.> > Ta, Lee.
Tom, Tom Lane writes: > Lee Kindness <lkindness@csl.co.uk> writes: > > + #define _THREAD_SAFE > > + #define _REENTRANT > > + #define _POSIX_PTHREAD_SEMANTICS > What is this stuff, and why isn't it wrapped in any sort of > platform-specific test? If it's needed, why is it in only one .c > file? It's actually in libpq-int.h too... The correct way for this is to compile with the compilers specific thread flags, however the downside to this has already been discussed. Depending on the system one, or a combination of those flags will turn on some magic such as errno being a function call rather than a global variable. This is needed to make the library thread safe. On a second look libpq-int.h isn't the best place for this (hence it also appears in one of the C files), it needs to be done in each C file before any of the system headers are included - a libpq-threads.h header? Would this be ok? Do do things 100% right we'd need to detect compiler thread flags and compile with them... > Also, haven't you broken SOCK_STRERROR for the Windows case? Sorry, I seem to have forgotton to update the prototype in win32.h to match the updated function. Updated diff attached (and online). Lee.