Thread: PostgreSQL libraries - PThread Support, but not use...

PostgreSQL libraries - PThread Support, but not use...

From
Lee Kindness
Date:
On a slightly related note to the other threads thread [sic] going
on... Over the Christmas/New Year break i've been looking into making
the PostgreSQL client libraries (in particular libpq and ecpg)
thread-safe - that is they can safely be used by a program which
itself is using mutliple threads. I'm largely there with ecpg (globals
are protected, a sqlca for each thread), but of course it relies on
libpq which needs work to share a connection between thrreads.

Relying on thread experience from a few years back on Solaris I had
assumed I could just use the pthread_* routines in the shared
libraries and they would reference the weak symbols in libc, rather
than explicitly pull in libpthread. If the application using the
library linked to libpthreads then the 'real' thread routines would be
activated, single threaded apps wouldn't need link to
libpthread... However there seems to be no standard to which pthread_*
routines are available as weak symbols in libc (Linux and Solaris
differ).

So a couple of questions to decide the course of this work:

1. It's looking likely that the libraries will need to link to
libpthread, and as a result anything linking to the libraries will
need to link to libpthread too. Will this be accepted in a patch? A
similar problem has cropt up with the perl integration recently too
(i.e. the Perl developers have decided to link in libpthread).

2. Is their any mileage in using an abstraction layer - ACE, npr? A
Bit OTT for what i'm doing, but...

3. Am I wasting my time?

I think making the PostgreSQL libraries thread-safe/aware is very
worthwhile, but a lot of hurdles...

Thanks, Lee.


Re: PostgreSQL libraries - PThread Support, but not use...

From
Tom Lane
Date:
Lee Kindness <lkindness@csl.co.uk> writes:
> On a slightly related note to the other threads thread [sic] going
> on... Over the Christmas/New Year break i've been looking into making
> the PostgreSQL client libraries (in particular libpq and ecpg)
> thread-safe - that is they can safely be used by a program which
> itself is using mutliple threads. I'm largely there with ecpg (globals
> are protected, a sqlca for each thread), but of course it relies on
> libpq which needs work to share a connection between thrreads.

AFAIK, libpq is thread-safe already, it's just not thread-aware.
What you'd presumably want is a wrapper layer that adds a mutex to
prevent multiple threads from manipulating a PGconn at the same time.
Couldn't this be done without touching libpq at all?

> 1. It's looking likely that the libraries will need to link to
> libpthread, and as a result anything linking to the libraries will
> need to link to libpthread too. Will this be accepted in a patch?

If the patch requires pthread and not any other form of thread support,
probably not.  See nearby threading thread ;-)

In general I'd be unhappy with doing anything to libpq that forces
non-threaded clients to start depending on libpthread (or other thread
libraries).  Thus the idea of a wrapper seems more appealing to me.
        regards, tom lane


Re: PostgreSQL libraries - PThread Support, but not use...

From
"Jeroen T. Vermeulen"
Date:
On Mon, Jan 06, 2003 at 11:58:17AM -0500, Tom Lane wrote:
> 
> AFAIK, libpq is thread-safe already, it's just not thread-aware.
> What you'd presumably want is a wrapper layer that adds a mutex to
> prevent multiple threads from manipulating a PGconn at the same time.
> Couldn't this be done without touching libpq at all?
In fact it'd probably be better to do it without touching libpq at all,
so the application can tune for the level of thread-safety it needs.
There's no point in locking a PGresult for every time the application
wants to read a field--it'd be unacceptably slow yet some applications 
may want to do it.  But I'm sure this has been discussed in that other
threading thread...

Having a thread-aware wrapper multiplex a PGconn between multiple
client threads using the nonblocking functions probably isn't going to 
wash either, because nonblocking operation isn't quite the same as fully 
asynchronous.  And even if the backend protocol allows for it, there's
no great benefit since the threads would still be waiting on the same
socket and the same backend.  Better to give each thread its own PGconn 
and its own file descriptor to block on, if needed.


Jeroen



Re: PostgreSQL libraries - PThread Support, but not use...

From
Lee Kindness
Date:
Tom Lane writes:> Lee Kindness <lkindness@csl.co.uk> writes:> > On a slightly related note to the other threads thread
[sic]going> > on... Over the Christmas/New Year break i've been looking into making> > the PostgreSQL client libraries
(inparticular libpq and ecpg)> > thread-safe - that is they can safely be used by a program which> > itself is using
mutliplethreads. I'm largely there with ecpg (globals> > are protected, a sqlca for each thread), but of course it
relieson> > libpq which needs work to share a connection between thrreads.> > AFAIK, libpq is thread-safe already, it's
justnot thread-aware.> What you'd presumably want is a wrapper layer that adds a mutex to> prevent multiple threads
frommanipulating a PGconn at the same time.> Couldn't this be done without touching libpq at all?
 

Certainly, it could. I've not fully investigated the current failings
of libpq WRT to threading yet. But it's certainly a bit more than I
stated above. I don't know where the statement that libpq is thread
safe originated from (I see it's on the website). but at a quick
glance I believe that krb4, krb5, PQoidStatus(),
PQsetClientEncoding(), winsock_strerror() need more though
investigation and non-thread-safe fuctions are also being used (at a
glance gethostbyname() and strtok()).
> > 1. It's looking likely that the libraries will need to link to> > libpthread, and as a result anything linking to
thelibraries will> > need to link to libpthread too. Will this be accepted in a patch?> If the patch requires pthread
andnot any other form of thread support,> probably not.  See nearby threading thread ;-)> In general I'd be unhappy
withdoing anything to libpq that forces> non-threaded clients to start depending on libpthread (or other thread>
libraries). Thus the idea of a wrapper seems more appealing to me.
 

Okay, but what about ecpg? Thread-local sqlca instances would be a
real benefit, guess Michael and Christof are the guys to talk to?

I suppose the real problem is the needed baggage with this - the
autoconf macros will be a nightmare!

Thanks, Lee.


Re: PostgreSQL libraries - PThread Support, but not use...

From
Bruce Momjian
Date:
We have definatly had requests for improved thread-safeness for libpq
and ecpg in the past, so whatever you can do would be a help.  We say
libpq is thread-safe, but specifically mention the non-threadsafe calls
in the libpq documentation, or at least we should.  We have been making
marginal thread-safe improvements over the years.

---------------------------------------------------------------------------

Lee Kindness wrote:
> Tom Lane writes:
>  > Lee Kindness <lkindness@csl.co.uk> writes:
>  > > On a slightly related note to the other threads thread [sic] going
>  > > on... Over the Christmas/New Year break i've been looking into making
>  > > the PostgreSQL client libraries (in particular libpq and ecpg)
>  > > thread-safe - that is they can safely be used by a program which
>  > > itself is using mutliple threads. I'm largely there with ecpg (globals
>  > > are protected, a sqlca for each thread), but of course it relies on
>  > > libpq which needs work to share a connection between thrreads.
>  > 
>  > AFAIK, libpq is thread-safe already, it's just not thread-aware.
>  > What you'd presumably want is a wrapper layer that adds a mutex to
>  > prevent multiple threads from manipulating a PGconn at the same time.
>  > Couldn't this be done without touching libpq at all?
> 
> Certainly, it could. I've not fully investigated the current failings
> of libpq WRT to threading yet. But it's certainly a bit more than I
> stated above. I don't know where the statement that libpq is thread
> safe originated from (I see it's on the website). but at a quick
> glance I believe that krb4, krb5, PQoidStatus(),
> PQsetClientEncoding(), winsock_strerror() need more though
> investigation and non-thread-safe fuctions are also being used (at a
> glance gethostbyname() and strtok()).
> 
>  > > 1. It's looking likely that the libraries will need to link to
>  > > libpthread, and as a result anything linking to the libraries will
>  > > need to link to libpthread too. Will this be accepted in a patch?
>  > If the patch requires pthread and not any other form of thread support,
>  > probably not.  See nearby threading thread ;-)
>  > In general I'd be unhappy with doing anything to libpq that forces
>  > non-threaded clients to start depending on libpthread (or other thread
>  > libraries).  Thus the idea of a wrapper seems more appealing to me.
> 
> Okay, but what about ecpg? Thread-local sqlca instances would be a
> real benefit, guess Michael and Christof are the guys to talk to?
> 
> I suppose the real problem is the needed baggage with this - the
> autoconf macros will be a nightmare!
> 
> Thanks, Lee.
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
> 

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: PostgreSQL libraries - PThread Support, but not use...

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> We have definatly had requests for improved thread-safeness for libpq
> and ecpg in the past, so whatever you can do would be a help.  We say
> libpq is thread-safe, but specifically mention the non-threadsafe calls
> in the libpq documentation, or at least we should.

We do:
http://www.ca.postgresql.org/users-lounge/docs/7.3/postgres/libpq-threading.html

But Lee's point about depending on possibly-unsafe libc routines is a
good one.  I don't think anyone's gone through the code with an eye to
that.
        regards, tom lane


Re: PostgreSQL libraries - PThread Support, but not use...

From
Lamar Owen
Date:
On Monday 06 January 2003 12:28, Lee Kindness wrote:
> Tom Lane writes:
>  > Lee Kindness <lkindness@csl.co.uk> writes:
>  > > are protected, a sqlca for each thread), but of course it relies on
>  > > libpq which needs work to share a connection between thrreads.

>  > AFAIK, libpq is thread-safe already, it's just not thread-aware.
>  > Couldn't this be done without touching libpq at all?

> Certainly, it could. I've not fully investigated the current failings
> of libpq WRT to threading yet. But it's certainly a bit more than I
> stated above. I don't know where the statement that libpq is thread
> safe originated from (I see it's on the website). but at a quick
> glance I believe that krb4, krb5, PQoidStatus(),
> PQsetClientEncoding(), winsock_strerror() need more though
> investigation and non-thread-safe fuctions are also being used (at a
> glance gethostbyname() and strtok()).

Lee, see the AOLserver source code.  AOLserver (www.aolserver.com) is a 
multithreaded dynamic web server that supports pooled database connections.  
One of the supported databases is PostgreSQL (I maintain the driver, 
currently).  It's dual licensed (AOLserver Public License and GPL).  I've not 
yet seen a problem that could be traced to libpq not being threadsafe.  And 
AOLserver certainly would show a non-threadsafe problem, particularly with 
sites running OpenACS, which can easily beat the database to death.

BTW: thanks for the Bison RPMs.  And I believe the PGDG is appropriate as 
well. :-)
-- 
Lamar Owen
WGCR Internet Radio
1 Peter 4:11



Re: PostgreSQL libraries - PThread Support, but not use...

From
Lee Kindness
Date:
Tom Lane writes:> Bruce Momjian <pgman@candle.pha.pa.us> writes:> > We have definatly had requests for improved
thread-safenessfor libpq> > and ecpg in the past, so whatever you can do would be a help.  We say> > libpq is
thread-safe,but specifically mention the non-threadsafe calls> > in the libpq documentation, or at least we should.> We
do:>http://www.ca.postgresql.org/users-lounge/docs/7.3/postgres/libpq-threading.html> But Lee's point about depending
onpossibly-unsafe libc routines is a> good one.  I don't think anyone's gone through the code with an eye to> that.
 

Right, so a reasonable angle for me to take is to go through the libpq
source looking for potential problem areas and use of "known bad"
functions. I can add autoconf checks for the replacement *_r()
functions, and use these in place of the traditional ones where
available.

If any function is found to be not thread-safe and cannot be made so
using standard library calls then it needs to be documented as such
both in the source and the aforementioned documentation.

This approach avoids any thread library dependencies and documents the
current state of play WRT thread safety (i.e it's a good, and needed,
basis for any later work).

ECPG is a separate issue, and best handled as such (it will need
thread calls). I'll post a patch for it at a later date so the changes
are available to anyone who wants to play with ECPG and threads.

Ta, Lee.


Re: PostgreSQL libraries - PThread Support, but not use...

From
Bruce Momjian
Date:
Lee Kindness wrote:
> Tom Lane writes:
>  > Bruce Momjian <pgman@candle.pha.pa.us> writes:
>  > > We have definatly had requests for improved thread-safeness for libpq
>  > > and ecpg in the past, so whatever you can do would be a help.  We say
>  > > libpq is thread-safe, but specifically mention the non-threadsafe calls
>  > > in the libpq documentation, or at least we should.
>  > We do:
>  > http://www.ca.postgresql.org/users-lounge/docs/7.3/postgres/libpq-threading.html
>  > But Lee's point about depending on possibly-unsafe libc routines is a
>  > good one.  I don't think anyone's gone through the code with an eye to
>  > that.
> 
> Right, so a reasonable angle for me to take is to go through the libpq
> source looking for potential problem areas and use of "known bad"
> functions. I can add autoconf checks for the replacement *_r()
> functions, and use these in place of the traditional ones where
> available.

I am a little confused by the *_r functions.  Are they for all
functions?  BSD/OS doesn't have them, but all our libc functions are
threadsafe except for things like strtok, where they recommend strsep,
and gethostbyname, where they would suggest getaddrinfo, I guess.

> If any function is found to be not thread-safe and cannot be made so
> using standard library calls then it needs to be documented as such
> both in the source and the aforementioned documentation.

Ideally we will find we can get them all fixed in some way.

> This approach avoids any thread library dependencies and documents the
> current state of play WRT thread safety (i.e it's a good, and needed,
> basis for any later work).

Yes, good idea.

> ECPG is a separate issue, and best handled as such (it will need
> thread calls). I'll post a patch for it at a later date so the changes
> are available to anyone who wants to play with ECPG and threads.

Yes, needs to be done too.  Someone was complaining about ecpg not being
thread safe several months ago.  I don't remember the details.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: PostgreSQL libraries - PThread Support, but not use...

From
Lee Kindness
Date:
Bruce Momjian writes:> Lee Kindness wrote:> > Right, so a reasonable angle for me to take is to go through the libpq> >
sourcelooking for potential problem areas and use of "known bad"> > functions. I can add autoconf checks for the
replacement*_r()> > functions, and use these in place of the traditional ones where> > available.> I am a little
confusedby the *_r functions.  Are they for all> functions?  BSD/OS doesn't have them, but all our libc functions are>
threadsafeexcept for things like strtok, where they recommend strsep,> and gethostbyname, where they would suggest
getaddrinfo,I guess.
 

Some functions in the C library (and other common system libraries)
are defined in such a way to make their implementation
non-reentrant. Normally this is due to return values being supplied in
a static buffer which is overwritten by the subsequent call, or the
calls relying on static data between calls. A list such functions
would include:
asctime crypt ctime drand48 ecvt encrypt erand48 fcvt fgetgrentfgetpwent fgetspent getaliasbyname getaliasent getdate
getgrentgetgrgidgetgrnam gethostbyaddr gethostbyname gethostbyname2gethostent getlogin getnetbyaddr getnetbyname
getnetentgetnetgrentgetprotobyname getprotobynumber getprotoent getpwent getpwnam getpwuidgetservbyname getservbyport
getserventgetspent getspnam getutentgetutid getutline gmtime hcreate hdestroy hsearch initstate jrand48lcong48
localtimelrand48 mrand48 nrand48 ptsname qecvt qfcvt randrandom readdir readdir64 seed48 setkey setstate sgetspent
srand48srandomstrerror strtok tmpnam ttyname
 

to one degree or another. The important ones to watch for are: ctime,
localtime, asctime, gmtime, readdir, strtok and tmpnam. Now these
functions are often augmented by a _r partner which fixes their API to
allow their implementations to be reentrant.

After a quick grep libpq could be using crypt, gethostbyname, random,
strerror, encrypt, getpwuid, rand and strtok. As you rightly note, ins
ome cases the correct fix is to use alternative functions and not the
_r versions - this avoids lots of ifdefs!

L.


Re: PostgreSQL libraries - PThread Support, but not use...

From
Lee Kindness
Date:
Guys, just posted patches for libpq to address thread-safe issues. Now
uses strerror_r(), gethostbyname_r(), getpwuid_r() and crypt_r() where
available. Passes all tests on Linux and Solaris.

Any comments let me know - my first major(ish) outing in the
PostgreSQL source, in particular I'm not use in the configure stuff is
100% right...

Patches also at:

http://services.csl.co.uk/postgresql/diffs-20030109-configure.txt.gzhttp://services.csl.co.uk/postgresql/diffs-20030109-libpq.txt.gz

Thanks, Lee.

Lee Kindness writes:> Tom Lane writes:>  > Bruce Momjian <pgman@candle.pha.pa.us> writes:>  > > We have definatly had
requestsfor improved thread-safeness for libpq>  > > and ecpg in the past, so whatever you can do would be a help.  We
say> > > libpq is thread-safe, but specifically mention the non-threadsafe calls>  > > in the libpq documentation, or
atleast we should.>  > We do:>  > http://www.ca.postgresql.org/users-lounge/docs/7.3/postgres/libpq-threading.html>  >
ButLee's point about depending on possibly-unsafe libc routines is a>  > good one.  I don't think anyone's gone through
thecode with an eye to>  > that.> > Right, so a reasonable angle for me to take is to go through the libpq> source
lookingfor potential problem areas and use of "known bad"> functions. I can add autoconf checks for the replacement
*_r()>functions, and use these in place of the traditional ones where> available.> > If any function is found to be not
thread-safeand cannot be made so> using standard library calls then it needs to be documented as such> both in the
sourceand the aforementioned documentation.> > This approach avoids any thread library dependencies and documents the>
currentstate of play WRT thread safety (i.e it's a good, and needed,> basis for any later work).> > ECPG is a separate
issue,and best handled as such (it will need> thread calls). I'll post a patch for it at a later date so the changes>
areavailable to anyone who wants to play with ECPG and threads.> > Ta, Lee.
 


Re: [PATCHES] PostgreSQL libraries - PThread Support, but not use...

From
Lee Kindness
Date:
Tom,

Tom Lane writes:
 > Lee Kindness <lkindness@csl.co.uk> writes:
 > > + #define _THREAD_SAFE
 > > + #define _REENTRANT
 > > + #define _POSIX_PTHREAD_SEMANTICS
 > What is this stuff, and why isn't it wrapped in any sort of
 > platform-specific test?  If it's needed, why is it in only one .c
 > file?

It's actually in libpq-int.h too... The correct way for this is to
compile with the compilers specific thread flags, however the downside
to this has already been discussed. Depending on the system one, or a
combination of those flags will turn on some magic such as errno being
a function call rather than a global variable. This is needed to make
the library thread safe.

On a second look libpq-int.h isn't the best place for this (hence it
also appears in one of the C files), it needs to be done in each C
file before any of the system headers are included - a libpq-threads.h
header? Would this be ok?

Do do things 100% right we'd need to detect compiler thread flags and
compile with them...

 > Also, haven't you broken SOCK_STRERROR for the Windows case?

Sorry, I seem to have forgotton to update the prototype in win32.h to
match the updated function. Updated diff attached (and online).

Lee.


Attachment