Thread: [HACKERS] fork()-safety, thread-safety

[HACKERS] fork()-safety, thread-safety

From
Nico Williams
Date:
A thread on parallelization made me wonder so I took a look:
- src/bin/*/parallel.c uses threads on WIN32- src/bin/*/parallel.c uses fork() on not-WIN32
  (Ditto src/bin/pg_basebackup/pg_basebackup.c and  src/backend/postmaster/syslogger.c.)
  A quick look at the functions called on the child side of fork()  makes me think that it's unlikely that the children
hereuse  async-signal-safe functions only.
 
  Why not use threads on all systems where threads are available when  we'd use threads on some such systems?  If this
codeis thread-safe  on WIN32, why wouldn't it be thread-safe on POSIX?  (Well, naturally  there may be calls to, e.g.,
getpwnam()and such that would not be  thread-safe on POSIX, and which might not exist on WIN32.  But I  mean, aside
fromthat, if synchronization is done correctly on WIN32,  what would stop that from being true on POSIX?)
 
- fork() is used in a number of places where execl() or execv() are  called immediately after (and exit() if the exec
fails).
  It would be better to use vfork() where available and _exit() instead  of exit().
  Alternatively posix_spawn() should be used (which generally uses  vfork() or equivalent under the covers).
  vfork() is widely demonized, but it's actually quite superior  (performance-wise) to fork() when all you want to do
isexec-or-exit  since no page copying (COW or otherwise) needs be done when using  vfork().
 
  It's actually safer to use vfork() because POSIX limits one to  async-signal-safe functions between fork() and
exec-or-exit... With  fork(), where neither the parent nor the child immediately execs-or-  exits, it's too easy to
failto make sure that the code they execute  is fork-safe.  Whereas with vfork() the fact that the parent (just  the
onethread, incidentally, not all of them[*]) blocks until the  child execs-or-exits means it's impossible to fail to
noticea  long-running child that does lots of fork-unsafe work.
 
  It's safer still to use posix_spawn(), naturally.

In Unix-land it's standard practice to ignore the async-signal-safe
requirement when using fork() early on in a daemon's life to start
worker processes.  This is fine, of course, though if we're using
CreateProcess*()/_spawn() on WIN32 anyways, it might be best to do the
equivalent on Unix and just spawn the children -- if nothing else, this
would reduce the likelihood of unintended divergence between WIN32 and
Unix.

Nico

[*] Actually, I do believe that on Solaris/Illumos vfork() stops all   threads in the parent, if I remember correctly
anyways. Linux's and   NetBSD's vfork() only stops the one thread in the parent that called   it.  I haven't checked
otherBSDs.  There was a patch for NetBSD to   stop all threads in the parent, but I convinced the NetBSD community   to
discardthat patch.
 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] fork()-safety, thread-safety

From
Andres Freund
Date:
Hi,

On 2017-10-05 17:02:22 -0500, Nico Williams wrote:
>    A quick look at the functions called on the child side of fork()
>    makes me think that it's unlikely that the children here use
>    async-signal-safe functions only.

That's not a requirement unless you're using fork *and* threads. At
least by my last reading of posix and common practice.



>  - fork() is used in a number of places where execl() or execv() are
>    called immediately after (and exit() if the exec fails).
> 
>    It would be better to use vfork() where available and _exit() instead
>    of exit().

vfork is less portable, and doesn't really win us anything on common
platforms. On most it's pretty much the same implementation.


>    vfork() is widely demonized, but it's actually quite superior
>    (performance-wise) to fork() when all you want to do is exec-or-exit
>    since no page copying (COW or otherwise) needs be done when using
>    vfork().

Not on linux, at least not as of a year or two back.


I do think it'd be good to move more towards threads, but not at all for
the reasons mentioned here.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] fork()-safety, thread-safety

From
Nico Williams
Date:
On Thu, Oct 05, 2017 at 03:13:07PM -0700, Andres Freund wrote:
> On 2017-10-05 17:02:22 -0500, Nico Williams wrote:
> >    A quick look at the functions called on the child side of fork()
> >    makes me think that it's unlikely that the children here use
> >    async-signal-safe functions only.
> 
> That's not a requirement unless you're using fork *and* threads. At
> least by my last reading of posix and common practice.

True, yes.  One still has to be careful to fflush() all open FILEs (that
might be used on both sides of fork()) and such though.

> >  - fork() is used in a number of places where execl() or execv() are
> >    called immediately after (and exit() if the exec fails).
> > 
> >    It would be better to use vfork() where available and _exit() instead
> >    of exit().
> 
> vfork is less portable, and doesn't really win us anything on common
> platforms. On most it's pretty much the same implementation.

It's trivial to use it where available, and fork() otherwise.  Mind you,
all current versions of Solaris/Illumos, *BSD, OS X, and Linux w/glibc
(and even Windows with WSL!) have a true vfork().

> >    vfork() is widely demonized, but it's actually quite superior
> >    (performance-wise) to fork() when all you want to do is exec-or-exit
> >    since no page copying (COW or otherwise) needs be done when using
> >    vfork().
> 
> Not on linux, at least not as of a year or two back.

glibc has it.  Other Linux C libraries might also; I've not checked them
all.

> I do think it'd be good to move more towards threads, but not at all for
> the reasons mentioned here.

You don't think eliminating a large difference between handling of WIN32
vs. POSIX is a good reason?

Nico
-- 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] fork()-safety, thread-safety

From
Andres Freund
Date:
Hi,

On 2017-10-05 17:31:07 -0500, Nico Williams wrote:
> > >    vfork() is widely demonized, but it's actually quite superior
> > >    (performance-wise) to fork() when all you want to do is exec-or-exit
> > >    since no page copying (COW or otherwise) needs be done when using
> > >    vfork().
> > 
> > Not on linux, at least not as of a year or two back.
> 
> glibc has it.  Other Linux C libraries might also; I've not checked them
> all.

It has it, but it's not more efficient.


> > I do think it'd be good to move more towards threads, but not at all for
> > the reasons mentioned here.
> 
> You don't think eliminating a large difference between handling of WIN32
> vs. POSIX is a good reason?

I seems like you'd not really get a much reduced set of differences,
just a *different* set of differences. After investing time.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] fork()-safety, thread-safety

From
Nico Williams
Date:
On Thu, Oct 05, 2017 at 03:34:41PM -0700, Andres Freund wrote:
> On 2017-10-05 17:31:07 -0500, Nico Williams wrote:
> > > >    vfork() is widely demonized, but it's actually quite superior
> > > >    (performance-wise) to fork() when all you want to do is exec-or-exit
> > > >    since no page copying (COW or otherwise) needs be done when using
> > > >    vfork().
> > > 
> > > Not on linux, at least not as of a year or two back.
> > 
> > glibc has it.  Other Linux C libraries might also; I've not checked them
> > all.
> 
> It has it, but it's not more efficient.

Because of signal-blocking issues?

> > > I do think it'd be good to move more towards threads, but not at all for
> > > the reasons mentioned here.
> > 
> > You don't think eliminating a large difference between handling of WIN32
> > vs. POSIX is a good reason?
> 
> I seems like you'd not really get a much reduced set of differences,
> just a *different* set of differences. After investing time.

Fair enough.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] fork()-safety, thread-safety

From
Tom Lane
Date:
Andres Freund <andres@anarazel.de> writes:
> On 2017-10-05 17:31:07 -0500, Nico Williams wrote:
>> You don't think eliminating a large difference between handling of WIN32
>> vs. POSIX is a good reason?

> I seems like you'd not really get a much reduced set of differences,
> just a *different* set of differences. After investing time.

Yeah -- unless we're prepared to drop threadless systems altogether,
this doesn't seem like it does much for maintainability.  It might even
be a net negative on that score, due to reducing the amount of testing
the now-legacy code path would get.

If there were reason to think we'd get a large performance benefit,
or some other concrete win, it might be worth putting time into this.
But I see no reason to believe that.

(There's certainly an argument to be made that no-one cares about
platforms without thread support anymore.  But I'm unconvinced that
rewriting existing code that works fine is the most productive
way to exploit such a choice if we were to make it.)
        regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] fork()-safety, thread-safety

From
Andres Freund
Date:
On 2017-10-05 18:49:22 -0400, Tom Lane wrote:
> (There's certainly an argument to be made that no-one cares about
> platforms without thread support anymore.  But I'm unconvinced that
> rewriting existing code that works fine is the most productive
> way to exploit such a choice if we were to make it.)

Yea, that's pretty much what I'm thinking too.

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] fork()-safety, thread-safety

From
Craig Ringer
Date:
On 6 October 2017 at 06:49, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Andres Freund <andres@anarazel.de> writes:
>> On 2017-10-05 17:31:07 -0500, Nico Williams wrote:
>>> You don't think eliminating a large difference between handling of WIN32
>>> vs. POSIX is a good reason?
>
>> I seems like you'd not really get a much reduced set of differences,
>> just a *different* set of differences. After investing time.
>
> Yeah -- unless we're prepared to drop threadless systems altogether,
> this doesn't seem like it does much for maintainability.  It might even
> be a net negative on that score, due to reducing the amount of testing
> the now-legacy code path would get.
>
> If there were reason to think we'd get a large performance benefit,
> or some other concrete win, it might be worth putting time into this.
> But I see no reason to believe that.
>
> (There's certainly an argument to be made that no-one cares about
> platforms without thread support anymore.  But I'm unconvinced that
> rewriting existing code that works fine is the most productive
> way to exploit such a choice if we were to make it.)

The only thing that gets me excited about a threaded postgres is the
ability to have a PL/Java, PL/Mono etc that don't suck. We could do
some really cool things that just aren't practical right now.

Not compelling to a wide audience, really.

-- Craig Ringer                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] fork()-safety, thread-safety

From
Andres Freund
Date:
On 2017-10-06 07:59:40 +0800, Craig Ringer wrote:
> The only thing that gets me excited about a threaded postgres is the
> ability to have a PL/Java, PL/Mono etc that don't suck. We could do
> some really cool things that just aren't practical right now.

Faster parallelism with a lot less reinventing the wheel. Easier backend
/ session separation. Shared caches.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] fork()-safety, thread-safety

From
Tom Lane
Date:
Andres Freund <andres@anarazel.de> writes:
> On 2017-10-06 07:59:40 +0800, Craig Ringer wrote:
>> The only thing that gets me excited about a threaded postgres is the
>> ability to have a PL/Java, PL/Mono etc that don't suck. We could do
>> some really cool things that just aren't practical right now.

> Faster parallelism with a lot less reinventing the wheel. Easier backend
> / session separation. Shared caches.

What you guys are talking about here is a threaded backend, which is a
whole different matter from replacing the client-side threading that Nico
was looking at.  That would surely offer far higher rewards, but the costs
to get there are likewise orders of magnitude greater.
        regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] fork()-safety, thread-safety

From
Andres Freund
Date:

On October 5, 2017 5:15:41 PM PDT, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>Andres Freund <andres@anarazel.de> writes:
>> On 2017-10-06 07:59:40 +0800, Craig Ringer wrote:
>>> The only thing that gets me excited about a threaded postgres is the
>>> ability to have a PL/Java, PL/Mono etc that don't suck. We could do
>>> some really cool things that just aren't practical right now.
>
>> Faster parallelism with a lot less reinventing the wheel. Easier
>backend
>> / session separation. Shared caches.
>
>What you guys are talking about here is a threaded backend, which is a
>whole different matter from replacing the client-side threading that
>Nico
>was looking at.  That would surely offer far higher rewards, but the
>costs
>to get there are likewise orders of magnitude greater.

No disagreement there. Don't really see much need for it client side though.

Andres
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] fork()-safety, thread-safety

From
Craig Ringer
Date:
On 6 October 2017 at 08:06, Andres Freund <andres@anarazel.de> wrote:
> On 2017-10-06 07:59:40 +0800, Craig Ringer wrote:
>> The only thing that gets me excited about a threaded postgres is the
>> ability to have a PL/Java, PL/Mono etc that don't suck. We could do
>> some really cool things that just aren't practical right now.
>
> Faster parallelism with a lot less reinventing the wheel. Easier backend
> / session separation. Shared caches.

Yeah. We have a pretty major NIH problem in PostgreSQL, and I agree
that adopting threading and some commonplace tools would sure help us
reduce that burden a bit.

I would really miss shared-nothing-by-default though.

-- Craig Ringer                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers