Thread: Threading in BGWorkers (!)

Threading in BGWorkers (!)

From
James Sewell
Date:
Hi hackers,

I was talking about PostgreSQL and threading on IRC the other day - which I know is a frowned upon topic - and just wanted to frame the same questions here and hopefully get a discussion going.

On IRC RhodiumToad offered the following advice (after a standard there be dragons here disclaimer, as well as noting this may not be a complete list):

Threading (may) be safe if:
  1. all signals will be delivered on the main thread and nowhere else
  2. no postgres function will ever be called from anything that's not the main thread
  3. it's safe for postgres to call any system library function, even ones explicitly marked as not thread safe
  4. it's safe for postgres to call sigprocmask()
I can live with 1. and 2 - they are fairly easy as long as you know the rules.

3. needs to be converted to a list of possible calls - which can be done and checked, I suppose against the POSIX standards?

4. is not fine (I suppose this is a specific example of 3.), as I think Postgres would need to be using  pthread_sigmask() instead - given looks like a one line change could  pthread_sigmask be used when available?

Are there any other rules which need to be adhered to?

Any thoughts, comments, dire warnings, hand waving? On IRC the general thought was that any changes could be seen as encouraging threading which is a bad thing - I would argue if you're writing BGWorkers which have SPI access you've already got a pretty large area to screw things up in anyway (if you aren't following the standards / code comments).

James



The contents of this email are confidential and may be subject to legal or professional privilege and copyright. No representation is made that this email is free of viruses or other defects. If you have received this communication in error, you may not copy or distribute any part of it or otherwise disclose its contents to anyone. Please advise the sender of your incorrect receipt of this correspondence.

Re: Threading in BGWorkers (!)

From
Tom Lane
Date:
James Sewell <james.sewell@jirotech.com> writes:
> I was talking about PostgreSQL and threading on IRC the other day - which I
> know is a frowned upon topic - and just wanted to frame the same questions
> here and hopefully get a discussion going.

I think the short answer about threading in bgworkers (or any other
backend process) is "we don't support it; if you try it and it breaks,
which it likely will, you get to keep both pieces".  I'm not sure that
there's any merit in making small dents in that policy.  I suspect that
at some point, somebody will try to move those goalposts a long way,
but it will be a large and controversial patch.

Why do you want threads in a bgworker anyway?  You could spawn multiple
bgworkers, or you could dispatch the threaded work to a non-Postgres-ish
process as PL/Java does.  The only advantage I can see of doing work in a
process that's not at arm's-length is to have access to PG computational
or IPC facilities, and none of that is likely to work safely in a threaded
context.

            regards, tom lane



Re: Threading in BGWorkers (!)

From
Eric Ridge
Date:
dispatch the threaded work to a non-Postgres-ish process


I’m no expert here but all your solid points about threading with Postgres notwithstanding....


I think there’s some issues around interrupt handling and general syscalls that doesn’t otherwise play nice with “non-Postgres-ish” *threads* when Postgres is still the main thread. 


This is all purely hypothetical, but it seems that Postgres’ use of sigprocmask can cause problems with threads that are otherwise 100% “disconnected” from Postgres. 


How can we start a dialog about this kind of situation?  Nobody here is trying to make Postgres thread-safe, maybe only thread-friendly. 


I think Mr. Sewell, has a better handle around these topics.  But he ain’t the only one interested. 


eric 


On Mon, Jun 22, 2020 at 9:38 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
James Sewell <james.sewell@jirotech.com> writes:
> I was talking about PostgreSQL and threading on IRC the other day - which I
> know is a frowned upon topic - and just wanted to frame the same questions
> here and hopefully get a discussion going.

I think the short answer about threading in bgworkers (or any other
backend process) is "we don't support it; if you try it and it breaks,
which it likely will, you get to keep both pieces".  I'm not sure that
there's any merit in making small dents in that policy.  I suspect that
at some point, somebody will try to move those goalposts a long way,
but it will be a large and controversial patch.

Why do you want threads in a bgworker anyway?  You could spawn multiple
bgworkers, or you could dispatch the threaded work to a non-Postgres-ish
process as PL/Java does.  The only advantage I can see of doing work in a
process that's not at arm's-length is to have access to PG computational
or IPC facilities, and none of that is likely to work safely in a threaded
context.

                        regards, tom lane


Re: Threading in BGWorkers (!)

From
James Sewell
Date:
On Tue, 23 Jun 2020 at 13:38, Tom Lane <tgl@sss.pgh.pa.us> wrote:
James Sewell <james.sewell@jirotech.com> writes:
> I was talking about PostgreSQL and threading on IRC the other day - which I
> know is a frowned upon topic - and just wanted to frame the same questions
> here and hopefully get a discussion going.

I think the short answer about threading in bgworkers (or any other
backend process) is "we don't support it; if you try it and it breaks,
which it likely will, you get to keep both pieces". 

I'm hoping that from this a set of rules rather than a blanket ban can be agreed upon.  
 
I'm not sure that
there's any merit in making small dents in that policy.  I suspect that
at some point, somebody will try to move those goalposts a long way,
but it will be a large and controversial patch.

Understood, and I do agree with keeping the policy simple - but it looks like (potentially) the only blocker is a one line change to swap out sigprocmask. From my perspective this is a very large win - I'll do some testing.

Why do you want threads in a bgworker anyway?  You could spawn multiple
bgworkers, or you could dispatch the threaded work to a non-Postgres-ish
process as PL/Java does.  The only advantage I can see of doing work in a
process that's not at arm's-length is to have access to PG computational
or IPC facilities, and none of that is likely to work safely in a threaded
context.

I'm writing the workers in Rust - it would be nice to be able to safely access Rust crates which make use of threads.

cheers,
James


The contents of this email are confidential and may be subject to legal or professional privilege and copyright. No representation is made that this email is free of viruses or other defects. If you have received this communication in error, you may not copy or distribute any part of it or otherwise disclose its contents to anyone. Please advise the sender of your incorrect receipt of this correspondence.

Re: Threading in BGWorkers (!)

From
Konstantin Knizhnik
Date:

On 23.06.2020 06:38, Tom Lane wrote:
> James Sewell <james.sewell@jirotech.com> writes:
>> I was talking about PostgreSQL and threading on IRC the other day - which I
>> know is a frowned upon topic - and just wanted to frame the same questions
>> here and hopefully get a discussion going.
> I think the short answer about threading in bgworkers (or any other
> backend process) is "we don't support it; if you try it and it breaks,
> which it likely will, you get to keep both pieces".  I'm not sure that
> there's any merit in making small dents in that policy.  I suspect that
> at some point, somebody will try to move those goalposts a long way,
> but it will be a large and controversial patch.
>
> Why do you want threads in a bgworker anyway?  You could spawn multiple
> bgworkers, or you could dispatch the threaded work to a non-Postgres-ish
> process as PL/Java does.  The only advantage I can see of doing work in a
> process that's not at arm's-length is to have access to PG computational
> or IPC facilities, and none of that is likely to work safely in a threaded
> context.

Just an example of using threads in bgworker: right now I am working on 
FDW for RocksDB.
RocksDB LSM  implementation shows quite promising performance advantages 
comparing with classical Postgres B-Tree
(almost no degrade of insert speed with increasing number of records).

RocksDB is multithreded database. May be it is possible to port it to 
multiprocess model
but it will be very non trivial task. And even if such fork of RocksDB 
will be created,  somebody have to permanently back patch changes from 
main trunk.

Alternative solution is to launch multithreded RocksDB worker and let 
backends redirect requests to this worker.
Client-server architecture inside server:)

Using multithreading in bgworker is possible if you do not use any 
Postgres runtime inside thread procedures or do it in exclusive critical 
section.
It is not so convenient but possible. The most difficult thing from my 
point of view is error reporting.




Re: Threading in BGWorkers (!)

From
James Sewell
Date:
Using multithreading in bgworker is possible if you do not use any
Postgres runtime inside thread procedures or do it in exclusive critical
section.
It is not so convenient but possible. The most difficult thing from my
point of view is error reporting.

Happy to be proved wrong, but I don't think this is correct. 

PostgreSQL can call sigprocmask() in your BGWorker whenever it wants, and  "The use of sigprocmask() is unspecified in a multithreaded process" [1]




The contents of this email are confidential and may be subject to legal or professional privilege and copyright. No representation is made that this email is free of viruses or other defects. If you have received this communication in error, you may not copy or distribute any part of it or otherwise disclose its contents to anyone. Please advise the sender of your incorrect receipt of this correspondence.

Re: Threading in BGWorkers (!)

From
James Sewell
Date:


On Tue, 23 Jun 2020 at 17:15, James Sewell <james.sewell@jirotech.com> wrote:
Using multithreading in bgworker is possible if you do not use any
Postgres runtime inside thread procedures or do it in exclusive critical
section.
It is not so convenient but possible. The most difficult thing from my
point of view is error reporting.

Happy to be proved wrong, but I don't think this is correct. 

PostgreSQL can call sigprocmask() in your BGWorker whenever it wants, and  "The use of sigprocmask() is unspecified in a multithreaded process" [1]




The contents of this email are confidential and may be subject to legal or professional privilege and copyright. No representation is made that this email is free of viruses or other defects. If you have received this communication in error, you may not copy or distribute any part of it or otherwise disclose its contents to anyone. Please advise the sender of your incorrect receipt of this correspondence.

Re: Threading in BGWorkers (!)

From
Konstantin Knizhnik
Date:


On 23.06.2020 10:15, James Sewell wrote:
Using multithreading in bgworker is possible if you do not use any
Postgres runtime inside thread procedures or do it in exclusive critical
section.
It is not so convenient but possible. The most difficult thing from my
point of view is error reporting.

Happy to be proved wrong, but I don't think this is correct.
PostgreSQL can call sigprocmask() in your BGWorker whenever it wants, and  "The use of sigprocmask() is unspecified in a multithreaded process" [1]

Sorry, may be I missed something.
But in my bgworker I am not using Postgres runtime at all (except initial bgworker startup code).
So I am not using latches (which are based on signals), snapshots,...
In my case bgworker has no connection to Postgres at all.
Yes, it can still receives signals from Postmaster (SIGTERM, SIGHUP). But their handler are trivial and do not need to mask any signals.

So may be in general case combination of signals and threads may cause some problems,
but it doesn't mean that you can't create multithreaded bgworker.



Re: Threading in BGWorkers (!)

From
James Sewell
Date:

On Tue, 23 Jun 2020 at 17:26, Konstantin Knizhnik <k.knizhnik@postgrespro.ru> wrote:
On 23.06.2020 10:15, James Sewell wrote:
Using multithreading in bgworker is possible if you do not use any
Postgres runtime inside thread procedures or do it in exclusive critical
section.
It is not so convenient but possible. The most difficult thing from my
point of view is error reporting.

Happy to be proved wrong, but I don't think this is correct.
PostgreSQL can call sigprocmask() in your BGWorker whenever it wants, and  "The use of sigprocmask() is unspecified in a multithreaded process" [1]

Sorry, may be I missed something.
But in my bgworker I am not using Postgres runtime at all (except initial bgworker startup code).
So I am not using latches (which are based on signals), snapshots,...
In my case bgworker has no connection to Postgres at all.
Yes, it can still receives signals from Postmaster (SIGTERM, SIGHUP). But their handler are trivial and do not need to mask any signals.

So may be in general case combination of signals and threads may cause some problems,
but it doesn't mean that you can't create multithreaded bgworker.

Ah yes - sorry *I* missed something. 

A multi threaded BGWorker which accesses shared memory and database via SPI. 


The contents of this email are confidential and may be subject to legal or professional privilege and copyright. No representation is made that this email is free of viruses or other defects. If you have received this communication in error, you may not copy or distribute any part of it or otherwise disclose its contents to anyone. Please advise the sender of your incorrect receipt of this correspondence.

Re: Threading in BGWorkers (!)

From
Chapman Flack
Date:
On 06/22/20 23:38, Tom Lane wrote:
> bgworkers, or you could dispatch the threaded work to a non-Postgres-ish
> process as PL/Java does.  The only advantage I can see of doing work in a
> process that's not at arm's-length is to have access to PG computational
> or IPC facilities, and none of that is likely to work safely in a threaded
> context.

You might be thinking of Dave Cramer's PL/JVM, which runs a JVM in another
process and does IPC to it.

PL/Java, by contrast, runs the JVM in-process and keeps both pieces.
It only lets one thread downcall into PostgreSQL.


On 06/23/20 00:46, Eric Ridge wrote:
> How can we start a dialog about this kind of situation?  Nobody here is
> trying to make Postgres thread-safe, maybe only thread-friendly.

There are just a couple of things I've been wanting to suggest, based on
PL/Java experience.

1) It would be nice to be able to ereport from an arbitrary thread. There
   is already support in core to forward messages from parallel workers:
   the worker signals the lead process after adding a message to a shm_mq
   referenced from its ParallelWorkerInfo struct. The signal handler
   asynchronously sets ParallelMessagePending, which ProcessInterrupts
   will check at some convenient point and ereport the message.

   It seems like it would be no sweat for another thread in the same
   process to add something to an mq (could be the same structure as
   shm_mq but would not need to really be in shared memory) and do a
   volatile write of ParallelMessagePending. The rest is already there.
   Only missing ingredient would be a way for an extension to allocate
   something like a ParallelWorkerInfo struct good for the life of the
   backend (the current parallel worker infrastructure makes them all
   go away at the completion of a parallel query).

2) It would be nice to be able to request service. If J Random thread
   in PL/Java generates a bit of work requiring some PostgreSQL API,
   at present that bit of work has to queue up until the next time a
   call into PL/Java is occasioned by a query, which might be never.

   It would be nice to be able to also asynchronously set some flag
   like ExtensionServiceRequested, which could be checked as part of
   CHECK_FOR_INTERRUPTS or even at more limited times, such as idle.
   An extension could populate an ExtensionServiceInfo struct with
   a service entry point an flag indicating that extension has work
   pending.


Those are the two thread-friendlier ideas I have been thinking of.

Regards,
-Chap



Re: Threading in BGWorkers (!)

From
Andres Freund
Date:
Hi,

On 2020-06-23 09:19:36 -0400, Chapman Flack wrote:
> 1) It would be nice to be able to ereport from an arbitrary thread. There
>    is already support in core to forward messages from parallel workers:
>    the worker signals the lead process after adding a message to a shm_mq
>    referenced from its ParallelWorkerInfo struct. The signal handler
>    asynchronously sets ParallelMessagePending, which ProcessInterrupts
>    will check at some convenient point and ereport the message.
> 
>    It seems like it would be no sweat for another thread in the same
>    process to add something to an mq (could be the same structure as
>    shm_mq but would not need to really be in shared memory) and do a
>    volatile write of ParallelMessagePending. The rest is already there.
>    Only missing ingredient would be a way for an extension to allocate
>    something like a ParallelWorkerInfo struct good for the life of the
>    backend (the current parallel worker infrastructure makes them all
>    go away at the completion of a parallel query).

I think that's way harder than what you make it sound here. The locking
for shm_mq doesn't really work inside a process. In contrast to the
single threaded case something like a volatile write to
ParallelMessagePending doesn't guarantee much, because there's no
guaranteed memory ordering between threads. And more.


I'm very doubtful this is a good direction to go in. Kinda maybe
somewhat partially converting tiny parts of the backend code into
threadsafe code will leave us with some baroque code.

Greetings,

Andres Freund



Re: Threading in BGWorkers (!)

From
Chapman Flack
Date:
On 06/23/20 21:44, Andres Freund wrote:

> I think that's way harder than what you make it sound here. The locking
> for shm_mq doesn't really work inside a process. In contrast to the
> single threaded case something like a volatile write to
> ParallelMessagePending doesn't guarantee much, because there's no
> guaranteed memory ordering between threads. And more.

It occurred to me after I sent the message this morning that my suggestion
(2) could subsume (1). And requires nothing more than a single volatile
write of a boolean, and getting called back at a convenient time on the
single main thread.

So perhaps I shouldn't have suggested (1) at all - just muddies the waters.

Regards,
-Chap



Re: Threading in BGWorkers (!)

From
Andres Freund
Date:
Hi,

On 2020-06-23 21:50:26 -0400, Chapman Flack wrote:
> On 06/23/20 21:44, Andres Freund wrote:
> 
> > I think that's way harder than what you make it sound here. The locking
> > for shm_mq doesn't really work inside a process. In contrast to the
> > single threaded case something like a volatile write to
> > ParallelMessagePending doesn't guarantee much, because there's no
> > guaranteed memory ordering between threads. And more.
> 
> It occurred to me after I sent the message this morning that my suggestion
> (2) could subsume (1). And requires nothing more than a single volatile
> write of a boolean, and getting called back at a convenient time on the
> single main thread.

A single volatile write wouldn't guarantee you much in the presence of
multiple threads. You could very well end up with a concurrent
CHECK_FOR_INTERRUPTS() in the main thread unsetting InterruptPending,
but not yet seeing / processing ParallelMessagePending.  Nor would it
wake up the main process if it's currently waiting on a latch.

Greetings,

Andres Freund



Re: Threading in BGWorkers (!)

From
Tom Lane
Date:
Andres Freund <andres@anarazel.de> writes:
> On 2020-06-23 09:19:36 -0400, Chapman Flack wrote:
>> 1) It would be nice to be able to ereport from an arbitrary thread.

> I think that's way harder than what you make it sound here.

Indeed.  Just for starters:

1. elog.c is in itself not-thread-safe, because it uses a static data
structure to track the message being built.

2. It uses palloc, another large pile of not-thread-safe infrastructure.

3. What exactly would the semantics of elog(ERROR) be?  You can't make it
something other than "abort the transaction" without mind-boggling levels
of breakage.  But how are you going to enforce a transaction abort across
multiple threads?  What if one of the other threads reports an independent
error concurrently, or worse tries to COMMIT concurrently?

So that's already two fundamental, and non-trivial, subsystems that have
to be made fully thread-safe before you can get anything off the ground;
plus basic architectural issues to be settled.  I imagine that somebody
will take a run at this at some point, but the idea that it's an easy
problem to bite off seems nonsensical.

I'm not sure whether the other idea

>>   It would be nice to be able to also asynchronously set some flag
>>   like ExtensionServiceRequested, which could be checked as part of
>>   CHECK_FOR_INTERRUPTS or even at more limited times, such as idle.

is much easier.  In the barest terms, we already have things like that
(such as NOTIFY interrupts), so it doesn't sound hard at first.  The
problem is to figure out whether action X that you wish to do is safe
to do at CHECK_FOR_INTERRUPTS call site Y.  The answer is certainly not
always "yes", but how would we build an infrastructure for deciding?
(NOTIFY largely punts on this, by decreeing that it won't do anything
till we reach an idle state.  That's surely not adequate for a lot
of likely actions X.)

            regards, tom lane



Re: Threading in BGWorkers (!)

From
Chapman Flack
Date:
On 06/23/20 22:13, Tom Lane wrote:
> 1. elog.c is in itself not-thread-safe, because it uses a static data
> structure to track the message being built.
> 
> 2. It uses palloc, another large pile of not-thread-safe infrastructure.
...

I'm sure now that I shouldn't have mentioned (1) - muddied waters. The idea
in my head had been to make what the PG code sees as close to the parallel-
message case as possible: "here is an error structure that your elog code
did not build, in a region of memory your palloc code did not manage." But
leave that aside, because a way to request a service callback would clearly
allow the regular elog and the regular palloc to do their regular thing
on the regular thread, and be the more general and desirable idea anyway.

> I'm not sure whether the other idea
> 
>>>   It would be nice to be able to also asynchronously set some flag
>>>   like ExtensionServiceRequested, which could be checked as part of
>>>   CHECK_FOR_INTERRUPTS or even at more limited times, such as idle.
> 
> is much easier.  In the barest terms, we already have things like that
> (such as NOTIFY interrupts), so it doesn't sound hard at first.  The
> problem is to figure out whether action X that you wish to do is safe
> to do at CHECK_FOR_INTERRUPTS call site Y.  The answer is certainly not
> always "yes", but how would we build an infrastructure for deciding?
> (NOTIFY largely punts on this, by decreeing that it won't do anything
> till we reach an idle state.  That's surely not adequate for a lot
> of likely actions X.)

I think it could be adequate for a lot of them. (I even said "more
limited times, such as idle" up there.) In PL/Java's case,
there clearly aren't people running code now that functionally depends
on this ability, because it wouldn't work. Even if the JVM uses multiple
threads to accomplish something, if it is something the Java function
result depends on, it obviously has to happen before the function returns,
while PL/Java has the main thread and can just serialize the work onto it.

The likeliest cases where something might want to happen after the
function has returned are resource releases, which can sometimes
be discovered by the garbage collector a little after the fact, and if
the Java resource that's being collected is the dual of some palloc'd
or reference-counted PostgreSQL object, it would be nice to not have to
enqueue that cleanup for the next time some query calls into PL/Java.
Even an "only in an idle state" rule would be an improvement over
"who knows when and maybe never".

Regards,
-Chap



Re: Threading in BGWorkers (!)

From
Tom Lane
Date:
Chapman Flack <chap@anastigmatix.net> writes:
> On 06/23/20 22:13, Tom Lane wrote:
>> I'm not sure whether the other idea

>>> It would be nice to be able to also asynchronously set some flag
>>> like ExtensionServiceRequested, which could be checked as part of
>>> CHECK_FOR_INTERRUPTS or even at more limited times, such as idle.

>> is much easier.  In the barest terms, we already have things like that
>> (such as NOTIFY interrupts), so it doesn't sound hard at first.  The
>> problem is to figure out whether action X that you wish to do is safe
>> to do at CHECK_FOR_INTERRUPTS call site Y.  The answer is certainly not
>> always "yes", but how would we build an infrastructure for deciding?

> I think it could be adequate for a lot of them.

I dunno.  It's not even adequate for the use-case of reporting an
error, because waiting till after the current transaction commits
is surely not what should happen in that case.  It happens to be
okay for NOTIFY, because that's reporting an outside event that
did occur regardless of the local transaction's success ... but
really, how many use-cases does that argument apply to?

I'm not trying to be completely negative here, but I think these
issues are a lot harder than they might seem at first glance.

            regards, tom lane



Re: Threading in BGWorkers (!)

From
Chapman Flack
Date:
On 06/23/20 23:08, Tom Lane wrote:

> I dunno.  It's not even adequate for the use-case of reporting an
> error, because waiting till after the current transaction commits
> is surely not what should happen in that case.

In the case of the kind of exuberantly-threaded language runtime of
which Java's an example, most of the threads running at any given time
are doing somewhat obscure things that the language runtime knows about
but might not be directly relevant to whether your current transaction
commits. (The garbage collector thread was my earlier example because it
routinely discovers reclaimable things, which can have implications for
resources in PG but usually not for whether a commit should succeed.)

If you're going to write a function and explicitly use threads in your
computation, those are threads you're going to manage, catch exceptions
in, and forward those back to the main thread to be ereported at the
appropriate time.

In other cases where some thread you're but dimly aware of has encountered
some problem, generally what happens now is a default message and stacktrace
get directly written to the backend's stderr and you don't otherwise
find out anything happened. If something doesn't work later
because that thread got wedged, then you piece together the story.
If the logging_collector is running then at least the stuff written to
stderr ends up in the log anyway, though without any log prefix info added.
If the collector isn't running, then the messages went somewhere else,
maybe the systemd journal, maybe the floor.

Regards,
-Chap



Re: Threading in BGWorkers (!)

From
James Sewell
Date:
Hackers,

In the hope of this not being derailed by larger/more unpopular pieces of work, I'm attaching a tiny patch which I don't believe will have any negative impact - but will remove one blocker for $subject (sigprocmask usage is "unspecified" in multithreaded code [1]).

The patch replaces sigprocmask with pthread_sigmask. They have identical APIs ("[pthread_sigmask] shall be equivalent to sigprocmask(), without the restriction that the call be made in a single-threaded process"[1])

The rationale here is that as far as I can tell this is the *only* blocker to using multithreaded code in a BGWorker which can't be avoided by adhering to strict code rules (eg: no PG calls from non-main threads, no interaction with signals from non-main threads).

Before this went in the rules would need to be agreed upon and documented - but hopefully it's at least a way forward / a way to progress this discussion.

Cheers,
James









The contents of this email are confidential and may be subject to legal or professional privilege and copyright. No representation is made that this email is free of viruses or other defects. If you have received this communication in error, you may not copy or distribute any part of it or otherwise disclose its contents to anyone. Please advise the sender of your incorrect receipt of this correspondence.
Attachment

Re: Threading in BGWorkers (!)

From
Thomas Munro
Date:
On Thu, Jul 2, 2020 at 6:39 PM James Sewell <james.sewell@jirotech.com> wrote:
> The patch replaces sigprocmask with pthread_sigmask. They have identical APIs ("[pthread_sigmask] shall be equivalent
tosigprocmask(), without the restriction that the call be made in a single-threaded process"[1]) 

-#define PG_SETMASK(mask)    sigprocmask(SIG_SETMASK, mask, NULL)
+#define PG_SETMASK(mask)    pthread_sigmask(SIG_SETMASK, mask, NULL)

So you're assuming that <signal.h> declares pthread_sigmask().  I was
trying to understand what POSIX's "The functionality described is
optional" means; could there be <signal.h> headers without the
declaration?  I mean, I know the practical answer: we support all the
remaining Unixes, you can count them on two hands, and they all have
pthreads, so it doesn't matter, and like Dr Stonebraker said, the plan
is "converting POSTGRES to use lightweight processes available in the
operating systems we are using. These include PRESTO for the Sequent
Symmetry and threads in Version 4 of Sun/OS." so we'll actually
*require* that stuff eventually anyway.

One practical problem with this change is that some systems have a
stub definition of pthread_sigmask() that does nothing, when you don't
link against the threading library.  Realistically, most *useful*
builds of PostgreSQL bring it in indirectly (via SSL, LDAP, blah
blah), but it so happens that a bare bones build and make check on
this here FreeBSD box hangs in CHECK DATABASE waiting for the
checkpointer to signal it.  I can fix that by putting -lthr into
LDFLAGS, so we'd probably have to figure out how to do that on our
supported systems.

> The rationale here is that as far as I can tell this is the *only* blocker to using multithreaded code in a BGWorker
whichcan't be avoided by adhering to strict code rules (eg: no PG calls from non-main threads, no interaction with
signalsfrom non-main threads). 

I guess you'd have to mask at least all the signals we care about
before every call to pthread_create() and trust that the threads never
unmask them.  I guess you could interpose a checker function to abort
if something tries to break the programming rule.



Re: Threading in BGWorkers (!)

From
Tom Lane
Date:
Thomas Munro <thomas.munro@gmail.com> writes:
> On Thu, Jul 2, 2020 at 6:39 PM James Sewell <james.sewell@jirotech.com> wrote:
>> The patch replaces sigprocmask with pthread_sigmask. They have identical APIs ("[pthread_sigmask] shall be
equivalentto sigprocmask(), without the restriction that the call be made in a single-threaded process"[1]) 

> -#define PG_SETMASK(mask)    sigprocmask(SIG_SETMASK, mask, NULL)
> +#define PG_SETMASK(mask)    pthread_sigmask(SIG_SETMASK, mask, NULL)

> So you're assuming that <signal.h> declares pthread_sigmask().

If we were going to accept this patch, I'd say it should be conditional
on a configure test for pthread_sigmask being present.  We could allow
that to require an additional library, or not.

>> The rationale here is that as far as I can tell this is the *only* blocker to using multithreaded code in a BGWorker
whichcan't be avoided by adhering to strict code rules (eg: no PG calls from non-main threads, no interaction with
signalsfrom non-main threads). 

TBH, though, I do not buy this argument for a millisecond.  I don't
think that anything is going to come out of multithreading a bgworker
but blood and tears.  Perhaps someday we'll make a major push to
make the backend code (somewhat(?)) thread safe ... but I'm not on
board with making one-line-at-a-time changes in hopes of getting
partway there.  We need some kind of concrete plan for what is a
usable amount of functionality and what has to be done to get it.

            regards, tom lane



Re: Threading in BGWorkers (!)

From
James Sewell
Date:
We need some kind of concrete plan for what is a
usable amount of functionality and what has to be done to get it.

This is exactly the type of discussion I'm after. 

I'll start.

A usable amount of functionality would be the ability to start threads from:
  • a background worker
These cases should be bound by *at least* the following rules:
  • no signal handling from threads
  • no calls into PostgreSQL functions from threads

The patch I supplied is one of the requirements to get there - I would love help to discover the others.


The contents of this email are confidential and may be subject to legal or professional privilege and copyright. No representation is made that this email is free of viruses or other defects. If you have received this communication in error, you may not copy or distribute any part of it or otherwise disclose its contents to anyone. Please advise the sender of your incorrect receipt of this correspondence.

Re: Threading in BGWorkers (!)

From
Andres Freund
Date:
Hi,

On 2020-07-29 13:41:02 +1200, Thomas Munro wrote:
> One practical problem with this change is that some systems have a
> stub definition of pthread_sigmask() that does nothing, when you don't
> link against the threading library.  Realistically, most *useful*
> builds of PostgreSQL bring it in indirectly (via SSL, LDAP, blah
> blah), but it so happens that a bare bones build and make check on
> this here FreeBSD box hangs in CHECK DATABASE waiting for the
> checkpointer to signal it.  I can fix that by putting -lthr into
> LDFLAGS, so we'd probably have to figure out how to do that on our
> supported systems.

Couldn't this be driven by --disable-thread-safety?

Greetings,

Andres Freund



Re: Threading in BGWorkers (!)

From
Andres Freund
Date:
Hi,

On 2020-07-28 21:52:20 -0400, Tom Lane wrote:
> >> The rationale here is that as far as I can tell this is the *only* blocker to using multithreaded code in a
BGWorkerwhich can't be avoided by adhering to strict code rules (eg: no PG calls from non-main threads, no interaction
withsignals from non-main threads).
 
> 
> TBH, though, I do not buy this argument for a millisecond.  I don't
> think that anything is going to come out of multithreading a bgworker
> but blood and tears.  Perhaps someday we'll make a major push to
> make the backend code (somewhat(?)) thread safe ... but I'm not on
> board with making one-line-at-a-time changes in hopes of getting
> partway there.  We need some kind of concrete plan for what is a
> usable amount of functionality and what has to be done to get it.

Why not? Our answer to threading inside functions has been, for quite a
while, that it's kinda ok if the threads never call into postgres and
can never escape the lifetime of a function. But that's not actually
really safe due to the signal handler issue. Whether it's a normal
backend or a bgworker doesn't really play a role here, no?

Greetings,

Andres Freund



Re: Threading in BGWorkers (!)

From
Tom Lane
Date:
Andres Freund <andres@anarazel.de> writes:
> On 2020-07-28 21:52:20 -0400, Tom Lane wrote:
>> TBH, though, I do not buy this argument for a millisecond.  I don't
>> think that anything is going to come out of multithreading a bgworker
>> but blood and tears.  Perhaps someday we'll make a major push to
>> make the backend code (somewhat(?)) thread safe ... but I'm not on
>> board with making one-line-at-a-time changes in hopes of getting
>> partway there.  We need some kind of concrete plan for what is a
>> usable amount of functionality and what has to be done to get it.

> Why not? Our answer to threading inside functions has been, for quite a
> while, that it's kinda ok if the threads never call into postgres and
> can never escape the lifetime of a function. But that's not actually
> really safe due to the signal handler issue.

In other words, it's *not* safe and never has been.  I see no good reason
to believe that the signal handler issue is the only one.  Even if it is,
not being able to call any postgres infrastructure is a pretty huge
handicap.

So I stand by the position that we need an actual plan here, not just
chipping away at one-liner things that might or might not improve
matters noticeably.

            regards, tom lane



Re: Threading in BGWorkers (!)

From
Robert Haas
Date:
On Thu, Jul 30, 2020 at 2:55 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > Why not? Our answer to threading inside functions has been, for quite a
> > while, that it's kinda ok if the threads never call into postgres and
> > can never escape the lifetime of a function. But that's not actually
> > really safe due to the signal handler issue.
>
> In other words, it's *not* safe and never has been.  I see no good reason
> to believe that the signal handler issue is the only one.  Even if it is,
> not being able to call any postgres infrastructure is a pretty huge
> handicap.

I find this line of argument really unfair. It's true that there might
be issues other than the signal handler one, but so what? That is not
a principled argument against fixing the signal handler part of the
problem, which is the only *known* problem with the use case Andres
describes. It is also true that it would be more useful to enable
additional use cases rather than just this one, but that is not a
principled argument against enabling this one.

My only present concern about the proposal actually in front of us --
that is to say, use pthread_sigmask() rather than sigprocmask() -- is
Thomas's observation that on his system doing so breaks the world.
That seems to be quite a serious problem. If we are deciding whether
to use one function or another some purpose and they are equally good
for the core code but one is better for people who want to play around
with extensions, we may as well use the one that's better for that
purpose. We need not give such experimentation our unqualified
endorsement; we need only decide against obstructing it unnecessarily.
But when such a substitution risks breaking things that work today,
the calculus gets a lot more complicated. Unless we can find a way to
avoid that risk, I don't think this is a good trade-off.

But more broadly I think it is well past time that we look into making
the backend more thread-friendly. The fact that "it's *not* safe and
never has been" has not prevented people from doing it. We don't end
up with people going "oh, well sigprocmask uh oh so I better give up
now." What we end up with is people just going right ahead and doing
it, probably without even thinking about sigprocmask, and ending up
with low-probability failure conditions, which seems likely to hurt
PostgreSQL's reputation for reliability with no compensating benefit.
Or alternatively they hack core, which sucks, or they switch to some
non-PG database, which also sucks.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Threading in BGWorkers (!)

From
James Sewell
Date:
I see no good reason to believe that the signal handler issue is the only one.  Even if it is,
not being able to call any postgres infrastructure is a pretty huge
handicap.

(changed emails to get rid of the nasty employer notice...)

It's at least a workable handicap that I'm happy to deal with.

I can say with 100% confidence that people coming from non C languages will be using threading in Postgres backends as interop matures (and it's happening fast now). A lot of the time they won't even know they are using threads as it will be done by libraries they make use of transparently. 

Let's help them to avoid unsafe code now, not wait until they show up on this list with a critical failure and tap at the big sign that says "NO THREADING".

- james