Thread: EXEC_BACKEND vs bgworkers without BGWORKER_SHMEM_ACCESS

EXEC_BACKEND vs bgworkers without BGWORKER_SHMEM_ACCESS

From
Andres Freund
Date:
Hi,

While looking at a patch I noticed that SubPostmasterMain() for bgworkers
unconditionally does

        /* Need a PGPROC to run CreateSharedMemoryAndSemaphores */
        InitProcess();

which presents a problem, because StartBackgroundWorker() then does

    /*
     * If we're not supposed to have shared memory access, then detach from
     * shared memory.  If we didn't request shared memory access, the
     * postmaster won't force a cluster-wide restart if we exit unexpectedly,
     * so we'd better make sure that we don't mess anything up that would
     * require that sort of cleanup.
     */
    if ((worker->bgw_flags & BGWORKER_SHMEM_ACCESS) == 0)
    {
        ShutdownLatchSupport();
        dsm_detach_all();
        PGSharedMemoryDetach();
    }

which presents a problem: We've initialized all kind of references to shared
memory, own a PGPROC, but have detached from shared memory.

In practice this will lead pretty quickly to a segfault, because process exit
will run proc_exit callbacks, which in turn will try to do a ProcKill(). Or
logging dereferences MyProc, or ...

It seems the above code block would need to at least do shmem_exit() before
the PGSharedMemoryDetach()?

This code has been introduced in

commit 4d155d8b08fe08c1a1649fdbad61c6dcf4a8671f
Author: Robert Haas <rhaas@postgresql.org>
Date:   2014-05-07 14:54:43 -0400

    Detach shared memory from bgworkers without shmem access.

    Since the postmaster won't perform a crash-and-restart sequence
    for background workers which don't request shared memory access,
    we'd better make sure that they can't corrupt shared memory.

    Patch by me, review by Tom Lane.

but before that things were just slightly differently broken...


I tested both the crash and that a shmem_exit() fixes it with an ugly hack in
regress.c. I don't really know how to write a good testcase for this, given
that the communication facilities of a unconnected bgworker are quite
limited... I guess the bgworker could create files or something :(.

Greetings,

Andres Freund



Re: EXEC_BACKEND vs bgworkers without BGWORKER_SHMEM_ACCESS

From
Andres Freund
Date:
Hi,

On 2021-08-01 23:51:16 -0700, Andres Freund wrote:
> In practice this will lead pretty quickly to a segfault, because process exit
> will run proc_exit callbacks, which in turn will try to do a ProcKill(). Or
> logging dereferences MyProc, or ...
> 
> It seems the above code block would need to at least do shmem_exit() before
> the PGSharedMemoryDetach()?

> I tested both the crash and that a shmem_exit() fixes it with an ugly hack in
> regress.c. I don't really know how to write a good testcase for this, given
> that the communication facilities of a unconnected bgworker are quite
> limited... I guess the bgworker could create files or something :(.

Hm. There may be another way: Because BackgroundWorkerEntry() does not
actually need a lock, we could transport BackgroundWorkerData via
backend_save_variables(). To make the connected case work, we'd of course
continue to call CreateSharedMemoryAndSemaphores (and thus InitProcess()) for
those.

But that doesn't really seem better? Sure, it's nice to not need a pgproc
entry unnecessarily, but it's not like unconnected bgworkers are commonly
used and it does seem like it'd end up with more complicated code?

Greetings,

Andres Freund



Re: EXEC_BACKEND vs bgworkers without BGWORKER_SHMEM_ACCESS

From
Robert Haas
Date:
On Mon, Aug 2, 2021 at 2:51 AM Andres Freund <andres@anarazel.de> wrote:
> which presents a problem: We've initialized all kind of references to shared
> memory, own a PGPROC, but have detached from shared memory.
>
> In practice this will lead pretty quickly to a segfault, because process exit
> will run proc_exit callbacks, which in turn will try to do a ProcKill(). Or
> logging dereferences MyProc, or ...
>
> It seems the above code block would need to at least do shmem_exit() before
> the PGSharedMemoryDetach()?
>
> This code has been introduced in
>
> commit 4d155d8b08fe08c1a1649fdbad61c6dcf4a8671f
> Author: Robert Haas <rhaas@postgresql.org>
> Date:   2014-05-07 14:54:43 -0400
>
>     Detach shared memory from bgworkers without shmem access.
>
>     Since the postmaster won't perform a crash-and-restart sequence
>     for background workers which don't request shared memory access,
>     we'd better make sure that they can't corrupt shared memory.
>
>     Patch by me, review by Tom Lane.
>
> but before that things were just slightly differently broken...

If you're saying that this code has been 100% broken for 7 years and
nobody's noticed until now, then that suggests that nobody actually
uses non-shmem-connected bgworkers. I sort of hate to give up on that
concept but if we've really gone that many years without anyone
noticing obvious breakage then maybe we should.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: EXEC_BACKEND vs bgworkers without BGWORKER_SHMEM_ACCESS

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> If you're saying that this code has been 100% broken for 7 years and
> nobody's noticed until now, then that suggests that nobody actually
> uses non-shmem-connected bgworkers. I sort of hate to give up on that
> concept but if we've really gone that many years without anyone
> noticing obvious breakage then maybe we should.

Well, the problem only exists on Windows so maybe this indeed
escaped notice.  Still, this is good evidence that the case isn't
used *much*, and TBH I don't see many applications for it.
I can't say I'm excited about putting effort into fixing it.

            regards, tom lane



Re: EXEC_BACKEND vs bgworkers without BGWORKER_SHMEM_ACCESS

From
Andres Freund
Date:
Hi,

On 2021-08-02 11:00:49 -0400, Tom Lane wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
> > If you're saying that this code has been 100% broken for 7 years and
> > nobody's noticed until now, then that suggests that nobody actually
> > uses non-shmem-connected bgworkers. I sort of hate to give up on that
> > concept but if we've really gone that many years without anyone
> > noticing obvious breakage then maybe we should.
> 
> Well, the problem only exists on Windows so maybe this indeed
> escaped notice.

Right. I did briefly look around and I didn't find bgworkers without
shmem attachement...


> Still, this is good evidence that the case isn't used *much*, and TBH
> I don't see many applications for it.  I can't say I'm excited about
> putting effort into fixing it.

Yea, I don't think it adds that much - without e.g. sharing a file
descriptor with the unconnected bgworker one can't implement something
like syslogger.

Greetings,

Andres Freund



Re: EXEC_BACKEND vs bgworkers without BGWORKER_SHMEM_ACCESS

From
Alvaro Herrera
Date:
On 2021-Aug-02, Tom Lane wrote:

> Robert Haas <robertmhaas@gmail.com> writes:
> > If you're saying that this code has been 100% broken for 7 years and
> > nobody's noticed until now, then that suggests that nobody actually
> > uses non-shmem-connected bgworkers. I sort of hate to give up on that
> > concept but if we've really gone that many years without anyone
> > noticing obvious breakage then maybe we should.
> 
> Well, the problem only exists on Windows so maybe this indeed
> escaped notice.  Still, this is good evidence that the case isn't
> used *much*, and TBH I don't see many applications for it.
> I can't say I'm excited about putting effort into fixing it.

When I included this case I was thinking in tasks which would just run
stuff not directly connected to data.  Something like a sub-daemon: say
a connection pooler, which is a bgworker just so that it starts and
stops together with postmaster, and share facilities like GUC
configuration and SIGHUP handling, etc.  It doesn't look like anybody
has had an interest in developing such a thing, so if this is
obstructing your work, I don't object to removing the no-shmem case.

-- 
Álvaro Herrera           39°49'30"S 73°17'W  —  https://www.EnterpriseDB.com/



Re: EXEC_BACKEND vs bgworkers without BGWORKER_SHMEM_ACCESS

From
"Andres Freund"
Date:
Hi,

On Mon, Aug 2, 2021, at 11:08, Alvaro Herrera wrote:
> On 2021-Aug-02, Tom Lane wrote:
> 
> > Robert Haas <robertmhaas@gmail.com> writes:
> > > If you're saying that this code has been 100% broken for 7 years and
> > > nobody's noticed until now, then that suggests that nobody actually
> > > uses non-shmem-connected bgworkers. I sort of hate to give up on that
> > > concept but if we've really gone that many years without anyone
> > > noticing obvious breakage then maybe we should.
> > 
> > Well, the problem only exists on Windows so maybe this indeed
> > escaped notice.  Still, this is good evidence that the case isn't
> > used *much*, and TBH I don't see many applications for it.
> > I can't say I'm excited about putting effort into fixing it.
> 
> When I included this case I was thinking in tasks which would just run
> stuff not directly connected to data.  Something like a sub-daemon: say
> a connection pooler, which is a bgworker just so that it starts and
> stops together with postmaster, and share facilities like GUC
> configuration and SIGHUP handling, etc.

I think nearly all such cases are going to want some monitoring from within the database - which then needs shared
memory.

And even if not - it's not really that useful to avoid a crash-restart if your worker dies with a segfault. There's no
reallylegitimate reasons for completely unhandled errors even if not connected to shmem.
 


> It doesn't look like anybody
> has had an interest in developing such a thing, so if this is
> obstructing your work, I don't object to removing the no-shmem case.

It's not obstructing me right now. I noticed it'd crash on EXEC_BACKEND when I tried to understand some code (see the
nearbythread about straightening out process startup). 
 

I do think there's some potential gains in simplicity and robustness that are made mildly harder by a subprocess that
firstattaches and detaches from shm (it's the only case where we can't easily unify the place InitProcess() is called
betweenEB and ! EB right now). There's several ways that could be tackled. Removing the need to have that if obviously
oneof them.
 

Regards,

Andres



Re: EXEC_BACKEND vs bgworkers without BGWORKER_SHMEM_ACCESS

From
Alvaro Herrera
Date:
On 2021-Aug-02, Andres Freund wrote:

> > When I included this case I was thinking in tasks which would just run
> > stuff not directly connected to data.  Something like a sub-daemon: say
> > a connection pooler, which is a bgworker just so that it starts and
> > stops together with postmaster, and share facilities like GUC
> > configuration and SIGHUP handling, etc.
> 
> I think nearly all such cases are going to want some monitoring from
> within the database - which then needs shared memory.

True.  Observability for such things is critical (pgbouncer goes quite
some trouble to offer SQL-queryable views into its metrics), which kills
the argument.

> I do think there's some potential gains in simplicity and robustness
> that are made mildly harder by a subprocess that first attaches and
> detaches from shm (it's the only case where we can't easily unify the
> place InitProcess() is called between EB and ! EB right now). There's
> several ways that could be tackled. Removing the need to have that if
> obviously one of them.

Hmm, I don't remember that an shmem-unconnected bgworker first connected
to it and then let go.  It seems weird to do it that way.  My intention,
as far as I recall, is that they would just never connect to shmem,
period.

-- 
Álvaro Herrera           39°49'30"S 73°17'W  —  https://www.EnterpriseDB.com/
"I think my standards have lowered enough that now I think 'good design'
is when the page doesn't irritate the living f*ck out of me." (JWZ)



Re: EXEC_BACKEND vs bgworkers without BGWORKER_SHMEM_ACCESS

From
"Andres Freund"
Date:
Hi,

On Mon, Aug 2, 2021, at 12:12, Alvaro Herrera wrote:
> On 2021-Aug-02, Andres Freund wrote:
> > I do think there's some potential gains in simplicity and robustness
> > that are made mildly harder by a subprocess that first attaches and
> > detaches from shm (it's the only case where we can't easily unify the
> > place InitProcess() is called between EB and ! EB right now). There's
> > several ways that could be tackled. Removing the need to have that if
> > obviously one of them.
> 
> Hmm, I don't remember that an shmem-unconnected bgworker first connected
> to it and then let go.  It seems weird to do it that way.  My intention,
> as far as I recall, is that they would just never connect to shmem,
> period.

They currently do for EXEC_BACKEND. See SubPostmasterMain(). There the definition of the worker is passed via shared
memory.So it does the full reattach thing, which requires lwlock, which requires PGPROC. We could get away without that
bypassing more through the variables file (either the worker definition or the address of the bgworker shmem piece).
 

Greetings,

Andres



Re: EXEC_BACKEND vs bgworkers without BGWORKER_SHMEM_ACCESS

From
Alvaro Herrera
Date:
On 2021-Aug-02, Andres Freund wrote:

> On Mon, Aug 2, 2021, at 12:12, Alvaro Herrera wrote:

> > Hmm, I don't remember that an shmem-unconnected bgworker first connected
> > to it and then let go.  It seems weird to do it that way.  My intention,
> > as far as I recall, is that they would just never connect to shmem,
> > period.
> 
> They currently do for EXEC_BACKEND. See SubPostmasterMain(). There the
> definition of the worker is passed via shared memory. So it does the
> full reattach thing, which requires lwlock, which requires PGPROC. We
> could get away without that by passing more through the variables file
> (either the worker definition or the address of the bgworker shmem
> piece).

Ah, that makes sense.  That doesn't sound super fragile, but it is odd
and it's probably a good argument for removing the feature, particularly
since nobody seems to be using it.

-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"Ninguna manada de bestias tiene una voz tan horrible como la humana" (Orual)



Re: EXEC_BACKEND vs bgworkers without BGWORKER_SHMEM_ACCESS

From
Andres Freund
Date:
Hi,

On 2021-08-02 15:34:07 -0400, Alvaro Herrera wrote:
> Ah, that makes sense.  That doesn't sound super fragile, but it is odd
> and it's probably a good argument for removing the feature, particularly
> since nobody seems to be using it.

ISTM we concluded that we should remove unconnected workers. Writing a patch
to do so left me with two questions:

First, what do we want to do with BGWORKER_SHMEM_ACCESS? I'm inclined to treat
it as a required flag going forward. That way we don't silently start being
attached to shared memory in case somebody actually has a unattached
worker. And if we ever wanted to add the ability to have unattached workers
back, it'll also be easier this way.  Perhaps it also has a small amount of
signalling value reminding people that they need to be careful...

The second question is what we want to do in the backbranches. I think the
reasonable options are to do nothing, or to make !BGWORKER_SHMEM_ACCESS an
error in SanityCheckBackgroundWorker() if EXEC_BACKEND is used.

Greetings,

Andres Freund



Re: EXEC_BACKEND vs bgworkers without BGWORKER_SHMEM_ACCESS

From
Tom Lane
Date:
Andres Freund <andres@anarazel.de> writes:
> First, what do we want to do with BGWORKER_SHMEM_ACCESS? I'm inclined to treat
> it as a required flag going forward.

+1

> The second question is what we want to do in the backbranches. I think the
> reasonable options are to do nothing, or to make !BGWORKER_SHMEM_ACCESS an
> error in SanityCheckBackgroundWorker() if EXEC_BACKEND is used.

I think doing nothing is fine.  Given the lack of complaints, we're
more likely to break something than fix anything useful.

            regards, tom lane



Re: EXEC_BACKEND vs bgworkers without BGWORKER_SHMEM_ACCESS

From
Andres Freund
Date:
Hi,

On 2021-08-05 20:02:02 -0400, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > First, what do we want to do with BGWORKER_SHMEM_ACCESS? I'm inclined to treat
> > it as a required flag going forward.
> 
> +1
> 
> > The second question is what we want to do in the backbranches. I think the
> > reasonable options are to do nothing, or to make !BGWORKER_SHMEM_ACCESS an
> > error in SanityCheckBackgroundWorker() if EXEC_BACKEND is used.
> 
> I think doing nothing is fine.  Given the lack of complaints, we're
> more likely to break something than fix anything useful.

Done in the attached patch. I don't think we need to add more to the docs than
the flag being required?

Greetings,

Andres Freund

Attachment

Re: EXEC_BACKEND vs bgworkers without BGWORKER_SHMEM_ACCESS

From
Robert Haas
Date:
On Thu, Aug 5, 2021 at 8:02 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I think doing nothing is fine.  Given the lack of complaints, we're
> more likely to break something than fix anything useful.

+1.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: EXEC_BACKEND vs bgworkers without BGWORKER_SHMEM_ACCESS

From
Michael Paquier
Date:
On Mon, Aug 09, 2021 at 11:07:14AM -0400, Robert Haas wrote:
> On Thu, Aug 5, 2021 at 8:02 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I think doing nothing is fine.  Given the lack of complaints, we're
>> more likely to break something than fix anything useful.
>
> +1.

FWIW, the only interesting case I have in my plugin box for a
background worker that does not attach to shared memory is a template
of worker able to catch signals, to be used as a base for simple
actions.  So that's not really interesting.  Making the SHMEM flag be
something mandatory on HEAD while doing nothing in the back-branches
sounds good to me, so +1.
--
Michael

Attachment

Re: EXEC_BACKEND vs bgworkers without BGWORKER_SHMEM_ACCESS

From
Andres Freund
Date:
On 2021-08-05 19:56:49 -0700, Andres Freund wrote:
> Done in the attached patch. I don't think we need to add more to the docs than
> the flag being required?

Pushed that patch now. If we want further additions to the docs we can
do so separately.