Thread: straightening out backend process startup

straightening out backend process startup

From
Andres Freund
Date:
Hi,

I've previously complained ([1]) that process initialization has gotten
very complicated. I hit this once more last week when trying to commit
one of the shared memory stats pieces...

I think there's quite a few different issues around this - here I'm just
trying to tackle a few of the most glaring (to me):

- AuxiliaryProcessMain() is used for two independent tasks: Start bootstrap /
  checker mode and starting auxiliary processes. In HEAD there's maybe 5 lines
  out 250 that are actually common to both uses.

  A related oddity is that we reserve shared memory resources for bootstrap &
  checker aux processes, despite those never existing.

  This is addressed in patches 1-7

- The order of invocation of InitProcess()/InitAuxiliaryProcess() and
  BaseInit() depends on EXEC_BACKEND. Due to that there often is no single
  place initialization code can be put if it needs any locks.

  This is addressed in patches 8-9

- PostgresMain() has code for single user and multi user interleaved, making
  it unnecessarily hard to understand what's going on.

  This is addressed in patches 10


This isn't a patch series ready to commit, there's a bunch of polishing that
needs to be done if there's agreement.


Questions I have:

- What exactly to do with checker mode: Keep it as part of bootstrap, separate
  it out completely? What commandline flags?

- I used a separate argv entry to pass the aux proc type - do we rather want
  to go for the approach that e.g. bgworker went for? Adding code for string
  splitting seems a bit unnecessary to me.

- PostgresMainSingle() should probably not be in postgres.c. We could put it
  into postinit.c or ..?

- My first attempt at PostgresMainSingle() separated the single/multi user
  cases a bit more than the code right now, by having a PostgresMainCommon()
  which was called by PostgresMainSingle(), PostgresMain(). *Common only
  started with the MessageContext allocation, which did have the advantage of
  splitting out a few of the remaining conditional actions in PostgresMain()
  (PostmasterContext, welcome banner, Log_disconnections). But lead to a bit
  more duplication. I don't really have an opinion on what's better.

- I had to move the PgStartTime computation to a bit earlier for single user
  mode. That seems to make sense to me anyway, given that postmaster does so
  fairly early too.

  Any reason that'd be a bad idea?

  Arguably it should even be a tad earlier to be symmetric.


There's one further issue that I think is big enough to be worth
tackling in the near future: Too many things depend on BackendIds.

Aux processes need procsignal and backend status reporting, which use
BackendId for indexing. But they don't use sinval, so they don't have a
BackendId - so we have hacks to work around that in a few places. If we
instead make those places use pgprocno for indexing the whole issue
vanishes.

In fact, I think there's a good argument to be made that we should
entirely remove the concept of BackendIds and just use pgprocnos. We
have a fair number of places like SignalVirtualTransaction() that need
to search the procarray just to find the proc to signal based on the
BackendId. If we used pgprocno instead, that'd not be needed.

But perhaps that's a separate thread.

Greetings,

Andres Freund

[1] https://postgr.es/m/20210402002240.56cuz3uo3alnqwae%40alap3.anarazel.de

Attachment

Re: straightening out backend process startup

From
Tom Lane
Date:
Andres Freund <andres@anarazel.de> writes:
> I think there's quite a few different issues around this - here I'm just
> trying to tackle a few of the most glaring (to me):

No opinion yet about most of this, but I did want to react to

> In fact, I think there's a good argument to be made that we should
> entirely remove the concept of BackendIds and just use pgprocnos. We
> have a fair number of places like SignalVirtualTransaction() that need
> to search the procarray just to find the proc to signal based on the
> BackendId. If we used pgprocno instead, that'd not be needed.

If I understand what you're suggesting, it'd result in unused slots
in sinvaladt.c's procState[] array, which could create an annoying
drag on performance.  However, I think it might be reasonable to
switch other things over to using pgprocnos, with an eye to
eventually making BackendIds private to the sinval mechanism.
There's certainly no strong reason why sinval's array indexes
need to be used as identifiers by other modules.

            regards, tom lane



Re: straightening out backend process startup

From
Andres Freund
Date:
Hi,

On 2021-08-02 12:57:36 -0400, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > I think there's quite a few different issues around this - here I'm just
> > trying to tackle a few of the most glaring (to me):
> 
> No opinion yet about most of this, but I did want to react to
> 
> > In fact, I think there's a good argument to be made that we should
> > entirely remove the concept of BackendIds and just use pgprocnos. We
> > have a fair number of places like SignalVirtualTransaction() that need
> > to search the procarray just to find the proc to signal based on the
> > BackendId. If we used pgprocno instead, that'd not be needed.
> 
> If I understand what you're suggesting, it'd result in unused slots
> in sinvaladt.c's procState[] array, which could create an annoying
> drag on performance.

Yea, I was looking into that, which is why I don't yet have a patch. I think
it might be reasonable to address this by making pgprocnos more dense. We
right now actually have a kind of maximally bad allocation pattern for
pgprocnos - InitProcGlobal puts them onto the free lists in reverse
order. Which means that ProcArrayAdd() will have to move all procs...


But, as you say:

> However, I think it might be reasonable to switch other things over to
> using pgprocnos, with an eye to eventually making BackendIds private
> to the sinval mechanism.  There's certainly no strong reason why
> sinval's array indexes need to be used as identifiers by other
> modules.

I think it'd entirely be reasonable to switch over at least backend_status.c,
procsignal.c to pgprocnos without doing anything about the density of
allocation. We'd likely would want to do that as independent steps anyway,
even if we were to switch over sinval as well.


Another approach to deal with this could be to simply not do the O(n) work in
SIInsertDataEntries(). It's not obvious that ->hasMessages is actually
necessary - we could atomically read maxMsgNum without acquiring a lock
instead of needing the per-backend ->hasMessages.  I don't the density would
be a relevant factor in SICleanupQueue().

Greetings,

Andres Freund



Re: straightening out backend process startup

From
Kyotaro Horiguchi
Date:
At Mon, 2 Aug 2021 09:41:24 -0700, Andres Freund <andres@anarazel.de> wrote in 
> Hi,
> 
> I've previously complained ([1]) that process initialization has gotten
> very complicated. I hit this once more last week when trying to commit
> one of the shared memory stats pieces...
> 
> I think there's quite a few different issues around this - here I'm just
> trying to tackle a few of the most glaring (to me):
> 
> - AuxiliaryProcessMain() is used for two independent tasks: Start bootstrap /
>   checker mode and starting auxiliary processes. In HEAD there's maybe 5 lines
>   out 250 that are actually common to both uses.
> 
>   A related oddity is that we reserve shared memory resources for bootstrap &
>   checker aux processes, despite those never existing.
> 
>   This is addressed in patches 1-7
> 
> - The order of invocation of InitProcess()/InitAuxiliaryProcess() and
>   BaseInit() depends on EXEC_BACKEND. Due to that there often is no single
>   place initialization code can be put if it needs any locks.
> 
>   This is addressed in patches 8-9
> 
> - PostgresMain() has code for single user and multi user interleaved, making
>   it unnecessarily hard to understand what's going on.
> 
>   This is addressed in patches 10
> 
> 
> This isn't a patch series ready to commit, there's a bunch of polishing that
> needs to be done if there's agreement.
> 
> 
> Questions I have:
> 
> - What exactly to do with checker mode: Keep it as part of bootstrap, separate
>   it out completely? What commandline flags?

Checker tries to attach shared memory just to make sure it is actually
attachable with a set of parameters.  It is similar to bootstrap as it
is not run under postmaster but similar to auxiliary process as it
needs to attach shared memory.  If we are going to get rid of
shared-memory access by bootstrap, or get rid of the bootstrap itself,
checker should be separated out from bootstrap.

> - I used a separate argv entry to pass the aux proc type - do we rather want
>   to go for the approach that e.g. bgworker went for? Adding code for string
>   splitting seems a bit unnecessary to me.

It seems to me separate entry is cleaner and robust.

> - PostgresMainSingle() should probably not be in postgres.c. We could put it
>   into postinit.c or ..?

PostgresMainSingle() looks like the single-process version of
PostgresMain so it is natural that they are placed together in
postgres.c.  If PostgresMainSingle is constructed as initializing
standalone first then calling PostgresMain, it might be right that
PostgresMain calls the initialization function resides in postinit.c
if !IsUnderPostmaster.

PostgresMain()
{
  if (!IsUnderPostmaster)
      InitSinglePostgres(argv[0]);
  ...

> - My first attempt at PostgresMainSingle() separated the single/multi user
>   cases a bit more than the code right now, by having a PostgresMainCommon()
>   which was called by PostgresMainSingle(), PostgresMain(). *Common only
>   started with the MessageContext allocation, which did have the advantage of
>   splitting out a few of the remaining conditional actions in PostgresMain()
>   (PostmasterContext, welcome banner, Log_disconnections). But lead to a bit
>   more duplication. I don't really have an opinion on what's better.

I'm not sure how it looked like, but isn't it reasonable that quickdie
and log_disconnections(). handle IsUnderPostmaster instead?  Or for
log_disconnections, Log_disconnections should be turned off at
standalone-initialization?

> - I had to move the PgStartTime computation to a bit earlier for single user
>   mode. That seems to make sense to me anyway, given that postmaster does so
>   fairly early too.
> 
>   Any reason that'd be a bad idea?
> 
>   Arguably it should even be a tad earlier to be symmetric.

Why don't you move the code for multiuser as earlier as standalone does?

> There's one further issue that I think is big enough to be worth
> tackling in the near future: Too many things depend on BackendIds.
> 
> Aux processes need procsignal and backend status reporting, which use
> BackendId for indexing. But they don't use sinval, so they don't have a
> BackendId - so we have hacks to work around that in a few places. If we
> instead make those places use pgprocno for indexing the whole issue
> vanishes.
> 
> In fact, I think there's a good argument to be made that we should
> entirely remove the concept of BackendIds and just use pgprocnos. We
> have a fair number of places like SignalVirtualTransaction() that need
> to search the procarray just to find the proc to signal based on the
> BackendId. If we used pgprocno instead, that'd not be needed.
> 
> But perhaps that's a separate thread.
> 
> Greetings,
> 
> Andres Freund
> 
> [1] https://postgr.es/m/20210402002240.56cuz3uo3alnqwae%40alap3.anarazel.de

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: straightening out backend process startup

From
Robert Haas
Date:
On Mon, Aug 2, 2021 at 12:41 PM Andres Freund <andres@anarazel.de> wrote:
> - AuxiliaryProcessMain() is used for two independent tasks: Start bootstrap /
>   checker mode and starting auxiliary processes. In HEAD there's maybe 5 lines
>   out 250 that are actually common to both uses.
>
>   A related oddity is that we reserve shared memory resources for bootstrap &
>   checker aux processes, despite those never existing.
>
>   This is addressed in patches 1-7

This all looks pretty mechanical and, I would guess, not very controversial.

> - The order of invocation of InitProcess()/InitAuxiliaryProcess() and
>   BaseInit() depends on EXEC_BACKEND. Due to that there often is no single
>   place initialization code can be put if it needs any locks.
>
>   This is addressed in patches 8-9
>
> - PostgresMain() has code for single user and multi user interleaved, making
>   it unnecessarily hard to understand what's going on.
>
>   This is addressed in patches 10

This stuff I'd need to study more in order to have an intelligent opinion.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: straightening out backend process startup

From
Andres Freund
Date:
Hi,

Thanks Robert, Horiguchi-san for looking.

On 2021-08-04 16:34:52 -0400, Robert Haas wrote:
> On Mon, Aug 2, 2021 at 12:41 PM Andres Freund <andres@anarazel.de> wrote:
> > - AuxiliaryProcessMain() is used for two independent tasks: Start bootstrap /
> >   checker mode and starting auxiliary processes. In HEAD there's maybe 5 lines
> >   out 250 that are actually common to both uses.
> >
> >   A related oddity is that we reserve shared memory resources for bootstrap &
> >   checker aux processes, despite those never existing.
> >
> >   This is addressed in patches 1-7
>
> This all looks pretty mechanical and, I would guess, not very controversial.

Pushed these patches.


> > - The order of invocation of InitProcess()/InitAuxiliaryProcess() and
> >   BaseInit() depends on EXEC_BACKEND. Due to that there often is no single
> >   place initialization code can be put if it needs any locks.
> >
> >   This is addressed in patches 8-9
> >
> > - PostgresMain() has code for single user and multi user interleaved, making
> >   it unnecessarily hard to understand what's going on.
> >
> >   This is addressed in patches 10
>
> This stuff I'd need to study more in order to have an intelligent opinion.

Unless somebody protests soon I plan to push at least the
"process startup: Always call Init[Auxiliary]Process() before BaseInit()."
portion, as the inconsistent order between EXEC_BACKEND/!EB is making life
hard for me in other patches.

I don't have a dependency on
"process startup: Split single user code out of PostgresMain()."
but it does make the code a good bit more readable imo...

Greetings,

Andres Freund



Re: straightening out backend process startup

From
Andres Freund
Date:
Hi,

On 2021-08-03 16:50:24 +0900, Kyotaro Horiguchi wrote:
> At Mon, 2 Aug 2021 09:41:24 -0700, Andres Freund <andres@anarazel.de> wrote in
> > - PostgresMainSingle() should probably not be in postgres.c. We could put it
> >   into postinit.c or ..?
>
> PostgresMainSingle() looks like the single-process version of
> PostgresMain so it is natural that they are placed together in
> postgres.c.  If PostgresMainSingle is constructed as initializing
> standalone first then calling PostgresMain, it might be right that
> PostgresMain calls the initialization function resides in postinit.c
> if !IsUnderPostmaster.
>
> PostgresMain()
> {
>   if (!IsUnderPostmaster)
>       InitSinglePostgres(argv[0]);
>   ...

I find passing argc/argv to functions that don't actually need them, like
PostgresMain during normal operation, confusing. Especially when the argc/argv
values are just manufactured stuff like in the case of PostgresMain(). So I
think it's better to have have the order work the other way round.


> > - My first attempt at PostgresMainSingle() separated the single/multi user
> >   cases a bit more than the code right now, by having a PostgresMainCommon()
> >   which was called by PostgresMainSingle(), PostgresMain(). *Common only
> >   started with the MessageContext allocation, which did have the advantage of
> >   splitting out a few of the remaining conditional actions in PostgresMain()
> >   (PostmasterContext, welcome banner, Log_disconnections). But lead to a bit
> >   more duplication. I don't really have an opinion on what's better.
>
> I'm not sure how it looked like, but isn't it reasonable that quickdie
> and log_disconnections(). handle IsUnderPostmaster instead?  Or for
> log_disconnections, Log_disconnections should be turned off at
> standalone-initialization?

I was wondering about log_disconnections too. The conditional addition of the
callback is all that forces log_disconnections to be PGC_SU_BACKEND rather
than PGC_SUSET too. So I agree that moving a check for Log_disconnections and
IsUnderPostmaster into log_disconnections is a good idea.

I don't understand your reference to quickdie() though?


> > - I had to move the PgStartTime computation to a bit earlier for single user
> >   mode. That seems to make sense to me anyway, given that postmaster does so
> >   fairly early too.
> >
> >   Any reason that'd be a bad idea?
> >
> >   Arguably it should even be a tad earlier to be symmetric.
>
> Why don't you move the code for multiuser as earlier as standalone does?

I think it's the other way round - right now the standalone case is much later
than the multiuser case. Postmaster determines PgStartTime after creating
shared memory, just before starting checkpointer / startup process - whereas
single user mode in HEAD does it just before accepting input for the first
time.

Greetings,

Andres Freund



Re: straightening out backend process startup

From
Andres Freund
Date:
Hi,

On 2021-08-05 12:50:15 -0700, Andres Freund wrote:
> On 2021-08-03 16:50:24 +0900, Kyotaro Horiguchi wrote:
> > > - My first attempt at PostgresMainSingle() separated the single/multi user
> > >   cases a bit more than the code right now, by having a PostgresMainCommon()
> > >   which was called by PostgresMainSingle(), PostgresMain(). *Common only
> > >   started with the MessageContext allocation, which did have the advantage of
> > >   splitting out a few of the remaining conditional actions in PostgresMain()
> > >   (PostmasterContext, welcome banner, Log_disconnections). But lead to a bit
> > >   more duplication. I don't really have an opinion on what's better.
> >
> > I'm not sure how it looked like, but isn't it reasonable that quickdie
> > and log_disconnections(). handle IsUnderPostmaster instead?  Or for
> > log_disconnections, Log_disconnections should be turned off at
> > standalone-initialization?
> 
> I was wondering about log_disconnections too. The conditional addition of the
> callback is all that forces log_disconnections to be PGC_SU_BACKEND rather
> than PGC_SUSET too. So I agree that moving a check for Log_disconnections and
> IsUnderPostmaster into log_disconnections is a good idea.

I did that, and it didn't seem a clear improvement..


> > > - I had to move the PgStartTime computation to a bit earlier for single user
> > >   mode. That seems to make sense to me anyway, given that postmaster does so
> > >   fairly early too.
> > >
> > >   Any reason that'd be a bad idea?
> > >
> > >   Arguably it should even be a tad earlier to be symmetric.
> >
> > Why don't you move the code for multiuser as earlier as standalone does?
> 
> I think it's the other way round - right now the standalone case is much later
> than the multiuser case. Postmaster determines PgStartTime after creating
> shared memory, just before starting checkpointer / startup process - whereas
> single user mode in HEAD does it just before accepting input for the first
> time.

Did that in the attached.


I've attached the three remaining patches, after some more polish. Unless
somebody argues against I plan to commit these soon-ish.

Greetings,

Andres Freund

Attachment

Re: straightening out backend process startup

From
Kyotaro Horiguchi
Date:
At Mon, 13 Sep 2021 20:11:29 -0700, Andres Freund <andres@anarazel.de> wrote in 
> Hi,
> 
> On 2021-08-05 12:50:15 -0700, Andres Freund wrote:
> > On 2021-08-03 16:50:24 +0900, Kyotaro Horiguchi wrote:
> > > > - My first attempt at PostgresMainSingle() separated the single/multi user
> > > >   cases a bit more than the code right now, by having a PostgresMainCommon()
> > > >   which was called by PostgresMainSingle(), PostgresMain(). *Common only
> > > >   started with the MessageContext allocation, which did have the advantage of
> > > >   splitting out a few of the remaining conditional actions in PostgresMain()
> > > >   (PostmasterContext, welcome banner, Log_disconnections). But lead to a bit
> > > >   more duplication. I don't really have an opinion on what's better.
> > >
> > > I'm not sure how it looked like, but isn't it reasonable that quickdie
> > > and log_disconnections(). handle IsUnderPostmaster instead?  Or for
> > > log_disconnections, Log_disconnections should be turned off at
> > > standalone-initialization?
> > 
> > I was wondering about log_disconnections too. The conditional addition of the
> > callback is all that forces log_disconnections to be PGC_SU_BACKEND rather
> > than PGC_SUSET too. So I agree that moving a check for Log_disconnections and
> > IsUnderPostmaster into log_disconnections is a good idea.
> 
> I did that, and it didn't seem a clear improvement..

Sorry for bothering you about that..

> > > > - I had to move the PgStartTime computation to a bit earlier for single user
> > > >   mode. That seems to make sense to me anyway, given that postmaster does so
> > > >   fairly early too.
> > > >
> > > >   Any reason that'd be a bad idea?
> > > >
> > > >   Arguably it should even be a tad earlier to be symmetric.
> > >
> > > Why don't you move the code for multiuser as earlier as standalone does?
> > 
> > I think it's the other way round - right now the standalone case is much later
> > than the multiuser case. Postmaster determines PgStartTime after creating
> > shared memory, just before starting checkpointer / startup process - whereas
> > single user mode in HEAD does it just before accepting input for the first
> > time.
> 
> Did that in the attached.
> 
> 
> I've attached the three remaining patches, after some more polish. Unless
> somebody argues against I plan to commit these soon-ish.

0001 looks fine.

0002: Looks fine in conjunction with 0003.

0003:

PostgresSingleUserMain doesn't set processing mode. It is fine as it
is initialied to InitProcessing at process start.  On the other hand,
PostgresMain sets it to InitProcessing but it seems to me it is always
InitProcessing when entering the function. In the first place isn't
InitProcessing the initial state and won't be transitioned from other
states?  However, it would be another issue even if it is right.

So, everything looks fine to me.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: straightening out backend process startup

From
Andres Freund
Date:
Hi,

On September 13, 2021 9:56:52 PM PDT, Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote:
>At Mon, 13 Sep 2021 20:11:29 -0700, Andres Freund <andres@anarazel.de> wrote in
>> Hi,
>>
>> On 2021-08-05 12:50:15 -0700, Andres Freund wrote:
>> > On 2021-08-03 16:50:24 +0900, Kyotaro Horiguchi wrote:
>> > > > - My first attempt at PostgresMainSingle() separated the single/multi user
>> > > >   cases a bit more than the code right now, by having a PostgresMainCommon()
>> > > >   which was called by PostgresMainSingle(), PostgresMain(). *Common only
>> > > >   started with the MessageContext allocation, which did have the advantage of
>> > > >   splitting out a few of the remaining conditional actions in PostgresMain()
>> > > >   (PostmasterContext, welcome banner, Log_disconnections). But lead to a bit
>> > > >   more duplication. I don't really have an opinion on what's better.
>> > >
>> > > I'm not sure how it looked like, but isn't it reasonable that quickdie
>> > > and log_disconnections(). handle IsUnderPostmaster instead?  Or for
>> > > log_disconnections, Log_disconnections should be turned off at
>> > > standalone-initialization?
>> >
>> > I was wondering about log_disconnections too. The conditional addition of the
>> > callback is all that forces log_disconnections to be PGC_SU_BACKEND rather
>> > than PGC_SUSET too. So I agree that moving a check for Log_disconnections and
>> > IsUnderPostmaster into log_disconnections is a good idea.
>>
>> I did that, and it didn't seem a clear improvement..
>
>Sorry for bothering you about that..

I thought it might look better too. Might still be worth later, just to make the guc SUSET.


>> > > > - I had to move the PgStartTime computation to a bit earlier for single user
>> > > >   mode. That seems to make sense to me anyway, given that postmaster does so
>> > > >   fairly early too.
>> > > >
>> > > >   Any reason that'd be a bad idea?
>> > > >
>> > > >   Arguably it should even be a tad earlier to be symmetric.
>> > >
>> > > Why don't you move the code for multiuser as earlier as standalone does?
>> >
>> > I think it's the other way round - right now the standalone case is much later
>> > than the multiuser case. Postmaster determines PgStartTime after creating
>> > shared memory, just before starting checkpointer / startup process - whereas
>> > single user mode in HEAD does it just before accepting input for the first
>> > time.
>>
>> Did that in the attached.
>>
>>
>> I've attached the three remaining patches, after some more polish. Unless
>> somebody argues against I plan to commit these soon-ish.
>
>0001 looks fine.
>
>0002: Looks fine in conjunction with 0003.
>
>0003:
>
>PostgresSingleUserMain doesn't set processing mode. It is fine as it
>is initialied to InitProcessing at process start.  On the other hand,
>PostgresMain sets it to InitProcessing but it seems to me it is always
>InitProcessing when entering the function. In the first place isn't
>InitProcessing the initial state and won't be transitioned from other
>states?  However, it would be another issue even if it is right.

I don't think it should be accessed during the stuff that PostgresSingleUserMain() does. The catalog access / cache
initializationcontinues to happen below PostgresMain(). I guess a comment wouldn't me amiss. 

I think we have way too many different process type & state variables :(

Andres



--
Sent from my Android device with K-9 Mail. Please excuse my brevity.