Thread: What are exactly bootstrap processes, auxiliary processes, standalone backends, normal backends(user sessions)?

Hi,

I've always had a hard time distinguishing various types of
processes/terms used in postgres. I look at the source code every time
to understand them, yet I don't feel satisfied with my understanding.
I request any hacker (having a better idea than me) to help me with
what each different process does and how they are different from each
other? Of course, I'm clear with normal backends (user sessions), bg
workers, but the others need a bit more understanding.

I appreciate the help.

Regards,
Bharath Rupireddy.



Greetings,

* Bharath Rupireddy (bharath.rupireddyforpostgres@gmail.com) wrote:
> I've always had a hard time distinguishing various types of
> processes/terms used in postgres. I look at the source code every time
> to understand them, yet I don't feel satisfied with my understanding.
> I request any hacker (having a better idea than me) to help me with
> what each different process does and how they are different from each
> other? Of course, I'm clear with normal backends (user sessions), bg
> workers, but the others need a bit more understanding.

There was an effort to try to pull these things together because, yeah,
it seems a bit messy.

I'd suggest you take a look at:

https://www.postgresql.org/message-id/flat/CAMN686FE0OdZKp9YPO=htC6LnA6aW4r-+jq=3Q5RAoFQgW8EtA@mail.gmail.com

Thanks,

Stephen

Attachment
On Tue, Jul 13, 2021 at 3:00 AM Stephen Frost <sfrost@snowman.net> wrote:
>
> Greetings,
>
> * Bharath Rupireddy (bharath.rupireddyforpostgres@gmail.com) wrote:
> > I've always had a hard time distinguishing various types of
> > processes/terms used in postgres. I look at the source code every time
> > to understand them, yet I don't feel satisfied with my understanding.
> > I request any hacker (having a better idea than me) to help me with
> > what each different process does and how they are different from each
> > other? Of course, I'm clear with normal backends (user sessions), bg
> > workers, but the others need a bit more understanding.
>
> There was an effort to try to pull these things together because, yeah,
> it seems a bit messy.
>
> I'd suggest you take a look at:
>
> https://www.postgresql.org/message-id/flat/CAMN686FE0OdZKp9YPO=htC6LnA6aW4r-+jq=3Q5RAoFQgW8EtA@mail.gmail.com

Thanks. I will check it.

Regards,
Bharath Rupireddy.



On Fri, Jul 09, 2021 at 09:24:19PM +0530, Bharath Rupireddy wrote:
> I've always had a hard time distinguishing various types of
> processes/terms used in postgres. I look at the source code every time
> to understand them, yet I don't feel satisfied with my understanding.
> I request any hacker (having a better idea than me) to help me with
> what each different process does and how they are different from each
> other? Of course, I'm clear with normal backends (user sessions), bg
> workers, but the others need a bit more understanding.

It sounds like something that should be in the glossary, which currently refers
to but doesn't define "auxiliary processes".

 * Background writer, checkpointer, WAL writer and archiver run during normal
 * operation.  Startup process and WAL receiver also consume 2 slots, but WAL
 * writer is launched only after startup has exited, so we only need 5 slots.
 */
#define NUM_AUXILIARY_PROCS             5

Bootstrap is run by initdb:
src/bin/initdb/initdb.c:                                 "\"%s\" --boot -x0 %s %s "

Standalone backend is run by --single, right ?

-- 
Justin



On Thu, Jul 15, 2021 at 8:17 PM Justin Pryzby <pryzby@telsasoft.com> wrote:
>
> On Fri, Jul 09, 2021 at 09:24:19PM +0530, Bharath Rupireddy wrote:
> > I've always had a hard time distinguishing various types of
> > processes/terms used in postgres. I look at the source code every time
> > to understand them, yet I don't feel satisfied with my understanding.
> > I request any hacker (having a better idea than me) to help me with
> > what each different process does and how they are different from each
> > other? Of course, I'm clear with normal backends (user sessions), bg
> > workers, but the others need a bit more understanding.
>
> It sounds like something that should be in the glossary, which currently refers
> to but doesn't define "auxiliary processes".

Thanks. I strongly feel that it should be documented somewhere. I will
be happy if someone with a clear idea about these various processes
does it.

>  * Background writer, checkpointer, WAL writer and archiver run during normal
>  * operation.  Startup process and WAL receiver also consume 2 slots, but WAL
>  * writer is launched only after startup has exited, so we only need 5 slots.
>  */
> #define NUM_AUXILIARY_PROCS             5
>
> Bootstrap is run by initdb:
> src/bin/initdb/initdb.c:                                 "\"%s\" --boot -x0 %s %s "
>
> Standalone backend is run by --single, right ?

Maybe(?). I found another snippet below:

    if (argc > 1 && strcmp(argv[1], "--boot") == 0)
        AuxiliaryProcessMain(argc, argv);   /* does not return */
    else if (argc > 1 && strcmp(argv[1], "--describe-config") == 0)
        GucInfoMain();          /* does not return */
    else if (argc > 1 && strcmp(argv[1], "--single") == 0)
        PostgresMain(argc, argv,
                     NULL,      /* no dbname */
                     strdup(get_user_name_or_exit(progname)));  /*
does not return */
    else
        PostmasterMain(argc, argv); /* does not return */

Regards,
Bharath Rupireddy.



On 2021-Jul-17, Bharath Rupireddy wrote:

> On Thu, Jul 15, 2021 at 8:17 PM Justin Pryzby <pryzby@telsasoft.com> wrote:
> >
> > On Fri, Jul 09, 2021 at 09:24:19PM +0530, Bharath Rupireddy wrote:
> > > I've always had a hard time distinguishing various types of
> > > processes/terms used in postgres. I look at the source code every time
> > > to understand them, yet I don't feel satisfied with my understanding.
> > > I request any hacker (having a better idea than me) to help me with
> > > what each different process does and how they are different from each
> > > other? Of course, I'm clear with normal backends (user sessions), bg
> > > workers, but the others need a bit more understanding.
> >
> > It sounds like something that should be in the glossary, which currently refers
> > to but doesn't define "auxiliary processes".
> 
> Thanks. I strongly feel that it should be documented somewhere. I will
> be happy if someone with a clear idea about these various processes
> does it.

    Auxiliary process

    Process of an <glossterm linkend="glossary-instance">instance</glossterm> in
    charge of some specific, hardcoded background task.  Examples are
    the startup process,
    the WAL receiver (but not the WAL senders),
    the WAL writer,
    the archiver,
    the <glossterm linkend="glossary-background-writer">background writer</glossterm>,
    the <glossterm linkend="glossary-checkpointer">checkpointer</glossterm>,
    the <glossterm linkend="glossary-stats-collector">statistics collector</glossterm>,
    and the <glossterm linkend="glossary-logger">logger</glossterm>.

We should probably include individual glossary entries for those that
don't already have one.  Maybe revise the entries for ones that do so
that they start with "An auxiliary process that ..."

I just realized that the autovac launcher is not nominally an auxiliary
process (per SubPostmasterMain), which is odd since notionally it
clearly is.  Maybe we should include it in the list anyway, with
something like

"and the autovacuum launcher (but not the autovacuum workers)".

-- 
Álvaro Herrera           39°49'30"S 73°17'W  —  https://www.EnterpriseDB.com/
"I am amazed at [the pgsql-sql] mailing list for the wonderful support, and
lack of hesitasion in answering a lost soul's question, I just wished the rest
of the mailing list could be like this."                               (Fotis)
               (http://archives.postgresql.org/pgsql-sql/2006-06/msg00265.php)



Justin Pryzby <pryzby@telsasoft.com> writes:
> On Sat, Jul 17, 2021 at 10:45:52AM -0400, Alvaro Herrera wrote:
>> Process of an <glossterm linkend="glossary-instance">instance</glossterm> in
>> charge of some specific, hardcoded background task.  Examples are

> And I think "hardcoded" doesn't mean anything beyond what "specific" means.

> Maybe you'd say: .. process which handles a specific, central task for the
> cluster instance.

Meh.  "Specific" and "background" both seem to be useful terms here.
I do not think "central" is a useful adjective.

            regards, tom lane



On Tue, Sep 7, 2021 at 5:48 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
>
> On 2021-Aug-14, Justin Pryzby wrote:
>
> > I elaborated on your definition and added here.
> > https://commitfest.postgresql.org/34/3285/
>
> Thanks!  This works for me.  After looking at it, it seemed to me that
> listing the autovacuum launcher is perfectly adapted, so we might as
> well do it; and add verbiage about it to the autovacuum entry.  (I was
> first adding a whole new glossary entry for it, but it seemed overkill.)
>
> I also ended up adding an entry for WAL sender -- seems to round things
> nicely.
>
> ... In doing so I noticed that the definition for startup process and
> WAL receiver is slightly wrong.  WAL receiver only receives, it doesn't
> replay; it is always the startup process the one that replays.  So I
> changed that too.

Thanks for the v2 patch, here are some comments on it:

1) How about
A set of background processes (<firstterm>autovacuum
launcher</firstterm> and <firstterm>autovacuum workers</firstterm>)
that routinely perform
instead of
A set of background processes that routinely perform
?

2) In what way we call autovacuum launcher an auxiliary process but
not autovacuum worker? And autovacuum isn't a background worker right?
Why can't we call it an auxiliary process?
+     (but not the autovacuum workers),

3) Isn't it "WAL sender" instead of "WAL senders"?
+     (but not the <glossterm linkend="glossary-wal-sender">WAL
senders</glossterm>),


4) replays WAL during replication? Isn't it "replays WAL during crash
recovery or in standby mode"
+     An auxiliary process that replays WAL during replication and
+     crash recovery.

5) Should we mention that WAL archiver too is optional similar to
Logger (process)? Also, let us rearrange the text a bit to be in sync.
+     An auxiliary process which (if enabled) saves copies of
+     <glossterm linkend="glossary-wal-file">WAL files</glossterm>

+     An auxiliary process which (if enabled)
      writes information about database events into the current

6) Shouldn't we mention "<glossterm
linkend="glossary-auxiliary-proc">auxiliary process</glossterm>
instead of just plain "auxilary process"?

7) Shouldn't we mention "<glossterm
linkend="glossary-primary-server">primary</glossterm>"? instead of
"primary server"?
+     to receive WAL from the primary server for replay by the

8) I agree to not call walsender an auxiliary process because it is
type of a <glossterm linkend="glossary-backend">backend</glossterm>
process that understands replication commands only. Instead of saying
"A process that runs..."
why can't we mention that in the description?
+     A process that runs on a server that streams WAL over a
+     network.  The receiving end can be a
+     <glossterm linkend="glossary-wal-receiver">WAL receiver</glossterm>

Regards,
Bharath Rupireddy.



Thanks Bharath and Justin -- I think I took all the suggestions and made
a few other changes of my own.  Here's the result.

I'm not 100% happy with the historical note in "startup process", mostly
because it uses the word "name" three times too close to each other.
Didn't quickly see an obvious way to reword it to avoid that.

-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"Most hackers will be perfectly comfortable conceptualizing users as entropy
 sources, so let's move on."                               (Nathaniel Smith)

Attachment
On 2021-Sep-13, Alvaro Herrera wrote:

> Thanks Bharath and Justin -- I think I took all the suggestions and made
> a few other changes of my own.  Here's the result.

Pushed this with very minor additional changes, thanks.

-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/