Re: Using WaitEventSet in the postmaster - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Using WaitEventSet in the postmaster
Date
Msg-id 20221205180939.2p6zcrtevukgyay7@awork3.anarazel.de
Whole thread Raw
In response to Re: Using WaitEventSet in the postmaster  (Thomas Munro <thomas.munro@gmail.com>)
Responses Re: Using WaitEventSet in the postmaster  (Thomas Munro <thomas.munro@gmail.com>)
List pgsql-hackers
Hi,

On 2022-12-05 22:45:57 +1300, Thomas Munro wrote:
> On Sat, Dec 3, 2022 at 10:41 AM Thomas Munro <thomas.munro@gmail.com> wrote:
> > Here's an iteration like that.  Still WIP grade.  It passes, but there
> > must be something I don't understand about this computer program yet,
> > because if I move the "if (pending_..." section up into the block
> > where WL_LATCH_SET has arrived (instead of testing those variables
> > every time through the loop), a couple of tests leave zombie
> > (unreaped) processes behind, indicating that something funky happened
> > to the state machine that I haven't yet grokked.  Will look more next
> > week.
> 
> Duh.  The reason for that was the pre-existing special case for
> PM_WAIT_DEAD_END, which used a sleep(100ms) loop to wait for children
> to exit, which I needed to change to a latch wait.  Fixed in the next
> iteration, attached.
> 
> The reason for the existing sleep-based approach was that we didn't
> want to accept any more connections (or spin furiously because the
> listen queue was non-empty).  So in this version I invented a way to
> suppress socket events temporarily with WL_SOCKET_IGNORE, and then
> reactivate them after crash reinit.

That seems like an odd special flag. Why do we need it? Is it just because we
want to have assertions ensuring that something is being queried?


>  * WL_SOCKET_ACCEPT is a new event for an incoming connection (on Unix,
>    this is just another name for WL_SOCKET_READABLE, but Window has a
>    different underlying event; this mirrors WL_SOCKET_CONNECTED on the
>    other end of a connection)

Perhaps worth committing separately and soon? Seems pretty uncontroversial
from here.


> +/*
> + * Object representing the state of a postmaster.
> + *
> + * XXX Lots of global variables could move in here.
> + */
> +typedef struct
> +{
> +    WaitEventSet    *wes;
> +} Postmaster;
> +

Seems weird to introduce this but then basically have it be unused. I'd say
either have a preceding patch move at least a few members into it, or just
omit it for now.


> +    /* This may configure SIGURG, depending on platform. */
> +    InitializeLatchSupport();
> +    InitLocalLatch();

I'm mildly preferring InitProcessLocalLatch(), but not sure why - there's not
really a conflicting meaning of "local" here.


> +/*
> + * Initialize the WaitEventSet we'll use in our main event loop.
> + */
> +static void
> +InitializeWaitSet(Postmaster *postmaster)
> +{
> +    /* Set up a WaitEventSet for our latch and listening sockets. */
> +    postmaster->wes = CreateWaitEventSet(CurrentMemoryContext, 1 + MAXLISTEN);
> +    AddWaitEventToSet(postmaster->wes, WL_LATCH_SET, PGINVALID_SOCKET, MyLatch, NULL);
> +    for (int i = 0; i < MAXLISTEN; i++)
> +    {
> +        int            fd = ListenSocket[i];
> +
> +        if (fd == PGINVALID_SOCKET)
> +            break;
> +        AddWaitEventToSet(postmaster->wes, WL_SOCKET_ACCEPT, fd, NULL, NULL);
> +    }
> +}

The naming seems like it could be confused with latch.h
infrastructure. InitPostmasterWaitSet()?


> +/*
> + * Activate or deactivate the server socket events.
> + */
> +static void
> +AdjustServerSocketEvents(Postmaster *postmaster, bool active)
> +{
> +    for (int pos = 1; pos < GetNumRegisteredWaitEvents(postmaster->wes); ++pos)
> +        ModifyWaitEvent(postmaster->wes,
> +                        pos, active ? WL_SOCKET_ACCEPT : WL_SOCKET_IGNORE,
> +                        NULL);
> +}

This seems to hardcode the specific wait events we're waiting for based on
latch.c infrastructure. Not really convinced that's a good idea.

> +        /*
> +         * Latch set by signal handler, or new connection pending on any of our
> +         * sockets? If the latter, fork a child process to deal with it.
> +         */
> +        for (int i = 0; i < nevents; i++)
>          {
> -            if (errno != EINTR && errno != EWOULDBLOCK)
> +            if (events[i].events & WL_LATCH_SET)
>              {
> -                ereport(LOG,
> -                        (errcode_for_socket_access(),
> -                         errmsg("select() failed in postmaster: %m")));
> -                return STATUS_ERROR;
> +                ResetLatch(MyLatch);
> +
> +                /* Process work scheduled by signal handlers. */
> +                if (pending_action_request)
> +                    process_action_request(postmaster);
> +                if (pending_child_exit)
> +                    process_child_exit(postmaster);
> +                if (pending_reload_request)
> +                    process_reload_request();
> +                if (pending_shutdown_request)
> +                    process_shutdown_request(postmaster);
>              }

Is the order of operations here quite right? Shouldn't we process a shutdown
request before the others? And a child exit before the request to start an
autovac worker etc?

ISTM it should be 1) shutdown request 2) child exit 3) config reload 4) action
request.


>  /*
> - * pmdie -- signal handler for processing various postmaster signals.
> + * pg_ctl uses SIGTERM, SIGINT and SIGQUIT to request different types of
> + * shutdown.
>   */
>  static void
> -pmdie(SIGNAL_ARGS)
> +handle_shutdown_request_signal(SIGNAL_ARGS)
>  {
> -    int            save_errno = errno;
> -
> -    ereport(DEBUG2,
> -            (errmsg_internal("postmaster received signal %d",
> -                             postgres_signal_arg)));
> +    int save_errno = errno;
>  
>      switch (postgres_signal_arg)
>      {
>          case SIGTERM:
> +            pending_shutdown_request = SmartShutdown;
> +            break;
> +        case SIGINT:
> +            pending_shutdown_request = FastShutdown;
> +            break;
> +        case SIGQUIT:
> +            pending_shutdown_request = ImmediateShutdown;
> +            break;
> +    }
> +    SetLatch(MyLatch);
> +
> +    errno = save_errno;
> +}

Hm, not having the "postmaster received signal" message anymore seems like a
loss when debugging things. I think process_shutdown_request() should emit
something like it.

I wonder if we should have a elog_sighand() that's written to be signal
safe. I've written versions of that numerous times for debugging, and it's a
bit silly to do that over and over again.



> @@ -2905,23 +2926,33 @@ pmdie(SIGNAL_ARGS)
>               * Now wait for backends to exit.  If there are none,
>               * PostmasterStateMachine will take the next step.
>               */
> -            PostmasterStateMachine();
> +            PostmasterStateMachine(postmaster);
>              break;

I'm by now fairly certain that it's a bad idea to have this change mixed in
with the rest of this large-ish change.


>  static void
> -PostmasterStateMachine(void)
> +PostmasterStateMachine(Postmaster *postmaster)
>  {
>      /* If we're doing a smart shutdown, try to advance that state. */
>      if (pmState == PM_RUN || pmState == PM_HOT_STANDBY)
> @@ -3819,6 +3849,9 @@ PostmasterStateMachine(void)
>              Assert(AutoVacPID == 0);
>              /* syslogger is not considered here */
>              pmState = PM_NO_CHILDREN;
> +
> +            /* re-activate server socket events */
> +            AdjustServerSocketEvents(postmaster, true);
>          }
>      }
>  
> @@ -3905,6 +3938,9 @@ PostmasterStateMachine(void)
>          pmState = PM_STARTUP;
>          /* crash recovery started, reset SIGKILL flag */
>          AbortStartTime = 0;
> +
> +        /* start accepting server socket connection events again */
> +        reenable_server_socket_events = true;
>      }
>  }

I don't think reenable_server_socket_events does anything as the patch stands
- I don't see it being checked anywhere?  And in the path above, you're using
AdjustServerSocketEvents() directly.


> @@ -4094,6 +4130,7 @@ BackendStartup(Port *port)
>      /* Hasn't asked to be notified about any bgworkers yet */
>      bn->bgworker_notify = false;
>  
> +    PG_SETMASK(&BlockSig);
>  #ifdef EXEC_BACKEND
>      pid = backend_forkexec(port);
>  #else                            /* !EXEC_BACKEND */

There are other calls to fork_process() - why don't they need the same
treatment?

Perhaps we should add an assertion to fork_process() ensuring that signals are
masked?


> diff --git a/src/backend/storage/ipc/latch.c b/src/backend/storage/ipc/latch.c
> index eb3a569aae..3bfef592eb 100644
> --- a/src/backend/storage/ipc/latch.c
> +++ b/src/backend/storage/ipc/latch.c
> @@ -283,6 +283,17 @@ InitializeLatchSupport(void)
>  #ifdef WAIT_USE_SIGNALFD
>      sigset_t    signalfd_mask;
>  
> +    if (IsUnderPostmaster)
> +    {
> +        if (signal_fd != -1)
> +        {
> +            /* Release postmaster's signal FD; ignore any error */
> +            (void) close(signal_fd);
> +            signal_fd = -1;
> +            ReleaseExternalFD();
> +        }
> +    }
> +

Hm - arguably it's a bug that we don't do this right now, correct?

> @@ -1201,6 +1214,7 @@ WaitEventAdjustKqueue(WaitEventSet *set, WaitEvent *event, int old_events)
>             event->events == WL_POSTMASTER_DEATH ||
>             (event->events & (WL_SOCKET_READABLE |
>                               WL_SOCKET_WRITEABLE |
> +                             WL_SOCKET_IGNORE |
>                               WL_SOCKET_CLOSED)));
>  
>      if (event->events == WL_POSTMASTER_DEATH)
> @@ -1312,6 +1326,8 @@ WaitEventAdjustWin32(WaitEventSet *set, WaitEvent *event)
>              flags |= FD_WRITE;
>          if (event->events & WL_SOCKET_CONNECTED)
>              flags |= FD_CONNECT;
> +        if (event->events & WL_SOCKET_ACCEPT)
> +            flags |= FD_ACCEPT;
>  
>          if (*handle == WSA_INVALID_EVENT)
>          {

I wonder if the code would end up easier to understand if we handled
WL_SOCKET_CONNECTED, WL_SOCKET_ACCEPT explicitly in the !WIN32 cases, rather
than redefining it to WL_SOCKET_READABLE.



> diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
> index 3082093d1e..655e881688 100644
> --- a/src/backend/tcop/postgres.c
> +++ b/src/backend/tcop/postgres.c
> @@ -24,7 +24,6 @@
>  #include <signal.h>
>  #include <unistd.h>
>  #include <sys/resource.h>
> -#include <sys/select.h>
>  #include <sys/socket.h>
>  #include <sys/time.h>

Do you know why this include even existed?


Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: ANY_VALUE aggregate
Next
From: Andres Freund
Date:
Subject: Re: [PATCH] Add native windows on arm64 support