Re: [HACKERS] Unportable implementation of background worker start - Mailing list pgsql-hackers

From Tom Lane
Subject Re: [HACKERS] Unportable implementation of background worker start
Date
Msg-id 31599.1492730608@sss.pgh.pa.us
Whole thread Raw
In response to Re: [HACKERS] Unportable implementation of background worker start  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: [HACKERS] Unportable implementation of background worker start
List pgsql-hackers
I wrote:
> Alvaro Herrera <alvherre@2ndquadrant.com> writes:
>> Tom Lane wrote:
>>> Hm.  Do you have a more-portable alternative?

>> I was thinking in a WaitEventSet from latch.c.

> My first reaction was that that sounded like a lot more work than removing
> two lines from maybe_start_bgworker and adjusting some comments.  But on
> closer inspection, the slow-bgworker-start issue isn't the only problem
> here.  On a machine that restarts select()'s timeout after an interrupt,
> as (at least) HPUX does, the postmaster will actually never iterate
> ServerLoop's loop except immediately after receiving a new connection
> request.

I had a go at making the postmaster use a WaitEventSet, and immediately
ran into problems: latch.c is completely unprepared to be used anywhere
except a postmaster child process.  I think we can get around the issue
for the self-pipe, as per the attached untested patch.  But there remains
a problem: we should do a FreeWaitEventSet() after forking a child
process to ensure that postmaster children aren't running around with
open FDs for the postmaster's stuff.  This is no big problem in a regular
Unix build; we can give ClosePostmasterPorts() the responsibility.
It *is* a problem in EXEC_BACKEND children, which won't have inherited
the WaitEventSet data structure.  Maybe we could ignore the problem for
Unix EXEC_BACKEND builds, since we consider those to be for debug
purposes only, not for production.  But I don't think we can get away
with it for Windows --- or are the HANDLEs in a Windows WaitEventSet
not inheritable resources?

            regards, tom lane

diff --git a/src/backend/storage/ipc/latch.c b/src/backend/storage/ipc/latch.c
index 4798370..d4ac54c 100644
*** a/src/backend/storage/ipc/latch.c
--- b/src/backend/storage/ipc/latch.c
*************** static volatile sig_atomic_t waiting = f
*** 129,134 ****
--- 129,136 ----
  /* Read and write ends of the self-pipe */
  static int    selfpipe_readfd = -1;
  static int    selfpipe_writefd = -1;
+ /* Process owning the self-pipe --- needed for checking purposes */
+ static int    selfpipe_owner_pid = 0;

  /* Private function prototypes */
  static void sendSelfPipeByte(void);
*************** InitializeLatchSupport(void)
*** 158,164 ****
  #ifndef WIN32
      int            pipefd[2];

!     Assert(selfpipe_readfd == -1);

      /*
       * Set up the self-pipe that allows a signal handler to wake up the
--- 160,202 ----
  #ifndef WIN32
      int            pipefd[2];

!     if (IsUnderPostmaster)
!     {
!         /*
!          * We might have inherited connections to a self-pipe created by the
!          * postmaster.  It's critical that child processes create their own
!          * self-pipes, of course, and we really want them to close the
!          * inherited FDs for safety's sake.
!          */
!         if (selfpipe_owner_pid != 0)
!         {
!             /* Assert we go through here but once in a child process */
!             Assert(selfpipe_owner_pid != MyProcPid);
!             /* Release postmaster's pipe */
!             close(selfpipe_readfd);
!             close(selfpipe_writefd);
!             /* Clean up, just for safety's sake; we'll set these in a bit */
!             selfpipe_readfd = selfpipe_writefd = -1;
!             selfpipe_owner_pid = 0;
!         }
!         else
!         {
!             /*
!              * Postmaster didn't create a self-pipe ... or else we're in an
!              * EXEC_BACKEND build, and don't know because we didn't inherit
!              * values for the static variables.  (In that case we'll fail to
!              * close the inherited FDs, but that seems acceptable since
!              * EXEC_BACKEND builds aren't meant for production on Unix.)
!              */
!             Assert(selfpipe_readfd == -1);
!         }
!     }
!     else
!     {
!         /* In postmaster or standalone backend, assert we do this but once */
!         Assert(selfpipe_readfd == -1);
!         Assert(selfpipe_owner_pid == 0);
!     }

      /*
       * Set up the self-pipe that allows a signal handler to wake up the
*************** InitializeLatchSupport(void)
*** 176,188 ****

      selfpipe_readfd = pipefd[0];
      selfpipe_writefd = pipefd[1];
  #else
      /* currently, nothing to do here for Windows */
  #endif
  }

  /*
!  * Initialize a backend-local latch.
   */
  void
  InitLatch(volatile Latch *latch)
--- 214,227 ----

      selfpipe_readfd = pipefd[0];
      selfpipe_writefd = pipefd[1];
+     selfpipe_owner_pid = MyProcPid;
  #else
      /* currently, nothing to do here for Windows */
  #endif
  }

  /*
!  * Initialize a process-local latch.
   */
  void
  InitLatch(volatile Latch *latch)
*************** InitLatch(volatile Latch *latch)
*** 193,199 ****

  #ifndef WIN32
      /* Assert InitializeLatchSupport has been called in this process */
!     Assert(selfpipe_readfd >= 0);
  #else
      latch->event = CreateEvent(NULL, TRUE, FALSE, NULL);
      if (latch->event == NULL)
--- 232,238 ----

  #ifndef WIN32
      /* Assert InitializeLatchSupport has been called in this process */
!     Assert(selfpipe_readfd >= 0 && selfpipe_owner_pid == MyProcPid);
  #else
      latch->event = CreateEvent(NULL, TRUE, FALSE, NULL);
      if (latch->event == NULL)
*************** OwnLatch(volatile Latch *latch)
*** 256,262 ****

  #ifndef WIN32
      /* Assert InitializeLatchSupport has been called in this process */
!     Assert(selfpipe_readfd >= 0);
  #endif

      if (latch->owner_pid != 0)
--- 295,301 ----

  #ifndef WIN32
      /* Assert InitializeLatchSupport has been called in this process */
!     Assert(selfpipe_readfd >= 0 && selfpipe_owner_pid == MyProcPid);
  #endif

      if (latch->owner_pid != 0)
*************** DisownLatch(volatile Latch *latch)
*** 289,295 ****
   * is incurred when WL_TIMEOUT is given, so avoid using a timeout if possible.
   *
   * The latch must be owned by the current process, ie. it must be a
!  * backend-local latch initialized with InitLatch, or a shared latch
   * associated with the current process by calling OwnLatch.
   *
   * Returns bit mask indicating which condition(s) caused the wake-up. Note
--- 328,334 ----
   * is incurred when WL_TIMEOUT is given, so avoid using a timeout if possible.
   *
   * The latch must be owned by the current process, ie. it must be a
!  * process-local latch initialized with InitLatch, or a shared latch
   * associated with the current process by calling OwnLatch.
   *
   * Returns bit mask indicating which condition(s) caused the wake-up. Note
*************** FreeWaitEventSet(WaitEventSet *set)
*** 597,603 ****
   * used to modify previously added wait events using ModifyWaitEvent().
   *
   * In the WL_LATCH_SET case the latch must be owned by the current process,
!  * i.e. it must be a backend-local latch initialized with InitLatch, or a
   * shared latch associated with the current process by calling OwnLatch.
   *
   * In the WL_SOCKET_READABLE/WRITEABLE case, EOF and error conditions are
--- 636,642 ----
   * used to modify previously added wait events using ModifyWaitEvent().
   *
   * In the WL_LATCH_SET case the latch must be owned by the current process,
!  * i.e. it must be a process-local latch initialized with InitLatch, or a
   * shared latch associated with the current process by calling OwnLatch.
   *
   * In the WL_SOCKET_READABLE/WRITEABLE case, EOF and error conditions are

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: [HACKERS] WITH clause in CREATE STATISTICS
Next
From: Andres Freund
Date:
Subject: Re: [HACKERS] Unportable implementation of background worker start