Re: [HACKERS] Reducing pg_ctl's reaction time - Mailing list pgsql-hackers

From Tom Lane
Subject Re: [HACKERS] Reducing pg_ctl's reaction time
Date
Msg-id 15323.1498589958@sss.pgh.pa.us
Whole thread Raw
In response to Re: [HACKERS] Reducing pg_ctl's reaction time  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: [HACKERS] Reducing pg_ctl's reaction time
Re: [HACKERS] Reducing pg_ctl's reaction time
List pgsql-hackers
I wrote:
> Andres Freund <andres@anarazel.de> writes:
>> On 2017-06-26 17:38:03 -0400, Tom Lane wrote:
>>> Hm.  Take that a bit further, and we could drop the connection probes
>>> altogether --- just put the whole responsibility on the postmaster to
>>> show in the pidfile whether it's ready for connections or not.

>> Yea, that seems quite appealing, both from an architectural, simplicity,
>> and log noise perspective. I wonder if there's some added reliability by
>> the connection probe though? Essentially wondering if it'd be worthwhile
>> to keep a single connection test at the end. I'm somewhat disinclined
>> though.

> I agree --- part of the appeal of this idea is that there could be a net
> subtraction of code from pg_ctl.  (I think it wouldn't have to link libpq
> anymore at all, though maybe I forgot something.)  And you get rid of a
> bunch of can't-connect failure modes, eg kernel packet filter in the way,
> which probably outweighs any hypothetical reliability gain from confirming
> the postmaster state the old way.

Here's a draft patch for that.  I quite like the results --- this seems
way simpler and more reliable than what pg_ctl has done up to now.
However, it's certainly arguable that this is too much change for an
optional post-beta patch.  If we decide that it has to wait for v11,
I'd address Jeff's complaint by hacking the loop behavior in
test_postmaster_connection, which'd be ugly but not many lines of code.

Note that I followed the USE_SYSTEMD patch's lead as to where to report
postmaster state changes.  Arguably, in the standby-server case, we
should not report the postmaster is ready until we've reached consistency.
But that would require additional signaling from the startup process
to the postmaster, so it seems like a separate change if we want it.

Thoughts?

            regards, tom lane

diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index 2874f63..38f534f 100644
*** a/src/backend/postmaster/postmaster.c
--- b/src/backend/postmaster/postmaster.c
*************** PostmasterMain(int argc, char *argv[])
*** 1341,1346 ****
--- 1341,1354 ----
  #endif

      /*
+      * Report postmaster status in the postmaster.pid file, to allow pg_ctl to
+      * see what's happening.  Note that all strings written to the status line
+      * must be the same length, per comments for AddToDataDirLockFile().  We
+      * currently make them all 8 bytes, padding with spaces.
+      */
+     AddToDataDirLockFile(LOCK_FILE_LINE_PM_STATUS, "starting");
+
+     /*
       * We're ready to rock and roll...
       */
      StartupPID = StartupDataBase();
*************** pmdie(SIGNAL_ARGS)
*** 2608,2613 ****
--- 2616,2624 ----
              Shutdown = SmartShutdown;
              ereport(LOG,
                      (errmsg("received smart shutdown request")));
+
+             /* Report status */
+             AddToDataDirLockFile(LOCK_FILE_LINE_PM_STATUS, "stopping");
  #ifdef USE_SYSTEMD
              sd_notify(0, "STOPPING=1");
  #endif
*************** pmdie(SIGNAL_ARGS)
*** 2663,2668 ****
--- 2674,2682 ----
              Shutdown = FastShutdown;
              ereport(LOG,
                      (errmsg("received fast shutdown request")));
+
+             /* Report status */
+             AddToDataDirLockFile(LOCK_FILE_LINE_PM_STATUS, "stopping");
  #ifdef USE_SYSTEMD
              sd_notify(0, "STOPPING=1");
  #endif
*************** pmdie(SIGNAL_ARGS)
*** 2727,2732 ****
--- 2741,2749 ----
              Shutdown = ImmediateShutdown;
              ereport(LOG,
                      (errmsg("received immediate shutdown request")));
+
+             /* Report status */
+             AddToDataDirLockFile(LOCK_FILE_LINE_PM_STATUS, "stopping");
  #ifdef USE_SYSTEMD
              sd_notify(0, "STOPPING=1");
  #endif
*************** reaper(SIGNAL_ARGS)
*** 2872,2877 ****
--- 2889,2896 ----
              ereport(LOG,
                      (errmsg("database system is ready to accept connections")));

+             /* Report status */
+             AddToDataDirLockFile(LOCK_FILE_LINE_PM_STATUS, "ready   ");
  #ifdef USE_SYSTEMD
              sd_notify(0, "READY=1");
  #endif
*************** sigusr1_handler(SIGNAL_ARGS)
*** 5005,5014 ****
          if (XLogArchivingAlways())
              PgArchPID = pgarch_start();

! #ifdef USE_SYSTEMD
          if (!EnableHotStandby)
              sd_notify(0, "READY=1");
  #endif

          pmState = PM_RECOVERY;
      }
--- 5024,5041 ----
          if (XLogArchivingAlways())
              PgArchPID = pgarch_start();

!         /*
!          * If we aren't planning to enter hot standby mode later, treat
!          * RECOVERY_STARTED as meaning we're out of startup, and report status
!          * accordingly.
!          */
          if (!EnableHotStandby)
+         {
+             AddToDataDirLockFile(LOCK_FILE_LINE_PM_STATUS, "standby ");
+ #ifdef USE_SYSTEMD
              sd_notify(0, "READY=1");
  #endif
+         }

          pmState = PM_RECOVERY;
      }
*************** sigusr1_handler(SIGNAL_ARGS)
*** 5024,5029 ****
--- 5051,5058 ----
          ereport(LOG,
                  (errmsg("database system is ready to accept read only connections")));

+         /* Report status */
+         AddToDataDirLockFile(LOCK_FILE_LINE_PM_STATUS, "ready   ");
  #ifdef USE_SYSTEMD
          sd_notify(0, "READY=1");
  #endif
diff --git a/src/backend/utils/init/miscinit.c b/src/backend/utils/init/miscinit.c
index 49a6afa..216bcc7 100644
*** a/src/backend/utils/init/miscinit.c
--- b/src/backend/utils/init/miscinit.c
*************** TouchSocketLockFiles(void)
*** 1149,1156 ****
   *
   * Note: because we don't truncate the file, if we were to rewrite a line
   * with less data than it had before, there would be garbage after the last
!  * line.  We don't ever actually do that, so not worth adding another kernel
!  * call to cover the possibility.
   */
  void
  AddToDataDirLockFile(int target_line, const char *str)
--- 1149,1157 ----
   *
   * Note: because we don't truncate the file, if we were to rewrite a line
   * with less data than it had before, there would be garbage after the last
!  * line.  While we could fix that by adding a truncate call, that would make
!  * the file update non-atomic, which we'd rather avoid.  Therefore, callers
!  * should endeavor never to shorten a line once it's been written.
   */
  void
  AddToDataDirLockFile(int target_line, const char *str)
*************** AddToDataDirLockFile(int target_line, co
*** 1193,1211 ****
      srcptr = srcbuffer;
      for (lineno = 1; lineno < target_line; lineno++)
      {
!         if ((srcptr = strchr(srcptr, '\n')) == NULL)
!         {
!             elog(LOG, "incomplete data in \"%s\": found only %d newlines while trying to add line %d",
!                  DIRECTORY_LOCK_FILE, lineno - 1, target_line);
!             close(fd);
!             return;
!         }
!         srcptr++;
      }
      memcpy(destbuffer, srcbuffer, srcptr - srcbuffer);
      destptr = destbuffer + (srcptr - srcbuffer);

      /*
       * Write or rewrite the target line.
       */
      snprintf(destptr, destbuffer + sizeof(destbuffer) - destptr, "%s\n", str);
--- 1194,1219 ----
      srcptr = srcbuffer;
      for (lineno = 1; lineno < target_line; lineno++)
      {
!         char       *eol = strchr(srcptr, '\n');
!
!         if (eol == NULL)
!             break;                /* not enough lines in file yet */
!         srcptr = eol + 1;
      }
      memcpy(destbuffer, srcbuffer, srcptr - srcbuffer);
      destptr = destbuffer + (srcptr - srcbuffer);

      /*
+      * Fill in any missing lines before the target line, in case lines are
+      * added to the file out of order.
+      */
+     for (; lineno < target_line; lineno++)
+     {
+         if (destptr < destbuffer + sizeof(destbuffer))
+             *destptr++ = '\n';
+     }
+
+     /*
       * Write or rewrite the target line.
       */
      snprintf(destptr, destbuffer + sizeof(destbuffer) - destptr, "%s\n", str);
diff --git a/src/bin/pg_ctl/Makefile b/src/bin/pg_ctl/Makefile
index f5ec088..46f30bd 100644
*** a/src/bin/pg_ctl/Makefile
--- b/src/bin/pg_ctl/Makefile
*************** subdir = src/bin/pg_ctl
*** 16,29 ****
  top_builddir = ../../..
  include $(top_builddir)/src/Makefile.global

- override CPPFLAGS := -I$(libpq_srcdir) $(CPPFLAGS)
-
  OBJS=    pg_ctl.o $(WIN32RES)

  all: pg_ctl

! pg_ctl: $(OBJS) | submake-libpq submake-libpgport
!     $(CC) $(CFLAGS) $(OBJS) $(libpq_pgport) $(LDFLAGS) $(LDFLAGS_EX) $(LIBS) -o $@$(X)

  install: all installdirs
      $(INSTALL_PROGRAM) pg_ctl$(X) '$(DESTDIR)$(bindir)/pg_ctl$(X)'
--- 16,27 ----
  top_builddir = ../../..
  include $(top_builddir)/src/Makefile.global

  OBJS=    pg_ctl.o $(WIN32RES)

  all: pg_ctl

! pg_ctl: $(OBJS) | submake-libpgport
!     $(CC) $(CFLAGS) $(OBJS) $(LDFLAGS) $(LDFLAGS_EX) $(LIBS) -o $@$(X)

  install: all installdirs
      $(INSTALL_PROGRAM) pg_ctl$(X) '$(DESTDIR)$(bindir)/pg_ctl$(X)'
diff --git a/src/bin/pg_ctl/pg_ctl.c b/src/bin/pg_ctl/pg_ctl.c
index ad2a16f..81b276c 100644
*** a/src/bin/pg_ctl/pg_ctl.c
--- b/src/bin/pg_ctl/pg_ctl.c
***************
*** 34,42 ****
  #include "catalog/pg_control.h"
  #include "common/controldata_utils.h"
  #include "getopt_long.h"
- #include "libpq-fe.h"
  #include "miscadmin.h"
- #include "pqexpbuffer.h"

  /* PID can be negative for standalone backend */
  typedef long pgpid_t;
--- 34,40 ----
*************** typedef enum
*** 49,54 ****
--- 47,58 ----
      IMMEDIATE_MODE
  } ShutdownMode;

+ typedef enum
+ {
+     POSTMASTER_READY,
+     POSTMASTER_STILL_STARTING,
+     POSTMASTER_FAILED
+ } WaitPMResult;

  typedef enum
  {
*************** static int    CreateRestrictedProcess(char
*** 147,158 ****
  #endif

  static pgpid_t get_pgpid(bool is_status_request);
! static char **readfile(const char *path);
  static void free_readfile(char **optlines);
  static pgpid_t start_postmaster(void);
  static void read_post_opts(void);

! static PGPing test_postmaster_connection(pgpid_t pm_pid, bool do_checkpoint);
  static bool postmaster_is_alive(pid_t pid);

  #if defined(HAVE_GETRLIMIT) && defined(RLIMIT_CORE)
--- 151,162 ----
  #endif

  static pgpid_t get_pgpid(bool is_status_request);
! static char **readfile(const char *path, int *numlines);
  static void free_readfile(char **optlines);
  static pgpid_t start_postmaster(void);
  static void read_post_opts(void);

! static WaitPMResult wait_for_postmaster(pgpid_t pm_pid, bool do_checkpoint);
  static bool postmaster_is_alive(pid_t pid);

  #if defined(HAVE_GETRLIMIT) && defined(RLIMIT_CORE)
*************** get_pgpid(bool is_status_request)
*** 304,312 ****

  /*
   * get the lines from a text file - return NULL if file can't be opened
   */
  static char **
! readfile(const char *path)
  {
      int            fd;
      int            nlines;
--- 308,319 ----

  /*
   * get the lines from a text file - return NULL if file can't be opened
+  *
+  * *numlines is set to the number of line pointers returned; there is
+  * also an additional NULL pointer after the last real line.
   */
  static char **
! readfile(const char *path, int *numlines)
  {
      int            fd;
      int            nlines;
*************** readfile(const char *path)
*** 318,323 ****
--- 325,332 ----
      int            len;
      struct stat statbuf;

+     *numlines = 0;                /* in case of failure or empty file */
+
      /*
       * Slurp the file into memory.
       *
*************** readfile(const char *path)
*** 367,372 ****
--- 376,382 ----

      /* set up the result buffer */
      result = (char **) pg_malloc((nlines + 1) * sizeof(char *));
+     *numlines = nlines;

      /* now split the buffer into lines */
      linebegin = buffer;
*************** start_postmaster(void)
*** 509,515 ****


  /*
!  * Find the pgport and try a connection
   *
   * On Unix, pm_pid is the PID of the just-launched postmaster.  On Windows,
   * it may be the PID of an ancestor shell process, so we can't check the
--- 519,525 ----


  /*
!  * Wait for the postmaster to become ready.
   *
   * On Unix, pm_pid is the PID of the just-launched postmaster.  On Windows,
   * it may be the PID of an ancestor shell process, so we can't check the
*************** start_postmaster(void)
*** 522,689 ****
   * Note that the checkpoint parameter enables a Windows service control
   * manager checkpoint, it's got nothing to do with database checkpoints!!
   */
! static PGPing
! test_postmaster_connection(pgpid_t pm_pid, bool do_checkpoint)
  {
-     PGPing        ret = PQPING_NO_RESPONSE;
-     char        connstr[MAXPGPATH * 2 + 256];
      int            i;

-     /* if requested wait time is zero, return "still starting up" code */
-     if (wait_seconds <= 0)
-         return PQPING_REJECT;
-
-     connstr[0] = '\0';
-
      for (i = 0; i < wait_seconds * WAITS_PER_SEC; i++)
      {
!         /* Do we need a connection string? */
!         if (connstr[0] == '\0')
!         {
!             /*----------
!              * The number of lines in postmaster.pid tells us several things:
!              *
!              * # of lines
!              *        0    lock file created but status not written
!              *        2    pre-9.1 server, shared memory not created
!              *        3    pre-9.1 server, shared memory created
!              *        5    9.1+ server, ports not opened
!              *        6    9.1+ server, shared memory not created
!              *        7    9.1+ server, shared memory created
!              *
!              * This code does not support pre-9.1 servers.  On Unix machines
!              * we could consider extracting the port number from the shmem
!              * key, but that (a) is not robust, and (b) doesn't help with
!              * finding out the socket directory.  And it wouldn't work anyway
!              * on Windows.
!              *
!              * If we see less than 6 lines in postmaster.pid, just keep
!              * waiting.
!              *----------
!              */
!             char      **optlines;

!             /* Try to read the postmaster.pid file */
!             if ((optlines = readfile(pid_file)) != NULL &&
!                 optlines[0] != NULL &&
!                 optlines[1] != NULL &&
!                 optlines[2] != NULL)
!             {
!                 if (optlines[3] == NULL)
!                 {
!                     /* File is exactly three lines, must be pre-9.1 */
!                     write_stderr(_("\n%s: -w option is not supported when starting a pre-9.1 server\n"),
!                                  progname);
!                     return PQPING_NO_ATTEMPT;
!                 }
!                 else if (optlines[4] != NULL &&
!                          optlines[5] != NULL)
!                 {
!                     /* File is complete enough for us, parse it */
!                     pgpid_t        pmpid;
!                     time_t        pmstart;

!                     /*
!                      * Make sanity checks.  If it's for the wrong PID, or the
!                      * recorded start time is before pg_ctl started, then
!                      * either we are looking at the wrong data directory, or
!                      * this is a pre-existing pidfile that hasn't (yet?) been
!                      * overwritten by our child postmaster.  Allow 2 seconds
!                      * slop for possible cross-process clock skew.
!                      */
!                     pmpid = atol(optlines[LOCK_FILE_LINE_PID - 1]);
!                     pmstart = atol(optlines[LOCK_FILE_LINE_START_TIME - 1]);
!                     if (pmstart >= start_time - 2 &&
  #ifndef WIN32
!                         pmpid == pm_pid
  #else
!                     /* Windows can only reject standalone-backend PIDs */
!                         pmpid > 0
  #endif
!                         )
!                     {
!                         /*
!                          * OK, seems to be a valid pidfile from our child.
!                          */
!                         int            portnum;
!                         char       *sockdir;
!                         char       *hostaddr;
!                         char        host_str[MAXPGPATH];
!
!                         /*
!                          * Extract port number and host string to use. Prefer
!                          * using Unix socket if available.
!                          */
!                         portnum = atoi(optlines[LOCK_FILE_LINE_PORT - 1]);
!                         sockdir = optlines[LOCK_FILE_LINE_SOCKET_DIR - 1];
!                         hostaddr = optlines[LOCK_FILE_LINE_LISTEN_ADDR - 1];
!
!                         /*
!                          * While unix_socket_directories can accept relative
!                          * directories, libpq's host parameter must have a
!                          * leading slash to indicate a socket directory.  So,
!                          * ignore sockdir if it's relative, and try to use TCP
!                          * instead.
!                          */
!                         if (sockdir[0] == '/')
!                             strlcpy(host_str, sockdir, sizeof(host_str));
!                         else
!                             strlcpy(host_str, hostaddr, sizeof(host_str));
!
!                         /* remove trailing newline */
!                         if (strchr(host_str, '\n') != NULL)
!                             *strchr(host_str, '\n') = '\0';
!
!                         /* Fail if couldn't get either sockdir or host addr */
!                         if (host_str[0] == '\0')
!                         {
!                             write_stderr(_("\n%s: -w option cannot use a relative socket directory specification\n"),
!                                          progname);
!                             return PQPING_NO_ATTEMPT;
!                         }
!
!                         /*
!                          * Map listen-only addresses to counterparts usable
!                          * for establishing a connection.  connect() to "::"
!                          * or "0.0.0.0" is not portable to OpenBSD 5.0 or to
!                          * Windows Server 2008, and connect() to "::" is
!                          * additionally not portable to NetBSD 6.0.  (Cygwin
!                          * does handle both addresses, though.)
!                          */
!                         if (strcmp(host_str, "*") == 0)
!                             strcpy(host_str, "localhost");
!                         else if (strcmp(host_str, "0.0.0.0") == 0)
!                             strcpy(host_str, "127.0.0.1");
!                         else if (strcmp(host_str, "::") == 0)
!                             strcpy(host_str, "::1");

!                         /*
!                          * We need to set connect_timeout otherwise on Windows
!                          * the Service Control Manager (SCM) will probably
!                          * timeout first.
!                          */
!                         snprintf(connstr, sizeof(connstr),
!                                  "dbname=postgres port=%d host='%s' connect_timeout=5",
!                                  portnum, host_str);
!                     }
                  }
              }
-
-             /*
-              * Free the results of readfile.
-              *
-              * This is safe to call even if optlines is NULL.
-              */
-             free_readfile(optlines);
          }

!         /* If we have a connection string, ping the server */
!         if (connstr[0] != '\0')
!         {
!             ret = PQping(connstr);
!             if (ret == PQPING_OK || ret == PQPING_NO_ATTEMPT)
!                 break;
!         }

          /*
           * Check whether the child postmaster process is still alive.  This
--- 532,599 ----
   * Note that the checkpoint parameter enables a Windows service control
   * manager checkpoint, it's got nothing to do with database checkpoints!!
   */
! static WaitPMResult
! wait_for_postmaster(pgpid_t pm_pid, bool do_checkpoint)
  {
      int            i;

      for (i = 0; i < wait_seconds * WAITS_PER_SEC; i++)
      {
!         char      **optlines;
!         int            numlines;

!         /*
!          * Try to read the postmaster.pid file.  If it's not valid, or if the
!          * status line isn't there yet, just keep waiting.
!          */
!         if ((optlines = readfile(pid_file, &numlines)) != NULL &&
!             numlines >= LOCK_FILE_LINE_PM_STATUS)
!         {
!             /* File is complete enough for us, parse it */
!             pgpid_t        pmpid;
!             time_t        pmstart;

!             /*
!              * Make sanity checks.  If it's for the wrong PID, or the recorded
!              * start time is before pg_ctl started, then either we are looking
!              * at the wrong data directory, or this is a pre-existing pidfile
!              * that hasn't (yet?) been overwritten by our child postmaster.
!              * Allow 2 seconds slop for possible cross-process clock skew.
!              */
!             pmpid = atol(optlines[LOCK_FILE_LINE_PID - 1]);
!             pmstart = atol(optlines[LOCK_FILE_LINE_START_TIME - 1]);
!             if (pmstart >= start_time - 2 &&
  #ifndef WIN32
!                 pmpid == pm_pid
  #else
!             /* Windows can only reject standalone-backend PIDs */
!                 pmpid > 0
  #endif
!                 )
!             {
!                 /*
!                  * OK, seems to be a valid pidfile from our child.  Check the
!                  * status line (this assumes a v10 or later server).
!                  */
!                 char       *pmstatus = optlines[LOCK_FILE_LINE_PM_STATUS - 1];

!                 /* status line may be blank-padded */
!                 if (strncmp(pmstatus, "ready", 5) == 0 ||
!                     strncmp(pmstatus, "standby", 7) == 0)
!                 {
!                     /* postmaster is done starting up */
!                     free_readfile(optlines);
!                     return POSTMASTER_READY;
                  }
              }
          }

!         /*
!          * Free the results of readfile.
!          *
!          * This is safe to call even if optlines is NULL.
!          */
!         free_readfile(optlines);

          /*
           * Check whether the child postmaster process is still alive.  This
*************** test_postmaster_connection(pgpid_t pm_pi
*** 697,710 ****
              int            exitstatus;

              if (waitpid((pid_t) pm_pid, &exitstatus, WNOHANG) == (pid_t) pm_pid)
!                 return PQPING_NO_RESPONSE;
          }
  #else
          if (WaitForSingleObject(postmasterProcess, 0) == WAIT_OBJECT_0)
!             return PQPING_NO_RESPONSE;
  #endif

!         /* No response, or startup still in process; wait */
          if (i % WAITS_PER_SEC == 0)
          {
  #ifdef WIN32
--- 607,620 ----
              int            exitstatus;

              if (waitpid((pid_t) pm_pid, &exitstatus, WNOHANG) == (pid_t) pm_pid)
!                 return POSTMASTER_FAILED;
          }
  #else
          if (WaitForSingleObject(postmasterProcess, 0) == WAIT_OBJECT_0)
!             return POSTMASTER_FAILED;
  #endif

!         /* Startup still in process; wait, printing a dot once per second */
          if (i % WAITS_PER_SEC == 0)
          {
  #ifdef WIN32
*************** test_postmaster_connection(pgpid_t pm_pi
*** 729,736 ****
          pg_usleep(USEC_PER_SEC / WAITS_PER_SEC);
      }

!     /* return result of last call to PQping */
!     return ret;
  }


--- 639,646 ----
          pg_usleep(USEC_PER_SEC / WAITS_PER_SEC);
      }

!     /* out of patience; report that postmaster is still starting up */
!     return POSTMASTER_STILL_STARTING;
  }


*************** read_post_opts(void)
*** 764,777 ****
          if (ctl_command == RESTART_COMMAND)
          {
              char      **optlines;

!             optlines = readfile(postopts_file);
              if (optlines == NULL)
              {
                  write_stderr(_("%s: could not read file \"%s\"\n"), progname, postopts_file);
                  exit(1);
              }
!             else if (optlines[0] == NULL || optlines[1] != NULL)
              {
                  write_stderr(_("%s: option file \"%s\" must have exactly one line\n"),
                               progname, postopts_file);
--- 674,688 ----
          if (ctl_command == RESTART_COMMAND)
          {
              char      **optlines;
+             int            numlines;

!             optlines = readfile(postopts_file, &numlines);
              if (optlines == NULL)
              {
                  write_stderr(_("%s: could not read file \"%s\"\n"), progname, postopts_file);
                  exit(1);
              }
!             else if (numlines != 1)
              {
                  write_stderr(_("%s: option file \"%s\" must have exactly one line\n"),
                               progname, postopts_file);
*************** do_start(void)
*** 917,944 ****
      {
          print_msg(_("waiting for server to start..."));

!         switch (test_postmaster_connection(pm_pid, false))
          {
!             case PQPING_OK:
                  print_msg(_(" done\n"));
                  print_msg(_("server started\n"));
                  break;
!             case PQPING_REJECT:
                  print_msg(_(" stopped waiting\n"));
                  print_msg(_("server is still starting up\n"));
                  break;
!             case PQPING_NO_RESPONSE:
                  print_msg(_(" stopped waiting\n"));
                  write_stderr(_("%s: could not start server\n"
                                 "Examine the log output.\n"),
                               progname);
                  exit(1);
                  break;
-             case PQPING_NO_ATTEMPT:
-                 print_msg(_(" failed\n"));
-                 write_stderr(_("%s: could not wait for server because of misconfiguration\n"),
-                              progname);
-                 exit(1);
          }
      }
      else
--- 828,850 ----
      {
          print_msg(_("waiting for server to start..."));

!         switch (wait_for_postmaster(pm_pid, false))
          {
!             case POSTMASTER_READY:
                  print_msg(_(" done\n"));
                  print_msg(_("server started\n"));
                  break;
!             case POSTMASTER_STILL_STARTING:
                  print_msg(_(" stopped waiting\n"));
                  print_msg(_("server is still starting up\n"));
                  break;
!             case POSTMASTER_FAILED:
                  print_msg(_(" stopped waiting\n"));
                  write_stderr(_("%s: could not start server\n"
                                 "Examine the log output.\n"),
                               progname);
                  exit(1);
                  break;
          }
      }
      else
*************** do_status(void)
*** 1319,1329 ****
              {
                  char      **optlines;
                  char      **curr_line;

                  printf(_("%s: server is running (PID: %ld)\n"),
                         progname, pid);

!                 optlines = readfile(postopts_file);
                  if (optlines != NULL)
                  {
                      for (curr_line = optlines; *curr_line != NULL; curr_line++)
--- 1225,1236 ----
              {
                  char      **optlines;
                  char      **curr_line;
+                 int            numlines;

                  printf(_("%s: server is running (PID: %ld)\n"),
                         progname, pid);

!                 optlines = readfile(postopts_file, &numlines);
                  if (optlines != NULL)
                  {
                      for (curr_line = optlines; *curr_line != NULL; curr_line++)
*************** pgwin32_ServiceMain(DWORD argc, LPTSTR *
*** 1634,1640 ****
      if (do_wait)
      {
          write_eventlog(EVENTLOG_INFORMATION_TYPE, _("Waiting for server startup...\n"));
!         if (test_postmaster_connection(postmasterPID, true) != PQPING_OK)
          {
              write_eventlog(EVENTLOG_ERROR_TYPE, _("Timed out waiting for server startup\n"));
              pgwin32_SetServiceStatus(SERVICE_STOPPED);
--- 1541,1547 ----
      if (do_wait)
      {
          write_eventlog(EVENTLOG_INFORMATION_TYPE, _("Waiting for server startup...\n"));
!         if (wait_for_postmaster(postmasterPID, true) != POSTMASTER_READY)
          {
              write_eventlog(EVENTLOG_ERROR_TYPE, _("Timed out waiting for server startup\n"));
              pgwin32_SetServiceStatus(SERVICE_STOPPED);
*************** pgwin32_ServiceMain(DWORD argc, LPTSTR *
*** 1655,1661 ****
              {
                  /*
                   * status.dwCheckPoint can be incremented by
!                  * test_postmaster_connection(), so it might not start from 0.
                   */
                  int            maxShutdownCheckPoint = status.dwCheckPoint + 12;

--- 1562,1568 ----
              {
                  /*
                   * status.dwCheckPoint can be incremented by
!                  * wait_for_postmaster(), so it might not start from 0.
                   */
                  int            maxShutdownCheckPoint = status.dwCheckPoint + 12;

diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 21a7728..0a07a02 100644
*** a/src/include/miscadmin.h
--- b/src/include/miscadmin.h
*************** extern char *shared_preload_libraries_st
*** 432,438 ****
  extern char *local_preload_libraries_string;

  /*
!  * As of 9.1, the contents of the data-directory lock file are:
   *
   * line #
   *        1    postmaster PID (or negative of a standalone backend's PID)
--- 432,438 ----
  extern char *local_preload_libraries_string;

  /*
!  * As of Postgres 10, the contents of the data-directory lock file are:
   *
   * line #
   *        1    postmaster PID (or negative of a standalone backend's PID)
*************** extern char *local_preload_libraries_str
*** 441,452 ****
   *        4    port number
   *        5    first Unix socket directory path (empty if none)
   *        6    first listen_address (IP address or "*"; empty if no TCP port)
!  *        7    shared memory key (not present on Windows)
   *
   * Lines 6 and up are added via AddToDataDirLockFile() after initial file
!  * creation.
   *
!  * The socket lock file, if used, has the same contents as lines 1-5.
   */
  #define LOCK_FILE_LINE_PID            1
  #define LOCK_FILE_LINE_DATA_DIR        2
--- 441,455 ----
   *        4    port number
   *        5    first Unix socket directory path (empty if none)
   *        6    first listen_address (IP address or "*"; empty if no TCP port)
!  *        7    shared memory key (empty on Windows)
!  *        8    postmaster status ("starting", "stopping", "ready   ", "standby ")
   *
   * Lines 6 and up are added via AddToDataDirLockFile() after initial file
!  * creation; also, line 5 is initially empty and is changed after the first
!  * Unix socket is opened.
   *
!  * Socket lock file(s), if used, have the same contents as lines 1-5, with
!  * line 5 being their own directory.
   */
  #define LOCK_FILE_LINE_PID            1
  #define LOCK_FILE_LINE_DATA_DIR        2
*************** extern char *local_preload_libraries_str
*** 455,460 ****
--- 458,464 ----
  #define LOCK_FILE_LINE_SOCKET_DIR    5
  #define LOCK_FILE_LINE_LISTEN_ADDR    6
  #define LOCK_FILE_LINE_SHMEM_KEY    7
+ #define LOCK_FILE_LINE_PM_STATUS    8

  extern void CreateDataDirLockFile(bool amPostmaster);
  extern void CreateSocketLockFile(const char *socketfile, bool amPostmaster,

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: [HACKERS] memory layouts for binary search in nbtree
Next
From: Peter Geoghegan
Date:
Subject: [HACKERS] Abbreviated keys in nbtree internal pages