Thread: Fix pg_ctl restart bug

Fix pg_ctl restart bug

From
Bruce Momjian
Date:
We document 'pg_ctl restart' to preserve any command-line arguments
passed when the server was started:

    Restarting the server is almost equivalent to stopping the
    server and starting it again except that <command>pg_ctl</command>
    saves and reuses the command line options that were passed
    to the previously running instance.  To restart

However, as of 2004-10-15, this has not worked.  :-(  The attached patch
is the one that caused the bug --- on non-Unix systems, SYSTEMQUOTE is
"", meaning zero-length string.  I should have seen the bug when I
applied the contributed patch in 2004.

The second attached applied patch fixes the problem.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +
Index: src/backend/postmaster/postmaster.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/postmaster/postmaster.c,v
retrieving revision 1.433
retrieving revision 1.434
diff -c -r1.433 -r1.434
*** src/backend/postmaster/postmaster.c    14 Oct 2004 20:23:45 -0000    1.433
--- src/backend/postmaster/postmaster.c    15 Oct 2004 04:54:31 -0000    1.434
***************
*** 3301,3307 ****

      fprintf(fp, "%s", fullprogname);
      for (i = 1; i < argc; i++)
!         fprintf(fp, " '%s'", argv[i]);
      fputs("\n", fp);

      if (fclose(fp))
--- 3301,3307 ----

      fprintf(fp, "%s", fullprogname);
      for (i = 1; i < argc; i++)
!         fprintf(fp, " %s%s%s", SYSTEMQUOTE, argv[i], SYSTEMQUOTE);
      fputs("\n", fp);

      if (fclose(fp))
Index: src/bin/pg_ctl/pg_ctl.c
===================================================================
RCS file: /cvsroot/pgsql/src/bin/pg_ctl/pg_ctl.c,v
retrieving revision 1.37
retrieving revision 1.38
diff -c -r1.37 -r1.38
*** src/bin/pg_ctl/pg_ctl.c    15 Oct 2004 04:32:14 -0000    1.37
--- src/bin/pg_ctl/pg_ctl.c    15 Oct 2004 04:54:33 -0000    1.38
***************
*** 501,507 ****
              {
                  char       *arg1;

!                 arg1 = strchr(optline, '\'');
                  if (arg1 == NULL || arg1 == optline)
                      post_opts = "";
                  else
--- 501,507 ----
              {
                  char       *arg1;

!                 arg1 = strchr(optline, *SYSTEMQUOTE);
                  if (arg1 == NULL || arg1 == optline)
                      post_opts = "";
                  else
Index: src/backend/postmaster/postmaster.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/postmaster/postmaster.c,v
retrieving revision 1.560
diff -c -c -r1.560 postmaster.c
*** src/backend/postmaster/postmaster.c    26 Jun 2008 01:35:45 -0000    1.560
--- src/backend/postmaster/postmaster.c    26 Jun 2008 02:41:04 -0000
***************
*** 4184,4190 ****

      fprintf(fp, "%s", fullprogname);
      for (i = 1; i < argc; i++)
!         fprintf(fp, " " SYSTEMQUOTE "%s" SYSTEMQUOTE, argv[i]);
      fputs("\n", fp);

      if (fclose(fp))
--- 4184,4190 ----

      fprintf(fp, "%s", fullprogname);
      for (i = 1; i < argc; i++)
!         fprintf(fp, " \"%s\"", argv[i]);
      fputs("\n", fp);

      if (fclose(fp))
Index: src/bin/pg_ctl/pg_ctl.c
===================================================================
RCS file: /cvsroot/pgsql/src/bin/pg_ctl/pg_ctl.c,v
retrieving revision 1.100
diff -c -c -r1.100 pg_ctl.c
*** src/bin/pg_ctl/pg_ctl.c    26 Jun 2008 01:35:45 -0000    1.100
--- src/bin/pg_ctl/pg_ctl.c    26 Jun 2008 02:41:04 -0000
***************
*** 573,583 ****
  {
      if (post_opts == NULL)
      {
-         char      **optlines;
-
          post_opts = "";        /* defatult */
          if (ctl_command == RESTART_COMMAND)
          {
              optlines = readfile(postopts_file);
              if (optlines == NULL)
              {
--- 573,583 ----
  {
      if (post_opts == NULL)
      {
          post_opts = "";        /* defatult */
          if (ctl_command == RESTART_COMMAND)
          {
+             char      **optlines;
+
              optlines = readfile(postopts_file);
              if (optlines == NULL)
              {
***************
*** 593,612 ****
              else
              {
                  int            len;
!                 char       *optline = NULL;
                  char       *arg1;

                  optline = optlines[0];
                  len = strcspn(optline, "\r\n");
                  optline[len] = '\0';

!                 arg1 = strchr(optline, *SYSTEMQUOTE);
!                 if (arg1 == NULL || arg1 == optline)
!                     post_opts = "";
!                 else
                  {
!                     *(arg1 - 1) = '\0'; /* this should be a space */
!                     post_opts = arg1;
                  }
                  if (postgres_path != NULL)
                      postgres_path = optline;
--- 593,618 ----
              else
              {
                  int            len;
!                 char       *optline;
                  char       *arg1;

                  optline = optlines[0];
+                 /* trim off line endings */
                  len = strcspn(optline, "\r\n");
                  optline[len] = '\0';

!                 for (arg1 = optline; *arg1; arg1++)
                  {
!                     /*
!                      * Are we at the first option, as defined by space,
!                      * double-quote, and a dash?
!                      */
!                     if (*arg1 == ' ' && *(arg1+1) == '"' && *(arg1+2) == '-')
!                     {
!                         *arg1 = '\0';    /* terminate so we get only program name */
!                         post_opts = arg1 + 1; /* point past whitespace */
!                         break;
!                     }
                  }
                  if (postgres_path != NULL)
                      postgres_path = optline;

Re: Fix pg_ctl restart bug

From
Tom Lane
Date:
Bruce Momjian <bruce@momjian.us> writes:
> However, as of 2004-10-15, this has not worked.  :-(  The attached patch
> is the one that caused the bug --- on non-Unix systems, SYSTEMQUOTE is
> "", meaning zero-length string.  I should have seen the bug when I
> applied the contributed patch in 2004.

So, shouldn't this fix be back-patched?

            regards, tom lane

Re: Fix pg_ctl restart bug

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > However, as of 2004-10-15, this has not worked.  :-(  The attached patch
> > is the one that caused the bug --- on non-Unix systems, SYSTEMQUOTE is
> > "", meaning zero-length string.  I should have seen the bug when I
> > applied the contributed patch in 2004.
>
> So, shouldn't this fix be back-patched?

Well, no one has actually complained about the breakage, and it has been
a few years.  Also there is always the risk of a new bug being
introduced, so I am unsure.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

Re: Fix pg_ctl restart bug

From
Alvaro Herrera
Date:
Bruce Momjian wrote:
> Tom Lane wrote:
> > Bruce Momjian <bruce@momjian.us> writes:
> > > However, as of 2004-10-15, this has not worked.  :-(  The attached patch
> > > is the one that caused the bug --- on non-Unix systems, SYSTEMQUOTE is
> > > "", meaning zero-length string.  I should have seen the bug when I
> > > applied the contributed patch in 2004.
> >
> > So, shouldn't this fix be back-patched?
>
> Well, no one has actually complained about the breakage, and it has been
> a few years.  Also there is always the risk of a new bug being
> introduced, so I am unsure.

Why do we need someone to complain?  We know the bug is there.  Has the
code changed a lot in that area?

--
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: Fix pg_ctl restart bug

From
Bruce Momjian
Date:
Alvaro Herrera wrote:
> Bruce Momjian wrote:
> > Tom Lane wrote:
> > > Bruce Momjian <bruce@momjian.us> writes:
> > > > However, as of 2004-10-15, this has not worked.  :-(  The attached patch
> > > > is the one that caused the bug --- on non-Unix systems, SYSTEMQUOTE is
> > > > "", meaning zero-length string.  I should have seen the bug when I
> > > > applied the contributed patch in 2004.
> > >
> > > So, shouldn't this fix be back-patched?
> >
> > Well, no one has actually complained about the breakage, and it has been
> > a few years.  Also there is always the risk of a new bug being
> > introduced, so I am unsure.
>
> Why do we need someone to complain?  We know the bug is there.  Has the
> code changed a lot in that area?

Do we have the policy of backpatching every fix?  I thought it was only
the major bugs we fixed in back branches.  If someone wants to backpatch
it, feel free to do so.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

Re: Fix pg_ctl restart bug

From
Bruce Momjian
Date:
Bruce Momjian wrote:
> > > > > "", meaning zero-length string.  I should have seen the bug when I
> > > > > applied the contributed patch in 2004.
> > > >
> > > > So, shouldn't this fix be back-patched?
> > >
> > > Well, no one has actually complained about the breakage, and it has been
> > > a few years.  Also there is always the risk of a new bug being
> > > introduced, so I am unsure.
> >
> > Why do we need someone to complain?  We know the bug is there.  Has the
> > code changed a lot in that area?
>
> Do we have the policy of backpatching every fix?  I thought it was only
> the major bugs we fixed in back branches.  If someone wants to backpatch
> it, feel free to do so.

OK, I started looking at what it would take to backpatch this and found
another bug I have fixed in CVS HEAD.  What back branchs (8.0-8.3.X) are
doing is pretty odd.  On non-Win32 systems, it is looking for the null
byte, then putting a null byte before it, and passing a NULL back as the
options and binary location.  The test:

                if (postgres_path != NULL)
                    postgres_path = optline;

is backwards, which means that if in 8.3.X you start the server with any
arguments, like:

    /usr/var/local/postgres/bin/postgres -i -o -d5

and you use pg_ctl to specify the binary location:

    pg_ctl -p /u/pg/bin/postmaster restart

the server actually fails to restart because it chops off the last byte
(a bug) and the test above is wrong (another bug), and it thinks the
binary name is the full string, in quotes:

    /usr/var/local/postgres/bin/postgres -i -o -d

and you get this error from pg_ctl:

    sh: /usr/var/local/postgres/bin/postgres -i -o -d: not found

This is more than just ignoring the documentation, it is a failure.

I am attaching a minimal patch that will fix the bug in back branches.
Keep in mind that a patched pg_ctl will not be able to restart a backend
that was not patched.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +
Index: src/backend/postmaster/postmaster.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/postmaster/postmaster.c,v
retrieving revision 1.551
diff -c -c -r1.551 postmaster.c
*** src/backend/postmaster/postmaster.c    11 Jan 2008 00:54:09 -0000    1.551
--- src/backend/postmaster/postmaster.c    26 Jun 2008 18:53:42 -0000
***************
*** 4163,4169 ****

      fprintf(fp, "%s", fullprogname);
      for (i = 1; i < argc; i++)
!         fprintf(fp, " %s%s%s", SYSTEMQUOTE, argv[i], SYSTEMQUOTE);
      fputs("\n", fp);

      if (fclose(fp))
--- 4163,4169 ----

      fprintf(fp, "%s", fullprogname);
      for (i = 1; i < argc; i++)
!         fprintf(fp, " \"%s\"", argv[i]);
      fputs("\n", fp);

      if (fclose(fp))
Index: src/bin/pg_ctl/pg_ctl.c
===================================================================
RCS file: /cvsroot/pgsql/src/bin/pg_ctl/pg_ctl.c,v
retrieving revision 1.92.2.3
diff -c -c -r1.92.2.3 pg_ctl.c
*** src/bin/pg_ctl/pg_ctl.c    29 Feb 2008 23:31:42 -0000    1.92.2.3
--- src/bin/pg_ctl/pg_ctl.c    26 Jun 2008 18:53:43 -0000
***************
*** 613,627 ****
              {
                  char       *arg1;

!                 arg1 = strchr(optline, *SYSTEMQUOTE);
!                 if (arg1 == NULL || arg1 == optline)
!                     post_opts = "";
!                 else
                  {
!                     *(arg1 - 1) = '\0'; /* this should be a space */
!                     post_opts = arg1;
                  }
!                 if (postgres_path != NULL)
                      postgres_path = optline;
              }
              else
--- 613,628 ----
              {
                  char       *arg1;

!                 /*
!                  * Are we at the first option, as defined by space and
!                  * double-quote?
!                  */
!                 if ((arg1 = strstr(optline, " \"")) != NULL)
                  {
!                     *arg1 = '\0';    /* terminate so we get only program name */
!                     post_opts = arg1 + 1; /* point past whitespace */
                  }
!                 if (postgres_path == NULL)
                      postgres_path = optline;
              }
              else

Re: Fix pg_ctl restart bug

From
Bruce Momjian
Date:
Bruce Momjian wrote:
> I am attaching a minimal patch that will fix the bug in back branches.
> Keep in mind that a patched pg_ctl will not be able to restart a backend
> that was not patched.

I think this patch will work for unpatched backends as well.  I am still
uncertain if it should be backpatched.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +
Index: src/backend/postmaster/postmaster.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/postmaster/postmaster.c,v
retrieving revision 1.551
diff -c -c -r1.551 postmaster.c
*** src/backend/postmaster/postmaster.c    11 Jan 2008 00:54:09 -0000    1.551
--- src/backend/postmaster/postmaster.c    26 Jun 2008 19:11:37 -0000
***************
*** 4163,4169 ****

      fprintf(fp, "%s", fullprogname);
      for (i = 1; i < argc; i++)
!         fprintf(fp, " %s%s%s", SYSTEMQUOTE, argv[i], SYSTEMQUOTE);
      fputs("\n", fp);

      if (fclose(fp))
--- 4163,4169 ----

      fprintf(fp, "%s", fullprogname);
      for (i = 1; i < argc; i++)
!         fprintf(fp, " \"%s\"", argv[i]);
      fputs("\n", fp);

      if (fclose(fp))
Index: src/bin/pg_ctl/pg_ctl.c
===================================================================
RCS file: /cvsroot/pgsql/src/bin/pg_ctl/pg_ctl.c,v
retrieving revision 1.92.2.3
diff -c -c -r1.92.2.3 pg_ctl.c
*** src/bin/pg_ctl/pg_ctl.c    29 Feb 2008 23:31:42 -0000    1.92.2.3
--- src/bin/pg_ctl/pg_ctl.c    26 Jun 2008 19:11:37 -0000
***************
*** 613,627 ****
              {
                  char       *arg1;

!                 arg1 = strchr(optline, *SYSTEMQUOTE);
!                 if (arg1 == NULL || arg1 == optline)
!                     post_opts = "";
!                 else
                  {
!                     *(arg1 - 1) = '\0'; /* this should be a space */
!                     post_opts = arg1;
                  }
!                 if (postgres_path != NULL)
                      postgres_path = optline;
              }
              else
--- 613,629 ----
              {
                  char       *arg1;

!                 /*
!                  * Are we at the first option, as defined by space and
!                  * double-quote?
!                  */
!                 if ((arg1 = strstr(optline, " \"")) != NULL ||
!                     (arg1 = strstr(optline, " -")) != NULL)
                  {
!                     *arg1 = '\0';    /* terminate so we get only program name */
!                     post_opts = arg1 + 1; /* point past whitespace */
                  }
!                 if (postgres_path == NULL)
                      postgres_path = optline;
              }
              else

Re: Fix pg_ctl restart bug

From
Alvaro Herrera
Date:
Bruce Momjian wrote:
> Alvaro Herrera wrote:

> > Why do we need someone to complain?  We know the bug is there.  Has the
> > code changed a lot in that area?
>
> Do we have the policy of backpatching every fix?  I thought it was only
> the major bugs we fixed in back branches.  If someone wants to backpatch
> it, feel free to do so.

I think the policy is "we fix the bugs in supported releases".  If you
start making exceptions, it becomes needlessly complex.

I've always assumed that I'm supposed to backpatch the bugs I fix in
HEAD, however far is reasonable.

--
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

Re: Fix pg_ctl restart bug

From
Bruce Momjian
Date:
Alvaro Herrera wrote:
> Bruce Momjian wrote:
> > Alvaro Herrera wrote:
>
> > > Why do we need someone to complain?  We know the bug is there.  Has the
> > > code changed a lot in that area?
> >
> > Do we have the policy of backpatching every fix?  I thought it was only
> > the major bugs we fixed in back branches.  If someone wants to backpatch
> > it, feel free to do so.
>
> I think the policy is "we fix the bugs in supported releases".  If you
> start making exceptions, it becomes needlessly complex.
>
> I've always assumed that I'm supposed to backpatch the bugs I fix in
> HEAD, however far is reasonable.

I thought we only backatched major bugs to prevent possible instability
when fixing minor bugs.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

Re: Fix pg_ctl restart bug

From
Tom Lane
Date:
Bruce Momjian <bruce@momjian.us> writes:
> Alvaro Herrera wrote:
>> I've always assumed that I'm supposed to backpatch the bugs I fix in
>> HEAD, however far is reasonable.

> I thought we only backatched major bugs to prevent possible instability
> when fixing minor bugs.

Actually, Bruce, this *is* a minor bug; if it were major we'd have heard
about it from the field.

My take on it is that "pg_ctl restart" must be practically unused.
Given that we now know it's completely broken, the only way that
patching it could make the situation worse would be if the patch
affected some other code path that people actually do use.  As
long as you're sure it doesn't do that, I see no downside to an
attempted fix, even if it fails.

            regards, tom lane

Re: Fix pg_ctl restart bug

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > Alvaro Herrera wrote:
> >> I've always assumed that I'm supposed to backpatch the bugs I fix in
> >> HEAD, however far is reasonable.
>
> > I thought we only backatched major bugs to prevent possible instability
> > when fixing minor bugs.
>
> Actually, Bruce, this *is* a minor bug; if it were major we'd have heard
> about it from the field.
>
> My take on it is that "pg_ctl restart" must be practically unused.
> Given that we now know it's completely broken, the only way that
> patching it could make the situation worse would be if the patch
> affected some other code path that people actually do use.  As
> long as you're sure it doesn't do that, I see no downside to an
> attempted fix, even if it fails.

OK, done; backpatched from 8.0.X to 8.3.X.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +