Thread: hba load error and silent mode

hba load error and silent mode

From
Andrew Dunstan
Date:
Last night I had to deal with a puzzled user of version 8.4  who found 
postgres refused to start but didn't log any error.  It turned out that 
there was an error in the pg_hba.conf file, and the client was running 
in silent mode (the SUSE default).

This seems like a bug, and it's certainly not very postgres-like behaviour.

Can we move the call to hba_load() in postmaster.c down a bit so it 
occurs after the SysLogger is set up? ISTM we really need an absolute 
minimum of code between the call to pmdaemonize() and SysLogger_Start().

(Maybe there's a good case for deprecating silent mode. I'm not sure why 
Suse uses it. Other distros don't seem to feel the need.)

cheers

andrew


Re: hba load error and silent mode

From
Magnus Hagander
Date:
On Mon, Aug 24, 2009 at 14:39, Andrew Dunstan<andrew@dunslane.net> wrote:
>
> Last night I had to deal with a puzzled user of version 8.4  who found
> postgres refused to start but didn't log any error.  It turned out that
> there was an error in the pg_hba.conf file, and the client was running in
> silent mode (the SUSE default).
>
> This seems like a bug, and it's certainly not very postgres-like behaviour.
>
> Can we move the call to hba_load() in postmaster.c down a bit so it occurs
> after the SysLogger is set up? ISTM we really need an absolute minimum of
> code between the call to pmdaemonize() and SysLogger_Start().

I can see other reasons that this would be good, so +1 from me unless
there is any specific reason we can't start it earlier.


> (Maybe there's a good case for deprecating silent mode. I'm not sure why
> Suse uses it. Other distros don't seem to feel the need.)

Could be, but even with silent_mode=off that would be a problem, no?
as in, the log wouldn't go where you'd expect it to go.


-- Magnus HaganderMe: http://www.hagander.net/Work: http://www.redpill-linpro.com/


Re: hba load error and silent mode

From
Andrew Dunstan
Date:

Magnus Hagander wrote:
>> (Maybe there's a good case for deprecating silent mode. I'm not sure why
>> Suse uses it. Other distros don't seem to feel the need.)
>>     
>
> Could be, but even with silent_mode=off that would be a problem, no?
> as in, the log wouldn't go where you'd expect it to go.
>
>   

It would go to stderr, and then I would have seen it. Silent mode 
apparently sends stderr to the bit bucket until the syslogger is set up. 
I found where the error was by running postmaster under strace, but 
that's really rather icky.

cheers

andrew


Re: hba load error and silent mode

From
Joshua Tolley
Date:
On Mon, Aug 24, 2009 at 02:57:02PM +0200, Magnus Hagander wrote:
> On Mon, Aug 24, 2009 at 14:39, Andrew Dunstan<andrew@dunslane.net> wrote:
> >
> > Last night I had to deal with a puzzled user of version 8.4  who found
> > postgres refused to start but didn't log any error.  It turned out that
> > there was an error in the pg_hba.conf file, and the client was running in
> > silent mode (the SUSE default).
> >
> > This seems like a bug, and it's certainly not very postgres-like behaviour.
> >
> > Can we move the call to hba_load() in postmaster.c down a bit so it occurs
> > after the SysLogger is set up? ISTM we really need an absolute minimum of
> > code between the call to pmdaemonize() and SysLogger_Start().
>
> I can see other reasons that this would be good, so +1 from me unless
> there is any specific reason we can't start it earlier.

+1 from me, too, under the same condition.  Since logging in really
interesting ways depends on a separate process, any logging before then will
be "abnormal", and any logs we create will probably show up in a relatively
unexpected place. The Principle of Least Surprise suggests we minimize that
possibility.

--
Joshua Tolley / eggyknap
End Point Corporation
http://www.endpoint.com

Re: hba load error and silent mode

From
Tom Lane
Date:
Andrew Dunstan <andrew@dunslane.net> writes:
> (Maybe there's a good case for deprecating silent mode.

+1.  The only reason to use it is that an init-script writer is too
lazy to deal with things properly --- the thing in question here being
exactly to think of a place for early failure messages to go.

You can *not* just move the syslogger start call up; it's dependent
on having done some of the other initialization steps.  (chdir and
signal setup being obvious candidates.)  In general, there will always
be messages that come out before the syslogger can start, and thus a
robust setup has got to provide some fallback place for them.

It might be that a reasonable solution on our end would be for
pmdaemonize to point stdout/stderr someplace other than /dev/null,
perhaps "$PGDATA/postmaster.log"?  Of course, it's not clear what
we're supposed to do if that open() fails ...
        regards, tom lane


Re: hba load error and silent mode

From
Andrew Dunstan
Date:

Tom Lane wrote:
> It might be that a reasonable solution on our end would be for
> pmdaemonize to point stdout/stderr someplace other than /dev/null,
> perhaps "$PGDATA/postmaster.log"?  Of course, it's not clear what
> we're supposed to do if that open() fails ...
>
>         
>   

Well, yes, but that is at least a comparatively low risk, certainly much 
much lower than the risk having a faulty hba file, for example.

This sounds like a reasonable approach.

cheers

andrew


Re: hba load error and silent mode

From
Tom Lane
Date:
Andrew Dunstan <andrew@dunslane.net> writes:
> Tom Lane wrote:
>> It might be that a reasonable solution on our end would be for
>> pmdaemonize to point stdout/stderr someplace other than /dev/null,
>> perhaps "$PGDATA/postmaster.log"?  Of course, it's not clear what
>> we're supposed to do if that open() fails ...

> Well, yes, but that is at least a comparatively low risk, certainly much 
> much lower than the risk having a faulty hba file, for example.

> This sounds like a reasonable approach.

Actually, if people are happy with that basic behavior, I think we could
make it robust: open /dev/null and $PGDATA/postmaster.log *before* we
fork away from the terminal session.  On failure, report that and exit(1).
On success, go ahead and fork.  Failure of the subsequent dup2() calls
should be just about impossible --- in fact, so far as I can tell from
the SUS documents, if we put in a loop for EINTR then there is no
documented way for them to fail.

If no objections, I'll be happy to make this happen.
        regards, tom lane


Re: hba load error and silent mode

From
Magnus Hagander
Date:
On Mon, Aug 24, 2009 at 16:31, Tom Lane<tgl@sss.pgh.pa.us> wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
>> (Maybe there's a good case for deprecating silent mode.
>
> +1.  The only reason to use it is that an init-script writer is too
> lazy to deal with things properly --- the thing in question here being
> exactly to think of a place for early failure messages to go.
>
> You can *not* just move the syslogger start call up; it's dependent
> on having done some of the other initialization steps.  (chdir and
> signal setup being obvious candidates.)  In general, there will always
> be messages that come out before the syslogger can start, and thus a
> robust setup has got to provide some fallback place for them.

Agreed.

I don't see why we couldn't move the hba call specifically, though.
That's a fairly common error, so it would be good if the output went
to the place that is actually configured in postgresql.conf. It's at
least a lot more likely than most other things that are prior to
syslogger startup.

-- Magnus HaganderMe: http://www.hagander.net/Work: http://www.redpill-linpro.com/


Re: hba load error and silent mode

From
Tom Lane
Date:
Magnus Hagander <magnus@hagander.net> writes:
> I don't see why we couldn't move the hba call specifically, though.
> That's a fairly common error, so it would be good if the output went
> to the place that is actually configured in postgresql.conf. It's at
> least a lot more likely than most other things that are prior to
> syslogger startup.

Oh, you mean move load_hba *down*, past the syslogger startup?
Yeah, that would probably be all right.
        regards, tom lane


Re: hba load error and silent mode

From
Andrew Dunstan
Date:

Tom Lane wrote:
> Magnus Hagander <magnus@hagander.net> writes:
>   
>> I don't see why we couldn't move the hba call specifically, though.
>> That's a fairly common error, so it would be good if the output went
>> to the place that is actually configured in postgresql.conf. It's at
>> least a lot more likely than most other things that are prior to
>> syslogger startup.
>>     
>
> Oh, you mean move load_hba *down*, past the syslogger startup?
> Yeah, that would probably be all right.
>
>   

Well, that's what I originally said, yes ;-)

But I don't think that precludes your more general suggestion regarding 
startup errors. In particular, I think moving the hba load down would be 
reasonable to backpatch to 8.4, whereas I doubt the general fix would.

cheers

andrew


Re: hba load error and silent mode

From
Tom Lane
Date:
Andrew Dunstan <andrew@dunslane.net> writes:
> Tom Lane wrote:
>> Oh, you mean move load_hba *down*, past the syslogger startup?
>> Yeah, that would probably be all right.

> Well, that's what I originally said, yes ;-)

> But I don't think that precludes your more general suggestion regarding 
> startup errors. In particular, I think moving the hba load down would be 
> reasonable to backpatch to 8.4, whereas I doubt the general fix would.

Well, the change I had in mind is only a few lines of code, and is
fixing a behavior that you yourself are arguing is unusably broken.
It seems like a reasonable back-patch candidate to me if we think this
is a serious bug.  But I personally wasn't seeing any of this as due for
back-patching.  The -S behavior has been like it is since forever, and
nobody's complained before.
        regards, tom lane


Re: hba load error and silent mode

From
Andrew Dunstan
Date:

Tom Lane wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
>   
>> Tom Lane wrote:
>>     
>>> Oh, you mean move load_hba *down*, past the syslogger startup?
>>> Yeah, that would probably be all right.
>>>       
>
>   
>> Well, that's what I originally said, yes ;-)
>>     
>
>   
>> But I don't think that precludes your more general suggestion regarding 
>> startup errors. In particular, I think moving the hba load down would be 
>> reasonable to backpatch to 8.4, whereas I doubt the general fix would.
>>     
>
> Well, the change I had in mind is only a few lines of code, and is
> fixing a behavior that you yourself are arguing is unusably broken.
> It seems like a reasonable back-patch candidate to me if we think this
> is a serious bug.  But I personally wasn't seeing any of this as due for
> back-patching.  The -S behavior has been like it is since forever, and
> nobody's complained before.
>
>             
>   

We didn't check HBA validity at startup time before, did we? I would not 
be surprised to get more complaints now.

cheers

andrew


Re: hba load error and silent mode

From
Magnus Hagander
Date:
On Mon, Aug 24, 2009 at 20:51, Andrew Dunstan<andrew@dunslane.net> wrote:
>
>
> Tom Lane wrote:
>>
>> Andrew Dunstan <andrew@dunslane.net> writes:
>>
>>>
>>> Tom Lane wrote:
>>>
>>>>
>>>> Oh, you mean move load_hba *down*, past the syslogger startup?
>>>> Yeah, that would probably be all right.
>>>>
>>
>>
>>>
>>> Well, that's what I originally said, yes ;-)
>>>
>>
>>
>>>
>>> But I don't think that precludes your more general suggestion regarding
>>> startup errors. In particular, I think moving the hba load down would be
>>> reasonable to backpatch to 8.4, whereas I doubt the general fix would.
>>>
>>
>> Well, the change I had in mind is only a few lines of code, and is
>> fixing a behavior that you yourself are arguing is unusably broken.
>> It seems like a reasonable back-patch candidate to me if we think this
>> is a serious bug.  But I personally wasn't seeing any of this as due for
>> back-patching.  The -S behavior has been like it is since forever, and
>> nobody's complained before.
>>
>>
>>
>
> We didn't check HBA validity at startup time before, did we? I would not be
> surprised to get more complaints now.

We checked some of it, but we check it a whole lot more now.

+1 for backpatching at least the move of the load_hba call.

-- Magnus HaganderMe: http://www.hagander.net/Work: http://www.redpill-linpro.com/


Re: hba load error and silent mode

From
Tom Lane
Date:
Magnus Hagander <magnus@hagander.net> writes:
> On Mon, Aug 24, 2009 at 20:51, Andrew Dunstan<andrew@dunslane.net> wrote:
>> We didn't check HBA validity at startup time before, did we? I would not be
>> surprised to get more complaints now.

Good point.

> We checked some of it, but we check it a whole lot more now.
> +1 for backpatching at least the move of the load_hba call.

OK, in that case I think it's reasonable to backpatch into 8.4.
Attached is my work-in-progress changes to pmdaemonize --- I think
the code is good now, but I need to go say something in the docs.
I haven't moved the load_hba call yet.

            regards, tom lane

Index: postmaster.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/postmaster/postmaster.c,v
retrieving revision 1.589
diff -c -r1.589 postmaster.c
*** postmaster.c    24 Aug 2009 18:09:37 -0000    1.589
--- postmaster.c    24 Aug 2009 19:22:28 -0000
***************
*** 191,197 ****

  /* still more option variables */
  bool        EnableSSL = false;
! bool        SilentMode = false; /* silent mode (-S) */

  int            PreAuthDelay = 0;
  int            AuthenticationTimeout = 60;
--- 191,197 ----

  /* still more option variables */
  bool        EnableSSL = false;
! bool        SilentMode = false; /* silent_mode */

  int            PreAuthDelay = 0;
  int            AuthenticationTimeout = 60;
***************
*** 744,750 ****
      }

      /*
!      * Fork away from controlling terminal, if -S specified.
       *
       * Must do this before we grab any interlock files, else the interlocks
       * will show the wrong PID.
--- 744,750 ----
      }

      /*
!      * Fork away from controlling terminal, if silent_mode specified.
       *
       * Must do this before we grab any interlock files, else the interlocks
       * will show the wrong PID.
***************
*** 1204,1218 ****


  /*
!  * Fork away from the controlling terminal (-S option)
   */
  static void
  pmdaemonize(void)
  {
  #ifndef WIN32
!     int            i;
      pid_t        pid;

      pid = fork_process();
      if (pid == (pid_t) -1)
      {
--- 1204,1249 ----


  /*
!  * Fork away from the controlling terminal (silent_mode option)
!  *
!  * Since this requires disconnecting from stdin/stdout/stderr (in case they're
!  * linked to the terminal), we re-point stdin to /dev/null and stdout/stderr
!  * to "postmaster.log" in the data directory, where we're already chdir'd.
   */
  static void
  pmdaemonize(void)
  {
  #ifndef WIN32
!     const char *pmlogname = "postmaster.log";
!     int            dvnull;
!     int            pmlog;
      pid_t        pid;
+     int            res;

+     /*
+      * Make sure we can open the files we're going to redirect to.  If this
+      * fails, we want to complain before disconnecting.  Mention the full path
+      * of the logfile in the error message, even though we address it by
+      * relative path.
+      */
+     dvnull = open(DEVNULL, O_RDONLY, 0);
+     if (dvnull < 0)
+     {
+         write_stderr("%s: could not open file \"%s\": %s\n",
+                      progname, DEVNULL, strerror(errno));
+         ExitPostmaster(1);
+     }
+     pmlog = open(pmlogname, O_CREAT | O_WRONLY | O_APPEND, 0600);
+     if (pmlog < 0)
+     {
+         write_stderr("%s: could not open log file \"%s/%s\": %s\n",
+                      progname, DataDir, pmlogname, strerror(errno));
+         ExitPostmaster(1);
+     }
+
+     /*
+      * Okay to fork.
+      */
      pid = fork_process();
      if (pid == (pid_t) -1)
      {
***************
*** 1231,1238 ****
      MyStartTime = time(NULL);

      /*
!      * GH: If there's no setsid(), we hopefully don't need silent mode. Until
!      * there's a better solution.
       */
  #ifdef HAVE_SETSID
      if (setsid() < 0)
--- 1262,1269 ----
      MyStartTime = time(NULL);

      /*
!      * Some systems use setsid() to dissociate from the TTY's process group,
!      * while on others it depends on stdin/stdout/stderr.  Do both if possible.
       */
  #ifdef HAVE_SETSID
      if (setsid() < 0)
***************
*** 1242,1255 ****
          ExitPostmaster(1);
      }
  #endif
!     i = open(DEVNULL, O_RDWR, 0);
!     dup2(i, 0);
!     dup2(i, 1);
!     dup2(i, 2);
!     close(i);
  #else                            /* WIN32 */
      /* not supported */
!     elog(FATAL, "SilentMode not supported under WIN32");
  #endif   /* WIN32 */
  }

--- 1273,1298 ----
          ExitPostmaster(1);
      }
  #endif
!
!     /*
!      * Reassociate stdin/stdout/stderr.  fork_process() cleared any pending
!      * output, so this should be safe.  The only plausible error is EINTR,
!      * which just means we should retry.
!      */
!     do {
!         res = dup2(dvnull, 0);
!     } while (res < 0 && errno == EINTR);
!     close(dvnull);
!     do {
!         res = dup2(pmlog, 1);
!     } while (res < 0 && errno == EINTR);
!     do {
!         res = dup2(pmlog, 2);
!     } while (res < 0 && errno == EINTR);
!     close(pmlog);
  #else                            /* WIN32 */
      /* not supported */
!     elog(FATAL, "silent_mode is not supported under Windows");
  #endif   /* WIN32 */
  }