Re: Shutting down a warm standby database in 8.2beta3 - Mailing list pgsql-general

From Stephen Harris
Subject Re: Shutting down a warm standby database in 8.2beta3
Date
Msg-id 20061118022205.GA5465@pugwash.spuddy.org
Whole thread Raw
In response to Re: Shutting down a warm standby database in 8.2beta3  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: [HACKERS] Shutting down a warm standby database in 8.2beta3  (Gregory Stark <stark@enterprisedb.com>)
List pgsql-general
On Fri, Nov 17, 2006 at 05:03:44PM -0500, Tom Lane wrote:
> Stephen Harris <lists@spuddy.org> writes:
> > Doing a shutdown "immediate" isn't to clever because it actually leaves
> > the recovery threads running
>
> > LOG:  restored log file "00000001000000010000003E" from archive
> > LOG:  received immediate shutdown request
> > LOG:  restored log file "00000001000000010000003F" from archive
>
> Hm, that should work --- AFAICS the startup process should abort on
> SIGQUIT the same as any regular backend.
>
> [ thinks... ]  Ah-hah, "man system(3)" tells the tale:
>
>      system() ignores the SIGINT and SIGQUIT signals, and blocks the
>      SIGCHLD signal, while waiting for the command to terminate.  If this
>      might cause the application to miss a signal that would have killed
>      it, the application should examine the return value from system() and
>      take whatever action is appropriate to the application if the command
>      terminated due to receipt of a signal.
>
> So the SIGQUIT went to the recovery script command and was missed by the
> startup process.  It looks to me like your script actually ignored the
> signal, which you'll need to fix, but it also looks like we are not

My script was just a ksh script and didn't do anything special with signals.
Essentially it does
  #!/bin/ksh -p

  [...variable setup...]
  while [ ! -f $wanted_file ]
  do
    if [ -f $abort_file ]
    then
      exit 1
    fi
    sleep 5
  done
  cat $wanted_file

I know signals can be deferred in scripts (a signal sent to the script during
the sleep will be deferred if a trap handler had been written for the signal)
but they _do_ get delivered.

However, it seems the signal wasn't sent at all.  Once the wanted file
appeared the recovery thread from postmaster started a _new_ script for
the next log.  I'll rewrite the script in perl (probably monday when
I'm back in the office) and stick lots of signal() traps in to see if
anything does get sent to the script.

> As the code stands, if the recovery script is killed by a signal, we'd
> take that as normal termination of the recovery and proceed to come up,
> which is definitely the Wrong Thing.

Oh good; that means I'm not mad :-)

--

rgds
Stephen

pgsql-general by date:

Previous
From: Jeff Davis
Date:
Subject: Re: PostgreSQL: Question about rules
Next
From: Bruce Momjian
Date:
Subject: Re: select result / functions from another database