Thread: Immediate shutdown during recovery
Hi, The immediate shutdown (pg_ctl -m i stop) might not be able to kill the startup process during archive recovery. It's because the startup process calls system() which ignores SIGQUIT for executing the restore_command. So, only the startup process might survive the immediate shutdown and continue redoing up to the end. Is this desirable behavior? This sounds odd for me. In order to prevent the surviving, I think that the startup process should check whether postmaster is still alive periodically. This idea is already adopted in the archiver process which also calls system() for executing archive_command. What is your opinion? Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
Hi, On Fri, Nov 28, 2008 at 6:56 PM, Fujii Masao <masao.fujii@gmail.com> wrote: > Hi, > > The immediate shutdown (pg_ctl -m i stop) might not be able to > kill the startup process during archive recovery. It's because > the startup process calls system() which ignores SIGQUIT for > executing the restore_command. So, only the startup process > might survive the immediate shutdown and continue redoing up > to the end. Is this desirable behavior? This sounds odd for me. In RestoreArchivedFile(), there is the following code as the safeguard against the termination of restore_command by signal. But the safeguard might not work if restore_command defines its own signal handler for SIGQUIT like pg_standby. > signaled = WIFSIGNALED(rc) || WEXITSTATUS(rc) > 125; > > ereport(signaled ? FATAL : DEBUG2, > (errmsg("could not restore file \"%s\" from archive: return code %d", > xlogfname, rc))); Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
On Fri, 2008-11-28 at 19:53 +0900, Fujii Masao wrote: > Hi, > > On Fri, Nov 28, 2008 at 6:56 PM, Fujii Masao <masao.fujii@gmail.com> wrote: > > Hi, > > > > The immediate shutdown (pg_ctl -m i stop) might not be able to > > kill the startup process during archive recovery. It's because > > the startup process calls system() which ignores SIGQUIT for > > executing the restore_command. So, only the startup process > > might survive the immediate shutdown and continue redoing up > > to the end. Is this desirable behavior? This sounds odd for me. > > In RestoreArchivedFile(), there is the following code as the safeguard > against the termination of restore_command by signal. But the > safeguard might not work if restore_command defines its own signal > handler for SIGQUIT like pg_standby. > > > signaled = WIFSIGNALED(rc) || WEXITSTATUS(rc) > 125; > > > > ereport(signaled ? FATAL : DEBUG2, > > (errmsg("could not restore file \"%s\" from archive: return code %d", > > xlogfname, rc))); Agree there is an existing problem. Suggest we fix it after the main patches are committed. -- Simon Riggs www.2ndQuadrant.comPostgreSQL Training, Services and Support
Hello, On Sat, Nov 29, 2008 at 12:40 AM, Simon Riggs <simon@2ndquadrant.com> wrote: > > On Fri, 2008-11-28 at 19:53 +0900, Fujii Masao wrote: >> Hi, >> >> On Fri, Nov 28, 2008 at 6:56 PM, Fujii Masao <masao.fujii@gmail.com> wrote: >> > Hi, >> > >> > The immediate shutdown (pg_ctl -m i stop) might not be able to >> > kill the startup process during archive recovery. It's because >> > the startup process calls system() which ignores SIGQUIT for >> > executing the restore_command. So, only the startup process >> > might survive the immediate shutdown and continue redoing up >> > to the end. Is this desirable behavior? This sounds odd for me. >> >> In RestoreArchivedFile(), there is the following code as the safeguard >> against the termination of restore_command by signal. But the >> safeguard might not work if restore_command defines its own signal >> handler for SIGQUIT like pg_standby. >> >> > signaled = WIFSIGNALED(rc) || WEXITSTATUS(rc) > 125; >> > >> > ereport(signaled ? FATAL : DEBUG2, >> > (errmsg("could not restore file \"%s\" from archive: return code %d", >> > xlogfname, rc))); > > Agree there is an existing problem. > > Suggest we fix it after the main patches are committed. OK, thanks. -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center