Re: Endless recovery - Mailing list pgsql-patches

From Bruce Momjian
Subject Re: Endless recovery
Date
Msg-id 200803051700.m25H0vu24693@momjian.us
Whole thread Raw
In response to Re: Endless recovery  (Simon Riggs <simon@2ndquadrant.com>)
List pgsql-patches
Simon Riggs wrote:
> On Mon, 2008-02-11 at 09:29 +0100, Hans-Juergen Schoenig wrote:
> > Last week we have seen a problem with some horribly configured
> > machine.
> > The disk filled up (bad FSM ;) ) and once this happened the sysadmi
> > killed the system (-9).
> > After two days PostgreSQL has still not started up and they tried to
> > restart it again and again making sure that the consistency check was
> > started over an over again (thus causing more and more downtime).
> > From the admi point of view there was no way to find out whether the
> > machine was actually dead or still recovering.
>
> I'm sorry to hear about this problem.
>
> Not sure we need a LOG message to warn people about the possible length
> of recovery time. The chances of a recovery taking that much time seem
> very low for normal Postgres, even with checkpoint parameters set at
> their maximum values.
>
> I note that the configuration section does not mention the likely
> increase in recovery time that will result from setting those parameters
> higher. That needs a patch. ISTM a serious omission that should be
> treated as a bug and backpatched.

Patch attached and applied, and backpatched to 8.3.X.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +
Index: doc/src/sgml/config.sgml
===================================================================
RCS file: /cvsroot/pgsql/doc/src/sgml/config.sgml,v
retrieving revision 1.166
diff -c -c -r1.166 config.sgml
*** doc/src/sgml/config.sgml    16 Feb 2008 21:14:08 -0000    1.166
--- doc/src/sgml/config.sgml    5 Mar 2008 16:57:47 -0000
***************
*** 1584,1592 ****
        </indexterm>
        <listitem>
         <para>
!         Maximum distance between automatic WAL checkpoints, in log
!         file segments (each segment is normally 16 megabytes). The
!         default is three segments.
          This parameter can only be set in the <filename>postgresql.conf</>
          file or on the server command line.
         </para>
--- 1584,1593 ----
        </indexterm>
        <listitem>
         <para>
!         Maximum number of log file segments between automatic WAL
!         checkpoints (each segment is normally 16 megabytes). The default
!         is three segments.  Increasing this parameter can increase the
!         amount of time needed for crash recovery.
          This parameter can only be set in the <filename>postgresql.conf</>
          file or on the server command line.
         </para>
***************
*** 1602,1607 ****
--- 1603,1610 ----
         <para>
          Maximum time between automatic WAL checkpoints, in
          seconds. The default is five minutes (<literal>5min</>).
+         Increasing this parameter can increase the amount of time needed
+         for crash recovery.
          This parameter can only be set in the <filename>postgresql.conf</>
          file or on the server command line.
         </para>

pgsql-patches by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: [BUGS] BUG #3975: tsearch2 index should not bomb out of 1Mb limit
Next
From: Andrew Dunstan
Date:
Subject: Re: CopyReadLineText optimization