Re: Postgres Hot Standby. How or when does the recovery db move recovery.conf to recovery.done? - Mailing list pgsql-general

From Dhaval Shah
Subject Re: Postgres Hot Standby. How or when does the recovery db move recovery.conf to recovery.done?
Date
Msg-id 565237760703210842g529cb62ax5b92678edaf4e18d@mail.gmail.com
Whole thread Raw
In response to Re: Postgres Hot Standby. How or when does the recovery db move recovery.conf to recovery.done?  ("Merlin Moncure" <mmoncure@gmail.com>)
Responses Re: Postgres Hot Standby. How or when does the recovery db move recovery.conf to recovery.done?
List pgsql-general
I looked at the pg_standby utility and would have liked to use it,
however there are some customer driven extraneous issues in using
that.

What I am looking at it is this:

1. I can detect that the primary has gone down and return a non-zero
for the standby to recover.

2. Since I can detect that I am out of standby mode, I can shutdown
the postgres, move the recovery.conf file to recovery.done manually.
And then restart the db.

Even if I do step 2, I still get the following in the server log:

=====
Main: Triggering Recovery!!!  <- my script is returning a non-zero code here ...

PANIC:  could not open file "pg_xlog/00000001000000000000001B" (log
file 0, segment 27): No such file or directory
LOG:  startup process (PID 32167) was terminated by signal 6
LOG:  aborting startup due to startup process failure
LOG:  database system was interrupted while in recovery at log time
2007-03-20 13:04:28 PDT
HINT:  If this has occurred more than once some data may be corrupted
and you may need to choose an earlier recovery target.
LOG:  could not open file "pg_xlog/000000010000000000000006" (log file
0, segment 6): No such file or directory
LOG:  invalid primary checkpoint record
LOG:  could not open file "pg_xlog/000000010000000000000005" (log file
0, segment 5): No such file or directory
LOG:  invalid secondary checkpoint record
PANIC:  could not locate a valid checkpoint record
LOG:  startup process (PID 4676) was terminated by signal 6
LOG:  aborting startup due to startup process failure
LOG:  database system was interrupted while in recovery at log time
2007-03-20 13:04:28 PDT
====

The question I have is how do I get out of the above mode and ensure
that the db is up and ready? What I need to clear? A previous cache or
something? Am I missing something here? I went to the docs and it says
the following:

"Start the postmaster. The postmaster will go into recovery mode and
proceed to read through the archived WAL files it needs. Upon
completion of the recovery process, the postmaster will rename
recovery.conf to recovery.done (to prevent accidentally re-entering
recovery mode in case of a crash later) and then commence normal
database operations."

And I do not see the recovery.conf go to recovery.done automatically.

Dhaval


On 3/21/07, Merlin Moncure <mmoncure@gmail.com> wrote:
> On 3/21/07, Dhaval Shah <dhaval.shah.m@gmail.com> wrote:
> > Resending.
> >
> > I have a "hot" standby. Now, if the primary fails
> > how do I tell the secondary that come out of recovery mode and move
> > the recovery.conf to recovery.done and start the db. I mean, what
> > error code shall I return?
>
> did you look at pg_standby utility? it has kill file mechanism that
> automates this for you.
>
> merlin
>


--
Dhaval Shah

pgsql-general by date:

Previous
From: Oleg Bartunov
Date:
Subject: Re: multi terabyte fulltext searching
Next
From: Ron Johnson
Date:
Subject: Re: Anyone still using the sql_inheritance parameter?