Re: odd output in restore mode - Mailing list pgsql-hackers

From Simon Riggs
Subject Re: odd output in restore mode
Date
Msg-id 1210630440.29684.249.camel@ebony.site
Whole thread Raw
In response to odd output in restore mode  (Andrew Dunstan <andrew@dunslane.net>)
Responses Re: odd output in restore mode  (Andrew Dunstan <andrew@dunslane.net>)
List pgsql-hackers
On Mon, 2008-05-12 at 16:57 -0400, Andrew Dunstan wrote:

> I have just been working on setting up a continuous recovery failover 
> system, and noticed some odd log lines, shown below. (Using 8.3).

Hmmm, well, the first time you use something complex, there are some
surprising features, I guess. Most especially the log lines are there to
allow production issues to be diagnosed, not to create a beautiful log.

Many of the things that look somewhat strange are there for a reason,
since a wide range of options and save-your-customers-ass scenarios are
covered by the recovery code.

Suggestions for improvement are always welcome and you are welcome to
suggest doc changes, as many people do. 

> First note that our parsing of recovery.conf in xlog.c is pretty bad, 
> and at least we need to document the quirks if it's not going to be 
> fixed. log_restartpoints is said to be boolean, but when I set it to an 
> unquoted true I got a fatal error, while a quoted 'on' sets it to false, 
> as seen. Ick. 

Yes, some improvements are definitely due there.

> What is more, I apparently managed to get the recovery 
> server to lose a WAL file and hang totally by having a bad 
> recovery.conf. Triple ick.

Sounds like a bug you should report in the normal way. Correctness is
paramount. Or are you confusing the message in the log for file AA with
an error? 

> Second, what is all this about .history files? My understanding is that 
> they are not necessary, so surely we should try to stat them to see if 
> they are present before trying to copy them. These lines are going to 
> confuse a lot of people, I suspect (including me).

I try to keep it as simple as possible, since much of this code only
gets run when you really need it to work. The request for the .history
file and the cp are examples of that.

> Lastly, not quite related to this output, but in the same general area, 
> should we have an option on pg_standby to allow removing the archive 
> file after it has been restored?

There already is one, but its more complex than that. (%r)

> LOG:  database system was interrupted; last known up at 2008-05-12 
> 15:18:23 EDT
> LOG:  starting archive recovery
> LOG:  log_restartpoints = false
> LOG:  restore_command = '../bin/pg_standby -t ../common_archive/failover 
> ../common_archive %f %p %r '
> cp: cannot stat `../common_archive/00000001.history': No such file or 
> directory
> cp: cannot stat `../common_archive/00000001.history': No such file or 
> directory
> cp: cannot stat `../common_archive/00000001.history': No such file or 
> directory
> LOG:  restored log file "0000000100000000000000A5.00000068.backup" from 
> archive
> LOG:  restored log file "0000000100000000000000A5" from archive
> LOG:  automatic recovery in progress
> LOG:  redo starts at 0/A50000B0
> LOG:  restored log file "0000000100000000000000A6" from archive
> LOG:  restored log file "0000000100000000000000A7" from archive
> LOG:  restored log file "0000000100000000000000A8" from archive
> LOG:  restored log file "0000000100000000000000A9" from archive
> trigger file found
> LOG:  could not open file "pg_xlog/0000000100000000000000AA" (log file 
> 0, segment 170): No such file or directory
> LOG:  redo done at 0/A9000068
> LOG:  restored log file "0000000100000000000000A9" from archive
> cp: cannot stat `../common_archive/00000002.history': No such file or 
> directory
> cp: cannot stat `../common_archive/00000002.history': No such file or 
> directory
> cp: cannot stat `../common_archive/00000002.history': No such file or 
> directory
> LOG:  selected new timeline ID: 2
> cp: cannot stat `../common_archive/00000001.history': No such file or 
> directory
> cp: cannot stat `../common_archive/00000001.history': No such file or 
> directory
> cp: cannot stat `../common_archive/00000001.history': No such file or 
> directory
> LOG:  archive recovery complete
> LOG:  database system is ready to accept connections
> LOG:  autovacuum launcher started

There is an outstanding Windows issue with pg_standby that your help
would be appreciated with, shown on latest commitfest page. It's a
Windows issue and I don't maintain a Windows dev environment.

--  Simon Riggs 2ndQuadrant  http://www.2ndQuadrant.com



pgsql-hackers by date:

Previous
From: Zdenek Kotala
Date:
Subject: Re: bloated heapam.h
Next
From: Tom Lane
Date:
Subject: Re: Fairly serious bug induced by latest guc enum changes