On Thu, Oct 6, 2011 at 3:47 AM, Frank Lanitz
<frank@frank.uvena.de> wrote:
Hi folks,
I want to refer to a question Rob did back in 2008 at
http://archives.postgresql.org/pgsql-general/2008-07/msg01167.php as we
are currently running into a similar question:
We are using warm standby via PITR using a shared drive between master
and slave node.
Our setup currently is set to archive_timeout = 60s and
checkpoint_timeout = 600s.
We expected that now every minute a WAL-file is written to the share,
but somehow we might misunderstood some part of the documentation as in
periods with low traffic on database the interval between WAL files is
>1min up to ten minutes.
The 8.4 docs lack this detail, but the 9.0 docs explain this. I don't believe it's a behavior change; I think it's just more clarification in the documents (
http://www.postgresql.org/docs/9.0/interactive/runtime-config-wal.html#GUC-ARCHIVE-TIMEOUT )
" When this parameter is greater than zero, the server will switch to a new segment file whenever this many seconds have elapsed since the last segment file switch, ***and there has been any database activity, including a single checkpoint.***" (emphasis mine)
Tom said something similar in the thread you referenced:
http://archives.postgresql.org/pgsql-general/2008-07/msg01166.php"One possible connection is that an xlog file switch will not actually happen unless some xlog output has been generated since the last switch.
If you were watching an otherwise-idle system then maybe the checkpoint records are needed to make it look like a switch is needed. OTOH if
it's *that* idle then the checkpoints should be no-ops too."
However, the goal was to have a WAL file every minute so disaster
recovering can be done fast with a minimum of lost data.
If there was any data, it's existence in the transaction log would trigger the archive_timeout behavior. With no database activity, you aren't missing anything.
Question is: What did we miss? Do we need to put checkpoint_timeout also
to 60s and does this makes sense at all?
You are getting what you need (maximum 60s between data and the corresponding data being sent through archive_command), just not exactly what you thought you asked for.
If you absolutely must have a file every in order to sleep well, you can lower checkpoint_timeout. Keep in mind the cost of checkpoints.
Derrick