Thread: BUG #4137: Postgres drives pg_standby %r param to delete WAL file before processing

BUG #4137: Postgres drives pg_standby %r param to delete WAL file before processing

From
"Wojciech Strzalka"
Date:
The following bug has been logged online:

Bug reference:      4137
Logged by:          Wojciech Strzalka
Email address:      wstrzalka@gmail.com
PostgreSQL version: 8.3.1
Operating system:   Linux version 2.6.18-53.1.14.el5 (Red Hat 4.1.2-14)
Description:        Postgres drives pg_standby %r param to delete WAL file
before processing
Details:

I've discovered problem with WAL recovery on standby server.

I have archive_command set properly and logs are shipping OK. Archive
timeout is also set (5 min).

When performing pg_start_backup the WAL is lets say on position
0000000100000001000000D9, then I start copy database to the second
machine which takes me 30 minutes. In that time archive timeout is
called a few times and those file are shipped properly to the second
host. After DB is succesfully copied i'm calling pg_stop_backup. The
WAL is at the moment on position 0000000100000001000000DE.

In that moment I see on the second machine WAL files from
0000000100000001000000D9 to 0000000100000001000000DE as well as
0000000100000001000000D9.00000020.backup

The problem occurs now when I'm trying to start my standby server in
recovery mode (with pg_standby).

The output from pg_standby:
------------------------------------
Trigger file             : /tmp/pgsql.promote_trigger.5432
Waiting for WAL file     : 00000001.history
WAL file path            : /var/lib/pgsql/incoming_wal/
00000001.history
Restoring to...          : pg_xlog/RECOVERYHISTORY
Sleep interval           : 5 seconds
Max wait interval        : 0 forever
Command for restore      : ln -s -f "/var/lib/pgsql/incoming_wal/
00000001.history" "pg_xlog/RECOVERYHISTORY"
Keep archive history     : 0000000100000001000000DB and later
running restore          : OK

Trigger file             : /tmp/pgsql.promote_trigger.5432
Waiting for WAL file     : 0000000100000001000000D9.00000020.backup
WAL file path            : /var/lib/pgsql/incoming_wal/
0000000100000001000000D9.00000020.backup
Restoring to...          : pg_xlog/RECOVERYHISTORY
Sleep interval           : 5 seconds
Max wait interval        : 0 forever
Command for restore      : ln -s -f "/var/lib/pgsql/incoming_wal/
0000000100000001000000D9.00000020.backup" "pg_xlog/RECOVERYHISTORY"
Keep archive history     : 0000000100000001000000DB and later
running restore          : OK

Trigger file             : /tmp/pgsql.promote_trigger.5432
Waiting for WAL file     : 0000000100000001000000D9
WAL file path            : /var/lib/pgsql/incoming_wal/
0000000100000001000000D9
Restoring to...          : pg_xlog/RECOVERYXLOG
Sleep interval           : 5 seconds
Max wait interval        : 0 forever
Command for restore      : ln -s -f "/var/lib/pgsql/incoming_wal/
0000000100000001000000D9" "pg_xlog/RECOVERYXLOG"
Keep archive history     : 0000000100000001000000DB and later
running restore          : OK

removing "/var/lib/pgsql/incoming_wal/0000000100000001000000D9"
removing "/var/lib/pgsql/incoming_wal/0000000100000001000000DA"

----------------------------------------------------------------------------
----------------------------

For the first time I start standby Postgres log says and the postgres
process goes down:
----------------------------------------------------------------------------
----------------------------
restored log file "0000000100000001000000D9.00000020.backup" from
archive
could not open file "pg_xlog/0000000100000001000000D9" (log file 1,
segment 217): No such file or directory invalid checkpoint record could not
locate required checkpoint record
If you are not restoring from a backup, try removing the file
"/var/lib/pgsql/data/backup_label".
startup process (PID 19201) was terminated by signal 6: Aborted
aborting startup due to startup process failure
----------------------------------------------------------------------------
----------------------------

When I try to start PG for the second time it just stucks waiting for
0000000100000001000000D9

In my opinion the problem is that when starting standby PostgresSQL
wants to recovery WAL 0000000100000001000000D9, but first deletes it, as
keep  archive history (%r) param is set to 0000000100000001000000DB from the
very first call to the pg_standby (when reading 00000001.history).

This occurs only when the copy process is longer then archive_timeout.

My restore command is:
------------------------------------------
restore_command = 'pg_standby -l -d -s 5 -w 0 -t
/tmp/pgsql.promote_trigger.5432 ~postgres/incoming_wal %f %p %r 2>&1 |
logger -p local1.info -t pitr-standby'
------------------------------------------

I don't know if it matters but there are no transactions on active at the
time I make a copy - so all WAL archives shipped at the time are probably
empty.

I don't delete any files from pg_xlog prior to start standby.

I hope this description is enough to reproduce the bug.
On Fri, 2008-05-02 at 18:11 +0000, Wojciech Strzalka wrote:

> I've discovered problem with WAL recovery on standby server.

Thanks for reporting it.

> I start copy database to the second
> machine which takes me 30 minutes.

> The output from pg_standby:
> ------------------------------------
> Trigger file             : /tmp/pgsql.promote_trigger.5432
> Waiting for WAL file     : 00000001.history
> WAL file path            : /var/lib/pgsql/incoming_wal/
> 00000001.history
> Restoring to...          : pg_xlog/RECOVERYHISTORY
> Sleep interval           : 5 seconds
> Max wait interval        : 0 forever
> Command for restore      : ln -s -f "/var/lib/pgsql/incoming_wal/
> 00000001.history" "pg_xlog/RECOVERYHISTORY"

> Keep archive history     : 0000000100000001000000DB and later
...................................................^^
> running restore          : OK

Looks like the first message gives and incorrect archive cut-off point.

My guess is that its picking up a checkpoint that happens during backup
as the starting point for the cutoff, which is incorrect. My guess at a
fix is that we need to make sure that the %r value is always less than
or equal to the %f value. Sounds like a good safeguard even if that
doesn't fix this specific bug.

I'll check and see if I can recreate it, then patch. Won't be tonight
though.

If I'm right, then removing the %r value from the restore_command should
make this work correctly as a workaround. Please let me know either way.

--
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com
On Fri, 2008-05-02 at 22:01 +0100, Simon Riggs wrote:
> On Fri, 2008-05-02 at 18:11 +0000, Wojciech Strzalka wrote:

> > Keep archive history     : 0000000100000001000000DB and later
> ...................................................^^
> > running restore          : OK
>
> Looks like the first message gives and incorrect archive cut-off point.

I confess that I don't fully understand your description of the sequence
of activities you performed. But I do see that a file has been removed
that is *after* the one requested, which should never happen. So this is
a valid bug.

Not managed to recreate this yet, but I'm posting my suggested fix now.
This is a general fix, so will fix any problem with those symptoms.

Please test this and report back. You can do this in two ways:
1. recompile pg_standby only and retry. I enclose a fully patched copy
of pg_standby.c to assist you, in case that helps
2. apply complete patch to postgres source tree and retry

I will attempt to recreate the problem and test this patch also using
method (2)

[Not yet ready to be applied to core, until we get test results.]

--
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com

Attachment
On Tue, 2008-05-06 at 09:37 +0200, Wojciech Strzałka wrote:

>     Thanks for your fix. It solves the problem I described.
>     If you're interested I'm attaching my log files before & after applying the fix
>     (I've applied only new pg_standby.c, although in the patch some other files are also changed)
>     and graphical diff with the problematic activities marked.
>     Recovery was made on exactly the same files.
>
>
>    Looks great for me. Thanks for quick response & fix.

No problem.

This looks good to apply now, so posting to patches.

--
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com