Re: Failing start-up archive recovery at Standby mode in PG9.2.4 - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: Failing start-up archive recovery at Standby mode in PG9.2.4
Date
Msg-id 51798552.2010102@vmware.com
Whole thread Raw
In response to Re: Failing start-up archive recovery at Standby mode in PG9.2.4  (Kyotaro HORIGUCHI <kyota.horiguchi@gmail.com>)
Responses Re: Failing start-up archive recovery at Standby mode in PG9.2.4
Re: Failing start-up archive recovery at Standby mode in PG9.2.4
Re: Failing start-up archive recovery at Standby mode in PG9.2.4
List pgsql-hackers
On 25.04.2013 18:56, Kyotaro HORIGUCHI wrote:
>> Can you share the modified script, please?
>
> Please find the attached files:
>    test.sh : test script. most significant change is the load.
>                 I used simple insert instead of pgbench.
>                 It might need some more adjustment for other environment
>                 as my usual.
>    xlog.c.diff : Additional log output I thought to be useful to diagnose.

Ok, thanks, I see what's going on now. The problem is that once
XLogFileRead() finds a file with tli X, it immediately sets curFileTLI =
X. XLogFileReadAnyTLI() never tries to read files with tli < curFileTLI.
So, if recovery finds a file with the right filename, e.g
000000030000000000000008, it never tries to open
000000020000000000000008 anymore, even if the contents of
000000030000000000000008 later turn out to be bogus.

One idea to fix this is to not set curFileTLI, until the page header on
the just-opened file has been verified. Another idea is to change the
check in XLogFileReadAnyTLI() that currently forbids curFileTLI from
moving backwards. We could allow curFileTLI to move backwards, as long
as the tli is >= ThisTimeLineID (ThisTimeLineID is the current timeline
we're recovering records from).

Attached is a patch for the 2nd approach. With the patch, the test
script works for me. Thoughts?

PS. This wasn't caused by the 9.2.4 change to do crash recovery before
entering archive recovery. The test script fails in the same way with
9.2.3 as well.

- Heikki

Attachment

pgsql-hackers by date:

Previous
From: Jeff Davis
Date:
Subject: Re: Substituting Checksum Algorithm (was: Enabling Checksums)
Next
From: Tom Lane
Date:
Subject: Re: [ADMIN] Simultaneous index creates on different schemas cause deadlock?