Re: Re: BUG #5602: Recovering from Hot-Standby file backup leads to the currupted indexes - Mailing list pgsql-bugs

From Tom Lane
Subject Re: Re: BUG #5602: Recovering from Hot-Standby file backup leads to the currupted indexes
Date
Msg-id 6027.1281591117@sss.pgh.pa.us
Whole thread Raw
In response to Re: Re: BUG #5602: Recovering from Hot-Standby file backup leads to the currupted indexes  (Fujii Masao <masao.fujii@gmail.com>)
Responses Re: Re: BUG #5602: Recovering from Hot-Standby file backup leads to the currupted indexes  (Simon Riggs <simon@2ndQuadrant.com>)
Re: Re: BUG #5602: Recovering from Hot-Standby file backup leads to the currupted indexes  (Fujii Masao <masao.fujii@gmail.com>)
List pgsql-bugs
Fujii Masao <masao.fujii@gmail.com> writes:
> On Fri, Aug 6, 2010 at 7:50 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> The procedure used does differ from that documented. However, IMHO the
>> procedure *documented* is *not* safe and could lead to corrupt indexes
>> in the way described, since the last recovered point might be mid-way
>> between two halves of an index split record, which will never be
>> corrected during HS.

> An index split record is replayed by two calls of rm_redo()? If not,
> we don't need to worry about the above since the last recovered point
> which pg_last_xlog_replay_location() returns is updated after every
> rm_redo().

Yeah, I thought that was bogus too.  If we're following a live master,
the second xlog record should be along shortly, and in any case queries
will give the correct result in between.  The problem is only interesting
if the WAL series ends and we have to cons up the split completion by
ourselves; but the logic to do that does exist.

What was bothering me about the procedure is that it's not clear when
the new slave has reached consistency, in the sense of having used WAL
to clean up any out-of-sync conditions in the base backup it was started
from.  So you can't be sure when it's okay to begin treating it as a
trustworthy backup or potential master.  We track the minimum safe
recovery point for normal PITR recovery cases, but that mechanism isn't
available for slaves cloned according to this procedure.  So the DBA is
just flying blind as to whether the slave is trustworthy yet.  I can't
prove that that's what burnt the original complainant, but it fits the
symptoms.

            regards, tom lane

pgsql-bugs by date:

Previous
From: Fujii Masao
Date:
Subject: Re: Re: BUG #5602: Recovering from Hot-Standby file backup leads to the currupted indexes
Next
From: Simon Riggs
Date:
Subject: Re: Re: BUG #5602: Recovering from Hot-Standby file backup leads to the currupted indexes