Re: [BUG] pg_basebackup from disconnected standby fails - Mailing list pgsql-hackers

From Robert Haas
Subject Re: [BUG] pg_basebackup from disconnected standby fails
Date
Msg-id CA+Tgmoa7fFtg-H3j1J-fHgMdOEN2UTJCuCE1AOJzPz6R70vPPQ@mail.gmail.com
Whole thread Raw
In response to Re: [BUG] pg_basebackup from disconnected standby fails  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: [BUG] pg_basebackup from disconnected standby fails  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers
On Mon, Oct 24, 2016 at 12:26 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>> The consensus solution on this thread seems to be that we should have
>> pg_do_stop_backup() return the last-replayed XLOG location as the
>> backup end point.  If the control file has been updated with a newer
>> redo location, then the associated checkpoint record has surely been
>> replayed, so we'll definitely have enough WAL to replay that
>> checkpoint record (and we don't need to replay anything else, because
>> we're supposing that this is the case where the minimum recovery point
>> precedes the redo location).  Although this will work, it might
>> include more WAL in the backup than is strictly necessary.  If the
>> additional WAL replayed beyond the minimum recovery point does NOT
>> include a checkpoint, we don't need any of it; if it does, we need
>> only the portion up to and including the last checkpoint record, and
>> not anything beyond that.
>
> You are right that it will include additional WAL than strictly
> necessary, but that can happen today as well because minrecoverypoint
> could be updated after you have established restart point for
> do_pg_start_backup().  Do you want to say that even without patch in
> some cases we are including additional WAL in backup than what is
> strictly necessary, so it is better to improve the situation?

In that case, including additional WAL is *necessary*.  If the minimum
recovery point is updated after do_pg_start_backup(), pages bearing
LSNs newer than the old minimum recovery point could conceivably have
been copied as part of the backup, and therefore we need to replay up
through the new minimum recovery point to guarantee consistency.

>> I can think of two solutions that would be "tighter":
>>
>> 1. When performing a restartpoint, update the minimum recovery point
>> to just beyond the checkpoint record.  I think this can't hurt anyone
>> who is actually restarting recovery on the same machine, because
>> that's exactly the point where they are going to start recovery.  A
>> minimum recovery point which precedes the point at which they will
>> start recovery is no better than one which is equal to the point at
>> which they will start recovery.
>
> I think this will work but will cause an unnecessary control file
> update for the cases when buffer flush will anyway do it.  It can work
> if we try to do only when required (minrecoverypoint is lesser than
> lastcheckpoint) after flush of buffers.

Sure, that doesn't sound too hard to implement.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: [BUG] pg_basebackup from disconnected standby fails
Next
From: Robert Haas
Date:
Subject: Re: [BUG] pg_basebackup from disconnected standby fails