Re: [BUG] pg_basebackup from disconnected standby fails - Mailing list pgsql-hackers

From Michael Paquier
Subject Re: [BUG] pg_basebackup from disconnected standby fails
Date
Msg-id CAB7nPqSTguwh4hkKfWU18MCe6Fbvdm0DK-cH-7tx6xZ2BhCc9Q@mail.gmail.com
Whole thread Raw
In response to Re: [BUG] pg_basebackup from disconnected standby fails  (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>)
Responses Re: [BUG] pg_basebackup from disconnected standby fails
Re: [BUG] pg_basebackup from disconnected standby fails
List pgsql-hackers
On Wed, Jun 15, 2016 at 4:52 PM, Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:
> Hello,

Sorry for the late reply, Horiguchi-san. I have finally been able to
put some mind power into that.

> This is somewhat artificial but the same situation could be made
> also in the nature. The exact condition for this is replaying a
> checkpoint record having no buffer modification since the
> preceding checkpoint in the previous WAL segments.

After thinking about that more, I am seeing your point.
CreateRestartPoint is clearly missing the shot for the update of
minRecoveryPoint even when a restart point can be created. Since
cdd46c7, we assume that a node in recovery will not update it, and
rely on the fact that it will be updated if some buffers are flushed,
but that assumption got broken since the introduction of the
possibility to take base backups from standbys, as in your case there
is no activity on the standby that would allow minRecoveryPoint to be
updated and make the backup finish in a clean state. By the way,
relying on CheckpointStats to decide if it should be updated or not
looks like a broken concept to me as this information is just stored
for logging and has no role in any decision-making, and I think that
it should remain this way. So, instead I think that we had better
update minRecoveryPoint unconditionally as in the attached. I reworked
the surrounding comments in consequence. That's a bit more aggressive,
and that addresses the backup problems.

> Returning to the discussion on which way to choose, we agreed
> that using replayEndRecPtr instead of minRecoveryPoint in
> do_pg_stop_backup(), as the patch attached to the following
> message.

Hm.. If there are no buffers flushed to disk that may increase the
time it takes to reach a consistent state at recovery for no real
reasons. So actually it is not that much a good idea...
--
Michael

Attachment

pgsql-hackers by date:

Previous
From: "Tsunakawa, Takayuki"
Date:
Subject: Re: Question and suggestion about application binary compatibility policy
Next
From: Michael Paquier
Date:
Subject: Re: [BUG] pg_basebackup from disconnected standby fails