Re: [HACKERS] Infrastructure changes for recovery - Mailing list pgsql-patches

From Tom Lane
Subject Re: [HACKERS] Infrastructure changes for recovery
Date
Msg-id 9794.1222624945@sss.pgh.pa.us
Whole thread Raw
In response to Re: [HACKERS] Infrastructure changes for recovery  (Simon Riggs <simon@2ndQuadrant.com>)
Responses Re: [HACKERS] Infrastructure changes for recovery
List pgsql-patches
Simon Riggs <simon@2ndQuadrant.com> writes:
> On Thu, 2008-09-25 at 18:28 -0400, Tom Lane wrote:
>> After reading this for awhile, I realized that there is a rather
>> fundamental problem with it: it switches into "consistent recovery"
>> mode as soon as it's read WAL beyond ControlFile->minRecoveryPoint.
>> In a crash recovery situation that typically is before the last
>> checkpoint (if indeed it's not still zero), and what that means is
>> that this patch will activate the bgwriter and start letting in
>> backends instantaneously after a crash, long before we can have any
>> certainty that the DB state really is consistent.
>>
>> In a normal crash recovery situation this would be easily fixed by
>> simply not letting it go to "consistent recovery" state at all, but
>> what about recovery from a restartpoint?  We don't want a slave that's
>> crashed once to never let backends in again.  But I don't see how to
>> determine that we're far enough past the restartpoint to be consistent
>> again.  In crash recovery we assume (without proof ;-)) that we're
>> consistent once we reach the end of valid-looking WAL, but that rule
>> doesn't help for a slave that's following a continuing WAL sequence.
>>
>> Perhaps something could be done based on noting when we have to pull in
>> a WAL segment from the recovery_command, but it sounds like a pretty
>> fragile assumption.

> Seems like we just say we only signal the postmaster if
> InArchiveRecovery. Archive recovery from a restartpoint is still archive
> recovery, so this shouldn't be a problem in the way you mention. The
> presence of recovery.conf overrides all other cases.

What that implements is my comment that we don't have to let anyone in
at all during a plain crash recovery.  It does nothing AFAICS for the
problem that when restarting archive recovery from a restartpoint,
it's not clear when it is safe to start letting in backends.  You need
to get past the highest LSN that has made it out to disk, and there is
no good way to know what that is.

Unless we can get past this problem the whole thing seems a bit dead in
the water :-(

>> * I'm a bit uncomfortable with the fact that the
>> IsRecoveryProcessingMode flag is read and written with no lock.

> It's not a dynamic state, so I can fix that inside
> IsRecoveryProcessingMode() with a local state to make check faster.

Erm, this code doesn't look like it can allow IsRecoveryProcessingMode
to become locally true in the first place?  I guess you could fix it
by initializing IsRecoveryProcessingMode to true, but that seems likely
to break other places.  Maybe better is to have an additional local
state variable showing whether the flag has ever been fetched from
shared memory.

The other issues don't seem worth arguing about ...

            regards, tom lane

pgsql-patches by date:

Previous
From: Simon Riggs
Date:
Subject: Re: [HACKERS] get_relation_stats_hook()
Next
From: Tom Lane
Date:
Subject: Re: [HACKERS] get_relation_stats_hook()