Re: [HACKERS] Infrastructure changes for recovery - Mailing list pgsql-patches
From | Tom Lane |
---|---|
Subject | Re: [HACKERS] Infrastructure changes for recovery |
Date | |
Msg-id | 9794.1222624945@sss.pgh.pa.us Whole thread Raw |
In response to | Re: [HACKERS] Infrastructure changes for recovery (Simon Riggs <simon@2ndQuadrant.com>) |
Responses |
Re: [HACKERS] Infrastructure changes for recovery
|
List | pgsql-patches |
Simon Riggs <simon@2ndQuadrant.com> writes: > On Thu, 2008-09-25 at 18:28 -0400, Tom Lane wrote: >> After reading this for awhile, I realized that there is a rather >> fundamental problem with it: it switches into "consistent recovery" >> mode as soon as it's read WAL beyond ControlFile->minRecoveryPoint. >> In a crash recovery situation that typically is before the last >> checkpoint (if indeed it's not still zero), and what that means is >> that this patch will activate the bgwriter and start letting in >> backends instantaneously after a crash, long before we can have any >> certainty that the DB state really is consistent. >> >> In a normal crash recovery situation this would be easily fixed by >> simply not letting it go to "consistent recovery" state at all, but >> what about recovery from a restartpoint? We don't want a slave that's >> crashed once to never let backends in again. But I don't see how to >> determine that we're far enough past the restartpoint to be consistent >> again. In crash recovery we assume (without proof ;-)) that we're >> consistent once we reach the end of valid-looking WAL, but that rule >> doesn't help for a slave that's following a continuing WAL sequence. >> >> Perhaps something could be done based on noting when we have to pull in >> a WAL segment from the recovery_command, but it sounds like a pretty >> fragile assumption. > Seems like we just say we only signal the postmaster if > InArchiveRecovery. Archive recovery from a restartpoint is still archive > recovery, so this shouldn't be a problem in the way you mention. The > presence of recovery.conf overrides all other cases. What that implements is my comment that we don't have to let anyone in at all during a plain crash recovery. It does nothing AFAICS for the problem that when restarting archive recovery from a restartpoint, it's not clear when it is safe to start letting in backends. You need to get past the highest LSN that has made it out to disk, and there is no good way to know what that is. Unless we can get past this problem the whole thing seems a bit dead in the water :-( >> * I'm a bit uncomfortable with the fact that the >> IsRecoveryProcessingMode flag is read and written with no lock. > It's not a dynamic state, so I can fix that inside > IsRecoveryProcessingMode() with a local state to make check faster. Erm, this code doesn't look like it can allow IsRecoveryProcessingMode to become locally true in the first place? I guess you could fix it by initializing IsRecoveryProcessingMode to true, but that seems likely to break other places. Maybe better is to have an additional local state variable showing whether the flag has ever been fetched from shared memory. The other issues don't seem worth arguing about ... regards, tom lane
pgsql-patches by date: