Re: [PATCH] A crash and subsequent recovery of themaster can cause the slave to get out-of-sync - Mailing list pgsql-hackers

From Florian G. Pflug
Subject Re: [PATCH] A crash and subsequent recovery of themaster can cause the slave to get out-of-sync
Date
Msg-id 462C964E.90405@phlo.org
Whole thread Raw
In response to Re: [PATCH] A crash and subsequent recovery of themaster can cause the slave to get out-of-sync  ("Simon Riggs" <simon@2ndquadrant.com>)
Responses Re: [PATCH] A crash and subsequent recovery of themaster can cause the slave to get out-of-sync
List pgsql-hackers
Simon Riggs wrote:
> On Thu, 2007-04-19 at 22:37 +0200, Florian G. Pflug wrote:
>> The problem is that after a crash, the master might complete incomplete
>> actions via rm_cleanup() - but since it won't wal-log those changes,
>> the slave won't know about this. This will at least prevent the creation
>> of any further restart points on the slave (because safe_restartpoint)
>> will never return true again - it it might even cause data corruption,
>> if subsequent wal records are interpreted wrongly by the slave because
>> it sees other data than the master did when it generated them.
> 
> I agree the problem exists. It is somewhat rare because the idea is that
> if the Primary crashes we would failover to the Standby, which would
> mean that both Primary and Standby have executed rm_cleanup(), if
> needed.
> 
> So in the case where the Primary fails and we choose *not* to failover,
> there is a potential difficulty on the Standby.

It occured to me today that this might plague 8.1 too. As you explain below,
the problem is not really connected to restartpoints - failing to create
them is merely a sympton of this. On 8.1, this might still lead to rm_cleanup()
being called much "later" (if you consider the wal position to be the "time")
on the slave than on the master. I dunno if this really causes trouble - I
don't yet understand the btree code well enough to judge this.

> The rationale for this fix could be described somewhat differently:
> 
> When we shutdown, we know for certain that safe_restartpoint() is true.
> However, we don't know whether it is true because we successfully did a
> clean shutdown, or because we crashed, recovered and then issued a
> shutdown checkpoint following recovery. In the latter case we *must*
> execute rm_cleanup() on the standby because it has been executed on the
> primary. Not doing so at this point *might* be safe, but is not certain
> to be safe. We don't *need* to log a restartpoint at this time, but it
> seems sensible to do so.

While creating the patch, I've been thinking if it might be worthwile
to note that we just did recovery in the ShutdownCheckpoint
(or create a new checkpoint type RecoveryCheckpoint). This wouldl allow
for more error checking, because then the slave could check that
safe_restartpoint() is true for all ShutdownCheckpoints that were not
after recovering.

> We need to check that rm_cleanup() routines don't assume that they will
> only ever be called once or this will clearly fail. There is also no
> need to call rm_cleanup() unless rm_safe_restartpoint() is false.

But a non-idempotent rm_cleanup() routine will cause trouble anyway,
if postgres crashes after having called rm_cleanup() but before creating
the ShutdownCheckpoint.

greetings, Florian Pflug



pgsql-hackers by date:

Previous
From: "Simon Riggs"
Date:
Subject: Re: [PATCH] A crash and subsequent recovery of themaster can cause the slave to get out-of-sync
Next
From: "Simon Riggs"
Date:
Subject: Re: EXPLAIN/EXPLAIN ANALYSE for pl/pgsql functions