Re: Recovery inconsistencies, standby much larger than primary - Mailing list pgsql-hackers

From Greg Stark
Subject Re: Recovery inconsistencies, standby much larger than primary
Date
Msg-id CAM-w4HNy6isA41tnY5sH3=rVswEUD8XfjYc6kGGg5Pwzqxnjdw@mail.gmail.com
Whole thread Raw
In response to Re: Recovery inconsistencies, standby much larger than primary  (Andres Freund <andres@2ndquadrant.com>)
Responses Re: Recovery inconsistencies, standby much larger than primary  (Andres Freund <andres@2ndquadrant.com>)
List pgsql-hackers
Going over this I think this is still a potential issue:

On 31 Jan 2014 15:56, "Andres Freund" <andres@2ndquadrant.com> wrote:

I am not sure that explains the issue, but I think the redo action for
truncation is not safe across crashes.  A XLOG_SMGR_TRUNCATE will just
do a smgrtruncate() (and then mdtruncate) which will iterate over the
segments starting at 0 till mdnblocks()/segment_size and *truncate* but
not delete individual segment files that are not needed anymore, right?
If we crash in the midst of that a new mdtruncate() will be issued, but
it will get a shorter value back from mdnblocks().

Am I missing something?

I'm not too familiar with md.c but my reading of the code is that we truncate the files in reverse order? In which case I think the code is safe *iff* the filesystem guarantees ordered meta data writes which I tihnk ext3 does (I think in all the journal modes). Most filesystems meta data writes are synchronous so the truncates are safe for them too.

But we don't generally rely on meta data writes being ordered. I think the "correct" thing to do is to record the nblocks prior to the truncate and then have md.c expose a new function that takes that parameter and pokes around looking for any segments it might need to clean up. But that would involve lots of abstraction violations in md.c. I think using nblocks would keep the violations within md.c but that still seems like a pain.

pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: Release schedule for 9.3.3?
Next
From: David Beck
Date:
Subject: Re: New hook after raw parsing, before analyze