On Friday, January 06, 2012 08:53:38 PM Robert Haas wrote:
> On Fri, Jan 6, 2012 at 2:48 PM, Andres Freund <andres@anarazel.de> wrote:
> > On Friday, January 06, 2012 08:45:45 PM Heikki Linnakangas wrote:
> >> On 06.01.2012 20:26, Simon Riggs wrote:
> >> > The following patch (v4) introduces a new WAL record type that writes
> >> > backup blocks for the first hint on a block in any checkpoint that has
> >> > not previously been changed. IMHO this fixes the torn page problem
> >> > correctly, though at some additional loss of performance but not the
> >> > total catastrophe some people had imagined. Specifically we don't need
> >> > to log anywhere near 100% of hint bit settings, much more like 20-30%
> >> > (estimated not measured).
> >>
> >> How's that going to work during recovery? Like in hot standby.
> >
> > How's recovery a problem? Unless I miss something that doesn't actually
> > introduce a new possibility to transport hint bits to the standby (think
> > fpw's). A new transport will obviously increase traffic but ...
>
> The standby can set hint bits locally that weren't set on the data it
> received from the master. This will require rechecksumming and
> rewriting the page, but obviously we can't write the WAL records
> needed to protect those writes during recovery. So a crash could
> create a torn page, invalidating the checksum.
Err. Stupid me, thanks.
> Ignoring checksum errors during Hot Standby operation doesn't fix it,
> either, because eventually you might want to promote the standby, and
> the checksum will still be invalid.
Its funny. I have the feeling we all are missing a very obvious brilliant
solution to this...
Andres