Re: 16-bit page checksums for 9.2 - Mailing list pgsql-hackers

From Robert Haas
Subject Re: 16-bit page checksums for 9.2
Date
Msg-id CA+TgmoY+r-EVs3zskY5_wE_EXxs9yvG-0im531==UM_PHuCbmw@mail.gmail.com
Whole thread Raw
In response to Re: 16-bit page checksums for 9.2  (Jeff Janes <jeff.janes@gmail.com>)
Responses Re: 16-bit page checksums for 9.2  (Ants Aasma <ants.aasma@eesti.ee>)
List pgsql-hackers
On Fri, Dec 30, 2011 at 11:58 AM, Jeff Janes <jeff.janes@gmail.com> wrote:
> On 12/29/11, Ants Aasma <ants.aasma@eesti.ee> wrote:
>> Unless I'm missing something, double-writes are needed for all writes,
>> not only the first page after a checkpoint. Consider this sequence of
>> events:
>>
>> 1. Checkpoint
>> 2. Double-write of page A (DW buffer write, sync, heap write)
>> 3. Sync of heap, releasing DW buffer for new writes.
>>  ... some time goes by
>> 4. Regular write of page A
>> 5. OS writes one part of page A
>> 6. Crash!
>>
>> Now recovery comes along, page A is broken in the heap with no
>> double-write buffer backup nor anything to recover it by in the WAL.
>
> Isn't 3 the very definition of a checkpoint, meaning that 4 is not
> really a regular write as it is the first one after a checkpoint?

I think you nailed it.

> But it doesn't seem safe to me replace a page from the DW buffer and
> then apply WAL to that replaced page which preceded the age of the
> page in the buffer.

That's what LSNs are for.

If we write the page to the checkpoint buffer just once per
checkpoint, recovery can restore the double-written versions of the
pages and then begin WAL replay, which will restore all the subsequent
changes made to the page.  Recovery may also need to do additional
double-writes if it encounters pages that for which we wrote WAL but
never flushed the buffer, because a crash during recovery can also
create torn pages.  When we reach a restartpoint, we fsync everything
down to disk and then nuke the double-write buffer.  Similarly, in
normal running, we can nuke the double-write buffer at checkpoint
time, once the fsyncs are complete.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: Add SPI results constants available for PL/*
Next
From: Robert Haas
Date:
Subject: Re: Setting -Werror in CFLAGS