Re: Checkpoint cost, looks like it is WAL/CRC - Mailing list pgsql-hackers

From Bruce Momjian
Subject Re: Checkpoint cost, looks like it is WAL/CRC
Date
Msg-id 200507071515.j67FF6B07875@candle.pha.pa.us
Whole thread Raw
In response to Re: Checkpoint cost, looks like it is WAL/CRC  (Simon Riggs <simon@2ndquadrant.com>)
Responses Re: Checkpoint cost, looks like it is WAL/CRC
List pgsql-hackers
Simon Riggs wrote:
> > SCSI tagged queueing certainly allows 512-byte blocks to be reordered
> > during writes.
> 
> Then a torn-page tell-tale is required that will tell us of any change
> to any of the 512-byte sectors that make up a block/page.
> 
> Here's an idea:
> 
> We read the page that we would have backed up, calc the CRC and write a
> short WAL record with just the CRC, not the block. When we recover we
> re-read the database page, calc its CRC and compare it with the CRC from
> the transaction log. If they differ, we know that the page was torn and
> we know the database needs recovery. (So we calc the CRC when we log AND
> when we recover).
> 
> This avoids the need to write full pages, though slightly slows down
> recovery.

Yes, that is a good idea!  That torn page thing sounded like a mess, and
I love that we can check them on recovery rather than whenever you
happen to access the page.

What would be great would be to implement this when full_page_writes is
off, _and_ have the page writes happen when the page is written to disk
rather than modified in the shared buffers.

I will add those to the TODO list now.  Updated item:
* Eliminate need to write full pages to WAL before page modification   [wal]  Currently, to protect against partial
diskpage writes, we write  full page images to WAL before they are modified so we can correct any  partial page writes
duringrecovery.  These pages can also be  eliminated from point-in-time archive files.        o  -Add ability to turn
offfull page writes        o  When off, write CRC to WAL and check file system blocks           on recovery        o
Writefull pages during file system write and not when           the page is modified in the buffer cache           This
allowsmost full page writes to happen in the background           writer.
 

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


pgsql-hackers by date:

Previous
From: Robert Perry
Date:
Subject: Re: [INTERFACES] By Passed Domain Constraints
Next
From: Tom Lane
Date:
Subject: Re: Checkpoint cost, looks like it is WAL/CRC