Re: Block-level CRC checks - Mailing list pgsql-hackers

From Alvaro Herrera
Subject Re: Block-level CRC checks
Date
Msg-id 20081113192540.GE4062@alvh.no-ip.org
Whole thread Raw
In response to Re: Block-level CRC checks  (Aidan Van Dyk <aidan@highrise.ca>)
Responses Re: Block-level CRC checks  (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
List pgsql-hackers
Aidan Van Dyk wrote:
> 
> I think I'm missing something...
> 
> In this patch, I see you writing WAL records for hint-bits (bufmgr.c
> FlushBuffer).  But doesn't XLogInsert then make a "backup block" record (unless
> it's already got one since last checkpoint)?

I'm not causing a backup block to be written with that WAL record.  The
rationale is that it's not needed -- if there was a critical write to
the page, then there's already a backup block.  If the only write was a
hint bit being set, then the page cannot possibly be torn.

Now that I think about this, I wonder if this can cause problems in some
filesystems.  XFS, for example, zeroes out during recovery any block
that was written to but not fsync'ed before a crash.  This means that if
we change a hint bit after a checkpoing and mark the page dirty, the
system can write the page.  Suppose we crash at this point.  On
recovery, XFS will zero out the block, but there will be nothing with
which to recovery it, because there's no backup block ...

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: Block-level CRC checks
Next
From: Tom Lane
Date:
Subject: Re: auto_explain contrib moudle