Re: 16-bit page checksums for 9.2 - Mailing list pgsql-hackers

From Greg Smith
Subject Re: 16-bit page checksums for 9.2
Date
Msg-id 4F0BC6E8.3070905@2ndQuadrant.com
Whole thread Raw
In response to Re: 16-bit page checksums for 9.2  (Aidan Van Dyk <aidan@highrise.ca>)
List pgsql-hackers
On 12/30/11 9:44 AM, Aidan Van Dyk wrote:

> So moving to this new double-write-area bandwagon, we move from a "WAL
> FPW synced at the commit, collect as many other writes, then final
> sync" type system to a system where *EVERY* write requires syncs of 2
> separate 8K writes at buffer write-out time.

It's not quite that bad.  The double-write area is going to be a small 
chunk of re-used sequential I/O, like the current WAL.  And if this 
approach shifts some of the full-page writes out of the WAL and toward 
the new area instead, that's not a real doubling either.  Could probably 
put both on the same disk, and in situations where you don't have a 
battery-backed write cache it's possible to get a write to both per 
rotation.

This idea has been tested pretty extensively as part of MySQL's Innodb 
engine.  Results there suggest the overhead is in the 5% to 30% range; 
some examples mentioning both extremes of that:

http://www.mysqlperformanceblog.com/2006/08/04/innodb-double-write/
http://www.bigdbahead.com/?p=392

Makes me wish I knew off the top of my head how expensive WAL logging 
hint bits would be, for comparison sake.


-- 
Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Sending notifications from the master to the standby
Next
From: Peter Geoghegan
Date:
Subject: Re: Progress on fast path sorting, btree index creation time