Re: Page Checksums - Mailing list pgsql-hackers

From Greg Smith
Subject Re: Page Checksums
Date
Msg-id 4EEF8681.8020903@2ndQuadrant.com
Whole thread Raw
In response to Re: Page Checksums  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On 12/19/2011 07:50 AM, Robert Haas wrote:
> On Mon, Dec 19, 2011 at 6:10 AM, Simon Riggs<simon@2ndquadrant.com>  wrote:
>> The only sensible way to handle this is to change the page format as
>> discussed. IMHO the only sensible way that can happen is if we also
>> support an online upgrade feature. I will take on the online upgrade
>> feature if others work on the page format issues, but none of this is
>> possible for 9.2, ISTM.
> I'm not sure that I understand the dividing line you are drawing here.

There are three likely steps to reaching checksums:

1) Build a checksum mechanism into the database.  This is the 
straighforward part that multiple people have now done.

2) Rework hint bits to make the torn page problem go away.  Checksums go 
elsewhere? More WAL logging to eliminate the bad situations?  Eliminate 
some types of hint bit writes?  It seems every alternative has 
trade-offs that will require serious performance testing to really validate.

3) Finally tackle in-place upgrades that include a page format change.  
One basic mechanism was already outlined:  a page converter that knows 
how to handle two page formats, some metadata to track which pages have 
been converted, a daemon to do background conversions.  Simon has some 
new ideas here too ("online upgrade" involves two clusters kept in sync 
on different versions, slightly different concept than the current 
"in-place upgrade").  My recollection is that the in-place page upgrade 
work was pushed out of the critical path before due to lack of immediate 
need.  It wasn't necessary until a) a working catalog upgrade tool was 
validated and b) a bite-size feature change to test it on appeared.  We 
have (a) now in pg_upgrade, and CRCs could be (b)--if the hint bit 
issues are sorted first.

What Simon was saying is that he's got some interest in (3), but wants 
no part of (2).

I don't know how much time each of these will take.  I would expect that 
(2) and (3) have similar scopes though--many days, possibly a few 
months, of work--which means they both dwarf (1).  The part that's been 
done is the visible tip of a mostly underwater iceburg.

-- 
Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: RangeVarGetRelid()
Next
From: Andrew Dunstan
Date:
Subject: reprise: pretty print viewdefs