Re: [DESIGN] Incremental checksums - Mailing list pgsql-hackers

From David Christensen
Subject Re: [DESIGN] Incremental checksums
Date
Msg-id A69C6D70-50D9-403A-9259-D5F460138082@endpoint.com
Whole thread Raw
In response to Re: [DESIGN] Incremental checksums  (Jim Nasby <Jim.Nasby@BlueTreble.com>)
Responses Re: [DESIGN] Incremental checksums
List pgsql-hackers
> On Jul 13, 2015, at 3:50 PM, Jim Nasby <Jim.Nasby@BlueTreble.com> wrote:
>
> On 7/13/15 3:26 PM, David Christensen wrote:
>> * Incremental Checksums
>>
>> PostgreSQL users should have a way up upgrading their cluster to use data checksums without having to do a costly
pg_dump/pg_restore;in particular, checksums should be able to be enabled/disabled at will, with the database enforcing
thelogic of whether the pages considered for a given database are valid. 
>>
>> Considered approaches for this are having additional flags to pg_upgrade to set up the new cluster to use checksums
wherethey did not before (or optionally turning these off).  This approach is a nice tool to have, but in order to be
ableto support this process in a manner which has the database online while the database is going throught the initial
checksumprocess. 
>
> It would be really nice if this could be extended to handle different page formats as well, something that keeps
rearingit's head. Perhaps that could be done with the cycle idea you've described. 

I had had this thought too, but the main issues I saw were that new page formats were not guaranteed to take up the
samespace/storage, so there was an inherent limitation on the ability to restructure things out *arbitrarily*; that
beingsaid, there may be a use-case for the types of modifications that this approach *would* be able to handle. 

> Another possibility is some kind of a page-level indicator of what binary format is in use on a given page. For
checksumsmaybe a single bit would suffice (indicating that you should verify the page checksum). Another use case is
usingthis to finally ditch all the old VACUUM FULL code in HeapTupleSatisfies*(). 

There’s already a page version field, no?  I assume that would be sufficient for the page format indicator.  I don’t
knowabout using a flag for verifying the checksum, as that is already modifying the page which is to be checksummed
anyway,which we want to avoid having to rewrite a bunch of pages unnecessarily, no?  And you’d presumably need to clear
thatstate again which would be an additional write.  This was the issue that the checksum cycle was meant to handle,
sincewe store this information in the system catalogs and the types of modifications here would be idempotent. 

David
--
David Christensen
PostgreSQL Team Manager
End Point Corporation
david@endpoint.com
785-727-1171








pgsql-hackers by date:

Previous
From: dinesh kumar
Date:
Subject: Re: [PATCH] SQL function to report log message
Next
From: Andrew Dunstan
Date:
Subject: Re: PostgreSQL 9.5 Alpha 1 build fail with perl 5.22