Re: Enabling Checksums - Mailing list pgsql-hackers

From Greg Smith
Subject Re: Enabling Checksums
Date
Msg-id 50A192AB.3070602@2ndQuadrant.com
Whole thread Raw
In response to Re: Enabling Checksums  (Markus Wanner <markus@bluegap.ch>)
Responses Re: Enabling Checksums  (Markus Wanner <markus@bluegap.ch>)
List pgsql-hackers
On 11/12/12 3:44 AM, Markus Wanner wrote:
> Sorry if that has been discussed before, but can't we do without that
> bit at all? It adds a checksum switch to each page, where we just agreed
> we don't event want a per-database switch.

Once you accept that eventually there need to be online conversion 
tools, there needs to be some easy way to distinguish which pages have 
been processed for several potential implementations.  The options seem 
to be adding some bits just for that or bumping the page format.  I 
would like to just bump the format, but that has a pile of its own 
issues to cross.  Rather not make that a requirement for this month's 
requirements.

> Can we simply write a progress indicator to pg_control or someplace
> saying that all pages up to X of relation Y are supposed to have valid
> checksums?

All of the table-based checksum enabling ideas seem destined to add 
metadata to pg_class or something related to it for this purpose.  While 
I think everyone agrees that this is a secondary priority to getting 
basic cluster-level checksums going right now, I'd like to have at least 
a prototype for that before 9.3 development ends.  All of the

> I realize this doesn't support Jesper's use case of wanting to have the
> checksums only for newly dirtied pages. However, I'd argue that
> prolonging the migration to spread the load would allow even big shops
> to go through this without much of an impact on performance.

I'm thinking of this in some ways like the way creation of a new (but 
not yet valid) foreign key works.  Once that's active, new activity is 
immediately protected moving forward.  And eventually there's this 
cleanup step needed, one that you can inch forward over a few days.

The main upper limit on load spreading here is that the conversion 
program may need to grab a snapshot.  In that case the conversion taking 
too long will be a problem, as it blocks other vacuum activity past that 
point.   This is why I think any good solution to this problem needs to 
incorporate restartable conversion.  We were just getting complaints 
recently about how losing a CREATE INDEX CONCURRENTLY session can cause 
the whole process to end and need to be started over.  The way 
autovacuum runs right now it can be stopped and restarted later, with 
only a small loss of duplicated work in many common cases.  If it's 
possible to maintain that property for the checksum conversion, that 
would be very helpful to larger sites.  It doesn't matter if adding 
checksums to the old data takes a week if you throttle the load down, so 
long as you're not forced to hold an open snapshot the whole time.

-- 
Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com



pgsql-hackers by date:

Previous
From: Jeff Davis
Date:
Subject: Re: Enabling Checksums
Next
From: Josh Berkus
Date:
Subject: Re: Enabling Checksums