[HACKERS] Online enabling of page level checksums - Mailing list pgsql-hackers

From Magnus Hagander
Subject [HACKERS] Online enabling of page level checksums
Date
Msg-id CABUevEx8KWhZE_XkZQpzEkZypZmBp3GbM9W90JLp=-7OJWBbcg@mail.gmail.com
Whole thread Raw
Responses Re: [HACKERS] Online enabling of page level checksums  (Jim Nasby <Jim.Nasby@BlueTreble.com>)
Re: [HACKERS] Online enabling of page level checksums  (Greg Stark <stark@mit.edu>)
List pgsql-hackers
So, that post I made about checksums certainly spurred a lot of discussion :) One summary is that life would be a lot easier if we could turn checksums on (and off) without re-initdbing.  I'm breaking out this question into this thread to talk about it separately.


I've been toying with a branch to work on this, but haven't had a time to get it even to compiling state. But instead of waiting until I have some code to show, let me outline the idea I had.

My general idea is this:

Take one bit in the checksum version field and make it mean "in progress". That means chat checksums can now be "on", "off", or "in progress".

When checksums are "in progress", PostgreSQL will compute and write checksums whenever it writes out a buffer, but it will *not* verify checksums on read.

This state would be set by calling a function (or an external command with the system shut down if need be - I can accept a restart for this, but I'd rather avoid it if possible).

This function would also launch a background worker. This worker would enumerate the entire database block by block. Read a block, verify if the checksum is set and correct. If it is, ignore it (because any further updates will keep it in state ok when we're in state "in progress"). If not then mark it as dirty and write it out through regular means, which will include computing and writing the checksum since we're "in progress". With something similar to vacuum cost delay to control how quickly it writes.

Yes, this means the entire db will end up in the transaction log since everything is rewritten. That's not great, but for a lot of people that will be a trade they're willing to make since it's a one-time thing. Yes, this background process might take days or weeks - that's OK as long as it happens online.

Once the background worker is done, it flips the checksum state to "on", and the system starts verifying checksums as well.

If the system is interrupted before the background worker is done, it starts over from the beginning. Previously touched blocks will be read and verified, but not written (because their checksum is already correct). This will take time, but not re-generate the WAL.

I think the actual functions and background worker could go in an extension that's installed and loaded only by those who need it. But the core functionality of being able to have "checksum in progress" would have to be in the core codebase.

So, is there something obviously missing in this plan? Or just the code to do it :)

--

pgsql-hackers by date:

Previous
From: Magnus Hagander
Date:
Subject: Re: [HACKERS] Checksums by default?
Next
From: Stephen Frost
Date:
Subject: Re: [HACKERS] new autovacuum criterion for visible pages