Re: 16-bit page checksums for 9.2 - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: 16-bit page checksums for 9.2
Date
Msg-id 4F4E4484.9010203@enterprisedb.com
Whole thread Raw
In response to Re: 16-bit page checksums for 9.2  (Simon Riggs <simon@2ndQuadrant.com>)
Responses Re: 16-bit page checksums for 9.2  (Simon Riggs <simon@2ndQuadrant.com>)
List pgsql-hackers
On 29.02.2012 17:01, Simon Riggs wrote:
> On Wed, Feb 29, 2012 at 2:40 PM, Heikki Linnakangas
> <heikki.linnakangas@enterprisedb.com>  wrote:
>> On 22.02.2012 14:30, Simon Riggs wrote:
>>>
>>> Agreed. No reason to change a checksum unless we rewrite the block, no
>>> matter whether page_checksums is on or off.
>>
>> This can happen:
>>
>> 1. checksums are initially enabled. A page is written, with a correct
>> checksum.
>> 2. checksums turned off.
>> 3. A hint bit is set on the page.
>> 4. While the page is being written out, someone pulls the power cord, and
>> you get a torn write. The hint bit change made it to disk, but the clearing
>> of the checksum in the page header did not.
>> 5. Sometime after restart, checksums are turned back on.
>>
>> The page now has an incorrect checksum on it. The next time it's read, you
>> get a checksum error.
>
> Yes, you will. And you'll get a checksum error because the block no
> longer passes. So an error should be reported.
>
> We can and should document that turning this on/off/on can cause
> problems. Hopefully crashing isn't that common a situation.

Hopefully not, but then again, you get interested in fiddling with this 
setting, when you do experience crashes. This feature needs to be 
trustworthy in the face of crashes.

>> I'm pretty uncomfortable with this idea of having a flag on the page itself
>> to indicate whether it has a checksum or not. No matter how many bits we use
>> for that flag. You can never be quite sure that all your data is covered by
>> the checksum, and there's a lot of room for subtle bugs like the above,
>> where a page is reported as corrupt when it isn't, or vice versa.
>
> That is necessary to allow upgrade. It's not their for any other reason.

Understood. I'm uncomfortable with it regardless of why it's there.

>> This thing needs to be reliable and robust. The purpose of a checksum is to
>> have an extra sanity check, to detect faulty hardware. If it's complicated,
>> whenever you get a checksum mismatch, you'll be wondering if you have broken
>> hardware or if you just bumped on a PostgreSQL bug. I think you need a flag
>> in pg_control or somewhere to indicate whether checksums are currently
>> enabled or disabled, and a mechanism to scan and rewrite all the pages with
>> checksums, before they are verified.
>
> That would require massive downtime, so again, it has been ruled out
> for practicality.

Surely it can be done online. You'll just need a third state between off 
and on, where checksums are written but not verified, while the cluster 
is scanned.

>> I've said this before, but I still don't like the hacks with the version
>> number in the page header. Even if it works, I would much prefer the
>> straightforward option of extending the page header for the new field. Yes,
>> it means you have to deal with pg_upgrade, but it's a hurdle we'll have to
>> jump at some point anyway.
>
> What you suggest might happen in the next release, or maybe longer.
> There may be things that block it completely, so it might never
> happen. My personal opinion is that it is not possible to make further
> block format changes until we have a fully online upgrade process,
> otherwise we block people from upgrading - not everybody can take
> their site down to run pg_upgrade. I plan to work on that, but it may
> not happen for 9.3;

Yep, we should bite the bullet and work on that.

> perhaps you will object to that also when it comes.

Heh, quite possible :-). But only if there's a reason. I do want to see 
a solution to this, I have a few page-format changing ideas myself that 
I'd like to implement at some point.

> This patch is very specifically something that makes the best of the
> situation, now, for those that want and need it. If you don't want it,
> you don't have to use it.

Whether I use it or not, I'll have to live with it in the source tree.

> But that shouldn't stop us giving it to the people that do want it.
>
> I'm hearing general interest and support for this feature from people
> that run their business on PostgreSQL.

If you ask someone "would you like to have checksums in PostgreSQL?", 
he'll say "sure". If you ask him "would you like to keep the PostgreSQL 
source tree as simple as possible, to make it easy for new developers to 
join the effort?", he'll say "yes". It's all about how you frame the 
question. Even if you want to have checksums on data pages, it doesn't 
necessarily mean you want them so badly you can't wait another release 
or two for a cleaner solution, or that you'd be satisfied with the 
implementation proposed here.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


pgsql-hackers by date:

Previous
From: Simon Riggs
Date:
Subject: Re: 16-bit page checksums for 9.2
Next
From: Tom Lane
Date:
Subject: Re: pgsql_fdw, FDW for PostgreSQL server