Re: Enabling Checksums - Mailing list pgsql-hackers

From Greg Smith
Subject Re: Enabling Checksums
Date
Msg-id 5148AF3F.5040801@2ndQuadrant.com
Whole thread Raw
In response to Re: Enabling Checksums  (Greg Stark <stark@mit.edu>)
List pgsql-hackers
On 3/8/13 4:40 PM, Greg Stark wrote:
> On Fri, Mar 8, 2013 at 5:46 PM, Josh Berkus <josh@agliodbs.com> wrote:
>> After some examination of the systems involved, we conculded that the
>> issue was the FreeBSD drivers for the new storage, which were unstable
>> and had custom source patches.  However, without PostgreSQL checksums,
>> we couldn't *prove* it wasn't PostgreSQL at fault.  It ended up taking
>> weeks of testing, most of which was useless, to prove to them they had a
>> driver problem so it could be fixed.  If Postgres had had checksums, we
>> could have avoided wasting a couple weeks looking for non-existant
>> PostgreSQL bugs.
>
> How would Postgres checksums have proven that?

It's hard to prove this sort of thing definitively.  I see this more as 
a source of evidence that can increase confidence that the database is 
doing the right thing, most usefully in a replication environment. 
Systems that care about data integrity nowadays are running with a WAL 
shipping replica of some sort.  Right now there's no way to grade the 
master vs. standby copies of data, to figure out which is likely to be 
the better copy.  In a checksum environment, here's a new 
troubleshooting workflow that becomes possible:

1) Checksum error happens on the master.
2) The same block is checked on the standby.  It has the same 16 bit 
checksum, but different data, and its checksum matches its data.
3) The copy of that block on the standby, which was shipped over the 
network instead of being stored locally, is probably good.
4) The database must have been consistent when the data was in RAM on 
the master.
5) Conclusion:  there's probably something wrong at a storage layer 
below the database on the master.

Now, of course this doesn't automatically point the finger correctly 
with every possible corruption possibility.  But this example is a 
situation I've seen in the real world when a bad driver flips a random 
bit in a block.  If Josh had been able to show his client the standby 
server built from streaming replication was just fine, and corruption 
was limited to the master, that doesn't *prove* the database isn't the 
problem.  But it does usefully adjust the perception of what faults are 
likely and unlikely away from it.  Right now when I see master/standby 
differences in data blocks, it's the old problem of telling the true 
time when you have two clocks.  Having a checksum helps pick the right 
copy when there is more than one, and one has been corrupted by storage 
layer issues.

> If i understand the performance issues right the main problem is the
> extra round trip to the wal log which can require a sync. Is that
> right?

I don't think this changes things such that there is a second fsync per 
transaction.  That is a worthwhile test workload to add though.  Right 
now the tests Jeff and I have ran have specifically avoided systems with 
slow fsync, because you can't really test the CPU/memory overhead very 
well if you're hitting the rotational latency bottleneck.

-- 
Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Enabling Checksums
Next
From: Tom Lane
Date:
Subject: Re: postgres_fdw vs data formatting GUCs (was Re: [v9.3] writable foreign tables)