Re: Enabling Checksums - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: Enabling Checksums
Date
Msg-id 513503C0.3030809@vmware.com
Whole thread Raw
In response to Re: Enabling Checksums  (Jeff Davis <pgsql@j-davis.com>)
Responses Re: Enabling Checksums  (Jeff Davis <pgsql@j-davis.com>)
List pgsql-hackers
On 04.03.2013 18:00, Jeff Davis wrote:
> On Mon, 2013-03-04 at 10:36 +0200, Heikki Linnakangas wrote:
>> On 04.03.2013 09:11, Simon Riggs wrote:
>>> Are there objectors?
>>
>> FWIW, I still think that checksumming belongs in the filesystem, not
>> PostgreSQL.
>
> Doing checksums in the filesystem has some downsides. One is that you
> need to use a copy-on-write filesystem like btrfs or zfs, which (by
> design) will fragment the heap on random writes.

Yeah, fragmentation will certainly hurt some workloads. But how badly, 
and which workloads, and how does that compare with the work that 
PostgreSQL has to do to maintain the checksums? I'd like to see some 
data on those things.

> There are also other issues, like what fraction of our users can freely
> move to btrfs, and when. If it doesn't happen to be already there, you
> need root to get it there, which has never been a requirement before.

If you're serious enough about your data that you want checksums, you 
should be able to choose your filesystem.

>>   If you go ahead with this anyway, at the very least I'd like
>> to see some sort of a comparison with e.g btrfs. How do performance,
>> error-detection rate, and behavior on error compare? Any other metrics
>> that are relevant here?
>
> I suspect it will be hard to get an apples-to-apples comparison here
> because of the heap fragmentation, which means that a sequential scan is
> not so sequential. That may be acceptable for some workloads but not for
> others, so it would get tricky to compare.

An apples-to-apples comparison is to run the benchmark and see what 
happens. If it gets fragmented as hell on btrfs, and performance tanks 
because of that, then that's your result. If avoiding fragmentation is 
critical to the workload, then with btrfs you'll want to run the 
defragmenter in the background to keep it in order, and factor that into 
the test case.

I realize that performance testing is laborious. But we can't skip it 
and assume that the patch performs fine, because it's hard to benchmark.

- Heikki



pgsql-hackers by date:

Previous
From: Jeff Davis
Date:
Subject: Re: Enabling Checksums
Next
From: Jeff Davis
Date:
Subject: Re: Enabling Checksums