Re: 9.3: summary of corruption detection / checksums / CRCs discussion - Mailing list pgsql-hackers

From Jeff Davis
Subject Re: 9.3: summary of corruption detection / checksums / CRCs discussion
Date
Msg-id 1335052711.25680.112.camel@jdavis
Whole thread Raw
In response to Re: 9.3: summary of corruption detection / checksums / CRCs discussion  (Greg Stark <stark@mit.edu>)
List pgsql-hackers
On Sun, 2012-04-22 at 00:08 +0100, Greg Stark wrote:
> On Sat, Apr 21, 2012 at 10:40 PM, Jeff Davis <pgsql@j-davis.com> wrote:
> > * In addition to detecting random garbage, we also need to be able to
> > detect zeroing of pages. Right now, a zero page is not considered
> > corrupt, so that's a problem. We'll need to WAL table extension
> > operations, and we'll need to mitigate the performance impact of doing
> > so. I think we can do that by extending larger tables by many pages
> > (say, 16 at a time) so we can amortize the cost of WAL and avoid
> > contention.
> 
> I haven't seen this come up in discussion.

I don't have any links, and it might just be based on in-person
discussions. I think it's just being left as a loose end for later, but
it will eventually need to be solved.

> WAL logging table
> extensions wouldn't by itself work because currently we treat the file
> size on disk as the size of the table. So you would have to do the
> extension in the critical section or else different backends might see
> the wrong file size and write out conflicting wal entries.

By "critical section", I assume you mean "while holding the relation
extension lock" not "while inside a CRITICAL_SECTION()", right?

There would be some synchronization overhead, to be sure, but I think it
can be done. Ideally, we'd be able to do large enough extensions that,
if there is a parallel bulk load on a single table or something, the
overhead could be made insignificant.

I didn't intend to get too much into the detail in this thread, but if
it's a totally ridiculous or impossible idea, I'll remove it.

> The earlier consensus was to move all the hint bits to a dedicated
> area and exclude them from the checksum. I think double-write buffers
> seem to have become more fashionable but a summary that doesn't
> describe the former is definitely incomplete.

Thank you, that's the kind of omission I was looking to catch.

> That link points to the MVCC-safe truncate patch. I don't follow how
> optimizations in bulk loads are relevant to wal logging hint bit
> updates.

I should have linked to these messages:
http://archives.postgresql.org/message-id/CA
+TgmoYLOzDezzJKyJ8_x2bPeEerAo5dJ-OMvS1fLQOQSQP5jg@mail.gmail.com
http://archives.postgresql.org/message-id/CA
+Tgmoa4Xs1jbZhm=pb9Xi4AGMJXRB2a4GSE9EJtLo=70Zne=g@mail.gmail.com

Though perhaps I'm reading too much into Robert's comments.

Regards,Jeff Davis




pgsql-hackers by date:

Previous
From: Greg Stark
Date:
Subject: Re: 9.3: summary of corruption detection / checksums / CRCs discussion
Next
From: Fujii Masao
Date:
Subject: Re: [BUG] Checkpointer on hot standby runs without looking checkpoint_segments