Home > mailing lists

Re: corrupt pages detected by enabling checksums - Mailing list pgsql-hackers

From	Simon Riggs
Subject	Re: corrupt pages detected by enabling checksums
Date	April 4, 2013 12:30:57
Msg-id	CA+U5nMJupKK_hRwXy1JOxxQnS6TeY2gmg5eMbV--uSa5CACE1g@mail.gmail.com Whole thread Raw
In response to	Re: corrupt pages detected by enabling checksums (Andres Freund <andres@2ndquadrant.com>)
Responses	Re: corrupt pages detected by enabling checksums Re: corrupt pages detected by enabling checksums
List	pgsql-hackers

Tree view

On 4 April 2013 02:39, Andres Freund <andres@2ndquadrant.com> wrote:

> Ok, I think I see the bug. And I think its been introduced in the
> checkpoints patch.

Well spotted. (I think you mean checksums patch).

> If by now the first backend has proceeded to PageSetLSN() we are writing
> different data to disk than the one we computed the checksum of
> before. Boom.

Right, so nothing else we were doing was wrong, that's why we couldn't
spot a bug. The problem is that we aren't replaying enough WAL because
the checksum on the WAL record is broke.

> I think the whole locking interactions in MarkBufferDirtyHint() need to
> be thought over pretty carefully.

When we write out a buffer with checksums enabled, we take a copy of
the buffer so that the checksum is consistent, even while other
backends may be writing hints to the same bufer.

I missed out on doing that with XLOG_HINT records, so the WAL CRC can
be incorrect because the data is scanned twice; normally that would be
OK because we have an exclusive lock on the block, but with hints we
only have share lock. So what we need to do is take a copy of the
buffer before we do XLogInsert().

Simple patch to do this attached for discussion. (Not tested).

We might also do this by modifying the WAL record to take the whole
block and bypass the BkpBlock mechanism entirely. But that's more work
and doesn't seem like it would be any cleaner. I figure lets solve the
problem first then discuss which approach is best.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Attachment

copy_before_XLOG_HINT.v1.patch

pgsql-hackers by date:

From: Kohei KaiGai
Date: 04 April 2013, 12:26:22
Subject: Re: [sepgsql 2/3] Add db_schema:search permission checks

From: Robert Haas
Date: 04 April 2013, 12:42:03
Subject: Re: Page replacement algorithm in buffer cache

Re: corrupt pages detected by enabling checksums - Mailing list pgsql-hackers

Attachment

Previous

Next