Re: corrupt pages detected by enabling checksums - Mailing list pgsql-hackers

From Andres Freund
Subject Re: corrupt pages detected by enabling checksums
Date
Msg-id 20130404203948.GA16011@awork2.anarazel.de
Whole thread Raw
In response to Re: corrupt pages detected by enabling checksums  (Jeff Davis <pgsql@j-davis.com>)
Responses Re: corrupt pages detected by enabling checksums
List pgsql-hackers
On 2013-04-04 12:59:36 -0700, Jeff Davis wrote:
> Andres,
> 
> Thank you for diagnosing this problem!
> 
> On Thu, 2013-04-04 at 16:53 +0200, Andres Freund wrote:
> > I think the route you quickly sketched is more realistic. That would
> > remove all knowledge obout XLOG_HINT from generic code hich is a very
> > good thing, I spent like 15minutes yesterday wondering whether the early
> > return in there might be the cause of the bug...
> 
> I like this approach. It may have some performance impact though,
> because there are a couple extra spinlocks taken, and an extra memcpy.

I don't think its really slower. Earlier the code took WalInsertLock
everytime, even if we ended up not logging anything. Thats far more
epensive than a single spinlock. And the copy should also only be taken
in the case we need to log. So I think we end up ahead of the current
state.

> The code looks good to me except that we should be consistent about the
> page hole -- XLogCheckBuffer is calculating it, but then we copy the
> entire page. I don't think anything can change the size of the page hole
> while we have a shared lock on the buffer, so it seems OK to skip the
> page hole during the copy.

I don't think it can change either, but I doubt that there's a
performance advantage by not copying the hole. I'd guess the simpler
code ends up faster.

> Another possible approach is to drop the lock on the buffer and
> re-acquire it in exclusive mode after we find out we'll need to do
> XLogInsert. It means that MarkBufferDirtyHint may end up blocking for
> longer, but would avoid the memcpy. I haven't really thought through the
> details.

That sounds like it would be prone to deadlocks. So I would dislike to
go there.

Will write up a patch tomorrow.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: Clang compiler warning on 9.3 HEAD
Next
From: Tom Lane
Date:
Subject: Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)