Re: WAL format changes break the suppression of do-nothing checkpoints. - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: WAL format changes break the suppression of do-nothing checkpoints.
Date
Msg-id 552E7611.1000907@iki.fi
Whole thread Raw
In response to WAL format changes break the suppression of do-nothing checkpoints.  (Jeff Janes <jeff.janes@gmail.com>)
List pgsql-hackers
On 03/31/2015 09:09 PM, Jeff Janes wrote:
> On Tue, Mar 31, 2015 at 6:19 AM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
>
>> On 03/30/2015 09:01 PM, Jeff Janes wrote:
>>
>>> commit 2c03216d831160bedd72d45f7 has invalidated the part of the docs
>>> saying "If no WAL has been written since the previous checkpoint, new
>>> checkpoints will be skipped even if checkpoint_timeout has passed",
>>> presumably by accident.
>>>
>>> It seems that this part is no longer true when it should be true:
>>>
>>>           if (curInsert == ControlFile->checkPoint +
>>>               MAXALIGN(SizeOfXLogRecord + sizeof(CheckPoint))
>>>
>>> MAXALIGN(SizeOfXLogRecord + sizeof(CheckPoint) is now 96, but the amount
>>> by
>>> which curInsert gets advanced is still 104, like it was before the commit.
>>
>> Hmm. Wasn't this a bit broken before too, when the checkpoint record
>> crosses a page boundary?
>
> But once the next one it was written, it wouldn't across a boundary
> anymore.  So you would get one extra checkpoint, rather than an infinite
> stream of them.  So a bit broken, but just a bit.

Yeah. It's not serious.

It occurs to me that that's actually nice from a robustness point of 
view: if something goes badly wrong and corrupt, you're more likely to 
be able to recover the last checkpoint record if it doesn't cross a page 
boundary. And more importantly, a segment boundary. OTOH, when you 
create a spurious checkpoint, you lose the "previous" checkpoint 
location. A good compromise is probably to not skip the checkpoint if 
the last one crossed a segment boundary. (Committed that way)

>> Instead of trying to calculate where the checkpoint record ends, I think
>> we could check that the prev-pointer points to the last checkpoint record.
>
> Wouldn't doing that involve reading WAL while not in recovery?

No. We keep track of the beginning position of the last inserted record 
at all times, because when the next record is inserted, we have to put 
that in the xl_prev field.

Committed a patch to do that.

- Heikki




pgsql-hackers by date:

Previous
From: Simon Riggs
Date:
Subject: Re: Turning off HOT/Cleanup sometimes
Next
From: Alvaro Herrera
Date:
Subject: Re: Turning off HOT/Cleanup sometimes