Re: Avoiding adjacent checkpoint records - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Avoiding adjacent checkpoint records
Date
Msg-id 9485.1339084079@sss.pgh.pa.us
Whole thread Raw
In response to Re: Avoiding adjacent checkpoint records  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
Robert Haas <robertmhaas@gmail.com> writes:
> On Wed, Jun 6, 2012 at 6:46 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> If we don't like that, I can think of a couple of other ways to get there,
>> but they have their own downsides:
>> 
>> * Instead of trying to detect after-the-fact whether any concurrent
>> WAL activity happened during the last checkpoint, we could detect it
>> during the checkpoint and then keep the info in a static variable in
>> the checkpointer process until next time. �However, I don't see any
>> bulletproof way to do this without adding at least one or two lines
>> of code within XLogInsert, which I'm sure Robert will complain about.

> My main concern here is to avoid doing anything that will make things
> harder for Heikki's WAL insert scaling patch, which I'm hoping will
> get done for 9.3.

Yeah, I'm not very happy either with adding new requirements to
XLogInsert, even if it is just tracking one more bit of information.

> What do you have in mind, exactly?  I feel like ProcLastRecPtr might
> be enough information.  After logging running xacts, we can check
> whether ProcLastRecPtr is equal to the redo pointer.  If so, then
> nothing got written to WAL between the time we began the checkpoint
> and the time we wrote that record.  If, through further, similar
> gyrations, we can then work out whether the checkpoint record
> immediately follows the running-xacts record, we're there.  That's
> pretty ugly, I guess, but it seems possible.

It's fairly messy because of the issues around "holes" being left at
page ends etc --- so the fact that ProcLastRecPtr is different from the
previous insert pointer doesn't immediately prove whether something else
got inserted first, or we just had to leave some dead space.  Heikki
mentioned this morning that he'd like to remove some of those rules in
9.3, but that doesn't help us today.  Note also that a closer look at
LogStandbySnapshot shows it emits more than one WAL record, meaning
we'd have to do this dance more than once, and the changes to do that
would be pretty deadly to the modularity of the functions
LogStandbySnapshot calls.

The conclusion I'd come to yesterday was that we'd want XLogInsert to
do something likeif (Insert->PrevRecord is different from ProcLastRecPtr)    SomebodyElseWroteWAL = true;
where SomebodyElseWroteWAL is a process-local boolean that we reset
at the start of a checkpoint sequence, and then check after we've
written out the LogStandbySnapshot records and the checkpoint record.
(We'd also have to hack ProcLastRecPtr by initializing it to
Insert->PrevRecord at the time we reset SomebodyElseWroteWAL, which is
sort of ugly in that it messes up the relatively clean definition of
that variable.)  So that's not exactly a lot of new code in the critical
section, but it's still new code.

In the end I think I like the last idea I mentioned (keeping two
different "REDO" values during a checkpoint) the best.  It's a bit
complicated but the grottiness is localized in CreateCheckpoint.
Or at least I think it will be, without having written a patch yet.
        regards, tom lane


pgsql-hackers by date:

Previous
From: Honza Horak
Date:
Subject: Re: Ability to listen on two unix sockets
Next
From: Simon Riggs
Date:
Subject: Re: XLog changes for 9.3