Thread: Logical decoding timeline following fails to handle records split across segments
Logical decoding timeline following fails to handle records split across segments
From
Craig Ringer
Date:
Hi all
--
There's a bug (mine) in logical decoding timeline following where reading the first page from the segment containing a timeline switch fails to read from the most recent timeline in that segment. This is harmless if the old timeline's copy of the segment is present - but if it's been renamed to .partial, deleted or never copied over to a replica then decoding will complain that the required segment has already been removed. Just like without timeline following.
The underlying problem is that timeline calculations used the record's start pointer and didn't properly consider continuations; they were record-based, not page-based like they should be.
A corrected and handily much, much simpler patch is attached. The logic for finding the last timeline on a segment was massively more complex than it needed to be, and that wasn't the only thing.
Attachment
Re: Logical decoding timeline following fails to handle records split across segments
From
Craig Ringer
Date:
On 3 May 2016 at 22:03, Craig Ringer <craig@2ndquadrant.com> wrote:
* The function first tests cases where it can say "nothing to do, carry on" and only then handles a timeline switch if one might be required. Simpler, clearer.
* The logic for determining the last timeline on a segment was way too complicated in the old patch. Instead, just look up the timeline of the LSN of the last byte on the segment. No more loops.
A corrected and handily much, much simpler patch is attached. The logic for finding the last timeline on a segment was massively more complex than it needed to be, and that wasn't the only thing.
I'm aware that this is late in the piece, btw, and that revert will be considered. I think that's probably unnecessary - in part because this code only gets called when doing logical decoding, and only does anything remotely interesting when replay from a slot hits a timeline boundary. So it should be assessed mainly based on its risk of breaking what works. That said, I also think that now that I've fixed the fundamental misunderstanding embodied by the original patch it should be pretty clear and reasonable.
The diff looks big, but it just rewrites the new XLogReadDetermineTimeline function and otherwise just removes the no-longer-needed XlogReaderState->timelineHistory member the prior one added. It's best to just read the new XLogReadDetermineTimeline function, not the diff of that function, since diff has done confusing things with it.
The changes here are that:
* XLogReadDetermineTimeline(...) now looks up the page being fetched by ReadPageInternal. It previously used the record start pointer, but that didn't work if a continuation spilled over to the first page on a new segment where there's a timeline switch.
* The function first tests cases where it can say "nothing to do, carry on" and only then handles a timeline switch if one might be required. Simpler, clearer.
* The logic for determining the last timeline on a segment was way too complicated in the old patch. Instead, just look up the timeline of the LSN of the last byte on the segment. No more loops.
* XlogReaderState->timelineHistory is removed; the timeline file is now read only when needed when making a timeline decision, then discarded. It's read only very infrequently. Any time we read it we might need to re-read it if there's a newer one anyway, so it's simpler this way.
* XLogReaderState->currTLIValidUntil is now the tliSwitchPoint for currTLI and is no longer truncated to the start of the segment boundary like before. So it's actually useful now. That's a result of fixing the stupid logic I used in the old version for calculating the last timeline on a segment and whether there's a timeline change on the segment.
Álvaro is aware of this and I've added it to open items.
Re: Logical decoding timeline following fails to handle records split across segments
From
Craig Ringer
Date:
On 3 May 2016 at 22:03, Craig Ringer <craig@2ndquadrant.com> wrote:
Hi allThere's a bug (mine) in logical decoding timeline following where reading the first page from the segment containing a timeline switch fails to read from the most recent timeline in that segment. This is harmless if the old timeline's copy of the segment is present - but if it's been renamed to .partial, deleted or never copied over to a replica then decoding will complain that the required segment has already been removed. Just like without timeline following.The underlying problem is that timeline calculations used the record's start pointer and didn't properly consider continuations; they were record-based, not page-based like they should be.A corrected and handily much, much simpler patch is attached. The logic for finding the last timeline on a segment was massively more complex than it needed to be, and that wasn't the only thing.
For the record the patch this fixes got reverted as agreed in http://www.postgresql.org/message-id/20160503165812.GA29604@alvherre.pgsql .
I will submit this patch to 9.7 along with the improvements to pg_recvlogical and expanded test suite.
I then expect to follow on with work to clean up the use of globals to pass timeline info through xlogreader to read page callbacks, and hopefully the hs protocol changes etc required to allow the improved slot failover support mechanism Petr, Andres and I discussed to work.
This patch as attached won't apply anymore, but it's trivial to apply it on top of a cherry-picked copy of the reverted feature patch for testing or further development.