Re: [HACKERS] Replication status in logical replication - Mailing list pgsql-hackers

From Simon Riggs
Subject Re: [HACKERS] Replication status in logical replication
Date
Msg-id CANP8+jLwgsexwdPkBtkN5kdHN5TwV-d=i311Tq_FdOmzJ8QyRQ@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Replication status in logical replication  (Masahiko Sawada <sawada.mshk@gmail.com>)
Responses Re: [HACKERS] Replication status in logical replication
List pgsql-hackers
On 26 December 2017 at 00:26, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> On Tue, Dec 26, 2017 at 1:10 AM, Petr Jelinek
> <petr.jelinek@2ndquadrant.com> wrote:
>> On 21/11/17 22:06, Masahiko Sawada wrote:
>>>
>>> After investigation, I found out that my previous patch was wrong
>>> direction. I should have changed XLogSendLogical() so that we can
>>> check the read LSN and set WalSndCaughtUp = true even after read a
>>> record without wait. Attached updated patch passed 'make check-world'.
>>> Please review it.
>>>
>>
>> Hi,
>>
>> This version looks good to me and seems to be in line with what we do in
>> physical replication.
>>
>> Marking as ready for committer.
>>
>
> Thank you for reviewing this patch!

The patch calls this AFTER processing the record
   if (sentPtr >= GetFlushRecPtr())
but it seems better to call GetFlushRecPtr() before we process the
record, otherwise the flush pointer might have moved forwards while we
process the record and it wouldn't catch up. (Physical replication
works like that).

New patch version attached for discussion before commit. (v4)

I'd rather not call it at all at that point though, so if we made
RecentFlushPtr static at the module level rather than within
WalSndWaitForWal we could use it here also. That's a bit more invasive
for backpatching, so not implemented that here.

Overall, I find setting WalSndCaughtUp = false at the top of
XLogSendLogical() to be incredibly ugly and I would like to remove it.
It can't be correct to have a static status variable that oscillates
between false and true with every record. This seems to be done
because of the lack of a logical initialization call. Petr? Peter?
Version with this removed (v4alt2)

I've removed the edit that fusses over English grammar: both ways are correct.

> I think this patch can be
> back-patched to 9.4 as Simon mentioned.

This patch appears to cause this DEBUG1 message

"standby \"%s\" has now caught up with primary"

which probably isn't the right message, but might be OK to backpatch.

Thoughts on better wording?

-- 
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

pgsql-hackers by date:

Previous
From: Simon Riggs
Date:
Subject: Re: [HACKERS] [PROPOSAL] Temporal query processing with range types
Next
From: John Naylor
Date:
Subject: Re: MCV lists for highly skewed distributions