Re: Stopping logical replication protocol - Mailing list pgsql-hackers
From | Craig Ringer |
---|---|
Subject | Re: Stopping logical replication protocol |
Date | |
Msg-id | CAMsr+YHrGFUV_JfeO+QU2acDp9yvYC-yWOC73AnF4TbM9vZAtg@mail.gmail.com Whole thread Raw |
In response to | Stopping logical replication protocol (Vladimir Gordiychuk <folyga@gmail.com>) |
Responses |
Re: Stopping logical replication protocol
|
List | pgsql-hackers |
On 6 May 2016 at 23:23, Vladimir Gordiychuk <folyga@gmail.com> wrote:
I prepare small patch that fix problems describe below. Now WalSndWriteData first check message from consumer and during decode transaction check that replication still active.
OK, upon looking closer I'm not sure I agree with this approach.
In WalSndLoop(...) we currently call ProcessRepliesIfAny() which handles a client-initiated CopyDone by replying with its own CopyDone. WalSndLoop then just waits until the send buffer is flushed or the client disconnects and returns. That looks to be pretty much what's needed... but only when we actually execute that code path.
XLogSendLogical calls WalSndWaitForWal(...) (via xlogreader and logical_read_xlog_page) when waiting for something to process, when the client will be blocked on a socket read. It also processes client CopyDone, but it doesn't seem to test streamingDoneSending or streamingDoneReceiving like the outer WalSndLoop does. So it looks like it continues waiting for WAL when we know the client sent CopyDone and doesn't want anything more.
I think we should be testing that in WalSndWaitForWal, like we do in WalSndLoop, and bailing out. You've done that part, and I agree that's good.
It seems like what you're also trying to allow interruption deeper than that, when we're in the middle of processing a reorder buffer commit record and streaming that to the client. You're introducing an is_active member (actually a callback, though name suggests it's a flag) in struct ReorderBuffer to check whether a CopyDone is received, and you're skipping ReorderBuffer commit processing when the callback returns false. The callback returns "!streamingDoneReceiving && !streamingDoneSending" i.e. it's false if either end has sent CopyDone. streamingDoneSending and streamingDoneSending are only set in ProcessRepliesIfAny, called by WalSndLoop and WalSndWaitForWal. So the idea is, presumably, that if we're waiting for WAL from XLogSendLogical we skip processing of any commit records and exit.
That seems overcomplicated.
When WalSndWaitForWAL is called by logical_read_xlog_page, logical_read_xlog_page can just test streamingDoneReceiving and streamingDoneSending. If they're set it can skip the page read and return -1, which will cause the xlogreader to return a null record to XLogSendLogical. That'll skip the decoding calls and return to WalSndLoop, where we'll notice it's time to exit.
That way there's no need to modify the reorder buffer structs, add callbacks, etc.
Make sense?
pgsql-hackers by date: