Re: Reviving the "Stopping logical replication protocol" patch fromVladimir Gordichuk - Mailing list pgsql-hackers

From Dave Cramer
Subject Re: Reviving the "Stopping logical replication protocol" patch fromVladimir Gordichuk
Date
Msg-id CADK3HHLVvu_BMWNUEeFYWZ1G_4ia3pOGjG4qw5kZZa3wqOJmhA@mail.gmail.com
Whole thread Raw
In response to Re: Reviving the "Stopping logical replication protocol" patch fromVladimir Gordichuk  (Dave Cramer <davecramer@gmail.com>)
List pgsql-hackers

Dave Cramer


On Tue, 15 Jan 2019 at 07:53, Dave Cramer <davecramer@gmail.com> wrote:

Dave Cramer


On Sun, 13 Jan 2019 at 23:19, Craig Ringer <craig@2ndquadrant.com> wrote:
On Mon, 3 Dec 2018 at 19:38, Dave Cramer <davecramer@gmail.com> wrote:
Dmitry,

Please see attached rebased patches

I'm fine with patch 0001, though I find this comment a bit hard to follow:

+ * The send_data callback must enqueue complete CopyData messages to libpq
+ * using pq_putmessage_noblock or similar, since the walsender loop may send
+ * CopyDone then exit and return to command mode in response to a client
+ * CopyDone between calls to send_data.
 
I think it needs splitting up into a couple of sentences.

Fair point, remember it was originally written by a non-english speaker 

Thoughts on below ?

+/*
+ * Main loop of walsender process that streams the WAL over Copy messages.
+ *
+ * The send_data callback must enqueue complete CopyData messages to the client
+ * using pq_putmessage_noblock or similar 
+ * In order to preserve the protocol it is necessary to send all of the existing
+ * messages still in the buffer as the WalSender loop may send 
+ * CopyDone then exit and return to command mode in response to a client
+ * CopyDone between calls to send_data.
+ */
 

In patch 0002, stopping during a txn. It's important that once we skip any action, we continue skipping. In patch 0002 I'd like it to be clearer that we will *always* skip the rb->commit callback if rb->continue_decoding_cb() returned false and aborted the while loop. I suggest storing the result of (rb->continue_decoding_cb == NULL || rb->continue_decoding_cb())  in an assignment within the while loop, and testing that result later.

e.g.

    (continue_decoding = (rb->continue_decoding_cb == NULL || rb->continue_decoding_cb()))

and later

    if (continue_decoding) {
        rb->commit(rb, txn, commit_lsn);
    }

Will do 

Hmmm... I don't actually see how this is any different than what we have in the patch now where below would the test occur?
 
I don't actually think it's necessary to re-test the continue_decoding callback and skip commit here. If we've sent all the of the txn
except the commit, we might as well send the commit, it's tiny and might save some hassle later.


I think a comment on the skipped commit would be good too:

/*
 * If we skipped any changes due to a client CopyDone we must not send a commit
 * otherwise the client would incorrectly think it received a complete transaction.
 */

I notice that the fast-path logic in WalSndWriteData is removed by this patch. It isn't moved; there's no comparison of last_reply_timestamp and wal_sender_timeout now. What's the rationale there? You want to ensure that we reach ProcessRepliesIfAny() ? Can we cheaply test for a readable client socket then still take the fast-path if it's not readable?

This may have been a mistake as I am taking this over from a very old patch that I did not originally write. I'll have a look at this 

OK, I'm trying to decipher the original intent of this patch as well as I didn't write it.


As to why the fast path logic was removed. Does it make sense to you?

Dave

pgsql-hackers by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: could not access status of transaction
Next
From: Pavel Stehule
Date:
Subject: Re: PostgreSQL vs SQL/XML Standards