Re: Logical replication keepalive flood - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Logical replication keepalive flood
Date
Msg-id CAA4eK1LYqmPyOzL4fbbXDTwd4fAy9bEP4VdYEDuSNgaHKgbCQg@mail.gmail.com
Whole thread Raw
In response to Re: Logical replication keepalive flood  (Kyotaro Horiguchi <horikyota.ntt@gmail.com>)
Responses Re: Logical replication keepalive flood  (Greg Nancarrow <gregn4422@gmail.com>)
List pgsql-hackers
On Thu, Sep 30, 2021 at 8:49 AM Hou, Zhijie/侯 志杰 <houzj.fnst@fujitsu.com> wrote:
>
> I noticed another patch that Horiguchi-San posted earlier[1].
>
> The approach in that patch is to splits the sleep into two
> pieces. If the first sleep reaches the timeout then send a keepalive
> then sleep for the remaining time.
>
> I tested that patch and can see the keepalive message is reduced and
> the patch won't cause the current regression test fail.
>
> Since I didn't find some comments about that patch,
> I wonder did we find some problems about that patch ?
>

I am not able to understand some parts of that patch.

+ If the sleep is shorter
+ * than KEEPALIVE_TIMEOUT milliseconds, we skip sending a keepalive to
+ * prevent it from getting too-frequent.
+ */
+ if (MyWalSnd->flush < sentPtr &&
+ MyWalSnd->write < sentPtr &&
+ !waiting_for_ping_response)
+ {
+ if (sleeptime > KEEPALIVE_TIMEOUT)
+ {
+ int r;
+
+ r = WalSndWait(wakeEvents, KEEPALIVE_TIMEOUT,
+    WAIT_EVENT_WAL_SENDER_WAIT_WAL);
+
+ if (r != 0)
+ continue;
+
+ sleeptime -= KEEPALIVE_TIMEOUT;
+ }
+
+ WalSndKeepalive(false);

It claims to skip sending keepalive if the sleep time is shorter than
KEEPALIVE_TIMEOUT (a new threshold) but the above code seems to
suggest it sends in both cases. What am I missing?

Also, more to the point this special keep_alive seems to be sent for
synchronous replication and walsender shutdown as they can expect
updated locations. You haven't given any reason (theory) why those two
won't be impacted due to this change? I am aware that for synchronous
replication, we wake waiters while ProcessStandbyReplyMessage but I am
not sure how it helps with wal sender shutdown? I think we need to
know the reasons for this message and then need to see if the change
has any impact on the same.

--
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: Masahiko Sawada
Date:
Subject: Re: Skipping logical replication transactions on subscriber side
Next
From: Greg Nancarrow
Date:
Subject: Re: Logical replication keepalive flood