Home > mailing lists

Re: Logical replication keepalive flood - Mailing list pgsql-hackers

From	Greg Nancarrow
Subject	Re: Logical replication keepalive flood
Date	September 17, 2021 09:32:54
Msg-id	CAJcOf-ct+7K53kPsnYery=8W6sZx7Q14H8UjqAgpwkxCfvR5mQ@mail.gmail.com Whole thread Raw
In response to	Re: Logical replication keepalive flood (Amit Kapila <amit.kapila16@gmail.com>)
Responses	Re: Logical replication keepalive flood
List	pgsql-hackers

Tree view

On Thu, Sep 16, 2021 at 10:36 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> I think here the reason is that the first_lsn of a transaction is
> always equal to end_lsn of the previous transaction (See comments
> above first_lsn and end_lsn fields of ReorderBufferTXN).

That may be the case, but those comments certainly don't make this clear.

>I have not
> debugged but I think in StreamLogicalLog() the cur_record_lsn after
> receiving 'w' message, in this case, will be equal to endpos whereas
> we expect to be greater than endpos to exit. Before the patch, it will
> always get the 'k' message where we expect the received lsn to be
> equal to endpos to conclude that we can exit. Do let me know if your
> analysis differs?
>

Yes, pg_recvlogical seems to be relying on receiving a keepalive for
its "--endpos" logic to work (and the 006 test is relying on '' record
output from pg_recvlogical in this case).
But is it correct to be relying on a keepalive for this?
As I already pointed out, there's also code which seems to be relying
on replies from sending keepalives, to update flush and write
locations related to LSN.
The original problem reporter measured 500 keepalives per second being
sent by walsender (which I also reproduced, for pg_recvlogical and
pub/sub cases).
None of these cases appear to be traditional uses of "keepalive" type
messages to me.
Am I missing something? Documentation?

Regards,
Greg Nancarrow
Fujitsu Australia

pgsql-hackers by date:

From: Amit Kapila
Date: 17 September 2021, 09:17:37
Subject: Re: Logical replication keepalive flood

From: Fabrice Chapuis
Date: 17 September 2021, 09:59:08
Subject: Logical replication timeout problem

Re: Logical replication keepalive flood - Mailing list pgsql-hackers

Previous

Next