Re: logical replication empty transactions - Mailing list pgsql-hackers

From Dilip Kumar
Subject Re: logical replication empty transactions
Date
Msg-id CAFiTN-upX4pkYLcYza9YPO6u4F7z0EmPB00M0agAcanX-zkzHQ@mail.gmail.com
Whole thread Raw
In response to Re: logical replication empty transactions  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: logical replication empty transactions  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers
On Wed, Mar 4, 2020 at 9:12 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Mar 4, 2020 at 7:17 AM Euler Taveira
> <euler.taveira@2ndquadrant.com> wrote:
> >
> > On Tue, 3 Mar 2020 at 05:24, Amit Kapila <amit.kapila16@gmail.com> wrote:
> >>
> >>
> >> Another idea could be that we stream the transaction after some
> >> threshold number (say 100 or anything we think is reasonable) of empty
> >> xacts.  This will reduce the traffic without tinkering with the core
> >> design too much.
> >>
> >>
> > Amit, I suggest an interval to control this setting. Time is something we have control; transactions aren't
(dependingon workload). pg_stat_replication query interval usually is not milliseconds, however, you can execute
thousandsof transactions in a second. If we agree on that idea I can add it to the patch. 
> >
>
> Do you mean to say that if for some threshold interval we didn't
> stream any transaction, then we can send the next empty transaction to
> the subscriber?  If so, then isn't it possible that the empty xacts
> happen irregularly after the specified interval and then we still end
> up sending them all.  I might be missing something here, so can you
> please explain your idea in detail?  Basically, how will it work and
> how will it solve the problem.

IMHO, the threshold should be based on the commit LSN.  Our main
reason we want to send empty transactions after a certain
transaction/duration is that we want the restart_lsn to be moving
forward so that if we need to restart the replication slot we don't
need to process a lot of extra WAL.  So assume we set the threshold
based on transaction count then there is still a possibility that we
might process a few very big transactions then we will have to process
them again after the restart.  OTOH, if we set based on an interval
then even if there is not much work going on, still we end up sending
the empty transaction as pointed by Amit.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Masahiko Sawada
Date:
Subject: Re: Some problems of recovery conflict wait events
Next
From: Michael Paquier
Date:
Subject: Re: Some problems of recovery conflict wait events