Re: logical replication empty transactions - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: logical replication empty transactions
Date
Msg-id CAA4eK1+Naj0+3wsroFnAAu+HLTQUo_oPCZFrsCKwoxs7PAWtPQ@mail.gmail.com
Whole thread Raw
In response to Re: logical replication empty transactions  (Dilip Kumar <dilipbalaut@gmail.com>)
Responses Re: logical replication empty transactions  (Dilip Kumar <dilipbalaut@gmail.com>)
List pgsql-hackers
On Wed, Mar 4, 2020 at 9:52 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Wed, Mar 4, 2020 at 9:12 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Wed, Mar 4, 2020 at 7:17 AM Euler Taveira
> > <euler.taveira@2ndquadrant.com> wrote:
> > >
> > > On Tue, 3 Mar 2020 at 05:24, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >>
> > >>
> > >> Another idea could be that we stream the transaction after some
> > >> threshold number (say 100 or anything we think is reasonable) of empty
> > >> xacts.  This will reduce the traffic without tinkering with the core
> > >> design too much.
> > >>
> > >>
> > > Amit, I suggest an interval to control this setting. Time is something we have control; transactions aren't
(dependingon workload). pg_stat_replication query interval usually is not milliseconds, however, you can execute
thousandsof transactions in a second. If we agree on that idea I can add it to the patch. 
> > >
> >
> > Do you mean to say that if for some threshold interval we didn't
> > stream any transaction, then we can send the next empty transaction to
> > the subscriber?  If so, then isn't it possible that the empty xacts
> > happen irregularly after the specified interval and then we still end
> > up sending them all.  I might be missing something here, so can you
> > please explain your idea in detail?  Basically, how will it work and
> > how will it solve the problem.
>
> IMHO, the threshold should be based on the commit LSN.  Our main
> reason we want to send empty transactions after a certain
> transaction/duration is that we want the restart_lsn to be moving
> forward so that if we need to restart the replication slot we don't
> need to process a lot of extra WAL.  So assume we set the threshold
> based on transaction count then there is still a possibility that we
> might process a few very big transactions then we will have to process
> them again after the restart.
>

Won't the subscriber eventually send the flush location for the large
transactions which will move the restart_lsn?

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Chris Bandy
Date:
Subject: Re: [PATCH] Add schema and table names to partition error
Next
From: Amit Kapila
Date:
Subject: Re: [PATCH] Add schema and table names to partition error