Re: logical replication empty transactions - Mailing list pgsql-hackers
From | Dilip Kumar |
---|---|
Subject | Re: logical replication empty transactions |
Date | |
Msg-id | CAFiTN-u7FieGKZcOvQM+yYyb2j-t08db3kC0PBVs+jj8cMKvRg@mail.gmail.com Whole thread Raw |
In response to | Re: logical replication empty transactions (Amit Kapila <amit.kapila16@gmail.com>) |
Responses |
Re: logical replication empty transactions
|
List | pgsql-hackers |
On Mon, Mar 2, 2020 at 4:56 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Mon, Mar 2, 2020 at 9:01 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > On Sat, Nov 9, 2019 at 7:29 AM Euler Taveira <euler@timbira.com.br> wrote: > > > > > > Em seg., 21 de out. de 2019 às 21:20, Jeff Janes > > > <jeff.janes@gmail.com> escreveu: > > > > > > > > After setting up logical replication of a slowly changing table using the built in pub/sub facility, I noticed waymore network traffic than made sense. Looking into I see that every transaction in that database on the master gets sentto the replica. 99.999+% of them are empty transactions ('B' message and 'C' message with nothing in between) becausethe transactions don't touch any tables in the publication, only non-replicated tables. Is doing it this way necessaryfor some reason? Couldn't we hold the transmission of 'B' until something else comes along, and then if that nextthing is 'C' drop both of them? > > > > > > > That is not optimal. Those empty transactions is a waste of bandwidth. > > > We can suppress them if no changes will be sent. test_decoding > > > implements "skip empty transaction" as you described above and I did > > > something similar to it. Patch is attached. > > > > I think this significantly reduces the network bandwidth for empty > > transactions. I have briefly reviewed the patch and it looks good to > > me. > > > > One thing that is not clear to me is how will we advance restart_lsn > if we don't send any empty xact in a system where there are many such > xacts? IIRC, the restart_lsn is advanced based on confirmed_flush lsn > sent by subscriber. After this change, the subscriber won't be able > to send the confirmed_flush and for a long time, we won't be able to > advance restart_lsn. Is that correct, if so, why do we think that is > acceptable? One might argue that restart_lsn will be advanced as soon > as we send the first non-empty xact, but not sure if that is good > enough. What do you think? It seems like a valid point. One idea could be that we can track the last commit LSN which we streamed and if the confirmed flush location is already greater than that then even if we skip the sending the commit message we can increase the confirm flush location locally. Logically, it should not cause any problem because once we have got the confirmation for whatever we have streamed so far. So for other commits(which we are skipping), we can we advance it locally because we are sure that we don't have any streamed commit which is not yet confirmed by the subscriber. This is just my thought, but if we think from the code and design perspective then it might complicate the things and sounds hackish. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
pgsql-hackers by date: