Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions - Mailing list pgsql-hackers
From | Amit Kapila |
---|---|
Subject | Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions |
Date | |
Msg-id | CAA4eK1L-KYycdTYanqo3nDzw=XWvADOuerHtbBSnBiRejmE3Qg@mail.gmail.com Whole thread Raw |
In response to | Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions (Masahiko Sawada <masahiko.sawada@2ndquadrant.com>) |
Responses |
Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions
|
List | pgsql-hackers |
On Tue, Dec 24, 2019 at 11:17 AM Masahiko Sawada <masahiko.sawada@2ndquadrant.com> wrote: > > On Fri, 20 Dec 2019 at 22:30, Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > The main aim of this feature is to reduce apply lag. Because if we > > send all the changes together it can delay there apply because of > > network delay, whereas if most of the changes are already sent, then > > we will save the effort on sending the entire data at commit time. > > This in itself gives us decent benefits. Sure, we can further improve > > it by having separate workers (dedicated to apply the changes) as you > > are suggesting and in fact, there is a patch for that as well(see the > > performance results and bgworker patch at [1]), but if try to shove in > > all the things in one go, then it will be difficult to get this patch > > committed (there are already enough things and the patch is quite big > > that to get it right takes a lot of energy). So, the plan is > > something like that first we get the basic feature and then try to > > improve by having dedicated workers or things like that. Does this > > make sense to you? > > > > Thank you for explanation. The plan makes sense. But I think in the > current design it's a problem that logical replication worker doesn't > receive changes (and doesn't check interrupts) during applying > committed changes even if we don't have a worker dedicated for > applying. I think the worker should continue to receive changes and > save them to temporary files even during applying changes. > Won't it beat the purpose of this feature which is to reduce the apply lag? Basically, it can so happen that while applying commit, it constantly gets changes of other transactions which will delay the apply of the current transaction. Also, won't it create some further work to identify the order of commits? Say while applying commit-1, it receives 5 other commits that are written to separate temporary files. How will we later identify which transaction's WAL we need to apply first? We might deduce by LSN's, but I think that could be tricky. Another thing is that I think it could lead to some design complications as well because while applying commit, you need some sort of callback or something like that to receive and flush totally unrelated changes. It could lead to another kind of failure mode wherein while applying commit if it tries to receive another transaction data and some failure happens while writing the data of that transaction. I am not sure if it is a good idea to try something like that. > Otherwise > the buffer would be easily full and replication gets stuck. > Are you telling about network buffer? I think the best way as discussed is to launch new workers for streamed transactions, but we can do that as an additional feature. Anyway, as proposed, users can choose the streaming mode for subscriptions, so there is an option to turn this selectively. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
pgsql-hackers by date: