Re: Perform streaming logical transactions by background workers and parallel apply - Mailing list pgsql-hackers
From | Dilip Kumar |
---|---|
Subject | Re: Perform streaming logical transactions by background workers and parallel apply |
Date | |
Msg-id | CAFiTN-taQYnSn_p040o1k=GWDytMrX+dGU_D5uU1ZmAf3VdZsQ@mail.gmail.com Whole thread Raw |
In response to | Re: Perform streaming logical transactions by background workers and parallel apply (Dilip Kumar <dilipbalaut@gmail.com>) |
Responses |
RE: Perform streaming logical transactions by background workers and parallel apply
|
List | pgsql-hackers |
On Tue, Jul 26, 2022 at 2:30 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Fri, Jul 22, 2022 at 8:27 AM wangw.fnst@fujitsu.com > <wangw.fnst@fujitsu.com> wrote: > > > > On Tues, Jul 19, 2022 at 10:29 AM I wrote: > > > Attach the news patches. > > > > Not able to apply patches cleanly because the change in HEAD (366283961a). > > Therefore, I rebased the patch based on the changes in HEAD. > > > > Attach the new patches. > > + /* Check the foreign keys. */ > + fkeys = RelationGetFKeyList(entry->localrel); > + if (fkeys) > + entry->parallel_apply = PARALLEL_APPLY_UNSAFE; > > So if there is a foreign key on any of the tables which are parts of a > subscription then we do not allow changes for that subscription to be > applied in parallel? I think this is a big limitation because having > foreign key on the table is very normal right? I agree that if we > allow them then there could be failure due to out of order apply > right? but IMHO we should not put the restriction instead let it fail > if there is ever such conflict. Because if there is a conflict the > transaction will be sent again. Do we see that there could be wrong > or inconsistent results if we allow such things to be executed in > parallel. If not then IMHO just to avoid some corner case failure we > are restricting very normal cases. some more comments.. 1. + /* + * If we have found a free worker or if we are already applying this + * transaction in an apply background worker, then we pass the data to + * that worker. + */ + if (first_segment) + apply_bgworker_send_data(stream_apply_worker, s->len, s->data); Comment says that if we have found a free worker or we are already applying in the worker then pass the changes to the worker but actually as per the code here we are only passing in case of first_segment? I think what you are trying to say is that if it is first segment then send the 2. + /* + * This is the main apply worker. Check if there is any free apply + * background worker we can use to process this transaction. + */ + if (first_segment) + stream_apply_worker = apply_bgworker_start(stream_xid); + else + stream_apply_worker = apply_bgworker_find(stream_xid); So currently, whenever we get a new streamed transaction we try to start a new background worker for that. Why do we need to start/close the background apply worker every time we get a new streamed transaction. I mean we can keep the worker in the pool for time being and if there is a new transaction looking for a worker then we can find from that. Starting a worker is costly operation and since we are using parallelism for this mean we are expecting that there would be frequent streamed transaction needing parallel apply worker so why not to let it wait for a certain amount of time so that if load is low it will anyway stop and if the load is high it will be reused for next streamed transaction. 3. Why are we restricting parallel apply workers only for the streamed transactions, because streaming depends upon the size of the logical decoding work mem so making steaming and parallel apply tightly coupled seems too restrictive to me. Do we see some obvious problems in applying other transactions in parallel? -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
pgsql-hackers by date: