Re: Perform streaming logical transactions by background workers and parallel apply - Mailing list pgsql-hackers

From Masahiko Sawada
Subject Re: Perform streaming logical transactions by background workers and parallel apply
Date
Msg-id CAD21AoDfuaCQhJKBQjcrAKT9bqwuyvDzPxdJrupRDs7_7c22Nw@mail.gmail.com
Whole thread Raw
In response to RE: Perform streaming logical transactions by background workers and parallel apply  ("houzj.fnst@fujitsu.com" <houzj.fnst@fujitsu.com>)
Responses Re: Perform streaming logical transactions by background workers and parallel apply  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers
On Fri, Oct 21, 2022 at 6:32 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
>
> On Thursday, October 20, 2022 5:49 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > On Thu, Oct 20, 2022 at 2:08 PM Peter Smith <smithpb2250@gmail.com>
> > wrote:
> > >
> > > 7. get_transaction_apply_action
> > >
> > > > 12. get_transaction_apply_action
> > > >
> > > > I still felt like there should be some tablesync checks/comments in
> > > > this function, just for sanity, even if it works as-is now.
> > > >
> > > > For example, are you saying ([3] #22b) that there might be rare
> > > > cases where a Tablesync would call to parallel_apply_find_worker?
> > > > That seems strange, given that "for streaming transactions that are
> > > > being applied in the parallel ... we disallow applying changes on a
> > > > table that is not in the READY state".
> > > >
> > > > ------
> > >
> > > Houz wrote [2] -
> > >
> > > I think because we won't try to start parallel apply worker in table
> > > sync worker(see the check in parallel_apply_can_start()), so we won't
> > > find any worker in parallel_apply_find_worker() which means
> > > get_transaction_apply_action will return TRANS_LEADER_SERIALIZE. And
> > > get_transaction_apply_action is a function which can be invoked for
> > > all kinds of workers(same is true for all apply_handle_xxx functions),
> > > so not sure if table sync check/comment is necessary.
> > >
> > > ~
> > >
> > > Sure, and I believe you when you say it all works OK - but IMO there
> > > is something still not quite right with this current code. For
> > > example,
> > >
> > > e.g.1 the functional will return TRANS_LEADER_SERIALIZE for Tablesync
> > > worker, and yet the comment for TRANS_LEADER_SERIALIZE says "means
> > > that we are in the leader apply worker" (except we are not)
> > >
> > > e.g.2 we know for a fact that Tablesync workers cannot start their own
> > > parallel apply workers, so then why do we even let the Tablesync
> > > worker make a call to parallel_apply_find_worker() looking for
> > > something we know will not be found?
> > >
> >
> > I don't see much benefit in adding an additional check for tablesync workers
> > here. It will unnecessarily make this part of the code look bit ugly.
>
> Thanks for the review, here is the new version patch set which addressed Peter[1]
> and Kuroda-san[2]'s comments.

I've started to review this patch. I tested v40-0001 patch and have
one question:

IIUC even when most of the changes in the transaction are filtered out
in pgoutput (eg., by relation filter or row filter), the walsender
sends STREAM_START. This means that the subscriber could end up
launching parallel apply workers also for almost empty (and streamed)
transactions. For example, I created three subscriptions each of which
subscribes to a different table. When I loaded a large amount of data
into one table, all three (leader) apply workers received START_STREAM
and launched their parallel apply workers. However, two of them
finished without applying any data. I think this behaviour looks
problematic since it wastes workers and rather decreases the apply
performance if the changes are not large. Is it worth considering a
way to delay launching a parallel apply worker until we find out the
amount of changes is actually large? For example, the leader worker
writes the streamed changes to files as usual and launches a parallel
worker if the amount of changes exceeds a threshold or the leader
receives the second segment. After that, the leader worker switches to
send the streamed changes to parallel workers via shm_mq instead of
files.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



pgsql-hackers by date:

Previous
From: Zhihong Yu
Date:
Subject: Re: fixing typo in comment for restriction_is_or_clause
Next
From: vignesh C
Date:
Subject: Re: Improve tab completion for ALTER STATISTICS