Re: Perform streaming logical transactions by background workers and parallel apply - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Perform streaming logical transactions by background workers and parallel apply
Date
Msg-id CAA4eK1+z=xGfC11u_H6tWiaeNCo3g0hW3+5k8RBzYTR6avQ4UA@mail.gmail.com
Whole thread Raw
In response to RE: Perform streaming logical transactions by background workers and parallel apply  ("houzj.fnst@fujitsu.com" <houzj.fnst@fujitsu.com>)
Responses RE: Perform streaming logical transactions by background workers and parallel apply
List pgsql-hackers
On Mon, Nov 7, 2022 at 6:49 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
>
> On Friday, November 4, 2022 7:45 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > 3.
> > apply_handle_stream_start(StringInfo s)
> > {
> > ...
> > + if (!first_segment)
> > + {
> > + /*
> > + * Unlock the shared object lock so that parallel apply worker
> > + * can continue to receive and apply changes.
> > + */
> > + parallel_apply_unlock(winfo->shared->stream_lock_id);
> > ...
> > }
> >
> > Can we have an assert before this unlock call that the lock must be
> > held? Similarly, if there are other places then we can have assert
> > there as well.
>
> It seems we don't have a standard API can be used without a transaction.
> Maybe we can use the list ParallelApplyLockids to check that ?
>

Yeah, that occurred to me as well but I am not sure if it is a good
idea to maintain this list just for assertion but if it turns out that
we need to maintain it for a different purpose then we can probably
use it for assert as well.

Few other comments/questions:
=========================
1.
apply_handle_stream_start(StringInfo s)
{
...

+ case TRANS_PARALLEL_APPLY:
...
...
+ /*
+ * Unlock the shared object lock so that the leader apply worker
+ * can continue to send changes.
+ */
+ parallel_apply_unlock(MyParallelShared->stream_lock_id, AccessShareLock);

As per the design in the email [1], this lock needs to be released by
the leader worker during stream start which means it should be
released under the state TRANS_LEADER_SEND_TO_PARALLEL. From the
comments as well, it is not clear to me why at this time leader is
supposed to be blocked. Is there a reason for doing differently than
what is proposed in the original design?

2. Similar to above, it is not clear why the parallel worker needs to
release the stream_lock_id lock at stream_commit and stream_prepare?

3. Am, I understanding correctly that you need to lock/unlock in
apply_handle_stream_abort() for the parallel worker because after
rollback to savepoint, there could be another set of stream or
transaction end commands for which you want to wait? If so, maybe an
additional comment would serve the purpose.

4.
The leader may have sent multiple streaming blocks in the queue
+ * When the child is processing a streaming block. So only try to
+ * lock if there is no message left in the queue.

Let's slightly reword this to: "By the time child is processing the
changes in the current streaming block, the leader may have sent
multiple streaming blocks. So, try to lock only if there is no message
left in the queue."

5.
+parallel_apply_unlock(uint16 lockid, LOCKMODE lockmode)
+{
+ if (!list_member_int(ParallelApplyLockids, lockid))
+ return;
+
+ UnlockSharedObjectForSession(SubscriptionRelationId, MySubscription->oid,
+ lockid, am_leader_apply_worker() ?
+ AccessExclusiveLock:
+ AccessShareLock);

This function should use lockmode argument passed rather than deciding
based on am_leader_apply_worker. I think this is anyway going to
change if we start using a different locktag as discussed in one of
the above emails.

6.
+
 /*
  * Common spoolfile processing.
  */
-static void
-apply_spooled_messages(TransactionId xid, XLogRecPtr lsn)
+void
+apply_spooled_messages(FileSet *stream_fileset, TransactionId xid,

Seems like a spurious line addition.

-- 
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: Andrey Lepikhov
Date:
Subject: Re: [PoC] Reducing planning time when tables have many partitions
Next
From: "Daniel Verite"
Date:
Subject: Re: psql: Add command to use extended query protocol