Re: Perform streaming logical transactions by background workers and parallel apply - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Perform streaming logical transactions by background workers and parallel apply
Date
Msg-id CAA4eK1+1m5wDroXiWZXH+t14caN6fSXocjaOanhWueS7S-zM4g@mail.gmail.com
Whole thread Raw
In response to Re: Perform streaming logical transactions by background workers and parallel apply  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers
On Fri, Dec 2, 2022 at 2:29 PM Peter Smith <smithpb2250@gmail.com> wrote:
>
> 3. pa_setup_dsm
>
> +/*
> + * Set up a dynamic shared memory segment.
> + *
> + * We set up a control region that contains a fixed-size worker info
> + * (ParallelApplyWorkerShared), a message queue, and an error queue.
> + *
> + * Returns true on success, false on failure.
> + */
> +static bool
> +pa_setup_dsm(ParallelApplyWorkerInfo *winfo)
>
> IMO that's confusing to say "fixed-sized worker info" when it's
> referring to the ParallelApplyWorkerShared structure and not the other
> ParallelApplyWorkerInfo.
>
> Might be better to say:
>
> "a fixed-size worker info (ParallelApplyWorkerShared)" -> "a
> fixed-size struct (ParallelApplyWorkerShared)"
>
> ~~~
>

I find the existing wording better than what you are proposing. We can
remove the structure name if you think that is better but IMO, current
wording is good.

>
> 6. pa_free_worker_info
>
> + /*
> + * Ensure this worker information won't be reused during worker
> + * allocation.
> + */
> + ParallelApplyWorkersList = list_delete_ptr(ParallelApplyWorkersList,
> +    winfo);
>
> SUGGESTION 1
> Removing from the worker pool ensures this information won't be reused
> during worker allocation.
>
> SUGGESTION 2 (more simply)
> Remove from the worker pool.
>

+1 for the second suggestion.

> ~~~
>
> 7. HandleParallelApplyMessage
>
> + /*
> + * The actual error must have been reported by the parallel
> + * apply worker.
> + */
> + ereport(ERROR,
> + (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
> + errmsg("logical replication parallel apply worker exited abnormally"),
> + errcontext("%s", edata.context)));
>
> Maybe it's better to remove the comment, but replace it with an
> errhint that tells the user "For the cause of this error see the error
> logged by the logical replication parallel apply worker."
>

I am not sure if such an errhint is a good idea, anyway, I think both
the errors will be adjacent in the LOGs unless there is some other
error in the short span.
> ~~~
>
> 17. apply_handle_stream_stop
>
> + case TRANS_PARALLEL_APPLY:
> + elog(DEBUG1, "applied %u changes in the streaming chunk",
> + parallel_stream_nchanges);
> +
> + /*
> + * By the time parallel apply worker is processing the changes in
> + * the current streaming block, the leader apply worker may have
> + * sent multiple streaming blocks. This can lead to parallel apply
> + * worker start waiting even when there are more chunk of streams
> + * in the queue. So, try to lock only if there is no message left
> + * in the queue. See Locking Considerations atop
> + * applyparallelworker.c.
> + */
>
> SUGGESTION (minor rewording)
>
> By the time the parallel apply worker is processing the changes in the
> current streaming block, the leader apply worker may have sent
> multiple streaming blocks. To the parallel apply from waiting
> unnecessarily, try to lock only if there is no message left in the
> queue. See Locking Considerations atop applyparallelworker.c.
>

I have proposed the additional line (This can lead to parallel apply
worker start waiting even when there are more chunk of streams in the
queue.) because it took me some time to understand this particular
scenario.

-- 
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: Niyas Sait
Date:
Subject: Re: [PATCH] Add native windows on arm64 support
Next
From: Alvaro Herrera
Date:
Subject: Re: ExecRTCheckPerms() and many prunable partitions