Re: Perform streaming logical transactions by background workers and parallel apply - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Perform streaming logical transactions by background workers and parallel apply
Date
Msg-id CAA4eK1JA-=pSNmeCsjegnQBb00Q3U4ZovGbfHqK_V7R527QMQQ@mail.gmail.com
Whole thread Raw
In response to Re: Perform streaming logical transactions by background workers and parallel apply  (Peter Smith <smithpb2250@gmail.com>)
Responses Re: Perform streaming logical transactions by background workers and parallel apply  (Peter Smith <smithpb2250@gmail.com>)
List pgsql-hackers
On Mon, Jan 16, 2023 at 10:24 AM Peter Smith <smithpb2250@gmail.com> wrote:
>
> 2.
>
>  /*
> + * Return the pid of the leader apply worker if the given pid is the pid of a
> + * parallel apply worker, otherwise return InvalidPid.
> + */
> +pid_t
> +GetLeaderApplyWorkerPid(pid_t pid)
> +{
> + int leader_pid = InvalidPid;
> + int i;
> +
> + LWLockAcquire(LogicalRepWorkerLock, LW_SHARED);
> +
> + for (i = 0; i < max_logical_replication_workers; i++)
> + {
> + LogicalRepWorker *w = &LogicalRepCtx->workers[i];
> +
> + if (isParallelApplyWorker(w) && w->proc && pid == w->proc->pid)
> + {
> + leader_pid = w->leader_pid;
> + break;
> + }
> + }
> +
> + LWLockRelease(LogicalRepWorkerLock);
> +
> + return leader_pid;
> +}
>
> 2a.
> IIUC the IsParallelApplyWorker macro does nothing except check that
> the leader_pid is not InvalidPid anyway, so AFAIK this algorithm does
> not benefit from using this macro because we will want to return
> InvalidPid anyway if the given pid matches.
>
> So the inner condition can just say:
>
> if (w->proc && w->proc->pid == pid)
> {
> leader_pid = w->leader_pid;
> break;
> }
>

Yeah, this should also work but I feel the current one is explicit and
more clear.

> ~
>
> 2b.
> A possible alternative comment.
>
> BEFORE
> Return the pid of the leader apply worker if the given pid is the pid
> of a parallel apply worker, otherwise return InvalidPid.
>
>
> AFTER
> If the given pid has a leader apply worker then return the leader pid,
> otherwise, return InvalidPid.
>

I don't think that is an improvement.

> ======
>
> src/backend/utils/adt/pgstatfuncs.c
>
> 3.
>
> @@ -434,6 +435,16 @@ pg_stat_get_activity(PG_FUNCTION_ARGS)
>   values[28] = Int32GetDatum(leader->pid);
>   nulls[28] = false;
>   }
> + else
> + {
> + int leader_pid = GetLeaderApplyWorkerPid(beentry->st_procpid);
> +
> + if (leader_pid != InvalidPid)
> + {
> + values[28] = Int32GetDatum(leader_pid);
> + nulls[28] = false;
> + }
> +
>
> 3a.
> There is an existing comment preceding this if/else but it refers only
> to leaders of parallel groups. Should that comment be updated to
> mention the leader apply worker too?
>

Yeah, we can slightly adjust the comments. How about something like the below:
index 415e711729..7eb668634a 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -410,9 +410,9 @@ pg_stat_get_activity(PG_FUNCTION_ARGS)

                        /*
                         * If a PGPROC entry was retrieved, display
wait events and lock
-                        * group leader information if any.  To avoid
extra overhead, no
-                        * extra lock is being held, so there is no guarantee of
-                        * consistency across multiple rows.
+                        * group leader or apply leader information if
any.  To avoid extra
+                        * overhead, no extra lock is being held, so
there is no guarantee
+                        * of consistency across multiple rows.
                         */
                        if (proc != NULL)
                        {
@@ -428,7 +428,7 @@ pg_stat_get_activity(PG_FUNCTION_ARGS)
                                /*
                                 * Show the leader only for active
parallel workers.  This
                                 * leaves the field as NULL for the
leader of a parallel
-                                * group.
+                                * group or the leader of a parallel apply.
                                 */
                                if (leader && leader->pid !=
beentry->st_procpid)


> ~
>
> 3b.
> It may be unrelated to this patch, but it seems strange to me that the
> nulls[28]/values[28] assignments are done where they are. Every other
> nulls/values assignment of this function here is pretty much in the
> correct numerical order except this one, so IMO this code ought to be
> relocated to later in this same function.
>

This is not related to the current patch but I see there is merit in
the current coding as it is better to retrieve all the fields of proc
together.

-- 
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: "David G. Johnston"
Date:
Subject: Re: If there are more than two functions in different schemas, the functions have the same name and same arguments, \df[+] only display the function that schema first appeared in the search_path.
Next
From: Zhang Mingli
Date:
Subject: Re: Code review in dsa.c