RE: Perform streaming logical transactions by background workers and parallel apply - Mailing list pgsql-hackers

From wangw.fnst@fujitsu.com
Subject RE: Perform streaming logical transactions by background workers and parallel apply
Date
Msg-id OS3PR01MB6275595120E69924BAE8BE219E9F9@OS3PR01MB6275.jpnprd01.prod.outlook.com
Whole thread Raw
In response to RE: Perform streaming logical transactions by background workers and parallel apply  ("kuroda.hayato@fujitsu.com" <kuroda.hayato@fujitsu.com>)
Responses Re: Perform streaming logical transactions by background workers and parallel apply
List pgsql-hackers
On Thurs, Jul 28, 2022 at 13:20 PM Kuroda, Hayato/黒田 隼人 <kuroda.hayato@fujitsu.com> wrote:
>
> Dear Wang-san,
> 
> Hi, I'm also interested in the patch and I started to review this.
> Followings are comments about 0001.

Thanks for your kindly review and comments.
To avoid making this thread too long, I will reply to all of your comments
(#1~#13) in this email.

> 1. terminology
> 
> In your patch a new worker "apply background worker" has been introduced,
> but I thought it might be confused because PostgreSQL has already the worker
> "background worker".
> Both of apply worker and apply bworker are categolized as bgworker.
> Do you have any reasons not to use "apply parallel worker" or "apply streaming
> worker"?
> (Note that I'm not native English speaker)

Since we will later consider applying non-streamed transactions in parallel, I
think "apply streaming worker" might not be very suitable. I think PostgreSQL
also has the worker "parallel worker", so for "apply parallel worker" and
"apply background worker", I feel that "apply background worker" will make the
relationship between workers more clear. ("[main] apply worker" and "apply
background worker")

> 2. logicalrep_worker_stop()
> 
> ```
> -       /* No worker, nothing to do. */
> -       if (!worker)
> -       {
> -               LWLockRelease(LogicalRepWorkerLock);
> -               return;
> -       }
> +       if (worker)
> +               logicalrep_worker_stop_internal(worker);
> +
> +       LWLockRelease(LogicalRepWorkerLock);
> +}
> ```
> 
> I thought you could add a comment the meaning of if-statement, like "No main
> apply worker, nothing to do"

Since the processing in the if statement is reversed from before, I added the
following comment based on your suggestion:
```
Found the main worker, then try to stop it.
```

> 3. logicalrep_workers_find()
> 
> I thought you could add a description about difference between this and
> logicalrep_worker_find() at the top of the function.
> IIUC logicalrep_workers_find() counts subworker, but logicalrep_worker_find()
> does not focus such type of workers.

I think it is fine to keep the comment because the comment says "returns list
of *all workers* for the subscription".
Also, we have added the comment "We are only interested in the main apply
worker or table sync worker here" in the function logicalrep_worker_find.

> 5. applybgworker.c
> 
> ```
> +/* Apply background workers hash table (initialized on first use) */
> +static HTAB *ApplyWorkersHash = NULL;
> +static List *ApplyWorkersFreeList = NIL;
> +static List *ApplyWorkersList = NIL;
> ```
> 
> I thought they should be ApplyBgWorkersXXX, because they stores information
> only related with apply bgworkers.

I improved them to ApplyBgworkersXXX just for the consistency with other names.

> 6. ApplyBgworkerShared
> 
> ```
> +       TransactionId   stream_xid;
> +       uint32  n;      /* id of apply background worker */
> +} ApplyBgworkerShared;
> ```
> 
> I thought the field "n" is too general, how about "proc_id" or "worker_id"?

I think "worker_id" seems better, so I improved "n" to "worker_id".

> 10. wait_event.h
> 
> ```
>         WAIT_EVENT_HASH_GROW_BUCKETS_REINSERT,
> +       WAIT_EVENT_LOGICAL_APPLY_WORKER_STATE_CHANGE,
>         WAIT_EVENT_LOGICAL_SYNC_DATA,
> ```
> 
> I thought the event should be
> WAIT_EVENT_LOGICAL_APPLY_BG_WORKER_STATE_CHANGE,
> because this is used when apply worker waits until the status of bgworker
> changes.

I improved them to "WAIT_EVENT_LOGICAL_APPLY_BGWORKER_STATE_CHANGE" just for
the consistency with other names.

> 13. 015_stream.pl
> 
> I could not find test about TRUNCATE. IIUC apply bgworker works well
> even if it gets LOGICAL_REP_MSG_TRUNCATE message from main worker.
> Can you add the case?

I modified the test cases in "032_streaming_apply.pl" this time, the use case
you mentioned is covered now.

The rest of the comments are improved as suggested.
The new patches were attached in [1].

[1] -
https://www.postgresql.org/message-id/OS3PR01MB6275D64BE7726B0221B15F389E9F9%40OS3PR01MB6275.jpnprd01.prod.outlook.com

Regards,
Wang wei

pgsql-hackers by date:

Previous
From: "wangw.fnst@fujitsu.com"
Date:
Subject: RE: Perform streaming logical transactions by background workers and parallel apply
Next
From: "wangw.fnst@fujitsu.com"
Date:
Subject: RE: Perform streaming logical transactions by background workers and parallel apply