RE: [PATCH] Reuse Workers and Replication Slots during Logical Replication - Mailing list pgsql-hackers

From Hayato Kuroda (Fujitsu)
Subject RE: [PATCH] Reuse Workers and Replication Slots during Logical Replication
Date
Msg-id TYAPR01MB5866D84F19A96917CDA4B509F527A@TYAPR01MB5866.jpnprd01.prod.outlook.com
Whole thread Raw
In response to Re: [PATCH] Reuse Workers and Replication Slots during Logical Replication  (Melih Mutlu <m.melihmutlu@gmail.com>)
List pgsql-hackers
Dear Melih,

Thanks for updating the patch. Followings are my comments.
Note that some lines exceeds 80 characters and some other lines seem too short.
And comments about coding conventions were skipped.

0001

01. logicalrep_worker_launch()

```
        if (is_parallel_apply_worker)
+       {
                snprintf(bgw.bgw_function_name, BGW_MAXLEN, "ParallelApplyWorkerMain");
-       else
-               snprintf(bgw.bgw_function_name, BGW_MAXLEN, "ApplyWorkerMain");
-
-       if (OidIsValid(relid))
                snprintf(bgw.bgw_name, BGW_MAXLEN,
-                                "logical replication worker for subscription %u sync %u", subid, relid);
-       else if (is_parallel_apply_worker)
+                                "logical replication parallel apply worker for subscription %u", subid);
                snprintf(bgw.bgw_name, BGW_MAXLEN,
                                 "logical replication parallel apply worker for subscription %u", subid);
```

Latter snprintf(bgw.bgw_name...) should be snprintf(bgw.bgw_type, BGW_MAXLEN, "logical replication worker").

02. ApplyWorkerMain

```
        /*
         * Setup callback for syscache so that we know when something changes in
-        * the subscription relation state.
+        * the subscription relation state. Do this outside the loop to avoid
+        * exceeding MAX_SYSCACHE_CALLBACKS
         */
```

I'm not sure this change is really needed. CacheRegisterSyscacheCallback() must
be outside the loop to avoid duplicated register, and it seems trivial.

0002

03. TablesyncWorkerMain()

Regarding the inner loop, the exclusive lock is acquired even if the rstate is
SUBREL_STATE_SYNCDONE. Moreover, palloc() and memcpy() for rstate seemsed not
needed. How about following?

```
        for (;;)
        {
                List       *rstates;
-               SubscriptionRelState *rstate;
                ListCell   *lc;
...
-                       rstate = (SubscriptionRelState *) palloc(sizeof(SubscriptionRelState));
 
                        foreach(lc, rstates)
                        {
-                               memcpy(rstate, lfirst(lc), sizeof(SubscriptionRelState));
+                               SubscriptionRelState *rstate =
+                                                                               (SubscriptionRelState *) lfirst(lc);
+
+                               if (rstate->state == SUBREL_STATE_SYNCDONE)
+                                       continue;

                                /*
-                               * Pick the table for the next run if it is not already picked up
-                               * by another worker.
-                               *
-                               * Take exclusive lock to prevent any other sync worker from picking
-                               * the same table.
-                               */
+                                * Take exclusive lock to prevent any other sync worker from
+                                * picking the same table.
+                                */
                                LWLockAcquire(LogicalRepWorkerLock, LW_EXCLUSIVE);
-                               if (rstate->state != SUBREL_STATE_SYNCDONE &&
-                                       !logicalrep_worker_find(MySubscription->oid, rstate->relid, false))
+
+                               /*
+                                * Pick the table for the next run if it is not already picked up
+                                * by another worker.
+                                */
+                               if (!logicalrep_worker_find(MySubscription->oid,
+                                                                                       rstate->relid, false))
```

04. TablesyncWorkerMain

I think rstates should be pfree'd at the end of the outer loop, but it's OK
if other parts do not.

05. repsponse for for post

>
I tried to move the logicalrep_worker_wakeup call from
clean_sync_worker (end of an iteration) to finish_sync_worker (end of
sync worker). I made table sync much slower for some reason, then I
reverted that change. Maybe I should look a bit more into the reason
why that happened some time.
>

I want to see the testing method to reproduce the same issue, could you please
share it to -hackers?

0003, 0004

I did not checked yet but I could say same as above:
I want to see the testing method to reproduce the same issue.
Could you please share it to -hackers?
My previous post (an approach for reuse connection) may help the performance.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED


pgsql-hackers by date:

Previous
From: "Tristan Partin"
Date:
Subject: Re: Make pgbench exit on SIGINT more reliably
Next
From: "Tristan Partin"
Date:
Subject: Make uselocale protection more consistent