Re: Perform streaming logical transactions by background workers and parallel apply - Mailing list pgsql-hackers

From Peter Smith
Subject Re: Perform streaming logical transactions by background workers and parallel apply
Date
Msg-id CAHut+Psf3QaOvso2-4T3kBpvfKCoksXi3tUf4XHhwD6-9awKtA@mail.gmail.com
Whole thread Raw
In response to RE: Perform streaming logical transactions by background workers and parallel apply  ("wangw.fnst@fujitsu.com" <wangw.fnst@fujitsu.com>)
List pgsql-hackers
Here are my review comments for the patch v23-0003:

======

3.1. src/backend/replication/logical/applybgworker.c -
apply_bgworker_relation_check

+ * Although the commit order is maintained by only allowing one process to
+ * commit at a time, the access order to the relation has changed. This could
+ * cause unexpected problems if the unique column on the replicated table is
+ * inconsistent with the publisher-side or contains non-immutable functions
+ * when applying transactions using an apply background worker.
+ */
+void
+apply_bgworker_relation_check(LogicalRepRelMapEntry *rel)

I’m not sure, but should that second sentence be rearranged as follows?

SUGGESTION
This could cause unexpected problems when applying transactions using
an apply background worker if the unique column on the replicated
table is inconsistent with the publisher-side, or if the relation
contains non-immutable functions.

~~~

3.2.

+ if (!am_apply_bgworker() &&
+ (list_length(ApplyBgworkersFreeList) == list_length(ApplyBgworkersList)))
+ return;

Previously I posted I was struggling to understand the above
condition, and then it was explained (see [1] comment #4) that:
> We need to check this for apply bgworker. (Both lists are "NIL" in apply bgworker.)

I think that information should be included in the code comment.

======

3.3. src/include/replication/logicalrelation.h

+/*
+ * States to determine if changes on one relation can be applied using an
+ * apply background worker.
+ */
+typedef enum ParallelApplySafety
+{
+ PARALLEL_APPLY_UNKNOWN = 0,
+ PARALLEL_APPLY_SAFE,
+ PARALLEL_APPLY_UNSAFE
+} ParallelApplySafety;
+

3.3a.
The enum value PARALLEL_APPLY_UNKNOWN doesn't really mean anything.
Maybe naming it PARALLEL_APPLY_SAFETY_UNKNOWN gives it the intended
meaning.

3.3b.
+ PARALLEL_APPLY_UNKNOWN = 0,
I didn't see any reason to explicitly assign this to 0.

~~~

3.4. src/include/replication/logicalrelation.h

@@ -31,6 +42,8 @@ typedef struct LogicalRepRelMapEntry
  Relation localrel; /* relcache entry (NULL when closed) */
  AttrMap    *attrmap; /* map of local attributes to remote ones */
  bool updatable; /* Can apply updates/deletes? */
+ ParallelApplySafety parallel_apply; /* Can apply changes in an apply
+

(Similar to above comment #3.3a)

The member name 'parallel_apply' doesn't really mean anything. Perhaps
renaming this to 'parallel_apply_safe' or 'parallel_safe' etc will
give it the intended meaning.

------
[1]
https://www.postgresql.org/message-id/OS3PR01MB6275739E73E8BEC5D13FB6739E6B9%40OS3PR01MB6275.jpnprd01.prod.outlook.com

Kind Regards,
Peter Smith.
Fujitsu Australia



pgsql-hackers by date:

Previous
From: Justin Pryzby
Date:
Subject: Re: shadow variables - pg15 edition
Next
From: Thomas Munro
Date:
Subject: Re: Cleaning up historical portability baggage