Re: row filtering for logical replication - Mailing list pgsql-hackers

From Peter Smith
Subject Re: row filtering for logical replication
Date
Msg-id CAHut+Pv3ow1pmQZ5MQme8Zgie_QsD0_B-BCrRi7pJt-32rSbSQ@mail.gmail.com
Whole thread Raw
In response to RE: row filtering for logical replication  ("houzj.fnst@fujitsu.com" <houzj.fnst@fujitsu.com>)
Responses Re: row filtering for logical replication
RE: row filtering for logical replication
Re: row filtering for logical replication
List pgsql-hackers
Thanks for posting the merged v63.

Here are my review comments for the v63-0001 changes.

~~~

1. src/backend/replication/logical/proto.c - logicalrep_write_tuple

  TupleDesc desc;
- Datum values[MaxTupleAttributeNumber];
- bool isnull[MaxTupleAttributeNumber];
+ Datum    *values;
+ bool    *isnull;
  int i;
  uint16 nliveatts = 0;

Those separate declarations for values / isnull are not strictly
needed anymore, so those vars could be deleted. IIRC those were only
added before when there were both slots and tuples. OTOH, maybe you
prefer to keep it this way just for code readability?

~~~

2. src/backend/replication/pgoutput/pgoutput.c - typedef

+typedef enum RowFilterPubAction
+{
+ PUBACTION_INSERT,
+ PUBACTION_UPDATE,
+ PUBACTION_DELETE,
+ NUM_ROWFILTER_PUBACTIONS  /* must be last */
+} RowFilterPubAction;

This typedef is not currently used by any of the code.

So I think choices are:

- Option 1: remove the typedef, because nobody is using it.

- Option 2: keep the typedef, but use it! e.g. everywhere there is an
exprstate array index variable probably it should be declared as a
'RowFilterPubAction idx' instead of just 'int idx'.

I prefer option 2, but YMMV.

~~~

3. src/backend/replication/pgoutput/pgoutput.c - map_changetype_pubaction

After this recent v63 refactoring and merging of some APIs it seems
that the map_changetype_pubaction is now ONLY used by
pgoutput_row_filter function. So this can now be a static member of
pgoutput_row_filter function instead of being declared at file scope.

~~~

4. src/backend/replication/pgoutput/pgoutput.c - pgoutput_row_filter comments

+
+/*
+ * Change is checked against the row filter, if any.
+ *
+ * If it returns true, the change is replicated, otherwise, it is not.
+ *
+ * FOR INSERT: evaluates the row filter for new tuple.
+ * FOR DELETE: evaluates the row filter for old tuple.
+ * For UPDATE: evaluates the row filter for old and new tuple. If both
+ * evaluations are true, it sends the UPDATE. If both evaluations are false, it
+ * doesn't send the UPDATE. If only one of the tuples matches the row filter
+ * expression, there is a data consistency issue. Fixing this issue requires a
+ * transformation.
+ *
+ * Transformations:
+ * Updates are transformed to inserts and deletes based on the
+ * old tuple and new tuple. The new action is updated in the
+ * action parameter. If not updated, action remains as update.
+ *
+ * Case 1: old-row (no match)    new-row (no match)  -> (drop change)
+ * Case 2: old-row (no match)    new row (match)     -> INSERT
+ * Case 3: old-row (match)       new-row (no match)  -> DELETE
+ * Case 4: old-row (match)       new row (match)     -> UPDATE
+ *
+ * If the change is to be replicated this function returns true, else false.
+ *
+ * Examples:

The function header comment says the same thing 2x about the return values.

The 1st text "If it returns true, the change is replicated, otherwise,
it is not." should be replaced by the better wording of the 2nd text
("If the change is to be replicated this function returns true, else
false."). Then, that 2nd text can be removed (from where it is later
in this same comment).

~~~

5. src/backend/replication/pgoutput/pgoutput.c - pgoutput_row_filter

+ ExprContext    *ecxt;
+ int filter_index = map_changetype_pubaction[*action];
+
+ /* *action is already assigned default by caller */
+ Assert(*action == REORDER_BUFFER_CHANGE_INSERT ||
+    *action == REORDER_BUFFER_CHANGE_UPDATE ||
+    *action == REORDER_BUFFER_CHANGE_DELETE);
+

Accessing the map_changetype_pubaction array should be done *after* the Assert.

~~~

6. src/backend/replication/pgoutput/pgoutput.c - pgoutput_row_filter

Actually, instead of assigning the filter_insert and then referring to
entry->exprstate[filter_index] in multiple places, now the code might
be neater if we simply assign a local variable “filter_exprstate” like
below and use that instead.

ExprState *filter_exprstate;
...
filter_exprstate = entry->exprstate[map_changetype_pubaction[*action]];

Please consider what way you think is best.

~~~

7. src/backend/replication/pgoutput/pgoutput.c - pgoutput_row_filter

+ /*
+ * For the following occasions where there is only one tuple, we can
+ * evaluates the row filter for the not null tuple and return.
+ *
+ * For INSERT: we only have new tuple.
+ *
+ * For UPDATE: if no old tuple, it means none of the replica identity
+ * columns changed and this would reduce to a simple update. we only need
+ * to evaluate the row filter for new tuple.
+ *
+ * FOR DELETE: we only have old tuple.
+ */

There are several things not quite right with that comment:
a. I thought now it should refer to "slots" instead of "tuples"
b. Some of the upper/lowercase is wonky
c. Maybe it reads better without the ":"

Suggested replacement comment:

/*
* For the following occasions where there is only one slot, we can
* evaluates the row filter for the not-null slot and return.
*
* For INSERT we only have the  new slot.
*
* For UPDATE if no old slot, it means none of the replica identity
* columns changed and this would reduce to a simple update. We only need
* to evaluate the row filter for the new slot.
*
* For DELETE we only have the old slot.
*/

~~~

8. src/backend/replication/pgoutput/pgoutput.c - pgoutput_row_filter

+ if (!new_slot || !old_slot)
+ {
+ ecxt->ecxt_scantuple = new_slot ? new_slot : old_slot;
+ result = pgoutput_row_filter_exec_expr(entry->exprstate[filter_index],
+    ecxt);
+
+ FreeExecutorState(estate);
+ PopActiveSnapshot();
+
+ return result;
+ }
+
+ tmp_new_slot = new_slot;
+ slot_getallattrs(new_slot);
+ slot_getallattrs(old_slot);

I think after this "if" condition then the INSERT, DELETE and simple
UPDATE are already handled. So, the remainder of the code is for
deciding what update transformation is needed etc.

I think there should be some block comment somewhere here to make that
more obvious.

~~~

9. src/backend/replication/pgoutput/pgoutput.c - pgoutput_row_filter

Many of the comments in this function are still referring to old/new
"tuple". Now that all the params are slots instead of tuples maybe now
all the comments should also refer to "slots" instead of "tuples".
Please search all the comments - e.g. including all the "Case 1:" ...
"Case 4:" comments.

~~~

10. src/backend/replication/pgoutput/pgoutput.c - pgoutput_row_filter_init

+ int idx_ins = PUBACTION_INSERT;
+ int idx_upd = PUBACTION_UPDATE;
+ int idx_del = PUBACTION_DELETE;

These variables are unnecessary now... They previously were added only
as short synonyms because the other enum names were too verbose (e.g.
REORDER_BUFFER_CHANGE_INSERT) but now that we have shorter enum names
like PUBACTION_INSERT we can just use those names directly

~~~

11. src/backend/replication/pgoutput/pgoutput.c - pgoutput_row_filter_init

I felt that the code would seem more natural if the
pgoutput_row_filter_init function came *before* the
pgoutput_row_filter function in the source code.

~~~

12. src/backend/replication/pgoutput/pgoutput.c - pgoutput_change

@@ -634,6 +1176,9 @@ pgoutput_change(LogicalDecodingContext *ctx,
ReorderBufferTXN *txn,
  RelationSyncEntry *relentry;
  TransactionId xid = InvalidTransactionId;
  Relation ancestor = NULL;
+ ReorderBufferChangeType modified_action = change->action;
+ TupleTableSlot *old_slot = NULL;
+ TupleTableSlot *new_slot = NULL;

It seemed a bit misleading to me to call this variable
'modified_action' since mostly it is not modified at all.

IMO it is better just to call this as 'action' but then add a comment
(above the "switch (modified_action)") to say the previous call to
pgoutput_row_filter may have transformed it to a different action.

~~~

13. src/tools/pgindent/typedefs.list - RowFilterPubAction

If you choose to keep the typedef for RowFilterPubAction (ref to
comment #1) then it should also be added to the typedefs.list.


------
Kind Regards,
Peter Smith.
Fujitsu Australia



pgsql-hackers by date:

Previous
From: Amit Langote
Date:
Subject: Re: ExecRTCheckPerms() and many prunable partitions
Next
From: "tanghy.fnst@fujitsu.com"
Date:
Subject: RE: [PATCH]Add tab completion for foreigh table