Re: row filtering for logical replication - Mailing list pgsql-hackers
From | Peter Smith |
---|---|
Subject | Re: row filtering for logical replication |
Date | |
Msg-id | CAHut+Pv3ow1pmQZ5MQme8Zgie_QsD0_B-BCrRi7pJt-32rSbSQ@mail.gmail.com Whole thread Raw |
In response to | RE: row filtering for logical replication ("houzj.fnst@fujitsu.com" <houzj.fnst@fujitsu.com>) |
Responses |
Re: row filtering for logical replication
RE: row filtering for logical replication Re: row filtering for logical replication |
List | pgsql-hackers |
Thanks for posting the merged v63. Here are my review comments for the v63-0001 changes. ~~~ 1. src/backend/replication/logical/proto.c - logicalrep_write_tuple TupleDesc desc; - Datum values[MaxTupleAttributeNumber]; - bool isnull[MaxTupleAttributeNumber]; + Datum *values; + bool *isnull; int i; uint16 nliveatts = 0; Those separate declarations for values / isnull are not strictly needed anymore, so those vars could be deleted. IIRC those were only added before when there were both slots and tuples. OTOH, maybe you prefer to keep it this way just for code readability? ~~~ 2. src/backend/replication/pgoutput/pgoutput.c - typedef +typedef enum RowFilterPubAction +{ + PUBACTION_INSERT, + PUBACTION_UPDATE, + PUBACTION_DELETE, + NUM_ROWFILTER_PUBACTIONS /* must be last */ +} RowFilterPubAction; This typedef is not currently used by any of the code. So I think choices are: - Option 1: remove the typedef, because nobody is using it. - Option 2: keep the typedef, but use it! e.g. everywhere there is an exprstate array index variable probably it should be declared as a 'RowFilterPubAction idx' instead of just 'int idx'. I prefer option 2, but YMMV. ~~~ 3. src/backend/replication/pgoutput/pgoutput.c - map_changetype_pubaction After this recent v63 refactoring and merging of some APIs it seems that the map_changetype_pubaction is now ONLY used by pgoutput_row_filter function. So this can now be a static member of pgoutput_row_filter function instead of being declared at file scope. ~~~ 4. src/backend/replication/pgoutput/pgoutput.c - pgoutput_row_filter comments + +/* + * Change is checked against the row filter, if any. + * + * If it returns true, the change is replicated, otherwise, it is not. + * + * FOR INSERT: evaluates the row filter for new tuple. + * FOR DELETE: evaluates the row filter for old tuple. + * For UPDATE: evaluates the row filter for old and new tuple. If both + * evaluations are true, it sends the UPDATE. If both evaluations are false, it + * doesn't send the UPDATE. If only one of the tuples matches the row filter + * expression, there is a data consistency issue. Fixing this issue requires a + * transformation. + * + * Transformations: + * Updates are transformed to inserts and deletes based on the + * old tuple and new tuple. The new action is updated in the + * action parameter. If not updated, action remains as update. + * + * Case 1: old-row (no match) new-row (no match) -> (drop change) + * Case 2: old-row (no match) new row (match) -> INSERT + * Case 3: old-row (match) new-row (no match) -> DELETE + * Case 4: old-row (match) new row (match) -> UPDATE + * + * If the change is to be replicated this function returns true, else false. + * + * Examples: The function header comment says the same thing 2x about the return values. The 1st text "If it returns true, the change is replicated, otherwise, it is not." should be replaced by the better wording of the 2nd text ("If the change is to be replicated this function returns true, else false."). Then, that 2nd text can be removed (from where it is later in this same comment). ~~~ 5. src/backend/replication/pgoutput/pgoutput.c - pgoutput_row_filter + ExprContext *ecxt; + int filter_index = map_changetype_pubaction[*action]; + + /* *action is already assigned default by caller */ + Assert(*action == REORDER_BUFFER_CHANGE_INSERT || + *action == REORDER_BUFFER_CHANGE_UPDATE || + *action == REORDER_BUFFER_CHANGE_DELETE); + Accessing the map_changetype_pubaction array should be done *after* the Assert. ~~~ 6. src/backend/replication/pgoutput/pgoutput.c - pgoutput_row_filter Actually, instead of assigning the filter_insert and then referring to entry->exprstate[filter_index] in multiple places, now the code might be neater if we simply assign a local variable “filter_exprstate” like below and use that instead. ExprState *filter_exprstate; ... filter_exprstate = entry->exprstate[map_changetype_pubaction[*action]]; Please consider what way you think is best. ~~~ 7. src/backend/replication/pgoutput/pgoutput.c - pgoutput_row_filter + /* + * For the following occasions where there is only one tuple, we can + * evaluates the row filter for the not null tuple and return. + * + * For INSERT: we only have new tuple. + * + * For UPDATE: if no old tuple, it means none of the replica identity + * columns changed and this would reduce to a simple update. we only need + * to evaluate the row filter for new tuple. + * + * FOR DELETE: we only have old tuple. + */ There are several things not quite right with that comment: a. I thought now it should refer to "slots" instead of "tuples" b. Some of the upper/lowercase is wonky c. Maybe it reads better without the ":" Suggested replacement comment: /* * For the following occasions where there is only one slot, we can * evaluates the row filter for the not-null slot and return. * * For INSERT we only have the new slot. * * For UPDATE if no old slot, it means none of the replica identity * columns changed and this would reduce to a simple update. We only need * to evaluate the row filter for the new slot. * * For DELETE we only have the old slot. */ ~~~ 8. src/backend/replication/pgoutput/pgoutput.c - pgoutput_row_filter + if (!new_slot || !old_slot) + { + ecxt->ecxt_scantuple = new_slot ? new_slot : old_slot; + result = pgoutput_row_filter_exec_expr(entry->exprstate[filter_index], + ecxt); + + FreeExecutorState(estate); + PopActiveSnapshot(); + + return result; + } + + tmp_new_slot = new_slot; + slot_getallattrs(new_slot); + slot_getallattrs(old_slot); I think after this "if" condition then the INSERT, DELETE and simple UPDATE are already handled. So, the remainder of the code is for deciding what update transformation is needed etc. I think there should be some block comment somewhere here to make that more obvious. ~~~ 9. src/backend/replication/pgoutput/pgoutput.c - pgoutput_row_filter Many of the comments in this function are still referring to old/new "tuple". Now that all the params are slots instead of tuples maybe now all the comments should also refer to "slots" instead of "tuples". Please search all the comments - e.g. including all the "Case 1:" ... "Case 4:" comments. ~~~ 10. src/backend/replication/pgoutput/pgoutput.c - pgoutput_row_filter_init + int idx_ins = PUBACTION_INSERT; + int idx_upd = PUBACTION_UPDATE; + int idx_del = PUBACTION_DELETE; These variables are unnecessary now... They previously were added only as short synonyms because the other enum names were too verbose (e.g. REORDER_BUFFER_CHANGE_INSERT) but now that we have shorter enum names like PUBACTION_INSERT we can just use those names directly ~~~ 11. src/backend/replication/pgoutput/pgoutput.c - pgoutput_row_filter_init I felt that the code would seem more natural if the pgoutput_row_filter_init function came *before* the pgoutput_row_filter function in the source code. ~~~ 12. src/backend/replication/pgoutput/pgoutput.c - pgoutput_change @@ -634,6 +1176,9 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn, RelationSyncEntry *relentry; TransactionId xid = InvalidTransactionId; Relation ancestor = NULL; + ReorderBufferChangeType modified_action = change->action; + TupleTableSlot *old_slot = NULL; + TupleTableSlot *new_slot = NULL; It seemed a bit misleading to me to call this variable 'modified_action' since mostly it is not modified at all. IMO it is better just to call this as 'action' but then add a comment (above the "switch (modified_action)") to say the previous call to pgoutput_row_filter may have transformed it to a different action. ~~~ 13. src/tools/pgindent/typedefs.list - RowFilterPubAction If you choose to keep the typedef for RowFilterPubAction (ref to comment #1) then it should also be added to the typedefs.list. ------ Kind Regards, Peter Smith. Fujitsu Australia
pgsql-hackers by date: