Re: Proposal: Filter irrelevant change before reassemble transactions during logical decoding - Mailing list pgsql-hackers

From Peter Smith
Subject Re: Proposal: Filter irrelevant change before reassemble transactions during logical decoding
Date
Msg-id CAHut+Pt+wci+UrOmqjq7W4gv_WPF5y9HN-sGTjBQLTz_6zqpbw@mail.gmail.com
Whole thread Raw
In response to Proposal: Filter irrelevant change before reassemble transactions during logical decoding  (li jie <ggysxcq@gmail.com>)
List pgsql-hackers
Hi Ajin,

Some review comments for v14-0002.

======
src/backend/replication/logical/decode.c

1.
There is lots of nearly duplicate code checking to see if a change is filterable

DecodeInsert:
+ /*
+ * When filtering changes, determine if the relation associated with the change
+ * can be skipped. This could be because the relation is unlogged or because
+ * the plugin has opted to exclude this relation from decoding.
+ */
+ if (FilterChangeIsEnabled(ctx) &&
  ReorderBufferFilterByRelFileLocator(ctx->reorder, XLogRecGetXid(r),
- buf->origptr, &target_locator, true))
+ buf->origptr, &target_locator,
+ REORDER_BUFFER_CHANGE_INSERT,
+ true))


DecodeUpdate:
+ /*
+ * When filtering changes, determine if the relation associated with the change
+ * can be skipped. This could be because the relation is unlogged or because
+ * the plugin has opted to exclude this relation from decoding.
+ */
+ if (FilterChangeIsEnabled(ctx) &&
+ ReorderBufferFilterByRelFileLocator(ctx->reorder, XLogRecGetXid(r),
+ buf->origptr, &target_locator,
+ REORDER_BUFFER_CHANGE_UPDATE,
+ true))
+ return;
+

DecodeDelete:
+ /*
+ * When filtering changes, determine if the relation associated with the change
+ * can be skipped. This could be because the relation is unlogged or because
+ * the plugin has opted to exclude this relation from decoding.
+ */
+ if (FilterChangeIsEnabled(ctx) &&
+ ReorderBufferFilterByRelFileLocator(ctx->reorder, XLogRecGetXid(r),
+ buf->origptr, &target_locator,
+ REORDER_BUFFER_CHANGE_DELETE,
+ true))
+ return;
+

DecodeMultiInsert:
  /*
+ * When filtering changes, determine if the relation associated with the change
+ * can be skipped. This could be because the relation is unlogged or because
+ * the plugin has opted to exclude this relation from decoding.
+ */
+ if (FilterChangeIsEnabled(ctx) &&
+ ReorderBufferFilterByRelFileLocator(ctx->reorder, XLogRecGetXid(r),
+ buf->origptr, &rlocator,
+ REORDER_BUFFER_CHANGE_INSERT,
+ true))
+ return;
+

DecodeSpecConfirm:
+ /*
+ * When filtering changes, determine if the relation associated with the change
+ * can be skipped. This could be because the relation is unlogged or because
+ * the plugin has opted to exclude this relation from decoding.
+ */
+ if (FilterChangeIsEnabled(ctx) &&
+ ReorderBufferFilterByRelFileLocator(ctx->reorder, XLogRecGetXid(r),
+ buf->origptr, &target_locator,
+ REORDER_BUFFER_CHANGE_INSERT,
+ true))
+ return;
+

Can't all those code fragments (DecodeInsert, DecodeUpdate,
DecodeDelete, DecodeMultiInsert, DecodeSpecConfirm) delegate to a
new/common 'SkipThisChange(...)' function?

======
src/backend/replication/logical/reorderbuffer.c

2.
 /*
  * After encountering a change that cannot be filtered out, filtering is
- * temporarily suspended. Filtering resumes after processing every 100 changes.
+ * temporarily suspended. Filtering resumes after processing
CHANGES_THRESHOLD_FOR_FILTER
+ * changes.
  * This strategy helps to minimize the overhead of performing a hash table
  * search for each record, especially when most changes are not filterable.
  */
-#define CHANGES_THRESHOLD_FOR_FILTER 100
+#define CHANGES_THRESHOLD_FOR_FILTER 0

Why is this defined as 0? Some accidental residue from performance
testing different values?

======
src/test/subscription/t/001_rep_changes.pl

3.
+# Check that an unpublished change is filtered out.
+$logfile = slurp_file($node_publisher->logfile, $log_location);
+ok($logfile =~ qr/Filtering INSERT/,
+ 'unpublished INSERT is filtered');
+
+ok($logfile =~ qr/Filtering UPDATE/,
+ 'unpublished UPDATE is filtered');
+
+ok($logfile =~ qr/Filtering DELETE/,
+ 'unpublished DELETE is filtered');
+

AFAICT these are probably getting filtered out because the entire
table is not published at all.

Should you also add different tests where you do operations on a table
that IS partially replicated but only some of the *operations* are not
published. e.g. test the different 'pubactions' of the PUBLICATION
'publish' parameter. Maybe you need different log checks to
distinguish the different causes for the filtering.

======
Kind Regards,
Peter Smith.
Fujitsu Australia



pgsql-hackers by date:

Previous
From: Peter Smith
Date:
Subject: Re: Restrict copying of invalidated replication slots
Next
From: Kirill Reshke
Date:
Subject: Re: [PATCH] Fix Potential Memory Leak in pg_amcheck Code