Re: row filtering for logical replication - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: row filtering for logical replication
Date
Msg-id CAA4eK1+mxOUU=hZSv5xDd4WzTN5aV+u_3QVOh0iHXNxB6aLTMw@mail.gmail.com
Whole thread Raw
In response to RE: row filtering for logical replication  ("houzj.fnst@fujitsu.com" <houzj.fnst@fujitsu.com>)
Responses Re: row filtering for logical replication
RE: row filtering for logical replication
List pgsql-hackers
On Mon, Jan 10, 2022 at 8:41 AM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
>
> Attach the v61 patch set.
>

Few comments:
==============
1.
pgoutput_row_filter()
{
..
+
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ rfnode = n_filters > 1 ? makeBoolExpr(OR_EXPR, rfnodes[idx], -1) :
linitial(rfnodes[idx]);
+ entry->exprstate[idx] = pgoutput_row_filter_init_expr(rfnode);
+ MemoryContextSwitchTo(oldctx);
..
}

rel_sync_cache_relation_cb()
{
..
+ if (entry->exprstate[idx] != NULL)
+ {
+ pfree(entry->exprstate[idx]);
+ entry->exprstate[idx] = NULL;
+ }
..
}

I think this can leak memory as just freeing 'exprstate' is not
sufficient. It contains other allocated memory as well like for
'steps'. Apart from that we might allocate other memory as well for
generating expression state. I think it would be better if we can have
another memory context (say cache_expr_cxt) in RelationSyncEntry and
allocate it the first time we need it and then reset it instead of
doing pfree of 'exprstate'. Also, we can free this new context in
pgoutput_shutdown before destroying RelationSyncCache.

2. If we do the above, we can use this new context at all other places
in the patch where it is using CacheMemoryContext.

3.
@@ -1365,6 +1785,7 @@ rel_sync_cache_publication_cb(Datum arg, int
cacheid, uint32 hashvalue)
 {
  HASH_SEQ_STATUS status;
  RelationSyncEntry *entry;
+ MemoryContext oldctx;

  /*
  * We can get here if the plugin was used in SQL interface as the
@@ -1374,6 +1795,8 @@ rel_sync_cache_publication_cb(Datum arg, int
cacheid, uint32 hashvalue)
  if (RelationSyncCache == NULL)
  return;

+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+
  /*
  * There is no way to find which entry in our cache the hash belongs to so
  * mark the whole cache as invalid.
@@ -1392,6 +1815,8 @@ rel_sync_cache_publication_cb(Datum arg, int
cacheid, uint32 hashvalue)
  entry->pubactions.pubdelete = false;
  entry->pubactions.pubtruncate = false;
  }
+
+ MemoryContextSwitchTo(oldctx);
 }

Is there a reason for the above change?

4.
+#define SET_NO_FILTER_FOR_CURRENT_PUBACTIONS \
+ if (pub->pubactions.pubinsert) \
+ no_filter[idx_ins] = true; \
+ if (pub->pubactions.pubupdate) \
+ no_filter[idx_upd] = true; \
+ if (pub->pubactions.pubdelete) \
+ no_filter[idx_del] = true

I don't see the need for this macro and it makes code less readable. I
think we can instead move this code to a function to avoid duplicate
code.

5.
Multiple publications might have multiple row filters for
+ * this relation. Since row filter usage depends on the DML operation,
+ * there are multiple lists (one for each operation) which row filters
+ * will be appended.

There seems to be a typo in the above sentence.
/which row filters/to which row filters

6.
+ /*
+ * Find if there are any row filters for this relation. If there are,
+ * then prepare the necessary ExprState and cache it in
+ * entry->exprstate.
+ *
+ * NOTE: All publication-table mappings must be checked.
+ *
+ * NOTE: If the relation is a partition and pubviaroot is true, use
+ * the row filter of the topmost partitioned table instead of the row
+ * filter of its own partition.
+ *
+ * NOTE: Multiple publications might have multiple row filters for
+ * this relation. Since row filter usage depends on the DML operation,
+ * there are multiple lists (one for each operation) which row filters
+ * will be appended.
+ *
+ * NOTE: FOR ALL TABLES implies "don't use row filter expression" so
+ * it takes precedence.
+ *
+ * NOTE: ALL TABLES IN SCHEMA implies "don't use row filter
+ * expression" if the schema is the same as the table schema.
+ */
+ foreach(lc, data->publications)

Let's not add NOTE for each of these points but instead expand the
first sentence as "Find if there are any row filters for this
relation. If there are, then prepare the necessary ExprState and cache
it in entry->exprstate. To build an expression state, we need to
ensure the following:"

-- 
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: pg_upgrade verbosity when redirecting output to log file
Next
From: Andres Freund
Date:
Subject: Re: pg_upgrade verbosity when redirecting output to log file