Re: row filtering for logical replication - Mailing list pgsql-hackers

From Ajin Cherian
Subject Re: row filtering for logical replication
Date
Msg-id CAFPTHDa9sB6MXZ0erDnjLX-rMfAf1UR9Kzr8yqDx38VJ5j_sMQ@mail.gmail.com
Whole thread Raw
In response to Re: row filtering for logical replication  (Peter Smith <smithpb2250@gmail.com>)
Responses Re: row filtering for logical replication
Re: row filtering for logical replication
Re: row filtering for logical replication
List pgsql-hackers
On Thu, Nov 25, 2021 at 2:22 PM Peter Smith <smithpb2250@gmail.com> wrote:
>
> Thanks for all the review comments so far! We are endeavouring to keep
> pace with them.
>
> All feedback is being tracked and we will fix and/or reply to everything ASAP.
>
> Meanwhile, PSA the latest set of v42* patches.
>
> This version was mostly a patch restructuring exercise but it also
> addresses some minor review comments in passing.
>

Addressed more review comments, in the attached patch-set v43. 5
patches carried forward from v42.
This patch-set contains the following fixes:

On Tue, Nov 23, 2021 at 1:28 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> in pgoutput_row_filter, we are dropping the slots if there are some
> old slots in the RelationSyncEntry.  But then I noticed that in
> rel_sync_cache_relation_cb(), also we are doing that but only for the
> scantuple slot.  So IMHO, rel_sync_cache_relation_cb(), is only place
> setting entry->rowfilter_valid to false; so why not drop all the slot
> that time only and in pgoutput_row_filter(), you can just put an
> assert?
>

Moved all the dropping of slots to rel_sync_cache_relation_cb()

> +static bool
> +pgoutput_row_filter_virtual(Relation relation, TupleTableSlot *slot,
> RelationSyncEntry *entry)
> +{
> +    EState       *estate;
> +    ExprContext *ecxt;
>
>
> pgoutput_row_filter_virtual and pgoutput_row_filter are exactly same
> except, ExecStoreHeapTuple(), so why not just put one check based on
> whether a slot is passed or not, instead of making complete duplicate
> copy of the function.

Removed pgoutput_row_filter_virtual

>          oldctx = MemoryContextSwitchTo(CacheMemoryContext);
>          tupdesc = CreateTupleDescCopy(tupdesc);
>          entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
>
> Why do we need to copy the tupledesc? do we think that we need to have
> this slot even if we close the relation, if so can you add the
> comments explaining why we are making a copy here.

This code has been modified, and comments added.

On Tue, Nov 23, 2021 at 8:02 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> One more thing related to this code:
> pgoutput_row_filter()
> {
> ..
> + if (!entry->rowfilter_valid)
> {
> ..
> + oldctx = MemoryContextSwitchTo(CacheMemoryContext);
> + tupdesc = CreateTupleDescCopy(tupdesc);
> + entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
> + MemoryContextSwitchTo(oldctx);
> ..
> }
>
> Why do we need to initialize scantuple here unless we are sure that
> the row filter is going to get associated with this relentry? I think
> when there is no row filter then this allocation is not required.
>

Modified as suggested.

On Tue, Nov 23, 2021 at 10:52 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> In 0003 patch, why is below change required?
> --- a/src/backend/replication/pgoutput/pgoutput.c
> +++ b/src/backend/replication/pgoutput/pgoutput.c
> @@ -1,4 +1,4 @@
> -/*-------------------------------------------------------------------------
> +/*------------------------------------------------------------------------
>   *
>   * pgoutput.c
>

Removed.

>
> After above, rearrange the code in pgoutput_row_filter(), so that two
> different checks related to 'rfisnull'  (introduced by different
> patches) can be combined as if .. else check.
>
Fixed.

On Thu, Nov 25, 2021 at 12:03 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> + * If the new relation or the old relation has a where clause,
> + * we need to remove it so that it can be added afresh later.
> + */
> + if (RelationGetRelid(newpubrel->relation) == oldrelid &&
> + newpubrel->whereClause == NULL && rfisnull)
>
> Can't we use _equalPublicationTable() here? It compares the whereClause as well.
>

Tried this, can't do this because one is an alter statement while the
other is a publication, the whereclause is not
the same Nodetype. In the statement, the whereclause is T_A_Expr,
while in the publication
catalog, it is T_OpExpr.

>   /* Must be owner of the table or superuser. */
> - if (!pg_class_ownercheck(RelationGetRelid(rel), GetUserId()))
> + if (!pg_class_ownercheck(relid, GetUserId()))
>
> Here, you can directly use RelationGetRelid as was used in the
> previous code without using an additional variable.
>

Fixed.

> 2.
> +typedef struct {
> + Relation rel;
> + bool check_replident;
> + Bitmapset  *bms_replident;
> +}
> +rf_context;
>
> Add rf_context in the same line where } ends.

Code has been modified, this comment no longer applies.

> 4.
> + * Rules: Node-type validation
> + * ---------------------------
> + * Allow only simple or compound expressions like:
> + * - "(Var Op Const)" or
>
> It seems Var Op Var is allowed. I tried below and it works:
> create publication pub for table t1 where (c1 < c2) WITH (publish = 'insert');
>
> I think it should be okay to allow it provided we ensure that we never
> access some other table/view etc. as part of the expression. Also, we
> should document the behavior correctly.

Fixed.

On Wed, Nov 24, 2021 at 8:52 PM vignesh C <vignesh21@gmail.com> wrote:
>
> 4) This should be included in typedefs.list, also we could add some
> comments for this structure
> +typedef struct {
> +       Relation        rel;
> +       Bitmapset  *bms_replident;
> +}
> +rf_context;

this has been removed in last patch, so comment no longer applies

> 5) Few includes are not required. #include "miscadmin.h" not required
> in pg_publication.c, #include "executor/executor.h" not required in
> proto.c, #include "access/xact.h", #include "executor/executor.h" and
> #include "replication/logicalrelation.h" not required in pgoutput.c
>

Optimized this. removed "executor/executor.h" from patch 0003, removed
"access/xact.h" from patch 0001
removed "replication/logicalrelation.h” from 0001. Others required.

> 6) typo "filte" should be "filter":
> +/*
> + * The row filte walker checks that the row filter expression is legal.
> + *
> + * Rules: Node-type validation
> + * ---------------------------
> + * Allow only simple or compound expressions like:
> + * - "(Var Op Const)" or
> + * - "(Var Op Const) Bool (Var Op Const)"

Fixed.


regards,
Ajin Cherian
Fujitsu Australia

Attachment

pgsql-hackers by date:

Previous
From: Amul Sul
Date:
Subject: Update stale code comment in CheckpointerMain()
Next
From: Fabien COELHO
Date:
Subject: Re: rand48 replacement