Re: row filtering for logical replication - Mailing list pgsql-hackers
From | Peter Smith |
---|---|
Subject | Re: row filtering for logical replication |
Date | |
Msg-id | CAHut+Pt+bOqOXnPMOP7fRWTb0qmN52w9Whbo9oqpko3ubDZd1g@mail.gmail.com Whole thread Raw |
In response to | Re: row filtering for logical replication (Amit Kapila <amit.kapila16@gmail.com>) |
Responses |
Re: row filtering for logical replication
Re: row filtering for logical replication Re: row filtering for logical replication |
List | pgsql-hackers |
On Mon, Nov 15, 2021 at 9:31 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Wed, Nov 10, 2021 at 12:36 PM Peter Smith <smithpb2250@gmail.com> wrote: > > > > On Mon, Nov 8, 2021 at 5:53 PM houzj.fnst@fujitsu.com > > <houzj.fnst@fujitsu.com> wrote: > > > > > > 3) v37-0005 > > > > > > - no parse nodes of any kind other than Var, OpExpr, Const, BoolExpr, FuncExpr > > > > > > I think there could be other node type which can also be considered as simple > > > expression, for exmaple T_NullIfExpr. > > > > The current walker restrictions are from a previously agreed decision > > by Amit/Tomas [1] and from an earlier suggestion from Andres [2] to > > keep everything very simple for a first version. > > > > Yes, you are right, there might be some additional node types that > > might be fine, but at this time I don't want to add anything different > > without getting their approval to do so. Anyway, additions like this > > are all candidates for a future version of this row-filter feature. > > > > I think we can consider T_NullIfExpr unless you see any problem with the same. Added in v40 [1] > Few comments on the latest set of patches (v39*) > ======================================= ... > 0003* > 3. In pgoutput_row_filter(), the patch is finding pub_relid when it > should already be there in RelationSyncEntry->publish_as_relid found > during get_rel_sync_entry call. Is there a reason to do this work > again? Fixed in v40 [1] > > 4. I think we should add some comments in pgoutput_row_filter() as to > why we are caching the row_filter here instead of > get_rel_sync_entry()? That has been discussed multiple times so it is > better to capture that in comments. Added comment in v40 [1] > > 5. Why do you need a separate variable rowfilter_valid to indicate > whether a valid row filter exists? Why exprstate is not sufficient? > Can you update comments to indicate why we need this variable > separately? I have improved the (existing) comment in v40 [1]. > > 0004* > 6. In rowfilter_expr_checker(), the expression tree is traversed > twice, can't we traverse it once to detect all non-allowed stuff? It > can be sometimes costly to traverse the tree multiple times especially > when the expression is complex and it doesn't seem acceptable to do so > unless there is some genuine reason for the same. I kind of doubt there would be any perceptible difference for 2 traverses instead of 1 because: a) filters are limited to simple expressions. Yes, a large boolean expression is possible but I don't think it is likely. b) the validation part is mostly a one-time execution only when the filter is created or changed. Anyway, I am happy to try to refactor the logic to a single traversal as suggested, but I'd like to combine those "validation" patches (v40-0005, v40-0006) first, so I can combine their walker logic. Is it OK? > > 7. > +static void > +rowfilter_expr_checker(Publication *pub, Node *rfnode, Relation rel) > > Keep the rel argument before whereclause as that makes the function > signature better. Fixed in v40 [1] ----- [1] https://www.postgresql.org/message-id/CAHut%2BPv-D4rQseRO_OzfEz2dQsTKEnKjBCET9Z-iJppyT1XNMQ%40mail.gmail.com Kind Regards, Peter Smith. Fujitsu Australia
pgsql-hackers by date: