Re: row filtering for logical replication - Mailing list pgsql-hackers

From Dilip Kumar
Subject Re: row filtering for logical replication
Date
Msg-id CAFiTN-uGr4jG2LTOz_nU2tKoPvgQSHoScMTsFaKVxbFeNAkvYA@mail.gmail.com
Whole thread Raw
In response to Re: row filtering for logical replication  (Ajin Cherian <itsajin@gmail.com>)
Responses Re: row filtering for logical replication
List pgsql-hackers
On Tue, Sep 21, 2021 at 8:58 AM Ajin Cherian <itsajin@gmail.com> wrote:
> > I understand why this is done, but I have 2 concerns here 1) We are
> > having extra deform and copying the field from new to old in case it
> > is unchanged replica identity.  2) The same unchanged attribute values
> > get qualified in the old tuple as well as in the new tuple.  What
> > exactly needs to be done is that the only updated field should be
> > validated as part of the old as well as the new tuple, the unchanged
> > field does not make sense to have redundant validation.   For that we
> > will have to change the filter for the old tuple to just validate the
> > attributes which are actually modified and remaining unchanged and new
> > values will anyway get validated in the new tuple.
> >
> But what if the filter expression depends on multiple columns, say (a+b) > 100
> where a is unchanged while b is changed. Then we will still need both
> columns for applying

In such a case, we need to.

> the filter even though one is unchanged. Also, I am not aware of any
> mechanism by which
> we can apply a filter expression on individual attributes. The current
> mechanism does it
> on a tuple. Do let me know if you have any ideas there?

What I suggested is to modify the filter for the old tuple, e.g.
filter is (a > 10 and b < 20 and c+d = 20), now only if a and c are
modified then we can process the expression and we can transform this
filter to (a > 10 and c+d=20).

>
> Even if it were done, there would still be the overhead of deforming the tuple.

Suppose filter is just (a > 10 and b < 20) and only if the a is
updated, and if we are able to modify the filter for the oldtuple to
be just (a>10) then also do we need to deform?  Even if we have to we
can save a lot on avoiding duplicate expression evaluation.

> I will run some performance tests like Amit suggested and see what the
> overhead is and
> try to minimise it.

It is good to know,  I think you must try with some worst-case
scenarios, e.g. we have 10 text column and 1 int column in the REPLICA
IDENTITY and only the int column get updated and all the text column
are not updated, and you have a filter on all the columns.

Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Justin Pryzby
Date:
Subject: Re: PostgreSQL 14 press release draft
Next
From: Masahiko Sawada
Date:
Subject: Re: Skipping logical replication transactions on subscriber side