Home > mailing lists

Re: row filtering for logical replication - Mailing list pgsql-hackers

From	Amit Kapila
Subject	Re: row filtering for logical replication
Date	September 21, 2021 09:04:40
Msg-id	CAA4eK1KRZGrCnhRCusH-k0K8RZ2qL=dp2eXLSDJCMtWHBcRygg@mail.gmail.com Whole thread Raw
In response to	Re: row filtering for logical replication (Dilip Kumar <dilipbalaut@gmail.com>)
Responses	Re: row filtering for logical replication
List	pgsql-hackers

Tree view

On Tue, Sep 21, 2021 at 11:16 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Tue, Sep 21, 2021 at 10:41 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> I think the point is if for some expression some
> > values are in old tuple and others are in new then the idea proposed
> > in the patch seems sane. Moreover, I think in your idea for each tuple
> > we might need to build a new expression and sometimes twice that will
> > beat the purpose of cache we have kept in the patch and I am not sure
> > if it is less costly.
>
> Basically, expression initialization should happen only once in most
> cases so with my suggestion you might have to do it twice.
>

No, the situation will be that we might have to do it twice per update
where as now, it is just done at the very first operation on a
relation.

>  But the
> overhead of extra expression evaluation is far less than doing
> duplicate evaluation because that will happen for sending each update
> operation right?
>

Expression evaluation has to be done twice because every update can
have a different set of values in the old and new tuple.

> > See another example where splitting filter might not give desired results:
> >
> > Say filter expression: (a = 10 and b = 20 and c = 30)
> >
> > Now, old_tuple has values for columns a and c and say values are 10
> > and 30. So, the old_tuple will match the filter if we split it as per
> > your suggestion. Now say new_tuple has values (a = 5, b = 15, c = 25).
> > In such a situation dividing the filter will give us the result that
> > the old_tuple is matching but new tuple is not matching which seems
> > incorrect. I think dividing filter conditions among old and new tuples
> > might not retain its sanctity.
>
> Yeah that is a good example to apply a duplicate filter, basically
> some filters might not even get evaluated on new tuples as the above
> example and if we have removed such expression on the other tuple we
> might break something.
>

Right.

>  Maybe for now this suggest that we might not
> be able to avoid the duplicate execution of the expression
>

So, IIUC, you agreed that let's proceed with the proposed approach and
we can later do optimizations if possible or if we get better ideas.

> > > >
> > > > Even if it were done, there would still be the overhead of deforming the tuple.
> > >
> > > Suppose filter is just (a > 10 and b < 20) and only if the a is
> > > updated, and if we are able to modify the filter for the oldtuple to
> > > be just (a>10) then also do we need to deform?
> > >
> >
> > Without deforming, how will you determine which columns are part of
> > the old tuple?
>
> Okay, then we might have to deform, but at least are we ensuring that
> once we have deform the tuple for the expression evaluation then we
> are not doing that again while sending the tuple?
>

I think this is possible but we might want to be careful not to send
extra unchanged values as we are doing now.

-- 
With Regards,
Amit Kapila.

pgsql-hackers by date:

From: Fabrice Chapuis
Date: 21 September 2021, 08:22:32
Subject: Re: Logical replication timeout problem

From: Amit Kapila
Date: 21 September 2021, 09:52:29
Subject: Re: Logical replication timeout problem

Re: row filtering for logical replication - Mailing list pgsql-hackers

Previous

Next