Re: row filtering for logical replication - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: row filtering for logical replication
Date
Msg-id CAA4eK1LOwT2Mog=f0e6y2Nd3FKjmFMRRdtR0GEr9jDBHYKS29w@mail.gmail.com
Whole thread Raw
In response to Re: row filtering for logical replication  (Rahila Syed <rahilasyed90@gmail.com>)
Responses RE: row filtering for logical replication
List pgsql-hackers
On Fri, Jul 23, 2021 at 2:27 PM Rahila Syed <rahilasyed90@gmail.com> wrote:
>
> On Fri, Jul 23, 2021 at 8:36 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>>
>> On Fri, Jul 23, 2021 at 8:29 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>> >
>> > On Thu, Jul 22, 2021 at 8:06 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>> > >
>> > > On Thu, Jul 22, 2021 at 5:15 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>> > > >
>> > > > On Tue, Jul 20, 2021 at 4:33 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>> > > > >
>> > > > > On Tue, Jul 20, 2021 at 3:43 PM Tomas Vondra
>> > > > > <tomas.vondra@enterprisedb.com> wrote:
>> > > > > >
>> > > > > > Do we log the TOAST-ed values that were not updated?
>> > > > >
>> > > > > No, we don't, I have submitted a patch sometime back to fix that [1]
>> > > > >
>> > > >
>> > > > That patch seems to log WAL for key unchanged columns. What about if
>> > > > unchanged non-key columns? Do they get logged as part of the new tuple
>> > > > or is there some other way we can get those? If not, then we need to
>> > > > probably think of restricting filter clause in some way.
>> > >
>> > > But what sort of restrictions? I mean we can not put based on data
>> > > type right that will be too restrictive,
>> > >
>> >
>> > Yeah, data type restriction sounds too restrictive and unless the data
>> > is toasted, the data will be anyway available. I think such kind of
>> > restriction should be the last resort but let's try to see if we can
>> > do something better.
>> >
>> > > other option is only to allow
>> > > replica identity keys columns in the filter condition?
>> > >
>> >
>> > Yes, that is what I had in mind because if key column(s) is changed
>> > then we will have data for both old and new tuples. But if it is not
>> > changed then we will have it probably for the old tuple unless we
>> > decide to fix the bug you mentioned in a different way in which case
>> > we might either need to log it for the purpose of this feature (but
>> > that will be any way for HEAD) or need to come up with some other
>> > solution here. I think we can't even fetch such columns data during
>> > decoding because we have catalog-only historic snapshots here. Do you
>> > have any better ideas?
>> >
>>
>> BTW, I wonder how pglogical can handle this because if these unchanged
>> toasted values are not logged in WAL for the new tuple then how the
>> comparison for such columns will work? Either they are forcing WAL in
>> some way or don't allow WHERE clause on such columns or maybe they
>> have dealt with it in some other way unless they are unaware of this
>> problem.
>>
>
> The column comparison for row filtering happens before the unchanged toast
> columns are filtered. Unchanged toast columns are filtered just before writing the tuple
> to output stream.
>

To perform filtering, you need to use the tuple from WAL and that
tuple doesn't seem to have unchanged toast values, so how can we do
filtering? I think it is a good idea to test this once.

-- 
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: Ajin Cherian
Date:
Subject: Re: logical replication empty transactions
Next
From: Nitin Jadhav
Date:
Subject: Re: when the startup process doesn't (logging startup delays)