Re: row filtering for logical replication - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: row filtering for logical replication
Date
Msg-id CAA4eK1JLQqNZypOpN7h3=Vt0JJW4Yb_FsLJS=T8J9J-WXgFMYg@mail.gmail.com
Whole thread Raw
In response to Re: row filtering for logical replication  (Dilip Kumar <dilipbalaut@gmail.com>)
Responses Re: row filtering for logical replication  (Dilip Kumar <dilipbalaut@gmail.com>)
List pgsql-hackers
On Tue, Jul 27, 2021 at 9:56 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Tue, Jul 27, 2021 at 6:21 AM houzj.fnst@fujitsu.com
> <houzj.fnst@fujitsu.com> wrote:
>
> > 1) UPDATE a nonkey column in publisher.
> > 2) Use debugger to block the walsender process in function
> >    pgoutput_row_filter_exec_expr().
> > 3) Open another psql to connect the publisher, and drop the table which updated
> >    in 1).
> > 4) Unblock the debugger in 2), and then I can see the following error:
> > ---
> > ERROR:  could not read block 0 in file "base/13675/16391"
>
> Yeah, that's a big problem, seems like the expression evaluation
> machinery directly going and detoasting the externally stored data
> using some random snapshot.  Ideally, in walsender we can never
> attempt to detoast the data because there is no guarantee that those
> data are preserved.  Somehow before going to the expression evaluation
> machinery, I think we will have to deform that tuple and need to do
> something for the externally stored data otherwise it will be very
> difficult to control that inside the expression evaluation.
>

True, I think it would be possible after we fix the issue reported in
another thread [1] where we will log the key values as part of
old_tuple_key for toast tuples even if they are not changed. We can
have a restriction that in the WHERE clause that user can specify only
Key columns for Updates similar to Deletes. Then, we have the data
required for filter columns basically if the toasted key values are
changed, then they will be anyway part of the old and new tuple and if
they are not changed then they will be part of the old tuple. I have
not checked the implementation part of it but theoretically, it seems
possible. If my understanding is correct then it becomes necessary to
solve the other bug [1] to solve this part of the problem for this
patch. The other possibility is to disallow columns (datatypes) that
can lead to toasted data (at least for Updates) which doesn't sound
like a good idea to me. Do you have any other ideas for this problem?

[1] -
https://www.postgresql.org/message-id/OS0PR01MB611342D0A92D4F4BF26C0F47FB229%40OS0PR01MB6113.jpnprd01.prod.outlook.com

-- 
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: vignesh C
Date:
Subject: Re: Skipping logical replication transactions on subscriber side
Next
From: Andrey Lepikhov
Date:
Subject: Re: Extra code in commit_ts.h