Re: row filtering for logical replication - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: row filtering for logical replication
Date
Msg-id acf0ba6e-1add-75da-a989-8e8fd82253e0@enterprisedb.com
Whole thread Raw
In response to Re: row filtering for logical replication  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: row filtering for logical replication  (Dilip Kumar <dilipbalaut@gmail.com>)
List pgsql-hackers
On 7/20/21 11:42 AM, Amit Kapila wrote:
> On Tue, Jul 20, 2021 at 2:39 PM Tomas Vondra
> <tomas.vondra@enterprisedb.com> wrote:
>>
>> On 7/20/21 7:23 AM, Amit Kapila wrote:
>>> On Mon, Jul 19, 2021 at 7:02 PM Tomas Vondra
>>> <tomas.vondra@enterprisedb.com> wrote:
>>
>>>> So maybe the best thing is to stick to the simple approach already used
>>>> e.g. by pglogical, which simply user the new row when available (insert,
>>>> update) and old one for deletes.
>>>>
>>>> I think that behaves more or less sensibly and it's easy to explain.
>>>>
>>>
>>> Okay, if nothing better comes up, then we can fall back to this option.
>>>
>>>> All the other things (e.g. turning UPDATE to INSERT, advanced conflict
>>>> resolution etc.) will require a lot of other stuff,
>>>>
>>>
>>> I have not evaluated this yet but I think spending some time thinking
>>> about turning Update to Insert/Delete (yesterday's suggestion by
>>> Alvaro) might be worth especially as that seems to be followed by some
>>> other replication solution as well.
>>>
>>
>> I think that requires quite a bit of infrastructure, and I'd bet we'll
>> need to handle other types of conflicts too.
>>
> 
> Hmm, I don't see why we need any additional infrastructure here if we
> do this at the publisher. I think this could be done without many
> changes to the patch as explained in one of my previous emails [1].
> 

Oh, I see. I've been thinking about doing the "usual" conflict
resolution on the subscriber side. I'm not sure about doing this on the
publisher ...

>> I don't have a clear
>> opinion if that's required to get this patch working - I'd try getting
>> the simplest implementation with reasonable behavior, with those more
>> advanced things as future enhancements.
>>
>>>> and I see them as
>>>> improvements of this simple approach.
>>>>
>>>>>>> Maybe a second option is to have replication change any UPDATE into
>>>>>>> either an INSERT or a DELETE, if the old or the new row do not pass the
>>>>>>> filter, respectively.  That way, the databases would remain consistent.
>>>>>
>>>>> Yeah, I think this is the best way to keep the data consistent.
>>>>>
>>>>
>>>> It'd also require REPLICA IDENTITY FULL, which seems like it'd add a
>>>> rather significant overhead.
>>>>
>>>
>>> Why? I think it would just need similar restrictions as we are
>>> planning for Delete operation such that filter columns must be either
>>> present in primary or replica identity columns.
>>>
>>
>> How else would you turn UPDATE to INSERT? For UPDATE we only send the
>> identity columns and modified columns, and the decision happens on the
>> subscriber.
>>
> 
> Hmm, we log the entire new tuple and replica identity columns for the
> old tuple in WAL for Update. And, we are going to use a new tuple for
> Insert, so we have everything we need.
> 

Do we log the TOAST-ed values that were not updated?


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: row filtering for logical replication
Next
From: Dilip Kumar
Date:
Subject: Re: row filtering for logical replication