Re: Handle infinite recursion in logical replication setup - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Handle infinite recursion in logical replication setup
Date
Msg-id CAA4eK1JEfUqQbSE31FGwx_BBEdA1_FEWis2o1z4VP2cpp8obZQ@mail.gmail.com
Whole thread Raw
In response to Re: Handle infinite recursion in logical replication setup  (Peter Smith <smithpb2250@gmail.com>)
List pgsql-hackers
On Tue, Apr 5, 2022 at 8:17 AM Peter Smith <smithpb2250@gmail.com> wrote:
>
> Here are my comments for the latest patch v6-0002.
>
> PATCH v6-0002 comments
> ======================
>
> 2.1 General - should this be an independent patch?
>

I think both the patches are dependent and might get committed
together if the concept proved to be useful and doesn't have flaws.

> In many ways, I think most of this patch is unrelated to the other
> "local_only" patch (v6-0001).
>
> For example, IIUC even in the current HEAD, we could consider it to be
> a user error if multiple SUBSCRIPTIONS or multiple PUBLICATIONS of the
> same SUBSCRIPTION are replicating to the same TABLE on the same node
> and using "copy_data = on".
>
> So I think it would be ok to throw an ERROR if such a copy_data clash
> is detected, and then the user will have to change to use "copy_data =
> off" for some/all of them to avoid data duplication.
>
> The "local_only" option only adds some small logic to this new ERROR,
> but it's not really a prerequisite at all.
>
> e.g. this whole ERROR part of the patch can be a separate thread.
>
> ~~~
>
> 2.2 General - can we remove the "force" enum?
>
> Now, because I consider the clashing "copy_data = on" ERROR to be a
> user error, I think that is something that the user can already take
> care of themselves just using the copy_data = off.
>
> I did not really like the modifying of the "copy_data" option from
> just boolean to some kind of hybrid boolean + "force".
>
> a) It smells a bit off to me. IMO replication is supposed to end up
> with the same (replicated) data on the standby machine but this
> "force" mode seems to be just helping the user to break that concept
> and say - "I know what I'm doing, and I don't care if I get lots of
> duplicated data in the replica table - just let me do it"...
>
> b) It also somehow feels like the new "force" was introduced mostly to
> make the code ERROR handling implementation simpler, rather than to
> make the life of the end-user better. Yes, if force is removed maybe
> the copy-clash-detection-code will need to be internally quite more
> complex than it is now, but that is how it should be, instead of
> putting extra burden on the user (e.g. by complicating the PG docs and
> giving them yet more alternatives to configure). I think any clashing
> copy_data options really is a user error, but also I think the current
> boolean copy_data true/false already gives the user a way to fix it.
>
> c) Actually, your new error hint messages are similar to my
> perspective: They already say "Use CREATE/ALTER SUBSCRIPTION with
> copy_data = off or force". All I am saying is remove the "force", and
> the user can still fix the problem just by using "copy_data = off"
> appropriately.
>

How will it work if there is more than one table? copy_data = "off"
means it won't copy any of the other tables in the corresponding
publication(s). I think it is primarily to give the user some option
where she understands the logical replication setup better and
understands that this won't be a problem.

-- 
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: [PATCH] Expose port->authn_id to extensions and triggers
Next
From: Noah Misch
Date:
Subject: Re: Skipping logical replication transactions on subscriber side