Re: Handle infinite recursion in logical replication setup - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Handle infinite recursion in logical replication setup
Date
Msg-id CAA4eK1+Mtz+StvNNtTg9=9BTq8=pMu-V5i4yWqs=KJUh0Z_L4g@mail.gmail.com
Whole thread Raw
In response to Re: Handle infinite recursion in logical replication setup  (Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>)
Responses Re: Handle infinite recursion in logical replication setup  (vignesh C <vignesh21@gmail.com>)
List pgsql-hackers
On Mon, Mar 7, 2022 at 5:01 PM Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:
>
> Hi Vignesh,
> I agree with Peter's comment that the changes to
> FilterRemoteOriginData() should be part of FilterByOrigin()
>
> Further, I wonder why "onlylocal_data" is a replication slot's
> property. A replication slot tracks the progress of replication and it
> may be used by different receivers with different options. I could
> start one receiver which wants only local data, say using
> "pg_logical_slot_get_changes" and later start another receiver which
> fetches all the data starting from where the first receiver left. This
> option prevents such flexibility.
>
> As discussed earlier in the thread, local_only can be property of
> publication or subscription, depending upon the use case, but I can't
> see any reason that it should be tied to a replication slot.
>

I thought it should be similar to 'streaming' option of subscription
but may be Vignesh has some other reason which makes it different.

> I have a similar question for "two_phase" but the ship has sailed and
> probably it makes some sense there which I don't know.
>

two_phase is different from some of the other subscription options
like 'streaming' such that it can be enabled only at the time of slot
and subscription creation, we can't change/specify it via
pg_logical_slot_get_changes. This is to avoid the case where we won't
know at the time of the commit prepared whether the prepare for the
transaction has already been sent. For the same reason, we need to
also know the 'two_phase_at' information.

> As for publication vs subscription, I think both are useful cases.
> 1. It will be a publication's property, if we want the node to not
> publish any data that it receives from other nodes for a given set of
> tables.
> 2. It will be the subscription's property, if we want the subscription
> to decide whether it wants to fetch the data changed on only upstream
> or other nodes as well.
>

I think it could be useful to allow it via both publication and
subscription but I guess it is better to provide it via one way
initially just to keep things simple and give users some way to deal
with such cases. I would prefer to allow it via subscription initially
for the reasons specified by Vignesh in his previous email [1]. Now,
if we think those all are ignorable things and it is more important to
allow this option first by publication or we must allow it via both
publication and subscription then it makes sense to change it.


[1] - https://www.postgresql.org/message-id/CALDaNm3jkotRhKfCqu5CXOf36_yiiW_cYE5%3DbG%3Dj6N3gOWJkqw%40mail.gmail.com

-- 
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: Magnus Hagander
Date:
Subject: Re: Add parameter jit_warn_above_fraction
Next
From: Magnus Hagander
Date:
Subject: Re: Expose JIT counters/timing in pg_stat_statements