Re: replication_origin and replication_origin_lsn usage on subscriber - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: replication_origin and replication_origin_lsn usage on subscriber
Date
Msg-id CAA4eK1KLaV=nVDF1W4WzAsKwOQpZDo_JsT+C4to3N48Q2pH_EQ@mail.gmail.com
Whole thread Raw
In response to Re: replication_origin and replication_origin_lsn usage on subscriber  (Petr Jelinek <petr@2ndquadrant.com>)
Responses Re: replication_origin and replication_origin_lsn usage on subscriber
List pgsql-hackers
On Thu, Jul 9, 2020 at 5:16 PM Petr Jelinek <petr@2ndquadrant.com> wrote:
>
> On 09/07/2020 13:10, Amit Kapila wrote:
> > On Thu, Feb 6, 2020 at 2:40 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >>
> >> During logical decoding, we send replication_origin and
> >> replication_origin_lsn when we decode commit.  In pgoutput_begin_txn,
> >> we send values for these two but never used on the subscriber side.
> >> Though we have provided a function (logicalrep_read_origin) to read
> >> these two values but that is not used in code anywhere.
> >>
>
> We don't use the origin message anywhere really because we don't support
> origin forwarding in the built-in replication yet. That part I left out
> intentionally in the original PG10 patchset as it's mostly useful for
> circular replication detection when you want to replicate both ways.
> However that's relatively useless without also having some kind of
> conflict detection which would be another huge pile of code and I
> expected we would end up not getting logical replication in PG10 at all
> if I tried to push conflict detection as well :)
>

Fair enough.  However, without tests and more documentation about this
concept, it is likely that future development might break it.  It is
good that you and others who know this part well are there to respond
but still, the more documentation and tests would be preferred.

> >
> > For the purpose of decoding in-progress transactions, I think we can
> > send replication_origin in the first 'start' message as it is present
> > with each WAL record, however replication_origin_lsn is only logged at
> > commit time, so can't send it before commit.  The
> > replication_origin_lsn is set by pg_replication_origin_xact_setup()
> > but it is not clear how and when that function can be used.  Do we
> > really need replication_origin_lsn before we decode the commit record?
> >
>
> That's the SQL interface, C interface does not require that and I don't
> think we need to do that.
>

I think when you are saying SQL interface, you referred to
pg_replication_origin_xact_setup() but I am not sure which C interface
you are referring to in the above sentence?

> The existing apply code sets the
> replorigin_session_origin_lsn only when processing commit message IIRC.
>

That's correct.  However, we do send it via 'begin' callback which
won't be possible with the streaming of in-progress transactions.  Do
we need to send this origin related information (origin, origin_lsn)
while streaming of in-progress transactions?  If so, when?  As far as
I can see, the origin_id can be sent with the first 'start' message.
The origin_lsn and origin_commit can be sent with the last 'start' of
streaming commit if we want but not sure if that is of use.  If we
need to send origin_lsn earlier than that then we need to record it
with other WAL records (other than Commit WAL record).

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Andy Fan
Date:
Subject: Re: Implementing Incremental View Maintenance
Next
From: Petr Jelinek
Date:
Subject: Re: replication_origin and replication_origin_lsn usage on subscriber