Thread: Logical subscription / publication lifetimes

Logical subscription / publication lifetimes

From
andrew cooke
Date:
If I define a publication at time Tp, then load some data on the
publisher, then start a subscription at time Ts, then load some more
data on the publisher, does the subscriber get data from Tp or Ts
onwards?

Also, if a subscription is disabled and then re-enabled does it lose
the data inbetween, or is it back-filled?

I am not finding the answers to these questions in the docs at
https://www.postgresql.org/docs/current/logical-replication.html but
maybe I am overlooking something.  The link above does mention copying
an existing table which may imply Ts?

Thanks,
Andrew



Re: Logical subscription / publication lifetimes

From
"David G. Johnston"
Date:
On Fri, Apr 22, 2022 at 5:00 AM andrew cooke <andrew@acooke.org> wrote:

If I define a publication at time Tp, then load some data on the
publisher, then start a subscription at time Ts, then load some more
data on the publisher, does the subscriber get data from Tp or Ts
onwards?


It depends.  By default, neither, the publisher is publishing the entire contents of the table and the subscriber will do everything necessary to replicate those contents in their entirety.

If you specify copy_data = false I'm not sure what you end up with initially or after disable.  My guess is the subscription defines the first transaction it cares about when it connects to the publisher, defaulting to the most recent publisher transaction (all older transactions would be handled via copy_data = true) but then so long as the slot remains active the publisher will place the data into the slot even while the subscriber is not active and the subscriber will receive all of it next time it comes online/re-enables.

David J.

Re: Logical subscription / publication lifetimes

From
andrew cooke
Date:
Ah, thanks!  I should have read the documentation of all the
parameters!

So the portion of data that is covered by "copy_data" is going to
reflect updates and deletes prior to the creation of the slot even if
"publish=insert" (only)?

This makes sense because I can't see how else it could be practically
implemented, but just want to be sure I am understanding.  The idea
that there are two phases (copy existing data then replicate
operations) is a big help.

Thanks again,
Andrew

On Fri, Apr 22, 2022 at 09:13:15AM -0700, David G. Johnston wrote:
> On Fri, Apr 22, 2022 at 5:00 AM andrew cooke <andrew@acooke.org> wrote:
> 
> >
> > If I define a publication at time Tp, then load some data on the
> > publisher, then start a subscription at time Ts, then load some more
> > data on the publisher, does the subscriber get data from Tp or Ts
> > onwards?
> >
> >
> It depends.  By default, neither, the publisher is publishing the entire
> contents of the table and the subscriber will do everything necessary to
> replicate those contents in their entirety.
> 
> If you specify copy_data = false I'm not sure what you end up with
> initially or after disable.  My guess is the subscription defines the first
> transaction it cares about when it connects to the publisher, defaulting to
> the most recent publisher transaction (all older transactions would be
> handled via copy_data = true) but then so long as the slot remains active
> the publisher will place the data into the slot even while the subscriber
> is not active and the subscriber will receive all of it next time it comes
> online/re-enables.
> 
> David J.