Thread: Logical subscription / publication lifetimes
If I define a publication at time Tp, then load some data on the publisher, then start a subscription at time Ts, then load some more data on the publisher, does the subscriber get data from Tp or Ts onwards? Also, if a subscription is disabled and then re-enabled does it lose the data inbetween, or is it back-filled? I am not finding the answers to these questions in the docs at https://www.postgresql.org/docs/current/logical-replication.html but maybe I am overlooking something. The link above does mention copying an existing table which may imply Ts? Thanks, Andrew
On Fri, Apr 22, 2022 at 5:00 AM andrew cooke <andrew@acooke.org> wrote:
If I define a publication at time Tp, then load some data on the
publisher, then start a subscription at time Ts, then load some more
data on the publisher, does the subscriber get data from Tp or Ts
onwards?
It depends. By default, neither, the publisher is publishing the entire contents of the table and the subscriber will do everything necessary to replicate those contents in their entirety.
If you specify copy_data = false I'm not sure what you end up with initially or after disable. My guess is the subscription defines the first transaction it cares about when it connects to the publisher, defaulting to the most recent publisher transaction (all older transactions would be handled via copy_data = true) but then so long as the slot remains active the publisher will place the data into the slot even while the subscriber is not active and the subscriber will receive all of it next time it comes online/re-enables.
David J.
Ah, thanks! I should have read the documentation of all the parameters! So the portion of data that is covered by "copy_data" is going to reflect updates and deletes prior to the creation of the slot even if "publish=insert" (only)? This makes sense because I can't see how else it could be practically implemented, but just want to be sure I am understanding. The idea that there are two phases (copy existing data then replicate operations) is a big help. Thanks again, Andrew On Fri, Apr 22, 2022 at 09:13:15AM -0700, David G. Johnston wrote: > On Fri, Apr 22, 2022 at 5:00 AM andrew cooke <andrew@acooke.org> wrote: > > > > > If I define a publication at time Tp, then load some data on the > > publisher, then start a subscription at time Ts, then load some more > > data on the publisher, does the subscriber get data from Tp or Ts > > onwards? > > > > > It depends. By default, neither, the publisher is publishing the entire > contents of the table and the subscriber will do everything necessary to > replicate those contents in their entirety. > > If you specify copy_data = false I'm not sure what you end up with > initially or after disable. My guess is the subscription defines the first > transaction it cares about when it connects to the publisher, defaulting to > the most recent publisher transaction (all older transactions would be > handled via copy_data = true) but then so long as the slot remains active > the publisher will place the data into the slot even while the subscriber > is not active and the subscriber will receive all of it next time it comes > online/re-enables. > > David J.