Re: Logical replication - initial data synchronization - Mailing list pgsql-docs
From | Koen De Groote |
---|---|
Subject | Re: Logical replication - initial data synchronization |
Date | |
Msg-id | CAGbX52HcDV7S5tEbsQEDWAJkfMBrm7OYaCmn_bt5shtm_Td-YQ@mail.gmail.com Whole thread Raw |
In response to | Logical replication - initial data synchronization (PG Doc comments form <noreply@postgresql.org>) |
List | pgsql-docs |
Hello Bruce, thanks for picking this up.
Having used LR for months now, that seems weird as I write it, but I remember it being part of my initial confusion.
Instead of:
" Internally logical replication of a table starts by taking a snapshot
of the data on the publisher database and copying that to the subscriber."
of the data on the publisher database and copying that to the subscriber."
I would say:
"When logical replication is started for a table, Postgres internally
takes a snapshot of the table data on the publisher database,
and then copies that data to the subscriber."
Also, I would change:
"Once complete, the changes on the publisher are sent to the subscriber"
To:
"Once complete, any changes on the publisher since the initial copy are sent to the subscriber"
This is more explicit and clear, I feel.
And then to be consistent I'd also use this wording in the last change, changing:
"publisher database. Once complete, changes on the publisher are sent"
to
"publisher database. Once complete, any changes on the publisher since the initial copy are sent"
Hope that's ok.
Thanks for looking into this.
Regards,
Koen De Groote
On Thu, Oct 17, 2024 at 3:20 AM Bruce Momjian <bruce@momjian.us> wrote:
On Sat, May 18, 2024 at 09:02:11PM +0000, PG Doc comments form wrote:
> The following documentation comment has been logged on the website:
>
> Page: https://www.postgresql.org/docs/16/logical-replication-subscription.html
> Description:
>
> I'm reading up on Logical Replication and have been reading the pages in
> order.
>
> The first 2 pages:
> https://www.postgresql.org/docs/current/logical-replication.html and
> https://www.postgresql.org/docs/current/logical-replication-publication.html
> both speak of the requirement to set up a snapshot and explain that
> publication will then send further updates as they happen to subscribers.
>
> But the 3rd page,
> https://www.postgresql.org/docs/current/logical-replication-subscription.html
> now mentions this: "Additional replication slots may be required for the
> initial data synchronization of pre-existing table data and those will be
> dropped at the end of data synchronization."
>
> For me, reading the first 2 pages implied that I would have to perform some
> manual command that starts the creation of a snapshot of pre-existing table
> data, and unpack this on the subscriber node somehow.
>
> The text on the "Subscription" page sounds to me like this is actually
> something the publisher<-> subscriber model of the postgres software can
> manage on its own. As opposed to a snapshot, which feels more like the
> concept of a basebackup.
>
> Regardless of that being correct or not, my current impression is that the
> description isn't consistent across pages. Maybe the text is obvious for
> people who've performed setup of logical replication before, but I have
> never done this. To me, the description on the first 2 pages seems
> inconsistent with the description I just encountered on the 3rd page. I was
> under the impression there was no such thing as "initial data
> synchronization of pre-existing table data" in terms of postgres doing this
> by itself.
>
> Am I missing something extremely simple, or can the description of the
> involved operations be made more consistent across documentation pages?
Is the attached patch an improvement?
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com
When a patient asks the doctor, "Am I going to die?", he means
"Am I going to die soon?"
pgsql-docs by date: