Re: Questions on logical replication - Mailing list pgsql-general
From | Koen De Groote |
---|---|
Subject | Re: Questions on logical replication |
Date | |
Msg-id | CAGbX52F7goP9u+FojqVYFWuB=HxBbOPC-6F_n0WKgkVep2x1nw@mail.gmail.com Whole thread Raw |
In response to | Re: Questions on logical replication (Justin <zzzzz.graf@gmail.com>) |
Responses |
Re: Questions on logical replication
|
List | pgsql-general |
> If there are any errors during the replay of WAL such as missing indexes for Replica Identities during an Update or Delete this will cause the main subscriber worker slot on the publisher to start backing up WAL files
And also if the connection breaks, from what I understand, is that correct? Anything that stops the subscription, including disabling the subscription, is that right?
> I suggest confirming all tables have replica identities or primary keys before going any further.
Yes, I am aware of this. I made me a small script that prints which tables I have added to the publication and are done syncing, and which are currently not being replicated.
> With PG 11 avoid REPLICA IDENTITY FULL as this causes full table scan on the subscriber for PG 15 and earlier.
I'm also aware of this. My plan is to create a publication with no tables, and add them 1 by 1, refreshing the subscriber each time.
I'm not planning on using "REPLICA IDENTITY FULL" anywhere.
On Sat, Jun 8, 2024 at 10:33 PM Justin <zzzzz.graf@gmail.com> wrote:
On Sat, Jun 8, 2024 at 1:41 PM Koen De Groote <kdg.dev@gmail.com> wrote:What I'm trying to do is upgrade a PG11 database to PG16, using logical replication.The PG11 has an active and a standby, there are a handful of databases. On particular one has a few tables just over 100GB, then a few 100 tables near 1GB.What I'd do is start a publication with no tables and add them 1 at a time, refreshing subscription each time.This might take a long time, so my main questions relate to potential network issues or various situations where the instance receiving the logical replication, suddenly stop being able to receive.Resyncing, and the effects of WAL buildup, are my main concern.Accidentally sent a mail to only your email, sorry for that.Regards,Koen De GrooteThis approach does not prevent WAL build up.The WAL build up is during the initial sync of the data by table NOT during the entire synce of all the tables that have been published.
The WAL build up occurs during the initial sync worker once that table is synced the WAL is replayed and released. The parent worker then become responsible for replaying the WAL for that table
For 1 gb table the initial sync will be very fast so I doubt any individual table will cause any significant WAL build up to put the publisher at risk of of crashing
Once a table becomes synced the main subscriber worker keeps the WAL replayed. If there are any errors during the replay of WAL such as missing indexes for Replica Identities during an Update or Delete this will cause the main subscriber worker slot on the publisher to start backing up WAL files. If there are missing replica identities the affected tables will have to be dropped from the publication and subscription refreshed. The WAL file is already written with incorrect information so the table on the subscriber table is most likely not in recoverable state.
I suggest confirming all tables have replica identities or primary keys before going any further. With PG 11 avoid REPLICA IDENTITY FULL as this causes full table scan on the subscriber for PG 15 and eariler. PG 16 on the subsciber can use a different unique index that has NOT NULL for all participating columns if the publisher is using Replicate Identity FULL on the published table
One must understand the above before deploying logical replication.
Hope this helps
pgsql-general by date: