Re: Questions on logical replication - Mailing list pgsql-general

From Koen De Groote
Subject Re: Questions on logical replication
Date
Msg-id CAGbX52EGNrR_UODu6sjUThLnLdadGaQ5qKSCnzJB9uJtRgTN1A@mail.gmail.com
Whole thread Raw
In response to Re: Questions on logical replication  (Justin <zzzzz.graf@gmail.com>)
Responses Re: Questions on logical replication
List pgsql-general
> Why?  what benefit does this provide you??   Add all the tables when creating the publication and be done with it...  I get this when trying to understand how this all works on test boxes, but for production NO idea what you're trying to accomplish

Adding all tables at once means adding the gigantic tables as well. Disk IO and Network traffic are a serious concern, increased CPU usage affecting queries of the live system, as well as transaction wraparound.

Initial sync can be a serious concern, depending on the size of the table.

Here's a nice guide where people did a logical replication upgrade, explaining why they did it this way: https://knock.app/blog/zero-downtime-postgres-upgrades

On Wed, Jun 12, 2024 at 7:01 PM Justin <zzzzz.graf@gmail.com> wrote:


On Tue, Jun 11, 2024 at 5:43 PM Koen De Groote <kdg.dev@gmail.com> wrote:
> If there are any errors during the replay of WAL such as missing indexes for Replica Identities during an Update or Delete  this will cause the main subscriber worker slot on the publisher to start backing up WAL files

And also if the connection breaks, from what I understand, is that correct? Anything that stops the subscription, including disabling the subscription, is that right?
 
Yes to all.... 


> I suggest confirming all tables have replica identities or primary keys before going any further.

Yes, I am aware of this. I made me a small script that prints which tables I have added to the publication and are done syncing, and which are currently not being replicated.
 

> With PG 11 avoid REPLICA IDENTITY FULL as this causes full table scan on the subscriber for PG 15 and earlier.

I'm also aware of this. My plan is to create a publication with no tables, and add them 1 by 1, refreshing the subscriber each time.
 
Why?  what benefit does this provide you??   Add all the tables when creating the publication and be done with it...  I get this when trying to understand how this all works on test boxes, but for production NO idea what you're trying to accomplish 


I'm not planning on using "REPLICA IDENTITY FULL" anywhere.
Good 

pgsql-general by date:

Previous
From: Muhammad Ikram
Date:
Subject: Re: TOAST Table / Dead Tuples / Free Pages
Next
From: Alvaro Herrera
Date:
Subject: Re: UPDATE with multiple WHERE conditions