Re: Initial Schema Sync for Logical Replication - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Initial Schema Sync for Logical Replication
Date
Msg-id CAA4eK1KWzAamnfB=YymAi3d_QuL=02dKXoG4QMdmYP-gpabR0g@mail.gmail.com
Whole thread Raw
In response to RE: Initial Schema Sync for Logical Replication  ("Kumar, Sachin" <ssetiya@amazon.com>)
Responses RE: Initial Schema Sync for Logical Replication  ("Kumar, Sachin" <ssetiya@amazon.com>)
List pgsql-hackers
On Thu, Mar 23, 2023 at 9:24 PM Kumar, Sachin <ssetiya@amazon.com> wrote:
>
> > From: Amit Kapila <amit.kapila16@gmail.com>
> > IIUC, this is possible only if tablesync process uses a snapshot different than the
> > snapshot we have used to perform the initial schema sync, otherwise, this
> > shouldn't be a problem. Let me try to explain my understanding with an example
> > (the LSNs used are just explain the
> > problem):
> >
> > 1. Create Table t1(c1, c2); --LSN: 90
> > 2. Insert t1 (1, 1); --LSN 100
> > 3. Insert t1 (2, 2); --LSN 110
> > 4. Alter t1 Add Column c3; --LSN 120
> > 5. Insert t1 (3, 3, 3); --LSN 130
> >
> > Now, say before starting tablesync worker, apply process performs initial
> > schema sync and uses a snapshot corresponding to LSN 100. Then it starts
> > tablesync process to allow the initial copy of data in t1.
> > Here, if the table sync process tries to establish a new snapshot, it may get data
> > till LSN 130 and when it will try to copy the same in subscriber it will fail. Is my
> > understanding correct about the problem you described?
> Right
> > If so, can't we allow
> > tablesync process to use the same exported snapshot as we used for the initial
> > schema sync and won't that solve the problem you described?
> I think we won't be able to use same snapshot because the transaction will be committed.
> In CreateSubscription() we can use the transaction snapshot from walrcv_create_slot()
> till walrcv_disconnect() is called.(I am not sure about this part maybe walrcv_disconnect() calls
> the commits internally ?).
> So somehow we need to keep this snapshot alive, even after transaction is committed(or delay committing
> the transaction , but we can have CREATE SUBSCRIPTION with ENABLED=FALSE, so we can have a restart before
> tableSync is able to use the same snapshot.)
>

Can we think of getting the table data as well along with schema via
pg_dump? Won't then both schema and initial data will correspond to
the same snapshot?

> > > Refresh publication :-
> > >
> > > In refresh publication, subscriber does create a new replication slot
> Typo-> subscriber does not
> > > hence , we can’t run
> > >
> > > pg_dump with a snapshot which starts from origin(maybe this is not an
> > > issue at all). In this case
> > >
> > > it makes more sense for tableSync worker to do schema sync.
> > >
> >
> > Can you please explain this problem with some examples?
> I think we can have same issues as you mentioned
> New table t1 is added to the publication , User does a refresh publication.
> pg_dump / pg_restore restores the table definition. But before tableSync
> can start,  steps from 2 to 5 happen on the publisher.
> > 1. Create Table t1(c1, c2); --LSN: 90
> > 2. Insert t1 (1, 1); --LSN 100
> > 3. Insert t1 (2, 2); --LSN 110
> > 4. Alter t1 Add Column c3; --LSN 120
> > 5. Insert t1 (3, 3, 3); --LSN 130
> And table sync errors out
> There can be one more issue , since we took the pg_dump without snapshot (wrt to replication slot).
>

To avoid both the problems mentioned for Refresh Publication, we can
do one of the following: (a) create a new slot along with a snapshot
for this operation and drop it afterward; or (b) using the existing
slot, establish a new snapshot using a technique proposed in email
[1].

Note - Please keep one empty line before and after your inline
responses, otherwise, it is slightly difficult to understand your
response.

[1] - https://www.postgresql.org/message-id/CAGPVpCRWEVhXa7ovrhuSQofx4to7o22oU9iKtrOgAOtz_%3DY6vg%40mail.gmail.com

--
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: Greg Stark
Date:
Subject: Re: add log messages when replication slots become active and inactive (was Re: Is it worth adding ReplicationSlot active_pid to ReplicationSlotPersistentData?)
Next
From: Etsuro Fujita
Date:
Subject: Re: Doc: Improve note about copying into postgres_fdw foreign tables in batch