Re: long-standing data loss bug in initial sync of logical replication - Mailing list pgsql-hackers

From Nitin Motiani
Subject Re: long-standing data loss bug in initial sync of logical replication
Date
Msg-id CAH5HC94xErGnFnnrs+=nJtCOSsXZiLBa_1YenCgiOLf=f=nh8g@mail.gmail.com
Whole thread Raw
In response to Re: long-standing data loss bug in initial sync of logical replication  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers
On Thu, Sep 5, 2024 at 4:04 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Sep 2, 2024 at 9:19 PM Nitin Motiani <nitinmotiani@google.com> wrote:
> >
> > I think that the partial data replication for one table is a bigger
> > issue than the case of data being sent for a subset of the tables in
> > the transaction. This can lead to inconsistent data if the same row is
> > updated multiple times or deleted in the same transaction. In such a
> > case if only the partial updates from the transaction are sent to the
> > subscriber, it might end up with the data which was never visible on
> > the publisher side.
> >
> > Here is an example I tried with the patch v8-001 :
> >
> > I created following 2 tables on the publisher and the subscriber :
> >
> > CREATE TABLE delete_test(id int primary key, name varchar(100));
> > CREATE TABLE update_test(id int primary key, name varchar(100));
> >
> > I added both the tables to the publication p on the publisher and
> > created a subscription s on the subscriber.
> >
> > I run 2 sessions on the publisher and do the following :
> >
> > Session 1 :
> > BEGIN;
> > INSERT INTO delete_test VALUES(0, 'Nitin');
> >
> > Session 2 :
> > ALTER PUBLICATION p DROP TABLE delete_test;
> >
> > Session 1 :
> > DELETE FROM delete_test WHERE id=0;
> > COMMIT;
> >
> > After the commit there should be no new row created on the publisher.
> > But because the partial data was replicated, this is what the select
> > on the subscriber shows :
> >
> > SELECT * FROM delete_test;
> >  id |   name
> > ----+-----------
> >   0 | Nitin
> > (1 row)
> >
> > I don't think the above is a common use case. But this is still an
> > issue because the subscriber has the data which never existed on the
> > publisher.
> >
>
> I don't think that is the correct conclusion because the user has
> intentionally avoided sending part of the transaction changes. This
> can happen in various ways without the patch as well. For example, if
> the user has performed the ALTER in the same transaction.
>
> Publisher:
> =========
> BEGIN
> postgres=*# Insert into delete_test values(0, 'Nitin');
> INSERT 0 1
> postgres=*# Alter Publication pub1 drop table delete_test;
> ALTER PUBLICATION
> postgres=*# Delete from delete_test where id=0;
> DELETE 1
> postgres=*# commit;
> COMMIT
> postgres=# select * from delete_test;
>  id | name
> ----+------
> (0 rows)
>
> Subscriber:
> =========
> postgres=# select * from delete_test;
>  id | name
> ----+-------
>   0 | Nitin
> (1 row)
>
> This can also happen when the user has published only 'inserts' but
> not 'updates' or 'deletes'.
>

Thanks for the clarification. I didn't think of this case. The change
seems fine if this can already happen.

Thanks & Regards
Nitin Motiani
Google



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: Invalid Assert while validating REPLICA IDENTITY?
Next
From: Alvaro Herrera
Date:
Subject: Re: First draft of PG 17 release notes