Re: long-standing data loss bug in initial sync of logical replication - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: long-standing data loss bug in initial sync of logical replication
Date
Msg-id CAA4eK1+cUsnAL6jF4Lny5KiBz5iZv1S+38=zHZHskgo_ZRYMMw@mail.gmail.com
Whole thread Raw
In response to Re: long-standing data loss bug in initial sync of logical replication  (Nitin Motiani <nitinmotiani@google.com>)
Responses Re: long-standing data loss bug in initial sync of logical replication
List pgsql-hackers
On Mon, Sep 2, 2024 at 9:19 PM Nitin Motiani <nitinmotiani@google.com> wrote:
>
> I think that the partial data replication for one table is a bigger
> issue than the case of data being sent for a subset of the tables in
> the transaction. This can lead to inconsistent data if the same row is
> updated multiple times or deleted in the same transaction. In such a
> case if only the partial updates from the transaction are sent to the
> subscriber, it might end up with the data which was never visible on
> the publisher side.
>
> Here is an example I tried with the patch v8-001 :
>
> I created following 2 tables on the publisher and the subscriber :
>
> CREATE TABLE delete_test(id int primary key, name varchar(100));
> CREATE TABLE update_test(id int primary key, name varchar(100));
>
> I added both the tables to the publication p on the publisher and
> created a subscription s on the subscriber.
>
> I run 2 sessions on the publisher and do the following :
>
> Session 1 :
> BEGIN;
> INSERT INTO delete_test VALUES(0, 'Nitin');
>
> Session 2 :
> ALTER PUBLICATION p DROP TABLE delete_test;
>
> Session 1 :
> DELETE FROM delete_test WHERE id=0;
> COMMIT;
>
> After the commit there should be no new row created on the publisher.
> But because the partial data was replicated, this is what the select
> on the subscriber shows :
>
> SELECT * FROM delete_test;
>  id |   name
> ----+-----------
>   0 | Nitin
> (1 row)
>
> I don't think the above is a common use case. But this is still an
> issue because the subscriber has the data which never existed on the
> publisher.
>

I don't think that is the correct conclusion because the user has
intentionally avoided sending part of the transaction changes. This
can happen in various ways without the patch as well. For example, if
the user has performed the ALTER in the same transaction.

Publisher:
=========
BEGIN
postgres=*# Insert into delete_test values(0, 'Nitin');
INSERT 0 1
postgres=*# Alter Publication pub1 drop table delete_test;
ALTER PUBLICATION
postgres=*# Delete from delete_test where id=0;
DELETE 1
postgres=*# commit;
COMMIT
postgres=# select * from delete_test;
 id | name
----+------
(0 rows)

Subscriber:
=========
postgres=# select * from delete_test;
 id | name
----+-------
  0 | Nitin
(1 row)

This can also happen when the user has published only 'inserts' but
not 'updates' or 'deletes'.

--
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: Aleksander Alekseev
Date:
Subject: Re: [PATCH] Add roman support for to_number function
Next
From: Daniel Gustafsson
Date:
Subject: Re: optimizing pg_upgrade's once-in-each-database steps