RE: long-standing data loss bug in initial sync of logical replication - Mailing list pgsql-hackers

From Hayato Kuroda (Fujitsu)
Subject RE: long-standing data loss bug in initial sync of logical replication
Date
Msg-id OSCPR01MB149664A485A89B0AC6FB7BA71F5DF2@OSCPR01MB14966.jpnprd01.prod.outlook.com
Whole thread Raw
In response to RE: long-standing data loss bug in initial sync of logical replication  ("Hayato Kuroda (Fujitsu)" <kuroda.hayato@fujitsu.com>)
Responses Re: long-standing data loss bug in initial sync of logical replication
List pgsql-hackers
Dear hackers,

> Regarding the PG13, it cannot be
> applied
> as-is thus some adjustments are needed. I will share it in upcoming posts.

Here is a patch set for PG13. Apart from PG14-17, the patch could be created as-is,
because...

1. WAL record for invalidation messages (XLOG_XACT_INVALIDATIONS) does not exist.
2. Thus the ReorderBufferChange for the invalidation does not exist.
   Our patch tries to distribute it but cannot be done as-is.
3. Codes assumed that invalidation messages can be added only once.
4. The timing when invalidation messages are consumed is limited:
  a. COMMAND_ID change is poped,
  b. start of decoding a transaction, or
  c. end of decoding a transaction.

Above means that invalidations cannot be executed while being decoded.
I created two patch sets to resolve the data loss issue. 0001 has less code
changes but could resolve a part of issue, 0002 has huge changes but provides a
complete solution.

0001 - mostly same as patches for other versions. ReorderBufferAddInvalidations()
       was adjusted to allow being called several times. As I said above,
       0001 cannot execute inval messages while decoding the transacitons.
0002 - introduces new ReorderBufferChange type to indicate inval messages.
       It would be handled like PG14+.

Here is an example. Assuming that the table foo exists on both nodes, a
publication "pub" which publishes all tables, and a subscription "sub" which
subscribes "pub". What if the workload is executed?

```
S1                S2
BEGIN;
INSERT INTO foo VALUES (1)
                ALTER PUBLICATION pub RENAME TO pub_renamed;
INSERT INTO foo VALUES (2)
COMMIT;
LR -> ?
```

With 0001, tuples (1) and (2) would be replicated to the subscriber.
An error "publication "pub" does not exist" would raise when new changes are done
later.

0001+0002 works more aggressively; the error would raise when S1 transaction is decoded.
The behavior is same as for patched PG14-PG17.

Best regards,
Hayato Kuroda
FUJITSU LIMITED


Attachment

pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: Support NOT VALID / VALIDATE constraint options for named NOT NULL constraints
Next
From: Ashutosh Bapat
Date:
Subject: Re: Fix couple of typos