Dear hackers,
> Regarding the PG13, it cannot be
> applied
> as-is thus some adjustments are needed. I will share it in upcoming posts.
Here is a patch set for PG13. Apart from PG14-17, the patch could be created as-is,
because...
1. WAL record for invalidation messages (XLOG_XACT_INVALIDATIONS) does not exist.
2. Thus the ReorderBufferChange for the invalidation does not exist.
Our patch tries to distribute it but cannot be done as-is.
3. Codes assumed that invalidation messages can be added only once.
4. The timing when invalidation messages are consumed is limited:
a. COMMAND_ID change is poped,
b. start of decoding a transaction, or
c. end of decoding a transaction.
Above means that invalidations cannot be executed while being decoded.
I created two patch sets to resolve the data loss issue. 0001 has less code
changes but could resolve a part of issue, 0002 has huge changes but provides a
complete solution.
0001 - mostly same as patches for other versions. ReorderBufferAddInvalidations()
was adjusted to allow being called several times. As I said above,
0001 cannot execute inval messages while decoding the transacitons.
0002 - introduces new ReorderBufferChange type to indicate inval messages.
It would be handled like PG14+.
Here is an example. Assuming that the table foo exists on both nodes, a
publication "pub" which publishes all tables, and a subscription "sub" which
subscribes "pub". What if the workload is executed?
```
S1 S2
BEGIN;
INSERT INTO foo VALUES (1)
ALTER PUBLICATION pub RENAME TO pub_renamed;
INSERT INTO foo VALUES (2)
COMMIT;
LR -> ?
```
With 0001, tuples (1) and (2) would be replicated to the subscriber.
An error "publication "pub" does not exist" would raise when new changes are done
later.
0001+0002 works more aggressively; the error would raise when S1 transaction is decoded.
The behavior is same as for patched PG14-PG17.
Best regards,
Hayato Kuroda
FUJITSU LIMITED