Thread: Bug report and fix about building historic snapshot

Bug report and fix about building historic snapshot

From
"cca5507"
Date:
Hello, I find a bug in building historic snapshot and the steps to reproduce are as follows:

Prepare:
  1. (pub)create table t1 (id int primary key);

  2. (pub)insert into t1 values (1);

  3. (pub)create publication pub for table t1;

  4. (sub)create table t1 (id int primary key);

Reproduce:
  1. (pub)begin; insert into t1 values (2); (txn1 in session1)

  2. (sub)create subscription sub connection 'hostaddr=127.0.0.1 port=5432 user=xxx dbname=postgres' publication pub; (pub will switch to BUILDING_SNAPSHOT state soon)

  3. (pub)begin; insert into t1 values (3); (txn2 in session2)

  4. (pub)create table t2 (id int primary key); (session3)

  5. (pub)commit; (commit txn1, and pub will switch to FULL_SNAPSHOT state soon)

  6. (pub)begin; insert into t2 values (1); (txn3 in session3)

  7. (pub)commit; (commit txn2, and pub will switch to CONSISTENT state soon)

  8. (pub)commit; (commit txn3, and replay txn3 will failed because its snapshot cannot see t2)

Reasons:
We currently don't track the transaction that begin after BUILDING_SNAPSHOT
and commit before FULL_SNAPSHOT when building historic snapshot in logical
decoding. This can cause a transaction which begin after FULL_SNAPSHOT to take
an incorrect historic snapshot because transactions committed in BUILDING_SNAPSHOT
state will not be processed by SnapBuildCommitTxn().

To fix it, we can track the transaction that begin after BUILDING_SNAPSHOT and
commit before FULL_SNAPSHOT forcely by using SnapBuildCommitTxn().

--
Regards,
ChangAo Chen
Attachment

Re:Bug report and fix about building historic snapshot

From
"cca5507"
Date:
This patch may be better, which only track catalog modified transactions.

--
Regards,
ChangAo Chen
------------------ Original ------------------
From: "cca5507" <cca5507@qq.com>;
Date: Sun, Jan 21, 2024 05:25 PM
To: "pgsql-hackers"<pgsql-hackers@lists.postgresql.org>;
Subject: Bug report and fix about building historic snapshot

Hello, I find a bug in building historic snapshot and the steps to reproduce are as follows:

Prepare:
  1. (pub)create table t1 (id int primary key);

  2. (pub)insert into t1 values (1);

  3. (pub)create publication pub for table t1;

  4. (sub)create table t1 (id int primary key);

Reproduce:
  1. (pub)begin; insert into t1 values (2); (txn1 in session1)

  2. (sub)create subscription sub connection 'hostaddr=127.0.0.1 port=5432 user=xxx dbname=postgres' publication pub; (pub will switch to BUILDING_SNAPSHOT state soon)

  3. (pub)begin; insert into t1 values (3); (txn2 in session2)

  4. (pub)create table t2 (id int primary key); (session3)

  5. (pub)commit; (commit txn1, and pub will switch to FULL_SNAPSHOT state soon)

  6. (pub)begin; insert into t2 values (1); (txn3 in session3)

  7. (pub)commit; (commit txn2, and pub will switch to CONSISTENT state soon)

  8. (pub)commit; (commit txn3, and replay txn3 will failed because its snapshot cannot see t2)

Reasons:
We currently don't track the transaction that begin after BUILDING_SNAPSHOT
and commit before FULL_SNAPSHOT when building historic snapshot in logical
decoding. This can cause a transaction which begin after FULL_SNAPSHOT to take
an incorrect historic snapshot because transactions committed in BUILDING_SNAPSHOT
state will not be processed by SnapBuildCommitTxn().

To fix it, we can track the transaction that begin after BUILDING_SNAPSHOT and
commit before FULL_SNAPSHOT forcely by using SnapBuildCommitTxn().

--
Regards,
ChangAo Chen
Attachment

Re:Bug report and fix about building historic snapshot

From
"cca5507"
Date:
> This patch may be better, which only track catalog modified transactions.
Can anyone help review this patch?
Thanks.
--
Regards,
ChangAo Chen