Thread: Why chain of snapshots is used in ReorderBufferCommit?

Why chain of snapshots is used in ReorderBufferCommit?

From
Arseny Sher
Date:
Hello,

While prowling through snapbuild & reorderbuffer code, I wondered: why a queue
of snapshots is used for replaying each transaction instead of just picking up
snapshot from snapbuilder once when COMMIT record is read? I am not aware of any
DDL/DML mix which would make this invalid: e.g. we can't insert something into
table in xact T1, then alter a column in xact T2, and then insert something more
in T1. All ALTER TABLE forms which are currently interesting from the decoding
POV obtain ACCESS EXCLUSIVE lock, conflicting with any other manipulations on the
table:
https://www.postgresql.org/docs/devel/static/sql-altertable.html

To confirm this, I tried to use single snapshot (see attached patch) and make
check of test_decoding and src/test/recovery passed all the tests.

The only reason I can think of is the desire to allow replay of in-progress
transactions some day. However, presently we are far from there. Moreover, if we
pretend that the rule is to replay xact with exactly the same catalog snapshot
which was used during its execution (not including any later changes), then we
violate it anyway due to "choose the latest base_snapshot of all subxacts's as a
base_snapshot during replay" logic, which is somewhat weird. If xact doesn't
have base snapshot itself, but has two subxacts both having one, currently we
will use latest snapshot to start decoding of subxact which actually had earlier
snapshot. On the whole, what is the value of 'base_snapshot' concept, why first
snapshots aren't logged to the change queue as any subsequent ones?

--
Arseny Sher
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

From 2a02c117e538f5085e82f211b41c3b14c37ce3ff Mon Sep 17 00:00:00 2001
From: Arseny Sher <sher-ars@yandex.ru>
Date: Tue, 27 Feb 2018 19:49:28 +0300
Subject: [PATCH] Use single snapshot for replaying transaction.

---
 src/backend/replication/logical/reorderbuffer.c | 39 +++++---------------
 src/backend/replication/logical/snapbuild.c     | 49 ++++++++++---------------
 2 files changed, 28 insertions(+), 60 deletions(-)

diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index c72a611a39..04b8068ddd 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -1479,36 +1479,7 @@ ReorderBufferCommit(ReorderBuffer *rb, TransactionId xid,
                     break;
 
                 case REORDER_BUFFER_CHANGE_INTERNAL_SNAPSHOT:
-                    /* get rid of the old */
-                    TeardownHistoricSnapshot(false);
 
-                    if (snapshot_now->copied)
-                    {
-                        ReorderBufferFreeSnap(rb, snapshot_now);
-                        snapshot_now =
-                            ReorderBufferCopySnap(rb, change->data.snapshot,
-                                                  txn, command_id);
-                    }
-
-                    /*
-                     * Restored from disk, need to be careful not to double
-                     * free. We could introduce refcounting for that, but for
-                     * now this seems infrequent enough not to care.
-                     */
-                    else if (change->data.snapshot->copied)
-                    {
-                        snapshot_now =
-                            ReorderBufferCopySnap(rb, change->data.snapshot,
-                                                  txn, command_id);
-                    }
-                    else
-                    {
-                        snapshot_now = change->data.snapshot;
-                    }
-
-
-                    /* and continue with the new one */
-                    SetupHistoricSnapshot(snapshot_now, txn->tuplecid_hash);
                     break;
 
                 case REORDER_BUFFER_CHANGE_INTERNAL_COMMAND_ID:
@@ -1831,7 +1802,15 @@ ReorderBufferSetBaseSnapshot(ReorderBuffer *rb, TransactionId xid,
     ReorderBufferTXN *txn;
     bool        is_new;
 
-    txn = ReorderBufferTXNByXid(rb, xid, true, &is_new, lsn, true);
+    txn = ReorderBufferTXNByXid(rb, xid, false, NULL, InvalidXLogRecPtr, false);
+    /*
+     * If xact still doesn't exist during the commit, it is empty and hadn't
+     * had any subxacts (if it had ones, XLOG_XACT_ASSIGNMENT would be logged
+     * and it would present in ReorderBuffer anyway).
+     */
+    if (txn == NULL)
+        return;
+
     Assert(txn->base_snapshot == NULL);
     Assert(snap != NULL);
 
diff --git a/src/backend/replication/logical/snapbuild.c b/src/backend/replication/logical/snapbuild.c
index 4123cdebcf..2b258ff7b9 100644
--- a/src/backend/replication/logical/snapbuild.c
+++ b/src/backend/replication/logical/snapbuild.c
@@ -734,29 +734,6 @@ SnapBuildProcessChange(SnapBuild *builder, TransactionId xid, XLogRecPtr lsn)
         TransactionIdPrecedes(xid, SnapBuildNextPhaseAt(builder)))
         return false;
 
-    /*
-     * If the reorderbuffer doesn't yet have a snapshot, add one now, it will
-     * be needed to decode the change we're currently processing.
-     */
-    if (!ReorderBufferXidHasBaseSnapshot(builder->reorder, xid))
-    {
-        /* only build a new snapshot if we don't have a prebuilt one */
-        if (builder->snapshot == NULL)
-        {
-            builder->snapshot = SnapBuildBuildSnapshot(builder);
-            /* increase refcount for the snapshot builder */
-            SnapBuildSnapIncRefcount(builder->snapshot);
-        }
-
-        /*
-         * Increase refcount for the transaction we're handing the snapshot
-         * out to.
-         */
-        SnapBuildSnapIncRefcount(builder->snapshot);
-        ReorderBufferSetBaseSnapshot(builder->reorder, xid, lsn,
-                                     builder->snapshot);
-    }
-
     return true;
 }
 
@@ -965,6 +942,25 @@ SnapBuildCommitTxn(SnapBuild *builder, XLogRecPtr lsn, TransactionId xid,
         }
     }
 
+    /*
+     * Snapbuilder's snapshot at the moment of COMMIT is the snapshot we will
+     * use to decode the xact.
+     */
+    if (builder->state >= SNAPBUILD_FULL_SNAPSHOT)
+    {
+        /* only build a new snapshot if we don't have a prebuilt one */
+        if (builder->snapshot == NULL)
+        {
+            builder->snapshot = SnapBuildBuildSnapshot(builder);
+            /* increase refcount for the snapshot builder */
+            SnapBuildSnapIncRefcount(builder->snapshot);
+        }
+
+        SnapBuildSnapIncRefcount(builder->snapshot);
+        ReorderBufferSetBaseSnapshot(builder->reorder, xid, lsn,
+                                     builder->snapshot);
+    }
+
     for (nxact = 0; nxact < nsubxacts; nxact++)
     {
         TransactionId subxid = subxacts[nxact];
@@ -1063,13 +1059,6 @@ SnapBuildCommitTxn(SnapBuild *builder, XLogRecPtr lsn, TransactionId xid,
 
         builder->snapshot = SnapBuildBuildSnapshot(builder);
 
-        /* we might need to execute invalidations, add snapshot */
-        if (!ReorderBufferXidHasBaseSnapshot(builder->reorder, xid))
-        {
-            SnapBuildSnapIncRefcount(builder->snapshot);
-            ReorderBufferSetBaseSnapshot(builder->reorder, xid, lsn,
-                                         builder->snapshot);
-        }
 
         /* refcount of the snapshot builder for the new snapshot */
         SnapBuildSnapIncRefcount(builder->snapshot);
-- 
2.11.0


Re: Why chain of snapshots is used in ReorderBufferCommit?

From
Andres Freund
Date:
Hi,

On 2018-03-01 08:17:33 +0300, Arseny Sher wrote:
> While prowling through snapbuild & reorderbuffer code, I wondered: why a queue
> of snapshots is used for replaying each transaction instead of just picking up
> snapshot from snapbuilder once when COMMIT record is read? I am not aware of any
> DDL/DML mix which would make this invalid: e.g. we can't insert something into
> table in xact T1, then alter a column in xact T2, and then insert something more
> in T1. All ALTER TABLE forms which are currently interesting from the decoding
> POV obtain ACCESS EXCLUSIVE lock, conflicting with any other manipulations on the
> table:
> https://www.postgresql.org/docs/devel/static/sql-altertable.html

I don't think that's right. For one there's plenty types of DDL where no
such locking exists, consider e.g. composite datums where additional
columns can be added even after such a type is already used by another
table. For another, T1 might need to see a DDL change to a table that
has been made by T2 after T1 started, which is not prohibited by locking
if T1 uses the table first after T2 commits.

- Andres


Re: Why chain of snapshots is used in ReorderBufferCommit?

From
Arseny Sher
Date:
Andres Freund <andres@anarazel.de> writes:

> I don't think that's right. For one there's plenty types of DDL where no
> such locking exists, consider e.g. composite datums where additional
> columns can be added even after such a type is already used by another
> table.

Oh, indeed. Never mind the last paragraph.

--
Arseny Sher
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company