Re: Hot Standby 0.2.1 - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: Hot Standby 0.2.1
Date
Msg-id 4ABF539F.8050305@enterprisedb.com
Whole thread Raw
In response to Hot Standby 0.2.1  (Simon Riggs <simon@2ndQuadrant.com>)
Responses Re: Hot Standby 0.2.1
List pgsql-hackers
The locking in smgr_redo_commit and smgr_redo_abort doesn't look right.
First of all, smgr_redo_abort is not holding XidGenLock and
ProcArrayLock while modifying ShmemVariableCache->nextXid and
ShmemVariableCache->latestCompletedXid, respectively, like
smgr_redo_commit is. Attached patch fixes that.

But I think there's a little race condition in there anyway, as they
remove the finished xids from known-assigned-xids list by calling
ExpireTreeKnownAssignedTransactionIds, and
ShmemVariableCache->latestCompletedXid is updated only after that. We're
not holding ProcArrayLock between those two steps, like we do during
normal transaction commit. I'm not sure what kind of issues that can
lead to, but it seems like it can lead to broken snapshots if someone
calls GetSnapshotData() between those steps.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com
>From 9d6ef0a3eb901b1d2182b3f2efd5c75d52e88cba Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <heikki@enterprisedb.com>
Date: Sun, 27 Sep 2009 14:51:43 +0300
Subject: [PATCH] Make locking in smgr_redo_abort reflect that in smgr_redo_commit.

---
 src/backend/access/transam/xact.c |   22 ++++++++++++++++------
 1 files changed, 16 insertions(+), 6 deletions(-)

diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index a696857..92a7c43 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -4502,10 +4502,7 @@ xact_redo_commit(xl_xact_commit *xlrec, TransactionId xid,
         /*
          * Release locks, if any. We do this for both two phase and normal
          * one phase transactions. In effect we are ignoring the prepare
-         * phase and just going straight to lock release. This explains
-         * why the twophase_postcommit_standby_callbacks[] do not invoke
-         * a special routine to handle locks - that is performed here
-         * instead.
+         * phase and just going straight to lock release.
          */
         RelationReleaseRecoveryLockTree(xid, xlrec->nsubxacts, sub_xids);
     }
@@ -4593,12 +4590,25 @@ xact_redo_abort(xl_xact_abort *xlrec, TransactionId xid)
     }

     /* Make sure nextXid is beyond any XID mentioned in the record */
+    /* We don't expect anyone else to modify nextXid, hence we
+     * don't need to hold a lock while checking this. We still acquire
+     * the lock to modify it, though.
+     */
     if (TransactionIdFollowsOrEquals(max_xid,
                                      ShmemVariableCache->nextXid))
     {
+        LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
         ShmemVariableCache->nextXid = max_xid;
-        ShmemVariableCache->latestCompletedXid = ShmemVariableCache->nextXid;
-        TransactionIdAdvance(ShmemVariableCache->nextXid);
+        LWLockRelease(XidGenLock);
+    }
+
+    /* Same here, don't use lock to test, but need one to modify */
+    if (TransactionIdFollowsOrEquals(max_xid,
+                                     ShmemVariableCache->latestCompletedXid))
+    {
+        LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
+        ShmemVariableCache->latestCompletedXid = max_xid;
+        LWLockRelease(ProcArrayLock);
     }

     /* Make sure files supposed to be dropped are dropped */
--
1.6.3.3


pgsql-hackers by date:

Previous
From: Jim Cox
Date:
Subject: Re: [PATCH] 8.5 TODO: Add comments to output indicating version of pg_dump and of the database server
Next
From: Heikki Linnakangas
Date:
Subject: Re: Hot Standby 0.2.1