One independent change included in the Hot Standby patch is the change
to the way subtransaction cache works. With the patch, only
subtransactions that don't fit in the subxid cache in PGPROC are marked
in pg_subtrans. To make that work, XidInMVCCSnapshot() always scans the
subxid array in the snapshot, while currently it's only used if none of
the subxid caches have overflowed. Attached is a patch for that,
extracted from the latest hot standby patch.
So far so good, but what about all the other callers of
SubTransGetParent()? For example, XactLockTableWait will fail an
assertion if asked to wait on a subtransaction which is then released.
It occurs to me that we don't need this patch for hot standby if we
abuse the main xid array (SnapshotData.xip) to store the unobserved xids
instead of the subxid array. That one is always scanned in
XidInMVCCSnapshot. I think we should do that rather than try to salvage
this patch.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
*** a/src/backend/access/transam/README
--- b/src/backend/access/transam/README
***************
*** 198,204 **** parent. This maintains the invariant that child transactions have XIDs later
than their parents, which is assumed in a number of places.
The subsidiary actions of obtaining a lock on the XID and and entering it into
! pg_subtrans and PG_PROC are done at the time it is assigned.
A transaction that has no XID still needs to be identified for various
purposes, notably holding locks. For this purpose we assign a "virtual
--- 198,204 ----
than their parents, which is assumed in a number of places.
The subsidiary actions of obtaining a lock on the XID and and entering it into
! PG_PROC and, in some cases, pg_subtrans are done at the time it is assigned.
A transaction that has no XID still needs to be identified for various
purposes, notably holding locks. For this purpose we assign a "virtual
***************
*** 376,382 **** but since we allow arbitrary nesting of subtransactions, we can't fit all Xids
in shared memory, so we have to store them on disk. Note, however, that for
each transaction we keep a "cache" of Xids that are known to be part of the
transaction tree, so we can skip looking at pg_subtrans unless we know the
! cache has been overflowed. See storage/ipc/procarray.c for the gory details.
slru.c is the supporting mechanism for both pg_clog and pg_subtrans. It
implements the LRU policy for in-memory buffer pages. The high-level routines
--- 376,384 ----
in shared memory, so we have to store them on disk. Note, however, that for
each transaction we keep a "cache" of Xids that are known to be part of the
transaction tree, so we can skip looking at pg_subtrans unless we know the
! cache has been overflowed. In 8.4 we skip updating pg_subtrans unless the
! cache has overflowed for that transaction, considerably reducing pg_subtrans
! activity. See storage/ipc/procarray.c for the gory details.
slru.c is the supporting mechanism for both pg_clog and pg_subtrans. It
implements the LRU policy for in-memory buffer pages. The high-level routines
*** a/src/backend/access/transam/xact.c
--- b/src/backend/access/transam/xact.c
***************
*** 415,421 **** AssignTransactionId(TransactionState s)
*/
s->transactionId = GetNewTransactionId(isSubXact);
! if (isSubXact)
SubTransSetParent(s->transactionId, s->parent->transactionId);
/*
--- 415,428 ----
*/
s->transactionId = GetNewTransactionId(isSubXact);
! /*
! * If we have overflowed the subxid cache then we must mark subtrans
! * with the parent xid. Prior to 8.4 we marked subtrans for each
! * subtransaction, though that is no longer necessary because the
! * way snapshots are searched in XidInMVCCSnapshot() has changed to
! * allow searching of both subxid cache and subtrans, not either/or.
! */
! if (isSubXact && MyProc->subxids.overflowed)
SubTransSetParent(s->transactionId, s->parent->transactionId);
/*
*** a/src/backend/storage/ipc/procarray.c
--- b/src/backend/storage/ipc/procarray.c
***************
*** 680,685 **** GetSnapshotData(Snapshot snapshot)
--- 680,686 ----
int index;
int count = 0;
int subcount = 0;
+ bool suboverflowed = false;
Assert(snapshot != NULL);
***************
*** 771,781 **** GetSnapshotData(Snapshot snapshot)
}
/*
! * Save subtransaction XIDs if possible (if we've already overflowed,
! * there's no point). Note that the subxact XIDs must be later than
! * their parent, so no need to check them against xmin. We could
! * filter against xmax, but it seems better not to do that much work
! * while holding the ProcArrayLock.
*
* The other backend can add more subxids concurrently, but cannot
* remove any. Hence it's important to fetch nxids just once. Should
--- 772,782 ----
}
/*
! * Save subtransaction XIDs, whether or not we have overflowed.
! * Note that the subxact XIDs must be later than their parent, so no
! * need to check them against xmin. We could filter against xmax,
! * but it seems better not to do that much work while holding the
! * ProcArrayLock.
*
* The other backend can add more subxids concurrently, but cannot
* remove any. Hence it's important to fetch nxids just once. Should
***************
*** 784,807 **** GetSnapshotData(Snapshot snapshot)
*
* Again, our own XIDs are not included in the snapshot.
*/
! if (subcount >= 0 && proc != MyProc)
! {
! if (proc->subxids.overflowed)
! subcount = -1; /* overflowed */
! else
{
int nxids = proc->subxids.nxids;
if (nxids > 0)
{
memcpy(snapshot->subxip + subcount,
(void *) proc->subxids.xids,
nxids * sizeof(TransactionId));
subcount += nxids;
}
}
}
- }
if (!TransactionIdIsValid(MyProc->xmin))
MyProc->xmin = TransactionXmin = xmin;
--- 785,807 ----
*
* Again, our own XIDs are not included in the snapshot.
*/
! if (proc != MyProc)
{
int nxids = proc->subxids.nxids;
if (nxids > 0)
{
+ if (proc->subxids.overflowed)
+ suboverflowed = true;
+
memcpy(snapshot->subxip + subcount,
(void *) proc->subxids.xids,
nxids * sizeof(TransactionId));
subcount += nxids;
}
+
}
}
if (!TransactionIdIsValid(MyProc->xmin))
MyProc->xmin = TransactionXmin = xmin;
***************
*** 824,829 **** GetSnapshotData(Snapshot snapshot)
--- 824,830 ----
snapshot->xmax = xmax;
snapshot->xcnt = count;
snapshot->subxcnt = subcount;
+ snapshot->suboverflowed = suboverflowed;
snapshot->curcid = GetCurrentCommandId(false);
*** a/src/backend/utils/time/tqual.c
--- b/src/backend/utils/time/tqual.c
***************
*** 1238,1263 **** XidInMVCCSnapshot(TransactionId xid, Snapshot snapshot)
return true;
/*
! * If the snapshot contains full subxact data, the fastest way to check
! * things is just to compare the given XID against both subxact XIDs and
! * top-level XIDs. If the snapshot overflowed, we have to use pg_subtrans
! * to convert a subxact XID to its parent XID, but then we need only look
! * at top-level XIDs not subxacts.
*/
- if (snapshot->subxcnt >= 0)
- {
- /* full data, so search subxip */
- int32 j;
-
- for (j = 0; j < snapshot->subxcnt; j++)
- {
- if (TransactionIdEquals(xid, snapshot->subxip[j]))
- return true;
- }
! /* not there, fall through to search xip[] */
}
! else
{
/* overflowed, so convert xid to top-level */
xid = SubTransGetTopmostTransaction(xid);
--- 1238,1283 ----
return true;
/*
! * Our strategy for checking xids changed in 8.4. Prior to 8.4
! * we either checked the subxid cache on the snapshot or we
! * checked subtrans. That was much more efficient than just using
! * subtrans but it has some problems. First, as soon as *any*
! * transaction had more than 64 transactions we forced *all*
! * snapshots to check against subtrans, giving a sharp modal
! * change in behaviour. Second because we either checked subtrans
! * or the snapshot, we were forced to place entries in subtrans
! * in case the snapshot later overflowed, even if we never
! * actually checked subtrans.
! *
! * In 8.4 we improve on that scheme in a number of ways. As before
! * we check subtrans if the snapshot has overflowed. We *also*
! * check the subxid cache. This has two benefits: first the
! * behaviour degrades gracefully when the cache overflows, so we
! * retain much of its benefit if it has only just overflowed.
! * Second, a transaction doesn't need to insert entries into
! * subtrans until its own personal subxid cache overflows. This
! * means entries into subtrans become significantly rarer,
! * perhaps less than 1% of the previous insert rate, giving
! * considerable benefit for transactions using only a few
! * subtransactions.
*/
! /*
! * First, compare the given XID against cached subxact XIDs.
! */
! for (i = 0; i < snapshot->subxcnt; i++)
! {
! if (TransactionIdEquals(xid, snapshot->subxip[i]))
! return true;
}
!
! /*
! * If the snapshot overflowed and we haven't already located the xid
! * we also have to consult pg_subtrans. We use subtrans to convert a
! * subxact XID to its parent XID, so that we can then check the status
! * of the top-level TransactionId.
! */
! if (snapshot->suboverflowed)
{
/* overflowed, so convert xid to top-level */
xid = SubTransGetTopmostTransaction(xid);
***************
*** 1270,1275 **** XidInMVCCSnapshot(TransactionId xid, Snapshot snapshot)
--- 1290,1299 ----
return false;
}
+ /*
+ * By now xid is either not present, or a top-level xid. So now
+ * we just need to check the main transaction ids.
+ */
for (i = 0; i < snapshot->xcnt; i++)
{
if (TransactionIdEquals(xid, snapshot->xip[i]))
*** a/src/include/utils/snapshot.h
--- b/src/include/utils/snapshot.h
***************
*** 49,55 **** typedef struct SnapshotData
uint32 xcnt; /* # of xact ids in xip[] */
TransactionId *xip; /* array of xact IDs in progress */
/* note: all ids in xip[] satisfy xmin <= xip[i] < xmax */
! int32 subxcnt; /* # of xact ids in subxip[], -1 if overflow */
TransactionId *subxip; /* array of subxact IDs in progress */
/*
--- 49,65 ----
uint32 xcnt; /* # of xact ids in xip[] */
TransactionId *xip; /* array of xact IDs in progress */
/* note: all ids in xip[] satisfy xmin <= xip[i] < xmax */
!
! /*
! * Prior to 8.4 we represented an overflowed subxid cache with subxcnt -1.
! * In 8.4+ we separate the two concepts because when checking the xids
! * in the snapshot we check *both* subxid cache and subtrans, if subxid
! * cache has overflowed. So we still need the count, even if overflowed.
! * We do this to allow unobserved xids to be placed into the snapshot
! * even when snapshot overflows. It is also a performance gain.
! */
! uint32 subxcnt; /* # of xact ids in subxip[] */
! bool suboverflowed; /* true means at least one subxid cache overflowed */
TransactionId *subxip; /* array of subxact IDs in progress */
/*