Re: Improvement of procArray.xmin for VACUUM - Mailing list pgsql-patches
From | Bruce Momjian |
---|---|
Subject | Re: Improvement of procArray.xmin for VACUUM |
Date | |
Msg-id | 200703241942.l2OJgZC18482@momjian.us Whole thread Raw |
In response to | Re: Improvement of procArray.xmin for VACUUM (Tom Lane <tgl@sss.pgh.pa.us>) |
List | pgsql-patches |
Tom Lane wrote: > Heikki Linnakangas <heikki@enterprisedb.com> writes: > > It seems to me a lot cleaner to do the reference counting like Tom > > suggested. Increase the refcount on CopySnapshot, and decrease it on > > FreeSnapshot. Assuming that all callers of CopySnapshot free the > > snapshot with FreeSnapshot when they're done with it. > > I don't believe we bother at the moment; which is one of the reasons > it'd be a nontrivial patch. I do think it might be worth doing though. > In the simple case where you're just issuing successive non-cursor > commands within a READ COMMITTED transaction, a refcounted > implementation would be able to recognize that there are *no* live > snapshots between commands and therefore reset MyProc->xmin to 0 > whenever the backend is idle. Attached is my current version of the patch. It doesn't work now that I tried to do reference count for Snapshots, but will stop now that Tom is considering redesigning the snapshot mechanism. > OTOH, do we have any evidence that this is worth bothering with at all? > I fear that the cases of long-running transactions that are problems > in the real world wouldn't be helped much --- for instance, pg_dump > wouldn't change behavior because it uses a serializable transaction. > Also, at some point a long-running transaction becomes a bottleneck > simply because its XID is itself the oldest thing visible in the > ProcArray and is determining everyone's xmin. How much daylight is > there really between "your xmin is old" and "your xid is old"? Well, interesting you mention that, because I have a second idea on how to improve things. We start with MyProc->xmin equal to our own xid, and then look for earlier transactions. It should be possible to skip considering our own xid for MyProc->xmin. This would obviously help VACUUM during long-running transactions. While our transaction is running, our xid isn't committed, so VACUUM isn't going to touch any of our rows, and if other transactions complete before our multi-transaction _statement_ starts, we can't see deleted rows from them transaction, so why keep the deleted rows around? Consider this case: Session #: 1 2 3 BEGIN; SELECT 1; CREATE TABLE test(x int); INSERT INTO test VALUES (1); DELETE FROM test; SELECT 1; VACUUM VERBOSE test; (row can be reused) COMMIT; VACUUM VERBOSE test; (normal row reuse) As I understand it, in READ COMMITTED mode, we have to skip transactions in progress when our _statement_ starts, but anything committed before that we see and we don't see dead rows created by them. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + Index: src/backend/catalog/index.c =================================================================== RCS file: /cvsroot/pgsql/src/backend/catalog/index.c,v retrieving revision 1.280 diff -c -c -r1.280 index.c *** src/backend/catalog/index.c 3 Mar 2007 20:08:41 -0000 1.280 --- src/backend/catalog/index.c 24 Mar 2007 19:17:56 -0000 *************** *** 1358,1364 **** ExprContext *econtext; Snapshot snapshot; TransactionId OldestXmin; ! /* * sanity checks */ --- 1358,1364 ---- ExprContext *econtext; Snapshot snapshot; TransactionId OldestXmin; ! /* * sanity checks */ *************** *** 1555,1561 **** ExecDropSingleTupleTableSlot(slot); FreeExecutorState(estate); ! /* These may have been pointing to the now-gone estate */ indexInfo->ii_ExpressionsState = NIL; indexInfo->ii_PredicateState = NIL; --- 1555,1563 ---- ExecDropSingleTupleTableSlot(slot); FreeExecutorState(estate); ! if (snapshot != SnapshotNow) ! FreeSnapshot(snapshot); ! /* These may have been pointing to the now-gone estate */ indexInfo->ii_ExpressionsState = NIL; indexInfo->ii_PredicateState = NIL; Index: src/backend/commands/cluster.c =================================================================== RCS file: /cvsroot/pgsql/src/backend/commands/cluster.c,v retrieving revision 1.157 diff -c -c -r1.157 cluster.c *** src/backend/commands/cluster.c 13 Mar 2007 00:33:39 -0000 1.157 --- src/backend/commands/cluster.c 24 Mar 2007 19:17:56 -0000 *************** *** 204,209 **** --- 204,210 ---- /* Start a new transaction for each relation. */ StartTransactionCommand(); /* functions in indexes may want a snapshot set */ + FreeSnapshot(ActiveSnapshot); ActiveSnapshot = CopySnapshot(GetTransactionSnapshot()); cluster_rel(rvtc, true); CommitTransactionCommand(); Index: src/backend/commands/indexcmds.c =================================================================== RCS file: /cvsroot/pgsql/src/backend/commands/indexcmds.c,v retrieving revision 1.157 diff -c -c -r1.157 indexcmds.c *** src/backend/commands/indexcmds.c 13 Mar 2007 00:33:39 -0000 1.157 --- src/backend/commands/indexcmds.c 24 Mar 2007 19:17:56 -0000 *************** *** 493,500 **** * We also set ActiveSnapshot to this snap, since functions in indexes may * need a snapshot. */ ! snapshot = CopySnapshot(GetTransactionSnapshot()); ! ActiveSnapshot = snapshot; /* * Scan the index and the heap, insert any missing index entries. --- 493,500 ---- * We also set ActiveSnapshot to this snap, since functions in indexes may * need a snapshot. */ ! FreeSnapshot(ActiveSnapshot); ! ActiveSnapshot = snapshot = CopySnapshot(GetTransactionSnapshot()); /* * Scan the index and the heap, insert any missing index entries. *************** *** 1304,1309 **** --- 1304,1310 ---- StartTransactionCommand(); /* functions in indexes may want a snapshot set */ + FreeSnapshot(ActiveSnapshot); ActiveSnapshot = CopySnapshot(GetTransactionSnapshot()); if (reindex_relation(relid, true)) ereport(NOTICE, Index: src/backend/commands/trigger.c =================================================================== RCS file: /cvsroot/pgsql/src/backend/commands/trigger.c,v retrieving revision 1.214 diff -c -c -r1.214 trigger.c *** src/backend/commands/trigger.c 19 Mar 2007 23:38:29 -0000 1.214 --- src/backend/commands/trigger.c 24 Mar 2007 19:17:57 -0000 *************** *** 2645,2651 **** --- 2645,2654 ---- */ events = &afterTriggers->events; if (events->head != NULL) + { + FreeSnapshot(ActiveSnapshot); ActiveSnapshot = CopySnapshot(GetTransactionSnapshot()); + } /* * Run all the remaining triggers. Loop until they are all gone, just in Index: src/backend/commands/vacuum.c =================================================================== RCS file: /cvsroot/pgsql/src/backend/commands/vacuum.c,v retrieving revision 1.349 diff -c -c -r1.349 vacuum.c *** src/backend/commands/vacuum.c 14 Mar 2007 18:48:55 -0000 1.349 --- src/backend/commands/vacuum.c 24 Mar 2007 19:17:59 -0000 *************** *** 412,417 **** --- 412,418 ---- { StartTransactionCommand(); /* functions in indexes may want a snapshot set */ + FreeSnapshot(ActiveSnapshot); ActiveSnapshot = CopySnapshot(GetTransactionSnapshot()); } else *************** *** 470,475 **** --- 471,477 ---- * necessary if we are called from autovacuum because autovacuum might * continue on to do an ANALYZE-only call. */ + FreeSnapshot(ActiveSnapshot); ActiveSnapshot = CopySnapshot(GetTransactionSnapshot()); } *************** *** 964,969 **** --- 966,972 ---- if (vacstmt->full) { /* functions in indexes may want a snapshot set */ + FreeSnapshot(ActiveSnapshot); ActiveSnapshot = CopySnapshot(GetTransactionSnapshot()); } else Index: src/backend/executor/functions.c =================================================================== RCS file: /cvsroot/pgsql/src/backend/executor/functions.c,v retrieving revision 1.112 diff -c -c -r1.112 functions.c *** src/backend/executor/functions.c 13 Mar 2007 00:33:40 -0000 1.112 --- src/backend/executor/functions.c 24 Mar 2007 19:17:59 -0000 *************** *** 315,321 **** fcache->paramLI); /* We assume we don't need to set up ActiveSnapshot for ExecutorStart */ ! /* Utility commands don't need Executor. */ if (es->qd->operation != CMD_UTILITY) { --- 315,321 ---- fcache->paramLI); /* We assume we don't need to set up ActiveSnapshot for ExecutorStart */ ! /* Utility commands don't need Executor. */ if (es->qd->operation != CMD_UTILITY) { Index: src/backend/postmaster/autovacuum.c =================================================================== RCS file: /cvsroot/pgsql/src/backend/postmaster/autovacuum.c,v retrieving revision 1.38 diff -c -c -r1.38 autovacuum.c *** src/backend/postmaster/autovacuum.c 23 Mar 2007 21:57:10 -0000 1.38 --- src/backend/postmaster/autovacuum.c 24 Mar 2007 19:18:00 -0000 *************** *** 884,889 **** --- 884,890 ---- StartTransactionCommand(); /* functions in indexes may want a snapshot set */ + FreeSnapshot(ActiveSnapshot); ActiveSnapshot = CopySnapshot(GetTransactionSnapshot()); /* Index: src/backend/storage/ipc/procarray.c =================================================================== RCS file: /cvsroot/pgsql/src/backend/storage/ipc/procarray.c,v retrieving revision 1.22 diff -c -c -r1.22 procarray.c *** src/backend/storage/ipc/procarray.c 23 Mar 2007 03:16:39 -0000 1.22 --- src/backend/storage/ipc/procarray.c 24 Mar 2007 19:18:01 -0000 *************** *** 498,504 **** *---------- */ Snapshot ! GetSnapshotData(Snapshot snapshot, bool serializable) { ProcArrayStruct *arrayP = procArray; TransactionId xmin; --- 498,504 ---- *---------- */ Snapshot ! GetSnapshotData(Snapshot snapshot) { ProcArrayStruct *arrayP = procArray; TransactionId xmin; *************** *** 510,519 **** Assert(snapshot != NULL); ! /* Serializable snapshot must be computed before any other... */ ! Assert(serializable ? ! !TransactionIdIsValid(MyProc->xmin) : ! TransactionIdIsValid(MyProc->xmin)); /* * Allocating space for maxProcs xids is usually overkill; numProcs would --- 510,517 ---- Assert(snapshot != NULL); ! /* Either we set an xmin, or we already have one. */ ! Assert(TransactionIdIsValid(MyProc->xmin) || GetOpenSnapshotCount() == 0); /* * Allocating space for maxProcs xids is usually overkill; numProcs would *************** *** 656,664 **** } } ! if (serializable) ! MyProc->xmin = TransactionXmin = xmin; LWLockRelease(ProcArrayLock); /* --- 654,681 ---- } } ! /* DEBUG */ ! if (!TransactionIdIsValid(MyProc->xmin) || GetOpenSnapshotCount() == 0) ! fprintf(stderr, "%x:%x: set new xmin old=%x new=%x\n", MyProcPid, MyProc->xid, MyProc->xmin, xmin); ! else ! fprintf(stderr, "%x:%x: count=%d, not set\n", MyProcPid, MyProc->xid, GetOpenSnapshotCount()); + /* + * --- + * We need to set MyProc->xmin in a few cases: + * o set at transaction start for SERIALIZABLE visibility + * o set for new command in READ COMMITTED mode so + * VACUUM knows we don't care about completed transactions + * + * The problem is that we might have old cursors/portals + * that need to have older transaction visibility, so we + * check to see we have no open Snapshots before making + * the change. + * --- + */ + if (!TransactionIdIsValid(MyProc->xmin) || GetOpenSnapshotCount() == 0) + MyProc->xmin = TransactionXmin = xmin; + LWLockRelease(ProcArrayLock); /* Index: src/backend/tcop/fastpath.c =================================================================== RCS file: /cvsroot/pgsql/src/backend/tcop/fastpath.c,v retrieving revision 1.96 diff -c -c -r1.96 fastpath.c *** src/backend/tcop/fastpath.c 3 Mar 2007 19:32:54 -0000 1.96 --- src/backend/tcop/fastpath.c 24 Mar 2007 19:18:01 -0000 *************** *** 308,313 **** --- 308,314 ---- * Now that we know we are in a valid transaction, set snapshot in case * needed by function itself or one of the datatype I/O routines. */ + FreeSnapshot(ActiveSnapshot); ActiveSnapshot = CopySnapshot(GetTransactionSnapshot()); /* Index: src/backend/tcop/postgres.c =================================================================== RCS file: /cvsroot/pgsql/src/backend/tcop/postgres.c,v retrieving revision 1.529 diff -c -c -r1.529 postgres.c *** src/backend/tcop/postgres.c 22 Mar 2007 19:55:04 -0000 1.529 --- src/backend/tcop/postgres.c 24 Mar 2007 19:18:02 -0000 *************** *** 738,743 **** --- 738,744 ---- { if (needSnapshot) { + FreeSnapshot(ActiveSnapshot); ActiveSnapshot = CopySnapshot(GetTransactionSnapshot()); needSnapshot = false; } Index: src/backend/tcop/pquery.c =================================================================== RCS file: /cvsroot/pgsql/src/backend/tcop/pquery.c,v retrieving revision 1.115 diff -c -c -r1.115 pquery.c *** src/backend/tcop/pquery.c 13 Mar 2007 00:33:42 -0000 1.115 --- src/backend/tcop/pquery.c 24 Mar 2007 19:18:02 -0000 *************** *** 154,159 **** --- 154,160 ---- * Must always set snapshot for plannable queries. Note we assume that * caller will take care of restoring ActiveSnapshot on exit/error. */ + FreeSnapshot(ActiveSnapshot); ActiveSnapshot = CopySnapshot(GetTransactionSnapshot()); /* *************** *** 486,491 **** --- 487,493 ---- * Must set snapshot before starting executor. Be sure to * copy it into the portal's context. */ + FreeSnapshot(ActiveSnapshot); if (snapshot) ActiveSnapshot = CopySnapshot(snapshot); else *************** *** 1163,1169 **** --- 1165,1174 ---- IsA(utilityStmt, NotifyStmt) || IsA(utilityStmt, UnlistenStmt) || IsA(utilityStmt, CheckPointStmt))) + { + FreeSnapshot(ActiveSnapshot); ActiveSnapshot = CopySnapshot(GetTransactionSnapshot()); + } else ActiveSnapshot = NULL; *************** *** 1177,1184 **** /* Some utility statements may change context on us */ MemoryContextSwitchTo(PortalGetHeapMemory(portal)); ! if (ActiveSnapshot) ! FreeSnapshot(ActiveSnapshot); ActiveSnapshot = NULL; } --- 1182,1188 ---- /* Some utility statements may change context on us */ MemoryContextSwitchTo(PortalGetHeapMemory(portal)); ! FreeSnapshot(ActiveSnapshot); ActiveSnapshot = NULL; } Index: src/backend/utils/adt/ri_triggers.c =================================================================== RCS file: /cvsroot/pgsql/src/backend/utils/adt/ri_triggers.c,v retrieving revision 1.92 diff -c -c -r1.92 ri_triggers.c *** src/backend/utils/adt/ri_triggers.c 15 Mar 2007 23:12:06 -0000 1.92 --- src/backend/utils/adt/ri_triggers.c 24 Mar 2007 19:18:04 -0000 *************** *** 2636,2642 **** char workmembuf[32]; int spi_result; SPIPlanPtr qplan; ! /* * Check to make sure current user has enough permissions to do the test * query. (If not, caller can fall back to the trigger method, which --- 2636,2643 ---- char workmembuf[32]; int spi_result; SPIPlanPtr qplan; ! Snapshot snapshot; ! /* * Check to make sure current user has enough permissions to do the test * query. (If not, caller can fall back to the trigger method, which *************** *** 2770,2780 **** * really haven't got much choice.) We need at most one tuple returned, * so pass limit = 1. */ spi_result = SPI_execute_snapshot(qplan, NULL, NULL, ! CopySnapshot(GetLatestSnapshot()), InvalidSnapshot, true, 1); /* Check result */ if (spi_result != SPI_OK_SELECT) --- 2771,2783 ---- * really haven't got much choice.) We need at most one tuple returned, * so pass limit = 1. */ + snapshot = CopySnapshot(GetLatestSnapshot()); spi_result = SPI_execute_snapshot(qplan, NULL, NULL, ! snapshot, InvalidSnapshot, true, 1); + FreeSnapshot(snapshot); /* Check result */ if (spi_result != SPI_OK_SELECT) *************** *** 3310,3315 **** --- 3313,3322 ---- test_snapshot, crosscheck_snapshot, false, limit); + if (test_snapshot != InvalidSnapshot) + FreeSnapshot(test_snapshot); + if (crosscheck_snapshot != InvalidSnapshot) + FreeSnapshot(crosscheck_snapshot); /* Restore UID */ SetUserId(save_uid); Index: src/backend/utils/time/tqual.c =================================================================== RCS file: /cvsroot/pgsql/src/backend/utils/time/tqual.c,v retrieving revision 1.101 diff -c -c -r1.101 tqual.c *** src/backend/utils/time/tqual.c 5 Jan 2007 22:19:47 -0000 1.101 --- src/backend/utils/time/tqual.c 24 Mar 2007 19:18:04 -0000 *************** *** 54,59 **** --- 54,67 ---- static SnapshotData SnapshotDirtyData; static SnapshotData SerializableSnapshotData; static SnapshotData LatestSnapshotData; + /* + * We use OpenSnapshotCount to record if there are any open snapshots. If + * there are not, we can update MyProc->xmin. A more sophisticated solution + * would be to keep track of the oldest open snapshot and prevent + * MyProc->xmin from going below that value, but the complexity doesn't + * seem worth it. + */ + static int OpenSnapshotCount = 0; /* Externally visible pointers to valid snapshots: */ Snapshot SnapshotDirty = &SnapshotDirtyData; *************** *** 1207,1220 **** /* First call in transaction? */ if (SerializableSnapshot == NULL) { ! SerializableSnapshot = GetSnapshotData(&SerializableSnapshotData, true); return SerializableSnapshot; } if (IsXactIsoLevelSerializable) return SerializableSnapshot; ! LatestSnapshot = GetSnapshotData(&LatestSnapshotData, false); return LatestSnapshot; } --- 1215,1228 ---- /* First call in transaction? */ if (SerializableSnapshot == NULL) { ! SerializableSnapshot = GetSnapshotData(&SerializableSnapshotData); return SerializableSnapshot; } if (IsXactIsoLevelSerializable) return SerializableSnapshot; ! LatestSnapshot = GetSnapshotData(&LatestSnapshotData); return LatestSnapshot; } *************** *** 1231,1237 **** if (SerializableSnapshot == NULL) elog(ERROR, "no snapshot has been set"); ! LatestSnapshot = GetSnapshotData(&LatestSnapshotData, false); return LatestSnapshot; } --- 1239,1245 ---- if (SerializableSnapshot == NULL) elog(ERROR, "no snapshot has been set"); ! LatestSnapshot = GetSnapshotData(&LatestSnapshotData); return LatestSnapshot; } *************** *** 1251,1256 **** --- 1259,1266 ---- Size subxipoff; Size size; + OpenSnapshotCount++; + /* We allocate any XID arrays needed in the same palloc block. */ size = subxipoff = sizeof(SnapshotData) + snapshot->xcnt * sizeof(TransactionId); *************** *** 1295,1301 **** void FreeSnapshot(Snapshot snapshot) { ! pfree(snapshot); } /* --- 1305,1320 ---- void FreeSnapshot(Snapshot snapshot) { ! if (snapshot != NULL) ! { ! OpenSnapshotCount--; ! pfree(snapshot); ! } ! } ! ! int GetOpenSnapshotCount(void) ! { ! return OpenSnapshotCount; } /* *************** *** 1312,1318 **** --- 1331,1339 ---- */ SerializableSnapshot = NULL; LatestSnapshot = NULL; + FreeSnapshot(ActiveSnapshot); ActiveSnapshot = NULL; /* just for cleanliness */ + Assert(GetOpenSnapshotCount() == 0); } /* Index: src/include/utils/tqual.h =================================================================== RCS file: /cvsroot/pgsql/src/include/utils/tqual.h,v retrieving revision 1.65 diff -c -c -r1.65 tqual.h *** src/include/utils/tqual.h 5 Jan 2007 22:19:59 -0000 1.65 --- src/include/utils/tqual.h 24 Mar 2007 19:18:05 -0000 *************** *** 147,154 **** extern Snapshot CopySnapshot(Snapshot snapshot); extern void FreeSnapshot(Snapshot snapshot); extern void FreeXactSnapshot(void); /* in procarray.c; declared here to avoid including tqual.h in procarray.h: */ ! extern Snapshot GetSnapshotData(Snapshot snapshot, bool serializable); #endif /* TQUAL_H */ --- 147,155 ---- extern Snapshot CopySnapshot(Snapshot snapshot); extern void FreeSnapshot(Snapshot snapshot); extern void FreeXactSnapshot(void); + extern int GetOpenSnapshotCount(void); /* in procarray.c; declared here to avoid including tqual.h in procarray.h: */ ! extern Snapshot GetSnapshotData(Snapshot snapshot); #endif /* TQUAL_H */ Index: src/pl/plpgsql/src/pl_exec.c =================================================================== RCS file: /cvsroot/pgsql/src/pl/plpgsql/src/pl_exec.c,v retrieving revision 1.190 diff -c -c -r1.190 pl_exec.c *** src/pl/plpgsql/src/pl_exec.c 15 Mar 2007 23:12:07 -0000 1.190 --- src/pl/plpgsql/src/pl_exec.c 24 Mar 2007 19:18:06 -0000 *************** *** 4139,4149 **** --- 4139,4153 ---- PG_CATCH(); { /* Restore global vars and propagate error */ + if (!estate->readonly_func) + FreeSnapshot(ActiveSnapshot); ActiveSnapshot = saveActiveSnapshot; PG_RE_THROW(); } PG_END_TRY(); + if (!estate->readonly_func) + FreeSnapshot(ActiveSnapshot); ActiveSnapshot = saveActiveSnapshot; SPI_pop();
pgsql-patches by date: