Re: Improvement of procArray.xmin for VACUUM - Mailing list pgsql-patches

From Bruce Momjian
Subject Re: Improvement of procArray.xmin for VACUUM
Date
Msg-id 200703241942.l2OJgZC18482@momjian.us
Whole thread Raw
In response to Re: Improvement of procArray.xmin for VACUUM  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-patches
Tom Lane wrote:
> Heikki Linnakangas <heikki@enterprisedb.com> writes:
> > It seems to me a lot cleaner to do the reference counting like Tom
> > suggested. Increase the refcount on CopySnapshot, and decrease it on
> > FreeSnapshot. Assuming that all callers of CopySnapshot free the
> > snapshot with FreeSnapshot when they're done with it.
>
> I don't believe we bother at the moment; which is one of the reasons
> it'd be a nontrivial patch.  I do think it might be worth doing though.
> In the simple case where you're just issuing successive non-cursor
> commands within a READ COMMITTED transaction, a refcounted
> implementation would be able to recognize that there are *no* live
> snapshots between commands and therefore reset MyProc->xmin to 0
> whenever the backend is idle.

Attached is my current version of the patch.  It doesn't work now that I
tried to do reference count for Snapshots, but will stop now that Tom is
considering redesigning the snapshot mechanism.

> OTOH, do we have any evidence that this is worth bothering with at all?
> I fear that the cases of long-running transactions that are problems
> in the real world wouldn't be helped much --- for instance, pg_dump
> wouldn't change behavior because it uses a serializable transaction.
> Also, at some point a long-running transaction becomes a bottleneck
> simply because its XID is itself the oldest thing visible in the
> ProcArray and is determining everyone's xmin.  How much daylight is
> there really between "your xmin is old" and "your xid is old"?

Well, interesting you mention that, because I have a second idea on how
to improve things.  We start with MyProc->xmin equal to our own xid, and
then look for earlier transactions.  It should be possible to skip
considering our own xid for MyProc->xmin.  This would obviously help
VACUUM during long-running transactions.  While our transaction is
running, our xid isn't committed, so VACUUM isn't going to touch any of
our rows, and if other transactions complete before our
multi-transaction _statement_ starts, we can't see deleted rows from
them transaction, so why keep the deleted rows around?  Consider this
case:

Session #:
        1                               2               3
                                        BEGIN;
                                        SELECT 1;
        CREATE TABLE test(x int);
        INSERT INTO test VALUES (1);
        DELETE FROM test;
                                        SELECT 1;
                                                        VACUUM VERBOSE test;
                                                        (row can be reused)
                                        COMMIT;
                                                        VACUUM VERBOSE test;
                                                        (normal row reuse)

As I understand it, in READ COMMITTED mode, we have to skip
transactions in progress when our _statement_ starts, but anything
committed before that we see and we don't see dead rows created by them.

--
  Bruce Momjian  <bruce@momjian.us>          http://momjian.us
  EnterpriseDB                               http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +
Index: src/backend/catalog/index.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/catalog/index.c,v
retrieving revision 1.280
diff -c -c -r1.280 index.c
*** src/backend/catalog/index.c    3 Mar 2007 20:08:41 -0000    1.280
--- src/backend/catalog/index.c    24 Mar 2007 19:17:56 -0000
***************
*** 1358,1364 ****
      ExprContext *econtext;
      Snapshot    snapshot;
      TransactionId OldestXmin;
!
      /*
       * sanity checks
       */
--- 1358,1364 ----
      ExprContext *econtext;
      Snapshot    snapshot;
      TransactionId OldestXmin;
!
      /*
       * sanity checks
       */
***************
*** 1555,1561 ****
      ExecDropSingleTupleTableSlot(slot);

      FreeExecutorState(estate);
!
      /* These may have been pointing to the now-gone estate */
      indexInfo->ii_ExpressionsState = NIL;
      indexInfo->ii_PredicateState = NIL;
--- 1555,1563 ----
      ExecDropSingleTupleTableSlot(slot);

      FreeExecutorState(estate);
!     if (snapshot != SnapshotNow)
!         FreeSnapshot(snapshot);
!
      /* These may have been pointing to the now-gone estate */
      indexInfo->ii_ExpressionsState = NIL;
      indexInfo->ii_PredicateState = NIL;
Index: src/backend/commands/cluster.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/commands/cluster.c,v
retrieving revision 1.157
diff -c -c -r1.157 cluster.c
*** src/backend/commands/cluster.c    13 Mar 2007 00:33:39 -0000    1.157
--- src/backend/commands/cluster.c    24 Mar 2007 19:17:56 -0000
***************
*** 204,209 ****
--- 204,210 ----
              /* Start a new transaction for each relation. */
              StartTransactionCommand();
              /* functions in indexes may want a snapshot set */
+             FreeSnapshot(ActiveSnapshot);
              ActiveSnapshot = CopySnapshot(GetTransactionSnapshot());
              cluster_rel(rvtc, true);
              CommitTransactionCommand();
Index: src/backend/commands/indexcmds.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/commands/indexcmds.c,v
retrieving revision 1.157
diff -c -c -r1.157 indexcmds.c
*** src/backend/commands/indexcmds.c    13 Mar 2007 00:33:39 -0000    1.157
--- src/backend/commands/indexcmds.c    24 Mar 2007 19:17:56 -0000
***************
*** 493,500 ****
       * We also set ActiveSnapshot to this snap, since functions in indexes may
       * need a snapshot.
       */
!     snapshot = CopySnapshot(GetTransactionSnapshot());
!     ActiveSnapshot = snapshot;

      /*
       * Scan the index and the heap, insert any missing index entries.
--- 493,500 ----
       * We also set ActiveSnapshot to this snap, since functions in indexes may
       * need a snapshot.
       */
!     FreeSnapshot(ActiveSnapshot);
!     ActiveSnapshot = snapshot = CopySnapshot(GetTransactionSnapshot());

      /*
       * Scan the index and the heap, insert any missing index entries.
***************
*** 1304,1309 ****
--- 1304,1310 ----

          StartTransactionCommand();
          /* functions in indexes may want a snapshot set */
+         FreeSnapshot(ActiveSnapshot);
          ActiveSnapshot = CopySnapshot(GetTransactionSnapshot());
          if (reindex_relation(relid, true))
              ereport(NOTICE,
Index: src/backend/commands/trigger.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/commands/trigger.c,v
retrieving revision 1.214
diff -c -c -r1.214 trigger.c
*** src/backend/commands/trigger.c    19 Mar 2007 23:38:29 -0000    1.214
--- src/backend/commands/trigger.c    24 Mar 2007 19:17:57 -0000
***************
*** 2645,2651 ****
--- 2645,2654 ----
       */
      events = &afterTriggers->events;
      if (events->head != NULL)
+     {
+         FreeSnapshot(ActiveSnapshot);
          ActiveSnapshot = CopySnapshot(GetTransactionSnapshot());
+     }

      /*
       * Run all the remaining triggers.    Loop until they are all gone, just in
Index: src/backend/commands/vacuum.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/commands/vacuum.c,v
retrieving revision 1.349
diff -c -c -r1.349 vacuum.c
*** src/backend/commands/vacuum.c    14 Mar 2007 18:48:55 -0000    1.349
--- src/backend/commands/vacuum.c    24 Mar 2007 19:17:59 -0000
***************
*** 412,417 ****
--- 412,418 ----
                  {
                      StartTransactionCommand();
                      /* functions in indexes may want a snapshot set */
+                     FreeSnapshot(ActiveSnapshot);
                      ActiveSnapshot = CopySnapshot(GetTransactionSnapshot());
                  }
                  else
***************
*** 470,475 ****
--- 471,477 ----
           * necessary if we are called from autovacuum because autovacuum might
           * continue on to do an ANALYZE-only call.
           */
+         FreeSnapshot(ActiveSnapshot);
          ActiveSnapshot = CopySnapshot(GetTransactionSnapshot());
      }

***************
*** 964,969 ****
--- 966,972 ----
      if (vacstmt->full)
      {
          /* functions in indexes may want a snapshot set */
+         FreeSnapshot(ActiveSnapshot);
          ActiveSnapshot = CopySnapshot(GetTransactionSnapshot());
      }
      else
Index: src/backend/executor/functions.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/executor/functions.c,v
retrieving revision 1.112
diff -c -c -r1.112 functions.c
*** src/backend/executor/functions.c    13 Mar 2007 00:33:40 -0000    1.112
--- src/backend/executor/functions.c    24 Mar 2007 19:17:59 -0000
***************
*** 315,321 ****
                                          fcache->paramLI);

      /* We assume we don't need to set up ActiveSnapshot for ExecutorStart */
!
      /* Utility commands don't need Executor. */
      if (es->qd->operation != CMD_UTILITY)
      {
--- 315,321 ----
                                          fcache->paramLI);

      /* We assume we don't need to set up ActiveSnapshot for ExecutorStart */
!
      /* Utility commands don't need Executor. */
      if (es->qd->operation != CMD_UTILITY)
      {
Index: src/backend/postmaster/autovacuum.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/postmaster/autovacuum.c,v
retrieving revision 1.38
diff -c -c -r1.38 autovacuum.c
*** src/backend/postmaster/autovacuum.c    23 Mar 2007 21:57:10 -0000    1.38
--- src/backend/postmaster/autovacuum.c    24 Mar 2007 19:18:00 -0000
***************
*** 884,889 ****
--- 884,890 ----
      StartTransactionCommand();

      /* functions in indexes may want a snapshot set */
+     FreeSnapshot(ActiveSnapshot);
      ActiveSnapshot = CopySnapshot(GetTransactionSnapshot());

      /*
Index: src/backend/storage/ipc/procarray.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/storage/ipc/procarray.c,v
retrieving revision 1.22
diff -c -c -r1.22 procarray.c
*** src/backend/storage/ipc/procarray.c    23 Mar 2007 03:16:39 -0000    1.22
--- src/backend/storage/ipc/procarray.c    24 Mar 2007 19:18:01 -0000
***************
*** 498,504 ****
   *----------
   */
  Snapshot
! GetSnapshotData(Snapshot snapshot, bool serializable)
  {
      ProcArrayStruct *arrayP = procArray;
      TransactionId xmin;
--- 498,504 ----
   *----------
   */
  Snapshot
! GetSnapshotData(Snapshot snapshot)
  {
      ProcArrayStruct *arrayP = procArray;
      TransactionId xmin;
***************
*** 510,519 ****

      Assert(snapshot != NULL);

!     /* Serializable snapshot must be computed before any other... */
!     Assert(serializable ?
!            !TransactionIdIsValid(MyProc->xmin) :
!            TransactionIdIsValid(MyProc->xmin));

      /*
       * Allocating space for maxProcs xids is usually overkill; numProcs would
--- 510,517 ----

      Assert(snapshot != NULL);

!     /* Either we set an xmin, or we already have one. */
!     Assert(TransactionIdIsValid(MyProc->xmin) || GetOpenSnapshotCount() == 0);

      /*
       * Allocating space for maxProcs xids is usually overkill; numProcs would
***************
*** 656,664 ****
          }
      }

!     if (serializable)
!         MyProc->xmin = TransactionXmin = xmin;

      LWLockRelease(ProcArrayLock);

      /*
--- 654,681 ----
          }
      }

!     /* DEBUG */
!     if (!TransactionIdIsValid(MyProc->xmin) || GetOpenSnapshotCount() == 0)
!         fprintf(stderr, "%x:%x: set new xmin old=%x new=%x\n", MyProcPid, MyProc->xid, MyProc->xmin, xmin);
!     else
!         fprintf(stderr, "%x:%x: count=%d, not set\n", MyProcPid, MyProc->xid, GetOpenSnapshotCount());

+     /*
+      * ---
+      * We need to set MyProc->xmin in a few cases:
+      *    o  set at transaction start for SERIALIZABLE visibility
+      *    o  set for new command in READ COMMITTED mode so
+      *       VACUUM knows we don't care about completed transactions
+      *
+      *    The problem is that we might have old cursors/portals
+      *    that need to have older transaction visibility, so we
+      *    check to see we have no open Snapshots before making
+      *    the change.
+      * ---
+      */
+     if (!TransactionIdIsValid(MyProc->xmin) || GetOpenSnapshotCount() == 0)
+         MyProc->xmin = TransactionXmin = xmin;
+
      LWLockRelease(ProcArrayLock);

      /*
Index: src/backend/tcop/fastpath.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/tcop/fastpath.c,v
retrieving revision 1.96
diff -c -c -r1.96 fastpath.c
*** src/backend/tcop/fastpath.c    3 Mar 2007 19:32:54 -0000    1.96
--- src/backend/tcop/fastpath.c    24 Mar 2007 19:18:01 -0000
***************
*** 308,313 ****
--- 308,314 ----
       * Now that we know we are in a valid transaction, set snapshot in case
       * needed by function itself or one of the datatype I/O routines.
       */
+     FreeSnapshot(ActiveSnapshot);
      ActiveSnapshot = CopySnapshot(GetTransactionSnapshot());

      /*
Index: src/backend/tcop/postgres.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/tcop/postgres.c,v
retrieving revision 1.529
diff -c -c -r1.529 postgres.c
*** src/backend/tcop/postgres.c    22 Mar 2007 19:55:04 -0000    1.529
--- src/backend/tcop/postgres.c    24 Mar 2007 19:18:02 -0000
***************
*** 738,743 ****
--- 738,744 ----
          {
              if (needSnapshot)
              {
+                 FreeSnapshot(ActiveSnapshot);
                  ActiveSnapshot = CopySnapshot(GetTransactionSnapshot());
                  needSnapshot = false;
              }
Index: src/backend/tcop/pquery.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/tcop/pquery.c,v
retrieving revision 1.115
diff -c -c -r1.115 pquery.c
*** src/backend/tcop/pquery.c    13 Mar 2007 00:33:42 -0000    1.115
--- src/backend/tcop/pquery.c    24 Mar 2007 19:18:02 -0000
***************
*** 154,159 ****
--- 154,160 ----
       * Must always set snapshot for plannable queries.    Note we assume that
       * caller will take care of restoring ActiveSnapshot on exit/error.
       */
+     FreeSnapshot(ActiveSnapshot);
      ActiveSnapshot = CopySnapshot(GetTransactionSnapshot());

      /*
***************
*** 486,491 ****
--- 487,493 ----
                   * Must set snapshot before starting executor.    Be sure to
                   * copy it into the portal's context.
                   */
+                 FreeSnapshot(ActiveSnapshot);
                  if (snapshot)
                      ActiveSnapshot = CopySnapshot(snapshot);
                  else
***************
*** 1163,1169 ****
--- 1165,1174 ----
            IsA(utilityStmt, NotifyStmt) ||
            IsA(utilityStmt, UnlistenStmt) ||
            IsA(utilityStmt, CheckPointStmt)))
+     {
+         FreeSnapshot(ActiveSnapshot);
          ActiveSnapshot = CopySnapshot(GetTransactionSnapshot());
+     }
      else
          ActiveSnapshot = NULL;

***************
*** 1177,1184 ****
      /* Some utility statements may change context on us */
      MemoryContextSwitchTo(PortalGetHeapMemory(portal));

!     if (ActiveSnapshot)
!         FreeSnapshot(ActiveSnapshot);
      ActiveSnapshot = NULL;
  }

--- 1182,1188 ----
      /* Some utility statements may change context on us */
      MemoryContextSwitchTo(PortalGetHeapMemory(portal));

!     FreeSnapshot(ActiveSnapshot);
      ActiveSnapshot = NULL;
  }

Index: src/backend/utils/adt/ri_triggers.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/utils/adt/ri_triggers.c,v
retrieving revision 1.92
diff -c -c -r1.92 ri_triggers.c
*** src/backend/utils/adt/ri_triggers.c    15 Mar 2007 23:12:06 -0000    1.92
--- src/backend/utils/adt/ri_triggers.c    24 Mar 2007 19:18:04 -0000
***************
*** 2636,2642 ****
      char        workmembuf[32];
      int            spi_result;
      SPIPlanPtr    qplan;
!
      /*
       * Check to make sure current user has enough permissions to do the test
       * query.  (If not, caller can fall back to the trigger method, which
--- 2636,2643 ----
      char        workmembuf[32];
      int            spi_result;
      SPIPlanPtr    qplan;
!     Snapshot    snapshot;
!
      /*
       * Check to make sure current user has enough permissions to do the test
       * query.  (If not, caller can fall back to the trigger method, which
***************
*** 2770,2780 ****
       * really haven't got much choice.)  We need at most one tuple returned,
       * so pass limit = 1.
       */
      spi_result = SPI_execute_snapshot(qplan,
                                        NULL, NULL,
!                                       CopySnapshot(GetLatestSnapshot()),
                                        InvalidSnapshot,
                                        true, 1);

      /* Check result */
      if (spi_result != SPI_OK_SELECT)
--- 2771,2783 ----
       * really haven't got much choice.)  We need at most one tuple returned,
       * so pass limit = 1.
       */
+     snapshot = CopySnapshot(GetLatestSnapshot());
      spi_result = SPI_execute_snapshot(qplan,
                                        NULL, NULL,
!                                       snapshot,
                                        InvalidSnapshot,
                                        true, 1);
+     FreeSnapshot(snapshot);

      /* Check result */
      if (spi_result != SPI_OK_SELECT)
***************
*** 3310,3315 ****
--- 3313,3322 ----
                                        test_snapshot, crosscheck_snapshot,
                                        false, limit);

+     if (test_snapshot != InvalidSnapshot)
+         FreeSnapshot(test_snapshot);
+     if (crosscheck_snapshot != InvalidSnapshot)
+         FreeSnapshot(crosscheck_snapshot);
      /* Restore UID */
      SetUserId(save_uid);

Index: src/backend/utils/time/tqual.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/utils/time/tqual.c,v
retrieving revision 1.101
diff -c -c -r1.101 tqual.c
*** src/backend/utils/time/tqual.c    5 Jan 2007 22:19:47 -0000    1.101
--- src/backend/utils/time/tqual.c    24 Mar 2007 19:18:04 -0000
***************
*** 54,59 ****
--- 54,67 ----
  static SnapshotData SnapshotDirtyData;
  static SnapshotData SerializableSnapshotData;
  static SnapshotData LatestSnapshotData;
+ /*
+  * We use OpenSnapshotCount to record if there are any open snapshots.  If
+  * there are not, we can update MyProc->xmin. A more sophisticated solution
+  * would be to keep track of the oldest open snapshot and prevent
+  * MyProc->xmin from going below that value, but the complexity doesn't
+  * seem worth it.
+  */
+ static int OpenSnapshotCount = 0;

  /* Externally visible pointers to valid snapshots: */
  Snapshot    SnapshotDirty = &SnapshotDirtyData;
***************
*** 1207,1220 ****
      /* First call in transaction? */
      if (SerializableSnapshot == NULL)
      {
!         SerializableSnapshot = GetSnapshotData(&SerializableSnapshotData, true);
          return SerializableSnapshot;
      }

      if (IsXactIsoLevelSerializable)
          return SerializableSnapshot;

!     LatestSnapshot = GetSnapshotData(&LatestSnapshotData, false);

      return LatestSnapshot;
  }
--- 1215,1228 ----
      /* First call in transaction? */
      if (SerializableSnapshot == NULL)
      {
!         SerializableSnapshot = GetSnapshotData(&SerializableSnapshotData);
          return SerializableSnapshot;
      }

      if (IsXactIsoLevelSerializable)
          return SerializableSnapshot;

!     LatestSnapshot = GetSnapshotData(&LatestSnapshotData);

      return LatestSnapshot;
  }
***************
*** 1231,1237 ****
      if (SerializableSnapshot == NULL)
          elog(ERROR, "no snapshot has been set");

!     LatestSnapshot = GetSnapshotData(&LatestSnapshotData, false);

      return LatestSnapshot;
  }
--- 1239,1245 ----
      if (SerializableSnapshot == NULL)
          elog(ERROR, "no snapshot has been set");

!     LatestSnapshot = GetSnapshotData(&LatestSnapshotData);

      return LatestSnapshot;
  }
***************
*** 1251,1256 ****
--- 1259,1266 ----
      Size        subxipoff;
      Size        size;

+     OpenSnapshotCount++;
+
      /* We allocate any XID arrays needed in the same palloc block. */
      size = subxipoff = sizeof(SnapshotData) +
          snapshot->xcnt * sizeof(TransactionId);
***************
*** 1295,1301 ****
  void
  FreeSnapshot(Snapshot snapshot)
  {
!     pfree(snapshot);
  }

  /*
--- 1305,1320 ----
  void
  FreeSnapshot(Snapshot snapshot)
  {
!     if (snapshot != NULL)
!     {
!         OpenSnapshotCount--;
!         pfree(snapshot);
!     }
! }
!
! int GetOpenSnapshotCount(void)
! {
!     return OpenSnapshotCount;
  }

  /*
***************
*** 1312,1318 ****
--- 1331,1339 ----
       */
      SerializableSnapshot = NULL;
      LatestSnapshot = NULL;
+     FreeSnapshot(ActiveSnapshot);
      ActiveSnapshot = NULL;        /* just for cleanliness */
+     Assert(GetOpenSnapshotCount() == 0);
  }

  /*
Index: src/include/utils/tqual.h
===================================================================
RCS file: /cvsroot/pgsql/src/include/utils/tqual.h,v
retrieving revision 1.65
diff -c -c -r1.65 tqual.h
*** src/include/utils/tqual.h    5 Jan 2007 22:19:59 -0000    1.65
--- src/include/utils/tqual.h    24 Mar 2007 19:18:05 -0000
***************
*** 147,154 ****
  extern Snapshot CopySnapshot(Snapshot snapshot);
  extern void FreeSnapshot(Snapshot snapshot);
  extern void FreeXactSnapshot(void);

  /* in procarray.c; declared here to avoid including tqual.h in procarray.h: */
! extern Snapshot GetSnapshotData(Snapshot snapshot, bool serializable);

  #endif   /* TQUAL_H */
--- 147,155 ----
  extern Snapshot CopySnapshot(Snapshot snapshot);
  extern void FreeSnapshot(Snapshot snapshot);
  extern void FreeXactSnapshot(void);
+ extern int GetOpenSnapshotCount(void);

  /* in procarray.c; declared here to avoid including tqual.h in procarray.h: */
! extern Snapshot GetSnapshotData(Snapshot snapshot);

  #endif   /* TQUAL_H */
Index: src/pl/plpgsql/src/pl_exec.c
===================================================================
RCS file: /cvsroot/pgsql/src/pl/plpgsql/src/pl_exec.c,v
retrieving revision 1.190
diff -c -c -r1.190 pl_exec.c
*** src/pl/plpgsql/src/pl_exec.c    15 Mar 2007 23:12:07 -0000    1.190
--- src/pl/plpgsql/src/pl_exec.c    24 Mar 2007 19:18:06 -0000
***************
*** 4139,4149 ****
--- 4139,4153 ----
      PG_CATCH();
      {
          /* Restore global vars and propagate error */
+         if (!estate->readonly_func)
+             FreeSnapshot(ActiveSnapshot);
          ActiveSnapshot = saveActiveSnapshot;
          PG_RE_THROW();
      }
      PG_END_TRY();

+     if (!estate->readonly_func)
+         FreeSnapshot(ActiveSnapshot);
      ActiveSnapshot = saveActiveSnapshot;
      SPI_pop();


pgsql-patches by date:

Previous
From: Tom Lane
Date:
Subject: Re: Improvement of procArray.xmin for VACUUM
Next
From: Bruce Momjian
Date:
Subject: Re: [BUGS] BUG #3095: LDAP authentication parsing incorrectly