Re: [pgsql-patches] Recalculating OldestXmin in a long-running vacuum - Mailing list pgsql-patches

From Heikki Linnakangas
Subject Re: [pgsql-patches] Recalculating OldestXmin in a long-running vacuum
Date
Msg-id 45C2291F.1020005@enterprisedb.com
Whole thread Raw
In response to Re: [pgsql-patches] Recalculating OldestXmin in a long-running vacuum  (Bruce Momjian <bruce@momjian.us>)
Responses Re: [pgsql-patches] Recalculating OldestXmin in a long-running vacuum  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-patches
Bruce Momjian wrote:
> Have you gotten performance numbers on this yet?

I have two runs of DBT-2, one with the patch and one without.

Patched:

autovac "public.stock" scans:1 pages:1285990(-0)
tuples:25303056(-2671265) CPU 95.22s/38.02u sec elapsed 10351.17 sec

Unpatched:

autovac "public.stock" scans:1 pages:1284504(-0)
tuples:25001369(-1973760) CPU 86.55s/34.70u sec elapsed 9628.13 sec

Both autovacuums started roughly at the same time after test start. The
numbers mean that without the patch, the vacuum found 1973760 dead
tuples and with the patch 2671265 dead tuples. The runs were done with
autovacuum_vacuum_scale_factor = 0.05, to trigger the autovacuum earlier
than with the default.

Before these test runs, I realized that the patch had a little
strangeness. Because we're taking new snapshot during the vacuum, some
rows that are updated while the vacuum is running might not get counted
as live. That can happen when the new updated version goes to page that
has already been vacuumed, and the old version is on a page that hasn't
yet been vacuumed. Also, because we're taking new snapshots, it makes
sense to recalculate the relation size as well to vacuum any new blocks
at the end. Attached is an updated patch that does that.

The reason I haven't posted the results earlier is that we're having
some periodic drops in performance on that server that we can't explain.
  (I don't think it's checkpoint nor autovacuum). I wanted to figure
that out first, but I don't think that makes a difference for this patch.

Is this enough, or does someone want more evidence?

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com
Index: src/backend/commands/vacuumlazy.c
===================================================================
RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/backend/commands/vacuumlazy.c,v
retrieving revision 1.82
diff -c -r1.82 vacuumlazy.c
*** src/backend/commands/vacuumlazy.c    5 Jan 2007 22:19:27 -0000    1.82
--- src/backend/commands/vacuumlazy.c    22 Jan 2007 11:35:34 -0000
***************
*** 66,71 ****
--- 66,73 ----
  #define REL_TRUNCATE_MINIMUM    1000
  #define REL_TRUNCATE_FRACTION    16

+ /* OldestXmin is recalculated every OLDEST_XMIN_REFRESH_INTERVAL pages */
+ #define OLDEST_XMIN_REFRESH_INTERVAL 100

  typedef struct LVRelStats
  {
***************
*** 274,279 ****
--- 276,296 ----
              vacrelstats->num_dead_tuples = 0;
          }

+         /* Get a new OldestXmin every OLDEST_XMIN_REFRESH_INTERVAL pages
+          * so that we get to reclaim a little bit more dead tuples in a
+          * long-running vacuum.
+          */
+         if (blkno % OLDEST_XMIN_REFRESH_INTERVAL == (OLDEST_XMIN_REFRESH_INTERVAL - 1))
+         {
+             OldestXmin = GetOldestXmin(onerel->rd_rel->relisshared, true);
+             /* The table could've grown since vacuum started, and there
+              * might already be dead tuples on the new pages. Catch them
+              * as well. Also, we want to include any live tuples in the
+              * new pages in the statistics.
+              */
+             nblocks = RelationGetNumberOfBlocks(onerel);
+         }
+
          buf = ReadBuffer(onerel, blkno);

          /* Initially, we only need shared access to the buffer */
Index: src/backend/storage/ipc/procarray.c
===================================================================
RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/backend/storage/ipc/procarray.c,v
retrieving revision 1.20
diff -c -r1.20 procarray.c
*** src/backend/storage/ipc/procarray.c    5 Jan 2007 22:19:38 -0000    1.20
--- src/backend/storage/ipc/procarray.c    16 Jan 2007 16:57:59 -0000
***************
*** 416,426 ****
      /*
       * Normally we start the min() calculation with our own XID.  But if
       * called by checkpointer, we will not be inside a transaction, so use
!      * next XID as starting point for min() calculation.  (Note that if there
!      * are no xacts running at all, that will be the subtrans truncation
!      * point!)
       */
!     if (IsTransactionState())
          result = GetTopTransactionId();
      else
          result = ReadNewTransactionId();
--- 416,429 ----
      /*
       * Normally we start the min() calculation with our own XID.  But if
       * called by checkpointer, we will not be inside a transaction, so use
!      * next XID as starting point for min() calculation.  We also don't
!      * include our own transaction if ignoreVacuum is true and we're a
!      * vacuum process ourselves.
!      *
!      * (Note that if there are no xacts running at all, that will be the
!      * subtrans truncation point!)
       */
!     if (IsTransactionState() && !(ignoreVacuum && MyProc->inVacuum))
          result = GetTopTransactionId();
      else
          result = ReadNewTransactionId();

pgsql-patches by date:

Previous
From:
Date:
Subject: Re: [pgsql-patches] Patch to avoid gprofprofilingoverwrites
Next
From: Tom Lane
Date:
Subject: Re: [pgsql-patches] Recalculating OldestXmin in a long-running vacuum