Thread: Recalculating OldestXmin in a long-running vacuum

Recalculating OldestXmin in a long-running vacuum

From
Heikki Linnakangas
Date:
Hi,

It seems to me that we could easily reclaim a bit more dead tuples in a
vacuum by recalculating the OldestXmin every now and then. In a large
table with a constant stream of updates/deletes and concurrent vacuums,
this could make a big difference.

With the attached patch, OldestXmin is recalculated in a vacuum every
100 pages. That's a quite arbitrary number, but feels like a good one to me.

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com
Index: src/backend/commands/vacuumlazy.c
===================================================================
RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/backend/commands/vacuumlazy.c,v
retrieving revision 1.82
diff -c -r1.82 vacuumlazy.c
*** src/backend/commands/vacuumlazy.c    5 Jan 2007 22:19:27 -0000    1.82
--- src/backend/commands/vacuumlazy.c    16 Jan 2007 11:02:35 -0000
***************
*** 66,71 ****
--- 66,73 ----
  #define REL_TRUNCATE_MINIMUM    1000
  #define REL_TRUNCATE_FRACTION    16

+ /* OldestXmin is recalculated every OLDEST_XMIN_REFRESH_INTERVAL pages */
+ #define OLDEST_XMIN_REFRESH_INTERVAL 100

  typedef struct LVRelStats
  {
***************
*** 256,261 ****
--- 258,270 ----

          vacuum_delay_point();

+         /* Get a new OldestXmin every OLDEST_XMIN_REFRESH_INTERVAL pages
+          * so that we get to reclaim a little bit more dead tuples in a
+          * long-running vacuum.
+          */
+         if (blkno % OLDEST_XMIN_REFRESH_INTERVAL == (OLDEST_XMIN_REFRESH_INTERVAL - 1))
+             OldestXmin = GetOldestXmin(onerel->rd_rel->relisshared, true);
+
          /*
           * If we are close to overrunning the available space for dead-tuple
           * TIDs, pause and do a cycle of vacuuming before we tackle this page.
Index: src/backend/storage/ipc/procarray.c
===================================================================
RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/backend/storage/ipc/procarray.c,v
retrieving revision 1.20
diff -c -r1.20 procarray.c
*** src/backend/storage/ipc/procarray.c    5 Jan 2007 22:19:38 -0000    1.20
--- src/backend/storage/ipc/procarray.c    16 Jan 2007 10:52:10 -0000
***************
*** 416,426 ****
      /*
       * Normally we start the min() calculation with our own XID.  But if
       * called by checkpointer, we will not be inside a transaction, so use
!      * next XID as starting point for min() calculation.  (Note that if there
!      * are no xacts running at all, that will be the subtrans truncation
!      * point!)
       */
!     if (IsTransactionState())
          result = GetTopTransactionId();
      else
          result = ReadNewTransactionId();
--- 416,429 ----
      /*
       * Normally we start the min() calculation with our own XID.  But if
       * called by checkpointer, we will not be inside a transaction, so use
!      * next XID as starting point for min() calculation.  We also don't
!      * include our own transaction if ignoreVacuum is true and we're a
!      * vacuum process ourselves.
!      *
!      * (Note that if there are no xacts running at all, that will be the
!      * subtrans truncation point!)
       */
!     if (IsTransactionState() && !(ignoreVacuum && MyProc->inVacuum))
          result = GetTopTransactionId();
      else
          result = ReadNewTransactionId();

Re: Recalculating OldestXmin in a long-running vacuum

From
Tom Lane
Date:
Heikki Linnakangas <heikki@enterprisedb.com> writes:
> It seems to me that we could easily reclaim a bit more dead tuples in a
> vacuum by recalculating the OldestXmin every now and then.

Doesn't this break relfrozenxid maintenance?  In any case I'd like to
see some evidence of significant real-world benefit before adding such
a conceptual wart ...

The procarray.c change scares me as well; I'm pretty sure the original
coding was intentional.

            regards, tom lane

Re: Recalculating OldestXmin in a long-running vacuum

From
Heikki Linnakangas
Date:
Tom Lane wrote:
> Heikki Linnakangas <heikki@enterprisedb.com> writes:
>> It seems to me that we could easily reclaim a bit more dead tuples in a
>> vacuum by recalculating the OldestXmin every now and then.
>
> Doesn't this break relfrozenxid maintenance?

Not AFAICS. relfrozenxid is nowadays updated with FreezeLimit.

> In any case I'd like to
> see some evidence of significant real-world benefit before adding such
> a conceptual wart ...

I've asked our testers to do a TPC-C run with and without the
patch. I'm not expecting it to make a huge difference there, but if
you're using a big cost delay, it could make quite a difference for such
a simple thing.

> The procarray.c change scares me as well; I'm pretty sure the original
> coding was intentional.

Well, it didn't make much difference before, since OldestXmin was always
calculated early in the transaction.

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com


Re: Recalculating OldestXmin in a long-running vacuum

From
Gregory Stark
Date:
Heikki Linnakangas <heikki@enterprisedb.com> writes:

> Tom Lane wrote:
>
> > In any case I'd like to see some evidence of significant real-world
> > benefit before adding such a conceptual wart ...
>
> I've asked our testers to do a TPC-C run with and without the
> patch. I'm not expecting it to make a huge difference there, but if you're
> using a big cost delay, it could make quite a difference for such a simple
> thing.

I think the key here is that it removes the size of the table from the list of
factors that govern the steady-state. Currently as the table grows the maximum
age of the snapshot vacuum uses gets older too which means a greater
percentage of the table is dedicated to dead tuple overhead. (which in turn
means a larger table which means an even greater percentage of dead tuples...)

With the patch the size of the table is no longer a factor. As long as the
work vacuum has on a given page can keep up with the rate of creating dead
tuples then it won't matter how large the table is. The only factors that
determine the steady state are the rate of creation of dead tuples and the
rate at which vacuum can process them.

Unfortunately indexes, again, mean that's not entirely true. As the table
grows the indexes will grow as well and that will slow vacuum down. Though
indexes are usually smaller than tables.

--
  Gregory Stark
  EnterpriseDB          http://www.enterprisedb.com

Re: [pgsql-patches] Recalculating OldestXmin in a long-running vacuum

From
Bruce Momjian
Date:
Have you gotten performance numbers on this yet?

---------------------------------------------------------------------------

Gregory Stark wrote:
> Heikki Linnakangas <heikki@enterprisedb.com> writes:
>
> > Tom Lane wrote:
> >
> > > In any case I'd like to see some evidence of significant real-world
> > > benefit before adding such a conceptual wart ...
> >
> > I've asked our testers to do a TPC-C run with and without the
> > patch. I'm not expecting it to make a huge difference there, but if you're
> > using a big cost delay, it could make quite a difference for such a simple
> > thing.
>
> I think the key here is that it removes the size of the table from the list of
> factors that govern the steady-state. Currently as the table grows the maximum
> age of the snapshot vacuum uses gets older too which means a greater
> percentage of the table is dedicated to dead tuple overhead. (which in turn
> means a larger table which means an even greater percentage of dead tuples...)
>
> With the patch the size of the table is no longer a factor. As long as the
> work vacuum has on a given page can keep up with the rate of creating dead
> tuples then it won't matter how large the table is. The only factors that
> determine the steady state are the rate of creation of dead tuples and the
> rate at which vacuum can process them.
>
> Unfortunately indexes, again, mean that's not entirely true. As the table
> grows the indexes will grow as well and that will slow vacuum down. Though
> indexes are usually smaller than tables.
>
> --
>   Gregory Stark
>   EnterpriseDB          http://www.enterprisedb.com
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Have you searched our list archives?
>
>                http://archives.postgresql.org

--
  Bruce Momjian   bruce@momjian.us
  EnterpriseDB    http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

Re: [pgsql-patches] Recalculating OldestXmin in a long-running vacuum

From
Heikki Linnakangas
Date:
Bruce Momjian wrote:
> Have you gotten performance numbers on this yet?

I have two runs of DBT-2, one with the patch and one without.

Patched:

autovac "public.stock" scans:1 pages:1285990(-0)
tuples:25303056(-2671265) CPU 95.22s/38.02u sec elapsed 10351.17 sec

Unpatched:

autovac "public.stock" scans:1 pages:1284504(-0)
tuples:25001369(-1973760) CPU 86.55s/34.70u sec elapsed 9628.13 sec

Both autovacuums started roughly at the same time after test start. The
numbers mean that without the patch, the vacuum found 1973760 dead
tuples and with the patch 2671265 dead tuples. The runs were done with
autovacuum_vacuum_scale_factor = 0.05, to trigger the autovacuum earlier
than with the default.

Before these test runs, I realized that the patch had a little
strangeness. Because we're taking new snapshot during the vacuum, some
rows that are updated while the vacuum is running might not get counted
as live. That can happen when the new updated version goes to page that
has already been vacuumed, and the old version is on a page that hasn't
yet been vacuumed. Also, because we're taking new snapshots, it makes
sense to recalculate the relation size as well to vacuum any new blocks
at the end. Attached is an updated patch that does that.

The reason I haven't posted the results earlier is that we're having
some periodic drops in performance on that server that we can't explain.
  (I don't think it's checkpoint nor autovacuum). I wanted to figure
that out first, but I don't think that makes a difference for this patch.

Is this enough, or does someone want more evidence?

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com
Index: src/backend/commands/vacuumlazy.c
===================================================================
RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/backend/commands/vacuumlazy.c,v
retrieving revision 1.82
diff -c -r1.82 vacuumlazy.c
*** src/backend/commands/vacuumlazy.c    5 Jan 2007 22:19:27 -0000    1.82
--- src/backend/commands/vacuumlazy.c    22 Jan 2007 11:35:34 -0000
***************
*** 66,71 ****
--- 66,73 ----
  #define REL_TRUNCATE_MINIMUM    1000
  #define REL_TRUNCATE_FRACTION    16

+ /* OldestXmin is recalculated every OLDEST_XMIN_REFRESH_INTERVAL pages */
+ #define OLDEST_XMIN_REFRESH_INTERVAL 100

  typedef struct LVRelStats
  {
***************
*** 274,279 ****
--- 276,296 ----
              vacrelstats->num_dead_tuples = 0;
          }

+         /* Get a new OldestXmin every OLDEST_XMIN_REFRESH_INTERVAL pages
+          * so that we get to reclaim a little bit more dead tuples in a
+          * long-running vacuum.
+          */
+         if (blkno % OLDEST_XMIN_REFRESH_INTERVAL == (OLDEST_XMIN_REFRESH_INTERVAL - 1))
+         {
+             OldestXmin = GetOldestXmin(onerel->rd_rel->relisshared, true);
+             /* The table could've grown since vacuum started, and there
+              * might already be dead tuples on the new pages. Catch them
+              * as well. Also, we want to include any live tuples in the
+              * new pages in the statistics.
+              */
+             nblocks = RelationGetNumberOfBlocks(onerel);
+         }
+
          buf = ReadBuffer(onerel, blkno);

          /* Initially, we only need shared access to the buffer */
Index: src/backend/storage/ipc/procarray.c
===================================================================
RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/backend/storage/ipc/procarray.c,v
retrieving revision 1.20
diff -c -r1.20 procarray.c
*** src/backend/storage/ipc/procarray.c    5 Jan 2007 22:19:38 -0000    1.20
--- src/backend/storage/ipc/procarray.c    16 Jan 2007 16:57:59 -0000
***************
*** 416,426 ****
      /*
       * Normally we start the min() calculation with our own XID.  But if
       * called by checkpointer, we will not be inside a transaction, so use
!      * next XID as starting point for min() calculation.  (Note that if there
!      * are no xacts running at all, that will be the subtrans truncation
!      * point!)
       */
!     if (IsTransactionState())
          result = GetTopTransactionId();
      else
          result = ReadNewTransactionId();
--- 416,429 ----
      /*
       * Normally we start the min() calculation with our own XID.  But if
       * called by checkpointer, we will not be inside a transaction, so use
!      * next XID as starting point for min() calculation.  We also don't
!      * include our own transaction if ignoreVacuum is true and we're a
!      * vacuum process ourselves.
!      *
!      * (Note that if there are no xacts running at all, that will be the
!      * subtrans truncation point!)
       */
!     if (IsTransactionState() && !(ignoreVacuum && MyProc->inVacuum))
          result = GetTopTransactionId();
      else
          result = ReadNewTransactionId();

Re: [pgsql-patches] Recalculating OldestXmin in a long-running vacuum

From
Tom Lane
Date:
Heikki Linnakangas <heikki@enterprisedb.com> writes:
> I have two runs of DBT-2, one with the patch and one without.

> Patched:

> autovac "public.stock" scans:1 pages:1285990(-0)
> tuples:25303056(-2671265) CPU 95.22s/38.02u sec elapsed 10351.17 sec

> Unpatched:

> autovac "public.stock" scans:1 pages:1284504(-0)
> tuples:25001369(-1973760) CPU 86.55s/34.70u sec elapsed 9628.13 sec

So that makes this patch a good idea why?  (Maybe what we need to see
is the impact on the total elapsed time for the DBT-2 test, rather
than just the VACUUM runtime.)

BTW I've got serious reservations about whether this bit is safe:

> +             /* The table could've grown since vacuum started, and there
> +              * might already be dead tuples on the new pages. Catch them
> +              * as well. Also, we want to include any live tuples in the
> +              * new pages in the statistics.
> +              */
> +             nblocks = RelationGetNumberOfBlocks(onerel);

I seem to recall some assumptions somewhere in the system that a vacuum
won't visit newly-added pages.

            regards, tom lane

Re: [pgsql-patches] Recalculating OldestXmin in a long-running vacuum

From
Bruce Momjian
Date:
Tom Lane wrote:
> Heikki Linnakangas <heikki@enterprisedb.com> writes:
> > I have two runs of DBT-2, one with the patch and one without.
>
> > Patched:
>
> > autovac "public.stock" scans:1 pages:1285990(-0)
> > tuples:25303056(-2671265) CPU 95.22s/38.02u sec elapsed 10351.17 sec
>
> > Unpatched:
>
> > autovac "public.stock" scans:1 pages:1284504(-0)
> > tuples:25001369(-1973760) CPU 86.55s/34.70u sec elapsed 9628.13 sec
>
> So that makes this patch a good idea why?  (Maybe what we need to see
> is the impact on the total elapsed time for the DBT-2 test, rather
> than just the VACUUM runtime.)

Yea, in summary, it looks like the patch made the performance worse.

--
  Bruce Momjian   bruce@momjian.us
  EnterpriseDB    http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

Re: [pgsql-patches] Recalculating OldestXmin in a long-running vacuum

From
Heikki Linnakangas
Date:
Tom Lane wrote:
> Heikki Linnakangas <heikki@enterprisedb.com> writes:
>> I have two runs of DBT-2, one with the patch and one without.
>
>> Patched:
>
>> autovac "public.stock" scans:1 pages:1285990(-0)
>> tuples:25303056(-2671265) CPU 95.22s/38.02u sec elapsed 10351.17 sec
>
>> Unpatched:
>
>> autovac "public.stock" scans:1 pages:1284504(-0)
>> tuples:25001369(-1973760) CPU 86.55s/34.70u sec elapsed 9628.13 sec
>
> So that makes this patch a good idea why?  (Maybe what we need to see
> is the impact on the total elapsed time for the DBT-2 test, rather
> than just the VACUUM runtime.)

The patch is a good idea because the vacuum collected more dead tuples,
not because it makes vacuum run faster (it doesn't).

I believe the increase in runtime is caused by some random variations in
the test. As I said, there's something going on that server that I don't
  fully understand. I'll setup another test set.

The impact on the whole DBT-2 test is hard to measure, it would require
a long run so that you reach a steady state where tables are not growing
because of dead tuples anymore. We'll need to do that anyway, so
hopefully I'll get a chance to measure the impact of this patch as well.
The effect I'm expecting the patch to have is to make the steady-state
size of the stock table smaller, giving more cache hits and less I/O.

> BTW I've got serious reservations about whether this bit is safe:
>
>> +             /* The table could've grown since vacuum started, and there
>> +              * might already be dead tuples on the new pages. Catch them
>> +              * as well. Also, we want to include any live tuples in the
>> +              * new pages in the statistics.
>> +              */
>> +             nblocks = RelationGetNumberOfBlocks(onerel);
>
> I seem to recall some assumptions somewhere in the system that a vacuum
> won't visit newly-added pages.

Hmm, I can't think of anything.

Without the patch, there can't be any dead tuples on the newly-added
pages, according to the snapshot taken at the beginning of the vacuum,
so it would be a waste of time to visit them.

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

Re: [pgsql-patches] Recalculating OldestXmin in a long-running vacuum

From
Tom Lane
Date:
Heikki Linnakangas <heikki@enterprisedb.com> writes:
> Tom Lane wrote:
>> BTW I've got serious reservations about whether this bit is safe:
>>
>>> +             /* The table could've grown since vacuum started, and there
>>> +              * might already be dead tuples on the new pages. Catch them
>>> +              * as well. Also, we want to include any live tuples in the
>>> +              * new pages in the statistics.
>>> +              */
>>> +             nblocks = RelationGetNumberOfBlocks(onerel);
>>
>> I seem to recall some assumptions somewhere in the system that a vacuum
>> won't visit newly-added pages.

> Hmm, I can't think of anything.

I think I was thinking of the second risk described here:
http://archives.postgresql.org/pgsql-hackers/2005-05/msg00613.php
which is now fixed so maybe there's no longer any problem.  (If there
is, a change like this will convert it from a very-low-probability
problem into a significant-probability problem, so I guess we'll
find out...)

I still don't like the patch though; rechecking the relation length
every N blocks is uselessly inefficient and still doesn't create any
guarantees about having examined everything.  If we think this is
worth doing at all, we should arrange to recheck the length after
processing what we think is the last block, not at any other time.

            regards, tom lane

Re: Recalculating OldestXmin in a long-running vacuum

From
Alvaro Herrera
Date:
Tom Lane wrote:
> Heikki Linnakangas <heikki@enterprisedb.com> writes:
> > Tom Lane wrote:
> >> BTW I've got serious reservations about whether this bit is safe:
> >>
> >>> +             /* The table could've grown since vacuum started, and there
> >>> +              * might already be dead tuples on the new pages. Catch them
> >>> +              * as well. Also, we want to include any live tuples in the
> >>> +              * new pages in the statistics.
> >>> +              */
> >>> +             nblocks = RelationGetNumberOfBlocks(onerel);
> >>
> >> I seem to recall some assumptions somewhere in the system that a vacuum
> >> won't visit newly-added pages.
>
> > Hmm, I can't think of anything.
>
> I think I was thinking of the second risk described here:
> http://archives.postgresql.org/pgsql-hackers/2005-05/msg00613.php
> which is now fixed so maybe there's no longer any problem.  (If there
> is, a change like this will convert it from a very-low-probability
> problem into a significant-probability problem, so I guess we'll
> find out...)
>
> I still don't like the patch though; rechecking the relation length
> every N blocks is uselessly inefficient and still doesn't create any
> guarantees about having examined everything.  If we think this is
> worth doing at all, we should arrange to recheck the length after
> processing what we think is the last block, not at any other time.

Was this revisited?

I'm wondering if there has been any effort to make this work in
conjunction with ITAGAKI Takahiro's patch to correct the dead tuple
count estimate.

--
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: Recalculating OldestXmin in a long-running vacuum

From
Heikki Linnakangas
Date:
Alvaro Herrera wrote:
> Tom Lane wrote:
>> Heikki Linnakangas <heikki@enterprisedb.com> writes:
>>> Tom Lane wrote:
>>>> BTW I've got serious reservations about whether this bit is safe:
>>>>
>>>>> +             /* The table could've grown since vacuum started, and there
>>>>> +              * might already be dead tuples on the new pages. Catch them
>>>>> +              * as well. Also, we want to include any live tuples in the
>>>>> +              * new pages in the statistics.
>>>>> +              */
>>>>> +             nblocks = RelationGetNumberOfBlocks(onerel);
>>>> I seem to recall some assumptions somewhere in the system that a vacuum
>>>> won't visit newly-added pages.
>>> Hmm, I can't think of anything.
>> I think I was thinking of the second risk described here:
>> http://archives.postgresql.org/pgsql-hackers/2005-05/msg00613.php
>> which is now fixed so maybe there's no longer any problem.  (If there
>> is, a change like this will convert it from a very-low-probability
>> problem into a significant-probability problem, so I guess we'll
>> find out...)
>>
>> I still don't like the patch though; rechecking the relation length
>> every N blocks is uselessly inefficient and still doesn't create any
>> guarantees about having examined everything.  If we think this is
>> worth doing at all, we should arrange to recheck the length after
>> processing what we think is the last block, not at any other time.
>
> Was this revisited?

Arranging the tests has taken me longer than I thought, but I now
finally have the hardware and DBT-2 set up. I just finished a pair of 2h
tests with autovacuum off, and continuous vacuum of the stock table. I'm
trying to get the results uploaded on some public website so we can all
see and discuss them.

> I'm wondering if there has been any effort to make this work in
> conjunction with ITAGAKI Takahiro's patch to correct the dead tuple
> count estimate.

No. I'll have to take a look at that patch...

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

Re: Recalculating OldestXmin in a long-running vacuum

From
Heikki Linnakangas
Date:
Alvaro Herrera wrote:
> I'm wondering if there has been any effort to make this work in
> conjunction with ITAGAKI Takahiro's patch to correct the dead tuple
> count estimate.

I just looked at that patch. If we applied both patches, the dead_tuples
estimate would be off by the number of dead tuples removed thanks to my
patch.

In vacuum, we could count separately the tuples that were vacuumable
according to the first snapshot, and tuples that were vacuumable
according to a new snapshot. We could then get an estimate that's as
good as with just Takahiro's patch with this formula:

new_n_dead_tuples = n_dead_tuples_at_end - (n_dead_tuples_at_start +
tuples_removed_thanks_to_new_snapshot)

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

Re: Recalculating OldestXmin in a long-running vacuum

From
NikhilS
Date:
Hi,

I was wondering if we can apply the same logic of recalculating OldestXmin within IndexBuildHeapScan which occurs as part of create index operation?

Having to index lesser number of tuples should be a good save in the "CREATE INDEX CONCURRENTLY" case, if the above is possible?

Regards,
Nikhils
--
EnterpriseDB               http://www.enterprisedb.com

Re: Recalculating OldestXmin in a long-running vacuum

From
Heikki Linnakangas
Date:
Heikki Linnakangas wrote:
> Alvaro Herrera wrote:
>> Was this revisited?
>
> Arranging the tests has taken me longer than I thought, but I now
> finally have the hardware and DBT-2 set up. I just finished a pair of 2h
> tests with autovacuum off, and continuous vacuum of the stock table. I'm
> trying to get the results uploaded on some public website so we can all
> see and discuss them.

I finally got around looking at this again.

I ran two 24h test runs with DBT-2, one with the patch and one without.
To get comparable, predictable results, I turned autovacuum off and run
a manual vacuum in a loop on the stock-table alone.

As expected, the steady-state of the stock table is smaller with the
patch. But only by ~2%, that's slightly less than I expected.

But what surprises me is that response times went up a with the patch. I
don't know why.

The full test results are here:

http://community.enterprisedb.com/oldestxmin/

92 was run with the patch, 93 without it.

BTW: The iostat chart clearly shows the vacuum WAL flush problem. The
WAL utilization jumps from ~20% to ~40% during the 2nd vacuum pass.

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

Re: Recalculating OldestXmin in a long-running vacuum

From
Alvaro Herrera
Date:
Heikki Linnakangas wrote:

> I ran two 24h test runs with DBT-2, one with the patch and one without.
> To get comparable, predictable results, I turned autovacuum off and run
> a manual vacuum in a loop on the stock-table alone.
>
> As expected, the steady-state of the stock table is smaller with the
> patch. But only by ~2%, that's slightly less than I expected.
>
> But what surprises me is that response times went up a with the patch. I
> don't know why.

Maybe because of increased contention of ProcArrayLock?  (I assume you
are using that, althought I haven't seen the patch)

--
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

Re: Recalculating OldestXmin in a long-running vacuum

From
Heikki Linnakangas
Date:
Alvaro Herrera wrote:
> Heikki Linnakangas wrote:
>
>> I ran two 24h test runs with DBT-2, one with the patch and one without.
>> To get comparable, predictable results, I turned autovacuum off and run
>> a manual vacuum in a loop on the stock-table alone.
>>
>> As expected, the steady-state of the stock table is smaller with the
>> patch. But only by ~2%, that's slightly less than I expected.
>>
>> But what surprises me is that response times went up a with the patch. I
>> don't know why.
>
> Maybe because of increased contention of ProcArrayLock?  (I assume you
> are using that, althought I haven't seen the patch)

I am, but I doubt that's it. The response times are dominated by I/O, so
any increase in lock contention would hardly show up. And the patch is
only adding one GetOldestXmin call every 1000 scanned pages, which is
nothing compared to the thousands of GetSnapshot calls the normal
transactions are issuing per minute.

The patch must have changed the I/O pattern in some subtle way.

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

Re: Recalculating OldestXmin in a long-running vacuum

From
Heikki Linnakangas
Date:
Heikki Linnakangas wrote:
> I am, but I doubt that's it. The response times are dominated by I/O, so
> any increase in lock contention would hardly show up. And the patch is
> only adding one GetOldestXmin call every 1000 scanned pages, which is

Sorry, should be "every 100 scanned pages". The argument still holds,
though.

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

Re: Recalculating OldestXmin in a long-running vacuum

From
Bruce Momjian
Date:
Heikki Linnakangas wrote:
> Alvaro Herrera wrote:
> > Heikki Linnakangas wrote:
> >
> >> I ran two 24h test runs with DBT-2, one with the patch and one without.
> >> To get comparable, predictable results, I turned autovacuum off and run
> >> a manual vacuum in a loop on the stock-table alone.
> >>
> >> As expected, the steady-state of the stock table is smaller with the
> >> patch. But only by ~2%, that's slightly less than I expected.
> >>
> >> But what surprises me is that response times went up a with the patch. I
> >> don't know why.
> >
> > Maybe because of increased contention of ProcArrayLock?  (I assume you
> > are using that, althought I haven't seen the patch)
>
> I am, but I doubt that's it. The response times are dominated by I/O, so
> any increase in lock contention would hardly show up. And the patch is
> only adding one GetOldestXmin call every 1000 scanned pages, which is
> nothing compared to the thousands of GetSnapshot calls the normal
> transactions are issuing per minute.
>
> The patch must have changed the I/O pattern in some subtle way.

So are you stopping work on the patch?  I assume so.

--
  Bruce Momjian  <bruce@momjian.us>          http://momjian.us
  EnterpriseDB                               http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

Re: Recalculating OldestXmin in a long-running vacuum

From
Heikki Linnakangas
Date:
Bruce Momjian wrote:
> Heikki Linnakangas wrote:
>> Alvaro Herrera wrote:
>>> Heikki Linnakangas wrote:
>>>
>>>> I ran two 24h test runs with DBT-2, one with the patch and one without.
>>>> To get comparable, predictable results, I turned autovacuum off and run
>>>> a manual vacuum in a loop on the stock-table alone.
>>>>
>>>> As expected, the steady-state of the stock table is smaller with the
>>>> patch. But only by ~2%, that's slightly less than I expected.
>>>>
>>>> But what surprises me is that response times went up a with the patch. I
>>>> don't know why.
>>> Maybe because of increased contention of ProcArrayLock?  (I assume you
>>> are using that, althought I haven't seen the patch)
>> I am, but I doubt that's it. The response times are dominated by I/O, so
>> any increase in lock contention would hardly show up. And the patch is
>> only adding one GetOldestXmin call every 1000 scanned pages, which is
>> nothing compared to the thousands of GetSnapshot calls the normal
>> transactions are issuing per minute.
>>
>> The patch must have changed the I/O pattern in some subtle way.
>
> So are you stopping work on the patch?  I assume so.

Yes, at least for now. I can't believe the patch actually hurts
performance, but I'm not going to spend time investigating it.

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

Re: Recalculating OldestXmin in a long-running vacuum

From
Gregory Stark
Date:
"Heikki Linnakangas" <heikki@enterprisedb.com> writes:

>>>>> But what surprises me is that response times went up a with the patch. I
>>>>> don't know why.
>>>> Maybe because of increased contention of ProcArrayLock?  (I assume you
>>>> are using that, althought I haven't seen the patch)
>>> I am, but I doubt that's it. The response times are dominated by I/O, so any
>>> increase in lock contention would hardly show up. And the patch is only
>>> adding one GetOldestXmin call every 1000 scanned pages, which is nothing
>>> compared to the thousands of GetSnapshot calls the normal transactions are
>>> issuing per minute.
>>>
>>> The patch must have changed the I/O pattern in some subtle way.
>>
>> So are you stopping work on the patch?  I assume so.
>
> Yes, at least for now. I can't believe the patch actually hurts performance,
> but I'm not going to spend time investigating it.

Isn't this exactly what you would expect? It will clean up more tuples so
it'll dirty more pages. Especially given the pessimal way vacuum's dirty
buffers are handled until Simon's patch to fix that goes in.

The benefit of the patch that we would expect to see is that you won't need to
run VACUUM as often. In the long term we would expect the stock table to grow
less too but I doubt these tests were long enough to demonstrate that effect.

--
  Gregory Stark
  EnterpriseDB          http://www.enterprisedb.com


Re: Recalculating OldestXmin in a long-running vacuum

From
Heikki Linnakangas
Date:
Gregory Stark wrote:
> "Heikki Linnakangas" <heikki@enterprisedb.com> writes:
>> Yes, at least for now. I can't believe the patch actually hurts performance,
>> but I'm not going to spend time investigating it.
>
> Isn't this exactly what you would expect? It will clean up more tuples so
> it'll dirty more pages. Especially given the pessimal way vacuum's dirty
> buffers are handled until Simon's patch to fix that goes in.

Hmm. Yeah, maybe it'll get better when we get that fixed..

> The benefit of the patch that we would expect to see is that you won't need to
> run VACUUM as often. In the long term we would expect the stock table to grow
> less too but I doubt these tests were long enough to demonstrate that effect.

The size did reach a steady state about half-way through the test, see
the logs here:

patched
http://community.enterprisedb.com/oldestxmin/92/server/relsizes.log

unpatched
http://community.enterprisedb.com/oldestxmin/93/server/relsizes.log

The test was a success in that sense, the patch did reduce the steady
state size of the stock table.

Maybe we would see a gain in transactions per minute or response times
if we traded off the smaller table size to run vacuum slightly less
frequently.. But as I said I don't want to spend time running more tests
for what seems like a small benefit.

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

Re: Recalculating OldestXmin in a long-running vacuum

From
Tom Lane
Date:
Heikki Linnakangas <heikki@enterprisedb.com> writes:
> Bruce Momjian wrote:
>> So are you stopping work on the patch?  I assume so.

> Yes, at least for now. I can't believe the patch actually hurts
> performance, but I'm not going to spend time investigating it.

Are we withdrawing the patch from consideration for 8.3 then?
I had assumed it was still a live candidate, but if it seems to
lose in pgbench maybe we had better set it aside.

            regards, tom lane

Re: Recalculating OldestXmin in a long-running vacuum

From
Heikki Linnakangas
Date:
Tom Lane wrote:
> Heikki Linnakangas <heikki@enterprisedb.com> writes:
>> Bruce Momjian wrote:
>>> So are you stopping work on the patch?  I assume so.
>
>> Yes, at least for now. I can't believe the patch actually hurts
>> performance, but I'm not going to spend time investigating it.
>
> Are we withdrawing the patch from consideration for 8.3 then?
> I had assumed it was still a live candidate, but if it seems to
> lose in pgbench maybe we had better set it aside.

I haven't tried pgbench, the tests I ran were with DBT-2.

Just to summarize again: the patch did help to keep the stock table
smaller, but the response times were higher with the patch.

Maybe we should keep this issue open until we resolve the vacuum WAL
flush issue? I can then rerun the same tests to see if this patch is a
win after that.

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

Re: Recalculating OldestXmin in a long-running vacuum

From
Bruce Momjian
Date:
Heikki Linnakangas wrote:
> Tom Lane wrote:
> > Heikki Linnakangas <heikki@enterprisedb.com> writes:
> >> Bruce Momjian wrote:
> >>> So are you stopping work on the patch?  I assume so.
> >
> >> Yes, at least for now. I can't believe the patch actually hurts
> >> performance, but I'm not going to spend time investigating it.
> >
> > Are we withdrawing the patch from consideration for 8.3 then?
> > I had assumed it was still a live candidate, but if it seems to
> > lose in pgbench maybe we had better set it aside.
>
> I haven't tried pgbench, the tests I ran were with DBT-2.
>
> Just to summarize again: the patch did help to keep the stock table
> smaller, but the response times were higher with the patch.
>
> Maybe we should keep this issue open until we resolve the vacuum WAL
> flush issue? I can then rerun the same tests to see if this patch is a
> win after that.

Would you like to add a TODO item?

--
  Bruce Momjian  <bruce@momjian.us>          http://momjian.us
  EnterpriseDB                               http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

Re: Recalculating OldestXmin in a long-running vacuum

From
Tom Lane
Date:
Heikki Linnakangas <heikki@enterprisedb.com> writes:
> Maybe we should keep this issue open until we resolve the vacuum WAL
> flush issue? I can then rerun the same tests to see if this patch is a
> win after that.

Sounds like a plan, if you are willing to do that.

            regards, tom lane

Re: Recalculating OldestXmin in a long-running vacuum

From
Heikki Linnakangas
Date:
Tom Lane wrote:
> Heikki Linnakangas <heikki@enterprisedb.com> writes:
>> Maybe we should keep this issue open until we resolve the vacuum WAL
>> flush issue? I can then rerun the same tests to see if this patch is a
>> win after that.
>
> Sounds like a plan, if you are willing to do that.

Sure, just rerunning the same tests isn't much work.

Bruce Momjian wrote:
>  Would you like to add a TODO item?

I don't know how we track things like this. Maybe add to the end of the
patch queue, with link to this discussion so that we remember that it
needs more testing?

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

Re: Recalculating OldestXmin in a long-running vacuum

From
Bruce Momjian
Date:
Heikki Linnakangas wrote:
> Tom Lane wrote:
> > Heikki Linnakangas <heikki@enterprisedb.com> writes:
> >> Maybe we should keep this issue open until we resolve the vacuum WAL
> >> flush issue? I can then rerun the same tests to see if this patch is a
> >> win after that.
> >
> > Sounds like a plan, if you are willing to do that.
>
> Sure, just rerunning the same tests isn't much work.
>
> Bruce Momjian wrote:
> >  Would you like to add a TODO item?
>
> I don't know how we track things like this. Maybe add to the end of the
> patch queue, with link to this discussion so that we remember that it
> needs more testing?

OK, I added this email to the patch queue.

--
  Bruce Momjian  <bruce@momjian.us>          http://momjian.us
  EnterpriseDB                               http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

Re: Recalculating OldestXmin in a long-running vacuum

From
Bruce Momjian
Date:
This has been saved for the 8.4 release:

    http://momjian.postgresql.org/cgi-bin/pgpatches_hold

---------------------------------------------------------------------------

Heikki Linnakangas wrote:
> Tom Lane wrote:
> > Heikki Linnakangas <heikki@enterprisedb.com> writes:
> >> Maybe we should keep this issue open until we resolve the vacuum WAL
> >> flush issue? I can then rerun the same tests to see if this patch is a
> >> win after that.
> >
> > Sounds like a plan, if you are willing to do that.
>
> Sure, just rerunning the same tests isn't much work.
>
> Bruce Momjian wrote:
> >  Would you like to add a TODO item?
>
> I don't know how we track things like this. Maybe add to the end of the
> patch queue, with link to this discussion so that we remember that it
> needs more testing?
>
> --
>    Heikki Linnakangas
>    EnterpriseDB   http://www.enterprisedb.com

--
  Bruce Momjian  <bruce@momjian.us>          http://momjian.us
  EnterpriseDB                               http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +