Thread: rapid degradation after postmaster restart

rapid degradation after postmaster restart

From
Joe Conway
Date:
I'm trying to troubleshoot a performance issue on an application ported
from Oracle to postgres. Now, I know the best way to get help is to post
the schema, explain analyze output, etc, etc -- unfortunately I can't do
that at the moment. However, maybe someone can point me in the right
direction to figure this out on my own. That said, here are a few details...

PostgreSQL 7.4.1
bash-2.03$ uname -a
SunOS col65 5.8 Generic_108528-27 sun4u sparc SUNW,Sun-Fire-280R

The problem is this: the application runs an insert, that fires off a
trigger, that cascades into a fairly complex series of functions, that
do a bunch of calculations, inserts, updates, and deletes. Immediately
after a postmaster restart, the first insert or two take about 1.5
minutes (undoubtedly this could be improved, but it isn't the main
issue). However by the second or third insert, the time increases to 7 -
9 minutes. Restarting the postmaster causes the cycle to repeat, i.e.
the first one or two inserts are back to the 1.5 minute range.

Any ideas spring to mind? I don't have much experience with Postgres on
Solaris -- could it be related to that somehow?

Thanks for any insights.

Joe

Re: rapid degradation after postmaster restart

From
Marty Scholes
Date:
Six days ago I installed Pg 7.4.1 on Sparc Solaris 8 also.  I am hopeful
that we as well can migrate a bunch of our apps from Oracle.

After doing some informal benchmarks and performance testing for the
past week I am becoming more and more impressed with what I see.

I have seen similar results to what you are describing.

I found that running a full vacuum:

vacuumdb -fza

followed by a checkpoint makes it run fast again.

Try timing the update with and without a full vacuum.

I can't help but wonder if a clean shutdown includes some vacuuming.

Obviously, in a production database this would be an issue.

Please post back what you learn.

Sincerely,
Marty

I have been doing a bunch of informat

Joe Conway wrote:
> I'm trying to troubleshoot a performance issue on an application ported
> from Oracle to postgres. Now, I know the best way to get help is to post
> the schema, explain analyze output, etc, etc -- unfortunately I can't do
> that at the moment. However, maybe someone can point me in the right
> direction to figure this out on my own. That said, here are a few
> details...
>
> PostgreSQL 7.4.1
> bash-2.03$ uname -a
> SunOS col65 5.8 Generic_108528-27 sun4u sparc SUNW,Sun-Fire-280R
>
> The problem is this: the application runs an insert, that fires off a
> trigger, that cascades into a fairly complex series of functions, that
> do a bunch of calculations, inserts, updates, and deletes. Immediately
> after a postmaster restart, the first insert or two take about 1.5
> minutes (undoubtedly this could be improved, but it isn't the main
> issue). However by the second or third insert, the time increases to 7 -
> 9 minutes. Restarting the postmaster causes the cycle to repeat, i.e.
> the first one or two inserts are back to the 1.5 minute range.
>
> Any ideas spring to mind? I don't have much experience with Postgres on
> Solaris -- could it be related to that somehow?
>
> Thanks for any insights.
>
> Joe
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
>               http://archives.postgresql.org



Re: rapid degradation after postmaster restart

From
Tom Lane
Date:
Joe Conway <mail@joeconway.com> writes:
> The problem is this: the application runs an insert, that fires off a
> trigger, that cascades into a fairly complex series of functions, that
> do a bunch of calculations, inserts, updates, and deletes. Immediately
> after a postmaster restart, the first insert or two take about 1.5
> minutes (undoubtedly this could be improved, but it isn't the main
> issue). However by the second or third insert, the time increases to 7 -
> 9 minutes. Restarting the postmaster causes the cycle to repeat, i.e.
> the first one or two inserts are back to the 1.5 minute range.

I realize this question might take some patience to answer, but what
does the performance curve look like beyond three trials?  Does it level
off or continue to get worse?  If it doesn't level off, does the
degradation seem linear in the number of trials, or worse than linear?

I have no ideas in mind, just trying to gather data ...

            regards, tom lane

Re: rapid degradation after postmaster restart

From
Tom Lane
Date:
Joe Conway <mail@joeconway.com> writes:
> ... Immediately
> after a postmaster restart, the first insert or two take about 1.5
> minutes (undoubtedly this could be improved, but it isn't the main
> issue). However by the second or third insert, the time increases to 7 -
> 9 minutes. Restarting the postmaster causes the cycle to repeat, i.e.
> the first one or two inserts are back to the 1.5 minute range.

Just to be clear on this: you have to restart the postmaster to bring
the time back down?  Simply starting a fresh backend session doesn't do
it?

Are you using particularly large values for shared_buffers or any of the
other resource parameters?

            regards, tom lane

Re: rapid degradation after postmaster restart

From
Joe Conway
Date:
Tom Lane wrote:
> I realize this question might take some patience to answer, but what
> does the performance curve look like beyond three trials?  Does it level
> off or continue to get worse?  If it doesn't level off, does the
> degradation seem linear in the number of trials, or worse than linear?

I try to gather some data during the weekend and report back.

Thanks,

Joe

Re: rapid degradation after postmaster restart

From
Joe Conway
Date:
Tom Lane wrote:
> Just to be clear on this: you have to restart the postmaster to bring
> the time back down?  Simply starting a fresh backend session doesn't do
> it?

Yes, a full postmaster restart is needed. It is a command line script
that does the insert, so each one is a new backend.

> Are you using particularly large values for shared_buffers or any of the
> other resource parameters?

I'll have to look at this again (I have to vpn in to the company lan
which kills all my current connections) -- the server and application
belong to another department at my employer.

IIRC, shared buffers was reasonable, maybe 128MB. One thing that is
worthy of note is that they are using pg_autovacuum and a very low
vacuum_mem setting (1024). But I also believe that max_fsm_relations and
max_fsm_pages have been bumped up from default (something like 10000 &
200000).

I'll post the non-default postgresql.conf settings shortly. The extended
tests discussed in the nearby post will take a bit more time to get.

Thanks,

Joe


Re: rapid degradation after postmaster restart

From
Joe Conway
Date:
Marty Scholes wrote:
> I have seen similar results to what you are describing.
>
> I found that running a full vacuum:
>
> vacuumdb -fza
>
> followed by a checkpoint makes it run fast again.
>
> Try timing the update with and without a full vacuum.

Will do. I'll let you know how it goes.

Thanks for the reply.

Joe


Re: rapid degradation after postmaster restart

From
Josh Berkus
Date:
Joe,

> IIRC, shared buffers was reasonable, maybe 128MB. One thing that is
> worthy of note is that they are using pg_autovacuum and a very low
> vacuum_mem setting (1024). But I also believe that max_fsm_relations and
> max_fsm_pages have been bumped up from default (something like 10000 &
> 200000).

pg_autovacuum may be your problem.   Imagine this:

1) The chain of updates and inserts called by the procedures makes enough
changes, on its own, to trigger pg_autovacuum.
2) Because they have a big database, and a low vacuum_mem, a vacuum of the
largest table takes noticable time, like several minutes.
3) This means that the vacuum is still running during the second and
succeeding events ....

Something to check by watching the process list.

FWIW, I don't use pg_autovacuum for databases which have frequent large batch
updates; I find it results in uneven performance.

Feel free to phone me if you're still stuck!

--
-Josh Berkus
 Aglio Database Solutions
 San Francisco


Re: rapid degradation after postmaster restart

From
"Matthew T. O'Connor"
Date:
Joe Conway wrote:

> Tom Lane wrote:
>
>> Just to be clear on this: you have to restart the postmaster to bring
>> the time back down?  Simply starting a fresh backend session doesn't do
>> it?
>
>
> IIRC, shared buffers was reasonable, maybe 128MB. One thing that is
> worthy of note is that they are using pg_autovacuum and a very low
> vacuum_mem setting (1024). But I also believe that max_fsm_relations
> and max_fsm_pages have been bumped up from default (something like
> 10000 & 200000).
>

pg_autovacuum could be a problem if it's vacuuming too often.  Have you
looked to see if a vacuum or analyze is running while the server is
slow?  If so, have you played with the pg_autovacuum default vacuum and
analyze thresholds?  If it appears that it is related to pg_autovacuum
please send me the command options used to run it and a logfile of it's
output running at at a debug level of -d2


Matthew


Re: rapid degradation after postmaster restart

From
"Matthew T. O'Connor"
Date:
Joe Conway wrote:

> A few pg_autovacuum questions came out of this:
>
>   First, the default vacuum scaling factor is 2, which I think implies
>   the big table should only get vacuumed every 56 million or so changes.
>   I didn't come anywhere near that volume in my tests, yet the table did
>   get vacuumed more than once (I was watching the pg_autovacuum log
>   output). Do I misunderstand this setting?


I think you understand correctly.  A table with 1,000,000 rows should
get vacuumed approx every 2,000,000 changes (assuming default values for
-V ).  FYI and insert and a delete count as one change, but and update
counts as two.

Unfortunately, the running with -d2 would show the numbers that
pg_autovacuum is using to decide if it when it should vacuum or
analyze.    Also, are you sure that it vacuumed more than once and
wasn't doing analyzes most of the time?

Also, I'm not sure if 2 is a good default value for the scaling factor
but I erred on the side of not vacuuming too often.

>   Second, Matthew requested pg_autovacuum run with -d2; I found that
>   with -d2 set, pg_autovacuum would immediately exit on start. -d0 and
>   -d1 work fine however.


That's unfortunate as that is the detail we need to see what
pg_autovacuum thinks is really going on.  We had a similar sounding
crash on FreeBSD due to some unitialized variables that were being
printed out by the debug code, however that was fixed a long time ago.
Any chance you can look into this?

> That's all I can think of at the moment. I'd like to try the 7.4 patch
> that makes vacuum sleep every few pages -- can anyone point me to the
> latest and greatest that will apply to 7.4?


Yes I would be very curious to see the results with the vacuum delay
patch installed (is that patch applied to HEAD?)



Re: rapid degradation after postmaster restart

From
Joe Conway
Date:
Matthew T. O'Connor wrote:
> I think you understand correctly.  A table with 1,000,000 rows should
> get vacuumed approx every 2,000,000 changes (assuming default values for
> -V ).  FYI and insert and a delete count as one change, but and update
> counts as two.
>
> Unfortunately, the running with -d2 would show the numbers that
> pg_autovacuum is using to decide if it when it should vacuum or
> analyze.    Also, are you sure that it vacuumed more than once and
> wasn't doing analyzes most of the time?

Yeah, I'm sure. Snippets from the log:

[...lots-o-tables...]
[2004-03-14 12:44:48 PM] added table: specdb."public"."parametric_states"
[2004-03-14 12:49:48 PM] Performing: VACUUM ANALYZE
"public"."transaction_data"
[2004-03-14 01:29:59 PM] Performing: VACUUM ANALYZE
"public"."transaction_data"
[2004-03-14 02:08:26 PM] Performing: ANALYZE "public"."out_of_spec"
[2004-03-14 02:08:26 PM] Performing: VACUUM ANALYZE
"public"."transaction_data"
[2004-03-14 02:22:44 PM] Performing: VACUUM ANALYZE "public"."spc_graphs"
[2004-03-14 03:06:45 PM] Performing: VACUUM ANALYZE "public"."out_of_spec"
[2004-03-14 03:06:45 PM] Performing: VACUUM ANALYZE
"public"."transaction_data"
[2004-03-14 03:19:51 PM] Performing: VACUUM ANALYZE "public"."spc_graphs"
[2004-03-14 03:21:09 PM] Performing: ANALYZE "public"."parametric_states"
[2004-03-14 03:54:57 PM] Performing: ANALYZE "public"."out_of_spec"
[2004-03-14 03:54:57 PM] Performing: VACUUM ANALYZE
"public"."transaction_data"
[2004-03-14 04:07:52 PM] Performing: VACUUM ANALYZE "public"."spc_graphs"
[2004-03-14 04:09:33 PM] Performing: ANALYZE "public"."equip_status_history"
[2004-03-14 04:09:33 PM] Performing: VACUUM ANALYZE
"public"."parametric_states"
[2004-03-14 04:43:46 PM] Performing: VACUUM ANALYZE "public"."out_of_spec"
[2004-03-14 04:43:46 PM] Performing: VACUUM ANALYZE
"public"."transaction_data"
[2004-03-14 04:56:35 PM] Performing: VACUUM ANALYZE "public"."spc_graphs"
[2004-03-14 04:58:32 PM] Performing: ANALYZE "public"."parametric_states"
[2004-03-14 05:28:58 PM] added database: specdb

This is the entire period of the first test, with default autovac
settings. The table "public"."transaction_data" is the one with 28
million active rows. The entire test run inserts about 600 x 600 =
360,000 rows, out of which roughly two-thirds are later deleted.

> That's unfortunate as that is the detail we need to see what
> pg_autovacuum thinks is really going on.  We had a similar sounding
> crash on FreeBSD due to some unitialized variables that were being
> printed out by the debug code, however that was fixed a long time ago.
> Any chance you can look into this?

I can try. The server belongs to another department, and they are under
the gun to get back on track with their testing. Also, they compiled
without debug symbols, so I need to get permission to recompile.

> Yes I would be very curious to see the results with the vacuum delay
> patch installed (is that patch applied to HEAD?)

Any idea where I can get my hands on the latest version. I found the
original post from Tom, but I thought there was a later version with
both number of pages and time to sleep as knobs.

Thanks,

Joe

Re: rapid degradation after postmaster restart

From
Tom Lane
Date:
Joe Conway <mail@joeconway.com> writes:
> Any idea where I can get my hands on the latest version. I found the
> original post from Tom, but I thought there was a later version with
> both number of pages and time to sleep as knobs.

That was as far as I got.  I think Jan posted a more complex version
that would still be reasonable to apply to 7.4.

            regards, tom lane

Re: rapid degradation after postmaster restart

From
"Matthew T. O'Connor"
Date:
Joe Conway wrote:

> Yeah, I'm sure. Snippets from the log:
>
> [...lots-o-tables...]
> [2004-03-14 12:44:48 PM] added table: specdb."public"."parametric_states"
> [2004-03-14 12:49:48 PM] Performing: VACUUM ANALYZE
> "public"."transaction_data"
> [2004-03-14 01:29:59 PM] Performing: VACUUM ANALYZE
> "public"."transaction_data"
> [2004-03-14 02:08:26 PM] Performing: ANALYZE "public"."out_of_spec"
> [2004-03-14 02:08:26 PM] Performing: VACUUM ANALYZE
> "public"."transaction_data"
> [2004-03-14 02:22:44 PM] Performing: VACUUM ANALYZE "public"."spc_graphs"
> [2004-03-14 03:06:45 PM] Performing: VACUUM ANALYZE
> "public"."out_of_spec"
> [2004-03-14 03:06:45 PM] Performing: VACUUM ANALYZE
> "public"."transaction_data"
> [2004-03-14 03:19:51 PM] Performing: VACUUM ANALYZE "public"."spc_graphs"
> [2004-03-14 03:21:09 PM] Performing: ANALYZE "public"."parametric_states"
> [2004-03-14 03:54:57 PM] Performing: ANALYZE "public"."out_of_spec"
> [2004-03-14 03:54:57 PM] Performing: VACUUM ANALYZE
> "public"."transaction_data"
> [2004-03-14 04:07:52 PM] Performing: VACUUM ANALYZE "public"."spc_graphs"
> [2004-03-14 04:09:33 PM] Performing: ANALYZE
> "public"."equip_status_history"
> [2004-03-14 04:09:33 PM] Performing: VACUUM ANALYZE
> "public"."parametric_states"
> [2004-03-14 04:43:46 PM] Performing: VACUUM ANALYZE
> "public"."out_of_spec"
> [2004-03-14 04:43:46 PM] Performing: VACUUM ANALYZE
> "public"."transaction_data"
> [2004-03-14 04:56:35 PM] Performing: VACUUM ANALYZE "public"."spc_graphs"
> [2004-03-14 04:58:32 PM] Performing: ANALYZE "public"."parametric_states"
> [2004-03-14 05:28:58 PM] added database: specdb


Yeah, you're right.....

> This is the entire period of the first test, with default autovac
> settings. The table "public"."transaction_data" is the one with 28
> million active rows. The entire test run inserts about 600 x 600 =
> 360,000 rows, out of which roughly two-thirds are later deleted.


Strange... I wonder if this is some integer overflow problem.  There was
one reported recently and fixed as of CVS head yesterday, you might try
that, however without the -d2 output I'm only guessing at why
pg_autovacuum is vacuuming so much / so often.

> I can try. The server belongs to another department, and they are
> under the gun to get back on track with their testing. Also, they
> compiled without debug symbols, so I need to get permission to recompile.


Good luck, I hope you can get permission.  Would e nice to fix this
little crash.

>> Yes I would be very curious to see the results with the vacuum delay
>> patch installed (is that patch applied to HEAD?)
>
>
> Any idea where I can get my hands on the latest version. I found the
> original post from Tom, but I thought there was a later version with
> both number of pages and time to sleep as knobs.


I think Jan posted one a while back....  [searches archives...]  But I
must say I'm at a loss to find it in the archives.  Anyone know where a
good delay patch is for 7.4?   If we can't find one, any chance you can
do some testing with CVS HEAD just to see if that works any better.  I
know there has been a fair amount of work done to improve this situation
(not just vacuum delay, but ARC etc...)
.

Re: rapid degradation after postmaster restart

From
Joe Conway
Date:
Tom Lane wrote:
> Joe Conway <mail@joeconway.com> writes:
>>Any idea where I can get my hands on the latest version. I found the
>>original post from Tom, but I thought there was a later version with
>>both number of pages and time to sleep as knobs.
>
> That was as far as I got.  I think Jan posted a more complex version
> that would still be reasonable to apply to 7.4.

I thought that too, but was having trouble finding it. I'll look again.

Thanks,

Joe


Re: rapid degradation after postmaster restart

From
Joe Conway
Date:
Matthew T. O'Connor wrote:
> Strange... I wonder if this is some integer overflow problem.  There was
> one reported recently and fixed as of CVS head yesterday, you might try
> that, however without the -d2 output I'm only guessing at why
> pg_autovacuum is vacuuming so much / so often.

I'll see what I can do tomorrow to track it down.

I have already recommended to the program manager that they switch to
7.4.2 plus the autovacuum patch. Not sure they will be willing to make
any changes at this stage in their release process though.

> If we can't find one, any chance you can
> do some testing with CVS HEAD just to see if that works any better.  I
> know there has been a fair amount of work done to improve this situation
> (not just vacuum delay, but ARC etc...)

I might do that, but not likely on Solaris. I can probably get a copy of
the current database and testing scripts, and give it a try on one of my
own machines (all Linux, either RHAS3, RH9, or Fedora).

Joe


Re: rapid degradation after postmaster restart

From
Joe Conway
Date:
Tom Lane wrote:
> Joe Conway <mail@joeconway.com> writes:
>
>>Any idea where I can get my hands on the latest version. I found the
>>original post from Tom, but I thought there was a later version with
>>both number of pages and time to sleep as knobs.
>
> That was as far as I got.  I think Jan posted a more complex version
> that would still be reasonable to apply to 7.4.

I have tested Tom's original patch now. The good news -- it works great
in terms of reducing the load imposed by vacuum -- almost to the level
of being unnoticeable. The bad news -- in a simulation test which loads
an hour's worth of data, even with delay set to 1 ms, vacuum of the
large table exceeds two hours (vs 12-14 minutes with delay = 0). Since
that hourly load is expected 7 x 24, this obviously isn't going to work.

The problem with Jan's more complex version of the patch (at least the
one I found - perhaps not the right one) is it includes a bunch of other
experimental stuff that I'd not want to mess with at the moment. Would
changing the input units (for the original patch) from milli-secs to
micro-secs be a bad idea? If so, I guess I'll get to extracting what I
need from Jan's patch.

Thanks,

Joe


Re: rapid degradation after postmaster restart

From
"Matthew T. O'Connor"
Date:
On Tue, 2004-03-16 at 23:49, Joe Conway wrote:
> I have tested Tom's original patch now. The good news -- it works great
> in terms of reducing the load imposed by vacuum -- almost to the level
> of being unnoticeable. The bad news -- in a simulation test which loads
> an hour's worth of data, even with delay set to 1 ms, vacuum of the
> large table exceeds two hours (vs 12-14 minutes with delay = 0). Since
> that hourly load is expected 7 x 24, this obviously isn't going to work.

If memory serves, the problem is that you actually sleep 10ms even when
you set it to 1.  One of the thing changed in Jan's later patch was the
ability to specify how many pages to work on before sleeping, rather
than how long to sleep inbetween every 1 page.  You might be able to do
a quick hack and have it do 10 pages or so before sleeping.

Matthew


Re: rapid degradation after postmaster restart

From
Tom Lane
Date:
Joe Conway <mail@joeconway.com> writes:
> I have tested Tom's original patch now. The good news -- it works great
> in terms of reducing the load imposed by vacuum -- almost to the level
> of being unnoticeable. The bad news -- in a simulation test which loads
> an hour's worth of data, even with delay set to 1 ms, vacuum of the
> large table exceeds two hours (vs 12-14 minutes with delay = 0). Since
> that hourly load is expected 7 x 24, this obviously isn't going to work.

Turns the dial down a bit too far then ...

> The problem with Jan's more complex version of the patch (at least the
> one I found - perhaps not the right one) is it includes a bunch of other
> experimental stuff that I'd not want to mess with at the moment. Would
> changing the input units (for the original patch) from milli-secs to
> micro-secs be a bad idea?

Unlikely to be helpful; on most kernels the minimum sleep delay is 1 or
10 msec, so asking for a few microsec is the same as asking for some
millisec.  I think what you need is a knob of the form "sleep N msec
after each M pages of I/O".  I'm almost certain that Jan posted such a
patch somewhere between my original and the version you refer to above.

            regards, tom lane

Re: rapid degradation after postmaster restart

From
Joe Conway
Date:
Matthew T. O'Connor wrote:
> If memory serves, the problem is that you actually sleep 10ms even when
> you set it to 1.  One of the thing changed in Jan's later patch was the
> ability to specify how many pages to work on before sleeping, rather
> than how long to sleep inbetween every 1 page.  You might be able to do
> a quick hack and have it do 10 pages or so before sleeping.

I thought I remembered something about that.

It turned out to be less difficult than I first thought to extract the
vacuum delay stuff from Jan's performance patch. I haven't yet tried it
out, but it's attached in case you are interested. I'll report back once
I have some results.

Joe
Index: src/backend/access/nbtree/nbtree.c
===================================================================
RCS file: /home/pgsql/CvsRoot/pgsql-server/src/backend/access/nbtree/nbtree.c,v
retrieving revision 1.106
diff -c -b -r1.106 nbtree.c
*** src/backend/access/nbtree/nbtree.c    2003/09/29 23:40:26    1.106
--- src/backend/access/nbtree/nbtree.c    2003/11/03 17:56:54
***************
*** 18,23 ****
--- 18,25 ----
   */
  #include "postgres.h"

+ #include <unistd.h>
+
  #include "access/genam.h"
  #include "access/heapam.h"
  #include "access/nbtree.h"
***************
*** 27,32 ****
--- 29,39 ----
  #include "storage/smgr.h"


+ extern int    vacuum_page_delay;
+ extern int    vacuum_page_groupsize;
+ extern int    vacuum_page_groupcount;
+
+
  /* Working state for btbuild and its callback */
  typedef struct
  {
***************
*** 610,615 ****
--- 617,631 ----

              CHECK_FOR_INTERRUPTS();

+             if (vacuum_page_delay > 0)
+             {
+                 if (++vacuum_page_groupcount >= vacuum_page_groupsize)
+                 {
+                     vacuum_page_groupcount = 0;
+                     usleep(vacuum_page_delay * 1000);
+                 }
+             }
+
              ndeletable = 0;
              page = BufferGetPage(buf);
              opaque = (BTPageOpaque) PageGetSpecialPointer(page);
***************
*** 736,741 ****
--- 752,768 ----
          Buffer        buf;
          Page        page;
          BTPageOpaque opaque;
+
+         CHECK_FOR_INTERRUPTS();
+
+         if (vacuum_page_delay > 0)
+         {
+             if (++vacuum_page_groupcount >= vacuum_page_groupsize)
+             {
+                 vacuum_page_groupcount = 0;
+                 usleep(vacuum_page_delay * 1000);
+             }
+         }

          buf = _bt_getbuf(rel, blkno, BT_READ);
          page = BufferGetPage(buf);
Index: src/backend/commands/vacuumlazy.c
===================================================================
RCS file: /home/pgsql/CvsRoot/pgsql-server/src/backend/commands/vacuumlazy.c,v
retrieving revision 1.32
diff -c -b -r1.32 vacuumlazy.c
*** src/backend/commands/vacuumlazy.c    2003/09/25 06:57:59    1.32
--- src/backend/commands/vacuumlazy.c    2003/11/03 17:57:27
***************
*** 37,42 ****
--- 37,44 ----
   */
  #include "postgres.h"

+ #include <unistd.h>
+
  #include "access/genam.h"
  #include "access/heapam.h"
  #include "access/xlog.h"
***************
*** 88,93 ****
--- 90,99 ----
  static TransactionId OldestXmin;
  static TransactionId FreezeLimit;

+ int        vacuum_page_delay = 0;        /* milliseconds per page group */
+ int        vacuum_page_groupsize = 10;    /* group size */
+ int        vacuum_page_groupcount = 0;    /* current group size count */
+

  /* non-export function prototypes */
  static void lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats,
***************
*** 228,233 ****
--- 234,248 ----

          CHECK_FOR_INTERRUPTS();

+         if (vacuum_page_delay > 0)
+         {
+             if (++vacuum_page_groupcount >= vacuum_page_groupsize)
+             {
+                 vacuum_page_groupcount = 0;
+                 usleep(vacuum_page_delay * 1000);
+             }
+         }
+
          /*
           * If we are close to overrunning the available space for
           * dead-tuple TIDs, pause and do a cycle of vacuuming before we
***************
*** 469,474 ****
--- 484,498 ----

          CHECK_FOR_INTERRUPTS();

+         if (vacuum_page_delay > 0)
+         {
+             if (++vacuum_page_groupcount >= vacuum_page_groupsize)
+             {
+                 vacuum_page_groupcount = 0;
+                 usleep(vacuum_page_delay * 1000);
+             }
+         }
+
          tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);
          buf = ReadBuffer(onerel, tblk);
          LockBufferForCleanup(buf);
***************
*** 799,804 ****
--- 823,837 ----
                      hastup;

          CHECK_FOR_INTERRUPTS();
+
+         if (vacuum_page_delay > 0)
+         {
+             if (++vacuum_page_groupcount >= vacuum_page_groupsize)
+             {
+                 vacuum_page_groupcount = 0;
+                 usleep(vacuum_page_delay * 1000);
+             }
+         }

          blkno--;

Index: src/backend/utils/misc/guc.c
===================================================================
RCS file: /home/pgsql/CvsRoot/pgsql-server/src/backend/utils/misc/guc.c,v
retrieving revision 1.164
diff -c -b -r1.164 guc.c
*** src/backend/utils/misc/guc.c    2003/10/18 22:59:09    1.164
--- src/backend/utils/misc/guc.c    2003/11/03 21:10:52
***************
*** 70,78 ****
--- 70,80 ----
  extern int    PreAuthDelay;
  extern int    AuthenticationTimeout;
  extern int    CheckPointTimeout;
  extern int    CommitDelay;
  extern int    CommitSiblings;
  extern char *preload_libraries_string;
+ extern int    vacuum_page_delay;
+ extern int    vacuum_page_groupsize;

  #ifdef HAVE_SYSLOG
  extern char *Syslog_facility;
***************
*** 1188,1193 ****
--- 1199,1222 ----
          },
          &log_min_duration_statement,
          -1, -1, INT_MAX / 1000, NULL, NULL
+     },
+
+     {
+         {"vacuum_page_delay", PGC_USERSET, CLIENT_CONN_STATEMENT,
+             gettext_noop("Sets VACUUM's delay in milliseconds between processing successive pages."),
+             NULL
+         },
+         &vacuum_page_delay,
+         0, 0, 100, NULL, NULL
+     },
+
+     {
+         {"vacuum_page_groupsize", PGC_USERSET, CLIENT_CONN_STATEMENT,
+             gettext_noop("Sets VACUUM's delay group size."),
+             NULL
+         },
+         &vacuum_page_groupsize,
+         10, 1, 1000, NULL, NULL
      },

      /* End-of-list marker */

Re: rapid degradation after postmaster restart

From
"Arthur Ward"
Date:
> The problem with Jan's more complex version of the patch (at least the
> one I found - perhaps not the right one) is it includes a bunch of other
> experimental stuff that I'd not want to mess with at the moment. Would
> changing the input units (for the original patch) from milli-secs to
> micro-secs be a bad idea? If so, I guess I'll get to extracting what I
> need from Jan's patch.

Jan's vacuum-delay-only patch that nobody can find is here:

http://archives.postgresql.org/pgsql-hackers/2003-11/msg00518.php

I've been using it in testing & production without any problems.

Re: rapid degradation after postmaster restart

From
Andrew Sullivan
Date:
Sorry I haven't had a chance to reply to this sooner.

On Fri, Mar 12, 2004 at 05:38:37PM -0800, Joe Conway wrote:
> The problem is this: the application runs an insert, that fires off a
> trigger, that cascades into a fairly complex series of functions, that
> do a bunch of calculations, inserts, updates, and deletes. Immediately
> after a postmaster restart, the first insert or two take about 1.5
> minutes (undoubtedly this could be improved, but it isn't the main
> issue). However by the second or third insert, the time increases to 7 -
> 9 minutes. Restarting the postmaster causes the cycle to repeat, i.e.
> the first one or two inserts are back to the 1.5 minute range.

The vacuum delay stuff that you're working on may help, but I can't
really believe it's your salvation if this is happening after only a
few minutes.  No matter how much you're doing inside those functions,
you surely can't be causing so many dead tuples that a vacuum is
necessary that soon.  Did you try not vacuuming for a little while to
see if it helps?

I didn't see it anywhere in this thread, but are you quite sure that
you're not swapping?  Note that vmstat on multiprocessor Solaris
machines is not notoriously useful.  You may want to have a look at
what the example stuff in the SE Toolkit tells you, or what you get
from sar.  I believe you have to use a special kernel setting on
Solaris to mark shared memory as being ineligible for swap.

A

--
Andrew Sullivan  | ajs@crankycanuck.ca
This work was visionary and imaginative, and goes to show that visionary
and imaginative work need not end up well.
        --Dennis Ritchie

Re: rapid degradation after postmaster restart

From
Joe Conway
Date:
Andrew Sullivan wrote:
> Sorry I haven't had a chance to reply to this sooner.

> The vacuum delay stuff that you're working on may help, but I can't
> really believe it's your salvation if this is happening after only a
> few minutes.  No matter how much you're doing inside those functions,
> you surely can't be causing so many dead tuples that a vacuum is
> necessary that soon.  Did you try not vacuuming for a little while to
> see if it helps?

I discussed it later in the thread, but we're adding about 400K rows per
hour and deleting most of them after processing (note this is a
commercial app, written and maintained by another department -- I can
recommend changes, but this late into their release cycle they are very
reluctant to change the app). This is 7 x 24 data collection from
equipment, so there is no "slow" time to use as a maintenance window.

But since the server in question is a test machine, I was able to shut
everything off long enough to do a full vacuum -- it took about 12 hours.

> I didn't see it anywhere in this thread, but are you quite sure that
> you're not swapping?  Note that vmstat on multiprocessor Solaris
> machines is not notoriously useful.  You may want to have a look at
> what the example stuff in the SE Toolkit tells you, or what you get
> from sar.  I believe you have to use a special kernel setting on
> Solaris to mark shared memory as being ineligible for swap.

I'm (reasonably) sure there is no swapping. Minimum free memory (from
top) is about 800 MB, and "vmstat -S" shows no swap-in or swap-out.

I've been playing with a version of Jan's performance patch in the past
few hours. Based on my simulations, it appears that a 1 ms delay every
10 pages is just about right. The performance hit is negligible (based
on overall test time, and cpu % used by the vacuum process). I still
have a bit more analysis to do, but this is looking pretty good. More
later...

Joe

Re: rapid degradation after postmaster restart

From
Joe Conway
Date:
Arthur Ward wrote:
> Jan's vacuum-delay-only patch that nobody can find is here:
>
> http://archives.postgresql.org/pgsql-hackers/2003-11/msg00518.php
>
> I've been using it in testing & production without any problems.

Great to know -- many thanks.

I've hacked my own vacuum-delay-only patch form Jan's all_performance
patch. It looks like the only difference is that it uses usleep()
instead of select(). So far the tests look promising.

Thanks,

Joe


Re: rapid degradation after postmaster restart

From
"Matthew T. O'Connor"
Date:
Andrew Sullivan wrote:

>The vacuum delay stuff that you're working on may help, but I can't
>really believe it's your salvation if this is happening after only a
>few minutes.  No matter how much you're doing inside those functions,
>you surely can't be causing so many dead tuples that a vacuum is
>necessary that soon.  Did you try not vacuuming for a little while to
>see if it helps?
>
>

Some of this thread was taken off line so I'm not sure it was mentioned
on the list, but a big part of the problem was that Joe was running into
the same bug that Cott Lang ran into a while ago which caused the vacuum
threshold to get set far too low resulting in vacuums far too often..
This has been fixed and the patch has been committed unfortunately it
didn't make it into 7.4.2, but it will be in 7.4.3 / 7.5.

>I didn't see it anywhere in this thread, but are you quite sure that
>you're not swapping?  Note that vmstat on multiprocessor Solaris
>machines is not notoriously useful.  You may want to have a look at
>what the example stuff in the SE Toolkit tells you, or what you get
>from sar.  I believe you have to use a special kernel setting on
>Solaris to mark shared memory as being ineligible for swap.
>
>

I haven't heard from Joe how things are going with the fixed
pg_autovacuum but that in combination with the vacuum delay stuff should
work well.

Matthew