Thread: rapid degradation after postmaster restart
I'm trying to troubleshoot a performance issue on an application ported from Oracle to postgres. Now, I know the best way to get help is to post the schema, explain analyze output, etc, etc -- unfortunately I can't do that at the moment. However, maybe someone can point me in the right direction to figure this out on my own. That said, here are a few details... PostgreSQL 7.4.1 bash-2.03$ uname -a SunOS col65 5.8 Generic_108528-27 sun4u sparc SUNW,Sun-Fire-280R The problem is this: the application runs an insert, that fires off a trigger, that cascades into a fairly complex series of functions, that do a bunch of calculations, inserts, updates, and deletes. Immediately after a postmaster restart, the first insert or two take about 1.5 minutes (undoubtedly this could be improved, but it isn't the main issue). However by the second or third insert, the time increases to 7 - 9 minutes. Restarting the postmaster causes the cycle to repeat, i.e. the first one or two inserts are back to the 1.5 minute range. Any ideas spring to mind? I don't have much experience with Postgres on Solaris -- could it be related to that somehow? Thanks for any insights. Joe
Six days ago I installed Pg 7.4.1 on Sparc Solaris 8 also. I am hopeful that we as well can migrate a bunch of our apps from Oracle. After doing some informal benchmarks and performance testing for the past week I am becoming more and more impressed with what I see. I have seen similar results to what you are describing. I found that running a full vacuum: vacuumdb -fza followed by a checkpoint makes it run fast again. Try timing the update with and without a full vacuum. I can't help but wonder if a clean shutdown includes some vacuuming. Obviously, in a production database this would be an issue. Please post back what you learn. Sincerely, Marty I have been doing a bunch of informat Joe Conway wrote: > I'm trying to troubleshoot a performance issue on an application ported > from Oracle to postgres. Now, I know the best way to get help is to post > the schema, explain analyze output, etc, etc -- unfortunately I can't do > that at the moment. However, maybe someone can point me in the right > direction to figure this out on my own. That said, here are a few > details... > > PostgreSQL 7.4.1 > bash-2.03$ uname -a > SunOS col65 5.8 Generic_108528-27 sun4u sparc SUNW,Sun-Fire-280R > > The problem is this: the application runs an insert, that fires off a > trigger, that cascades into a fairly complex series of functions, that > do a bunch of calculations, inserts, updates, and deletes. Immediately > after a postmaster restart, the first insert or two take about 1.5 > minutes (undoubtedly this could be improved, but it isn't the main > issue). However by the second or third insert, the time increases to 7 - > 9 minutes. Restarting the postmaster causes the cycle to repeat, i.e. > the first one or two inserts are back to the 1.5 minute range. > > Any ideas spring to mind? I don't have much experience with Postgres on > Solaris -- could it be related to that somehow? > > Thanks for any insights. > > Joe > > ---------------------------(end of broadcast)--------------------------- > TIP 6: Have you searched our list archives? > > http://archives.postgresql.org
Joe Conway <mail@joeconway.com> writes: > The problem is this: the application runs an insert, that fires off a > trigger, that cascades into a fairly complex series of functions, that > do a bunch of calculations, inserts, updates, and deletes. Immediately > after a postmaster restart, the first insert or two take about 1.5 > minutes (undoubtedly this could be improved, but it isn't the main > issue). However by the second or third insert, the time increases to 7 - > 9 minutes. Restarting the postmaster causes the cycle to repeat, i.e. > the first one or two inserts are back to the 1.5 minute range. I realize this question might take some patience to answer, but what does the performance curve look like beyond three trials? Does it level off or continue to get worse? If it doesn't level off, does the degradation seem linear in the number of trials, or worse than linear? I have no ideas in mind, just trying to gather data ... regards, tom lane
Joe Conway <mail@joeconway.com> writes: > ... Immediately > after a postmaster restart, the first insert or two take about 1.5 > minutes (undoubtedly this could be improved, but it isn't the main > issue). However by the second or third insert, the time increases to 7 - > 9 minutes. Restarting the postmaster causes the cycle to repeat, i.e. > the first one or two inserts are back to the 1.5 minute range. Just to be clear on this: you have to restart the postmaster to bring the time back down? Simply starting a fresh backend session doesn't do it? Are you using particularly large values for shared_buffers or any of the other resource parameters? regards, tom lane
Tom Lane wrote: > I realize this question might take some patience to answer, but what > does the performance curve look like beyond three trials? Does it level > off or continue to get worse? If it doesn't level off, does the > degradation seem linear in the number of trials, or worse than linear? I try to gather some data during the weekend and report back. Thanks, Joe
Tom Lane wrote: > Just to be clear on this: you have to restart the postmaster to bring > the time back down? Simply starting a fresh backend session doesn't do > it? Yes, a full postmaster restart is needed. It is a command line script that does the insert, so each one is a new backend. > Are you using particularly large values for shared_buffers or any of the > other resource parameters? I'll have to look at this again (I have to vpn in to the company lan which kills all my current connections) -- the server and application belong to another department at my employer. IIRC, shared buffers was reasonable, maybe 128MB. One thing that is worthy of note is that they are using pg_autovacuum and a very low vacuum_mem setting (1024). But I also believe that max_fsm_relations and max_fsm_pages have been bumped up from default (something like 10000 & 200000). I'll post the non-default postgresql.conf settings shortly. The extended tests discussed in the nearby post will take a bit more time to get. Thanks, Joe
Marty Scholes wrote: > I have seen similar results to what you are describing. > > I found that running a full vacuum: > > vacuumdb -fza > > followed by a checkpoint makes it run fast again. > > Try timing the update with and without a full vacuum. Will do. I'll let you know how it goes. Thanks for the reply. Joe
Joe, > IIRC, shared buffers was reasonable, maybe 128MB. One thing that is > worthy of note is that they are using pg_autovacuum and a very low > vacuum_mem setting (1024). But I also believe that max_fsm_relations and > max_fsm_pages have been bumped up from default (something like 10000 & > 200000). pg_autovacuum may be your problem. Imagine this: 1) The chain of updates and inserts called by the procedures makes enough changes, on its own, to trigger pg_autovacuum. 2) Because they have a big database, and a low vacuum_mem, a vacuum of the largest table takes noticable time, like several minutes. 3) This means that the vacuum is still running during the second and succeeding events .... Something to check by watching the process list. FWIW, I don't use pg_autovacuum for databases which have frequent large batch updates; I find it results in uneven performance. Feel free to phone me if you're still stuck! -- -Josh Berkus Aglio Database Solutions San Francisco
Joe Conway wrote: > Tom Lane wrote: > >> Just to be clear on this: you have to restart the postmaster to bring >> the time back down? Simply starting a fresh backend session doesn't do >> it? > > > IIRC, shared buffers was reasonable, maybe 128MB. One thing that is > worthy of note is that they are using pg_autovacuum and a very low > vacuum_mem setting (1024). But I also believe that max_fsm_relations > and max_fsm_pages have been bumped up from default (something like > 10000 & 200000). > pg_autovacuum could be a problem if it's vacuuming too often. Have you looked to see if a vacuum or analyze is running while the server is slow? If so, have you played with the pg_autovacuum default vacuum and analyze thresholds? If it appears that it is related to pg_autovacuum please send me the command options used to run it and a logfile of it's output running at at a debug level of -d2 Matthew
Joe Conway wrote: > A few pg_autovacuum questions came out of this: > > First, the default vacuum scaling factor is 2, which I think implies > the big table should only get vacuumed every 56 million or so changes. > I didn't come anywhere near that volume in my tests, yet the table did > get vacuumed more than once (I was watching the pg_autovacuum log > output). Do I misunderstand this setting? I think you understand correctly. A table with 1,000,000 rows should get vacuumed approx every 2,000,000 changes (assuming default values for -V ). FYI and insert and a delete count as one change, but and update counts as two. Unfortunately, the running with -d2 would show the numbers that pg_autovacuum is using to decide if it when it should vacuum or analyze. Also, are you sure that it vacuumed more than once and wasn't doing analyzes most of the time? Also, I'm not sure if 2 is a good default value for the scaling factor but I erred on the side of not vacuuming too often. > Second, Matthew requested pg_autovacuum run with -d2; I found that > with -d2 set, pg_autovacuum would immediately exit on start. -d0 and > -d1 work fine however. That's unfortunate as that is the detail we need to see what pg_autovacuum thinks is really going on. We had a similar sounding crash on FreeBSD due to some unitialized variables that were being printed out by the debug code, however that was fixed a long time ago. Any chance you can look into this? > That's all I can think of at the moment. I'd like to try the 7.4 patch > that makes vacuum sleep every few pages -- can anyone point me to the > latest and greatest that will apply to 7.4? Yes I would be very curious to see the results with the vacuum delay patch installed (is that patch applied to HEAD?)
Matthew T. O'Connor wrote: > I think you understand correctly. A table with 1,000,000 rows should > get vacuumed approx every 2,000,000 changes (assuming default values for > -V ). FYI and insert and a delete count as one change, but and update > counts as two. > > Unfortunately, the running with -d2 would show the numbers that > pg_autovacuum is using to decide if it when it should vacuum or > analyze. Also, are you sure that it vacuumed more than once and > wasn't doing analyzes most of the time? Yeah, I'm sure. Snippets from the log: [...lots-o-tables...] [2004-03-14 12:44:48 PM] added table: specdb."public"."parametric_states" [2004-03-14 12:49:48 PM] Performing: VACUUM ANALYZE "public"."transaction_data" [2004-03-14 01:29:59 PM] Performing: VACUUM ANALYZE "public"."transaction_data" [2004-03-14 02:08:26 PM] Performing: ANALYZE "public"."out_of_spec" [2004-03-14 02:08:26 PM] Performing: VACUUM ANALYZE "public"."transaction_data" [2004-03-14 02:22:44 PM] Performing: VACUUM ANALYZE "public"."spc_graphs" [2004-03-14 03:06:45 PM] Performing: VACUUM ANALYZE "public"."out_of_spec" [2004-03-14 03:06:45 PM] Performing: VACUUM ANALYZE "public"."transaction_data" [2004-03-14 03:19:51 PM] Performing: VACUUM ANALYZE "public"."spc_graphs" [2004-03-14 03:21:09 PM] Performing: ANALYZE "public"."parametric_states" [2004-03-14 03:54:57 PM] Performing: ANALYZE "public"."out_of_spec" [2004-03-14 03:54:57 PM] Performing: VACUUM ANALYZE "public"."transaction_data" [2004-03-14 04:07:52 PM] Performing: VACUUM ANALYZE "public"."spc_graphs" [2004-03-14 04:09:33 PM] Performing: ANALYZE "public"."equip_status_history" [2004-03-14 04:09:33 PM] Performing: VACUUM ANALYZE "public"."parametric_states" [2004-03-14 04:43:46 PM] Performing: VACUUM ANALYZE "public"."out_of_spec" [2004-03-14 04:43:46 PM] Performing: VACUUM ANALYZE "public"."transaction_data" [2004-03-14 04:56:35 PM] Performing: VACUUM ANALYZE "public"."spc_graphs" [2004-03-14 04:58:32 PM] Performing: ANALYZE "public"."parametric_states" [2004-03-14 05:28:58 PM] added database: specdb This is the entire period of the first test, with default autovac settings. The table "public"."transaction_data" is the one with 28 million active rows. The entire test run inserts about 600 x 600 = 360,000 rows, out of which roughly two-thirds are later deleted. > That's unfortunate as that is the detail we need to see what > pg_autovacuum thinks is really going on. We had a similar sounding > crash on FreeBSD due to some unitialized variables that were being > printed out by the debug code, however that was fixed a long time ago. > Any chance you can look into this? I can try. The server belongs to another department, and they are under the gun to get back on track with their testing. Also, they compiled without debug symbols, so I need to get permission to recompile. > Yes I would be very curious to see the results with the vacuum delay > patch installed (is that patch applied to HEAD?) Any idea where I can get my hands on the latest version. I found the original post from Tom, but I thought there was a later version with both number of pages and time to sleep as knobs. Thanks, Joe
Joe Conway <mail@joeconway.com> writes: > Any idea where I can get my hands on the latest version. I found the > original post from Tom, but I thought there was a later version with > both number of pages and time to sleep as knobs. That was as far as I got. I think Jan posted a more complex version that would still be reasonable to apply to 7.4. regards, tom lane
Joe Conway wrote: > Yeah, I'm sure. Snippets from the log: > > [...lots-o-tables...] > [2004-03-14 12:44:48 PM] added table: specdb."public"."parametric_states" > [2004-03-14 12:49:48 PM] Performing: VACUUM ANALYZE > "public"."transaction_data" > [2004-03-14 01:29:59 PM] Performing: VACUUM ANALYZE > "public"."transaction_data" > [2004-03-14 02:08:26 PM] Performing: ANALYZE "public"."out_of_spec" > [2004-03-14 02:08:26 PM] Performing: VACUUM ANALYZE > "public"."transaction_data" > [2004-03-14 02:22:44 PM] Performing: VACUUM ANALYZE "public"."spc_graphs" > [2004-03-14 03:06:45 PM] Performing: VACUUM ANALYZE > "public"."out_of_spec" > [2004-03-14 03:06:45 PM] Performing: VACUUM ANALYZE > "public"."transaction_data" > [2004-03-14 03:19:51 PM] Performing: VACUUM ANALYZE "public"."spc_graphs" > [2004-03-14 03:21:09 PM] Performing: ANALYZE "public"."parametric_states" > [2004-03-14 03:54:57 PM] Performing: ANALYZE "public"."out_of_spec" > [2004-03-14 03:54:57 PM] Performing: VACUUM ANALYZE > "public"."transaction_data" > [2004-03-14 04:07:52 PM] Performing: VACUUM ANALYZE "public"."spc_graphs" > [2004-03-14 04:09:33 PM] Performing: ANALYZE > "public"."equip_status_history" > [2004-03-14 04:09:33 PM] Performing: VACUUM ANALYZE > "public"."parametric_states" > [2004-03-14 04:43:46 PM] Performing: VACUUM ANALYZE > "public"."out_of_spec" > [2004-03-14 04:43:46 PM] Performing: VACUUM ANALYZE > "public"."transaction_data" > [2004-03-14 04:56:35 PM] Performing: VACUUM ANALYZE "public"."spc_graphs" > [2004-03-14 04:58:32 PM] Performing: ANALYZE "public"."parametric_states" > [2004-03-14 05:28:58 PM] added database: specdb Yeah, you're right..... > This is the entire period of the first test, with default autovac > settings. The table "public"."transaction_data" is the one with 28 > million active rows. The entire test run inserts about 600 x 600 = > 360,000 rows, out of which roughly two-thirds are later deleted. Strange... I wonder if this is some integer overflow problem. There was one reported recently and fixed as of CVS head yesterday, you might try that, however without the -d2 output I'm only guessing at why pg_autovacuum is vacuuming so much / so often. > I can try. The server belongs to another department, and they are > under the gun to get back on track with their testing. Also, they > compiled without debug symbols, so I need to get permission to recompile. Good luck, I hope you can get permission. Would e nice to fix this little crash. >> Yes I would be very curious to see the results with the vacuum delay >> patch installed (is that patch applied to HEAD?) > > > Any idea where I can get my hands on the latest version. I found the > original post from Tom, but I thought there was a later version with > both number of pages and time to sleep as knobs. I think Jan posted one a while back.... [searches archives...] But I must say I'm at a loss to find it in the archives. Anyone know where a good delay patch is for 7.4? If we can't find one, any chance you can do some testing with CVS HEAD just to see if that works any better. I know there has been a fair amount of work done to improve this situation (not just vacuum delay, but ARC etc...) .
Tom Lane wrote: > Joe Conway <mail@joeconway.com> writes: >>Any idea where I can get my hands on the latest version. I found the >>original post from Tom, but I thought there was a later version with >>both number of pages and time to sleep as knobs. > > That was as far as I got. I think Jan posted a more complex version > that would still be reasonable to apply to 7.4. I thought that too, but was having trouble finding it. I'll look again. Thanks, Joe
Matthew T. O'Connor wrote: > Strange... I wonder if this is some integer overflow problem. There was > one reported recently and fixed as of CVS head yesterday, you might try > that, however without the -d2 output I'm only guessing at why > pg_autovacuum is vacuuming so much / so often. I'll see what I can do tomorrow to track it down. I have already recommended to the program manager that they switch to 7.4.2 plus the autovacuum patch. Not sure they will be willing to make any changes at this stage in their release process though. > If we can't find one, any chance you can > do some testing with CVS HEAD just to see if that works any better. I > know there has been a fair amount of work done to improve this situation > (not just vacuum delay, but ARC etc...) I might do that, but not likely on Solaris. I can probably get a copy of the current database and testing scripts, and give it a try on one of my own machines (all Linux, either RHAS3, RH9, or Fedora). Joe
Tom Lane wrote: > Joe Conway <mail@joeconway.com> writes: > >>Any idea where I can get my hands on the latest version. I found the >>original post from Tom, but I thought there was a later version with >>both number of pages and time to sleep as knobs. > > That was as far as I got. I think Jan posted a more complex version > that would still be reasonable to apply to 7.4. I have tested Tom's original patch now. The good news -- it works great in terms of reducing the load imposed by vacuum -- almost to the level of being unnoticeable. The bad news -- in a simulation test which loads an hour's worth of data, even with delay set to 1 ms, vacuum of the large table exceeds two hours (vs 12-14 minutes with delay = 0). Since that hourly load is expected 7 x 24, this obviously isn't going to work. The problem with Jan's more complex version of the patch (at least the one I found - perhaps not the right one) is it includes a bunch of other experimental stuff that I'd not want to mess with at the moment. Would changing the input units (for the original patch) from milli-secs to micro-secs be a bad idea? If so, I guess I'll get to extracting what I need from Jan's patch. Thanks, Joe
On Tue, 2004-03-16 at 23:49, Joe Conway wrote: > I have tested Tom's original patch now. The good news -- it works great > in terms of reducing the load imposed by vacuum -- almost to the level > of being unnoticeable. The bad news -- in a simulation test which loads > an hour's worth of data, even with delay set to 1 ms, vacuum of the > large table exceeds two hours (vs 12-14 minutes with delay = 0). Since > that hourly load is expected 7 x 24, this obviously isn't going to work. If memory serves, the problem is that you actually sleep 10ms even when you set it to 1. One of the thing changed in Jan's later patch was the ability to specify how many pages to work on before sleeping, rather than how long to sleep inbetween every 1 page. You might be able to do a quick hack and have it do 10 pages or so before sleeping. Matthew
Joe Conway <mail@joeconway.com> writes: > I have tested Tom's original patch now. The good news -- it works great > in terms of reducing the load imposed by vacuum -- almost to the level > of being unnoticeable. The bad news -- in a simulation test which loads > an hour's worth of data, even with delay set to 1 ms, vacuum of the > large table exceeds two hours (vs 12-14 minutes with delay = 0). Since > that hourly load is expected 7 x 24, this obviously isn't going to work. Turns the dial down a bit too far then ... > The problem with Jan's more complex version of the patch (at least the > one I found - perhaps not the right one) is it includes a bunch of other > experimental stuff that I'd not want to mess with at the moment. Would > changing the input units (for the original patch) from milli-secs to > micro-secs be a bad idea? Unlikely to be helpful; on most kernels the minimum sleep delay is 1 or 10 msec, so asking for a few microsec is the same as asking for some millisec. I think what you need is a knob of the form "sleep N msec after each M pages of I/O". I'm almost certain that Jan posted such a patch somewhere between my original and the version you refer to above. regards, tom lane
Matthew T. O'Connor wrote: > If memory serves, the problem is that you actually sleep 10ms even when > you set it to 1. One of the thing changed in Jan's later patch was the > ability to specify how many pages to work on before sleeping, rather > than how long to sleep inbetween every 1 page. You might be able to do > a quick hack and have it do 10 pages or so before sleeping. I thought I remembered something about that. It turned out to be less difficult than I first thought to extract the vacuum delay stuff from Jan's performance patch. I haven't yet tried it out, but it's attached in case you are interested. I'll report back once I have some results. Joe Index: src/backend/access/nbtree/nbtree.c =================================================================== RCS file: /home/pgsql/CvsRoot/pgsql-server/src/backend/access/nbtree/nbtree.c,v retrieving revision 1.106 diff -c -b -r1.106 nbtree.c *** src/backend/access/nbtree/nbtree.c 2003/09/29 23:40:26 1.106 --- src/backend/access/nbtree/nbtree.c 2003/11/03 17:56:54 *************** *** 18,23 **** --- 18,25 ---- */ #include "postgres.h" + #include <unistd.h> + #include "access/genam.h" #include "access/heapam.h" #include "access/nbtree.h" *************** *** 27,32 **** --- 29,39 ---- #include "storage/smgr.h" + extern int vacuum_page_delay; + extern int vacuum_page_groupsize; + extern int vacuum_page_groupcount; + + /* Working state for btbuild and its callback */ typedef struct { *************** *** 610,615 **** --- 617,631 ---- CHECK_FOR_INTERRUPTS(); + if (vacuum_page_delay > 0) + { + if (++vacuum_page_groupcount >= vacuum_page_groupsize) + { + vacuum_page_groupcount = 0; + usleep(vacuum_page_delay * 1000); + } + } + ndeletable = 0; page = BufferGetPage(buf); opaque = (BTPageOpaque) PageGetSpecialPointer(page); *************** *** 736,741 **** --- 752,768 ---- Buffer buf; Page page; BTPageOpaque opaque; + + CHECK_FOR_INTERRUPTS(); + + if (vacuum_page_delay > 0) + { + if (++vacuum_page_groupcount >= vacuum_page_groupsize) + { + vacuum_page_groupcount = 0; + usleep(vacuum_page_delay * 1000); + } + } buf = _bt_getbuf(rel, blkno, BT_READ); page = BufferGetPage(buf); Index: src/backend/commands/vacuumlazy.c =================================================================== RCS file: /home/pgsql/CvsRoot/pgsql-server/src/backend/commands/vacuumlazy.c,v retrieving revision 1.32 diff -c -b -r1.32 vacuumlazy.c *** src/backend/commands/vacuumlazy.c 2003/09/25 06:57:59 1.32 --- src/backend/commands/vacuumlazy.c 2003/11/03 17:57:27 *************** *** 37,42 **** --- 37,44 ---- */ #include "postgres.h" + #include <unistd.h> + #include "access/genam.h" #include "access/heapam.h" #include "access/xlog.h" *************** *** 88,93 **** --- 90,99 ---- static TransactionId OldestXmin; static TransactionId FreezeLimit; + int vacuum_page_delay = 0; /* milliseconds per page group */ + int vacuum_page_groupsize = 10; /* group size */ + int vacuum_page_groupcount = 0; /* current group size count */ + /* non-export function prototypes */ static void lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats, *************** *** 228,233 **** --- 234,248 ---- CHECK_FOR_INTERRUPTS(); + if (vacuum_page_delay > 0) + { + if (++vacuum_page_groupcount >= vacuum_page_groupsize) + { + vacuum_page_groupcount = 0; + usleep(vacuum_page_delay * 1000); + } + } + /* * If we are close to overrunning the available space for * dead-tuple TIDs, pause and do a cycle of vacuuming before we *************** *** 469,474 **** --- 484,498 ---- CHECK_FOR_INTERRUPTS(); + if (vacuum_page_delay > 0) + { + if (++vacuum_page_groupcount >= vacuum_page_groupsize) + { + vacuum_page_groupcount = 0; + usleep(vacuum_page_delay * 1000); + } + } + tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]); buf = ReadBuffer(onerel, tblk); LockBufferForCleanup(buf); *************** *** 799,804 **** --- 823,837 ---- hastup; CHECK_FOR_INTERRUPTS(); + + if (vacuum_page_delay > 0) + { + if (++vacuum_page_groupcount >= vacuum_page_groupsize) + { + vacuum_page_groupcount = 0; + usleep(vacuum_page_delay * 1000); + } + } blkno--; Index: src/backend/utils/misc/guc.c =================================================================== RCS file: /home/pgsql/CvsRoot/pgsql-server/src/backend/utils/misc/guc.c,v retrieving revision 1.164 diff -c -b -r1.164 guc.c *** src/backend/utils/misc/guc.c 2003/10/18 22:59:09 1.164 --- src/backend/utils/misc/guc.c 2003/11/03 21:10:52 *************** *** 70,78 **** --- 70,80 ---- extern int PreAuthDelay; extern int AuthenticationTimeout; extern int CheckPointTimeout; extern int CommitDelay; extern int CommitSiblings; extern char *preload_libraries_string; + extern int vacuum_page_delay; + extern int vacuum_page_groupsize; #ifdef HAVE_SYSLOG extern char *Syslog_facility; *************** *** 1188,1193 **** --- 1199,1222 ---- }, &log_min_duration_statement, -1, -1, INT_MAX / 1000, NULL, NULL + }, + + { + {"vacuum_page_delay", PGC_USERSET, CLIENT_CONN_STATEMENT, + gettext_noop("Sets VACUUM's delay in milliseconds between processing successive pages."), + NULL + }, + &vacuum_page_delay, + 0, 0, 100, NULL, NULL + }, + + { + {"vacuum_page_groupsize", PGC_USERSET, CLIENT_CONN_STATEMENT, + gettext_noop("Sets VACUUM's delay group size."), + NULL + }, + &vacuum_page_groupsize, + 10, 1, 1000, NULL, NULL }, /* End-of-list marker */
> The problem with Jan's more complex version of the patch (at least the > one I found - perhaps not the right one) is it includes a bunch of other > experimental stuff that I'd not want to mess with at the moment. Would > changing the input units (for the original patch) from milli-secs to > micro-secs be a bad idea? If so, I guess I'll get to extracting what I > need from Jan's patch. Jan's vacuum-delay-only patch that nobody can find is here: http://archives.postgresql.org/pgsql-hackers/2003-11/msg00518.php I've been using it in testing & production without any problems.
Sorry I haven't had a chance to reply to this sooner. On Fri, Mar 12, 2004 at 05:38:37PM -0800, Joe Conway wrote: > The problem is this: the application runs an insert, that fires off a > trigger, that cascades into a fairly complex series of functions, that > do a bunch of calculations, inserts, updates, and deletes. Immediately > after a postmaster restart, the first insert or two take about 1.5 > minutes (undoubtedly this could be improved, but it isn't the main > issue). However by the second or third insert, the time increases to 7 - > 9 minutes. Restarting the postmaster causes the cycle to repeat, i.e. > the first one or two inserts are back to the 1.5 minute range. The vacuum delay stuff that you're working on may help, but I can't really believe it's your salvation if this is happening after only a few minutes. No matter how much you're doing inside those functions, you surely can't be causing so many dead tuples that a vacuum is necessary that soon. Did you try not vacuuming for a little while to see if it helps? I didn't see it anywhere in this thread, but are you quite sure that you're not swapping? Note that vmstat on multiprocessor Solaris machines is not notoriously useful. You may want to have a look at what the example stuff in the SE Toolkit tells you, or what you get from sar. I believe you have to use a special kernel setting on Solaris to mark shared memory as being ineligible for swap. A -- Andrew Sullivan | ajs@crankycanuck.ca This work was visionary and imaginative, and goes to show that visionary and imaginative work need not end up well. --Dennis Ritchie
Andrew Sullivan wrote: > Sorry I haven't had a chance to reply to this sooner. > The vacuum delay stuff that you're working on may help, but I can't > really believe it's your salvation if this is happening after only a > few minutes. No matter how much you're doing inside those functions, > you surely can't be causing so many dead tuples that a vacuum is > necessary that soon. Did you try not vacuuming for a little while to > see if it helps? I discussed it later in the thread, but we're adding about 400K rows per hour and deleting most of them after processing (note this is a commercial app, written and maintained by another department -- I can recommend changes, but this late into their release cycle they are very reluctant to change the app). This is 7 x 24 data collection from equipment, so there is no "slow" time to use as a maintenance window. But since the server in question is a test machine, I was able to shut everything off long enough to do a full vacuum -- it took about 12 hours. > I didn't see it anywhere in this thread, but are you quite sure that > you're not swapping? Note that vmstat on multiprocessor Solaris > machines is not notoriously useful. You may want to have a look at > what the example stuff in the SE Toolkit tells you, or what you get > from sar. I believe you have to use a special kernel setting on > Solaris to mark shared memory as being ineligible for swap. I'm (reasonably) sure there is no swapping. Minimum free memory (from top) is about 800 MB, and "vmstat -S" shows no swap-in or swap-out. I've been playing with a version of Jan's performance patch in the past few hours. Based on my simulations, it appears that a 1 ms delay every 10 pages is just about right. The performance hit is negligible (based on overall test time, and cpu % used by the vacuum process). I still have a bit more analysis to do, but this is looking pretty good. More later... Joe
Arthur Ward wrote: > Jan's vacuum-delay-only patch that nobody can find is here: > > http://archives.postgresql.org/pgsql-hackers/2003-11/msg00518.php > > I've been using it in testing & production without any problems. Great to know -- many thanks. I've hacked my own vacuum-delay-only patch form Jan's all_performance patch. It looks like the only difference is that it uses usleep() instead of select(). So far the tests look promising. Thanks, Joe
Andrew Sullivan wrote: >The vacuum delay stuff that you're working on may help, but I can't >really believe it's your salvation if this is happening after only a >few minutes. No matter how much you're doing inside those functions, >you surely can't be causing so many dead tuples that a vacuum is >necessary that soon. Did you try not vacuuming for a little while to >see if it helps? > > Some of this thread was taken off line so I'm not sure it was mentioned on the list, but a big part of the problem was that Joe was running into the same bug that Cott Lang ran into a while ago which caused the vacuum threshold to get set far too low resulting in vacuums far too often.. This has been fixed and the patch has been committed unfortunately it didn't make it into 7.4.2, but it will be in 7.4.3 / 7.5. >I didn't see it anywhere in this thread, but are you quite sure that >you're not swapping? Note that vmstat on multiprocessor Solaris >machines is not notoriously useful. You may want to have a look at >what the example stuff in the SE Toolkit tells you, or what you get >from sar. I believe you have to use a special kernel setting on >Solaris to mark shared memory as being ineligible for swap. > > I haven't heard from Joe how things are going with the fixed pg_autovacuum but that in combination with the vacuum delay stuff should work well. Matthew