Thread: VACUUM hanging on PostgreSQL 8.3.1 for larger tables
I am running into problems vacuuming my larger tables. It seems for tables greater than 1 million rows, Vacuum just hangs. I could leave it running for hours and it never comes to completion. Things like copying the whole table to a temp table with bulk insert such as (SELECT * INTO temp FROM sometable) takes about 60-80 secs for a 1.5 miliion table creating an index about 30 secs, ANALYZE VERBOSE about 109 secs (haven't changed the default % scan) - below is a sample of that INFO: analyzing "ky.ky_edges" INFO: "ky_edges": scanned 3000 of 115299 pages, containing 39113 live rows and 5216 dead rows; 3000 rows in sample, 1503230 estimated total rows My fillfactors are set to about 90%. I should add I have POSTGIS geometry fields in these tables (2 of them). I did a test creating a temp table of the same records, but leaving out the geometry fields with same results. So I ruled out the geometry fields as the issue. Below are my general specs "PostgreSQL 8.3.1 on i686-redhat-linux-gnu, compiled by GCC gcc (GCC) 4.1.2 20070626 (Red Hat 4.1.2-14)" shared_buffers = 1536MB temp_buffers = 128MB maintenance_work_mem = 512MB work_mem = 256MB max_fsm_relations = 1000 max_fsm_pages = 204800 max_fsm_relations = 1000 (never quite understood how these fsm things work) wal_buffers = 10MB vmstat shows procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------ r b swpd free buff cache si so bi bo in cs us sy id wa st 0 0 280 123328 4320 2762896 0 0 31 74 62 33 1 0 98 1 0 cat /proc/cpuinfo key elements show 8 of these model name : Intel(R) Xeon(R) CPU E5410 @ 2.33GHz stepping : 6 cpu MHz : 2333.644 cache size : 6144 KB uname -a shows: 2.6.18-53.1.4.el5 #1 SMP Wed Nov 14 10:37:33 EST 2007 i686 i686 i386 GNU/Linux Thanks, Regina
"Paragon" <lr@pcorp.us> writes: > I am running into problems vacuuming my larger tables. It seems for tables > greater than 1 million rows, Vacuum just hangs. I could leave it running > for hours and it never comes to completion. Is it actually *doing* anything, like consuming CPU or I/O -- and if so which? How much does VACUUM VERBOSE print before getting stuck? > vmstat shows vmstat without any arguments is very nearly useless, because what it shows you is averages since system boot. Watch "vmstat 1" for awhile and you will get an actual picture of what's happening. regards, tom lane
>"Paragon" <lr@pcorp.us> writes: >> I am running into problems vacuuming my larger tables. It seems for >> tables greater than 1 million rows, Vacuum just hangs. I could leave it running >> for hours and it never comes to completion. >Is it actually *doing* anything, like consuming CPU or I/O -- and if so which? How much does VACUUM VERBOSE print before getting stuck? It just shows - Vacuuming "ky.ky_edges" and sits there forever > vmstat without any arguments is very nearly useless, because what it shows you is averages since system boot. Watch "vmstat 1" for awhile and you will > > get an actual picture of what's happening. With no queries running vmstat 1 looks like this procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------ r b swpd free buff cache si so bi bo in cs us sy id wa st 0 0 400 178832 5336 2821712 0 0 130 193 22 10 18 0 79 3 0 0 0 400 178708 5336 2822100 0 0 0 0 1024 191 0 0 100 0 0 0 0 400 178460 5348 2822116 0 0 0 1328 1168 182 0 0 96 4 0 1 0 400 178212 5348 2822492 0 0 0 0 1022 201 0 0 100 0 0 0 0 400 178088 5348 2822568 0 0 0 0 1031 196 0 0 100 0 0 0 0 400 177840 5348 2822860 0 0 0 0 1023 194 0 0 100 0 0 0 0 400 177716 5348 2822936 0 0 0 48 1009 164 0 0 100 0 0 0 0 400 177468 5364 2823216 0 0 4 1272 1177 228 0 0 97 3 0 0 0 400 177468 5368 2823236 0 0 4 0 1006 158 0 0 100 0 0 0 0 400 177468 5368 2823440 0 0 0 0 1022 177 0 0 100 0 0 --If I start the vacuum verbose ky.ky_edges; and monitor vmstat 1 looks like this procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------ r b swpd free buff cache si so bi bo in cs us sy id wa st 1 0 400 169408 5932 2831468 0 0 0 24 1006 184 0 0 100 0 0 0 0 400 169160 5956 2831616 0 0 0 1132 1108 215 0 0 100 0 0 0 0 400 168792 5968 2831920 0 0 128 676 1055 216 0 0 100 0 0 0 0 400 168544 5968 2832108 0 0 0 24 1024 193 0 0 100 0 0 0 0 400 168300 5992 2832212 0 0 128 1120 1098 219 0 0 100 0 0 0 0 400 168176 6004 2832628 0 0 0 668 1075 198 0 0 100 0 0 0 0 400 168052 6004 2832640 0 0 0 24 1005 195 0 0 99 1 0 0 0 400 167684 6028 2832844 0 0 128 1116 1118 224 0 0 100 0 0 0 0 400 167560 6040 2833084 0 0 0 664 1056 216 0 0 99 1 0 Thanks, Regina
"Paragon" <lr@pcorp.us> writes: >> Is it actually *doing* anything, like consuming CPU or I/O -- and if so >> which? How much does VACUUM VERBOSE print before getting stuck? > --If I start the > vacuum verbose ky.ky_edges; > and monitor vmstat 1 looks like this Sure looks like a near-idle machine to me :-(. I think that the vacuum must be stuck on a lock. What other Postgres processes have you got, and what are they doing? regards, tom lane
> Sure looks like a near-idle machine to me :-(. > I think that the vacuum must be stuck on a lock. What other Postgres processes have you got, and what are they doing? > regards, tom lane Originally I thought it was the autovacuuming getting in the way since I noticed sometimes it was trying to vacuum the tables I was trying to vacuum and autovacuuming was taking a long time, but I had turned autovacuuming off during bulk load process period to prevent this. When I showed you the stats on only the vacuuming, I had already stopped all bulk load processes such that Running SELECT * FROM pg_stat_activity Was showing nothing but my single vacuuming process. If I vacuum say a 365,0000 record table of comparable structure to my 1 million someodd, it takes about 10 minutes to run and runs thru. 10 minutes for 365,000 records seemed a little long to me from memory on 8.2, but at least it ran thru okay. Next I was going to try to do was load the same dataset on my old 8.2 box to rule out 8.3.1 as the culprit. Thanks, Regina
"Paragon" <lr@pcorp.us> writes: >> Sure looks like a near-idle machine to me :-(. >> I think that the vacuum must be stuck on a lock. What other Postgres >> processes have you got, and what are they doing? > Originally I thought it was the autovacuuming getting in the way since I > noticed sometimes it was trying to vacuum the tables I was trying to vacuum > and autovacuuming was taking a long time, but I had turned autovacuuming off > during bulk load process period to prevent this. Hmm ... another reason for vacuum running really slowly would be poor choices of vacuum_cost parameters --- have you got those set to nondefault values? regards, tom lane
"Paragon" <lr@pcorp.us> writes: > Right now I have > vacuum_cost_delay = 600 Yikes. That's *way* too high. If you're trying to get the vacuum to complete quickly, it really should be zero anyway. Nonzero is for when you don't care how long vacuum takes as long as it's not sucking too much I/O from your real work. regards, tom lane
> Yikes. That's *way* too high. If you're trying to get the vacuum to complete quickly, it really should be zero anyway. Nonzero is for when you don't care how >> long vacuum takes as long as it's not sucking too much I/O from your real work. Thanks Tom, yap you were right now it runs instantaneously by resetting vacuum back to defaults. I feel kind of dumb at this point :) I guess misunderstood the docs. Thanks again, Regina
Hello If Your 'vacuum verbose analyze table' ends pretty fast,and Your 'vacuum full verbose analyze table' never ends,watch for 'select count (*) from pg_locks',might be that You have some heavy load transaction processing on that table,so 'vacuum full....' wait for transactions to end. If possible kill all processes that access that table and try 'vacuum full verbose analyze table'. Anyway,if processor is mostly idle,always check 'select count (*) from pg_locks' , at least in my experience. Sincerely Dragan Zubac Tom Lane wrote: > "Paragon" <lr@pcorp.us> writes: > >> Right now I have >> vacuum_cost_delay = 600 >> > > Yikes. That's *way* too high. If you're trying to get the vacuum to > complete quickly, it really should be zero anyway. Nonzero is for when > you don't care how long vacuum takes as long as it's not sucking too much > I/O from your real work. > > regards, tom lane > >