Thread: Avoiding vacuum full on an UPDATE-heavy table
All, I have a particularly troublesome table in my 7.3.4 database. It typically has less than 50k rows, and a usage pattern of about 1k INSERTs, 50-100k UPDATEs, and no DELETEs per day. It is vacuumed and analyzed three times per week. However, the performance of queries performed on this table slowly degrades over a period of weeks, until even a "select count(*)" takes several seconds. The only way I've found to restore performance is to VACUUM FULL the table, which is highly undesireable in our application due to the locks it imposes. Here is the output of a psql session demonstrating the problem/solution. Note the \timing output after each of the SELECTs: qqqqqqqq=> vacuum analyze xxxx; NOTICE: VACUUM will be committed automatically VACUUM Time: 715900.74 ms qqqqqqqq=> select count(*) from xxxx; count ------- 17978 (1 row) Time: 171789.08 ms qqqqqqqq=> vacuum full verbose xxxx; NOTICE: VACUUM will be committed automatically INFO: --Relation public.xxxx-- INFO: Pages 188903: Changed 60, reaped 188896, Empty 0, New 0; Tup 17987: Vac 1469, Keep/VTL 0/0, UnUsed 9120184, MinLen 92, MaxLen 468; Re-using: Free/Avail. Space 1504083956/1504083872; EndEmpty/Avail. Pages 0/188901. CPU 6.23s/1.07u sec elapsed 55.02 sec. INFO: Index xxxx_yyyy_idx: Pages 29296; Tuples 17987: Deleted 1469. CPU 1.08s/0.20u sec elapsed 61.68 sec. INFO: Index xxxx_zzzz_idx: Pages 18412; Tuples 17987: Deleted 1469. CPU 0.67s/0.05u sec elapsed 17.90 sec. INFO: Rel xxxx: Pages: 188903 --> 393; Tuple(s) moved: 17985. CPU 15.97s/19.11u sec elapsed 384.49 sec. INFO: Index xxxx_yyyy_idx: Pages 29326; Tuples 17987: Deleted 17985. CPU 1.14s/0.65u sec elapsed 32.34 sec. INFO: Index xxxx_zzzz_idx: Pages 18412; Tuples 17987: Deleted 17985. CPU 0.43s/0.32u sec elapsed 13.37 sec. VACUUM Time: 566313.54 ms qqqqqqqq=> select count(*) from xxxx; count ------- 17987 (1 row) Time: 22.82 ms Is there any way to avoid doing a periodic VACUUM FULL on this table, given the fairly radical usage pattern? Or is the (ugly) answer to redesign our application to avoid this usage pattern? Also, how do I read the output of VACUUM FULL? http://www.postgresql.org/docs/7.3/interactive/sql-vacuum.html does not explain how to interpret the output, nor has google helped. I have a feeling that the full vacuum is compressing hundreds of thousands of pages of sparse data into tens of thousands of pages of dense data, thus reducing the number of block reads by an order of magnitude, but I'm not quite sure how to read the output. FWIW, this is last night's relevant output from the scheduled VACUUM ANALYZE. 24 days have passed since the VACUUM FULL above: INFO: --Relation public.xxx-- INFO: Index xxx_yyy_idx: Pages 30427; Tuples 34545: Deleted 77066. CPU 1.88s/0.51u sec elapsed 95.39 sec. INFO: Index xxx_zzz_idx: Pages 19049; Tuples 34571: Deleted 77066. CPU 0.83s/0.40u sec elapsed 27.92 sec. INFO: Removed 77066 tuples in 3474 pages. CPU 0.38s/0.32u sec elapsed 1.33 sec. INFO: Pages 13295: Changed 276, Empty 0; Tup 34540: Vac 77066, Keep 0, UnUsed 474020. Total CPU 3.34s/1.29u sec elapsed 125.00 sec. INFO: Analyzing public.xxx Best Regards, Bill Montgomery
> Is there any way to avoid doing a periodic VACUUM FULL on this table, > given the fairly radical usage pattern? Or is the (ugly) answer to > redesign our application to avoid this usage pattern? Yes, you should be able to doing avoid periodic VACUUM FULL. The problem is that your table needs to be vacuumed MUCH more often. What should happen is that assuming you have enough FSM space allocated and assuming you vacuum the "right" amount, your table will reach a steady state size. As you could see your from you vacumm verbose output your table was almost entriely dead space. pg_autovacuum would probably help as it monitors activity and vacuumus tables accordingly. It is not included with 7.3.x but if you download it and compile yourself it will work against a 7.3.x server. Good luck, Matthew
>>>>> "BM" == Bill Montgomery <billm@lulu.com> writes: BM> Is there any way to avoid doing a periodic VACUUM FULL on this table, BM> given the fairly radical usage pattern? Or is the (ugly) answer to BM> redesign our application to avoid this usage pattern? I'll bet upgrading to 7.4.2 clears up your problems. I'm not sure if it was in 7.3 or 7.4 where the index bloat problem was solved. Try to see if just reindexing will help your performance. Also, run a plain vacuum at least nightly so that your table size stays reasonable. It won't take much time on a table with only 50k rows in it. -- =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Vivek Khera, Ph.D. Khera Communications, Inc. Internet: khera@kciLink.com Rockville, MD +1-301-869-4449 x806 AIM: vivekkhera Y!: vivek_khera http://www.khera.org/~vivek/
Matthew T. O'Connor wrote: >>Is there any way to avoid doing a periodic VACUUM FULL on this table, >>given the fairly radical usage pattern? Or is the (ugly) answer to >>redesign our application to avoid this usage pattern? >> >> >pg_autovacuum would probably help as it monitors activity and vacuumus >tables accordingly. It is not included with 7.3.x but if you download it >and compile yourself it will work against a 7.3.x server. > > As a quick fix, since we're upgrading to 7.4.2 in a few weeks anyhow (which includes pg_autovacuum), I've simply set up an hourly vacuum on this table. It only takes ~4 seconds to execute when kept up on an hourly basis. Is there any penalty to vacuuming too frequently, other than the time wasted in an unnecessary vacuum operation? My hourly VACUUM VERBOSE output now looks like this: INFO: --Relation public.xxxx-- INFO: Index xxxx_yyyy_idx: Pages 30452; Tuples 34990: Deleted 1226. CPU 0.67s/0.18u sec elapsed 0.87 sec. INFO: Index xxxx_yyyy_idx: Pages 19054; Tuples 34991: Deleted 1226. CPU 0.51s/0.13u sec elapsed 1.35 sec. INFO: Removed 1226 tuples in 137 pages. CPU 0.01s/0.00u sec elapsed 1.30 sec. INFO: Pages 13709: Changed 31, Empty 0; Tup 34990: Vac 1226, Keep 0, UnUsed 567233. Total CPU 1.58s/0.31u sec elapsed 3.91 sec. INFO: Analyzing public.xxxx VACUUM With regards to Vivek's post about index bloat, I tried REINDEXing before I did a VACUUM FULL a month ago when performance had gotten dismal. It didn't help :-( Best Regards, Bill Montgomery
Bill Montgomery <billm@lulu.com> writes: > I have a particularly troublesome table in my 7.3.4 database. It > typically has less than 50k rows, and a usage pattern of about 1k > INSERTs, 50-100k UPDATEs, and no DELETEs per day. It is vacuumed and > analyzed three times per week. You probably want to vacuum (non-FULL) once a day, if not more often. Also take a look at your FSM settings --- it seems like a good bet that they're not large enough to remember all the free space in your database. With adequate FSM the table should stabilize at a physical size corresponding to number-of-live-rows + number-of-updates-between-VACUUMs, which would be three times the minimum possible size if you vacuum once a day (50K + 100K) or five times if you stick to every-other-day (50K + 200K). Your VACUUM FULL output shows that the table had bloated to hundreds of times the minimum size: > INFO: Rel xxxx: Pages: 188903 --> 393; Tuple(s) moved: 17985. and AFAIK the only way that will happen is if you fail to vacuum at all or don't have enough FSM. The indexes are looking darn large as well. In 7.3 about the only thing you can do about this is REINDEX the table every so often. 7.4 should behave better though. regards, tom lane
Hi, After a table analyzed a table, the table's relpages of pg_class gets updated, but not those of associated indexes, which can be updated by "vacuum analyze". Is this a feature or a bug? I have some tables and there are almost only inserts. So I do not care about the "dead tuples", but do care about the statistics. Does the above "future/bug" affect the performance? My PG version is 7.3.2. Thanks, __________________________________ Do you Yahoo!? Yahoo! Domains � Claim yours for only $14.70/year http://smallbusiness.promotions.yahoo.com/offer
From PG http://developer.postgresql.org/docs/postgres/diskusage.html: "(Remember, relpages is only updated by VACUUM and ANALYZE.)" --- Litao Wu <litaowu@yahoo.com> wrote: > Hi, > > After a table analyzed a table, the table's relpages > > of pg_class gets updated, but not those of > associated > indexes, which can be updated by "vacuum analyze". > > Is this a feature or a bug? > > I have some tables and there are almost only > inserts. So I do not care about the "dead tuples", > but do care about the statistics. > > Does the above "future/bug" affect the performance? > > My PG version is 7.3.2. > > Thanks, > > > > > > > __________________________________ > Do you Yahoo!? > Yahoo! Domains ?Claim yours for only $14.70/year > http://smallbusiness.promotions.yahoo.com/offer > > ---------------------------(end of > broadcast)--------------------------- > TIP 8: explain analyze is your friend __________________________________ Do you Yahoo!? Yahoo! Domains � Claim yours for only $14.70/year http://smallbusiness.promotions.yahoo.com/offer
Litao, > I have some tables and there are almost only > inserts. So I do not care about the "dead tuples", > but do care about the statistics. Then just run ANALYZE on those tables, and not VACUUM. ANALYZE <table-name>; > My PG version is 7.3.2. I would suggest upgrading to 7.3.6; the version you are using has several known bugs. -- -Josh Berkus Aglio Database Solutions San Francisco
Bill, > As a quick fix, since we're upgrading to 7.4.2 in a few weeks anyhow > (which includes pg_autovacuum), I've simply set up an hourly vacuum on > this table. It only takes ~4 seconds to execute when kept up on an > hourly basis. Is there any penalty to vacuuming too frequently, other > than the time wasted in an unnecessary vacuum operation? Nope, no penalty other than the I/O and CPU load while vacuuming. If you have a lot of transactions involving serial writes to many tables, sometimes you can get into a deadlock situation, which is annoying, but I wouldn't assume this to be a problem until it crops up. -- -Josh Berkus Aglio Database Solutions San Francisco
Hi Josh, I know that and that is what I am using now. The problem is I also need to know the relpages each indexe takes and "analyze" seems not update relpages though vacuum and vacuum analyze do. According to PG doc: "Remember, relpages is only updated by VACUUM and ANALYZE" My question is why relpages of indexes do not get updated after "analyze". Here is a quick test: create table test as select * from pg_class where 1=2; create index test_idx on test (relname); insert into test select * from pg_class; select relname, relpages from pg_class where relname in ('test', 'test_idx'); relname | relpages ----------+---------- test | 10 test_idx | 1 (2 rows) analyze test; select relname, relpages from pg_class where relname in ('test', 'test_idx'); relname | relpages ----------+---------- test | 27 test_idx | 1 (2 rows) -- Analyze only updates table's relpage, not index's! vacuum analyze test; select relname, relpages from pg_class where relname in ('test', 'test_idx'); relname | relpages ----------+---------- test | 27 test_idx | 22 (2 rows) -- "acuum analzye" updates both -- "vacuum" only also updates both Thank you for your help! --- Josh Berkus <josh@agliodbs.com> wrote: > Litao, > > > I have some tables and there are almost only > > inserts. So I do not care about the "dead tuples", > > but do care about the statistics. > > Then just run ANALYZE on those tables, and not > VACUUM. > ANALYZE <table-name>; > > > My PG version is 7.3.2. > > I would suggest upgrading to 7.3.6; the version you > are using has several > known bugs. > > -- > -Josh Berkus > Aglio Database Solutions > San Francisco > __________________________________ Do you Yahoo!? Friends. Fun. Try the all-new Yahoo! Messenger. http://messenger.yahoo.com/
Litao Wu <litaowu@yahoo.com> writes: > My question is why relpages of indexes > do not get updated after "analyze". It's an oversight, which just got fixed in CVS tip a few weeks ago. regards, tom lane