Thread: Poor overall performance unless regular VACUUM FULL
Howdy. Some months back, when advised on one of these lists that it should not be necessary to issue VACUUM FULL/REINDEX DATABASE, we quit this nightly "maintenance" practice. We've been very happy to not have to do that, since it locked the database all night. Since then, however, our database performance has decreased. The decrease took a few weeks to become noticable; perhaps six weeks to become awful. I have no objective measurement of the decrease in performance. I have just created a benchmark that exercises our system and used it to measure the current degraded performance. I hope it will show me, objectively, how much any attempted fix improves system performance. One thing I noticed is that when we stopped doing the VACUUM FULL/REINDEX is that the size of the weekly backups (a compressed tarball of main + WAL files) jumped in size. A steady 53GB before we stopped doing the vacuum, the next backup after stopping the VACUUM FULL was 97GB. The backup sizes have grown in the three months since then and are now hovering at around 130GB. We believe, but have no hard numbers to prove, that this growth in physical backup size is out of proportion with the growth of the logical database that we expect due to the slow growth of the business. We are pretty sure we would have noticed the business growing at more than 50% per quarter. I did a VACUUM VERBOSE and looked at the statistics at the end; they seem to indicated that my max_fsm_pages is large enough to keep track of all of the dead rows that are being created (we do a fair amount of deleting as well as inserting). Postgres prints no complaint saying we need more slots, and we have more than the number of slots needed (if I recall, about twice as many). What options do I have for restoring performance other than VACUUM FULL/REINDEX DATABASE? Before trying any fix, what data do I want to collect that might indicate where the performance problem is? Best Regards, Wayne Conrad
On Mon, Jul 13, 2009 at 1:31 PM, Wayne Conrad<wayne@databill.com> wrote: > Howdy. Some months back, when advised on one of these lists that it > should not be necessary to issue VACUUM FULL/REINDEX DATABASE, we quit > this nightly "maintenance" practice. We've been very happy to not > have to do that, since it locked the database all night. Since then, > however, our database performance has decreased. The decrease took a > few weeks to become noticable; perhaps six weeks to become awful. SNIP > What options do I have for restoring performance other than VACUUM > FULL/REINDEX DATABASE? Just wondering, which pgsql version, and also, do you have autovacuum turned on?
On Mon, Jul 13, 2009 at 3:31 PM, Wayne Conrad<wayne@databill.com> wrote: > Howdy. Some months back, when advised on one of these lists that it > should not be necessary to issue VACUUM FULL/REINDEX DATABASE, we quit > this nightly "maintenance" practice. We've been very happy to not > have to do that, since it locked the database all night. Since then, > however, our database performance has decreased. The decrease took a > few weeks to become noticable; perhaps six weeks to become awful. <snip> > I did a VACUUM VERBOSE and looked at the statistics at the end; they > seem to indicated that my max_fsm_pages is large enough to keep track > of all of the dead rows that are being created (we do a fair amount of > deleting as well as inserting). Postgres prints no complaint saying > we need more slots, and we have more than the number of slots needed > (if I recall, about twice as many). > > What options do I have for restoring performance other than VACUUM > FULL/REINDEX DATABASE? > Do you have autovacuum on, or otherwise replaced your VACUUM FULL with regular VACUUM? The symptoms are pretty classically those of table bloat. Since it's gotten so out of hand now, a VACUUM FULL/REINDEX is probably what you'll need to fix it. Going forward, you need *some* vacuuming strategy. Autovacuum is probably best, especially if you're on 8.3. If not autovacuum for some reason, you *must* at least do regular vacuums. Vacuum full/reindex is for fixing the situation you're in now, but a regular vacuum strategy should prevent you from getting back into it. -- - David T. Wilson david.t.wilson@gmail.com
On Wed, Jul 15, 2009 at 6:51 AM, Wayne Conrad<wayne@databill.com> wrote: > On Tue, 14 Jul 2009, Scott Marlowe wrote: >> >> Just wondering, which pgsql version, and also, do you have >> autovacuum turned on? > > Dang, I should have said in my initial message. 8.3.6, and autovacuum > is turned on and has plenty of log activity. Are you guys doing anything that could be deemed pathological, like full table updates on big tables over and over? Had an issue last year where a dev left a where clause off an update to a field in one of our biggest tables and in a few weeks the database was so bloated we had to take it offline to fix the problem. After fixing the query.
On 7/14/09 9:53 PM, "David Wilson" <david.t.wilson@gmail.com> wrote: > On Mon, Jul 13, 2009 at 3:31 PM, Wayne Conrad<wayne@databill.com> wrote: >> Howdy. Some months back, when advised on one of these lists that it >> should not be necessary to issue VACUUM FULL/REINDEX DATABASE, we quit >> this nightly "maintenance" practice. We've been very happy to not >> have to do that, since it locked the database all night. Since then, >> however, our database performance has decreased. The decrease took a >> few weeks to become noticable; perhaps six weeks to become awful. > > <snip> > >> I did a VACUUM VERBOSE and looked at the statistics at the end; they >> seem to indicated that my max_fsm_pages is large enough to keep track >> of all of the dead rows that are being created (we do a fair amount of >> deleting as well as inserting). Postgres prints no complaint saying >> we need more slots, and we have more than the number of slots needed >> (if I recall, about twice as many). >> >> What options do I have for restoring performance other than VACUUM >> FULL/REINDEX DATABASE? >> > > Do you have autovacuum on, or otherwise replaced your VACUUM FULL with > regular VACUUM? The symptoms are pretty classically those of table > bloat. Since it's gotten so out of hand now, a VACUUM FULL/REINDEX is > probably what you'll need to fix it. If you go that route, do a REINDEX first. You probably want to know whether it is mostly index or table bloat that is the majority of the problem. Adjusting each table and index FILLFACTOR may also help. However, if it has bloated this much, you may have some long living transactions that make it hard for postgres to recycle free space. And as others have said, certain things can cause a lot of bloat that only CLUSTER or VACUUM FULL will reclaim well -- especially updating all or most rows in a table, or otherwise doing very large bulk delete or update. > > Going forward, you need *some* vacuuming strategy. Autovacuum is > probably best, especially if you're on 8.3. If not autovacuum for some > reason, you *must* at least do regular vacuums. > > Vacuum full/reindex is for fixing the situation you're in now, but a > regular vacuum strategy should prevent you from getting back into it. > > -- > - David T. Wilson > david.t.wilson@gmail.com > > -- > Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-performance >
On Wed, Jul 15, 2009 at 4:03 PM, Wayne Conrad<wayne@databill.com> wrote: >>> On Tue, 14 Jul 2009, Scott Marlowe wrote: >> >> Are you guys doing anything that could be deemed pathological, like >> full table updates on big tables over and over? Had an issue last >> year where a dev left a where clause off an update to a field in one >> of our biggest tables and in a few weeks the database was so bloated >> we had to take it offline to fix the problem. After fixing the >> query. > > I've just audited the source, looking for any updates without where > clauses. None jumped out to bite me. > > Almost everything we do happens in transactions which can occasionally > take 10-20 minutes to complete and span thousands or tens of thousands > of rows across multiple tables. Are long-running transactions a > culprit in table bloat? > > I've also used contrib/pgstattuple to try to identify which of our > large tables and indices are experiencing bloat. Here are the > pgstattuple results for our largest tables: Ouch hurts my eyes :) Can you see something like table_len, dead_tuple_percent, free_percent order by dead_tuple_percent desc limit 10 or something like that maybe? > > table_len: 56639488 > tuple_count: 655501 > tuple_len: 53573112 > tuple_percent: 94.59 > dead_tuple_count: 0 > dead_tuple_len: 0 > dead_tuple_percent: 0 > free_space: 251928 > free_percent: 0.44 > table_name: status Lots more rows deleted.
> Ouch hurts my eyes :) Can you see something like table_len, > dead_tuple_percent, free_percent order by dead_tuple_percent desc > limit 10 or something like that maybe? Sorry about the pain. Didn't know what you needed to see. Ordering by dead_tuple_percent: db.production=> select table_name, table_len, dead_tuple_percent, free_percent from temp_tuplestats order by dead_tuple_percent desc limit 10; table_name | table_len | dead_tuple_percent | free_percent -------------------------------------+-----------+--------------------+-------------- scheduler_info | 8192 | 43.95 | 46 inserter_maintenance_logs | 16384 | 25.13 | 9 merchants | 8192 | 24.19 | 64 scheduler_in_progress | 32768 | 16.47 | 75 guilds_hosts | 8192 | 13.28 | 67 work_types | 8192 | 12.18 | 78 production_printer_maintenance_logs | 16384 | 11.18 | 11 guilds_work_types | 8192 | 10.94 | 71 config | 8192 | 10.47 | 83 work_in_progress | 131072 | 8.47 | 85 (10 rows) These are our smallest, and in terms of performance, least significant tables. Except for work_in_progress, they play little part in overall system performace. work_in_progress gets dozens of insertions and deletions per second, and as many queries. Ordering by table size, because I had the questions of where the bloat is, in terms of disk space used (since I brought up before that the physical size of the database is growing at about 50% per quarter): db.production=> select table_name, table_len, dead_tuple_percent, free_percent from temp_tuplestats order by table_len desclimit 10; table_name | table_len | dead_tuple_percent | free_percent --------------------------------------------+-------------+--------------------+-------------- documents | 28510109696 | 1.05 | 21 document_address | 23458062336 | 2.14 | 10 latest_document_address_links | 4953735168 | 3.71 | 21 documents_ps_page | 4927676416 | 1.19 | 6 injectd_log | 4233355264 | 0.74 | 17 ps_page | 3544350720 | 0.81 | 4 temp_bak_documents_invoice_amount_for_near | 3358351360 | 0 | 0 statements | 1832091648 | 4.4 | 2 documents_old_addresses | 1612947456 | 0 | 1 cron_logs | 791240704 | 0 | 1 (10 rows) Am I seeing in the above queries evidence that my bloat is mostly in free space, and not in dead tuples?
On Tue, 14 Jul 2009, Scott Marlowe wrote: > Just wondering, which pgsql version, and also, do you have > autovacuum turned on? Dang, I should have said in my initial message. 8.3.6, and autovacuum is turned on and has plenty of log activity.
>> On Tue, 14 Jul 2009, Scott Marlowe wrote: > Are you guys doing anything that could be deemed pathological, like > full table updates on big tables over and over? Had an issue last > year where a dev left a where clause off an update to a field in one > of our biggest tables and in a few weeks the database was so bloated > we had to take it offline to fix the problem. After fixing the > query. I've just audited the source, looking for any updates without where clauses. None jumped out to bite me. Almost everything we do happens in transactions which can occasionally take 10-20 minutes to complete and span thousands or tens of thousands of rows across multiple tables. Are long-running transactions a culprit in table bloat? I've also used contrib/pgstattuple to try to identify which of our large tables and indices are experiencing bloat. Here are the pgstattuple results for our largest tables: table_len: 56639488 tuple_count: 655501 tuple_len: 53573112 tuple_percent: 94.59 dead_tuple_count: 0 dead_tuple_len: 0 dead_tuple_percent: 0 free_space: 251928 free_percent: 0.44 table_name: status table_len: 94363648 tuple_count: 342363 tuple_len: 61084340 tuple_percent: 64.73 dead_tuple_count: 10514 dead_tuple_len: 1888364 dead_tuple_percent: 2 free_space: 28332256 free_percent: 30.02 table_name: uploads table_len: 135675904 tuple_count: 1094803 tuple_len: 129821312 tuple_percent: 95.68 dead_tuple_count: 133 dead_tuple_len: 16048 dead_tuple_percent: 0.01 free_space: 991460 free_percent: 0.73 table_name: invoice_details table_len: 148914176 tuple_count: 1858812 tuple_len: 139661736 tuple_percent: 93.79 dead_tuple_count: 1118 dead_tuple_len: 80704 dead_tuple_percent: 0.05 free_space: 1218040 free_percent: 0.82 table_name: job_status_log table_len: 173416448 tuple_count: 132974 tuple_len: 117788200 tuple_percent: 67.92 dead_tuple_count: 10670 dead_tuple_len: 7792692 dead_tuple_percent: 4.49 free_space: 46081516 free_percent: 26.57 table_name: mail table_len: 191299584 tuple_count: 433378 tuple_len: 145551144 tuple_percent: 76.09 dead_tuple_count: 1042 dead_tuple_len: 862952 dead_tuple_percent: 0.45 free_space: 42068276 free_percent: 21.99 table_name: sessions table_len: 548552704 tuple_count: 5446169 tuple_len: 429602136 tuple_percent: 78.32 dead_tuple_count: 24992 dead_tuple_len: 1929560 dead_tuple_percent: 0.35 free_space: 93157980 free_percent: 16.98 table_name: job_state_log table_len: 639262720 tuple_count: 556415 tuple_len: 221505548 tuple_percent: 34.65 dead_tuple_count: 66688 dead_tuple_len: 27239728 dead_tuple_percent: 4.26 free_space: 380168112 free_percent: 59.47 table_name: jobs table_len: 791240704 tuple_count: 8311799 tuple_len: 700000052 tuple_percent: 88.47 dead_tuple_count: 39 dead_tuple_len: 3752 dead_tuple_percent: 0 free_space: 11397548 free_percent: 1.44 table_name: cron_logs table_len: 1612947456 tuple_count: 10854417 tuple_len: 1513084075 tuple_percent: 93.81 dead_tuple_count: 0 dead_tuple_len: 0 dead_tuple_percent: 0 free_space: 13014040 free_percent: 0.81 table_name: documents_old_addresses table_len: 1832091648 tuple_count: 13729360 tuple_len: 1600763725 tuple_percent: 87.37 dead_tuple_count: 598525 dead_tuple_len: 80535904 dead_tuple_percent: 4.4 free_space: 38817616 free_percent: 2.12 table_name: statements table_len: 3544350720 tuple_count: 64289703 tuple_len: 2828746932 tuple_percent: 79.81 dead_tuple_count: 648849 dead_tuple_len: 28549356 dead_tuple_percent: 0.81 free_space: 143528236 free_percent: 4.05 table_name: ps_page table_len: 4233355264 tuple_count: 22866609 tuple_len: 3285722981 tuple_percent: 77.62 dead_tuple_count: 231624 dead_tuple_len: 31142594 dead_tuple_percent: 0.74 free_space: 706351636 free_percent: 16.69 table_name: injectd_log table_len: 4927676416 tuple_count: 55919895 tuple_len: 4176606972 tuple_percent: 84.76 dead_tuple_count: 795011 dead_tuple_len: 58409884 dead_tuple_percent: 1.19 free_space: 279870944 free_percent: 5.68 table_name: documents_ps_page table_len: 4953735168 tuple_count: 44846317 tuple_len: 3346823052 tuple_percent: 67.56 dead_tuple_count: 2485971 dead_tuple_len: 183639396 dead_tuple_percent: 3.71 free_space: 1038200484 free_percent: 20.96 table_name: latest_document_address_links table_len: 23458062336 tuple_count: 89533157 tuple_len: 19772992448 tuple_percent: 84.29 dead_tuple_count: 2311467 dead_tuple_len: 502940946 dead_tuple_percent: 2.14 free_space: 2332408612 free_percent: 9.94 table_name: document_address table_len: 28510109696 tuple_count: 44844664 tuple_len: 21711695949 tuple_percent: 76.15 dead_tuple_count: 1134932 dead_tuple_len: 300674467 dead_tuple_percent: 1.05 free_space: 5988985892 free_percent: 21.01 table_name: documents Here are the pgstatindex results for our largest indices. I assumed that negative index sizes are a reslt of integer overflow and ordered the results accordingly. index_size: 1317961728 version: 2 tree_level: 3 root_block_no: 12439 internal_pages: 13366 leaf_pages: 1182318 empty_pages: 0 deleted_pages: 13775 avg_leaf_density: -157.76 leaf_fragmentation: 37.87 index_name: documents_pkey index_size: 1346609152 version: 2 tree_level: 3 root_block_no: 10447 internal_pages: 1937 leaf_pages: 162431 empty_pages: 0 deleted_pages: 12 avg_leaf_density: 66.56 leaf_fragmentation: 26.48 index_name: statements_pkey index_size: 1592713216 version: 2 tree_level: 3 root_block_no: 81517 internal_pages: 723 leaf_pages: 177299 empty_pages: 0 deleted_pages: 16400 avg_leaf_density: 74.15 leaf_fragmentation: 5.58 index_name: latest_document_address2_precedence_key index_size: 1617821696 version: 2 tree_level: 3 root_block_no: 81517 internal_pages: 720 leaf_pages: 185846 empty_pages: 0 deleted_pages: 10921 avg_leaf_density: 78.8 leaf_fragmentation: 10.96 index_name: documents_ps_page_ps_page_id_idx index_size: 1629798400 version: 2 tree_level: 3 root_block_no: 81517 internal_pages: 728 leaf_pages: 188325 empty_pages: 0 deleted_pages: 9896 avg_leaf_density: 88.23 leaf_fragmentation: 0.66 index_name: ps_page_pkey index_size: 1658560512 version: 2 tree_level: 3 root_block_no: 81517 internal_pages: 740 leaf_pages: 191672 empty_pages: 0 deleted_pages: 10048 avg_leaf_density: 86.7 leaf_fragmentation: 1.03 index_name: ps_page_ps_id_key index_size: -31956992 version: 2 tree_level: 3 root_block_no: 12439 internal_pages: 5510 leaf_pages: 475474 empty_pages: 0 deleted_pages: 39402 avg_leaf_density: 72.19 leaf_fragmentation: 3.02 index_name: latest_document_address2_pkey index_size: -321863680 version: 2 tree_level: 3 root_block_no: 81517 internal_pages: 1809 leaf_pages: 479805 empty_pages: 0 deleted_pages: 3383 avg_leaf_density: 25.63 leaf_fragmentation: 40.05 index_name: documents_id_idx index_size: -461504512 version: 2 tree_level: 3 root_block_no: 49813 internal_pages: 3023 leaf_pages: 456246 empty_pages: 0 deleted_pages: 8682 avg_leaf_density: 34.37 leaf_fragmentation: 66.83 index_name: documents_city index_size: -11818844162 version: 3 tree_level: root_block_no: 11036 internal_pages: 10003 leaf_pages: 822178 empty_pages: 0 deleted_pages: 72121 avg_leaf_density: 54.52 leaf_fragmentation: 3.37 index_name: document_address_pkey index_size: -12678348802 version: 3 tree_level: root_block_no: 32210 internal_pages: 2410 leaf_pages: 359867 empty_pages: 0 deleted_pages: 7245 avg_leaf_density: 53.31 leaf_fragmentation: 52.7 index_name: documents_recipient index_size: -13276282882 version: 3 tree_level: root_block_no: 27346 internal_pages: 2183 leaf_pages: 360040 empty_pages: 0 deleted_pages: 0 avg_leaf_density: 58.39 leaf_fragmentation: 50 index_name: documents_magic_id_key index_size: -14476328962 version: 3 tree_level: root_block_no: 44129 internal_pages: 1998 leaf_pages: 339111 empty_pages: 0 deleted_pages: 6465 avg_leaf_density: 50.12 leaf_fragmentation: 52.85 index_name: documents_zip10 index_size: -14723809282 version: 3 tree_level: root_block_no: 81515 internal_pages: 2470 leaf_pages: 326170 empty_pages: 0 deleted_pages: 15913 avg_leaf_density: 38.21 leaf_fragmentation: 77.19 index_name: documents_state index_size: -14831697922 version: 3 tree_level: root_block_no: 47536 internal_pages: 1607 leaf_pages: 341421 empty_pages: 0 deleted_pages: 208 avg_leaf_density: 45.28 leaf_fragmentation: 46.48 index_name: documents_account_number index_size: -17118412802 version: 3 tree_level: root_block_no: 81517 internal_pages: 1149 leaf_pages: 296146 empty_pages: 0 deleted_pages: 18027 avg_leaf_density: 80.86 leaf_fragmentation: 7.14 index_name: document_address_precedence_key