Thread: Poor overall performance unless regular VACUUM FULL

Poor overall performance unless regular VACUUM FULL

From
Wayne Conrad
Date:
Howdy.  Some months back, when advised on one of these lists that it
should not be necessary to issue VACUUM FULL/REINDEX DATABASE, we quit
this nightly "maintenance" practice.  We've been very happy to not
have to do that, since it locked the database all night.  Since then,
however, our database performance has decreased.  The decrease took a
few weeks to become noticable; perhaps six weeks to become awful.

I have no objective measurement of the decrease in performance.  I
have just created a benchmark that exercises our system and used it to
measure the current degraded performance.  I hope it will show me,
objectively, how much any attempted fix improves system performance.

One thing I noticed is that when we stopped doing the VACUUM
FULL/REINDEX is that the size of the weekly backups (a compressed
tarball of main + WAL files) jumped in size.  A steady 53GB before we
stopped doing the vacuum, the next backup after stopping the VACUUM
FULL was 97GB.  The backup sizes have grown in the three months since
then and are now hovering at around 130GB.  We believe, but have no
hard numbers to prove, that this growth in physical backup size is out
of proportion with the growth of the logical database that we expect
due to the slow growth of the business.  We are pretty sure we would
have noticed the business growing at more than 50% per quarter.

I did a VACUUM VERBOSE and looked at the statistics at the end; they
seem to indicated that my max_fsm_pages is large enough to keep track
of all of the dead rows that are being created (we do a fair amount of
deleting as well as inserting).  Postgres prints no complaint saying
we need more slots, and we have more than the number of slots needed
(if I recall, about twice as many).

What options do I have for restoring performance other than VACUUM
FULL/REINDEX DATABASE?

Before trying any fix, what data do I want to collect that might
indicate where the performance problem is?

Best Regards,
         Wayne Conrad

Re: Poor overall performance unless regular VACUUM FULL

From
Scott Marlowe
Date:
On Mon, Jul 13, 2009 at 1:31 PM, Wayne Conrad<wayne@databill.com> wrote:
> Howdy.  Some months back, when advised on one of these lists that it
> should not be necessary to issue VACUUM FULL/REINDEX DATABASE, we quit
> this nightly "maintenance" practice.  We've been very happy to not
> have to do that, since it locked the database all night.  Since then,
> however, our database performance has decreased.  The decrease took a
> few weeks to become noticable; perhaps six weeks to become awful.
SNIP
> What options do I have for restoring performance other than VACUUM
> FULL/REINDEX DATABASE?

Just wondering, which pgsql version, and also, do you have autovacuum turned on?

Re: Poor overall performance unless regular VACUUM FULL

From
David Wilson
Date:
On Mon, Jul 13, 2009 at 3:31 PM, Wayne Conrad<wayne@databill.com> wrote:
> Howdy.  Some months back, when advised on one of these lists that it
> should not be necessary to issue VACUUM FULL/REINDEX DATABASE, we quit
> this nightly "maintenance" practice.  We've been very happy to not
> have to do that, since it locked the database all night.  Since then,
> however, our database performance has decreased.  The decrease took a
> few weeks to become noticable; perhaps six weeks to become awful.

 <snip>

> I did a VACUUM VERBOSE and looked at the statistics at the end; they
> seem to indicated that my max_fsm_pages is large enough to keep track
> of all of the dead rows that are being created (we do a fair amount of
> deleting as well as inserting).  Postgres prints no complaint saying
> we need more slots, and we have more than the number of slots needed
> (if I recall, about twice as many).
>
> What options do I have for restoring performance other than VACUUM
> FULL/REINDEX DATABASE?
>

Do you have autovacuum on, or otherwise replaced your VACUUM FULL with
regular VACUUM? The symptoms are pretty classically those of table
bloat. Since it's gotten so out of hand now, a VACUUM FULL/REINDEX is
probably what you'll need to fix it.

Going forward, you need *some* vacuuming strategy. Autovacuum is
probably best, especially if you're on 8.3. If not autovacuum for some
reason, you *must* at least do regular vacuums.

Vacuum full/reindex is for fixing the situation you're in now, but a
regular vacuum strategy should prevent you from getting back into it.

--
- David T. Wilson
david.t.wilson@gmail.com

Re: Poor overall performance unless regular VACUUM FULL

From
Scott Marlowe
Date:
On Wed, Jul 15, 2009 at 6:51 AM, Wayne Conrad<wayne@databill.com> wrote:
> On Tue, 14 Jul 2009, Scott Marlowe wrote:
>>
>> Just wondering, which pgsql version, and also, do you have
>> autovacuum turned on?
>
> Dang, I should have said in my initial message.  8.3.6, and autovacuum
> is turned on and has plenty of log activity.

Are you guys doing anything that could be deemed pathological, like
full table updates on big tables over and over?  Had an issue last
year where a dev left a where clause off an update to a field in one
of our biggest tables and in a few weeks the database was so bloated
we had to take it offline to fix the problem.  After fixing the query.

Re: Poor overall performance unless regular VACUUM FULL

From
Scott Carey
Date:
On 7/14/09 9:53 PM, "David Wilson" <david.t.wilson@gmail.com> wrote:

> On Mon, Jul 13, 2009 at 3:31 PM, Wayne Conrad<wayne@databill.com> wrote:
>> Howdy.  Some months back, when advised on one of these lists that it
>> should not be necessary to issue VACUUM FULL/REINDEX DATABASE, we quit
>> this nightly "maintenance" practice.  We've been very happy to not
>> have to do that, since it locked the database all night.  Since then,
>> however, our database performance has decreased.  The decrease took a
>> few weeks to become noticable; perhaps six weeks to become awful.
>
>  <snip>
>
>> I did a VACUUM VERBOSE and looked at the statistics at the end; they
>> seem to indicated that my max_fsm_pages is large enough to keep track
>> of all of the dead rows that are being created (we do a fair amount of
>> deleting as well as inserting).  Postgres prints no complaint saying
>> we need more slots, and we have more than the number of slots needed
>> (if I recall, about twice as many).
>>
>> What options do I have for restoring performance other than VACUUM
>> FULL/REINDEX DATABASE?
>>
>
> Do you have autovacuum on, or otherwise replaced your VACUUM FULL with
> regular VACUUM? The symptoms are pretty classically those of table
> bloat. Since it's gotten so out of hand now, a VACUUM FULL/REINDEX is
> probably what you'll need to fix it.

If you go that route, do a REINDEX first.  You probably want to know whether
it is mostly index or table bloat that is the majority of the problem.

Adjusting each table and index FILLFACTOR may also help.

However, if it has bloated this much, you may have some long living
transactions that make it hard for postgres to recycle free space.

And as others have said, certain things can cause a lot of bloat that only
CLUSTER or VACUUM FULL will reclaim well -- especially updating all or most
rows in a table, or otherwise doing very large bulk delete or update.

>
> Going forward, you need *some* vacuuming strategy. Autovacuum is
> probably best, especially if you're on 8.3. If not autovacuum for some
> reason, you *must* at least do regular vacuums.
>
> Vacuum full/reindex is for fixing the situation you're in now, but a
> regular vacuum strategy should prevent you from getting back into it.
>
> --
> - David T. Wilson
> david.t.wilson@gmail.com
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>


Re: Poor overall performance unless regular VACUUM FULL

From
Scott Marlowe
Date:
On Wed, Jul 15, 2009 at 4:03 PM, Wayne Conrad<wayne@databill.com> wrote:
>>> On Tue, 14 Jul 2009, Scott Marlowe wrote:
>>
>> Are you guys doing anything that could be deemed pathological, like
>> full table updates on big tables over and over?  Had an issue last
>> year where a dev left a where clause off an update to a field in one
>> of our biggest tables and in a few weeks the database was so bloated
>> we had to take it offline to fix the problem.  After fixing the
>> query.
>
> I've just audited the source, looking for any updates without where
> clauses.  None jumped out to bite me.
>
> Almost everything we do happens in transactions which can occasionally
> take 10-20 minutes to complete and span thousands or tens of thousands
> of rows across multiple tables.  Are long-running transactions a
> culprit in table bloat?
>
> I've also used contrib/pgstattuple to try to identify which of our
> large tables and indices are experiencing bloat.  Here are the
> pgstattuple results for our largest tables:

Ouch hurts my eyes :)  Can you see something like table_len,
dead_tuple_percent, free_percent order by dead_tuple_percent desc
limit 10 or something like that maybe?

>
> table_len:          56639488
> tuple_count:        655501
> tuple_len:          53573112
> tuple_percent:      94.59
> dead_tuple_count:   0
> dead_tuple_len:     0
> dead_tuple_percent: 0
> free_space:         251928
> free_percent:       0.44
> table_name:         status
Lots more rows deleted.

Re: Poor overall performance unless regular VACUUM FULL

From
Wayne Conrad
Date:
> Ouch hurts my eyes :)  Can you see something like table_len,
> dead_tuple_percent, free_percent order by dead_tuple_percent desc
> limit 10 or something like that maybe?

Sorry about the pain.  Didn't know what you needed to see.

Ordering by dead_tuple_percent:

db.production=> select table_name, table_len, dead_tuple_percent,
free_percent from temp_tuplestats order by dead_tuple_percent desc
limit 10;
              table_name              | table_len | dead_tuple_percent | free_percent
-------------------------------------+-----------+--------------------+--------------
  scheduler_info                      |      8192 |              43.95 |           46
  inserter_maintenance_logs           |     16384 |              25.13 |            9
  merchants                           |      8192 |              24.19 |           64
  scheduler_in_progress               |     32768 |              16.47 |           75
  guilds_hosts                        |      8192 |              13.28 |           67
  work_types                          |      8192 |              12.18 |           78
  production_printer_maintenance_logs |     16384 |              11.18 |           11
  guilds_work_types                   |      8192 |              10.94 |           71
  config                              |      8192 |              10.47 |           83
  work_in_progress                    |    131072 |               8.47 |           85
(10 rows)

These are our smallest, and in terms of performance, least significant
tables.  Except for work_in_progress, they play little part in overall
system performace.  work_in_progress gets dozens of insertions and
deletions per second, and as many queries.

Ordering by table size, because I had the questions of where the bloat
is, in terms of disk space used (since I brought up before that the
physical size of the database is growing at about 50% per quarter):

db.production=> select table_name, table_len, dead_tuple_percent, free_percent from temp_tuplestats order by table_len
desclimit 10; 
                  table_name                 |  table_len  | dead_tuple_percent | free_percent
--------------------------------------------+-------------+--------------------+--------------
  documents                                  | 28510109696 |               1.05 |           21
  document_address                           | 23458062336 |               2.14 |           10
  latest_document_address_links              |  4953735168 |               3.71 |           21
  documents_ps_page                          |  4927676416 |               1.19 |            6
  injectd_log                                |  4233355264 |               0.74 |           17
  ps_page                                    |  3544350720 |               0.81 |            4
  temp_bak_documents_invoice_amount_for_near |  3358351360 |                  0 |            0
  statements                                 |  1832091648 |                4.4 |            2
  documents_old_addresses                    |  1612947456 |                  0 |            1
  cron_logs                                  |   791240704 |                  0 |            1
(10 rows)

Am I seeing in the above queries evidence that my bloat is mostly in
free space, and not in dead tuples?

Re: Poor overall performance unless regular VACUUM FULL

From
Wayne Conrad
Date:
On Tue, 14 Jul 2009, Scott Marlowe wrote:
> Just wondering, which pgsql version, and also, do you have
> autovacuum turned on?

Dang, I should have said in my initial message.  8.3.6, and autovacuum
is turned on and has plenty of log activity.

Re: Poor overall performance unless regular VACUUM FULL

From
Wayne Conrad
Date:
>> On Tue, 14 Jul 2009, Scott Marlowe wrote:
> Are you guys doing anything that could be deemed pathological, like
> full table updates on big tables over and over?  Had an issue last
> year where a dev left a where clause off an update to a field in one
> of our biggest tables and in a few weeks the database was so bloated
> we had to take it offline to fix the problem.  After fixing the
> query.

I've just audited the source, looking for any updates without where
clauses.  None jumped out to bite me.

Almost everything we do happens in transactions which can occasionally
take 10-20 minutes to complete and span thousands or tens of thousands
of rows across multiple tables.  Are long-running transactions a
culprit in table bloat?

I've also used contrib/pgstattuple to try to identify which of our
large tables and indices are experiencing bloat.  Here are the
pgstattuple results for our largest tables:

table_len:          56639488
tuple_count:        655501
tuple_len:          53573112
tuple_percent:      94.59
dead_tuple_count:   0
dead_tuple_len:     0
dead_tuple_percent: 0
free_space:         251928
free_percent:       0.44
table_name:         status

table_len:          94363648
tuple_count:        342363
tuple_len:          61084340
tuple_percent:      64.73
dead_tuple_count:   10514
dead_tuple_len:     1888364
dead_tuple_percent: 2
free_space:         28332256
free_percent:       30.02
table_name:         uploads

table_len:          135675904
tuple_count:        1094803
tuple_len:          129821312
tuple_percent:      95.68
dead_tuple_count:   133
dead_tuple_len:     16048
dead_tuple_percent: 0.01
free_space:         991460
free_percent:       0.73
table_name:         invoice_details

table_len:          148914176
tuple_count:        1858812
tuple_len:          139661736
tuple_percent:      93.79
dead_tuple_count:   1118
dead_tuple_len:     80704
dead_tuple_percent: 0.05
free_space:         1218040
free_percent:       0.82
table_name:         job_status_log

table_len:          173416448
tuple_count:        132974
tuple_len:          117788200
tuple_percent:      67.92
dead_tuple_count:   10670
dead_tuple_len:     7792692
dead_tuple_percent: 4.49
free_space:         46081516
free_percent:       26.57
table_name:         mail

table_len:          191299584
tuple_count:        433378
tuple_len:          145551144
tuple_percent:      76.09
dead_tuple_count:   1042
dead_tuple_len:     862952
dead_tuple_percent: 0.45
free_space:         42068276
free_percent:       21.99
table_name:         sessions

table_len:          548552704
tuple_count:        5446169
tuple_len:          429602136
tuple_percent:      78.32
dead_tuple_count:   24992
dead_tuple_len:     1929560
dead_tuple_percent: 0.35
free_space:         93157980
free_percent:       16.98
table_name:         job_state_log

table_len:          639262720
tuple_count:        556415
tuple_len:          221505548
tuple_percent:      34.65
dead_tuple_count:   66688
dead_tuple_len:     27239728
dead_tuple_percent: 4.26
free_space:         380168112
free_percent:       59.47
table_name:         jobs

table_len:          791240704
tuple_count:        8311799
tuple_len:          700000052
tuple_percent:      88.47
dead_tuple_count:   39
dead_tuple_len:     3752
dead_tuple_percent: 0
free_space:         11397548
free_percent:       1.44
table_name:         cron_logs

table_len:          1612947456
tuple_count:        10854417
tuple_len:          1513084075
tuple_percent:      93.81
dead_tuple_count:   0
dead_tuple_len:     0
dead_tuple_percent: 0
free_space:         13014040
free_percent:       0.81
table_name:         documents_old_addresses

table_len:          1832091648
tuple_count:        13729360
tuple_len:          1600763725
tuple_percent:      87.37
dead_tuple_count:   598525
dead_tuple_len:     80535904
dead_tuple_percent: 4.4
free_space:         38817616
free_percent:       2.12
table_name:         statements

table_len:          3544350720
tuple_count:        64289703
tuple_len:          2828746932
tuple_percent:      79.81
dead_tuple_count:   648849
dead_tuple_len:     28549356
dead_tuple_percent: 0.81
free_space:         143528236
free_percent:       4.05
table_name:         ps_page

table_len:          4233355264
tuple_count:        22866609
tuple_len:          3285722981
tuple_percent:      77.62
dead_tuple_count:   231624
dead_tuple_len:     31142594
dead_tuple_percent: 0.74
free_space:         706351636
free_percent:       16.69
table_name:         injectd_log

table_len:          4927676416
tuple_count:        55919895
tuple_len:          4176606972
tuple_percent:      84.76
dead_tuple_count:   795011
dead_tuple_len:     58409884
dead_tuple_percent: 1.19
free_space:         279870944
free_percent:       5.68
table_name:         documents_ps_page

table_len:          4953735168
tuple_count:        44846317
tuple_len:          3346823052
tuple_percent:      67.56
dead_tuple_count:   2485971
dead_tuple_len:     183639396
dead_tuple_percent: 3.71
free_space:         1038200484
free_percent:       20.96
table_name:         latest_document_address_links

table_len:          23458062336
tuple_count:        89533157
tuple_len:          19772992448
tuple_percent:      84.29
dead_tuple_count:   2311467
dead_tuple_len:     502940946
dead_tuple_percent: 2.14
free_space:         2332408612
free_percent:       9.94
table_name:         document_address

table_len:          28510109696
tuple_count:        44844664
tuple_len:          21711695949
tuple_percent:      76.15
dead_tuple_count:   1134932
dead_tuple_len:     300674467
dead_tuple_percent: 1.05
free_space:         5988985892
free_percent:       21.01
table_name:         documents

Here are the pgstatindex results for our largest indices.  I assumed
that negative index sizes are a reslt of integer overflow and ordered
the results accordingly.

index_size:         1317961728
version:            2
tree_level:         3
root_block_no:      12439
internal_pages:     13366
leaf_pages:         1182318
empty_pages:        0
deleted_pages:      13775
avg_leaf_density:   -157.76
leaf_fragmentation: 37.87
index_name:         documents_pkey

index_size:         1346609152
version:            2
tree_level:         3
root_block_no:      10447
internal_pages:     1937
leaf_pages:         162431
empty_pages:        0
deleted_pages:      12
avg_leaf_density:   66.56
leaf_fragmentation: 26.48
index_name:         statements_pkey

index_size:         1592713216
version:            2
tree_level:         3
root_block_no:      81517
internal_pages:     723
leaf_pages:         177299
empty_pages:        0
deleted_pages:      16400
avg_leaf_density:   74.15
leaf_fragmentation: 5.58
index_name:         latest_document_address2_precedence_key

index_size:         1617821696
version:            2
tree_level:         3
root_block_no:      81517
internal_pages:     720
leaf_pages:         185846
empty_pages:        0
deleted_pages:      10921
avg_leaf_density:   78.8
leaf_fragmentation: 10.96
index_name:         documents_ps_page_ps_page_id_idx

index_size:         1629798400
version:            2
tree_level:         3
root_block_no:      81517
internal_pages:     728
leaf_pages:         188325
empty_pages:        0
deleted_pages:      9896
avg_leaf_density:   88.23
leaf_fragmentation: 0.66
index_name:         ps_page_pkey

index_size:         1658560512
version:            2
tree_level:         3
root_block_no:      81517
internal_pages:     740
leaf_pages:         191672
empty_pages:        0
deleted_pages:      10048
avg_leaf_density:   86.7
leaf_fragmentation: 1.03
index_name:         ps_page_ps_id_key

index_size:         -31956992
version:            2
tree_level:         3
root_block_no:      12439
internal_pages:     5510
leaf_pages:         475474
empty_pages:        0
deleted_pages:      39402
avg_leaf_density:   72.19
leaf_fragmentation: 3.02
index_name:         latest_document_address2_pkey

index_size:         -321863680
version:            2
tree_level:         3
root_block_no:      81517
internal_pages:     1809
leaf_pages:         479805
empty_pages:        0
deleted_pages:      3383
avg_leaf_density:   25.63
leaf_fragmentation: 40.05
index_name:         documents_id_idx

index_size:         -461504512
version:            2
tree_level:         3
root_block_no:      49813
internal_pages:     3023
leaf_pages:         456246
empty_pages:        0
deleted_pages:      8682
avg_leaf_density:   34.37
leaf_fragmentation: 66.83
index_name:         documents_city

index_size:         -11818844162
version:            3
tree_level:
root_block_no:      11036
internal_pages:     10003
leaf_pages:         822178
empty_pages:        0
deleted_pages:      72121
avg_leaf_density:   54.52
leaf_fragmentation: 3.37
index_name:         document_address_pkey

index_size:         -12678348802
version:            3
tree_level:
root_block_no:      32210
internal_pages:     2410
leaf_pages:         359867
empty_pages:        0
deleted_pages:      7245
avg_leaf_density:   53.31
leaf_fragmentation: 52.7
index_name:         documents_recipient

index_size:         -13276282882
version:            3
tree_level:
root_block_no:      27346
internal_pages:     2183
leaf_pages:         360040
empty_pages:        0
deleted_pages:      0
avg_leaf_density:   58.39
leaf_fragmentation: 50
index_name:         documents_magic_id_key

index_size:         -14476328962
version:            3
tree_level:
root_block_no:      44129
internal_pages:     1998
leaf_pages:         339111
empty_pages:        0
deleted_pages:      6465
avg_leaf_density:   50.12
leaf_fragmentation: 52.85
index_name:         documents_zip10

index_size:         -14723809282
version:            3
tree_level:
root_block_no:      81515
internal_pages:     2470
leaf_pages:         326170
empty_pages:        0
deleted_pages:      15913
avg_leaf_density:   38.21
leaf_fragmentation: 77.19
index_name:         documents_state

index_size:         -14831697922
version:            3
tree_level:
root_block_no:      47536
internal_pages:     1607
leaf_pages:         341421
empty_pages:        0
deleted_pages:      208
avg_leaf_density:   45.28
leaf_fragmentation: 46.48
index_name:         documents_account_number

index_size:         -17118412802
version:            3
tree_level:
root_block_no:      81517
internal_pages:     1149
leaf_pages:         296146
empty_pages:        0
deleted_pages:      18027
avg_leaf_density:   80.86
leaf_fragmentation: 7.14
index_name:         document_address_precedence_key