Avoiding vacuum full on an UPDATE-heavy table - Mailing list pgsql-performance

From Bill Montgomery
Subject Avoiding vacuum full on an UPDATE-heavy table
Date
Msg-id 40AE274F.7020800@lulu.com
Whole thread Raw
Responses Re: Avoiding vacuum full on an UPDATE-heavy table
Re: Avoiding vacuum full on an UPDATE-heavy table
List pgsql-performance
All,

I have a particularly troublesome table in my 7.3.4 database. It
typically has less than 50k rows, and a usage pattern of about 1k
INSERTs, 50-100k UPDATEs, and no DELETEs per day. It is vacuumed and
analyzed three times per week. However, the performance of queries
performed on this table slowly degrades over a period of weeks, until
even a "select count(*)" takes several seconds. The only way I've found
to restore performance is to VACUUM FULL the table, which is highly
undesireable in our application due to the locks it imposes.

Here is the output of a psql session demonstrating the problem/solution.
Note the \timing output after each of the SELECTs:

qqqqqqqq=> vacuum analyze xxxx;
NOTICE:  VACUUM will be committed automatically
VACUUM
Time: 715900.74 ms
qqqqqqqq=> select count(*) from xxxx;
 count
-------
 17978
(1 row)

Time: 171789.08 ms
qqqqqqqq=> vacuum full verbose xxxx;
NOTICE:  VACUUM will be committed automatically
INFO:  --Relation public.xxxx--
INFO:  Pages 188903: Changed 60, reaped 188896, Empty 0, New 0; Tup
17987: Vac 1469, Keep/VTL 0/0, UnUsed 9120184, MinLen 92, MaxLen 468;
Re-using: Free/Avail. Space 1504083956/1504083872; EndEmpty/Avail. Pages
0/188901.
        CPU 6.23s/1.07u sec elapsed 55.02 sec.
INFO:  Index xxxx_yyyy_idx: Pages 29296; Tuples 17987: Deleted 1469.
        CPU 1.08s/0.20u sec elapsed 61.68 sec.
INFO:  Index xxxx_zzzz_idx: Pages 18412; Tuples 17987: Deleted 1469.
        CPU 0.67s/0.05u sec elapsed 17.90 sec.
INFO:  Rel xxxx: Pages: 188903 --> 393; Tuple(s) moved: 17985.
        CPU 15.97s/19.11u sec elapsed 384.49 sec.
INFO:  Index xxxx_yyyy_idx: Pages 29326; Tuples 17987: Deleted 17985.
        CPU 1.14s/0.65u sec elapsed 32.34 sec.
INFO:  Index xxxx_zzzz_idx: Pages 18412; Tuples 17987: Deleted 17985.
        CPU 0.43s/0.32u sec elapsed 13.37 sec.
VACUUM
Time: 566313.54 ms
qqqqqqqq=> select count(*) from xxxx;
 count
-------
 17987
(1 row)

Time: 22.82 ms


Is there any way to avoid doing a periodic VACUUM FULL on this table,
given the fairly radical usage pattern? Or is the (ugly) answer to
redesign our application to avoid this usage pattern?

Also, how do I read the output of VACUUM FULL?
http://www.postgresql.org/docs/7.3/interactive/sql-vacuum.html does not
explain how to interpret the output, nor has google helped. I have a
feeling that the full vacuum is compressing hundreds of thousands of
pages of sparse data into tens of thousands of pages of dense data, thus
reducing the number of block reads by an order of magnitude, but I'm not
quite sure how to read the output.

FWIW, this is last night's relevant output from the scheduled VACUUM
ANALYZE. 24 days have passed since the VACUUM FULL above:

INFO:  --Relation public.xxx--
INFO:  Index xxx_yyy_idx: Pages 30427; Tuples 34545: Deleted 77066.
        CPU 1.88s/0.51u sec elapsed 95.39 sec.
INFO:  Index xxx_zzz_idx: Pages 19049; Tuples 34571: Deleted 77066.
        CPU 0.83s/0.40u sec elapsed 27.92 sec.
INFO:  Removed 77066 tuples in 3474 pages.
        CPU 0.38s/0.32u sec elapsed 1.33 sec.
INFO:  Pages 13295: Changed 276, Empty 0; Tup 34540: Vac 77066, Keep 0,
UnUsed 474020.
        Total CPU 3.34s/1.29u sec elapsed 125.00 sec.
INFO:  Analyzing public.xxx


Best Regards,

Bill Montgomery

pgsql-performance by date:

Previous
From: Rod Taylor
Date:
Subject: Re: PostgreSQL caching
Next
From: Neil Conway
Date:
Subject: Re: PostgreSQL caching