Re: poor VACUUM performance on large tables - Mailing list pgsql-performance

From Thomas F. O'Connell
Subject Re: poor VACUUM performance on large tables
Date
Msg-id DBFDD577-FB38-4E9A-AF66-3C802A010957@sitening.com
Whole thread Raw
In response to poor VACUUM performance on large tables  (Jan Peterson <jan.l.peterson@gmail.com>)
List pgsql-performance
On Sep 4, 2005, at 1:16 AM, Jan Peterson wrote:

> Hello,
>
> We have been experiencing poor performance of VACUUM in our production
> database.  Relevant details of our implementation are as follows:
>
> 1.  We have a database that grows to about 100GB.
> 2.  The database is a mixture of large and small tables.
> 3.  Bulk data (stored primarily in pg_largeobject, but also in various
> TOAST tables) comprises about 45% of our data.
> 4.  Some of our small tables are very active, with several hundred
> updates per hour.
> 5.  We have a "rolling delete" function that purges older data on a
> periodic basis to keep our maximum database size at or near 100GB.
>
> Everything works great until our rolling delete kicks in.  Of course,
> we are doing periodic VACUUMS on all tables, with frequent VACUUMs on
> the more active tables.  The problem arises when we start deleting the
> bulk data and have to VACUUM pg_largeobject and our other larger
> tables.  We have seen VACUUM run for several hours (even tens of
> hours).  During this VACUUM process, our smaller tables accumulate
> dead rows (we assume because of the transactional nature of the
> VACUUM) at a very rapid rate.  Statistics are also skewed during this
> process and we have observed the planner choosing sequential scans on
> tables where it is obvious that an index scan would be more efficient.
>
> We're looking for ways to improve the performance of VACUUM.  We are
> already experimenting with Hannu Krosing's patch for VACUUM, but it's
> not really helping (we are still faced with doing a database wide
> VACUUM about once every three weeks or so as we approach the
> transaction id rollover point... this VACUUM has been measured at 28
> hours in an active environment).
>
> Other things we're trying are partitioning tables (rotating the table
> that updates happen to and using a view to combine the sub-tables for
> querying).  Unfortunately, we are unable to partition the
> pg_largeobject table, and that table alone can take up 40+% of our
> database storage.  We're also looking at somehow storing our large
> objects externally (as files in the local file system) and
> implementing a mechanism similar to Oracle's bfile functionality.  Of
> course, we can't afford to give up the transactional security of being
> able to roll back if a particular update doesn't succeed.
>
> Does anyone have any suggestions to offer on good ways to proceed
> given our constraints?  Thanks in advance for any help you can
> provide.
>
>         -jan-

Do you have your Free Space Map settings configured appropriately?
See section 16.4.3.2 of the docs:

http://www.postgresql.org/docs/8.0/static/runtime-config.html#RUNTIME-
CONFIG-RESOURCE

You'll want to run a VACUUM VERBOSE and note the numbers at the end,
which describe how many pages are used and how many are needed.
max_fsm_pages should be set according to that, and you can set
max_fsm_relations based on it, too, although typically one knows
roughly how many relations are in a database.

http://www.postgresql.org/docs/8.0/static/sql-vacuum.html

Finally, have you experimented with pg_autovacuum, which is located
in contrib in the source tarballs (and is integrated into the backend
in 8.1 beta and beyond)? You don't really say how often you're
running VACUUM, and it might be that you're not vacuuming often enough.

--
Thomas F. O'Connell
Co-Founder, Information Architect
Sitening, LLC

Strategic Open Source: Open Your i™

http://www.sitening.com/
110 30th Avenue North, Suite 6
Nashville, TN 37203-6320
615-469-5150
615-469-5151 (fax)

pgsql-performance by date:

Previous
From: Ernst Einstein
Date:
Subject: Re: Poor performance on HP Package Cluster
Next
From: "Luke Lonergan"
Date:
Subject: Re: Poor performance on HP Package Cluster