Re: Vacuuming on heavily changed databases - Mailing list pgsql-general

From Sam Mason
Subject Re: Vacuuming on heavily changed databases
Date
Msg-id 20080519172118.GA27988@frubble.xen.chris-lamb.co.uk
Whole thread Raw
In response to Re: Vacuuming on heavily changed databases  ("Harald Armin Massa" <haraldarminmassa@gmail.com>)
Responses Re: Vacuuming on heavily changed databases  (Sam Mason <sam@samason.me.uk>)
List pgsql-general
On Mon, May 19, 2008 at 04:59:42PM +0200, Harald Armin Massa wrote:
> On Mon, May 19, 2008 at 04:50:18PM +0200, Bohdan Linda wrote:
> > I would like to ask an opinion on vacuuming general. Imagine situation
> > that you have single table with 5 fields (one varchar). This table has
> > during the day
> >
> > - cca 620 000 inserts
> > - 0 updates
> > - cca 620 000 deletes
> >
> > The table is vacuumed daily, but somehow after several months I got to
> > size of ~50GB
>
> do not vacuum DAILY. set up autovacuum to run AT LEAST every minute.

This will help if the changes to the database are evenly distributed
throughout the day, if they're very spiky then you may want to run
a vacuum after one of these bulk changes.  I believe that on larger
databases for those tables that are incurring heavy modification the
admin would configure an associated session whose sole responsibility
would be to issue a never ending stream of VACUUMs.  This tends to imply
larger RAID arrays that can tolerate multiple concurrent read/write
requests.  I.e. something like the following, but with some error
checking:

  for t in foo bar baz
    do ( while echo "VACUUM $t;" && false ; do true ; done | psql ) &
  done

As an ongoing thing, PG records partially filled pages in its so-called
"Free Space Map".  The FSM gets updated by VACUUM and is used by the
data modification statements to find places where new data can be
written.  If the FSM is too small then PG will grow the table even
though there is free space in the table (because it doesn't know it
exists).  VACUUM will normally give error messages about the FSM map
being too small, that and the fact that you didn't mentioned this makes
me think you have already found an optimum value here.


  Sam

pgsql-general by date:

Previous
From: Miguel Arroz
Date:
Subject: Using "ident sameuser" with Mac OS X Leopard
Next
From: johnduffy@f2s.com
Date:
Subject: Re: pgxs question - linking c-functions to external libraries