VACUUM FULL - Mailing list pgsql-admin

From Brian McCane
Subject VACUUM FULL
Date
Msg-id 20020510185436.F62674-100000@fw.mccons.net
Whole thread Raw
In response to Re: A couple of errors encountered in 7.1.3=>7.2.1-2 data migration  ("Nick Fankhauser" <nickf@ontko.com>)
Responses Re: VACUUM FULL
Re: VACUUM FULL
Re: VACUUM FULL
List pgsql-admin
Okay, I guess I misunderstood something about "VACUUM FULL".  My
understanding was that a VACUUM (without FULL), marked unused records for
reuse.  VACUUM FULL moved records from the end of a file to the holes
where unused records were stored and truncated the file to free disk
space.  So far so good, but....

I have had continued loss of performance on one of my largest tables
(600,000,000 records).  There are 4 fields in a record like such:

CREATE TABLE foo (
  a int not null references bar(a) on delete cascade on update no action,
  b int not null references baz(b) on delete cascade on update no action,
  c int,
  d smallint,
  primary key(a, b)) ;
CREATE INDEX foo_ac ON foo (a,c) ;
CREATE INDEX foo_ad on foo (a,d) ;
And there are 3 triggers which fire before insert/delete/update.

I INSERT/UPDATE/DELETE approximately 300,000 records per day, but this
number is increasing on a daily basis as I make changes which improve the
performance of my data gathering tools (spiders ;).  Two days ago, it had
reached the point where a search for a 3-word term (ie. free news servers)
took about 60 seconds.  I have just spent 48 hours running a VACUUM FULL
on my table, and now the same search takes < 10 seconds.  I assume that
the increase in performance is due to the decrease in table/index size
which added up to approximate 1GB of freed space on the machine, which was
approximately 4% of the original size of the table and all its indices.
But, a 4% decrease in size should not add up to a 84% increase in
performance (is that right? I always get the ratio confused :).

If all that VACUUM FULL did was move records from file 12345678.6 to file
12345678, the database would still being doing a large number of random
accesses on the table.  However, if VACUUM FULL clusters the data
according to the primary key, it would still be doing a large number of
random access on the table, because the primary key has almost nothing to
do with how I actually access the data in real life.  So, is VACUUM FULL
looking somewhere in pg_statistics (or pg_stat_user_indexes), to determine
which index I actually use most (foo_ad), and then clustering the data
that way, or is there some other agent at work here.

- brian


Wm. Brian McCane                    | Life is full of doors that won't open
Search http://recall.maxbaud.net/   | when you knock, equally spaced amid those
Usenet http://freenews.maxbaud.net/ | that open when you don't want them to.
Auction http://www.sellit-here.com/ | - Roger Zelazny "Blood of Amber"


pgsql-admin by date:

Previous
From: "Nick Fankhauser"
Date:
Subject: Re: A couple of errors encountered in 7.1.3=>7.2.1-2 data migration
Next
From: "Dan Langille"
Date:
Subject: Re: VACUUM FULL