Thread: VACUUM Question

VACUUM Question

From
"Oisin Glynn"
Date:
We have 2 tables we expect to grow by up to 50,000 rows per day each depending on the customer.  In normal operation we will most likely never update or delete from these tables as they are for historical reporting. (Eventually we may but a limit on the amount of data and delete older than X months or such)
We intend to create a number of indexes based upon the reporting search criteria.
 
What would the best setup be for VACUUM, ANALYSE, REINDEX.  Alot of the infor refers to data hanging around from deletes and updates which in normal course we will not do on these tables?
 
 
Oisin

Re: VACUUM Question

From
"Matthew T. O'Connor"
Date:
If you really are just inserting, and never updating or deleting, then you
will never need to vacuum the table, rather you will just need to ANALYSE
the table.  If you use autovacuum that is exactly what it will do.

As for Reindex, I'm not entirely sure, I don't think you would benefit
from reindex because you aren't updating or deleting.  Can anyone comment
on this?  Is is possibile that a table with lots of inserts resulting in
lots of page splits etc could ever benifit form REINDEX?

Matt


> We have 2 tables we expect to grow by up to 50,000 rows per day each
> depending on the customer.  In normal operation we will most likely never
> update or delete from these tables as they are for historical reporting.
> (Eventually we may but a limit on the amount of data and delete older than
> X months or such)
> We intend to create a number of indexes based upon the reporting search
> criteria.
>
> What would the best setup be for VACUUM, ANALYSE, REINDEX.  Alot of the
> infor refers to data hanging around from deletes and updates which in
> normal course we will not do on these tables?
>
>
> Oisin
>


Re: VACUUM Question

From
Michael Fuhr
Date:
On Thu, Jan 26, 2006 at 02:35:42PM -0500, Matthew T. O'Connor wrote:
> If you really are just inserting, and never updating or deleting, then you
> will never need to vacuum the table, rather you will just need to ANALYSE
> the table.

That's not quite true; the table must still be vacuumed occasionally
to prevent transaction ID wraparound failure, else you risk losing
data.

http://www.postgresql.org/docs/8.1/interactive/maintenance.html#VACUUM-FOR-WRAPAROUND

--
Michael Fuhr

Re: VACUUM Question

From
Tom Lane
Date:
Michael Fuhr <mike@fuhr.org> writes:
> On Thu, Jan 26, 2006 at 02:35:42PM -0500, Matthew T. O'Connor wrote:
>> If you really are just inserting, and never updating or deleting, then you
>> will never need to vacuum the table, rather you will just need to ANALYSE
>> the table.

> That's not quite true; the table must still be vacuumed occasionally
> to prevent transaction ID wraparound failure,

Also, somebody made a real good point about rolled-back insertions.
Even if the only command you ever apply to the table is INSERT, you
could still have dead rows in the table if some of those transactions
occasionally roll back.

            regards, tom lane

Re: VACUUM Question

From
Chris Browne
Date:
matthew@zeut.net ("Matthew T. O'Connor") writes:
> If you really are just inserting, and never updating or deleting,
> then you will never need to vacuum the table, rather you will just
> need to ANALYSE the table.  If you use autovacuum that is exactly
> what it will do.

"Never" is a pretty long time...

You need a VACUUM every 2^31 transactions, but since there needs to be
such a vacuum for the whole database, that one will do...

> As for Reindex, I'm not entirely sure, I don't think you would benefit
> from reindex because you aren't updating or deleting.  Can anyone comment
> on this?  Is is possibile that a table with lots of inserts resulting in
> lots of page splits etc could ever benifit form REINDEX?

I could imagine a CLUSTER doing some good, and if that's the case,
REINDEX could have some favorable results.  But you'd better have a
real specific model as to why that would be...
--
let name="cbbrowne" and tld="ntlug.org" in name ^ "@" ^ tld;;
http://cbbrowne.com/info/spreadsheets.html
Oh,  boy, virtual memory!  Now I'm  gonna make  myself a  really *big*
RAMdisk!

Re: VACUUM Question

From
"Matthew T. O'Connor"
Date:
> Also, somebody made a real good point about rolled-back insertions.
> Even if the only command you ever apply to the table is INSERT, you
> could still have dead rows in the table if some of those transactions
> occasionally roll back.

hmm... That's true.  I don't think autovacuum doesn't anything to account
for the concept of rolledback inserts.  I suppose in most real world
situations that number is going to be small enough to be ignored, but not
in all cases.  Is there anyway for the stats system to report the
information about rolledback inserts?  In fact autovacuum probably has a
similar deficiency for rolled back deletes but not a rolled back update.
Anyone think this is enough of an issue that it needs more attention?

Matt


Re: VACUUM Question

From
"Jim C. Nasby"
Date:
On Thu, Jan 26, 2006 at 04:14:45PM -0500, Chris Browne wrote:
> > As for Reindex, I'm not entirely sure, I don't think you would benefit
> > from reindex because you aren't updating or deleting.  Can anyone comment
> > on this?  Is is possibile that a table with lots of inserts resulting in
> > lots of page splits etc could ever benifit form REINDEX?
>
> I could imagine a CLUSTER doing some good, and if that's the case,
> REINDEX could have some favorable results.  But you'd better have a
> real specific model as to why that would be...

Aside from the cluster case, are there any issues with how page splits
in the b-tree are done that could lead to better performance after a
REINDEX?
--
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461