Thread: RE: [GENERAL] Benchmarks

RE: [GENERAL] Benchmarks

From
Michael J Davis
Date:
Maybe vacuum should be changed to automatically drop all indexes, vacuum,
and re-create all indexes and stop trying to rebuild each index.

> -----Original Message-----
> From:    Culberson, Philip [SMTP:philip.culberson@dat.com]
> Sent:    Thursday, January 06, 2000 1:58 PM
> To:    'Bruce Momjian'; Dustin Sallings
> Cc:    The Hermit Hacker; pgsql-general@hub.org
> Subject:    RE: [GENERAL] Benchmarks
>
> In his very insightful post last week, Mike Mascari pointed out that, on
> tables with heavy insert/updates, it was much faster to drop the index,
> vacuum analyze, and then rebuild the index.  Maybe in vacuum there is a
> specific inefficiency in what Mike coined "defragment"ing indexes.
>
> [Snip]
>
> 8. Not running VACUUM - PostgreSQL won't use indexes, or won't optimize
> correctly unless the record count and dispersion estimates are up-to-date.
> People have reported problems with running vacuum while under heavy load.
> We
> haven't seen it, but we run vacuum each night at 4:05 a.m. However, if you
> perform a LARGE number of INSERTS/UPDATES, it is better for you to do the
> following:
>
> DROP INDEX index_on_heavilty_used_table;
> VACUUM ANALYZE;
> CREATE INDEX index_on_heavily_used_table;
>
> Because VACUUM will sit there, and, row by row, essentially "defragment"
> your indexes, which can take damn near forever for any number of updates
> or
> deletes greater than, say, 30,000 rows.
>
> [Snip]
>
> -----Original Message-----
> From: Bruce Momjian [mailto:pgman@candle.pha.pa.us]
> Sent: Thursday, January 06, 2000 10:14 AM
> To: Dustin Sallings
> Cc: The Hermit Hacker; pgsql-general@hub.org
> Subject: Re: [GENERAL] Benchmarks
>
>
> >     Untrue, vacuum is *extremely* important for updating statistics.
> > If you have a lot of data in a table, and you have never vacuumed, you
> > might as well not have any indices.  It'd be nice if you could seperate
> > the stat update from the storage reclaim.  Actually, it'd be nice if you
> > could reuse storage, so that an actual vacuum wouldn't be necessary
> unless
> > you just wanted to free up disk space you might end up using again
> anyway.
> >
> >     The vacuum also doesn't seem to be very efficient.  In one of my
> > databases, a vacuum could take in excess of 24 hours, while I've written
> a
> > small SQL script that does a select rename and a insert into select from
> > that will do the same job in about ten minutes.  This is a database that
> > cannot lock for more than a few minutes.
>
> This is serious.  Why would an INSERT / RENAME be so much faster.  Are
> we that bad with VACUUM?
>
> --
>   Bruce Momjian                        |  http://www.op.net/~candle
>   maillist@candle.pha.pa.us            |  (610) 853-3000
>   +  If your life is a hard drive,     |  830 Blythe Avenue
>   +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
>
> ************
>
> ************

Re: [GENERAL] Benchmarks

From
Adriaan Joubert
Date:
Michael J Davis wrote:

> Maybe vacuum should be changed to automatically drop all indexes, vacuum,
> and re-create all indexes and stop trying to rebuild each index.

Those were my thoughts, but I think there may be a problem in that the indexes
will get lost if vacuum crashes before restoring the indexes. That is pretty
undesirable. Also, if there are relatively few updates to a large table
building a new index is going to take longer than the vacuum, which is actually
relatively quick if done regularly. Dunno what the best solution is.

I think it will be very hard to do away with vacuum completely: one may be able
to reuse space, but with variable length records there is always going to be
some fragmentation. Of course vacuumes may become less important if some space
can be reused?

Adriaan