Re: How to best use 32 15k.7 300GB drives? - Mailing list pgsql-performance

From Scott Marlowe
Subject Re: How to best use 32 15k.7 300GB drives?
Date
Msg-id AANLkTi=3j3PW=Lkrwu2XiWdS4jCy0TcLLO4L2h3=9U0G@mail.gmail.com
Whole thread Raw
In response to Re: How to best use 32 15k.7 300GB drives?  (Robert Schnabel <schnabelr@missouri.edu>)
Responses Re: How to best use 32 15k.7 300GB drives?
Re: How to best use 32 15k.7 300GB drives?
Re: How to best use 32 15k.7 300GB drives?
List pgsql-performance
On Fri, Jan 28, 2011 at 9:39 AM, Robert Schnabel <schnabelr@missouri.edu> wrote:
> I can't do outside the database.  So yes, once the upload is done I run
> queries that update every row for certain columns, not every column.  After
> I'm done with a table I run a VACUUM ANALYZE.  I'm really not worried about
> what my table looks like on disk.  I actually take other steps also to avoid
> what you're talking about.

It will still get bloated.  If you update one column in one row in pg,
you now have two copies of that row in the database.  If you date 1
column in 1M rows, you now have 2M rows in the database (1M "dead"
rows, 1M "live" rows).  vacuum analyze will not get rid of them, but
will free them up to be used in future updates / inserts.  Vacuum full
or cluster will free up the space, but will lock the table while it
does so.

There's nothing wrong with whole table updates as part of an import
process, you just have to know to "clean up" after you're done, and
regular vacuum can't fix this issue, only vacuum full or reindex or
cluster.

pgsql-performance by date:

Previous
From: Robert Schnabel
Date:
Subject: Re: How to best use 32 15k.7 300GB drives?
Next
From: Robert Schnabel
Date:
Subject: Re: How to best use 32 15k.7 300GB drives?