Re: How to best use 32 15k.7 300GB drives? - Mailing list pgsql-performance

From Robert Schnabel
Subject Re: How to best use 32 15k.7 300GB drives?
Date
Msg-id 4D42F861.9000000@missouri.edu
Whole thread Raw
In response to Re: How to best use 32 15k.7 300GB drives?  (Scott Marlowe <scott.marlowe@gmail.com>)
List pgsql-performance
On 1/28/2011 11:00 AM, Scott Marlowe wrote:
> On Fri, Jan 28, 2011 at 9:39 AM, Robert Schnabel<schnabelr@missouri.edu>  wrote:
>> I can't do outside the database.  So yes, once the upload is done I run
>> queries that update every row for certain columns, not every column.  After
>> I'm done with a table I run a VACUUM ANALYZE.  I'm really not worried about
>> what my table looks like on disk.  I actually take other steps also to avoid
>> what you're talking about.
> It will still get bloated.  If you update one column in one row in pg,
> you now have two copies of that row in the database.  If you date 1
> column in 1M rows, you now have 2M rows in the database (1M "dead"
> rows, 1M "live" rows).  vacuum analyze will not get rid of them, but
> will free them up to be used in future updates / inserts.  Vacuum full
> or cluster will free up the space, but will lock the table while it
> does so.
>
> There's nothing wrong with whole table updates as part of an import
> process, you just have to know to "clean up" after you're done, and
> regular vacuum can't fix this issue, only vacuum full or reindex or
> cluster.

Those are exactly what I was referring to with my "other steps".  I just
don't always do them as soon as I'm done updating because sometimes I
want to query the table right away to find out something.  Yep, I found
out the hard way that regular VACUUM didn't help.



pgsql-performance by date:

Previous
From: Scott Marlowe
Date:
Subject: Re: How to best use 32 15k.7 300GB drives?
Next
From: Stephen Frost
Date:
Subject: Re: How to best use 32 15k.7 300GB drives?