On 1/28/2011 11:00 AM, Scott Marlowe wrote:
> On Fri, Jan 28, 2011 at 9:39 AM, Robert Schnabel<schnabelr@missouri.edu> wrote:
>> I can't do outside the database. So yes, once the upload is done I run
>> queries that update every row for certain columns, not every column. After
>> I'm done with a table I run a VACUUM ANALYZE. I'm really not worried about
>> what my table looks like on disk. I actually take other steps also to avoid
>> what you're talking about.
> It will still get bloated. If you update one column in one row in pg,
> you now have two copies of that row in the database. If you date 1
> column in 1M rows, you now have 2M rows in the database (1M "dead"
> rows, 1M "live" rows). vacuum analyze will not get rid of them, but
> will free them up to be used in future updates / inserts. Vacuum full
> or cluster will free up the space, but will lock the table while it
> does so.
>
> There's nothing wrong with whole table updates as part of an import
> process, you just have to know to "clean up" after you're done, and
> regular vacuum can't fix this issue, only vacuum full or reindex or
> cluster.
Those are exactly what I was referring to with my "other steps". I just
don't always do them as soon as I'm done updating because sometimes I
want to query the table right away to find out something. Yep, I found
out the hard way that regular VACUUM didn't help.