Re: [HACKERS] Bulk update of large database - Mailing list pgsql-hackers

From Tom Lane
Subject Re: [HACKERS] Bulk update of large database
Date
Msg-id 7582.943119136@sss.pgh.pa.us
Whole thread Raw
In response to Bulk update of large database  (Martin Weinberg <weinberg@osprey.astro.umass.edu>)
List pgsql-hackers
Martin Weinberg <weinberg@osprey.astro.umass.edu> writes:
> This does work but appends the updates (until the next vacuum). 
> For a  100GB database, this is too large of a storage overhead.  
> Is there another good way?

There is no alternative; any sort of update operation will write a
new tuple value without first deleting the old.  This must be so
to preserve transaction semantics: if an error occurs later on
during the update (eg, violation of a unique-index constraint) the
old tuple value must still be there.

The only answer I can see is to update however many tuples you can
spare the space for, commit the transaction, vacuum, repeat.

The need for repeated vacuums in this scenario is pretty annoying.
It'd be nice if we could recycle dead tuples without a full vacuum.
Offhand I don't see any way to do it without introducing performance
penalties elsewhere...
        regards, tom lane


pgsql-hackers by date:

Previous
From: "Andrij Korud"
Date:
Subject: C++ and SPI
Next
From: "Timothy"
Date:
Subject: