On Tue, 2012-11-27 at 22:16 -0300, Claudio Freire wrote:
> Updates, are faster if batched, if your business logic allows it,
> because it creates less bloat and creates more opportunities for with
> HOT updates. I don't think it applies to inserts, though, and I
> haven't heard it either.
Huge updates (e.g. UPDATE with no WHERE clause) are less likely to
benefit from HOT. HOT has two main optimizations:
1. Remove dead tuples faster without waiting for VACUUM -- this only
works if the transaction that updated/deleted the tuple actually
finished (otherwise the tuple can't be removed yet), so it only benefits
the *next* update to come along. But if it's one big update, then VACUUM
is probably just as good at cleaning up the space.
2. Doesn't make new index entries for the new tuple; reuses the old
index entries -- this only works if the update is on the same page, but
large updates tend to fill pages up (because of the buildup of dead
tuples) and force new to go to new pages.
HOT is essentially designed for lots of small updates, which didn't
perform well before PG 8.3.
Batching of inserts/updates/deletes has a big benefit over separate
transactions, but only up to a point, after which it levels off. I'm not
sure exactly when that point is, but after that, the downsides of
keeping a transaction open (like inability to remove the previous
version of an updated tuple) take over.
Regards,
Jeff Davis