A Tale of 2 algorithms - Mailing list pgsql-performance

From Colin Taylor
Subject A Tale of 2 algorithms
Date
Msg-id CAN6Zwvv_h_C7rDwRAQL_Ufb3xrF261bV0XSeyFB3oZJDsUDLTw@mail.gmail.com
Whole thread Raw
Responses Re: A Tale of 2 algorithms  (Craig Ringer <ringerc@ringerc.id.au>)
List pgsql-performance
Hi, previously I selected categorized data for update then updated counts or inserted a new record if it was a new category of data.

select all categories
update batches of categories
or insert batches [intermingled as they hit batch size]

Problem was the select was saturating the network (pulling back far more data than needed too)
So I switched to doing optimistic updates where I checked for 0 row updates and made inserts out of them.

optimistic update batches 
followed by insert batches

New problem massive table bloat. I'm losing gigabytes of disk an hour which I can only recover by clustering.
   
Now's the bit where I lose some of my audience by saying I'm having this bloat problem on 8.3.7 and 8.4.4 but not 9.0. I'd love to upgrade obviously but that's out of my hands and I've been told not an option in the short term.

My thoughts are: surely 0-row updates dont cause this or have impact on the vacuum. I'm still doing the same updates after all why have things degenerated so badly?
While it made sense to me that the dead tuples are now more in the middle of the table than the end somehow and since autovacuum starts from the back that might be the cause, but I've turned on full autovacuum logging and there is seemingly very little vaccuming going on in either scenario (we have a nightly scheduled cluster). In desperation I've also doubled the freespace map settings in 8.3 to the seemingly very large max_fsm_pages = 25000000  and max_fsm_relations = 200000 without improvement. 
 
Any suggestions? These are roughly 0.5 to 1TB databases with 8GB shared buffers and work mem set appropriately and otherwise running fine.

cheers
Colin

pgsql-performance by date:

Previous
From: Igor Neyman
Date:
Subject: Re: Postgres becoming slow, only full vacuum fixes it
Next
From: Виктор Егоров
Date:
Subject: Re: NestedLoops over BitmapScan question