Re: REINDEX takes half a day (and still not complete!) - Mailing list pgsql-performance

From Jim Nasby
Subject Re: REINDEX takes half a day (and still not complete!)
Date
Msg-id 097E9E9A-BD13-4DDB-965E-EEFF80AB94C3@nasby.net
Whole thread Raw
In response to Re: REINDEX takes half a day (and still not complete!)  (Kenneth Marshall <ktm@rice.edu>)
List pgsql-performance
On Apr 30, 2011, at 9:34 AM, Kenneth Marshall wrote:
>> I suppose that's what I am going to do on a periodic basis from now
>> on. There is a lot of DELETE/UPDATE activity. But I wonder if the
>> vacuum stuff really should do something that's similar in function?
>> What do the high-end enterprise folks do -- surely they can't be
>> dumping/restoring every quarter or so....or are they?
>>
>> Anyway, many many thanks to the lovely folks on this list. Much appreciated!
>>
>
> The autovacuum and space management in 9.0 is dramatically more effective
> and efficient then that of 8.2. Unless you have an odd corner-case there
> really should be no reason for a periodic dump/restore. This is not your
> grandmother's Oldsmobile... :)

In 10+ years of using Postgres, I've never come across a case where you actually *need* to dump and restore on a
regularbasis. However, you can certainly run into scenarios where vacuum simply can't keep up. If your restored
databaseis 1/3 the size of the original then this is certainly what was happening on your 8.2 setup. 

As Kenneth mentioned, 9.0 is far better in this regard than 8.2, though it's still possible that you're doing something
thatwill give it fits. I suggest that you run a weekly vacuumdb -av, capture that output and run it through pgFouine.
Thatwill give you a ton of useful information about the amount of bloat you have in each table. I would definitely look
atanything with over 20% bloat. 

BTW, in case you're still questioning using Postgres in an enterprise setting; all of our production OLTP databases run
onPostgres. The largest one is ~1.5TB and does over 650TPS on average (with peaks that are much higher). Unplanned
downtimeon that database would cost us well over $100k/hour, and we're storing financial information, so data quality
issuesare not an option (data quality was one of the primary reasons we moved away from MySQL in 2006). So yes, you can
absolutelyrun very large Postgres databases in a high-workload environment. BTW, that's also on version 8.3. 
--
Jim C. Nasby, Database Architect                   jim@nasby.net
512.569.9461 (cell)                         http://jim.nasby.net



pgsql-performance by date:

Previous
From: Jim Nasby
Date:
Subject: Re: amazon ec2
Next
From: Denis de Bernardy
Date:
Subject: row estimate very wrong for array type