Re: Vacuum, Freeze and Analyze: the big picture - Mailing list pgsql-hackers

From Kevin Grittner
Subject Re: Vacuum, Freeze and Analyze: the big picture
Date
Msg-id 1370266498.2693.YahooMailNeo@web162905.mail.bf1.yahoo.com
Whole thread Raw
In response to Re: Vacuum, Freeze and Analyze: the big picture  (Craig Ringer <craig@2ndquadrant.com>)
Responses Re: Vacuum, Freeze and Analyze: the big picture  (Jeff Janes <jeff.janes@gmail.com>)
List pgsql-hackers
Craig Ringer <craig@2ndquadrant.com> wrote:
> On 06/02/2013 05:56 AM, Robert Haas wrote:

>> I agree with all that.  I don't have any data either, but I agree that
>> AFAICT it seems to mostly be a problem for large (terabyte-scale)
>> databases, or ones that are dreadfully short of I/O bandwidth.  AWS,
>> I'm looking at you.

> I've seen cases on Stack Overflow and elsewhere in which disk merge
> sorts perform vastly better than in-memory quicksort, so the user
> benefited from greatly *lowering* work_mem.

I have seen this a few times, to.  It would be interesting to
characterize the conditions under which this is the case.

>> (b) users
>> making ridiculous settings changes to avoid the problems caused by
>> anti-wraparound vacuums kicking in at inconvenient times and eating up
>> too many resources.

Where I hit a nightmare scenario with an anti-wraparound
autovacuum, personally, was after an upgrade using pg_dump piped to
psql.  At a high OLTP transaction load time (obviously the most
likely time for it to kick in, because it is triggered by xid
consumption), it started to READ AND REWRITE every heap page of
every table.  This overwhelmed the battery-backed write cache,
causing a series of "freezes" for a few minutes at a time, raising
a very large number of end-user complaints.  This is when I started
insisting on a VACUUM FREEZE ANALYZE after any bulk load before it
was considered complete and the database brought online for
production use.

> Some recent experiences I've had have also bought home to me that vacuum
> problems are often of the user's own making.
>
> "My database is slow"
> ->
> "This autovacuum thing is using up lots of I/O and CPU, I'll increase
> this delay setting here"
> ->
> "My database is slower"
> ->
> "Maybe I didn't solve the autovacuum thing, I'll just turn it
> off"
> ->
> "My database is barely working"
> ->
> "I'll whack in some manual VACUUM cron jobs during low load maintenance
> hours and hope that keeps the worst of the problem away, that's what
> random forum posts on the Internet say to do".
> -> "oh my, why did my DB just do an emergency shutdown?"

Yeah, I've seen exactly that sequence, and some variations on it
quite often.  In fact, when I was first using PostgreSQL I got as
far as "Maybe I didn't solve the autovacuum thing" but instead of
"I'll just turn it off" my next step was "I wonder what would
happen if I tried making it *more* aggressive so that it didn't
have so much work to do each time it fired?"  Of course, that
vastly improved things.  I have found it surprisingly difficult to
convince other people to try that, though.

I have seen people so convinced that vacuum (and particularly
autovacuum) are *so* evil that they turn off autovacuum and monitor
the freeze status of their tables and databases so that they can
run VACUUM "just in time" to prevent the emergency shutdown.
Obviously, this isn't great for their performance.  :-(

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Stephen Frost
Date:
Subject: Re: GRANT role_name TO role_name ON database_name
Next
From: Andres Freund
Date:
Subject: Re: Vacuum, Freeze and Analyze: the big picture