Re: autovacuum_work_mem - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: autovacuum_work_mem
Date
Msg-id CAM3SWZQ+FPueRRf-n4mn=k_Mn9ycRou1ZZARDNzY9kpCqChTaQ@mail.gmail.com
Whole thread Raw
In response to Re: autovacuum_work_mem  (Simon Riggs <simon@2ndQuadrant.com>)
Responses Re: autovacuum_work_mem  (Simon Riggs <simon@2ndQuadrant.com>)
List pgsql-hackers
On Sun, Nov 24, 2013 at 9:06 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> VACUUM uses 6 bytes per dead tuple. And autovacuum regularly removes
> dead tuples, limiting their numbers.
>
> In what circumstances will the memory usage from multiple concurrent
> VACUUMs become a problem? In those circumstances, reducing
> autovacuum_work_mem will cause more passes through indexes, dirtying
> more pages and elongating the problem workload.

Yes, of course, but if we presume that the memory for autovacuum
workers to do everything in one pass simply isn't there, it's still
better to do multiple passes. Similarly, it's also sometimes (roughly
speaking) locally suboptimal but globally optimal to do tapesorts in
preference to in-memory quicksorts, even though, as you know, very
frequently tapesort is very considerably slower than quicksort.

Look at the folk wisdom for sizing maintenance_work_mem that is
floating around (for example, take a look at Greg Smith's
recommendations in his book). Setting it within postgresql.conf is
assumed. You can end up with a conservative value because you're
worrying about the worst case. The average case suffers. Especially
since, as you say, autovacuum only uses 6 bytes per tuple, and so
probably isn't all that likely to run out of working memory, making
that worst case (that is, maintenance_work_mem over-allocation by
autovacuum workers) very unlikely.

So on larger Heroku Postgres plans, the generic maintenance_work_mem
is on the low side, and I sometimes have to manually increase it when
I'm doing something like creating a new index. I would like to not
have to do that, and I would like to not require users to be aware of
this issue, especially since external sorting is so much slower.

I am inclined to think that we need an altogether more sophisticated
model, but this is an incremental improvement.

> Can we re-state what problem actually is here and discuss how to solve
> it. (The reference [2] didn't provide a detailed explanation of the
> problem, only the reason why we want a separate parameter).

It's principally a DBA feature, in that it allows the DBA to
separately control the memory used by very routine vacuuming, while
also having a less conservative default value for maintenance
operations that typically are under direct human control. Yes, this is
no better than just having maintenance_work_mem be equal to your
would-be autovacuum_work_mem setting in the first place, and having
everyone remember to set maintenance_work_mem dynamically. However,
sometimes people are ill-informed (more ill-informed than the person
that puts the setting in postgresql.conf), and other times they're
forgetful, and other times still they're using a tool like pg_restore
with no convenient way to dynamically set maintenance_work_mem. So, to
answer your question, yes: it is entirely possible that you or someone
like you may have no use for this.

It's often reasonable to assume that autovacuum workers are the only
processes that can allocate memory bound in size by
maintenance_work_mem that are not under the direct control of a human
performing maintenance. Autovacuum workers are in a sense just
servicing regular application queries (consider how Oracle handles
ROLLBACK garbage collection), and things that service regular
application queries are already bound separately by work_mem.

-- 
Peter Geoghegan



pgsql-hackers by date:

Previous
From: Stephen Frost
Date:
Subject: Re: Extension Templates S03E11
Next
From: Robert Haas
Date:
Subject: Re: Performance Improvement by reducing WAL for Update Operation