Thread: autovacuum maintenance_work_mem
Magnus was just talking to me about having a better way of controlling memory usage on autovacuum. Instead of each worker using up to maintenance_work_mem, which ends up as a disaster when DBA A sets to a large value and DBA B raises autovacuum_max_workers, we could simply have an "autovacuum_maintenance_memory" setting (name TBD), that defines the maximum amount of memory that autovacuum is going to use regardless of the number of workers. So for the initial implementation, we could just have each worker set its local maintenance_work_mem to autovacuum_maintenance_memory / max_workers. That way there's never excessive memory usage. This implementation is not ideal, because most of the time they wouldn't use that much memory, and so vacuums could be slower. But I think it's better than what we currently have. Thoughts? (A future implementation could improve things by using something like the balancing code we have for cost_delay. But I don't want to go there now.) -- Álvaro Herrera <alvherre@alvh.no-ip.org>
On Wed, Nov 17, 2010 at 01:12, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > So for the initial implementation, we could just have each worker set > its local maintenance_work_mem to autovacuum_maintenance_memory / max_workers. > That way there's never excessive memory usage. It sounds reasonable, but is there the same issue for normal connections? We can limit max connections per user, but there are no quota for total memory consumed by the user. It might not be an autovacuum-specifix issue. -- Itagaki Takahiro
Itagaki Takahiro <itagaki.takahiro@gmail.com> writes: > On Wed, Nov 17, 2010 at 01:12, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: >> So for the initial implementation, we could just have each worker set >> its local maintenance_work_mem to autovacuum_maintenance_memory / max_workers. >> That way there's never excessive memory usage. > It sounds reasonable, but is there the same issue for normal connections? > We can limit max connections per user, but there are no quota for total > memory consumed by the user. It might not be an autovacuum-specifix issue. I agree with Itagaki-san: this isn't really autovacuum's fault. Another objection to the above approach is that anytime you have fewer than max_workers AV workers, you're not using the memory well. And not using the memory well has a *direct* cost in terms of increased effort, ie, extra indexscans. So this isn't something to mess with lightly. I can see the possible value of decoupling autovacuum's setting from foreground operations, though. What about creating autovacuum_maintenance_mem but defining it as being the maintenance_work_mem setting that each AV worker can use? If people can't figure out that the total possible hit is maintenance_work_mem times max_workers, their license to use a text editor should be revoked. regards, tom lane
On 16.11.2010 18:12, Alvaro Herrera wrote: > Thoughts? Sounds reasonable, but you know what would be even better? Use less memory in vacuum, so that it doesn't become an issue to begin with. There was some discussion on that back in 2007 (http://archives.postgresql.org/pgsql-hackers/2007-02/msg01814.php). That seems like low-hanging fruit, it should be simple to switch to more compact representation. I believe you could easily more than half the memory consumption in typical scenarios. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
On Tue, Nov 16, 2010 at 11:12 AM, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > Magnus was just talking to me about having a better way of controlling > memory usage on autovacuum. Instead of each worker using up to > maintenance_work_mem, which ends up as a disaster when DBA A sets to a > large value and DBA B raises autovacuum_max_workers, we could simply > have an "autovacuum_maintenance_memory" setting (name TBD), that defines > the maximum amount of memory that autovacuum is going to use regardless > of the number of workers. > > So for the initial implementation, we could just have each worker set > its local maintenance_work_mem to autovacuum_maintenance_memory / max_workers. > That way there's never excessive memory usage. > > This implementation is not ideal, because most of the time they wouldn't > use that much memory, and so vacuums could be slower. But I think it's > better than what we currently have. > > Thoughts? I'm a little skeptical about creating more memory tunables. DBAs who are used to previous versions of PG will find that their vacuum is now really slow, because they adjusted maintenance_work_mem but not this new parameter. If we could divide up the vacuum memory intelligently between the workers in some way, that would be a win. But just creating a different variable that controls the same thing in different units doesn't seem to add much. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 11/16/10 9:27 AM, Robert Haas wrote: > I'm a little skeptical about creating more memory tunables. DBAs who > are used to previous versions of PG will find that their vacuum is now > really slow, because they adjusted maintenance_work_mem but not this Also, generally people who are using autovacuum don't do much manual vacuuming, and when they do, it's easy enough to do a SET before you issue the VACUUM statement. So, -1 for yet another GUC. > new parameter. If we could divide up the vacuum memory intelligently > between the workers in some way, that would be a win. But just > creating a different variable that controls the same thing in > different units doesn't seem to add much. Actually, that's not unreasonable. The difficulty with allocating work_mem out of a pool involves concurrency, but use of maint_work_mem is very low-concurrency; it wouldn't be that challenging to have the autovac workers pull from a pool of preset size instead of each being allocated the full maint_work_mem. And that would help with over/under allocation of memory. -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com
On Tue, Nov 16, 2010 at 1:36 PM, Josh Berkus <josh@agliodbs.com> wrote: > On 11/16/10 9:27 AM, Robert Haas wrote: >> I'm a little skeptical about creating more memory tunables. DBAs who >> are used to previous versions of PG will find that their vacuum is now >> really slow, because they adjusted maintenance_work_mem but not this > > Also, generally people who are using autovacuum don't do much manual > vacuuming, and when they do, it's easy enough to do a SET before you > issue the VACUUM statement. > > So, -1 for yet another GUC. > >> new parameter. If we could divide up the vacuum memory intelligently >> between the workers in some way, that would be a win. But just >> creating a different variable that controls the same thing in >> different units doesn't seem to add much. > > Actually, that's not unreasonable. The difficulty with allocating > work_mem out of a pool involves concurrency, but use of maint_work_mem > is very low-concurrency; it wouldn't be that challenging to have the > autovac workers pull from a pool of preset size instead of each being > allocated the full maint_work_mem. And that would help with over/under > allocation of memory. I think the difficulty is figuring out what to get the existing workers to give us some memory when a new one comes along. You want the first worker to potentially use ALL the memory... until worker #2 arrives. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Tue, 2010-11-16 at 10:36 -0800, Josh Berkus wrote: > On 11/16/10 9:27 AM, Robert Haas wrote: > > I'm a little skeptical about creating more memory tunables. DBAs who > > are used to previous versions of PG will find that their vacuum is now > > really slow, because they adjusted maintenance_work_mem but not this > > Also, generally people who are using autovacuum don't do much manual > vacuuming, and when they do, it's easy enough to do a SET before you > issue the VACUUM statement. > > So, -1 for yet another GUC. Agreed. If we are going to do anything, it should be pushed to a per object level (at least table) with ALTER TABLE. We don't need yet another global variable. JD -- PostgreSQL.org Major Contributor Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579 Consulting, Training, Support, Custom Development, Engineering http://twitter.com/cmdpromptinc | http://identi.ca/commandprompt
> I think the difficulty is figuring out what to get the existing > workers to give us some memory when a new one comes along. You want > the first worker to potentially use ALL the memory... until worker #2 > arrives. Yeah, doing this would mean that you couldn't give worker #1 all the memory, because on most OSes it can't release the memory even if it wants to. The unintelligent way to do this is to do it by halves, which is what I think Oracle 8 did for sort memory, and what the old time-sharing systems used to do. That is, the first worker can use up to 50%. The second worker gets up to 25%, or what's left over from the first worker + 25%, whichever is greater, up to 50%. The third worker gets similar, up to max_workers. This kind of a system actually works pretty well, *except* that it causes underallocation in the case (the most common one, I might add) where only one worker is needed. The intelligent way would be for Postgres to track the maximum concurrent workers which have been needed in the past, and allocate (again, by halves) based on that number. Either of these systems would require some new registers in shared memory to track such things. Relevant to this is the question: *when* does vacuum do its memory allocation? Is memory allocation reasonably front-loaded, or does vacuum keep grabbing more RAM until it's done? -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com
Excerpts from Josh Berkus's message of mar nov 16 15:52:14 -0300 2010: > > > I think the difficulty is figuring out what to get the existing > > workers to give us some memory when a new one comes along. You want > > the first worker to potentially use ALL the memory... until worker #2 > > arrives. > > Yeah, doing this would mean that you couldn't give worker #1 all the > memory, because on most OSes it can't release the memory even if it > wants to. Hmm, good point. > Relevant to this is the question: *when* does vacuum do its memory > allocation? Is memory allocation reasonably front-loaded, or does > vacuum keep grabbing more RAM until it's done? All at start. -- Álvaro Herrera <alvherre@commandprompt.com> The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support
>> Relevant to this is the question: *when* does vacuum do its memory >> allocation? Is memory allocation reasonably front-loaded, or does >> vacuum keep grabbing more RAM until it's done? > > All at start. That means that "allocation by halves" would work fine. -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com
Josh Berkus wrote: > > > I think the difficulty is figuring out what to get the existing > > workers to give us some memory when a new one comes along. You want > > the first worker to potentially use ALL the memory... until worker #2 > > arrives. > > Yeah, doing this would mean that you couldn't give worker #1 all the > memory, because on most OSes it can't release the memory even if it > wants to. FYI, what normally happens in this case is that the memory is pushed to swap by the kernel and never paged in from swap. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +