Re: Vacuum rate limit in KBps - Mailing list pgsql-hackers

From Alvaro Herrera
Subject Re: Vacuum rate limit in KBps
Date
Msg-id 1328711414-sup-5164@alvh.no-ip.org
Whole thread Raw
In response to Re: Vacuum rate limit in KBps  (Bruce Momjian <bruce@momjian.us>)
Responses Re: Vacuum rate limit in KBps
List pgsql-hackers
Excerpts from Bruce Momjian's message of mié feb 08 00:58:58 -0300 2012:

> As much as I hate to poo-poo a patch addition, I have to agree with
> Robert Haas on this one.  Renaming settings really isn't moving us
> forward.  It introduces a migration problem and really doesn't move us
> forward in solving the underlying problem.  Additional monitoring, while
> helpful, also is only a stop-gap.

I think that (part of) the underlying problem is that we have no clear
way to specify "how much I/O do you want autovacuum to use".  That's
what this patch is all about, AFAIU; it has nothing to do with
monitoring.  Right now, as has been said, the only way to tweak this is
to change vacuum_cost_delay; the problem with that setting is that
making the calculation is not straightforward.

(Now, I disagree that it's so complex that it cannot ever be explain to
a class; or that it's so obscure that the only way to make it work is to
leave it alone and never touch it.  It's complex, okay, but it's not
exactly rocket science either.)

If the only real downside to this patch is that some people have already
changed vacuum_cost_delay and they will want to migrate those settings
forward, maybe we shouldn't be looking at _replacing_ that one with a
new setting, but rather just add the new setting; and in the code for
each, make sure that only one of them is set, and throw an error if the
other one is.


> On Thu, Jan 19, 2012 at 09:42:52PM -0500, Robert Haas wrote:

> > Another problem is that the vacuum algorithm itself could, I think, be
> > made much smarter.  We could teach HOT to prune pages that contain no
> > HOT chains but do contain dead tuples.  That would leave dead line
> > pointers behind, but that's not nearly as bad as leaving the entire
> > tuple behind.  We could, as Simon and others have suggested, have one
> > threshold for vacuuming the heap (i.e. reclaiming dead tuples) and
> > another for vacuuming the indexes (i.e. reclaiming dead line
> > pointers).  That would open the door to partial vacuuming: just vacuum
> > half a gigabyte or so of the heap, and then move on; the next vacuum
> > can pick up where that one left off, at least up to the point where we
> > decide we need to make an index pass; it would possibly also allow us
> > to permit more than one vacuum on the same table at the same time,
> > which is probably needed for very large tables.  We could have
> > backends that see dead tuples on a page throw them over to the fence
> > to the background writer for immediate pruning.  I blather, but I
> > guess my point is that I really hope we're going to do something
> > deeper here at some point in the near future, whatever becomes of the
> > proposals now on the table.

This is all fine, but what does it have to do with the current patch?  I
mean, if we change vacuum to do some stuff differently, it's still going
to have to read and dirty pages and thus account for I/O.

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: Add protransform for numeric, varbit, and temporal types
Next
From: Tom Lane
Date:
Subject: Re: Progress on fast path sorting, btree index creation time