Re: autovacuum not prioritising for-wraparound tables - Mailing list pgsql-hackers
From | Christopher Browne |
---|---|
Subject | Re: autovacuum not prioritising for-wraparound tables |
Date | |
Msg-id | CAFNqd5Vwf2TWUHNFwUy25z6HnSOTJq59b60=zR+KsV+qWXAJ_g@mail.gmail.com Whole thread Raw |
In response to | Re: autovacuum not prioritising for-wraparound tables (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: autovacuum not prioritising for-wraparound tables
(Tom Lane <tgl@sss.pgh.pa.us>)
Re: autovacuum not prioritising for-wraparound tables (Alvaro Herrera <alvherre@2ndquadrant.com>) |
List | pgsql-hackers |
On Fri, Feb 1, 2013 at 4:59 PM, Robert Haas <robertmhaas@gmail.com> wrote: > On Thu, Jan 31, 2013 at 3:18 PM, Alvaro Herrera > <alvherre@2ndquadrant.com> wrote: >> My intention was to apply a Nasby correction to Browne Strength and call >> the resulting function Browne' (Browne prime). Does that sound better? > > /me rests head in hands. I'm not halfway clever enough to hang with > this crowd; I'm not even going to touch the puns in Chris' reply. It's Friday... Fun needs to be had :-). >> Now seriously, I did experiment a bit with this and it seems to behave >> reasonably. Of course, there might be problems with it, and I don't >> oppose to changing the name. "Vacuum strength" didn't sound so great, >> so I picked the first term that came to mind. It's not like picking >> people's last names to name stuff is a completely new idea; that said, >> it was sort of a joke. > > I don't think I really understand the origin of the formula, so > perhaps if someone would try to characterize why it seems to behave > reasonably that would be helpful (at least to me). > >> f(deadtuples, relpages, age) = >> deadtuples/relpages + e ^ (age*ln(relpages)/2^32) > > To maybe make that discussion go more quickly let me kvetch about a > few things to kick things off: > > - Using deadtuples/relpages as part of the formula means that tables > with smaller tuples (thus more tuples per page) will tend to get > vacuumed before tables with larger tuples (thus less tuples per page). > I can't immediately see why that's a good thing. That wasn't intentional, and may be somewhat unfortunate. I picked values that I knew could be easily grabbed, and we don't have an immediate tuples-per-page estimate on pg_class. An estimate should be available in pg_statistic; I'm not sure that the bias from this hurts things badly. > - It's probably important to have a formula where we can be sure that > the wrap-around term will eventually dominate the dead-tuple term, > with enough time to spare to make sure nothing really bad happens; on > the other hand, it's also desirable to avoid the case where a table > that has just crossed the threshold for wraparound vacuuming doesn't > immediately shoot to the top of the list even if it isn't truly > urgent. It's unclear to me just from looking at this formula how well > the second term meets those goals. I think the second term *does* provide a way for wraparound to dominate; splitting it apart a bit... Consider... age * ln(relpages) e^ ---------------------------------- 2^32 The wraparound portion of this involves age/2^32... In the beginning, the numerator will be near zero, and denominator near 2 billion, so is roughly 1. As age trends towards 2^32, the fraction (ignoring ln(relpages)) trends towards 1, so that the longer we go without vacuuming, the more certain that the fraction indicates a value near 1. That *tends* to give you something looking like e^1, or 2.71828+, ignoring the relpages part. I threw in multiplying by ln(relpages) as a way to step Well Back from rollover; that means that this term will start growing considerably before rollover, and the larger the table, the sooner that growth takes place. There is a problem with the ln(relpages) term; if the table has just 1 page, the ln(relpages) = 0 so the value of the exponential term is *always* 1. Probably should have ln(relpages+CONSTANT) so that we guarantee the numerator is never 0. I'm a bit worried that the exponential term might dominate *too* quickly. For a table I have handy with 163K tuples, spread across 3357 pages, ln(relpage) = 8.1188, and the range of the "exponential bit" travels like follows: dotpro0620@localhost-> select generate_series(1,20)*100/20 as percent_wraparound, power(2.71828, (65536.0*32768.0*generate_series(1,20)/20.0 * ln(3357))/(65536.0*32768)) as wraparound_term;percent_wraparound | wraparound_term --------------------+------------------ 5 | 1.50071232210687 10 | 2.2521374737234 15 | 3.37981045789535 20 | 5.07212320054923 25 | 7.61179778630838 30| 11.4231187312988 35 | 17.1428150369499 40 | 25.7264337615498 45 | 38.607976149824 50 | 57.9394655396491 55 | 86.950469871638 60 | 130.48764154935 65 | 195.824411555774 70 | 293.876107391077 75 | 441.023495534592 80 | 661.849394087408 85 | 993.24554108594 90 | 1490.57582238538 95 | 2236.92550368832 100 | 3356.98166702019 (20 rows) At the beginning, the "wraparound" portion is just 1.5, so easily dominated by a table with a lot of dead tuples. As the time to wraparound declines, the term becomes steadily more urgent. There may be constants factors to fiddle with at the edges, but this term definitely heads towards dominance. That's definitely doing what I intended, and after constructing that table, I think I'm *more* confident. Consider that if there are two tables of different sizes, both head towards "maxing out" at a "wraparound_term" value that is directly correlated with the size of each table, which seems mighty right. A bigger table needs to get drawn into play (e.g. - needs to get vacuumed) earlier than a smaller one. > - More generally, it seems to me that we ought to be trying to think > about the units in which these various quantities are measured. Each > term ought to be unit-less. So perhaps the first term ought to divide > dead tuples by total tuples, which has the nice property that the > result is a dimensionless quantity that never exceeds 1.0. Then the > second term can be scaled somehow based on that value. Absolutely a good idea. I'm not sure I agree it ought to wind up unitless; I'd instead expect a common unit, perhaps number of pages, indicating a surrogate for quantity of I/O. That we're both thinking about "what's the unit?" means we're on a compatible trail. -- When confronted by a difficult problem, solve it by reducing it to the question, "How would the Lone Ranger handle this?"
pgsql-hackers by date: