Re: [PERFORM] encouraging index-only scans - Mailing list pgsql-hackers

From Robert Haas
Subject Re: [PERFORM] encouraging index-only scans
Date
Msg-id CA+TgmoYF7nX-gmODdSzPeGwKHBr3MbDcfW6yf8KO9q8Wk06N6g@mail.gmail.com
Whole thread Raw
In response to Re: [PERFORM] encouraging index-only scans  (Jeff Janes <jeff.janes@gmail.com>)
List pgsql-hackers
First, thanks for this thoughtful email.

On Tue, Feb 4, 2014 at 7:14 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
> On Mon, Feb 3, 2014 at 8:55 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>> I've also had some further thoughts about the right way to drive
>> vacuum scheduling.  I think what we need to do is tightly couple the
>> rate at which we're willing to do vacuuming to the rate at which we're
>> incurring "vacuum debt".  That is, if we're creating 100kB/s of pages
>> needing vacuum, we vacuum at 2-3MB/s (with default settings).
>
> If we can tolerate 2-3MB/s without adverse impact on other work, then we can
> tolerate it.  Do we gain anything substantial by sand-bagging it?

No.  The problem is the other direction.

>> If
>> we're creating 10MB/s of pages needing vacuum, we *still* vacuum at
>> 2-3MB/s.  Not shockingly, vacuum gets behind, the database bloats, and
>> everything goes to heck.
>
> (Your reference to bloat made be me think your comments here are about
> vacuuming in general, not specific to IOS.  If that isn't the case, then
> please ignore.)
>
> If we can only vacuum at 2-3MB/s without adversely impacting other activity,
> but we are creating 10MB/s of future vacuum need, then there are basically
> two possibilities I can think of.  Either the 10MB/s represents a spike, and
> vacuum should tolerate it and hope to catch up on the debt later.  Or it
> represents a new permanent condition, in which case I bought too few hard
> drives for the work load, and no scheduling decision that autovacuum can
> make will save me from my folly. Perhaps there is some middle ground between
> those possibilities, but I don't see room for much middle ground.
>
> I guess there might be entirely different possibilities not between those
> two; for example, I don't realize I'm doing something that is generating
> 10MB/s of vacuum debt, and would like to have this thing I'm doing be
> automatically throttled to the point it doesn't interfere with other
> processes (either directly, or indirectly by bloat)

The underlying issue here is that, in order for there not to be a
problem, a user needs to configure their autovacuum processes to
vacuum at a rate which is greater than or equal to the average rate at
which vacuum debt is being created.  If they don't, they get runaway
bloat.  But to do that, they need to know at what rate they are
creating vacuum debt, which is almost impossible to figure out right
now; and even if they did know it, they'd then need to figure out what
vacuum cost delay settings would allow vacuuming at a rate sufficient
to keep up, which isn't quite as hard to estimate but certainly
involves nontrivial math.  So a lot of people have this set wrong, and
it's not easy to get it right except by frobbing the settings until
you find something that works well in practice.

Also, a whole *lot* of problems in this area are caused by cases where
the rate at which vacuum debt is being created *changes*.  Autovacuum
is keeping up, but then you have either a load spike or just a gradual
increase in activity and it doesn't keep up any more.  You don't
necessarily notice right away, and by the time you do there's no easy
way to recover.  If you've got a table with lots of dead tuples in it,
but it's also got enough internal freespace to satisfy as many inserts
and updates as are happening, then it's possibly reasonable to put off
vacuuming in the hopes that system load will be lower at some time in
the future.  But if you've got a table with lots of dead tuples in it,
and you're extending it to create internal freespace instead of
vacuuming it, it is highly like that you are not doing what will make
the user most happy.  Even if vacuuming that table slows down
foreground activity quite badly, it is probably better than
accumulating an arbitrary amount of bloat.

>> The rate of vacuuming needs to be tied
>> somehow to the rate at which we're creating stuff that needs to be
>> vacuumed.  Right now we don't even have a way to measure that, let
>> alone auto-regulate the aggressiveness of autovacuum on that basis.
>
> There is the formula used to decide when a table gets vacuumed.  Isn't the
> time delta in this formula a measure of how fast we are creating stuff that
> needs to be vacuumed for bloat reasons?  Is your objection that it doesn't
> include other reasons we might want to vacuum, or that it just doesn't work
> very well, or that is not explicitly exposed?

AFAICT, the problem isn't when the table gets vacuumed so much as *how
fast* it gets vacuumed.  The autovacuum algorithm does a fine job
selecting tables for vacuuming, for the most part.  There are problems
with insert-only tables and sometimes for large tables the default
threshold (0.20) is too high, but it's not terrible.  However, the
limit on the overall rate of vacuuming activity to 2-3MB/s regardless
of how fast we're creating vacuum debt is a big problem.

>> Similarly, for marking of pages as all-visible, we currently make the
>> same decision whether the relation is getting index-scanned (in which
>> case the failure to mark those pages all-visible may be suppressing
>> the use of index scans or making them less effective) or whether it's
>> not being accessed at all (in which case vacuuming it won't help
>> anything, and might hurt by pushing other pages out of cache).
>
> If it is not getting accessed at all because the database is not very active
> right now, that would be the perfect time to vacuum it.  Between "I can
> accurately project current patterns of (in)activity into the future" and
> "People don't build large tables just to ignore them forever", I think the
> latter is more likely to be true.  If the system is busy but this particular
> table is not, then that would be a better reason to de-prioritise vacuuming
> that table.  But can this degree of reasoning really be implemented in a
> practical way?  In core?

I don't know.  But the algorithm for determining the rate at which we
vacuum (2-3MB/s) could hardly be stupider than it is right now.  It's
almost a constant, and to the extent that it's not a constant, it
depends on the wrong things.  The fact that getting this perfectly
right is unlikely to be easy, and may be altogether impossible,
shouldn't discourage us from trying to come up with something better
than what we have now.

> My experience has been that if too few pages are all visible, it generally
> switches to a seq scan, not an index scan of a different index.  But many
> things that are semantically possible to be index-only-scans would never be
> planned that way even if allvisible were 100%, so I think it would have to
> do two planning passes, one with the real allvisible, and a hypothetical one
> with allvisible set to 100%.  And then there is the possibility that, while
> a high allvisible would be useful, the table is so active that no amount of
> vacuuming could ever keep it high.

Yeah, those are all good points.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Andrew Dunstan
Date:
Subject: Re: jsonb and nested hstore
Next
From: Robert Haas
Date:
Subject: Re: Minor performance improvement in transition to external sort