Re: [PERFORM] encouraging index-only scans - Mailing list pgsql-hackers

From Jeff Janes
Subject Re: [PERFORM] encouraging index-only scans
Date
Msg-id CAMkU=1zShBFuzryeX0eibVt4fAfxnX_-drfZGRRB8jxBSN0HKA@mail.gmail.com
Whole thread Raw
In response to Re: [PERFORM] encouraging index-only scans  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: [PERFORM] encouraging index-only scans  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Mon, Feb 3, 2014 at 8:55 AM, Robert Haas <robertmhaas@gmail.com> wrote:
 
I've also had some further thoughts about the right way to drive
vacuum scheduling.  I think what we need to do is tightly couple the
rate at which we're willing to do vacuuming to the rate at which we're
incurring "vacuum debt".  That is, if we're creating 100kB/s of pages
needing vacuum, we vacuum at 2-3MB/s (with default settings).  

If we can tolerate 2-3MB/s without adverse impact on other work, then we can tolerate it.  Do we gain anything substantial by sand-bagging it?

 
If
we're creating 10MB/s of pages needing vacuum, we *still* vacuum at
2-3MB/s.  Not shockingly, vacuum gets behind, the database bloats, and
everything goes to heck.  

(Your reference to bloat made be me think your comments here are about vacuuming in general, not specific to IOS.  If that isn't the case, then please ignore.)

If we can only vacuum at 2-3MB/s without adversely impacting other activity, but we are creating 10MB/s of future vacuum need, then there are basically two possibilities I can think of.  Either the 10MB/s represents a spike, and vacuum should tolerate it and hope to catch up on the debt later.  Or it represents a new permanent condition, in which case I bought too few hard drives for the work load, and no scheduling decision that autovacuum can make will save me from my folly. Perhaps there is some middle ground between those possibilities, but I don't see room for much middle ground.  

I guess there might be entirely different possibilities not between those two; for example, I don't realize I'm doing something that is generating 10MB/s of vacuum debt, and would like to have this thing I'm doing be automatically throttled to the point it doesn't interfere with other processes (either directly, or indirectly by bloat)

 
The rate of vacuuming needs to be tied
somehow to the rate at which we're creating stuff that needs to be
vacuumed.  Right now we don't even have a way to measure that, let
alone auto-regulate the aggressiveness of autovacuum on that basis.

There is the formula used to decide when a table gets vacuumed.  Isn't the time delta in this formula a measure of how fast we are creating stuff that needs to be vacuumed for bloat reasons?  Is your objection that it doesn't include other reasons we might want to vacuum, or that it just doesn't work very well, or that is not explicitly exposed?


 
Similarly, for marking of pages as all-visible, we currently make the
same decision whether the relation is getting index-scanned (in which
case the failure to mark those pages all-visible may be suppressing
the use of index scans or making them less effective) or whether it's
not being accessed at all (in which case vacuuming it won't help
anything, and might hurt by pushing other pages out of cache).

If it is not getting accessed at all because the database is not very active right now, that would be the perfect time to vacuum it.  Between "I can accurately project current patterns of (in)activity into the future" and "People don't build large tables just to ignore them forever", I think the latter is more likely to be true.  If the system is busy but this particular table is not, then that would be a better reason to de-prioritise vacuuming that table.  But can this degree of reasoning really be implemented in a practical way?  In core?
 
 Again,
if we had better statistics, we could measure this - counting heap
fetches for actual index-only scans plus heap fetches for index scans
that might have been planned index-only scans but for the relation
having too few all-visible pages doesn't sound like an impossible
metric to gather.

My experience has been that if too few pages are all visible, it generally switches to a seq scan, not an index scan of a different index.  But many things that are semantically possible to be index-only-scans would never be planned that way even if allvisible were 100%, so I think it would have to do two planning passes, one with the real allvisible, and a hypothetical one with allvisible set to 100%.  And then there is the possibility that, while a high allvisible would be useful, the table is so active that no amount of vacuuming could ever keep it high.

Cheers,

Jeff

pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: should we add a XLogRecPtr/LSN SQL type?
Next
From: Peter Geoghegan
Date:
Subject: Re: Re: Misaligned BufferDescriptors causing major performance problems on AMD