Re: Vacuum rate limit in KBps - Mailing list pgsql-hackers

From Greg Smith
Subject Re: Vacuum rate limit in KBps
Date
Msg-id 4F1CA005.1040001@2ndquadrant.com
Whole thread Raw
In response to Re: Vacuum rate limit in KBps  (Jim Nasby <jim@nasby.net>)
List pgsql-hackers
Jim Nasby wrote:
> Your two comments together made me realize something... at the end of the day people don't care about MB/s. They care
aboutimpact to other read and write activity in the database.
 
>
> What would be interesting is if we could monitor how long all *foreground* IO requests took. If they start exceeding
somenumber, that means the system is at or near full capacity, and we'd like background stuff to slow down.
 
>   

My hope for 9.2 was to get VACUUM moved over into some human-readable 
units.  Having the whole thing work only via these abstract cost units 
is driving most of my customers with larger databases crazy.  The patch 
I suggested was the easiest refactoring I thought moved in the right 
direction.  While it may not be the perfect thing to care about, the 
very positive reaction I've gotten to the already landed patch to log in 
MB/s has suggested to me people are a lot more comfortable with that 
than the cost limit numbers.

For 9.3, this whole mess needs to become integrated with a full-system 
monitoring approach, to really solve this well.  pg_stat_bgwriter knows 
how many writes are coming from the various parts of the system, the 
total amount of write I/O.  Given that, I can turn VACUUM completely 
dynamic based on what else is happening in many common situations.  The 
sort of end goal I was thinking about was be able to say something like 
"let VACUUM use up to 4MB/s on writes, but subtract off the average 
write level of everything else".  Now it's a background process running 
only when there's capacity to spare for it.  You could turn it up a lot 
higher, if you knew it was only going to run at that level when the 
system wasn't as busy.  That's one reason I started by suggesting a 
write-based limit; it fit into that longer-range plan better.  Maybe 
that idea is junk and focusing on actual read I/O is the real problem 
with VACUUM for most people.  I can tell you once I get more data out of 
systems that are logging in MB/s.

If instead or in addition we get some better field data on systems that 
can afford to time a lot more things, and then start building feedback 
limiters based on how long all sorts of operations take to occur, that's 
a whole different parallel approach for auto-tuning this.  I haven't 
thought about that as much simply because it only just became clear 
recently when the timing data is cheap to collect.  I need to get a lot 
more production server data about that overhead to work with here too.

> Dealing with SSDs vs real media would be a bit challenging... though, I think it would only be an issue if the two
wererandomly mixed together. Kept separately I would expect them to have distinct behavior patterns that could be
measuredand identified
 

This might just turn into another one of those things where we will 
eventually need to have some more information on a per-tablespace 
basis.  I envision allowing the server to collect more timing data as 
being something you can turn on for a bit, let it populate statistics 
about just what fast or slow means for each tablespace.  Then you can 
keep those results around to guide future decisions even after timing is 
turned off.  Maybe toggle it back on a day a month to make sure the 
numbers are still sane, if it's too expensive to time things every day.


pgsql-hackers by date:

Previous
From: Jim Nasby
Date:
Subject: Re: Page Checksums
Next
From: Robert Haas
Date:
Subject: Re: Removing freelist (was Re: Should I implement DROP INDEX CONCURRENTLY?)