Re: Vacuum rate limit in KBps - Mailing list pgsql-hackers
From | Greg Smith |
---|---|
Subject | Re: Vacuum rate limit in KBps |
Date | |
Msg-id | 4F1CA005.1040001@2ndquadrant.com Whole thread Raw |
In response to | Re: Vacuum rate limit in KBps (Jim Nasby <jim@nasby.net>) |
List | pgsql-hackers |
Jim Nasby wrote: > Your two comments together made me realize something... at the end of the day people don't care about MB/s. They care aboutimpact to other read and write activity in the database. > > What would be interesting is if we could monitor how long all *foreground* IO requests took. If they start exceeding somenumber, that means the system is at or near full capacity, and we'd like background stuff to slow down. > My hope for 9.2 was to get VACUUM moved over into some human-readable units. Having the whole thing work only via these abstract cost units is driving most of my customers with larger databases crazy. The patch I suggested was the easiest refactoring I thought moved in the right direction. While it may not be the perfect thing to care about, the very positive reaction I've gotten to the already landed patch to log in MB/s has suggested to me people are a lot more comfortable with that than the cost limit numbers. For 9.3, this whole mess needs to become integrated with a full-system monitoring approach, to really solve this well. pg_stat_bgwriter knows how many writes are coming from the various parts of the system, the total amount of write I/O. Given that, I can turn VACUUM completely dynamic based on what else is happening in many common situations. The sort of end goal I was thinking about was be able to say something like "let VACUUM use up to 4MB/s on writes, but subtract off the average write level of everything else". Now it's a background process running only when there's capacity to spare for it. You could turn it up a lot higher, if you knew it was only going to run at that level when the system wasn't as busy. That's one reason I started by suggesting a write-based limit; it fit into that longer-range plan better. Maybe that idea is junk and focusing on actual read I/O is the real problem with VACUUM for most people. I can tell you once I get more data out of systems that are logging in MB/s. If instead or in addition we get some better field data on systems that can afford to time a lot more things, and then start building feedback limiters based on how long all sorts of operations take to occur, that's a whole different parallel approach for auto-tuning this. I haven't thought about that as much simply because it only just became clear recently when the timing data is cheap to collect. I need to get a lot more production server data about that overhead to work with here too. > Dealing with SSDs vs real media would be a bit challenging... though, I think it would only be an issue if the two wererandomly mixed together. Kept separately I would expect them to have distinct behavior patterns that could be measuredand identified This might just turn into another one of those things where we will eventually need to have some more information on a per-tablespace basis. I envision allowing the server to collect more timing data as being something you can turn on for a bit, let it populate statistics about just what fast or slow means for each tablespace. Then you can keep those results around to guide future decisions even after timing is turned off. Maybe toggle it back on a day a month to make sure the numbers are still sane, if it's too expensive to time things every day.
pgsql-hackers by date: