Re: Vacuum rate limit in KBps - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: Vacuum rate limit in KBps |
Date | |
Msg-id | CA+Tgmob9qYfMva79EugTKNCKnVDQjgGQvyovaWHWnN7Zr2SGYA@mail.gmail.com Whole thread Raw |
In response to | Re: Vacuum rate limit in KBps (Greg Smith <greg@2ndQuadrant.com>) |
Responses |
Re: Vacuum rate limit in KBps
Re: Vacuum rate limit in KBps |
List | pgsql-hackers |
On Thu, Jan 19, 2012 at 5:39 PM, Greg Smith <greg@2ndquadrant.com> wrote: > On 1/19/12 1:10 PM, Robert Haas wrote: >> I have to say that I find that intensely counterintuitive. The >> current settings are not entirely easy to tune correctly, but at least >> they're easy to explain. > > I attempt to explain those settings to people in training classes about once > a month. It's never been anything but a complete disaster. I am barely > concerned about preserving the current UI because, as far as I've been able > to tell, there are only a handful of PostgreSQL installatinos on the planet > that have managed to use it happily. Even the ones that do have a > non-default setup that works usually flailed about for some time until they > get something that works, over a few frustrating months. And the result are > settings few dare touch for fear of breaking it. > > It's also worth pointing out that VACUUM problems are very close to the top > of the list of problems larger sites run into. So right now we have an > inscrutable UI around an often essential part of the database to tune, one > that any production site that gets over a few hundred GB of data in it will > run into problems with. I wouldn't care about this area if it weren't for > people screaming about how bad it is every time the topic comes up. > > If there's anyone out there who has run a larger PostgreSQL database and not > at some point been extremely frustrated with how the current VACUUM settings > are controlled, please speak up and say I'm wrong about this. I thought it > was well understood the UI was near unusably bad, it just wasn't obvious > what to do about it. > >> What does that 8MB mean and how does it >> relate to vacuum_cost_page_miss? If I double vacuum_rate_page_miss, >> does that effectively also double the cost limit, so that dirty pages >> and hits become relatively cheaper? If so, then I think what that >> really means is that the limit is 8MB only if there are no hits and no >> dirtied pages - otherwise it's less, and the amount by which it is >> less is the result of some arcane calculation. Ugh! > > Saying what I suggested is an arcane calculation strikes me as pretty > weird--we'd be hard pressed to design a more arcane calculation than the one > that's already happening. Perhaps so, but I'm willing to bet that if we have a variable that looks like a pure read limit or a pure dirty limit and really is not, we'll have succeeded. :-) > The feedback here so far seems to lead toward making independent read and > write knobs. I'm going to chew on the scenarios Robert described and the > ones Jim has been commenting on and see if I can refactor this into > something friendlier that addresses them. > > As for the suggestion that I'm bringing this up a bit late in the release > cycle, I've been trying. My first submission pushing in this > direction--improving the logging first, which is needed before you can > usefully measure a behavior change--happened back in September. I've been > moving this area as fast as I can get it to budge. I'm concerned now that > much will be made of improved performance in 9.2, leading to people > converting even larger systems than they used to. And it's not hard at all > to find a large system where inability to tune vacuum easily is the top > limiting factor on overall performance. I certainly didn't intend to come across as disparaging your work on this topic. I understand that there are big problems with the way things work now; I'm just cautious about trying to replace them too hastily with something that may not turn out to be any better. Of course, if we can replace it with something that we're sure is actually an improvement, I'm all in favor of that. But, IMHO, the problems in this area are too serious to be solved by renaming the knobs. At most, we're going to buy ourselves a little time to come up with a better solution. IMHO, and at the risk of repeating myself, one of the big problems in this area is that we're making the user guess something that we really ought to be figuring out for them. Just as users want checkpoints to run as slowly as possible while still not bumping into the next checkpoint, they'd presumably like vacuum to run as slowly as possible without bumping into the next vacuum. Instead, we make them tell us how fast they'd like it tor run, which requires them to guess a value high enough to finish soon enough but low enough to minimize the impact on the rest of the system. Another problem is that the vacuum algorithm itself could, I think, be made much smarter. We could teach HOT to prune pages that contain no HOT chains but do contain dead tuples. That would leave dead line pointers behind, but that's not nearly as bad as leaving the entire tuple behind. We could, as Simon and others have suggested, have one threshold for vacuuming the heap (i.e. reclaiming dead tuples) and another for vacuuming the indexes (i.e. reclaiming dead line pointers). That would open the door to partial vacuuming: just vacuum half a gigabyte or so of the heap, and then move on; the next vacuum can pick up where that one left off, at least up to the point where we decide we need to make an index pass; it would possibly also allow us to permit more than one vacuum on the same table at the same time, which is probably needed for very large tables. We could have backends that see dead tuples on a page throw them over to the fence to the background writer for immediate pruning. I blather, but I guess my point is that I really hope we're going to do something deeper here at some point in the near future, whatever becomes of the proposals now on the table. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: