Re: our buffer replacement strategy is kind of lame - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: our buffer replacement strategy is kind of lame |
Date | |
Msg-id | CA+Tgmob793NeyRu0dHwBRWJFkobVwMpCSs1E7W9h1KsPe2vM1A@mail.gmail.com Whole thread Raw |
In response to | Re: our buffer replacement strategy is kind of lame (Simon Riggs <simon@2ndQuadrant.com>) |
List | pgsql-hackers |
On Sun, Aug 14, 2011 at 10:35 AM, Simon Riggs <simon@2ndquadrant.com> wrote: > On Sun, Aug 14, 2011 at 1:44 PM, Robert Haas <robertmhaas@gmail.com> wrote: >> The big problem with this idea is that it pretty much requires that >> the work you mentioned in one of your other emails - separating the >> background writer and checkpoint machinery into two separate processes >> - to happen first. So I'm probably going to have to code that up to >> see whether this works. If you're planning to post that patch soon >> I'll wait for it. Otherwise, I'm going to have to cobble together >> something that is at least good enough for testing. > > No, the big problem with the idea is that regrettably it is just an > idea on your part and has no basis in external theory or measurement. > I would not object to you investigating such a path and I think you > are someone that could invent something new and original there, but it > seems much less likely to be fruitful, or at least would require > significant experimental results to demonstrate an improvement in a > wide range of use cases to the rest of the hackers. All right, well, I'll mull over whether it's worth pursuing. Unless I or someone else comes up with an idea I like better, I think it probably is. > As to you not being able to work on your idea until I've split > bgwriter/checkpoint, that's completely unnecessary, and you know it. A > single ifdef is sufficient there, if at all. Hmm. Well, it might be unnecessary, but if I knew it were unnecessary, I wouldn't have said that I thought it was necessary. > The path I was working on (as shown in the earlier patch) was to apply > some corrections to the existing algorithm to reduce its worst case > behaviour. That's something I've seen mention of people doing for > RedHat kernels. Yeah. Your idea is appealing because it bounds the amount of time . There is some chance that you might kick out a really hot buffer if there are a long series of such buffers in a row. Without testing, I don't know whether that's a serious problem or not. > Overall, I think a minor modification is the appropriate path. If > Linux or other OS already use ClockPro then we already benefit from > it. It seems silly to track blocks that recently left shared buffers > when they are very likely still actually in memory in the filesystem > cache. You may be right. Basically, my concern is that buffer eviction is too slow. On a 32-core system, it's easy to construct a workload where the whole system bottlenecks on the rate at which buffers can be evicted and replaced - not because the system is fundamentally incapable of copying data around that quickly, but because everything piles up behind BufFreelistLock, and to a lesser extent the buffer mapping locks. Your idea may help with that, but I doubt that it's a complete solution. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: