Re: patch: improve SLRU replacement algorithm - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: patch: improve SLRU replacement algorithm |
Date | |
Msg-id | CA+TgmoZEzcdH4Pc22uvyCRA+FUYDtNrJqouJCu9ABRuF_weJ6Q@mail.gmail.com Whole thread Raw |
In response to | Re: patch: improve SLRU replacement algorithm (Greg Stark <stark@mit.edu>) |
List | pgsql-hackers |
On Thu, Apr 5, 2012 at 12:44 PM, Greg Stark <stark@mit.edu> wrote: > On Thu, Apr 5, 2012 at 3:05 PM, Robert Haas <robertmhaas@gmail.com> wrote: >> I'm not sure I find those numbers all that helpful, but there they >> are. There are a couple of outliers beyond 12 s on the patched run, >> but I wouldn't read anything into that; the absolute worst values >> bounce around a lot from test to test. However, note that every >> bucket between 2s and 8s improves, sometimes dramatically. > > The numbers seem pretty compelling to me. Thanks. > They seem to indicate that > you've killed one of the big source of stalls but that there are more > lurking including at least one which causes small number of small > stalls. The data in my OP identifies the other things that can cause stalls >= 100 ms with considerable specificity. > The only fear I have is that I'm still wondering what happens to your > code when *all* the buffers become blocked on I/O. Can you catch > whether this ever occurred in your test and explain what should happen > in that case? If all the buffers are I/O-busy, it just falls back to picking the least-recently-used buffer, which is a reasonable heuristic, since that I/O is likely to be done first. However, when I ran this with all the debugging instrumentation enabled, it reported no waits in slru.c consistent with that situation ever having occurred. So if something like that did happen during the test run, it produced a wait of less than 100 ms, but I think it's more likely that it never happened at all. I think part of the confusion here may relate to a previous discussion about increasing the number of CLOG buffers. During that discussion, I postulated that increasing the number of CLOG buffers improved performance because we could encounter a situation where every buffer is I/O-busy, causing new backends that wanted to perform an I/O to have to wait until some backend that had been doing an I/O finished it. It's now clear that I was totally wrong, because you don't need to have every buffer busy before the next backend that needs a CLOG page blocks on an I/O. As soon as ONE backend blocks on a CLOG buffer I/O, every other backend that needs to evict a page will pile up on the same I/O. I just assumed that we couldn't possibly be doing anything that silly, but we are. So here's my new theory: the real reason why increasing the number of CLOG pages improved performance is because it caused dirty pages to reach the tail of the LRU list less frequently. It's particularly bad if a page gets written and fsync'd but then someone still needs to WRITE that page, so it gets pulled back in and written and fsync'd a second time. Such events are less likely with more buffers. Of course, increasing the number of buffers also decreases cache pressure in general. What's clear from these numbers as well is that there is a tremendous amount of CLOG cache churn, and therefore we can infer that most of those I/Os complete almost immediately - if they did not, it would be impossible to replace 5000 CLOG buffers in a second no matter how many backends you have. It's the occasional I/Os that don't completely almost immediately that are at issue here. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: