Re: patch: improve SLRU replacement algorithm - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: patch: improve SLRU replacement algorithm |
Date | |
Msg-id | CA+TgmoZi_SU_824gprzHGpdxPmKUGAKPtz476iDmVFfiVtEzEA@mail.gmail.com Whole thread Raw |
In response to | Re: patch: improve SLRU replacement algorithm (Simon Riggs <simon@2ndQuadrant.com>) |
Responses |
Re: patch: improve SLRU replacement algorithm
|
List | pgsql-hackers |
On Wed, Apr 4, 2012 at 4:23 PM, Simon Riggs <simon@2ndquadrant.com> wrote: > Measurement? > > Sounds believable, I just want to make sure we have measured things. Yes, I measured things. I didn't post the results because they're almost identical to the previous set of results which I already posted. That is, I wrote the patch; I ran it through the instrumentation framework; the same long waits with the same set of file/line combinations were still present. Then I wrote the patch that is attached to the OP, and also tested that, and those long waits went away completely. > I believe that, but if all buffers are I/O busy we should avoid > waiting on a write I/O if possible. I thought about that, but I don't see that there's any point in further complicating the algorithm. The current patch eliminates ALL the long waits present in this code path, which means that the situation where every CLOG buffer is I/O-busy at the same time either never happens, or never causes any significant stalls. I think it's a bad idea to make this any more complicated than is necessary to do the right thing in real-world cases. > That seems much smarter. I'm thinking this should be back patched > because it appears to be fairly major, so I'm asking for some more > certainty that every thing you say here is valid. No doubt much of it > is valid, but that's not enough. Yeah, I was thinking about that. What we're doing right now seems pretty stupid, so maybe it's worth considering a back-patch. OTOH, I'm usually loathe to tinker with performance in stable releases. I'll defer to the opinions of others on this point. >> Applying this patch does in fact eliminate the stalls. > > I'd like to see that measured from a user perspective. It would be > good to see the response time distribution for run with and without > the patch. My feeling is that you're not going to see very much difference in a latency-by-second graph, because XLogInsert is responsible for lots and lots of huge stalls also. That's going to mask the impact of fixing this problem. However, it's not much work to run the test, so I'll do that. >> 2. I think we might want to revisit Simon's idea of background-writing >> SLRU pages. > > Agreed. No longer anywhere near as important. I'll take a little > credit for identifying the right bottleneck, since you weren't a > believer before. I don't think I ever said it was a bad idea; I just couldn't measure any benefit. I think now we know why, or at least have a clue; and maybe some ideas about how to measure it better. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: