Re: [WIP PATCH] for Performance Improvement in Buffer Management - Mailing list pgsql-hackers
From | Amit kapila |
---|---|
Subject | Re: [WIP PATCH] for Performance Improvement in Buffer Management |
Date | |
Msg-id | 6C0B27F7206C9E4CA54AE035729E9C383BC7D196@szxeml509-mbs Whole thread Raw |
In response to | Re: [WIP PATCH] for Performance Improvement in Buffer Management (Pavan Deolasee <pavan.deolasee@gmail.com>) |
Responses |
Re: [WIP PATCH] for Performance Improvement in Buffer Management
|
List | pgsql-hackers |
On Friday, November 23, 2012 11:15 AM Pavan Deolasee wrote: On Thu, Nov 22, 2012 at 2:05 PM, Amit Kapila <amit.kapila@huawei.com> wrote: >>>Sorry, I haven't followed this thread at all, but the numbers (43171 and 57920) in the last two runs of @mv-free-listfor 32 clients look aberrations, no ? I wonder if >>>that's skewing the average. >>Yes, that is one of the main reasons, but in all runs this is consistent that for 32 clients or above this kind of numbers are observed. >>Even Jeff has pointed the similar thing in one of his mails and suggested to run the tests such that first test shouldrun “with patch” and then “without patch”. >>After doing what he suggested the observations are still similar. >Are we convinced that the jump that we are seeing is a real one then ? Still not convinced, as the data has been collected in only my setup. >I'm a bit surprised because it happens only with the patch and only for 32 clients. How >would you explain that ? The reason this patch can improve performance is due to reduce contention for BufFreeListLock and PartitionLock (which ittakes in BufferAlloc a. to remove old page from buffer or b. to see if block is already in buffer pool) in backends. Asthe number of backends increase the chances of improved performance is much better. In particular for 32 clients when testsrun for longer time results are not that skewed. For 32 clients, as mentioned in previous mail when the test has ran for 1 hr, the differrence is not very skewed. 32 client /32 thread for 1 hour @mv-free-lst @9.3devl Single-run: 9842.019229 8050.357981 >>> I also looked at the the Results.htm file down thread. There seem to be a steep degradation when the shared buffers areincreased from 5GB to 10GB, both with and >>> without the patch. Is that expected ? If so, isn't that worth investigating and possibly even fixing before we do anythingelse ? >> The reason for decrease in performance is that when shared buffers are increased from 5GB to 10GB, the I/O starts as afterincreasing it cannot hold all >> the data in OS buffers. >Shouldn't that data be in the shared buffers if not the OS cache and hence approximately same IO will be required? I don't think so as the data in OS cache or PG Shared buffers doesn't have any direct relation, OS can flush its buffersbased on its scheduler algorithm. Let us try to see by example: Total RAM - 22G Database size - 16G Case -1 (Shared Buffers - 5G) a. Load all the files in OS buffers. Chances are good that all 16G data will be there in OS buffers as OS has still 17G ofmemory available. b. Try to load all in Shared buffers. Last 5G will be there in shared buffers. c. Chances are high that remaining 11G buffers access will not lead to IO as they will be in OS buffers. Case -2 (Shared Buffers - 10G) a. Load all the files in OS buffers. In best case OS buffers can contain10-12G data as OS has 12G of memory available. b. Try to load all in Shared buffers. Last 10G will be there in shared buffers. c. Now as there is no direct correlation of data between Shared Buffers and OS buffers, so whenever PG has to access anydata which is not there in Shared Buffers, good chances are there that it can lead to IO. > Again, the drop in the performance is so severe that it seems worth investigating that further, especially because youcan reproduce it reliably. Yes, I agree that it is worth investigating, but IMO this is a different problem which might not be addressed with thePatch in discussion. The 2 reasons I can think for dip in performance when Shared Buffers increase beyond certain threshholdpercentage of RAM are, a. either the algorithm of Buffer Management has some bottleneck b. due to the way datais managed in Shared Buffers and OS buffer cache Any Suggestion/Comments? With Regards, Amit Kapila.
pgsql-hackers by date: