Re: [WIP PATCH] for Performance Improvement in Buffer Management - Mailing list pgsql-hackers
From | Jeff Janes |
---|---|
Subject | Re: [WIP PATCH] for Performance Improvement in Buffer Management |
Date | |
Msg-id | CAMkU=1z7x=osjhxR6arnkiKc+C+GU7XWDmAwUefCJK8zZYMgKg@mail.gmail.com Whole thread Raw |
In response to | Re: [WIP PATCH] for Performance Improvement in Buffer Management (Amit kapila <amit.kapila@huawei.com>) |
Responses |
Re: [WIP PATCH] for Performance Improvement in Buffer
Management
|
List | pgsql-hackers |
On Sun, Oct 21, 2012 at 12:59 AM, Amit kapila <amit.kapila@huawei.com> wrote: > On Saturday, October 20, 2012 11:03 PM Jeff Janes wrote: > >>Run the modes in reciprocating order? > Sorry, I didn't understood this, What do you mean by modes in reciprocating order? Sorry for the long delay. In your scripts, it looks like you always run the unpatched first, and then the patched second. By reciprocating, I mean to run them in the reverse order, or in random order. Also, for the select only transactions, I think that 20 minutes is much longer than necessary. I'd rather see many more runs, each one being shorter. Because you can't restart the server without wiping out the shared_buffers, what I would do is make a test patch which introduces a new guc.c setting which allows the behavior to be turned on and off with a SIGHUP (pg_ctl reload). > >>I haven't been able to detect any reliable difference in performance >>with this patch. I've been testing with 150 scale factor with 4GB of >>ram and 4 cores, over a variety of shared_buffers and concurrencies. > > I think the main reason for this is that when shared buffers are less, then there is no performance gain, > even the same is observed by me when I ran this test with shared buffers=2G, there is no performance gain. > Please see the results of shared buffers=2G in below mail: > http://archives.postgresql.org/pgsql-hackers/2012-09/msg00422.php True, but I think that testing with shared_buffers=2G when RAM is 4GB (and pgbench scale is also lower) should behave different than doing so when RAM is 24 GB. > > The reason I can think of is because when shared buffers are less then clock sweep runs very fast and there is no bottleneck. > Only when shared buffers increase above some threshhold, it spends reasonable time in clock sweep. I am rather skeptical of this. When the work set doesn't fit in memory under a select-only workload, then about half the buffers will be evictable at any given time, and half will have usagecount=1, and a handful will usagecount>=4 (index meta, root and branch blocks). This will be the case over a wide range of shared_buffers, as long as it is big enough to hold all index branch blocks but not big enough to hold everything. Given this state of affairs, the average clock sweep should be about 2, regardless of the exact size of shared_buffers. The one wrinkle I could think of is if all the usagecount=1 buffers are grouped into a continuous chunk, and all the usagecount=0 are in another chunk. The average would still be 2, but the average would be made up of N/2 runs of length 1, followed by one run of length N/2. Now if 1 process is stuck in the N/2 stretch and all other processes are waiting on that, maybe that somehow escalates the waits so that they are larger when N is larger, but I still don't see how the math works on that. Are you working on this just because it was on the ToDo List, or because you have actually run into a problem with it? I've never seen freelist lock contention be a problem on machines with less than 8 CPU, but both of us are testing on smaller machines. I think we really need to test this on something bigger. Cheers, Jeff
pgsql-hackers by date: