Re: [PATCHES] [HACKERS] ARC Memory Usage analysis - Mailing list pgsql-performance
From | Thomas F.O'Connell |
---|---|
Subject | Re: [PATCHES] [HACKERS] ARC Memory Usage analysis |
Date | |
Msg-id | 74778BDA-27AC-11D9-B369-000D93AE0944@sitening.com Whole thread Raw |
Responses |
Re: [PATCHES] [HACKERS] ARC Memory Usage analysis
Re: [PATCHES] [HACKERS] ARC Memory Usage analysis |
List | pgsql-performance |
Simon, As a postgres DBA, I find your comments about how not to use effective_cache_size instructive, but I'm still not sure how I should arrive at a target value for it. On most of the machines on which I admin postgres, I generally set shared_buffers to 10,000 (using what seems to have been the recent conventional wisdom of the lesser of 10,000 or 10% of RAM). I haven't really settled on an optimal value for effective_cache_size, and now I'm again confused as to how I might even benchmark it. Here are the documents on which I've based my knowledge: http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html#effcache http://www.varlena.com/varlena/GeneralBits/Tidbits/annotated_conf_e.html http://www.ca.postgresql.org/docs/momjian/hw_performance/node8.html From Bruce's document, I gather that effective_cache_size would assume that either shared buffers or unused RAM were valid sources of cached pages for the purposes of assessing plans. As a result, I was intending to inflate the value of effective_cache_size to closer to the amount of unused RAM on some of the machines I admin (once I've verified that they all have a unified buffer cache). Is that correct? -tfo -- Thomas F. O'Connell Co-Founder, Information Architect Sitening, LLC http://www.sitening.com/ 110 30th Avenue North, Suite 6 Nashville, TN 37203-6320 615-260-0005 On Oct 26, 2004, at 3:49 AM, Simon Riggs wrote: > On Mon, 2004-10-25 at 16:34, Jan Wieck wrote: >> The problem is, with a too small directory ARC cannot guesstimate what >> might be in the kernel buffers. Nor can it guesstimate what recently >> was >> in the kernel buffers and got pushed out from there. That results in a >> way too small B1 list, and therefore we don't get B1 hits when in fact >> the data was found in memory. B1 hits is what increases the T1target, >> and since we are missing them with a too small directory size, our >> implementation of ARC is propably using a T2 size larger than the >> working set. That is not optimal. > > I think I have seen that the T1 list shrinks "too much", but need more > tests...with some good test results > > The effectiveness of ARC relies upon the balance between the often > conflicting requirements of "recency" and "frequency". It seems > possible, even likely, that pgsql's version of ARC may need some subtle > changes to rebalance it - if we are unlikely enough to find cases where > it genuinely is out of balance. Many performance tests are required, > together with a few ideas on extra parameters to include....hence my > support of Jan's ideas. > > That's also why I called the B1+B2 hit ratio "turbulence" because it > relates to how much oscillation is happening between T1 and T2. In > physical systems, we expect the oscillations to be damped, but there is > no guarantee that we have a nearly critically damped oscillator. (Note > that the absence of turbulence doesn't imply that T1+T2 is optimally > sized, just that is balanced). > > [...and all though the discussion has wandered away from my original > patch...would anybody like to commit, or decline the patch?] > >> If we would replace the dynamic T1 buffers with a max_backends*2 area >> of >> shared buffers, use a C value representing the effective cache size >> and >> limit the T1target on the lower bound to effective cache size - shared >> buffers, then we basically moved the T1 cache into the OS buffers. > > Limiting the minimum size of T1len to be 2* maxbackends sounds like an > easy way to prevent overbalancing of T2, but I would like to follow up > on ways to have T1 naturally stay larger. I'll do a patch with this > idea > in, for testing. I'll call this "T1 minimum size" so we can discuss it. > > Any other patches are welcome... > > It could be that B1 is too small and so we could use a larger value of > C > to keep track of more blocks. I think what is being suggested is two > GUCs: shared_buffers (as is), plus another one, larger, which would > allow us to track what is in shared_buffers and what is in OS cache. > > I have comments on "effective cache size" below.... > > On Mon, 2004-10-25 at 17:03, Tom Lane wrote: >> Jan Wieck <JanWieck@Yahoo.com> writes: >>> This all only holds water, if the OS is allowed to swap out shared >>> memory. And that was my initial question, how likely is it to find >>> this >>> to be true these days? >> >> I think it's more likely that not that the OS will consider shared >> memory to be potentially swappable. On some platforms there is a >> shmctl >> call you can make to lock your shmem in memory, but (a) we don't use >> it >> and (b) it may well require privileges we haven't got anyway. > > Are you saying we shouldn't, or we don't yet? I simply assumed that we > did use that function - surely it must be at least an option? RHEL > supports this at least.... > > It may well be that we don't have those privileges, in which case we > turn off the option. Often, we (or I?) will want to install a dedicated > server, so we should have all the permissions we need, in which case... > >> This has always been one of the arguments against making >> shared_buffers >> really large, of course --- if the buffers aren't all heavily used, >> and >> the OS decides to swap them to disk, you are worse off than you would >> have been with a smaller shared_buffers setting. > > Not really, just an argument against making them *too* large. Large > *and* utilised is OK, so we need ways of judging optimal sizing. > >> However, I'm still really nervous about the idea of using >> effective_cache_size to control the ARC algorithm. That number is >> usually entirely bogus. Right now it is only a second-order influence >> on certain planner estimates, and I am afraid to rely on it any more >> heavily than that. > > ...ah yes, effective_cache_size. > > The manual describes effective_cache_size as if it had something to do > with the OS, and some of this discussion has picked up on that. > > effective_cache_size is used in only two places in the code (both in > the > planner), as an estimate for calculating the cost of a) nonsequential > access and b) index access, mainly as a way of avoiding overestimates > of > access costs for small tables. > > There is absolutely no implication in the code that > effective_cache_size > measures anything in the OS; what it gives is an estimate of the number > of blocks that will be available from *somewhere* in memory (i.e. in > shared_buffers OR OS cache) for one particular table (the one currently > being considered by the planner). > > Crucially, the "size" referred to is the size of the *estimate*, not > the > size of the OS cache (nor the size of the OS cache + shared_buffers). > So > setting effective_cache_size = total memory available or setting > effective_cache_size = total memory - shared_buffers are both wildly > irrelevant things to do, or any assumption that directly links memory > size to that parameter. So talking about "effective_cache_size" as if > it > were the OS cache isn't the right thing to do. > > ...It could be that we use a very high % of physical memory as > shared_buffers - in which case the effective_cache_size would represent > the contents of shared_buffers. > > Note also that the planner assumes that all tables are equally likely > to > be in cache. Increasing effective_cache_size in postgresql.conf seems > destined to give the wrong answer in planning unless you absolutely > understand what it does. > > I will submit a patch to correct the description in the manual. > > Further comments: > The two estimates appear to use effective_cache_size differently: > a) assumes that a table of size effective_cache_size will be 50% in > cache > b) assumes that effective_cache_size blocks are available, so for a > table of size == effective_cache_size, then it will be 100% available > > IMHO the GUC should be renamed "estimated_cached_blocks", with the old > name deprecated to force people to re-read the manual description of > what effective_cache_size means and then set accordingly.....all of > that > in 8.0.... > > -- > Best Regards, Simon Riggs > > > ---------------------------(end of > broadcast)--------------------------- > TIP 2: you can get off all lists at once with the unregister command > (send "unregister YourEmailAddressHere" to > majordomo@postgresql.org)
pgsql-performance by date: