Re: [PATCHES] ARC Memory Usage analysis - Mailing list pgsql-hackers
From | Kevin Brown |
---|---|
Subject | Re: [PATCHES] ARC Memory Usage analysis |
Date | |
Msg-id | 20041028032044.GA17583@filer Whole thread Raw |
In response to | Re: [PATCHES] ARC Memory Usage analysis (Tom Lane <tgl@sss.pgh.pa.us>) |
List | pgsql-hackers |
Tom Lane wrote: > Greg Stark <gsstark@MIT.EDU> writes: > > So I would suggest using something like 100us as the threshold for > > determining whether a buffer fetch came from cache. > > I see no reason to hardwire such a number. On any hardware, the > distribution is going to be double-humped, and it will be pretty easy to > determine a cutoff after minimal accumulation of data. The real question > is whether we can afford a pair of gettimeofday() calls per read(). > This isn't a big issue if the read actually results in I/O, but if it > doesn't, the percentage overhead could be significant. > > If we assume that the effective_cache_size value isn't changing very > fast, maybe it would be good enough to instrument only every N'th read > (I'm imagining N on the order of 100) for this purpose. Or maybe we > need only instrument reads that are of blocks that are close to where > the ARC algorithm thinks the cache edge is. If it's decided to instrument reads, then perhaps an even better use of it would be to tune random_page_cost. If the storage manager knows the difference between a sequential scan and a random scan, then it should easily be able to measure the actual performance it gets for each and calculate random_page_cost based on the results. While the ARC lists can't be tuned on the fly, random_page_cost can. > One small problem is that the time measurement gives you only a lower > bound on the time the read() actually took. In a heavily loaded system > you might not get the CPU back for long enough to fool you about whether > the block came from cache or not. True, but that's information that you'd want to factor into the performance measurements anyway. The database needs to know how much wall clock time it takes for it to fetch a page under various circumstances from disk via the OS. For determining whether or not the read() hit the disk instead of just OS cache, what would matter is the average difference between the two. That's admittedly a problem if the difference is less than the noise, though, but at the same time that would imply that given the circumstances it really doesn't matter whether or not the page was fetched from disk: the difference is small enough that you could consider them equivalent. You don't need 100% accuracy for this stuff, just statistically significant accuracy. > Another issue is what we do with the effective_cache_size value once > we have a number we trust. We can't readily change the size of the > ARC lists on the fly. Compare it with the current value, and notify the DBA if the values are significantly different? Perhaps write the computed value to a file so the DBA can look at it later? Same with other values that are computed on the fly. In fact, it might make sense to store them in a table that gets periodically updated, and load their values from that table, and then the values in postgresql.conf or the command line would be the default that's used if there's nothing in the table (and if you really want fine-grained control of this process, you could stick a boolean column in the table to indicate whether or not to load the value from the table at startup time). -- Kevin Brown kevin@sysexperts.com
pgsql-hackers by date: