Re: [PATCHES] ARC Memory Usage analysis - Mailing list pgsql-hackers

From Kevin Brown
Subject Re: [PATCHES] ARC Memory Usage analysis
Date
Msg-id 20041028032044.GA17583@filer
Whole thread Raw
In response to Re: [PATCHES] ARC Memory Usage analysis  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Tom Lane wrote:
> Greg Stark <gsstark@MIT.EDU> writes:
> > So I would suggest using something like 100us as the threshold for
> > determining whether a buffer fetch came from cache.
> 
> I see no reason to hardwire such a number.  On any hardware, the
> distribution is going to be double-humped, and it will be pretty easy to
> determine a cutoff after minimal accumulation of data.  The real question
> is whether we can afford a pair of gettimeofday() calls per read().
> This isn't a big issue if the read actually results in I/O, but if it
> doesn't, the percentage overhead could be significant.
> 
> If we assume that the effective_cache_size value isn't changing very
> fast, maybe it would be good enough to instrument only every N'th read
> (I'm imagining N on the order of 100) for this purpose.  Or maybe we
> need only instrument reads that are of blocks that are close to where
> the ARC algorithm thinks the cache edge is.

If it's decided to instrument reads, then perhaps an even better use
of it would be to tune random_page_cost.  If the storage manager knows
the difference between a sequential scan and a random scan, then it
should easily be able to measure the actual performance it gets for
each and calculate random_page_cost based on the results.

While the ARC lists can't be tuned on the fly, random_page_cost can.

> One small problem is that the time measurement gives you only a lower
> bound on the time the read() actually took.  In a heavily loaded system
> you might not get the CPU back for long enough to fool you about whether
> the block came from cache or not.

True, but that's information that you'd want to factor into the
performance measurements anyway.  The database needs to know how much
wall clock time it takes for it to fetch a page under various
circumstances from disk via the OS.  For determining whether or not
the read() hit the disk instead of just OS cache, what would matter is
the average difference between the two.  That's admittedly a problem
if the difference is less than the noise, though, but at the same time
that would imply that given the circumstances it really doesn't matter
whether or not the page was fetched from disk: the difference is small
enough that you could consider them equivalent.


You don't need 100% accuracy for this stuff, just statistically
significant accuracy.


> Another issue is what we do with the effective_cache_size value once
> we have a number we trust.  We can't readily change the size of the
> ARC lists on the fly.

Compare it with the current value, and notify the DBA if the values
are significantly different?  Perhaps write the computed value to a
file so the DBA can look at it later?

Same with other values that are computed on the fly.  In fact, it
might make sense to store them in a table that gets periodically
updated, and load their values from that table, and then the values in
postgresql.conf or the command line would be the default that's used
if there's nothing in the table (and if you really want fine-grained
control of this process, you could stick a boolean column in the table
to indicate whether or not to load the value from the table at startup
time).


-- 
Kevin Brown                          kevin@sysexperts.com


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: 8.0b4: COMMIT outside of a transaction echoes ROLLBACK
Next
From: Alvaro Herrera
Date:
Subject: Shared dependencies