On Thu, May 26, 2011 at 2:05 PM, Kevin Grittner
<Kevin.Grittner@wicourts.gov> wrote:
>> I'm a bit confused by this - what the current design obfuscates is
>> the fact that reltuples and relpages are not really independent
>> columns; you can't update one without updating the other, unless
>> you want screwy behavior. Replacing reltuples by reltupledensity
>> would fix that problem - it would be logical and non-damaging to
>> update either column independently.
>
> They don't always move in tandem. Certainly there can be available
> space in those pages from which tuples can be allocated or which
> increases as tuples are vacuumed. Your proposed change would
> neither make more or less information available, because we've got
> two numbers which can be observed as raw counts, and a ratio between
> them.
So far I agree.
> By storing the ratio and one count you make changes to the
> other count implied and less visible. It seems more understandable
> and less prone to error (to me, anyway) to keep the two "raw"
> numbers and calculate the ratio -- and when you observe a change in
> one raw number which you believe should force an adjustment to the
> other raw number before its next actual value is observed, to
> comment on why that's a good idea, and do the trivial arithmetic at
> that time.
Except that's not how it works. At least in the case of ANALYZE, we
*aren't* counting all the tuples in the table. We're selecting a
random sample of pages and inferring a tuple density, which we then
extrapolate to the whole table and store. Then when we pull it back
out of the table, we convert it back to a tuple density. The real
value we are computing and using almost everywhere is tuple density;
storing a total number of tuples in the table appears to be just
confusing the issue.
Unless, of course, I am misunderstanding, which is possible.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company