Re: Group-count estimation statistics - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Group-count estimation statistics
Date
Msg-id 15045.1107274501@sss.pgh.pa.us
Whole thread Raw
In response to Re: Group-count estimation statistics  (Manfred Koizar <mkoi-pg@aon.at>)
List pgsql-hackers
Manfred Koizar <mkoi-pg@aon.at> writes:
> On Mon, 31 Jan 2005 14:40:08 -0500, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Oh, I see, you want a "max" calculation in there too.  Seems reasonable.
>> Any objections?

> Yes.  :-(  What I said is only true in the absence of any WHERE clause
> (or join).  Otherwise the same cross-column correlation issues you tried
> to work around with the N/10 clamping might come back through the
> backdoor.  I'm not sure whether coding for such a narrow use case is
> worth the trouble.  Forget my idea.

No, I think it's still good.  The WHERE clauses are factored in
separately (essentially by assuming their selectivity on the grouped
rows is the same as it would be on the raw rows, which is pretty bogus
but it's hard to do better).  The important point is that the group
count before WHERE filtering certainly does behave as you suggest,
and so the clamp is going to be overoptimistic if it clamps to less than
the largest individual number-of-distinct-values.
        regards, tom lane


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: [NOVICE] Last ID Problem
Next
From: Josh Berkus
Date:
Subject: Re: Allow GRANT/REVOKE permissions to be applied to all schema