On 08/03/2011 06:29 AM, Robert Haas wrote:
>> b. the Merge Join cost estimator did a poor job with the data it was given:
>>
>> In function eqjoinsel_inner there are two cases (1) ANALYZE data is
>> available for both sides of the join and (2) ANALYZE data is missing for one
>> or both sides. Due to the GROUP BY processing described above, ANALYZE data
>> was available for "t" but not for "SELECT * FROM t GROUP BY ...".
>>
>> The logic in that case is "use the column with the most distinct values" to
>> estimate selectivity. The default number of distinct values for a column
>> with no data (DEFAULT_NUM_DISTINCT) is 200. In my join the number of values
>> was:
>>
>> col in GROUP BY in table t
>> j 200 1
>> k 200 1
>> x 200 10
>> y 200 1000
>> z 200 30
>>
>> In 4 of the 5 columns the default value had more distinct values, and the
>> combined selectivity (chance that two arbitrary rows would have a join
>> match) was (1/200)^4 * 1/1000. Very small. The error is, IMO, that the code
>> does not distinguish known numbers from default numbers. A comment in the
>> code acknowledges this:
>
> I'm not sure I understand what you're getting at here, unless the idea
> is to make get_variable_numdistinct() somehow indicate to the caller
> whether it had to punt. That might be worth doing.
Yes, the first step is to make "punt" a separate indicator. The second
would be to make good use of that indicator. As it is now, with "punt"
being a possible data value, there two types of errors:
False negative (code treats DEFAULT_NUM_DISTINCT as ordinary case when
it is special):
I wanted eqjoinsel_inner() to treat "punt" specially: to use the value
from the known side of the JOIN when the other side is unknown. The
current behavior, although not ideal, is the expected use of a default
value.
False positive (code treats DEFAULT_NUM_DISTINCT as special case when it
is ordinary):
eqjoinsel_semi() and estimate_hash_bucketsize() treat
DEFAULT_NUM_DISTINCT specially. This behavior is less defensible than
false positive, since a valid numeric value is being re-used as a flag.
I suggest wrapping the value in a struct (to avoid accidental use) and
using macros for read access.
typedef struct {
double value; // negative means "unknown"
} num_distinct_t;
#define IS_NUM_DISTINCT_DEFINED(nd) ((nd).value >= 0)
#define NUM_DISTINCT_VALUE(nd) ((nd).value)
- Clem Dickey
P.S. Congratulations on displacing MySQL in Mac OS X Lion Server.