Josh Berkus <josh@agliodbs.com> writes:
> Tom,
>> Okay, here's our problem:
>>
>> live DB: tgroup_id n_distinct = -1
>>
>> test DN: tgroup_id n_distinct = 11
>>
>> The former estimate actually means that it thinks tgroup_id is a unique
>> column, whereas the latter says there are only 11 distinct values in the
>> column. I assume the former is much nearer to the truth (how many rows
>> in cases, and how many distinct tgroup_id values)?
> The real case is that there are 113 distinct tgroup_ids, which cover
> about 10% of the population of cases. The other 90% is NULL. The
> average tgroup_id is shared between 4.7 cases.
> So this seems like sampling error.
Partly. The numbers suggest that in ANALYZE's default sample of 3000
rows, it's only finding about a dozen non-null tgroup_ids (yielding the
0.996 null_frac value); and that in one case all dozen are different and
in the other case there are two duplicates. It would help if you
boosted the stats target for this column by a factor of 10. (You can
do that in 7.2, btw --- IIRC the only problem is that a pg_dump won't
show that you did so.)
But the other part of the problem is that in 7.2, the join selectivity
estimator is way off when you are joining a unique column (like the pkey
on the other side) to a column with a very large fraction of nulls.
We only discovered this recently; it's fixed as of 7.3.3:
2003-04-15 01:18 tgl
* src/backend/utils/adt/selfuncs.c (REL7_3_STABLE): eqjoinsel's
logic for case where MCV lists are not present should account for
NULLs; in hindsight this is obvious since the code for the
MCV-lists case would reduce to this when there are zero entries in
both lists. Per example from Alec Mitchell.
Possibly you could backpatch that into 7.2, although I'd think an update
to 7.3.4 would be a more profitable use of time.
regards, tom lane