Alec Mitchell <apm13@columbia.edu> writes:
> The tr/t join produces 52 rows with unique trailers (the primary key on tr)
> out of the 750 available (the planner estimates 62). These are then joined
> with the manifests table m, which has 13526 rows. The relationship between
> tr.trailer and m.trailer is a bit complex. Of the 750 possible trailer
> values in tr, 607 have a one to one mapping to rows in m. The remaining 143
> values are each referenced in 1-70 (avg 24) different rows in m.
> Additionally, there are 9510 rows in m (the vast majority), which have a null
> value for trailer (perhaps that is the cause of these bad statistics).
Hmm ... we fixed a bug last fall in which NULLs were twice-excluded from
the estimates for range queries, leading to silly results when the
proportion of nulls got nontrivial. This isn't a range query, but
I wonder if there's a similar bug lurking here ...
Could you see your way to sending me a dump of these tables for testing
purposes? You could exclude the columns not used in the FROM/WHERE
clauses, if that is needed to satisfy privacy worries.
regards, tom lane