Re: Odd estimation issue with user-defined type - Mailing list pgsql-performance

From Tom Lane
Subject Re: Odd estimation issue with user-defined type
Date
Msg-id 23120.1283526855@sss.pgh.pa.us
Whole thread Raw
In response to Odd estimation issue with user-defined type  (Florian Weimer <fweimer@bfk.de>)
Responses Re: Odd estimation issue with user-defined type  (Florian Weimer <fweimer@bfk.de>)
List pgsql-performance
Florian Weimer <fweimer@bfk.de> writes:
> I've created a few user-defined types quite similar to uuid which we
> use to store various hashes in the database.  (The types use binary
> encoding internally, but only expose hexadecimal strings externally.)

> The hashes are roughly equidistributed, so when I do a range query
> which is essentially based on a hash prefix(*), I expect the result to
> contain N * 2**(-k) results, where N is the table size and k the
> number of bits in the range.  Actual query results show that this is
> the case.  The odd thing is that the planner thinks that the range
> query will return about one quarter of the table, independently of the
> range specified.  Of course, the row estimates are quite far off as a
> result, leading to suboptimal plans.

> Any idea what could cause this?  Do I need to provide some estimator
> function somewhere?

If you haven't, then how would you expect the planner to know that?

Less flippantly, you really need to tell us exactly what planner support
you did provide, before you can expect any intelligent comment.  Has the
type got a default btree opclass?  What selectivity estimators did you
attach to the comparison operators?  Do you get MCV and/or histogram
entries in pg_stats when you ANALYZE one of these columns, and if so
do they look sane?

            regards, tom lane

pgsql-performance by date:

Previous
From: Florian Weimer
Date:
Subject: Odd estimation issue with user-defined type
Next
From: Florian Weimer
Date:
Subject: Re: Odd estimation issue with user-defined type