Re: Selectivity estimation for inet operators - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Selectivity estimation for inet operators
Date
Msg-id 9199.1409427071@sss.pgh.pa.us
Whole thread Raw
In response to Re: Selectivity estimation for inet operators  (Heikki Linnakangas <hlinnakangas@vmware.com>)
Responses Re: Selectivity estimation for inet operators
List pgsql-hackers
Heikki Linnakangas <hlinnakangas@vmware.com> writes:
> * inet_mcv_join_selec() is O(n^2) where n is the number of entries in 
> the MCV lists. With the max statistics target of 10000, a worst case 
> query on my laptop took about 15 seconds to plan. Maybe that's 
> acceptable, but you went through some trouble to make planning of MCV vs 
> histogram faster, by the log2 method to compare only some values, so I 
> wonder why you didn't do the same for the MCV vs MCV case?

Actually, what I think needs to be asked is the opposite question: why is
the other code ignoring some of the statistical data?  If the user asked
us to collect a lot of stats detail it seems reasonable that he's
expecting us to use it to get more accurate estimates.  It's for sure
not obvious why these estimators should take shortcuts that are not being
taken in the much-longer-established code for scalar comparison estimates.

I'm not exactly convinced that the math adds up in this logic, either.
The way in which it combines results from looking at the MCV lists and
at the histograms seems pretty arbitrary.
        regards, tom lane



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Selectivity estimation for inet operators
Next
From: Noah Misch
Date:
Subject: Re: What in the world is happening with castoroides and protosciurus?