Re: Selectivity estimation for inet operators - Mailing list pgsql-hackers

From Dilip kumar
Subject Re: Selectivity estimation for inet operators
Date
Msg-id 4205E661176A124FAF891E0A6BA913526633C027@szxeml509-mbs.china.huawei.com
Whole thread Raw
In response to Selectivity estimation for inet operators  (Emre Hasegeli <emre@hasegeli.com>)
Responses Re: Selectivity estimation for inet operators  (Emre Hasegeli <emre@hasegeli.com>)
List pgsql-hackers
On, 15 May 2014 14:04 Emre Hasegeli Wrote, 

> 
> * matching first MCV to second MCV
> * searching first MCV in the second histogram
> * searching second MCV in the first histogram
> * searching boundaries of the first histogram in the second histogram
> 
> Comparing the lists with each other slows down the function when
> statistics set to higher values. To avoid this problem I only use
> log(n) values of the lists. It is the first log(n) value for MCV,
> evenly separated values for histograms. In my tests, this optimization
> does not affect the planning time when statistics = 100, but does
> affect accuracy of the estimation. I can send the version without this
> optimization, if slow down with larger statistics is not a problem
> which should be solved on the selectivity estimation function.
>

I have started reviewing this patch, so far I have done basic reviews and some testing/debugging.

1. Patch applied to git head.
2. Basic testing works fine.

I have one query,

In inet_his_inclusion_selec function, 
When the constant matches only the right side of the bucket, and if it’s a last bucket then it's never considered as
partialmatch candidate.
 
In my opinion, if it's not a last bucket then for next bucket it will become left boundary and this will be treated as
partialmatch so no problem, but in-case of last bucket it can give wrong selectivity.
 

Can't we consider it as partial bucket match if it is last bucket ?

Apart from that there is one spell check you can correct
-- in inet_his_inclusion_selec comments
histogram boundies  -> histogram boundaries :)

Thanks & Regards,
Dilip Kumar





pgsql-hackers by date:

Previous
From: David Rowley
Date:
Subject: Re: Allowing NOT IN to use ANTI joins
Next
From: Fabien COELHO
Date:
Subject: Re: gaussian distribution pgbench