Selectivity estimation for inet operators - Mailing list pgsql-hackers

From Emre Hasegeli
Subject Selectivity estimation for inet operators
Date
Msg-id CAE2gYzwNBfGe9+1RP4UU6be9q2_m7Bfe6T18JgBSV4qJrkmxEQ@mail.gmail.com
Whole thread Raw
Responses Re: Selectivity estimation for inet operators
Re: Selectivity estimation for inet operators
List pgsql-hackers
New version of the selectivity estimation patch attached. I am adding
it to CommitFest 2014-06. Previous version of it reviewed by
Andreas Karlson on the previous CommitFest with the GiST support patch.
The new version includes join selectivity estimation.

Join selectivity is calculated in 4 steps:

* matching first MCV to second MCV
* searching first MCV in the second histogram
* searching second MCV in the first histogram
* searching boundaries of the first histogram in the second histogram

Comparing the lists with each other slows down the function when
statistics set to higher values. To avoid this problem I only use
log(n) values of the lists. It is the first log(n) value for MCV,
evenly separated values for histograms. In my tests, this optimization
does not affect the planning time when statistics = 100, but does
affect accuracy of the estimation. I can send the version without
this optimization, if slow down with larger statistics is not a problem
which should be solved on the selectivity estimation function.

I also attach the script I was using for testing and I left log statements
in the networkjoinsel() function to make testing easier. These statements
should be removed before commit.

Attachment

pgsql-hackers by date:

Previous
From: Benedikt Grundmann
Date:
Subject: Re: gettimeofday is at the end of its usefulness?
Next
From: Rohit Goyal
Date:
Subject: Re: Error in running DBT2