Thread: GiST support for inet datatypes

GiST support for inet datatypes

From
Emre Hasegeli
Date:
Hi,

Attached patch adds GiST support to the inet datatypes with two new
operators. Overlaps operator can be used with exclusion constraints.
Is adjacent to operator is just the negator of it. Index uses only
the network bits of the addresses. Except for the new operators and
is contained within, contains; basic comparison operators are also
supported.

Query planner never chooses to use the index for the operators which
the index is particularly useful because selectivity estimation functions
are missing. I am planning to work on them.

I also wanted to add strictly left of and strictly right of operators
but I did not want to introduce new symbols. I think we need style
guidelines for them. Range types use <@ and @> for is contained within
and contains operators; << and >> for strictly left of and strictly right of
operators. It would be nice if we could change the symbols for contains
and is contained within operators of the inet datatypes. Then we could
use the old ones for strictly left of and strictly right of operators.

I did not touch opr_sanity regression tests as I did not decide
how to solve these problems. I did not add documentation except
the new operators. It would be nice to mention the index and exclusion
constraints for inet datatypes somewhere. I did not know which page
would be more suitable.

Attachment

Re: GiST support for inet datatypes

From
David Fetter
Date:
On Tue, Dec 17, 2013 at 08:58:13PM +0200, Emre Hasegeli wrote:
> Hi,
> 
> Attached patch adds GiST support to the inet datatypes with two new
> operators. Overlaps operator can be used with exclusion constraints.
> Is adjacent to operator is just the negator of it. Index uses only
> the network bits of the addresses. Except for the new operators and
> is contained within, contains; basic comparison operators are also
> supported.

Please add this patch to the upcoming Commitfest at
https://commitfest.postgresql.org/action/commitfest_view?id=21

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate



Re: GiST support for inet datatypes

From
Emre Hasegeli
Date:
2013-12-17 Emre Hasegeli <emre@hasegeli.com>:
> Query planner never chooses to use the index for the operators which
> the index is particularly useful because selectivity estimation functions
> are missing. I am planning to work on them.

Attached patch adds selectivity estimation functions for the overlap and
adjacent operators. Other operators need a bit more work. I want to send it
before to get some feedback.

Attachment

Re: GiST support for inet datatypes

From
Andreas Karlsson
Date:
Hi,

I will review your two patches (gist support + selectivity). This is 
part 1 of my review. I will look more into the actual GiST 
implementation in a couple of days, but thought I could provide you with 
my initial input right away.

inet-gist
---------

General:

I like the idea of the patch and think the && operator is useful for 
exclusion constraints, and that indexing the contains operator is useful 
for IP-address lookups. There is an extension, ip4r, which adds a GiST 
indexed type for IP ranges but since we have the inet type in core I 
think it should have GiST indexes.

I am not convinced an adjacent operator is useful for the inet type, but 
if it is included it should be indexed just like -|- of ranges. We 
should try to keep these lists of indexed operators the same.

Compilation:

Compiled without errors.

Regression tests:

One of the broken regression tests seems unrelated to the selectivity 
functions.

-- Make a list of all the distinct operator names being used in particular
-- strategy slots.

I think it would be fine just to add the newly indexed operators here, 
but the more indexed operators we get in the core the less useful this 
test becomes.

Source code:

The is adjacent to operator, -|-, does not seem to be doing the right 
thing. In the code it is implemented as the inverse of && which is not 
true. The below comparison should return false.

SELECT '10.0.0.1'::inet -|- '127.0.0.1'::inet; ?column?
---------- t
(1 row)

I am a bit suspicious about your memcmp based optimization in 
bitncommon, but it could be good. Have you benchmarked it compared to 
doing the same thing with a loop?

Documentation:

Please use examples in the operator table which will evaluate to true, 
and for the && case an example where not both sides are the same.

I have not found a place either in the documentation where it is 
documented which operators are documented. I would suggest just adding a 
short note after the operators table.

inet-selfuncs
-------------

Compilation:

There were some new warnings caused by this patch (with gcc 4.8.2). The 
warnings were all about the use of uninitialized variables (right, 
right_min_bits, right_order) in inet_hist_overlap_selectivity. Looking 
at the code I see that they are harmless but you should still rewrite 
the loop to silence the warnings.

Regression tests:

There are two broken

-- Insist that all built-in pg_proc entries have descriptions

Here you should just add descriptions for the new functions in pg_proc.h.

-- Check that all opclass search operators have selectivity estimators.

Fails due to missing selectivity functions for the operators. I think 
you should at least for now do like the range types and use areajoinsel, 
contjoinsel, etc for the join selectivity estimation.

Source code:

The selectivity estimation functions, network_overlap_selectivity and 
network_adjacent_selectivity, should follow the naming conventions of 
the other selectivity estimation functions. They are named things like 
tsmatchsel, tsmatchjoinsel, and rangesel.

-- 
Andreas Karlsson



Re: GiST support for inet datatypes

From
Emre Hasegeli
Date:
2014-01-19 Andreas Karlsson <andreas@proxel.se>:
> Hi,
>
> I will review your two patches (gist support + selectivity). This is part 1
> of my review. I will look more into the actual GiST implementation in a
> couple of days, but thought I could provide you with my initial input right
> away.

Thank you for looking at it.

>
> inet-gist
> ---------
>
> General:
>
> I like the idea of the patch and think the && operator is useful for
> exclusion constraints, and that indexing the contains operator is useful for
> IP-address lookups. There is an extension, ip4r, which adds a GiST indexed
> type for IP ranges but since we have the inet type in core I think it should
> have GiST indexes.
>
> I am not convinced an adjacent operator is useful for the inet type, but if
> it is included it should be indexed just like -|- of ranges. We should try
> to keep these lists of indexed operators the same.

I added it just not to leave negotor field empty. It can also be useful with
exclusion constraints but not with GiST support. I did not add GiST support
for it and the not equals operator because they do not fit the index
structure. I can just remove the operator for now.

>
> Compilation:
>
> Compiled without errors.
>
> Regression tests:
>
> One of the broken regression tests seems unrelated to the selectivity
> functions.
>
> -- Make a list of all the distinct operator names being used in particular
> -- strategy slots.
>
> I think it would be fine just to add the newly indexed operators here, but
> the more indexed operators we get in the core the less useful this test
> becomes.

I did not add the new operators to the list because I do not feel right
about supporting <<, <<=, >>, >>= symbols for these operators.
They should be <@, <@=, @>, @>= to be consistent with other types.

>
> I am a bit suspicious about your memcmp based optimization in bitncommon,
> but it could be good. Have you benchmarked it compared to doing the same
> thing with a loop?

I did, when I was writing that part. I will be happy to do it again. I will
post the results.

>
> Documentation:
>
> Please use examples in the operator table which will evaluate to true, and
> for the && case an example where not both sides are the same.

I will change in the next version.

>
> I have not found a place either in the documentation where it is documented
> which operators are documented. I would suggest just adding a short note
> after the operators table.

I will add in the next version.

>
> inet-selfuncs
> -------------
>
> Compilation:
>
> There were some new warnings caused by this patch (with gcc 4.8.2). The
> warnings were all about the use of uninitialized variables (right,
> right_min_bits, right_order) in inet_hist_overlap_selectivity. Looking at
> the code I see that they are harmless but you should still rewrite the loop
> to silence the warnings.

I will fix in the next version.

>
> Regression tests:
>
> There are two broken
>
> -- Insist that all built-in pg_proc entries have descriptions
>
> Here you should just add descriptions for the new functions in pg_proc.h.

I will add in the next version.

>
> -- Check that all opclass search operators have selectivity estimators.
>
> Fails due to missing selectivity functions for the operators. I think you
> should at least for now do like the range types and use areajoinsel,
> contjoinsel, etc for the join selectivity estimation.

I did not support them in the first version because I could not decide
the way. I have better options than using the the geo_selfuncs for these
operators. The options are from simple to complex:

0) just use network_overlap_selectivity
1) fix network_overlap_selectivity with a constant between 0 and 1
2) calculate this constant by using masklens of all buckets of the histogram
3) calculate this constant by using masklens of matching buckets of  the histogram
4) store STATISTIC_KIND_RANGE_LENGTH_HISTOGRAM for this  types, calculate this constant with it
5) create another kind of histogram for masklens, calculate this constant  with it

I do not know how to do 4 or 5, so I will try 3 for the next version. Do you
think it is a good idea?

>
> Source code:
>
> The selectivity estimation functions, network_overlap_selectivity and
> network_adjacent_selectivity, should follow the naming conventions of the
> other selectivity estimation functions. They are named things like
> tsmatchsel, tsmatchjoinsel, and rangesel.

I will rename it to inetoverlapsel, then.



Re: GiST support for inet datatypes

From
Andreas Karlsson
Date:
On 01/19/2014 11:10 AM, Emre Hasegeli wrote:
>> I am not convinced an adjacent operator is useful for the inet type, but if
>> it is included it should be indexed just like -|- of ranges. We should try
>> to keep these lists of indexed operators the same.
>
> I added it just not to leave negotor field empty. It can also be useful with
> exclusion constraints but not with GiST support. I did not add GiST support
> for it and the not equals operator because they do not fit the index
> structure. I can just remove the operator for now.

Sounds good. None of the other && implementations have a negator.

>> I think it would be fine just to add the newly indexed operators here, but
>> the more indexed operators we get in the core the less useful this test
>> becomes.
>
> I did not add the new operators to the list because I do not feel right
> about supporting <<, <<=, >>, >>= symbols for these operators.
> They should be <@, <@=, @>, @>= to be consistent with other types.

I agree, but I am not sure if it is ok to break backward compatibility here.

>> -- Check that all opclass search operators have selectivity estimators.
>>
>> Fails due to missing selectivity functions for the operators. I think you
>> should at least for now do like the range types and use areajoinsel,
>> contjoinsel, etc for the join selectivity estimation.
>
> I did not support them in the first version because I could not decide
> the way. I have better options than using the the geo_selfuncs for these
> operators. The options are from simple to complex:
>
> 0) just use network_overlap_selectivity
> 1) fix network_overlap_selectivity with a constant between 0 and 1
> 2) calculate this constant by using masklens of all buckets of the histogram
> 3) calculate this constant by using masklens of matching buckets of
>     the histogram
> 4) store STATISTIC_KIND_RANGE_LENGTH_HISTOGRAM for this
>     types, calculate this constant with it
> 5) create another kind of histogram for masklens, calculate this constant
>     with it
>
> I do not know how to do 4 or 5, so I will try 3 for the next version. Do you
> think it is a good idea?

Solutions 0 and 1 does not really sound better than using geo_selfuncs.

-- 
Andreas Karlsson



Re: GiST support for inet datatypes

From
Andreas Karlsson
Date:
Here comes part 2 of 2 of the review.

inet-gist
---------

In general the code looks good and I think your approach makes sense. 
Not an expert on GiST though so I would like a second opinion on this. 
Maybe there is some clever trick which would make the index more 
efficient, but the design I see is simple and logical. Like this much 
more than the incorrect GiST index for inet in btree_gist.

The only thing is that I think your index would do poorly on the case 
where all values share a prefix before the netmask but have different 
values after the netmask, since gist_union and gist_penalty does not 
care about the bits after the netmask. Am I correct? Have you thought 
anything about this?

To create such data:

CREATE TABLE a (ip inet);
INSERT INTO a SELECT '127.0.0.0/16'::inet + s FROM generate_series(1, 
pow(2, 16)::int) s;
CREATE INDEX aidx ON a USING gist (ip);

Saw also some minor things too.

Typo in comment, weird sentence:
 * The main question to calculate the union is that they have how many * bits in common. [...]

I do not like how you called the result key i inet_union_gist() "tmp". 
Something like "result" or "result_ip" would be better.

Typo in comment, "reverse" should be "inverse":
 * Penalty is reverse of the common IP bits of the two addresses.

Typo in comment, missing "be":
 * Values bigger than 1 are used when the common IP bits cannot * calculated.

Language can be improved in this comment. Both ways to split are by the 
union of the keys, it is more of a split by address prefix.
 * The second and the important way is to split by the union of the keys. * Union of the keys calculated same way with
theinet_gist_union function. * The first and the last biggest subnets created from the calculated * union. In this case
addressescontained by the first subnet will be put * to the left bucket, address contained by the last subnet will be
putto * the right bucket.
 

Could you not just use memcmp in inet_gist_same() instead of bitncmp() 
since it only cares about equality?

There is an extra empty line at the end of network_gist.c

inet-selfuncs
-------------

Here I have to honestly admit I am not sure if I can tell if your 
selectivity function is correct. Your explanation sounds convincing and 
the general idea of looking at the histogram is correct. I am just have 
no idea if the details are right.

DEFAULT_NETWORK_OVERLAP_SELECTIVITY should be renamed 
DEFAULT_NETWORK_OVERLAP_SEL and moved to the .c file.

Typo in comment, "constrant" -> "constant".

There is an extra empty line at the end of network_selfuncs.c

-- 
Andreas Karlsson



Re: GiST support for inet datatypes

From
Emre Hasegeli
Date:
2014/1/20 Andreas Karlsson <andreas@proxel.se>:
> Here comes part 2 of 2 of the review.

Second versions of the patches attached. They address both of your reviews.

>
> inet-gist
> ---------
>
> In general the code looks good and I think your approach makes sense. Not an
> expert on GiST though so I would like a second opinion on this. Maybe there
> is some clever trick which would make the index more efficient, but the
> design I see is simple and logical. Like this much more than the incorrect
> GiST index for inet in btree_gist.

I realized that create extension btree_gist fails with the patch.

ERROR:  could not make operator class "gist_inet_ops" be default for type inet
DETAIL:  Operator class "inet_ops" already is the default.

I think making the new operator class default makes more sense. Name conflict
can be a bigger problem. Should I rename the new operator class?

>
> The only thing is that I think your index would do poorly on the case where
> all values share a prefix before the netmask but have different values after
> the netmask, since gist_union and gist_penalty does not care about the bits
> after the netmask. Am I correct? Have you thought anything about this?

Yes, I have tried to construct the index with ip_maxbits() instead of ip_bits()
but I could not make it work with the cidr type. It can be done by adding
operator as a parameter for union, penalty and picksplit functions on the
GiST framework. I am not sure it worths the trouble. It would only help
the basic comparison operators and it would slow down other operators because
network length comparison before network address would not be possible for
the intermediate nodes. I mentioned this behavior of the operator class on
the paragraph added to the documentation.

> Saw also some minor things too.
>
> Typo in comment, weird sentence:
>
>  * The main question to calculate the union is that they have how many
>  * bits in common. [...]

I rewrite it. I am sorry about my English.

>
> I do not like how you called the result key i inet_union_gist() "tmp".
> Something like "result" or "result_ip" would be better.
>
> Typo in comment, "reverse" should be "inverse":
>
>  * Penalty is reverse of the common IP bits of the two addresses.
>
> Typo in comment, missing "be":
>
>  * Values bigger than 1 are used when the common IP bits cannot
>  * calculated.
>
> Language can be improved in this comment. Both ways to split are by the
> union of the keys, it is more of a split by address prefix.
>
>  * The second and the important way is to split by the union of the keys.
>  * Union of the keys calculated same way with the inet_gist_union function.
>  * The first and the last biggest subnets created from the calculated
>  * union. In this case addresses contained by the first subnet will be put
>  * to the left bucket, address contained by the last subnet will be put to
>  * the right bucket.
>
> Could you not just use memcmp in inet_gist_same() instead of bitncmp() since
> it only cares about equality?
>
> There is an extra empty line at the end of network_gist.c

I changed them.

>
> inet-selfuncs
> -------------
>
> Here I have to honestly admit I am not sure if I can tell if your
> selectivity function is correct. Your explanation sounds convincing and the
> general idea of looking at the histogram is correct. I am just have no idea
> if the details are right.

I tried to improve it in this version. I hope it is more readable now. I used
the word inclusion instead of overlap, made some changes on the algorithm
to make it more suitable to the other operators.

Using the histogram for inclusion which is originally for basic comparison,
is definitely not correct. It is an empirical approach. I have tried several
versions of it, tested them with different data sets and thought it is better
than nothing.

>
> DEFAULT_NETWORK_OVERLAP_SELECTIVITY should be renamed
> DEFAULT_NETWORK_OVERLAP_SEL and moved to the .c file.
>
> Typo in comment, "constrant" -> "constant".
>
> There is an extra empty line at the end of network_selfuncs.c

I changed them.

Attachment

Re: GiST support for inet datatypes

From
Andreas Karlsson
Date:
On 01/23/2014 11:22 PM, Emre Hasegeli wrote:
 > inet-gist
 > ---------
 > I realized that create extension btree_gist fails with the patch.
>
> ERROR:  could not make operator class "gist_inet_ops" be default for type inet
> DETAIL:  Operator class "inet_ops" already is the default.
>
> I think making the new operator class default makes more sense. Name conflict
> can be a bigger problem. Should I rename the new operator class?

I agree that the new operator class should be the default one, it is
more useful and also not buggy. No idea about the naming.

>> The only thing is that I think your index would do poorly on the case where
>> all values share a prefix before the netmask but have different values after
>> the netmask, since gist_union and gist_penalty does not care about the bits
>> after the netmask. Am I correct? Have you thought anything about this?
>
> Yes, I have tried to construct the index with ip_maxbits() instead of ip_bits()
> but I could not make it work with the cidr type. It can be done by adding
> operator as a parameter for union, penalty and picksplit functions on the
> GiST framework. I am not sure it worths the trouble. It would only help
> the basic comparison operators and it would slow down other operators because
> network length comparison before network address would not be possible for
> the intermediate nodes. I mentioned this behavior of the operator class on
> the paragraph added to the documentation.

I think this is fine since it does not seem like a very useful case to
support to me. Would be worth doing only if it is easy to do.

A separate concern: we still have not come to a decision about operators
for inet here. I do not like the fact that the operators differ between
ranges and inet. But I feel this might be out of scope for this patch.

Does any third party want to chime in with an opinion?

Current inet/cidr
<<     is contained within
<<=     is contained within or equals
 >>     contains
 >>=     contains or equals

Range types
@>     contains range
<@     range is contained by
&&     overlap (have points in common)
<<     strictly left of
 >>     strictly right of

>> inet-selfuncs
>> -------------
>>
>> Here I have to honestly admit I am not sure if I can tell if your
>> selectivity function is correct. Your explanation sounds convincing and the
>> general idea of looking at the histogram is correct. I am just have no idea
>> if the details are right.
>
> I tried to improve it in this version. I hope it is more readable now. I used
> the word inclusion instead of overlap, made some changes on the algorithm
> to make it more suitable to the other operators.
>
> Using the histogram for inclusion which is originally for basic comparison,
> is definitely not correct. It is an empirical approach. I have tried several
> versions of it, tested them with different data sets and thought it is better
> than nothing.

Thanks for the updates. The code in this version is cleaner and easier
to follow.

I am not convinced of your approach to calculating the selectivity from
the histogram. The thing I have the problem with is the clever trickery
involved with how you handle different operator types. I prefer the
clearer code of the range types with how calc_hist_selectivity_scalar is
used. Is there any reason for why that approach would not work here or
result in worse code?

I have attached a patch where I improved the language of your comment
describing the workings of the selectivity estimation. Could you please
check it so I did not change the meaning of any of it?

I see from the tests that you still are missing selectivity functions
for operators, what is your plan for this?

--
Andreas Karlsson

Attachment

Re: GiST support for inet datatypes

From
Emre Hasegeli
Date:
2014-01-19 12:10, Emre Hasegeli <emre@hasegeli.com>:
> 2014-01-19 Andreas Karlsson <andreas@proxel.se>:
>
>> I am a bit suspicious about your memcmp based optimization in bitncommon,
>> but it could be good. Have you benchmarked it compared to doing the same
>> thing with a loop?
>
> I did, when I was writing that part. I will be happy to do it again. I will
> post the results.

I was testing it by creating GiST indexes. I realized that these test are
inconsistent when BUFFERING = AUTO. I repeated them with BUFFERING = ON.
The function without memcmp was faster in this case. I will change
the function in the next version of the patch.

The test case:

Create table Network as   select (a || '.' || b || '.' || c || '/24')::cidr       from generate_series(0, 255) as a,
          generate_series(0, 255) as b,               generate_series(0, 255) as c;
 

Drop index if exists N;
Create index N on Network using gist(cidr) with (buffering = on);

Create table Network6 as   select ('::' || to_hex(a) || ':' || to_hex(b))::inet       from generate_series(0, 255) as
a,              generate_series(0, 65535) as b;
 

Drop index if exists N6;
Create index N6 on Network6 using gist(inet) with (buffering = on);

What I could not understand is the tests with IP version 6 was much faster.
The row count is same, the index size is bigger.



Re: GiST support for inet datatypes

From
Emre Hasegeli
Date:
Third versions of the patches attached. They are rebased to the HEAD. In
this versions, the bitncommon function is changed. <sys/socket.h> included
to network_gist.c to be able to compile it on FreeBSD. Geometric mean
calculation for partial bucket match on the function
inet_hist_inclusion_selectivity
reverted back. It was something I changed without enough testing on
the second revision of the patch. This version uses the maximum divider
calculated from the boundaries of the bucket, like the first version. It is
simpler and more reliable.

2014-01-31 3:58, Andreas Karlsson <andreas@proxel.se>:

> I agree that the new operator class should be the default one, it is more
> useful and also not buggy. No idea about the naming.

I have misread the name, rename was not necessary. I removed the DEFAULT
keywords for the inet and the cidr operator classes from btree_gist--1.0.sql
on the inet-gist patch.

> A separate concern: we still have not come to a decision about operators for
> inet here. I do not like the fact that the operators differ between ranges
> and inet. But I feel this might be out of scope for this patch.

I attached a separate optional patch to rename the inclusion operators. It
leaves the old names, changes the GiST supported operators and the operator
names in the documentation. It would be better to apply it with the GiST
support, in my opinion.

> I am not convinced of your approach to calculating the selectivity from the
> histogram. The thing I have the problem with is the clever trickery involved
> with how you handle different operator types. I prefer the clearer code of
> the range types with how calc_hist_selectivity_scalar is used. Is there any
> reason for why that approach would not work here or result in worse code?

Currently we do not have histogram of the lower and upper bounds as
the range types. Current histogram can be used nicely as the lower bound,
but not the upper bound because the comparison is first on the common bits
of the network part, then on the length of the network part. For example,
10.0/16 is defined as greater than 10/8.

Using the histogram as the lower bounds of the networks is not enough to
calculate selectivity for any of these operators. Using it also as the upper
bounds is still not enough for the inclusion operators. The lengths of
the network parts should taken into consideration in a way and it is
what this patch does. Using separate histograms for the lower bounds,
the upper bounds and the lengths of the network parts can solve all of these
problems, but it is a lot of work.

> I have attached a patch where I improved the language of your comment
> describing the workings of the selectivity estimation. Could you please
> check it so I did not change the meaning of any of it?

Thank you. I only added the "for" to the sentence "This ratio can be big
enough to not disregard for addresses with small masklens." I merged
the changes to this version of the patch.

> I see from the tests that you still are missing selectivity functions for
> operators, what is your plan for this?

This was because the join selectivity estimation functions. I set
the geo_selfuncs for the missing ones. All tests pass with them. I want
to develop the join selectivity function too, but not for this commit fest.

Attachment

Re: GiST support for inet datatypes

From
Robert Haas
Date:
On Thu, Feb 6, 2014 at 12:14 PM, Emre Hasegeli <emre@hasegeli.com> wrote:
> I have misread the name, rename was not necessary. I removed the DEFAULT
> keywords for the inet and the cidr operator classes from btree_gist--1.0.sql
> on the inet-gist patch.

Generally, modifying already-release .sql files for extensions is a no-no...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: GiST support for inet datatypes

From
Emre Hasegeli
Date:
2014-02-07 22:41, Robert Haas <robertmhaas@gmail.com>:

> Generally, modifying already-release .sql files for extensions is a no-no...

I prepared separate patches for btree_gist extension with more options.
First one (btree-gist-drop-default-inet-v1.patch) removes DEFAULT keyword
only from the inet and the cidr operator classes. Second one
(btree-gist-drop-default-all-v1.patch) removes DEFAULT keyword for all
operator classes. I think it is more consistent to remove it from all.
Third one (btree-gist-drop-inet-v1.patch) removes the inet and the cidr
operator classes altogether. It is suggested by Tom Lane [1] on bug #5705.
The new GiST operator class includes basic comparison operators except !=
so it may be the right time to remove support from btree_gist. Fourth one
(btree-gist-drop-inet-and-default-v1.patch) is the second one and the third
one together.

[1] http://www.postgresql.org/message-id/10183.1287526949@sss.pgh.pa.us

Attachment

Re: GiST support for inet datatypes

From
Andres Freund
Date:
On 2014-02-17 14:40:07 +0200, Emre Hasegeli wrote:
> 2014-02-07 22:41, Robert Haas <robertmhaas@gmail.com>:
> 
> > Generally, modifying already-release .sql files for extensions is a no-no...
> 
> I prepared separate patches for btree_gist extension with more options.
> First one (btree-gist-drop-default-inet-v1.patch) removes DEFAULT keyword
> only from the inet and the cidr operator classes. Second one
> (btree-gist-drop-default-all-v1.patch) removes DEFAULT keyword for all
> operator classes. I think it is more consistent to remove it from all.
> Third one (btree-gist-drop-inet-v1.patch) removes the inet and the cidr
> operator classes altogether. It is suggested by Tom Lane [1] on bug #5705.
> The new GiST operator class includes basic comparison operators except !=
> so it may be the right time to remove support from btree_gist. Fourth one
> (btree-gist-drop-inet-and-default-v1.patch) is the second one and the third
> one together.
> 
> [1] http://www.postgresql.org/message-id/10183.1287526949@sss.pgh.pa.us

> diff --git a/contrib/btree_gist/Makefile b/contrib/btree_gist/Makefile
> index ba4af14..d5b1fd7 100644
> --- a/contrib/btree_gist/Makefile
> +++ b/contrib/btree_gist/Makefile
> @@ -9,7 +9,7 @@ OBJS =  btree_gist.o btree_utils_num.o btree_utils_var.o btree_int2.o \
>          btree_numeric.o
>  
>  EXTENSION = btree_gist
> -DATA = btree_gist--1.0.sql btree_gist--unpackaged--1.0.sql
> +DATA = btree_gist--1.1.sql btree_gist--unpackaged--1.0.sql

You need to add a file for going from 1.0 to 1.1.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: GiST support for inet datatypes

From
Emre Hasegeli
Date:
2014-02-17 14:54, Andres Freund <andres@2ndquadrant.com>:
> You need to add a file for going from 1.0 to 1.1.

Thank you for the notice. I added them to the patches which touch only two
of the operator classes. It drops and re-creates operator classes as there
is not ALTER OPERATOR CLASS DROP DEFAULT command. I do not apply it to
the patch to remove the DEFAULT keyword from all of the operator classes
because it would nearly be same as dropping and re-creating the extension.

Attachment

Re: GiST support for inet datatypes

From
Tom Lane
Date:
Emre Hasegeli <emre@hasegeli.com> writes:
> 2014-02-17 14:54, Andres Freund <andres@2ndquadrant.com>:
>> You need to add a file for going from 1.0 to 1.1.

> Thank you for the notice. I added them to the patches which touch only two
> of the operator classes. It drops and re-creates operator classes as there
> is not ALTER OPERATOR CLASS DROP DEFAULT command.

Dropping an operator class is quite unacceptable, as it will cause indexes
based on that class to go away (or more likely, cause the upgrade to fail,
if you didn't use CASCADE).  What we've done in the past for changes that
are nominally unsupported is to make the upgrade scripts tweak the system
catalogs directly.

More generally, it doesn't look to me like these upgrade scripts are
complete; shouldn't they be creating some new objects, not just replacing
old ones?

We need to have a discussion as to whether it's actually sane for an
upgrade to remove the DEFAULT marking on a pre-existing opclass.  It
strikes me that this would for instance break pg_dump dumps, in the sense
that the reloaded index would probably now have a different opclass
than before (since pg_dump would see no need to have put an explicit
opclass name into CREATE INDEX if it was the default in the old database).
Even if the new improved opclass is in all ways better, that would be
catastrophic for pg_upgrade I suspect.  Unless the new opclass is
on-disk-compatible with the old; in which case we shouldn't be creating
a new opclass at all, but just modifying the definition of the old one.

In short we probably need to think a bit harder about what this patch is
proposing to do.  It seems fairly likely to me that some other approach
would be a better idea.
        regards, tom lane



Re: GiST support for inet datatypes

From
Emre Hasegeli
Date:
2014-02-17 22:16, Tom Lane <tgl@sss.pgh.pa.us>:

> More generally, it doesn't look to me like these upgrade scripts are
> complete; shouldn't they be creating some new objects, not just replacing
> old ones?

The actual patches are on the previous mail [1]. I was just trying
to solve the problem that btree_gist cannot be loaded because of
the new operator class.

> In short we probably need to think a bit harder about what this patch is
> proposing to do.  It seems fairly likely to me that some other approach
> would be a better idea.

How about only removing the inet and the cidr operator classes
from btree_gist. btree-gist-drop-inet-v2.patch does that.

[1] http://www.postgresql.org/message-id/CAE2gYzxc0FXEwe59sFdUZNQ24c+fRBDMGXwwvbYvmeaNateidw@mail.gmail.com



Re: GiST support for inet datatypes

From
Tom Lane
Date:
Emre Hasegeli <emre@hasegeli.com> writes:
> How about only removing the inet and the cidr operator classes
> from btree_gist. btree-gist-drop-inet-v2.patch does that.

I'm not sure which part of "no" you didn't understand, but to be
clear: you don't get to break existing installations.

Assuming that this opclass is sufficiently better than the existing one,
it would sure be nice if it could become the default; but I've not seen
any proposal in this thread that would allow that without serious upgrade
problems.  I think the realistic alternatives so far are (1) new opclass
is not the default, or (2) this patch gets rejected.

We should probably expend some thought on a general approach to
replacing the default opclass for a datatype, because I'm sure this
will come up again.  Right now I don't see a feasible way.
        regards, tom lane



Re: GiST support for inet datatypes

From
Robert Haas
Date:
On Mon, Feb 17, 2014 at 3:56 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Emre Hasegeli <emre@hasegeli.com> writes:
>> How about only removing the inet and the cidr operator classes
>> from btree_gist. btree-gist-drop-inet-v2.patch does that.
>
> I'm not sure which part of "no" you didn't understand, but to be
> clear: you don't get to break existing installations.
>
> Assuming that this opclass is sufficiently better than the existing one,
> it would sure be nice if it could become the default; but I've not seen
> any proposal in this thread that would allow that without serious upgrade
> problems.  I think the realistic alternatives so far are (1) new opclass
> is not the default, or (2) this patch gets rejected.
>
> We should probably expend some thought on a general approach to
> replacing the default opclass for a datatype, because I'm sure this
> will come up again.  Right now I don't see a feasible way.

It seems odd for "default" to be a property of the opclass rather than
a separate knob.  What comes to mind is something like:

ALTER TYPE sniffle SET DEFAULT OPERATOR CLASS FOR btree TO achoo_ops;

Not sure how much that helps.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: GiST support for inet datatypes

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> On Mon, Feb 17, 2014 at 3:56 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> We should probably expend some thought on a general approach to
>> replacing the default opclass for a datatype, because I'm sure this
>> will come up again.  Right now I don't see a feasible way.

> It seems odd for "default" to be a property of the opclass rather than
> a separate knob.  What comes to mind is something like:
> ALTER TYPE sniffle SET DEFAULT OPERATOR CLASS FOR btree TO achoo_ops;
> Not sure how much that helps.

Not at all, AFAICS.  If it were okay to decide that some formerly-default
opclass is no longer default, then having such a command would be better
than manually manipulating pg_opclass.opcdefault --- but extension upgrade
scripts could certainly do the latter if they had to.  The problem here
is whether it's upgrade-safe to make such a change at all; having
syntactic sugar doesn't make it safer.
        regards, tom lane



Re: GiST support for inet datatypes

From
Emre Hasegeli
Date:
2014-02-19 16:52, Tom Lane <tgl@sss.pgh.pa.us>:
> Not at all, AFAICS.  If it were okay to decide that some formerly-default
> opclass is no longer default, then having such a command would be better
> than manually manipulating pg_opclass.opcdefault --- but extension upgrade
> scripts could certainly do the latter if they had to.  The problem here
> is whether it's upgrade-safe to make such a change at all; having
> syntactic sugar doesn't make it safer.

It is not hard to support != operator on the new operator class. That would
make it functionally compatible with the ones on btree_gist. That way,
changing the default would not break pg_dump dumps, it would only upgrade
the index to the new one.

pg_upgrade dumps current objects in the extension. It fails to restore them,
if the new operator class exists as the default. I do not really understand
how pg_upgrade handle extension upgrades. I do not have a solution to this.

It would be sad if we cannot make the new operator class default because of
the btree_gist implementation which is not useful for inet data types. You
wrote on 2010-10-11 about it [1]:

> Well, actually the btree_gist implementation for inet is a completely
> broken piece of junk: it thinks that convert_network_to_scalar is 100%
> trustworthy and can be used as a substitute for the real comparison
> functions, which isn't even approximately true.  I'm not sure why
> Teodor implemented it like that instead of using the type's real
> comparison functions, but it's pretty much useless if you want the
> same sort order that the type itself defines.

[1] http://www.postgresql.org/message-id/8973.1286841006@sss.pgh.pa.us



Re: GiST support for inet datatypes

From
Tom Lane
Date:
Emre Hasegeli <emre@hasegeli.com> writes:
> [ cites bug #5705 ]

Hm.  I had forgotten how thoroughly broken btree_gist's inet and cidr
opclasses are.  There was discussion at the time of just ripping them
out despite the compatibility hit.  We didn't do it, but if we had
then life would be simpler now.

Perhaps it would be acceptable to drop the btree_gist implementation
and teach pg_upgrade to refuse to upgrade if the old database contains
any such indexes.  I'm not sure that solves the problem, though, because
I think pg_upgrade will still fail if the opclass is in the old database
even though unused (because you'll get the complaint about multiple
default opclasses anyway).  The only simple way to avoid that is to not
have btree_gist loaded at all in the old DB, and I doubt that's an
acceptable upgrade restriction.  It would require dropping indexes of
the other types supported by btree_gist, which would be a pretty hard
sell since those aren't broken.

Really what we'd need here is for btree_gist to be upgraded to a version
that doesn't define gist_inet_ops (or at least doesn't mark it as default)
*before* pg_upgrading to a server version containing the proposed new
implementation.  Not sure how workable that is.  Could we get away with
having pg_upgrade unset the default flag in the old database?
(Ick, but there are no very good alternatives here ...)

BTW, I'd not been paying any attention to this thread, but now that
I scan through it, I think the proposal to change the longstanding
names of the inet operators has zero chance of being accepted.
Consistency with the names chosen for range operators is a mighty
weak argument compared to backwards compatibility.
        regards, tom lane



Re: GiST support for inet datatypes

From
Emre Hasegeli
Date:
2014-02-20 01:37, Tom Lane <tgl@sss.pgh.pa.us>:

> Perhaps it would be acceptable to drop the btree_gist implementation
> and teach pg_upgrade to refuse to upgrade if the old database contains
> any such indexes.  I'm not sure that solves the problem, though, because
> I think pg_upgrade will still fail if the opclass is in the old database
> even though unused (because you'll get the complaint about multiple
> default opclasses anyway).  The only simple way to avoid that is to not
> have btree_gist loaded at all in the old DB, and I doubt that's an
> acceptable upgrade restriction.  It would require dropping indexes of
> the other types supported by btree_gist, which would be a pretty hard
> sell since those aren't broken.
>
> Really what we'd need here is for btree_gist to be upgraded to a version
> that doesn't define gist_inet_ops (or at least doesn't mark it as default)
> *before* pg_upgrading to a server version containing the proposed new
> implementation.  Not sure how workable that is.  Could we get away with
> having pg_upgrade unset the default flag in the old database?
> (Ick, but there are no very good alternatives here ...)

Upgrading btree_gist on the old installation would be almost impossible
for the majority of the users who use package managers, in my opinion.
I cannot think of a better solution than your suggestion. I can try to
prepare a patch to execute the following query on pg_upgrade before
dumping the old database, if that is the final decision.

UPDATE pg_opclass SET opcdefault = false
WHERE opcname IN ('gist_inet_ops', 'gist_cidr_ops');

> BTW, I'd not been paying any attention to this thread, but now that
> I scan through it, I think the proposal to change the longstanding
> names of the inet operators has zero chance of being accepted.
> Consistency with the names chosen for range operators is a mighty
> weak argument compared to backwards compatibility.

That is why I prepared it as a separate patch on top of the others. It is
not only consistency with the range types: <@ and @> symbols used for
containment everywhere except the inet data types, particularly on
the geometric types, arrays; cube, hstore, intaray, ltree extensions.
The patch does not just change the operator names, it leaves the old ones,
adds new operators with GiST support and changes the documentation, like
your commit ba920e1c9182eac55d5f1327ab0d29b721154277 back in 2006. I could
not find why did you leave the inet operators unchanged on that commit,
in the mailing list archives [1]. GiST support will be a promotion for
users to switch to the new operators, if we make this change with it.

This change will also indirectly deprecate the undocumented non-transparent
btree index support that works sometimes for some of the subnet inclusion
operators [2].

[1] http://www.postgresql.org/message-id/14277.1157304939@sss.pgh.pa.us

[2] http://www.postgresql.org/message-id/389.1042992616@sss.pgh.pa.us



Re: GiST support for inet datatypes

From
Andreas Karlsson
Date:
On 02/06/2014 06:14 PM, Emre Hasegeli wrote:
> Third versions of the patches attached. They are rebased to the HEAD. In
> this versions, the bitncommon function is changed. <sys/socket.h> included
> to network_gist.c to be able to compile it on FreeBSD. Geometric mean
> calculation for partial bucket match on the function
> inet_hist_inclusion_selectivity
> reverted back. It was something I changed without enough testing on
> the second revision of the patch. This version uses the maximum divider
> calculated from the boundaries of the bucket, like the first version. It is
> simpler and more reliable.

Thanks for the updated patch.

About the discussions about upgrading PostgreSQL, extensions and 
defaults I do not have any strong opinion. I think that this patch is 
useful even if it does not end up the default, but it would be a pity 
since the BTree GiST index is broken.

Note: The patches do not apply anymore due to changes to 
src/backend/utils/adt/Makefile.

>> I am not convinced of your approach to calculating the selectivity from the
>> histogram. The thing I have the problem with is the clever trickery involved
>> with how you handle different operator types. I prefer the clearer code of
>> the range types with how calc_hist_selectivity_scalar is used. Is there any
>> reason for why that approach would not work here or result in worse code?
>
> Currently we do not have histogram of the lower and upper bounds as
> the range types. Current histogram can be used nicely as the lower bound,
> but not the upper bound because the comparison is first on the common bits
> of the network part, then on the length of the network part. For example,
> 10.0/16 is defined as greater than 10/8.
>
> Using the histogram as the lower bounds of the networks is not enough to
> calculate selectivity for any of these operators. Using it also as the upper
> bounds is still not enough for the inclusion operators. The lengths of
> the network parts should taken into consideration in a way and it is
> what this patch does. Using separate histograms for the lower bounds,
> the upper bounds and the lengths of the network parts can solve all of these
> problems, but it is a lot of work.

I see, thanks for the explanation. But I am still not very fond of how 
that code is written since I find it hard to verify the correctness of 
it, but have no better suggestions.

>> I see from the tests that you still are missing selectivity functions for
>> operators, what is your plan for this?
>
> This was because the join selectivity estimation functions. I set
> the geo_selfuncs for the missing ones. All tests pass with them. I want
> to develop the join selectivity function too, but not for this commit fest.

All tests pass now. Excellent!

Do you think the new index is useful even if you use the basic 
geo_selfuncs? Or should we wait with committing the patches until all 
selfuncs are implemented?

-- 
Andreas Karlsson



Re: GiST support for inet datatypes

From
Bruce Momjian
Date:
On Wed, Feb 19, 2014 at 06:37:42PM -0500, Tom Lane wrote:
> Emre Hasegeli <emre@hasegeli.com> writes:
> > [ cites bug #5705 ]
> 
> Hm.  I had forgotten how thoroughly broken btree_gist's inet and cidr
> opclasses are.  There was discussion at the time of just ripping them
> out despite the compatibility hit.  We didn't do it, but if we had
> then life would be simpler now.
> 
> Perhaps it would be acceptable to drop the btree_gist implementation
> and teach pg_upgrade to refuse to upgrade if the old database contains
> any such indexes.  I'm not sure that solves the problem, though, because
> I think pg_upgrade will still fail if the opclass is in the old database
> even though unused (because you'll get the complaint about multiple
> default opclasses anyway).  The only simple way to avoid that is to not
> have btree_gist loaded at all in the old DB, and I doubt that's an
> acceptable upgrade restriction.  It would require dropping indexes of
> the other types supported by btree_gist, which would be a pretty hard
> sell since those aren't broken.
> 
> Really what we'd need here is for btree_gist to be upgraded to a version
> that doesn't define gist_inet_ops (or at least doesn't mark it as default)
> *before* pg_upgrading to a server version containing the proposed new
> implementation.  Not sure how workable that is.  Could we get away with
> having pg_upgrade unset the default flag in the old database?
> (Ick, but there are no very good alternatives here ...)

pg_upgrade has _never_ modified the old cluster, and I am hesitant to do
that now.  Can we make the changes in the new cluster, or in pg_dump
when in binary upgrade mode?

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + Everyone has their own god. +



Re: GiST support for inet datatypes

From
Emre Hasegeli
Date:
2014-02-24 02:44, Andreas Karlsson <andreas@proxel.se>:

> Note: The patches do not apply anymore due to changes to
> src/backend/utils/adt/Makefile.

I will fix it on the next version.

> I see, thanks for the explanation. But I am still not very fond of how that
> code is written since I find it hard to verify the correctness of it, but
> have no better suggestions.

We can split the selectivity estimation patch from the GiST support, add
it to the next commit fest for more review. In the mean time, I can
improve it with join selectivity estimation. Testing with different datasets
would help the most to reveal how true the algorithm is.

> Do you think the new index is useful even if you use the basic geo_selfuncs?
> Or should we wait with committing the patches until all selfuncs are
> implemented?

My motivation was to be able to use the cidr datatype with exclusion
constraints. I think the index would also be useful without proper
selectivity estimation functions. Range types also use geo_selfuncs for
join selectivity estimation.



Re: GiST support for inet datatypes

From
Emre Hasegeli
Date:
2014-02-24 17:55, Bruce Momjian <bruce@momjian.us>:

> pg_upgrade has _never_ modified the old cluster, and I am hesitant to do
> that now.  Can we make the changes in the new cluster, or in pg_dump
> when in binary upgrade mode?

It can be possible to update the new operator class in the new cluster
as not default, before restore. After the restore, pg_upgrade can upgrade
the btree_gist extension and reset the operator class as the default.
pg_upgrade suggests to re-initdb the new cluster if it fails, anyway. Do
you think it is a better solution? I think it will be more complicated
to do in pg_dump when in binary upgrade mode.



Re: GiST support for inet datatypes

From
Florian Pflug
Date:
On Feb27, 2014, at 11:39 , Emre Hasegeli <emre@hasegeli.com> wrote:
> 2014-02-24 17:55, Bruce Momjian <bruce@momjian.us>:
> 
>> pg_upgrade has _never_ modified the old cluster, and I am hesitant to do
>> that now.  Can we make the changes in the new cluster, or in pg_dump
>> when in binary upgrade mode?
> 
> It can be possible to update the new operator class in the new cluster
> as not default, before restore. After the restore, pg_upgrade can upgrade
> the btree_gist extension and reset the operator class as the default.
> pg_upgrade suggests to re-initdb the new cluster if it fails, anyway. Do
> you think it is a better solution? I think it will be more complicated
> to do in pg_dump when in binary upgrade mode.

Maybe I'm missing something, but isn't the gist of the problem here that
pg_dump won't explicitly state the operator class if it's the default?

If so, can't we just make pg_dump always spell out the operator class
explicitly? Then changing the default will never change the meaning of
database dumps, so upgraded clusters will simply continue to use the
old (now non-default) opclass, no?

best regards,
Florian Pflug




Re: GiST support for inet datatypes

From
Tom Lane
Date:
Florian Pflug <fgp@phlo.org> writes:
> Maybe I'm missing something, but isn't the gist of the problem here that
> pg_dump won't explicitly state the operator class if it's the default?

That's not a bug, it's a feature, for much the same reasons that pg_dump
tries to minimize explicit schema-qualification.

At least, it's a feature for ordinary dumps.  I'm not sure whether it
might be a good idea for binary_upgrade dumps.  That would depend on
our policy about whether a new opclass has to have a new (and necessarily
weird) name.  If we try to make the new opclass still have the nicest
name then it won't help at all for pg_dump to dump the old name.
        regards, tom lane



Re: GiST support for inet datatypes

From
Florian Pflug
Date:
On Feb27, 2014, at 17:56 , Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Florian Pflug <fgp@phlo.org> writes:
>> Maybe I'm missing something, but isn't the gist of the problem here that
>> pg_dump won't explicitly state the operator class if it's the default?
> 
> That's not a bug, it's a feature, for much the same reasons that pg_dump
> tries to minimize explicit schema-qualification.

I fail to see the value in this for opclasses. It's certainly nice for
schema qualifications, because dumping one schema and restoring into a
different schema is probably quite common. But how often does anyone dump
a database and restore it into a database with a different default opclass
for some type?

Or is the idea just to keep the dump as readable as possible?

> At least, it's a feature for ordinary dumps.  I'm not sure whether it
> might be a good idea for binary_upgrade dumps.  That would depend on
> our policy about whether a new opclass has to have a new (and necessarily
> weird) name.  If we try to make the new opclass still have the nicest
> name then it won't help at all for pg_dump to dump the old name.

Well, given the choice between not ever being able to change the default
opclass of a type, and not being able to re-use an old opclass' name,
I'd pick the latter. Especially because for default opclasses, users don't
usually have to know the name anyway.

best regards,
Florian Pflug




Re: GiST support for inet datatypes

From
Tom Lane
Date:
Florian Pflug <fgp@phlo.org> writes:
> On Feb27, 2014, at 17:56 , Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> That's not a bug, it's a feature, for much the same reasons that pg_dump
>> tries to minimize explicit schema-qualification.

> I fail to see the value in this for opclasses. It's certainly nice for
> schema qualifications, because dumping one schema and restoring into a
> different schema is probably quite common.

The value in it is roughly the same as the reason we don't include a
version number when dumping CREATE EXTENSION.  If you had a default
opclass in the source database, you probably want a default opclass
in the destination, even if that's not bitwise the same as what you
had before.  The implication is that you want best practice for the
current PG version.

Another reason for not doing it is that unique-constraint syntax doesn't
even support it.  Constraints *have to* use the default opclass.

> But how often does anyone dump
> a database and restore it into a database with a different default opclass
> for some type?

Indeed.  The root of the problem here is that we've never really thought
about changing a type's default opclass before.  I don't think that a
two-line change in pg_dump fixes all the issues this will bring up.

I remind you also that even if you had a 100% bulletproof argument for
changing the behavior now, it won't affect what's in existing dump files.
The only real wiggle room we have is to change the --binary-upgrade
behavior, since we can plausibly insist that people use a current pg_dump
while doing an upgrade.
        regards, tom lane



Re: GiST support for inet datatypes

From
Greg Stark
Date:
On Thu, Feb 27, 2014 at 5:30 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Indeed.  The root of the problem here is that we've never really thought
> about changing a type's default opclass before.  I don't think that a
> two-line change in pg_dump fixes all the issues this will bring up.

I think we did actually do this once. When you replaced rtree with
gist as the default opclass for the rtree method. IIRC you did it by
making "rtree" a synonym for "gist" and since the opclass wasn't
specified the default gist opclass kicked in automatically.

-- 
greg



Re: GiST support for inet datatypes

From
Emre Hasegeli
Date:
2014-02-27 18:15, Florian Pflug <fgp@phlo.org>:
>> It can be possible to update the new operator class in the new cluster
>> as not default, before restore. After the restore, pg_upgrade can upgrade
>> the btree_gist extension and reset the operator class as the default.
>> pg_upgrade suggests to re-initdb the new cluster if it fails, anyway. Do
>> you think it is a better solution? I think it will be more complicated
>> to do in pg_dump when in binary upgrade mode.
>
> Maybe I'm missing something, but isn't the gist of the problem here that
> pg_dump won't explicitly state the operator class if it's the default?

No, the problem is pg_upgrade. We can even benefit from this behavior of
pg_dump, if we would like to remove the operator classes on btree_gist.
Because of that behavior; users, who would upgrade by dump and restore,
will upgrade to the better default operator class without noticing. I am
not sure not notifying is the best think to do, though.

The problem is that pg_dump --binary-upgrade dumps objects in the extension
on the old cluster, not just the CREATE EXTENSION statement. pg_upgrade
fails to restore them, if the new operator class already exists on the new
cluster as the default. It effects all users who use the extension, even
if they are not using the inet and cidr operator classes in it.



Re: GiST support for inet datatypes

From
Florian Pflug
Date:
On Feb28, 2014, at 15:45 , Emre Hasegeli <emre@hasegeli.com> wrote:
> The problem is that pg_dump --binary-upgrade dumps objects in the extension
> on the old cluster, not just the CREATE EXTENSION statement. pg_upgrade
> fails to restore them, if the new operator class already exists on the new
> cluster as the default. It effects all users who use the extension, even
> if they are not using the inet and cidr operator classes in it.

Hm, what if we put the new opclass into an extension of its own, say inet_gist,
instead of into core?

Couldn't we then remove the inet support from the latest version of btree_gist
(the one we'd ship with 9.4)? People who don't use the old inet opclass could
then simply upgrade the extension after running pg_upgrade to get rid of the
old, broken version. People who *do* use the old inet opclass would need to
drop their indices before doing that, then install the extension inet_gist,
and finally re-create their indices.

People who do nothing would continue to use the old inet opclass.

inet_gist's SQL script could check whether btree_gist has been upgrade, and
if not fail with an error like "btree_gist must be upgraded to at least version
x.y before inet_gist can be installed". That would avoid failing with a rather
cryptic error later.

best regards,
Florian Pflug




Re: GiST support for inet datatypes

From
Emre Hasegeli
Date:
2014-02-28 17:30, Florian Pflug <fgp@phlo.org>:
> Hm, what if we put the new opclass into an extension of its own, say inet_gist,
> instead of into core?

It will work but I do not think it is better than adding it in core as
not default.



Re: GiST support for inet datatypes

From
Robert Haas
Date:
On Thu, Feb 27, 2014 at 12:30 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Florian Pflug <fgp@phlo.org> writes:
>> On Feb27, 2014, at 17:56 , Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> That's not a bug, it's a feature, for much the same reasons that pg_dump
>>> tries to minimize explicit schema-qualification.
>
>> I fail to see the value in this for opclasses. It's certainly nice for
>> schema qualifications, because dumping one schema and restoring into a
>> different schema is probably quite common.
>
> The value in it is roughly the same as the reason we don't include a
> version number when dumping CREATE EXTENSION.  If you had a default
> opclass in the source database, you probably want a default opclass
> in the destination, even if that's not bitwise the same as what you
> had before.  The implication is that you want best practice for the
> current PG version.

I don't think that argument holds a lot of water in this instance.
The whole reason for having multiple opclasses that each one can
implement different comparison behavior.  It's unlikely that you want
an upgrade to change the comparison behavior you had before; you'd be
sad if, for example, the dump/restore process failed to preserve your
existing collation settings.

But even if that were desirable in general, suppressing it for binary
upgrade dumps certainly seems more than sane.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: GiST support for inet datatypes

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> On Thu, Feb 27, 2014 at 12:30 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> The value in it is roughly the same as the reason we don't include a
>> version number when dumping CREATE EXTENSION.  If you had a default
>> opclass in the source database, you probably want a default opclass
>> in the destination, even if that's not bitwise the same as what you
>> had before.  The implication is that you want best practice for the
>> current PG version.

> I don't think that argument holds a lot of water in this instance.
> The whole reason for having multiple opclasses that each one can
> implement different comparison behavior.

Well, I doubt we'd accept a proposal to change the default opclass
of a datatype to something that had incompatible behavior --- but
we might well accept one to change it to something that had improved
behavior, such as more operators.

The first couple dozen lines in GetIndexOpClass() make for interesting
reading in this context.  That approach to cross-version compatibility
probably doesn't work in the pg_upgrade universe, of course; but what I'd
like to point out here is that those kluges wouldn't have been necessary
in the first place if pg_dump had had the suppress-default-opclasses
behavior at the time.  (No, it didn't always do that; cf commits
e5bbf1965 and 1929a90b6.)
        regards, tom lane



Re: GiST support for inet datatypes

From
Emre Hasegeli
Date:
Fourth version of the patch attached. It is rebased to the HEAD (8879fa0).
Operator name formatting patch rebased on top of it. I will put
the selectivity estimation patch to the next commit fest.

This version of the patch does not touch to the btree_gist extension,
does not set the operator class as the default. It adds support to
the not equals operator to make the operator class compatible with
the btree_gist extension.

Attachment

Re: GiST support for inet datatypes

From
Andres Freund
Date:
Hi,

On 2014-03-08 23:40:31 +0200, Emre Hasegeli wrote:
> Fourth version of the patch attached. It is rebased to the HEAD (8879fa0).
> Operator name formatting patch rebased on top of it. I will put
> the selectivity estimation patch to the next commit fest.
> 
> This version of the patch does not touch to the btree_gist extension,
> does not set the operator class as the default. It adds support to
> the not equals operator to make the operator class compatible with
> the btree_gist extension.

This patch looks like it can be applied much more realistically, but it
looks too late for 9.4. I suggest moving it to the next CF?

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: GiST support for inet datatypes

From
Andreas Karlsson
Date:
On 04/04/2014 04:01 PM, Andres Freund wrote:
> This patch looks like it can be applied much more realistically, but it
> looks too late for 9.4. I suggest moving it to the next CF?

If it does not change the default operator class I do not see anything 
preventing it from being applied to 9.4, as long as the committers have 
the time to look at this. My review is done and I think the first patch 
is ok and useful by itself.

-- 
Andreas



Re: GiST support for inet datatypes

From
Andres Freund
Date:
On 2014-04-04 16:50:36 +0200, Andreas Karlsson wrote:
> On 04/04/2014 04:01 PM, Andres Freund wrote:
> >This patch looks like it can be applied much more realistically, but it
> >looks too late for 9.4. I suggest moving it to the next CF?
> 
> If it does not change the default operator class I do not see anything
> preventing it from being applied to 9.4, as long as the committers have the
> time to look at this. My review is done and I think the first patch is ok
> and useful by itself.

Well, the patch was marked as needs review, not ready for committer. So,
if that's your position, re-check the latest version and mark it as
ready.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: GiST support for inet datatypes

From
Andreas Karlsson
Date:
On 03/08/2014 10:40 PM, Emre Hasegeli wrote:
> Fourth version of the patch attached. It is rebased to the HEAD (8879fa0).
> Operator name formatting patch rebased on top of it. I will put
> the selectivity estimation patch to the next commit fest.
>
> This version of the patch does not touch to the btree_gist extension,
> does not set the operator class as the default. It adds support to
> the not equals operator to make the operator class compatible with
> the btree_gist extension.

The patch looks good but it does not apply anymore against master. If 
you could fix that and the duplicate OIDs I think this is ready for a 
committer.

Andreas




Re: GiST support for inet datatypes

From
Emre Hasegeli
Date:
2014-04-04 18:34, Andreas Karlsson <andreas@proxel.se>:

> The patch looks good but it does not apply anymore against master. If you
> could fix that and the duplicate OIDs I think this is ready for a committer.

Rebased version attached. I am marking it as Ready for Committer.

I will be happy to rebase the operator rename patch, too, if there is any
interest committing it.

Attachment

Re: GiST support for inet datatypes

From
Tom Lane
Date:
Emre Hasegeli <emre@hasegeli.com> writes:
> 2014-04-04 18:34, Andreas Karlsson <andreas@proxel.se>:
>> The patch looks good but it does not apply anymore against master. If you
>> could fix that and the duplicate OIDs I think this is ready for a committer.

> Rebased version attached. I am marking it as Ready for Committer.

I've been hacking on this patch over the weekend.  I wasn't that thrilled
with the design of the index representation: it seemed to me that by
allowing both the minimum netmask width and the number of common address
bits to limit what you store in union values, you were giving up a lot.
I tried recoding things to separate netmask width from number of common
bits.  Initially I'd thought that this would allow for smarter
tree-descent logic in the consistent function, but after several failures
I realized that that was more easily said than done.  (I've not totally
lost hope about it, but it's not easy given the existing inet comparison
rules.)  Nonetheless, I found that doing it like this led to substantially
faster index searches --- better than two-to-one in many cases, on random
data as per the attached test script.  I believe the reason is that
decoupling netmask and common address bits makes the picksplit function
more effective at choosing good splits.

I also thought that we should at least put in dummy selectivity functions
for the now-indexable inet operators.  This will allow real selectivity
functions to be patched in without forcing initdb.  It's probably unlikely
that we'd back-patch your selectivity-function patch into 9.4, but why
foreclose the option?

So attached is an updated patch with these things taken care of.  It's
still not quite ready to commit (in particular I've not looked at the
documentation changes yet), but if anyone wants to try to break it or
do their own performance testing, now's the time.

I also attach the script I was using for testing.  It runs too long
to be a plausible candidate for the regression tests, but perhaps
someone else would like to use it anyway.  There's a test that makes
sure that index searches get the same answers as the base operators
over a bunch of random data, and then a bunch of straight queries that
can be timed for performance testing.  I thought that maybe I was
overemphasizing the problem of trimming common address bits to the
minimum netmask width, so the performance part of the script also tests
probes into a table that has a uniform netmask width of 32.  (It still
wins to do it like this.)

Comments?

            regards, tom lane

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 6e2fbda..4787519 100644
*** a/doc/src/sgml/func.sgml
--- b/doc/src/sgml/func.sgml
*************** CREATE TYPE rainbow AS ENUM ('red', 'ora
*** 8434,8441 ****
     <xref linkend="cidr-inet-operators-table"> shows the operators
     available for the <type>cidr</type> and <type>inet</type> types.
     The operators <literal><<</literal>,
!    <literal><<=</literal>, <literal>>></literal>, and
!    <literal>>>=</literal> test for subnet inclusion.  They
     consider only the network parts of the two addresses (ignoring any
     host part) and determine whether one network is identical to
     or a subnet of the other.
--- 8434,8442 ----
     <xref linkend="cidr-inet-operators-table"> shows the operators
     available for the <type>cidr</type> and <type>inet</type> types.
     The operators <literal><<</literal>,
!    <literal><<=</literal>, <literal>>></literal>,
!    <literal>>>=</literal> and <literal>&&</literal>
!    test for subnet inclusion.  They
     consider only the network parts of the two addresses (ignoring any
     host part) and determine whether one network is identical to
     or a subnet of the other.
*************** CREATE TYPE rainbow AS ENUM ('red', 'ora
*** 8503,8508 ****
--- 8504,8514 ----
          <entry><literal>inet '192.168.1/24' >>= inet '192.168.1/24'</literal></entry>
         </row>
         <row>
+         <entry> <literal>&&</literal> </entry>
+         <entry>overlaps</entry>
+         <entry><literal>inet '192.168.1/24' && inet '192.168.1.80/28'</literal></entry>
+        </row>
+        <row>
          <entry> <literal>~</literal> </entry>
          <entry>bitwise NOT</entry>
          <entry><literal>~ inet '192.168.1.6'</literal></entry>
*************** CREATE TYPE rainbow AS ENUM ('red', 'ora
*** 8537,8542 ****
--- 8543,8560 ----
      </table>

    <para>
+    GiST operator class is included for the <type>cidr</type> and
+    <type>inet</type> types, which support indexed queries using
+    <literal><</literal>, <literal><=</literal>,
+    <literal>=</literal>, <literal>>=</literal>, <literal>></literal>,
+    <literal><></literal>, <literal><<</literal>,
+    <literal><<=</literal>, <literal>>></literal>,
+    <literal>>>=</literal> and <literal>&&</literal>
+    operators. The operator class considers only the network parts
+    of the addresses while creating and using the tree.
+   </para>
+
+   <para>
     <xref linkend="cidr-inet-functions-table"> shows the functions
     available for use with the <type>cidr</type> and <type>inet</type>
     types.  The <function>abbrev</function>, <function>host</function>,
diff --git a/src/backend/utils/adt/Makefile b/src/backend/utils/adt/Makefile
index 6b23069..7b4391b 100644
*** a/src/backend/utils/adt/Makefile
--- b/src/backend/utils/adt/Makefile
*************** OBJS = acl.o arrayfuncs.o array_selfuncs
*** 23,29 ****
      geo_ops.o geo_selfuncs.o inet_cidr_ntop.o inet_net_pton.o int.o \
      int8.o json.o jsonb.o jsonb_gin.o jsonb_op.o jsonb_util.o \
      jsonfuncs.o like.o lockfuncs.o mac.o misc.o nabstime.o name.o \
!     network.o numeric.o numutils.o oid.o oracle_compat.o \
      orderedsetaggs.o pg_lzcompress.o pg_locale.o pg_lsn.o \
      pgstatfuncs.o pseudotypes.o quote.o rangetypes.o rangetypes_gist.o \
      rangetypes_selfuncs.o rangetypes_spgist.o rangetypes_typanalyze.o \
--- 23,30 ----
      geo_ops.o geo_selfuncs.o inet_cidr_ntop.o inet_net_pton.o int.o \
      int8.o json.o jsonb.o jsonb_gin.o jsonb_op.o jsonb_util.o \
      jsonfuncs.o like.o lockfuncs.o mac.o misc.o nabstime.o name.o \
!     network.o network_gist.o network_selfuncs.o \
!     numeric.o numutils.o oid.o oracle_compat.o \
      orderedsetaggs.o pg_lzcompress.o pg_locale.o pg_lsn.o \
      pgstatfuncs.o pseudotypes.o quote.o rangetypes.o rangetypes_gist.o \
      rangetypes_selfuncs.o rangetypes_spgist.o rangetypes_typanalyze.o \
diff --git a/src/backend/utils/adt/network.c b/src/backend/utils/adt/network.c
index 5e837fa..8bdf577 100644
*** a/src/backend/utils/adt/network.c
--- b/src/backend/utils/adt/network.c
***************
*** 23,80 ****


  static int32 network_cmp_internal(inet *a1, inet *a2);
- static int    bitncmp(void *l, void *r, int n);
  static bool addressOK(unsigned char *a, int bits, int family);
- static int    ip_addrsize(inet *inetptr);
  static inet *internal_inetpl(inet *ip, int64 addend);

- /*
-  *    Access macros.    We use VARDATA_ANY so that we can process short-header
-  *    varlena values without detoasting them.  This requires a trick:
-  *    VARDATA_ANY assumes the varlena header is already filled in, which is
-  *    not the case when constructing a new value (until SET_INET_VARSIZE is
-  *    called, which we typically can't do till the end).  Therefore, we
-  *    always initialize the newly-allocated value to zeroes (using palloc0).
-  *    A zero length word will look like the not-1-byte case to VARDATA_ANY,
-  *    and so we correctly construct an uncompressed value.
-  *
-  *    Note that ip_maxbits() and SET_INET_VARSIZE() require
-  *    the family field to be set correctly.
-  */
-
- #define ip_family(inetptr) \
-     (((inet_struct *) VARDATA_ANY(inetptr))->family)
-
- #define ip_bits(inetptr) \
-     (((inet_struct *) VARDATA_ANY(inetptr))->bits)
-
- #define ip_addr(inetptr) \
-     (((inet_struct *) VARDATA_ANY(inetptr))->ipaddr)
-
- #define ip_maxbits(inetptr) \
-     (ip_family(inetptr) == PGSQL_AF_INET ? 32 : 128)
-
- #define SET_INET_VARSIZE(dst) \
-     SET_VARSIZE(dst, VARHDRSZ + offsetof(inet_struct, ipaddr) + \
-                 ip_addrsize(dst))
-
-
- /*
-  * Return the number of bytes of address storage needed for this data type.
-  */
- static int
- ip_addrsize(inet *inetptr)
- {
-     switch (ip_family(inetptr))
-     {
-         case PGSQL_AF_INET:
-             return 4;
-         case PGSQL_AF_INET6:
-             return 16;
-         default:
-             return 0;
-     }
- }

  /*
   * Common INET/CIDR input routine
--- 23,31 ----
*************** network_supeq(PG_FUNCTION_ARGS)
*** 596,601 ****
--- 547,567 ----
      PG_RETURN_BOOL(false);
  }

+ Datum
+ network_overlap(PG_FUNCTION_ARGS)
+ {
+     inet       *a1 = PG_GETARG_INET_PP(0);
+     inet       *a2 = PG_GETARG_INET_PP(1);
+
+     if (ip_family(a1) == ip_family(a2))
+     {
+         PG_RETURN_BOOL(bitncmp(ip_addr(a1), ip_addr(a2),
+                                Min(ip_bits(a1), ip_bits(a2))) == 0);
+     }
+
+     PG_RETURN_BOOL(false);
+ }
+
  /*
   * Extract data from a network datatype.
   */
*************** convert_network_to_scalar(Datum value, O
*** 962,971 ****
   * author:
   *        Paul Vixie (ISC), June 1996
   */
! static int
! bitncmp(void *l, void *r, int n)
  {
!     u_int        lb,
                  rb;
      int            x,
                  b;
--- 928,937 ----
   * author:
   *        Paul Vixie (ISC), June 1996
   */
! int
! bitncmp(const unsigned char *l, const unsigned char *r, int n)
  {
!     unsigned int lb,
                  rb;
      int            x,
                  b;
*************** bitncmp(void *l, void *r, int n)
*** 975,982 ****
      if (x || (n % 8) == 0)
          return x;

!     lb = ((const u_char *) l)[b];
!     rb = ((const u_char *) r)[b];
      for (b = n % 8; b > 0; b--)
      {
          if (IS_HIGHBIT_SET(lb) != IS_HIGHBIT_SET(rb))
--- 941,948 ----
      if (x || (n % 8) == 0)
          return x;

!     lb = l[b];
!     rb = r[b];
      for (b = n % 8; b > 0; b--)
      {
          if (IS_HIGHBIT_SET(lb) != IS_HIGHBIT_SET(rb))
*************** bitncmp(void *l, void *r, int n)
*** 991,996 ****
--- 957,1005 ----
      return 0;
  }

+ /*
+  * bitncommon: compare bit masks l and r, for up to n bits.
+  *
+  * Returns the number of leading bits that match (0 to n).
+  */
+ int
+ bitncommon(const unsigned char *l, const unsigned char *r, int n)
+ {
+     int            byte,
+                 nbits;
+
+     /* number of bits to examine in last byte */
+     nbits = n % 8;
+
+     /* check whole bytes */
+     for (byte = 0; byte < n / 8; byte++)
+     {
+         if (l[byte] != r[byte])
+         {
+             /* at least one bit in the last byte is not common */
+             nbits = 7;
+             break;
+         }
+     }
+
+     /* check bits in last partial byte */
+     if (nbits != 0)
+     {
+         /* calculate diff of first non-matching bytes */
+         unsigned int diff = l[byte] ^ r[byte];
+
+         /* compare the bits from the most to the least */
+         while ((diff >> (8 - nbits)) != 0)
+             nbits--;
+     }
+
+     return (8 * byte) + nbits;
+ }
+
+
+ /*
+  * Verify a CIDR address is OK (doesn't have bits set past the masklen)
+  */
  static bool
  addressOK(unsigned char *a, int bits, int family)
  {
diff --git a/src/backend/utils/adt/network_gist.c b/src/backend/utils/adt/network_gist.c
index ...a14e0e6 .
*** a/src/backend/utils/adt/network_gist.c
--- b/src/backend/utils/adt/network_gist.c
***************
*** 0 ****
--- 1,785 ----
+ /*-------------------------------------------------------------------------
+  *
+  * network_gist.c
+  *      GiST support for network types.
+  *
+  * The key thing to understand about this code is the definition of the
+  * "union" of a set of INET/CIDR values.  It works like this:
+  * 1. If the values are not all of the same IP address family, the "union"
+  * is a dummy value with family number zero, minbits zero, commonbits zero,
+  * address all zeroes.    Otherwise:
+  * 2. The union has the common IP address family number.
+  * 3. The union's minbits value is the smallest netmask length ("ip_bits")
+  * of all the input values.
+  * 4. Let C be the number of leading address bits that are in common among
+  * all the input values (C ranges from 0 to ip_maxbits for the family).
+  * 5. The union's commonbits value is C.
+  * 6. The union's address value is the same as the common prefix for its
+  * first C bits, and is zeroes to the right of that.  The physical width
+  * of the address value is ip_maxbits for the address family.
+  *
+  * In a leaf index entry (representing a single key), commonbits is equal to
+  * ip_maxbits for the address family, minbits is the same as the represented
+  * value's ip_bits, and the address is equal to the represented address.
+  * Although it may appear that we're wasting a byte by storing the union
+  * format and not just the represented INET/CIDR value in leaf keys, the
+  * extra byte is actually "free" because of alignment considerations.
+  *
+  * Note that this design tracks minbits and commonbits independently; in any
+  * given union value, either might be smaller than the other.  This does not
+  * help us much when descending the tree, because of the way inet comparison
+  * is defined: at non-leaf nodes we can't compare more than minbits bits
+  * even if we know them.  However, it greatly improves the quality of split
+  * decisions.  Preliminary testing suggests that searches are as much as
+  * twice as fast as for a simpler design in which a single field doubles as
+  * the common prefix length and the minimum ip_bits value.
+  *
+  * Portions Copyright (c) 1996-2014, PostgreSQL Global Development Group
+  * Portions Copyright (c) 1994, Regents of the University of California
+  *
+  *
+  * IDENTIFICATION
+  *      src/backend/utils/adt/network_gist.c
+  *
+  *-------------------------------------------------------------------------
+  */
+ #include "postgres.h"
+
+ #include <sys/socket.h>
+
+ #include "access/gist.h"
+ #include "access/skey.h"
+ #include "utils/inet.h"
+
+ /*
+  * Operator strategy numbers used in the GiST inet_ops opclass
+  */
+ #define INETSTRAT_OVERLAPS        3
+ #define INETSTRAT_EQ            18
+ #define INETSTRAT_NE            19
+ #define INETSTRAT_LT            20
+ #define INETSTRAT_LE            21
+ #define INETSTRAT_GT            22
+ #define INETSTRAT_GE            23
+ #define INETSTRAT_SUB            24
+ #define INETSTRAT_SUBEQ            25
+ #define INETSTRAT_SUP            26
+ #define INETSTRAT_SUPEQ            27
+
+
+ /*
+  * Representation of a GiST INET/CIDR index key.  This is not identical to
+  * INET/CIDR because we need to keep track of the length of the common
+  * address prefix as well as the minimum netmask length.  For simplicity
+  * we always use 1-byte-header varlena format.
+  */
+ typedef struct GistInetKey
+ {
+     uint8        va_header;        /* varlena header --- don't touch directly */
+     unsigned char family;        /* PGSQL_AF_INET, PGSQL_AF_INET6, or zero */
+     unsigned char minbits;        /* minimum number of bits in netmask */
+     unsigned char commonbits;    /* number of common prefix bits in addresses */
+     unsigned char ipaddr[16];    /* up to 128 bits of common address */
+ } GistInetKey;
+
+ #define DatumGetInetKeyP(X) ((GistInetKey *) DatumGetPointer(X))
+ #define InetKeyPGetDatum(X) PointerGetDatum(X)
+
+ /*
+  * Access macros; not really exciting, but we use these for notational
+  * consistency with access to INET/CIDR values.  Note that family-zero values
+  * are stored with 4 bytes of address, not 16.
+  */
+ #define gk_ip_family(gkptr)        ((gkptr)->family)
+ #define gk_ip_minbits(gkptr)    ((gkptr)->minbits)
+ #define gk_ip_commonbits(gkptr) ((gkptr)->commonbits)
+ #define gk_ip_addr(gkptr)        ((gkptr)->ipaddr)
+ #define family_maxbits(fam)        ((fam) == PGSQL_AF_INET6 ? 128 : 32)
+
+ /* These require that the family field has been set: */
+ #define gk_ip_addrsize(gkptr) \
+     (gk_ip_family(gkptr) == PGSQL_AF_INET6 ? 16 : 4)
+ #define gk_ip_maxbits(gkptr) \
+     family_maxbits(gk_ip_family(gkptr))
+ #define SET_GK_VARSIZE(dst) \
+     SET_VARSIZE_SHORT(dst, offsetof(GistInetKey, ipaddr) + gk_ip_addrsize(dst))
+
+
+ /*
+  * The GiST query consistency check
+  */
+ Datum
+ inet_gist_consistent(PG_FUNCTION_ARGS)
+ {
+     GISTENTRY  *ent = (GISTENTRY *) PG_GETARG_POINTER(0);
+     inet       *query = PG_GETARG_INET_PP(1);
+     StrategyNumber strategy = (StrategyNumber) PG_GETARG_UINT16(2);
+
+     /* Oid        subtype = PG_GETARG_OID(3); */
+     bool       *recheck = (bool *) PG_GETARG_POINTER(4);
+     GistInetKey *key = DatumGetInetKeyP(ent->key);
+     int            minbits,
+                 order;
+
+     /* All operators served by this function are exact. */
+     *recheck = false;
+
+     /*
+      * Check 0: different families
+      *
+      * If key represents multiple address families, its children could match
+      * anything.  This can only happen on an inner index page.
+      */
+     if (gk_ip_family(key) == 0)
+         PG_RETURN_BOOL(true);
+
+     /*
+      * Check 1: different families
+      *
+      * Matching families do not help any of the strategies.
+      */
+     if (gk_ip_family(key) != ip_family(query))
+     {
+         switch (strategy)
+         {
+             case INETSTRAT_LT:
+             case INETSTRAT_LE:
+                 if (gk_ip_family(key) < ip_family(query))
+                     PG_RETURN_BOOL(true);
+                 break;
+
+             case INETSTRAT_GE:
+             case INETSTRAT_GT:
+                 if (gk_ip_family(key) > ip_family(query))
+                     PG_RETURN_BOOL(true);
+                 break;
+
+             case INETSTRAT_NE:
+                 PG_RETURN_BOOL(true);
+         }
+         /* For all other cases, we can be sure there is no match */
+         PG_RETURN_BOOL(false);
+     }
+
+     /*
+      * Check 2: network bit count
+      *
+      * Network bit count (ip_bits) helps to check leaves for sub network and
+      * sup network operators.  At non-leaf nodes, we know every child value
+      * has ip_bits >= gk_ip_minbits(key), so we can avoid descending in some
+      * cases too.
+      */
+     switch (strategy)
+     {
+         case INETSTRAT_SUB:
+             if (GIST_LEAF(ent) && gk_ip_minbits(key) <= ip_bits(query))
+                 PG_RETURN_BOOL(false);
+             break;
+
+         case INETSTRAT_SUBEQ:
+             if (GIST_LEAF(ent) && gk_ip_minbits(key) < ip_bits(query))
+                 PG_RETURN_BOOL(false);
+             break;
+
+         case INETSTRAT_SUPEQ:
+         case INETSTRAT_EQ:
+             if (gk_ip_minbits(key) > ip_bits(query))
+                 PG_RETURN_BOOL(false);
+             break;
+
+         case INETSTRAT_SUP:
+             if (gk_ip_minbits(key) >= ip_bits(query))
+                 PG_RETURN_BOOL(false);
+             break;
+     }
+
+     /*
+      * Check 3: common network bits
+      *
+      * Compare available common prefix bits to the query, but not beyond
+      * either the query's netmask or the minimum netmask among the represented
+      * values.    If these bits don't match the query, we have our answer (and
+      * may or may not need to descend, depending on the operator).    If they do
+      * match, and we are not at a leaf, we descend in all cases.
+      *
+      * Note this is the final check for operators that only consider the
+      * network part of the address.
+      */
+     minbits = Min(gk_ip_commonbits(key), gk_ip_minbits(key));
+     minbits = Min(minbits, ip_bits(query));
+
+     order = bitncmp(gk_ip_addr(key), ip_addr(query), minbits);
+
+     switch (strategy)
+     {
+         case INETSTRAT_SUB:
+         case INETSTRAT_SUBEQ:
+         case INETSTRAT_OVERLAPS:
+         case INETSTRAT_SUPEQ:
+         case INETSTRAT_SUP:
+             PG_RETURN_BOOL(order == 0);
+
+         case INETSTRAT_LT:
+         case INETSTRAT_LE:
+             if (order > 0)
+                 PG_RETURN_BOOL(false);
+             if (order < 0 || !GIST_LEAF(ent))
+                 PG_RETURN_BOOL(true);
+             break;
+
+         case INETSTRAT_EQ:
+             if (order != 0)
+                 PG_RETURN_BOOL(false);
+             if (!GIST_LEAF(ent))
+                 PG_RETURN_BOOL(true);
+             break;
+
+         case INETSTRAT_GE:
+         case INETSTRAT_GT:
+             if (order < 0)
+                 PG_RETURN_BOOL(false);
+             if (order > 0 || !GIST_LEAF(ent))
+                 PG_RETURN_BOOL(true);
+             break;
+
+         case INETSTRAT_NE:
+             if (order != 0 || !GIST_LEAF(ent))
+                 PG_RETURN_BOOL(true);
+             break;
+     }
+
+     /*
+      * Remaining checks are only for leaves and basic comparison strategies.
+      * See network_cmp_internal() in network.c for the implementation we need
+      * to match.  Note that in a leaf key, commonbits should equal the address
+      * length, so we compared the whole network parts above.
+      */
+     Assert(GIST_LEAF(ent));
+
+     /*
+      * Check 4: network bit count
+      *
+      * Next step is to compare netmask widths.
+      */
+     switch (strategy)
+     {
+         case INETSTRAT_LT:
+         case INETSTRAT_LE:
+             if (gk_ip_minbits(key) < ip_bits(query))
+                 PG_RETURN_BOOL(true);
+             if (gk_ip_minbits(key) > ip_bits(query))
+                 PG_RETURN_BOOL(false);
+             break;
+
+         case INETSTRAT_EQ:
+             if (gk_ip_minbits(key) != ip_bits(query))
+                 PG_RETURN_BOOL(false);
+             break;
+
+         case INETSTRAT_GE:
+         case INETSTRAT_GT:
+             if (gk_ip_minbits(key) > ip_bits(query))
+                 PG_RETURN_BOOL(true);
+             if (gk_ip_minbits(key) < ip_bits(query))
+                 PG_RETURN_BOOL(false);
+             break;
+
+         case INETSTRAT_NE:
+             if (gk_ip_minbits(key) != ip_bits(query))
+                 PG_RETURN_BOOL(true);
+             break;
+     }
+
+     /*
+      * Check 5: whole address
+      *
+      * Netmask bit counts are the same, so check all the address bits.
+      */
+     order = bitncmp(gk_ip_addr(key), ip_addr(query), gk_ip_maxbits(key));
+
+     switch (strategy)
+     {
+         case INETSTRAT_LT:
+             PG_RETURN_BOOL(order < 0);
+
+         case INETSTRAT_LE:
+             PG_RETURN_BOOL(order <= 0);
+
+         case INETSTRAT_EQ:
+             PG_RETURN_BOOL(order == 0);
+
+         case INETSTRAT_GE:
+             PG_RETURN_BOOL(order >= 0);
+
+         case INETSTRAT_GT:
+             PG_RETURN_BOOL(order > 0);
+
+         case INETSTRAT_NE:
+             PG_RETURN_BOOL(order != 0);
+     }
+
+     elog(ERROR, "unknown strategy for inet GiST");
+     PG_RETURN_BOOL(false);        /* keep compiler quiet */
+ }
+
+ /*
+  * Calculate parameters of the union of some GistInetKeys.
+  *
+  * Examine the keys in elements m..n inclusive of the GISTENTRY array,
+  * and compute these output parameters:
+  * *minfamily_p = minimum IP address family number
+  * *maxfamily_p = maximum IP address family number
+  * *minbits_p = minimum netmask width
+  * *commonbits_p = number of leading bits in common among the addresses
+  *
+  * minbits and commonbits are forced to zero if there's more than one
+  * address family.
+  */
+ static void
+ calc_inet_union_params(GISTENTRY *ent,
+                        int m, int n,
+                        int *minfamily_p,
+                        int *maxfamily_p,
+                        int *minbits_p,
+                        int *commonbits_p)
+ {
+     int            minfamily,
+                 maxfamily,
+                 minbits,
+                 commonbits;
+     unsigned char *addr;
+     GistInetKey *tmp;
+     int            i;
+
+     /* Must be at least one key. */
+     Assert(m <= n);
+
+     /* Initialize variables using the first key. */
+     tmp = DatumGetInetKeyP(ent[m].key);
+     minfamily = maxfamily = gk_ip_family(tmp);
+     minbits = gk_ip_minbits(tmp);
+     commonbits = gk_ip_commonbits(tmp);
+     addr = gk_ip_addr(tmp);
+
+     /* Scan remaining keys. */
+     for (i = m + 1; i <= n; i++)
+     {
+         tmp = DatumGetInetKeyP(ent[i].key);
+
+         /* Determine range of family numbers */
+         if (minfamily > gk_ip_family(tmp))
+             minfamily = gk_ip_family(tmp);
+         if (maxfamily < gk_ip_family(tmp))
+             maxfamily = gk_ip_family(tmp);
+
+         /* Find minimum minbits */
+         if (minbits > gk_ip_minbits(tmp))
+             minbits = gk_ip_minbits(tmp);
+
+         /* Find minimum number of bits in common */
+         if (commonbits > gk_ip_commonbits(tmp))
+             commonbits = gk_ip_commonbits(tmp);
+         if (commonbits > 0)
+             commonbits = bitncommon(addr, gk_ip_addr(tmp), commonbits);
+     }
+
+     /* Force minbits/commonbits to zero if more than one family. */
+     if (minfamily != maxfamily)
+         minbits = commonbits = 0;
+
+     *minfamily_p = minfamily;
+     *maxfamily_p = maxfamily;
+     *minbits_p = minbits;
+     *commonbits_p = commonbits;
+ }
+
+ /*
+  * Same as above, but the GISTENTRY elements to examine are those with
+  * indices listed in the offsets[] array.
+  */
+ static void
+ calc_inet_union_params_indexed(GISTENTRY *ent,
+                                OffsetNumber *offsets, int noffsets,
+                                int *minfamily_p,
+                                int *maxfamily_p,
+                                int *minbits_p,
+                                int *commonbits_p)
+ {
+     int            minfamily,
+                 maxfamily,
+                 minbits,
+                 commonbits;
+     unsigned char *addr;
+     GistInetKey *tmp;
+     int            i;
+
+     /* Must be at least one key. */
+     Assert(noffsets > 0);
+
+     /* Initialize variables using the first key. */
+     tmp = DatumGetInetKeyP(ent[offsets[0]].key);
+     minfamily = maxfamily = gk_ip_family(tmp);
+     minbits = gk_ip_minbits(tmp);
+     commonbits = gk_ip_commonbits(tmp);
+     addr = gk_ip_addr(tmp);
+
+     /* Scan remaining keys. */
+     for (i = 1; i < noffsets; i++)
+     {
+         tmp = DatumGetInetKeyP(ent[offsets[i]].key);
+
+         /* Determine range of family numbers */
+         if (minfamily > gk_ip_family(tmp))
+             minfamily = gk_ip_family(tmp);
+         if (maxfamily < gk_ip_family(tmp))
+             maxfamily = gk_ip_family(tmp);
+
+         /* Find minimum minbits */
+         if (minbits > gk_ip_minbits(tmp))
+             minbits = gk_ip_minbits(tmp);
+
+         /* Find minimum number of bits in common */
+         if (commonbits > gk_ip_commonbits(tmp))
+             commonbits = gk_ip_commonbits(tmp);
+         if (commonbits > 0)
+             commonbits = bitncommon(addr, gk_ip_addr(tmp), commonbits);
+     }
+
+     /* Force minbits/commonbits to zero if more than one family. */
+     if (minfamily != maxfamily)
+         minbits = commonbits = 0;
+
+     *minfamily_p = minfamily;
+     *maxfamily_p = maxfamily;
+     *minbits_p = minbits;
+     *commonbits_p = commonbits;
+ }
+
+ /*
+  * Construct a GistInetKey representing a union value.
+  *
+  * Inputs are the family/minbits/commonbits values to use, plus a pointer to
+  * the address field of one of the union inputs.  (Since we're going to copy
+  * just the bits-in-common, it doesn't matter which one.)
+  */
+ static GistInetKey *
+ build_inet_union_key(int family, int minbits, int commonbits,
+                      unsigned char *addr)
+ {
+     GistInetKey *result;
+
+     /* Make sure any unused bits are zeroed. */
+     result = (GistInetKey *) palloc0(sizeof(GistInetKey));
+
+     gk_ip_family(result) = family;
+     gk_ip_minbits(result) = minbits;
+     gk_ip_commonbits(result) = commonbits;
+
+     /* Clone appropriate bytes of the address. */
+     if (commonbits > 0)
+         memcpy(gk_ip_addr(result), addr, (commonbits + 7) / 8);
+
+     /* Clean any unwanted bits in the last partial byte. */
+     if (commonbits % 8 != 0)
+         gk_ip_addr(result)[commonbits / 8] &= ~(0xFF >> (commonbits % 8));
+
+     /* Set varlena header correctly. */
+     SET_GK_VARSIZE(result);
+
+     return result;
+ }
+
+
+ /*
+  * The GiST union function
+  *
+  * See comments at head of file for the definition of the union.
+  */
+ Datum
+ inet_gist_union(PG_FUNCTION_ARGS)
+ {
+     GistEntryVector *entryvec = (GistEntryVector *) PG_GETARG_POINTER(0);
+     GISTENTRY  *ent = entryvec->vector;
+     int            minfamily,
+                 maxfamily,
+                 minbits,
+                 commonbits;
+     unsigned char *addr;
+     GistInetKey *tmp,
+                *result;
+
+     /* Determine parameters of the union. */
+     calc_inet_union_params(ent, 0, entryvec->n - 1,
+                            &minfamily, &maxfamily,
+                            &minbits, &commonbits);
+
+     /* If more than one family, emit family number zero. */
+     if (minfamily != maxfamily)
+         minfamily = 0;
+
+     /* Initialize address using the first key. */
+     tmp = DatumGetInetKeyP(ent[0].key);
+     addr = gk_ip_addr(tmp);
+
+     /* Construct the union value. */
+     result = build_inet_union_key(minfamily, minbits, commonbits, addr);
+
+     PG_RETURN_POINTER(result);
+ }
+
+ /*
+  * The GiST compress function
+  *
+  * Convert an inet value to GistInetKey.
+  */
+ Datum
+ inet_gist_compress(PG_FUNCTION_ARGS)
+ {
+     GISTENTRY  *entry = (GISTENTRY *) PG_GETARG_POINTER(0);
+     GISTENTRY  *retval;
+
+     if (entry->leafkey)
+     {
+         retval = palloc(sizeof(GISTENTRY));
+         if (DatumGetPointer(entry->key) != NULL)
+         {
+             inet       *in = DatumGetInetPP(entry->key);
+             GistInetKey *r;
+
+             r = (GistInetKey *) palloc0(sizeof(GistInetKey));
+
+             gk_ip_family(r) = ip_family(in);
+             gk_ip_minbits(r) = ip_bits(in);
+             gk_ip_commonbits(r) = gk_ip_maxbits(r);
+             memcpy(gk_ip_addr(r), ip_addr(in), gk_ip_addrsize(r));
+             SET_GK_VARSIZE(r);
+
+             gistentryinit(*retval, PointerGetDatum(r),
+                           entry->rel, entry->page,
+                           entry->offset, FALSE);
+         }
+         else
+         {
+             gistentryinit(*retval, (Datum) 0,
+                           entry->rel, entry->page,
+                           entry->offset, FALSE);
+         }
+     }
+     else
+         retval = entry;
+     PG_RETURN_POINTER(retval);
+ }
+
+ /*
+  * The GiST decompress function
+  *
+  * do not do anything --- we just use the stored GistInetKey as-is.
+  */
+ Datum
+ inet_gist_decompress(PG_FUNCTION_ARGS)
+ {
+     GISTENTRY  *entry = (GISTENTRY *) PG_GETARG_POINTER(0);
+
+     PG_RETURN_POINTER(entry);
+ }
+
+ /*
+  * The GiST page split penalty function
+  *
+  * Charge a large penalty if address family doesn't match, or a somewhat
+  * smaller one if the new value would degrade the union's ip_bits
+  * (minimum netmask width).  Otherwise, penalty is inverse of the
+  * new number of common address bits.
+  */
+ Datum
+ inet_gist_penalty(PG_FUNCTION_ARGS)
+ {
+     GISTENTRY  *origent = (GISTENTRY *) PG_GETARG_POINTER(0);
+     GISTENTRY  *newent = (GISTENTRY *) PG_GETARG_POINTER(1);
+     float       *penalty = (float *) PG_GETARG_POINTER(2);
+     GistInetKey *orig = DatumGetInetKeyP(origent->key),
+                *new = DatumGetInetKeyP(newent->key);
+     int            commonbits;
+
+     if (gk_ip_family(orig) == gk_ip_family(new))
+     {
+         if (gk_ip_minbits(orig) <= gk_ip_minbits(new))
+         {
+             commonbits = bitncommon(gk_ip_addr(orig), gk_ip_addr(new),
+                                     Min(gk_ip_commonbits(orig),
+                                         gk_ip_commonbits(new)));
+             if (commonbits > 0)
+                 *penalty = 1.0f / commonbits;
+             else
+                 *penalty = 2;
+         }
+         else
+             *penalty = 3;
+     }
+     else
+         *penalty = 4;
+
+     PG_RETURN_POINTER(penalty);
+ }
+
+ /*
+  * The GiST PickSplit method
+  *
+  * There are two ways to split. First one is to split by address families,
+  * if there are multiple families appearing in the input.
+  *
+  * The second and more common way is to split by addresses. To achieve this,
+  * determine the number of leading bits shared by all the keys, then split on
+  * the next bit.  (We don't currently consider the netmask widths while doing
+  * this; should we?)  If we fail to get a nontrivial split that way, split
+  * 50-50.
+  */
+ Datum
+ inet_gist_picksplit(PG_FUNCTION_ARGS)
+ {
+     GistEntryVector *entryvec = (GistEntryVector *) PG_GETARG_POINTER(0);
+     GIST_SPLITVEC *splitvec = (GIST_SPLITVEC *) PG_GETARG_POINTER(1);
+     GISTENTRY  *ent = entryvec->vector;
+     int            minfamily,
+                 maxfamily,
+                 minbits,
+                 commonbits;
+     unsigned char *addr;
+     GistInetKey *tmp,
+                *left_union,
+                *right_union;
+     int            maxoff,
+                 nbytes;
+     OffsetNumber i,
+                *left,
+                *right;
+
+     maxoff = entryvec->n - 1;
+     nbytes = (maxoff + 1) * sizeof(OffsetNumber);
+
+     left = (OffsetNumber *) palloc(nbytes);
+     right = (OffsetNumber *) palloc(nbytes);
+
+     splitvec->spl_left = left;
+     splitvec->spl_right = right;
+
+     splitvec->spl_nleft = 0;
+     splitvec->spl_nright = 0;
+
+     /* Determine parameters of the union of all the inputs. */
+     calc_inet_union_params(ent, FirstOffsetNumber, maxoff,
+                            &minfamily, &maxfamily,
+                            &minbits, &commonbits);
+
+     if (minfamily != maxfamily)
+     {
+         /* Multiple families, so split by family. */
+         for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+         {
+             /*
+              * If there's more than 2 families, all but maxfamily go into the
+              * left union.    This could only happen if the inputs include some
+              * IPv4, some IPv6, and some already-multiple-family unions.
+              */
+             tmp = DatumGetInetKeyP(ent[i].key);
+             if (gk_ip_family(tmp) != maxfamily)
+                 left[splitvec->spl_nleft++] = i;
+             else
+                 right[splitvec->spl_nright++] = i;
+         }
+     }
+     else
+     {
+         /*
+          * Split on the next bit after the common bits.  If that yields a
+          * trivial split, try the next bit position to the right.  Repeat till
+          * success; or if we run out of bits, do an arbitrary 50-50 split.
+          */
+         int            maxbits = family_maxbits(minfamily);
+
+         while (commonbits < maxbits)
+         {
+             /* Split using the commonbits'th bit position. */
+             int            bitbyte = commonbits / 8;
+             int            bitmask = 0x80 >> (commonbits % 8);
+
+             splitvec->spl_nleft = splitvec->spl_nright = 0;
+
+             for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+             {
+                 tmp = DatumGetInetKeyP(ent[i].key);
+                 addr = gk_ip_addr(tmp);
+                 if ((addr[bitbyte] & bitmask) == 0)
+                     left[splitvec->spl_nleft++] = i;
+                 else
+                     right[splitvec->spl_nright++] = i;
+             }
+
+             if (splitvec->spl_nleft > 0 && splitvec->spl_nright > 0)
+                 break;            /* success */
+             commonbits++;
+         }
+
+         if (commonbits >= maxbits)
+         {
+             /* Failed ... do a 50-50 split. */
+             splitvec->spl_nleft = splitvec->spl_nright = 0;
+
+             for (i = FirstOffsetNumber; i <= maxoff / 2; i = OffsetNumberNext(i))
+             {
+                 left[splitvec->spl_nleft++] = i;
+             }
+             for (; i <= maxoff; i = OffsetNumberNext(i))
+             {
+                 right[splitvec->spl_nright++] = i;
+             }
+         }
+     }
+
+     /*
+      * Compute the union value for each side from scratch.    In most cases we
+      * could approximate the union values with what we already know, but this
+      * ensures that each side has minbits and commonbits set as high as
+      * possible.
+      */
+     calc_inet_union_params_indexed(ent, left, splitvec->spl_nleft,
+                                    &minfamily, &maxfamily,
+                                    &minbits, &commonbits);
+     if (minfamily != maxfamily)
+         minfamily = 0;
+     tmp = DatumGetInetKeyP(ent[left[0]].key);
+     addr = gk_ip_addr(tmp);
+     left_union = build_inet_union_key(minfamily, minbits, commonbits, addr);
+     splitvec->spl_ldatum = PointerGetDatum(left_union);
+
+     calc_inet_union_params_indexed(ent, right, splitvec->spl_nright,
+                                    &minfamily, &maxfamily,
+                                    &minbits, &commonbits);
+     if (minfamily != maxfamily)
+         minfamily = 0;
+     tmp = DatumGetInetKeyP(ent[right[0]].key);
+     addr = gk_ip_addr(tmp);
+     right_union = build_inet_union_key(minfamily, minbits, commonbits, addr);
+     splitvec->spl_rdatum = PointerGetDatum(right_union);
+
+     PG_RETURN_POINTER(splitvec);
+ }
+
+ /*
+  * The GiST equality function
+  */
+ Datum
+ inet_gist_same(PG_FUNCTION_ARGS)
+ {
+     GistInetKey *left = DatumGetInetKeyP(PG_GETARG_DATUM(0));
+     GistInetKey *right = DatumGetInetKeyP(PG_GETARG_DATUM(1));
+     bool       *result = (bool *) PG_GETARG_POINTER(2);
+
+     *result = (gk_ip_family(left) == gk_ip_family(right) &&
+                gk_ip_minbits(left) == gk_ip_minbits(right) &&
+                gk_ip_commonbits(left) == gk_ip_commonbits(right) &&
+                memcmp(gk_ip_addr(left), gk_ip_addr(right),
+                       gk_ip_addrsize(left)) == 0);
+
+     PG_RETURN_POINTER(result);
+ }
diff --git a/src/backend/utils/adt/network_selfuncs.c b/src/backend/utils/adt/network_selfuncs.c
index ...d0d806f .
*** a/src/backend/utils/adt/network_selfuncs.c
--- b/src/backend/utils/adt/network_selfuncs.c
***************
*** 0 ****
--- 1,32 ----
+ /*-------------------------------------------------------------------------
+  *
+  * network_selfuncs.c
+  *      Functions for selectivity estimation of inet/cidr operators
+  *
+  * Currently these are just stubs, but we hope to do better soon.
+  *
+  * Portions Copyright (c) 1996-2014, PostgreSQL Global Development Group
+  * Portions Copyright (c) 1994, Regents of the University of California
+  *
+  *
+  * IDENTIFICATION
+  *      src/backend/utils/adt/network_selfuncs.c
+  *
+  *-------------------------------------------------------------------------
+  */
+ #include "postgres.h"
+
+ #include "utils/inet.h"
+
+
+ Datum
+ networksel(PG_FUNCTION_ARGS)
+ {
+     PG_RETURN_FLOAT8(0.001);
+ }
+
+ Datum
+ networkjoinsel(PG_FUNCTION_ARGS)
+ {
+     PG_RETURN_FLOAT8(0.001);
+ }
diff --git a/src/include/catalog/catversion.h b/src/include/catalog/catversion.h
index e1a04c8..51b631a 100644
*** a/src/include/catalog/catversion.h
--- b/src/include/catalog/catversion.h
***************
*** 53,58 ****
   */

  /*                            yyyymmddN */
! #define CATALOG_VERSION_NO    201404031

  #endif
--- 53,58 ----
   */

  /*                            yyyymmddN */
! #define CATALOG_VERSION_NO    201404061

  #endif
diff --git a/src/include/catalog/pg_amop.h b/src/include/catalog/pg_amop.h
index 2623113..8efd3be 100644
*** a/src/include/catalog/pg_amop.h
--- b/src/include/catalog/pg_amop.h
*************** DATA(insert (    3474   3831 3831 8 s    3892
*** 818,821 ****
--- 818,836 ----
  DATA(insert (    3474   3831 2283 16 s    3889 4000 0 ));
  DATA(insert (    3474   3831 3831 18 s    3882 4000 0 ));

+ /*
+  * GiST inet_ops
+  */
+ DATA(insert (    3550    869 869 3 s        3552 783 0 ));
+ DATA(insert (    3550    869 869 18 s    1201 783 0 ));
+ DATA(insert (    3550    869 869 19 s    1202 783 0 ));
+ DATA(insert (    3550    869 869 20 s    1203 783 0 ));
+ DATA(insert (    3550    869 869 21 s    1204 783 0 ));
+ DATA(insert (    3550    869 869 22 s    1205 783 0 ));
+ DATA(insert (    3550    869 869 23 s    1206 783 0 ));
+ DATA(insert (    3550    869 869 24 s    931 783 0 ));
+ DATA(insert (    3550    869 869 25 s    932 783 0 ));
+ DATA(insert (    3550    869 869 26 s    933 783 0 ));
+ DATA(insert (    3550    869 869 27 s    934 783 0 ));
+
  #endif   /* PG_AMOP_H */
diff --git a/src/include/catalog/pg_amproc.h b/src/include/catalog/pg_amproc.h
index b28dd56..198b126 100644
*** a/src/include/catalog/pg_amproc.h
--- b/src/include/catalog/pg_amproc.h
*************** DATA(insert (    4037   3802 3802 2 3485 ))
*** 399,404 ****
--- 399,411 ----
  DATA(insert (    4037   3802 3802 3 3486 ));
  DATA(insert (    4037   3802 3802 4 3487 ));
  DATA(insert (    4037   3802 3802 6 3489 ));
+ DATA(insert (    3550   869    869  1 3553 ));
+ DATA(insert (    3550   869    869  2 3554 ));
+ DATA(insert (    3550   869    869  3 3555 ));
+ DATA(insert (    3550   869    869  4 3556 ));
+ DATA(insert (    3550   869    869  5 3557 ));
+ DATA(insert (    3550   869    869  6 3558 ));
+ DATA(insert (    3550   869    869  7 3559 ));

  /* sp-gist */
  DATA(insert (    3474   3831 3831 1 3469 ));
diff --git a/src/include/catalog/pg_opclass.h b/src/include/catalog/pg_opclass.h
index 63a40a8..49b2410 100644
*** a/src/include/catalog/pg_opclass.h
--- b/src/include/catalog/pg_opclass.h
*************** DATA(insert OID = 3123 ( 403    float8_ops
*** 112,117 ****
--- 112,118 ----
  DATA(insert (    405        float8_ops            PGNSP PGUID 1971  701 t 0 ));
  DATA(insert (    403        inet_ops            PGNSP PGUID 1974  869 t 0 ));
  DATA(insert (    405        inet_ops            PGNSP PGUID 1975  869 t 0 ));
+ DATA(insert (    783        inet_ops            PGNSP PGUID 3550  869 f 0 ));
  DATA(insert OID = 1979 ( 403    int2_ops    PGNSP PGUID 1976   21 t 0 ));
  #define INT2_BTREE_OPS_OID 1979
  DATA(insert (    405        int2_ops            PGNSP PGUID 1977   21 t 0 ));
diff --git a/src/include/catalog/pg_operator.h b/src/include/catalog/pg_operator.h
index ac09034..f2e657b 100644
*** a/src/include/catalog/pg_operator.h
--- b/src/include/catalog/pg_operator.h
*************** DATA(insert OID = 1205 (  ">"       PGNSP P
*** 1140,1157 ****
  DESCR("greater than");
  DATA(insert OID = 1206 (  ">="       PGNSP PGUID b f f 869 869     16 1204 1203 network_ge scalargtsel
scalargtjoinsel)); 
  DESCR("greater than or equal");
! DATA(insert OID = 931  (  "<<"       PGNSP PGUID b f f 869 869     16 933        0 network_sub - - ));
  DESCR("is subnet");
  #define OID_INET_SUB_OP                  931
! DATA(insert OID = 932  (  "<<="    PGNSP PGUID b f f 869 869     16 934        0 network_subeq - - ));
  DESCR("is subnet or equal");
  #define OID_INET_SUBEQ_OP                932
! DATA(insert OID = 933  (  ">>"       PGNSP PGUID b f f 869 869     16 931        0 network_sup - - ));
  DESCR("is supernet");
  #define OID_INET_SUP_OP                  933
! DATA(insert OID = 934  (  ">>="    PGNSP PGUID b f f 869 869     16 932        0 network_supeq - - ));
  DESCR("is supernet or equal");
  #define OID_INET_SUPEQ_OP                934

  DATA(insert OID = 2634 (  "~"       PGNSP PGUID l f f      0 869 869 0 0 inetnot - - ));
  DESCR("bitwise not");
--- 1140,1159 ----
  DESCR("greater than");
  DATA(insert OID = 1206 (  ">="       PGNSP PGUID b f f 869 869     16 1204 1203 network_ge scalargtsel
scalargtjoinsel)); 
  DESCR("greater than or equal");
! DATA(insert OID = 931  (  "<<"       PGNSP PGUID b f f 869 869     16 933        0 network_sub networksel
networkjoinsel)); 
  DESCR("is subnet");
  #define OID_INET_SUB_OP                  931
! DATA(insert OID = 932  (  "<<="    PGNSP PGUID b f f 869 869     16 934        0 network_subeq networksel
networkjoinsel)); 
  DESCR("is subnet or equal");
  #define OID_INET_SUBEQ_OP                932
! DATA(insert OID = 933  (  ">>"       PGNSP PGUID b f f 869 869     16 931        0 network_sup networksel
networkjoinsel)); 
  DESCR("is supernet");
  #define OID_INET_SUP_OP                  933
! DATA(insert OID = 934  (  ">>="    PGNSP PGUID b f f 869 869     16 932        0 network_supeq networksel
networkjoinsel)); 
  DESCR("is supernet or equal");
  #define OID_INET_SUPEQ_OP                934
+ DATA(insert OID = 3552    (  "&&"    PGNSP PGUID b f f 869 869     16 3552    0 network_overlap networksel
networkjoinsel));
+ DESCR("overlaps (is subnet or supernet)");

  DATA(insert OID = 2634 (  "~"       PGNSP PGUID l f f      0 869 869 0 0 inetnot - - ));
  DESCR("bitwise not");
diff --git a/src/include/catalog/pg_opfamily.h b/src/include/catalog/pg_opfamily.h
index 775be86..9e8f4ac 100644
*** a/src/include/catalog/pg_opfamily.h
--- b/src/include/catalog/pg_opfamily.h
*************** DATA(insert OID = 1971 (    405        float_ops
*** 78,83 ****
--- 78,84 ----
  DATA(insert OID = 1974 (    403        network_ops        PGNSP PGUID ));
  #define NETWORK_BTREE_FAM_OID 1974
  DATA(insert OID = 1975 (    405        network_ops        PGNSP PGUID ));
+ DATA(insert OID = 3550 (    783        network_ops        PGNSP PGUID ));
  DATA(insert OID = 1976 (    403        integer_ops        PGNSP PGUID ));
  #define INTEGER_BTREE_FAM_OID 1976
  DATA(insert OID = 1977 (    405        integer_ops        PGNSP PGUID ));
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index 334e6b8..bed7fa1 100644
*** a/src/include/catalog/pg_proc.h
--- b/src/include/catalog/pg_proc.h
*************** DATA(insert OID = 927 (  network_sub        PG
*** 2118,2123 ****
--- 2118,2124 ----
  DATA(insert OID = 928 (  network_subeq        PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 16 "869 869" _null_ _null_
_null__null_    network_subeq _null_ _null_ _null_ )); 
  DATA(insert OID = 929 (  network_sup        PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 16 "869 869" _null_ _null_
_null__null_    network_sup _null_ _null_ _null_ )); 
  DATA(insert OID = 930 (  network_supeq        PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 16 "869 869" _null_ _null_
_null__null_    network_supeq _null_ _null_ _null_ )); 
+ DATA(insert OID = 3551 (  network_overlap    PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 16 "869 869" _null_ _null_
_null__null_    network_overlap _null_ _null_ _null_ )); 

  /* inet/cidr functions */
  DATA(insert OID = 598 (  abbrev                PGNSP PGUID 12 1 0 0 0 f f f f t f i 1 0 25 "869" _null_ _null_ _null_
_null_   inet_abbrev _null_ _null_ _null_ )); 
*************** DATA(insert OID = 2631 (  int8pl_inet        P
*** 2164,2169 ****
--- 2165,2192 ----
  DATA(insert OID = 2632 (  inetmi_int8        PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 869 "869 20" _null_ _null_
_null__null_    inetmi_int8 _null_ _null_ _null_ )); 
  DATA(insert OID = 2633 (  inetmi            PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 20 "869 869" _null_ _null_
_null__null_    inetmi _null_ _null_ _null_ )); 

+ /* GiST support for inet and cidr */
+ DATA(insert OID = 3553 (  inet_gist_consistent    PGNSP PGUID 12 1 0 0 0 f f f f t f i 5 0 16 "2281 869 23 26 2281"
_null__null_ _null_ _null_ inet_gist_consistent _null_ _null_ _null_ )); 
+ DESCR("GiST support");
+ DATA(insert OID = 3554 (  inet_gist_union        PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 2281 "2281 2281" _null_
_null__null_ _null_ inet_gist_union _null_ _null_ _null_ )); 
+ DESCR("GiST support");
+ DATA(insert OID = 3555 (  inet_gist_compress    PGNSP PGUID 12 1 0 0 0 f f f f t f i 1 0 2281 "2281" _null_ _null_
_null__null_ inet_gist_compress _null_ _null_ _null_ )); 
+ DESCR("GiST support");
+ DATA(insert OID = 3556 (  inet_gist_decompress    PGNSP PGUID 12 1 0 0 0 f f f f t f i 1 0 2281 "2281" _null_ _null_
_null__null_ inet_gist_decompress _null_ _null_ _null_ )); 
+ DESCR("GiST support");
+ DATA(insert OID = 3557 (  inet_gist_penalty        PGNSP PGUID 12 1 0 0 0 f f f f t f i 3 0 2281 "2281 2281 2281"
_null__null_ _null_ _null_ inet_gist_penalty _null_ _null_ _null_ )); 
+ DESCR("GiST support");
+ DATA(insert OID = 3558 (  inet_gist_picksplit    PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 2281 "2281 2281" _null_
_null__null_ _null_ inet_gist_picksplit _null_ _null_ _null_ )); 
+ DESCR("GiST support");
+ DATA(insert OID = 3559 (  inet_gist_same        PGNSP PGUID 12 1 0 0 0 f f f f t f i 3 0 2281 "869 869 2281" _null_
_null__null_ _null_ inet_gist_same _null_ _null_ _null_ )); 
+ DESCR("GiST support");
+
+ /* Selectivity estimation for inet and cidr */
+ DATA(insert OID = 3560 (  networksel           PGNSP PGUID 12 1 0 0 0 f f f f t f s 4 0 701 "2281 26 2281 23" _null_
_null__null_ _null_    networksel _null_ _null_ _null_ )); 
+ DESCR("restriction selectivity for network operators");
+ DATA(insert OID = 3561 (  networkjoinsel       PGNSP PGUID 12 1 0 0 0 f f f f t f s 5 0 701 "2281 26 2281 21 2281"
_null__null_ _null_ _null_    networkjoinsel _null_ _null_ _null_ )); 
+ DESCR("join selectivity for network operators");
+
  DATA(insert OID = 1690 ( time_mi_time        PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 1186 "1083 1083" _null_ _null_
_null__null_    time_mi_time _null_ _null_ _null_ )); 

  DATA(insert OID = 1691 (  boolle            PGNSP PGUID 12 1 0 0 0 f f f t t f i 2 0 16 "16 16" _null_ _null_ _null_
_null_boolle _null_ _null_ _null_ )); 
diff --git a/src/include/utils/builtins.h b/src/include/utils/builtins.h
index 031a43a..beec002 100644
*** a/src/include/utils/builtins.h
--- b/src/include/utils/builtins.h
*************** extern Datum network_sub(PG_FUNCTION_ARG
*** 902,907 ****
--- 902,908 ----
  extern Datum network_subeq(PG_FUNCTION_ARGS);
  extern Datum network_sup(PG_FUNCTION_ARGS);
  extern Datum network_supeq(PG_FUNCTION_ARGS);
+ extern Datum network_overlap(PG_FUNCTION_ARGS);
  extern Datum network_network(PG_FUNCTION_ARGS);
  extern Datum network_netmask(PG_FUNCTION_ARGS);
  extern Datum network_hostmask(PG_FUNCTION_ARGS);
diff --git a/src/include/utils/inet.h b/src/include/utils/inet.h
index 330f32d..bd31c71 100644
*** a/src/include/utils/inet.h
--- b/src/include/utils/inet.h
*************** typedef struct
*** 53,58 ****
--- 53,90 ----
      inet_struct inet_data;
  } inet;

+ /*
+  *    Access macros.    We use VARDATA_ANY so that we can process short-header
+  *    varlena values without detoasting them.  This requires a trick:
+  *    VARDATA_ANY assumes the varlena header is already filled in, which is
+  *    not the case when constructing a new value (until SET_INET_VARSIZE is
+  *    called, which we typically can't do till the end).  Therefore, we
+  *    always initialize the newly-allocated value to zeroes (using palloc0).
+  *    A zero length word will look like the not-1-byte case to VARDATA_ANY,
+  *    and so we correctly construct an uncompressed value.
+  *
+  *    Note that ip_addrsize(), ip_maxbits(), and SET_INET_VARSIZE() require
+  *    the family field to be set correctly.
+  */
+ #define ip_family(inetptr) \
+     (((inet_struct *) VARDATA_ANY(inetptr))->family)
+
+ #define ip_bits(inetptr) \
+     (((inet_struct *) VARDATA_ANY(inetptr))->bits)
+
+ #define ip_addr(inetptr) \
+     (((inet_struct *) VARDATA_ANY(inetptr))->ipaddr)
+
+ #define ip_addrsize(inetptr) \
+     (ip_family(inetptr) == PGSQL_AF_INET ? 4 : 16)
+
+ #define ip_maxbits(inetptr) \
+     (ip_family(inetptr) == PGSQL_AF_INET ? 32 : 128)
+
+ #define SET_INET_VARSIZE(dst) \
+     SET_VARSIZE(dst, VARHDRSZ + offsetof(inet_struct, ipaddr) + \
+                 ip_addrsize(dst))
+

  /*
   *    This is the internal storage format for MAC addresses:
*************** typedef struct macaddr
*** 82,85 ****
--- 114,140 ----
  #define PG_GETARG_MACADDR_P(n) DatumGetMacaddrP(PG_GETARG_DATUM(n))
  #define PG_RETURN_MACADDR_P(x) return MacaddrPGetDatum(x)

+ /*
+  * Support functions in network.c
+  */
+ extern int    bitncmp(const unsigned char *l, const unsigned char *r, int n);
+ extern int    bitncommon(const unsigned char *l, const unsigned char *r, int n);
+
+ /*
+  * GiST support functions in network_gist.c
+  */
+ extern Datum inet_gist_consistent(PG_FUNCTION_ARGS);
+ extern Datum inet_gist_union(PG_FUNCTION_ARGS);
+ extern Datum inet_gist_compress(PG_FUNCTION_ARGS);
+ extern Datum inet_gist_decompress(PG_FUNCTION_ARGS);
+ extern Datum inet_gist_penalty(PG_FUNCTION_ARGS);
+ extern Datum inet_gist_picksplit(PG_FUNCTION_ARGS);
+ extern Datum inet_gist_same(PG_FUNCTION_ARGS);
+
+ /*
+  * Estimation functions in network_selfuncs.c
+  */
+ extern Datum networksel(PG_FUNCTION_ARGS);
+ extern Datum networkjoinsel(PG_FUNCTION_ARGS);
+
  #endif   /* INET_H */
diff --git a/src/test/regress/expected/inet.out b/src/test/regress/expected/inet.out
index 356a397..008cc0b 100644
*** a/src/test/regress/expected/inet.out
--- b/src/test/regress/expected/inet.out
*************** SELECT '' AS ten, i, c,
*** 180,206 ****
    i < c AS lt, i <= c AS le, i = c AS eq,
    i >= c AS ge, i > c AS gt, i <> c AS ne,
    i << c AS sb, i <<= c AS sbe,
!   i >> c AS sup, i >>= c AS spe
    FROM INET_TBL;
!  ten |        i         |         c          | lt | le | eq | ge | gt | ne | sb | sbe | sup | spe
! -----+------------------+--------------------+----+----+----+----+----+----+----+-----+-----+-----
!      | 192.168.1.226/24 | 192.168.1.0/24     | f  | f  | f  | t  | t  | t  | f  | t   | f   | t
!      | 192.168.1.226    | 192.168.1.0/26     | f  | f  | f  | t  | t  | t  | f  | f   | f   | f
!      | 192.168.1.0/24   | 192.168.1.0/24     | f  | t  | t  | t  | f  | f  | f  | t   | f   | t
!      | 192.168.1.0/25   | 192.168.1.0/24     | f  | f  | f  | t  | t  | t  | t  | t   | f   | f
!      | 192.168.1.255/24 | 192.168.1.0/24     | f  | f  | f  | t  | t  | t  | f  | t   | f   | t
!      | 192.168.1.255/25 | 192.168.1.0/24     | f  | f  | f  | t  | t  | t  | t  | t   | f   | f
!      | 10.1.2.3/8       | 10.0.0.0/8         | f  | f  | f  | t  | t  | t  | f  | t   | f   | t
!      | 10.1.2.3/8       | 10.0.0.0/32        | t  | t  | f  | f  | f  | t  | f  | f   | t   | t
!      | 10.1.2.3         | 10.1.2.3/32        | f  | t  | t  | t  | f  | f  | f  | t   | f   | t
!      | 10.1.2.3/24      | 10.1.2.0/24        | f  | f  | f  | t  | t  | t  | f  | t   | f   | t
!      | 10.1.2.3/16      | 10.1.0.0/16        | f  | f  | f  | t  | t  | t  | f  | t   | f   | t
!      | 10.1.2.3/8       | 10.0.0.0/8         | f  | f  | f  | t  | t  | t  | f  | t   | f   | t
!      | 11.1.2.3/8       | 10.0.0.0/8         | f  | f  | f  | t  | t  | t  | f  | f   | f   | f
!      | 9.1.2.3/8        | 10.0.0.0/8         | t  | t  | f  | f  | f  | t  | f  | f   | f   | f
!      | 10:23::f1/64     | 10:23::f1/128      | t  | t  | f  | f  | f  | t  | f  | f   | t   | t
!      | 10:23::ffff      | 10:23::8000/113    | f  | f  | f  | t  | t  | t  | t  | t   | f   | f
!      | ::4.3.2.1/24     | ::ffff:1.2.3.4/128 | t  | t  | f  | f  | f  | t  | f  | f   | t   | t
  (17 rows)

  -- check the conversion to/from text and set_netmask
--- 180,207 ----
    i < c AS lt, i <= c AS le, i = c AS eq,
    i >= c AS ge, i > c AS gt, i <> c AS ne,
    i << c AS sb, i <<= c AS sbe,
!   i >> c AS sup, i >>= c AS spe,
!   i && c AS ovr
    FROM INET_TBL;
!  ten |        i         |         c          | lt | le | eq | ge | gt | ne | sb | sbe | sup | spe | ovr
! -----+------------------+--------------------+----+----+----+----+----+----+----+-----+-----+-----+-----
!      | 192.168.1.226/24 | 192.168.1.0/24     | f  | f  | f  | t  | t  | t  | f  | t   | f   | t   | t
!      | 192.168.1.226    | 192.168.1.0/26     | f  | f  | f  | t  | t  | t  | f  | f   | f   | f   | f
!      | 192.168.1.0/24   | 192.168.1.0/24     | f  | t  | t  | t  | f  | f  | f  | t   | f   | t   | t
!      | 192.168.1.0/25   | 192.168.1.0/24     | f  | f  | f  | t  | t  | t  | t  | t   | f   | f   | t
!      | 192.168.1.255/24 | 192.168.1.0/24     | f  | f  | f  | t  | t  | t  | f  | t   | f   | t   | t
!      | 192.168.1.255/25 | 192.168.1.0/24     | f  | f  | f  | t  | t  | t  | t  | t   | f   | f   | t
!      | 10.1.2.3/8       | 10.0.0.0/8         | f  | f  | f  | t  | t  | t  | f  | t   | f   | t   | t
!      | 10.1.2.3/8       | 10.0.0.0/32        | t  | t  | f  | f  | f  | t  | f  | f   | t   | t   | t
!      | 10.1.2.3         | 10.1.2.3/32        | f  | t  | t  | t  | f  | f  | f  | t   | f   | t   | t
!      | 10.1.2.3/24      | 10.1.2.0/24        | f  | f  | f  | t  | t  | t  | f  | t   | f   | t   | t
!      | 10.1.2.3/16      | 10.1.0.0/16        | f  | f  | f  | t  | t  | t  | f  | t   | f   | t   | t
!      | 10.1.2.3/8       | 10.0.0.0/8         | f  | f  | f  | t  | t  | t  | f  | t   | f   | t   | t
!      | 11.1.2.3/8       | 10.0.0.0/8         | f  | f  | f  | t  | t  | t  | f  | f   | f   | f   | f
!      | 9.1.2.3/8        | 10.0.0.0/8         | t  | t  | f  | f  | f  | t  | f  | f   | f   | f   | f
!      | 10:23::f1/64     | 10:23::f1/128      | t  | t  | f  | f  | f  | t  | f  | f   | t   | t   | t
!      | 10:23::ffff      | 10:23::8000/113    | f  | f  | f  | t  | t  | t  | t  | t   | f   | f   | t
!      | ::4.3.2.1/24     | ::ffff:1.2.3.4/128 | t  | t  | f  | f  | f  | t  | f  | f   | t   | t   | t
  (17 rows)

  -- check the conversion to/from text and set_netmask
*************** SELECT '' AS ten, set_masklen(inet(text(
*** 226,232 ****
       | ::4.3.2.1/24
  (17 rows)

! -- check that index works correctly
  CREATE INDEX inet_idx1 ON inet_tbl(i);
  SET enable_seqscan TO off;
  SELECT * FROM inet_tbl WHERE i<<'192.168.1.0/24'::cidr;
--- 227,233 ----
       | ::4.3.2.1/24
  (17 rows)

! -- check that btree index works correctly
  CREATE INDEX inet_idx1 ON inet_tbl(i);
  SET enable_seqscan TO off;
  SELECT * FROM inet_tbl WHERE i<<'192.168.1.0/24'::cidr;
*************** SELECT * FROM inet_tbl WHERE i<<='192.16
*** 250,255 ****
--- 251,385 ----

  SET enable_seqscan TO on;
  DROP INDEX inet_idx1;
+ -- check that gist index works correctly
+ CREATE INDEX inet_idx2 ON inet_tbl using gist (i inet_ops);
+ SET enable_seqscan TO off;
+ SELECT * FROM inet_tbl WHERE i << '192.168.1.0/24'::cidr ORDER BY i;
+        c        |        i
+ ----------------+------------------
+  192.168.1.0/24 | 192.168.1.0/25
+  192.168.1.0/24 | 192.168.1.255/25
+  192.168.1.0/26 | 192.168.1.226
+ (3 rows)
+
+ SELECT * FROM inet_tbl WHERE i <<= '192.168.1.0/24'::cidr ORDER BY i;
+        c        |        i
+ ----------------+------------------
+  192.168.1.0/24 | 192.168.1.0/24
+  192.168.1.0/24 | 192.168.1.226/24
+  192.168.1.0/24 | 192.168.1.255/24
+  192.168.1.0/24 | 192.168.1.0/25
+  192.168.1.0/24 | 192.168.1.255/25
+  192.168.1.0/26 | 192.168.1.226
+ (6 rows)
+
+ SELECT * FROM inet_tbl WHERE i && '192.168.1.0/24'::cidr ORDER BY i;
+        c        |        i
+ ----------------+------------------
+  192.168.1.0/24 | 192.168.1.0/24
+  192.168.1.0/24 | 192.168.1.226/24
+  192.168.1.0/24 | 192.168.1.255/24
+  192.168.1.0/24 | 192.168.1.0/25
+  192.168.1.0/24 | 192.168.1.255/25
+  192.168.1.0/26 | 192.168.1.226
+ (6 rows)
+
+ SELECT * FROM inet_tbl WHERE i >>= '192.168.1.0/24'::cidr ORDER BY i;
+        c        |        i
+ ----------------+------------------
+  192.168.1.0/24 | 192.168.1.0/24
+  192.168.1.0/24 | 192.168.1.226/24
+  192.168.1.0/24 | 192.168.1.255/24
+ (3 rows)
+
+ SELECT * FROM inet_tbl WHERE i >> '192.168.1.0/24'::cidr ORDER BY i;
+  c | i
+ ---+---
+ (0 rows)
+
+ SELECT * FROM inet_tbl WHERE i < '192.168.1.0/24'::cidr ORDER BY i;
+       c      |      i
+ -------------+-------------
+  10.0.0.0/8  | 9.1.2.3/8
+  10.0.0.0/32 | 10.1.2.3/8
+  10.0.0.0/8  | 10.1.2.3/8
+  10.0.0.0/8  | 10.1.2.3/8
+  10.1.0.0/16 | 10.1.2.3/16
+  10.1.2.0/24 | 10.1.2.3/24
+  10.1.2.3/32 | 10.1.2.3
+  10.0.0.0/8  | 11.1.2.3/8
+ (8 rows)
+
+ SELECT * FROM inet_tbl WHERE i <= '192.168.1.0/24'::cidr ORDER BY i;
+        c        |       i
+ ----------------+----------------
+  10.0.0.0/8     | 9.1.2.3/8
+  10.0.0.0/8     | 10.1.2.3/8
+  10.0.0.0/32    | 10.1.2.3/8
+  10.0.0.0/8     | 10.1.2.3/8
+  10.1.0.0/16    | 10.1.2.3/16
+  10.1.2.0/24    | 10.1.2.3/24
+  10.1.2.3/32    | 10.1.2.3
+  10.0.0.0/8     | 11.1.2.3/8
+  192.168.1.0/24 | 192.168.1.0/24
+ (9 rows)
+
+ SELECT * FROM inet_tbl WHERE i = '192.168.1.0/24'::cidr ORDER BY i;
+        c        |       i
+ ----------------+----------------
+  192.168.1.0/24 | 192.168.1.0/24
+ (1 row)
+
+ SELECT * FROM inet_tbl WHERE i >= '192.168.1.0/24'::cidr ORDER BY i;
+          c          |        i
+ --------------------+------------------
+  192.168.1.0/24     | 192.168.1.0/24
+  192.168.1.0/24     | 192.168.1.226/24
+  192.168.1.0/24     | 192.168.1.255/24
+  192.168.1.0/24     | 192.168.1.0/25
+  192.168.1.0/24     | 192.168.1.255/25
+  192.168.1.0/26     | 192.168.1.226
+  ::ffff:1.2.3.4/128 | ::4.3.2.1/24
+  10:23::f1/128      | 10:23::f1/64
+  10:23::8000/113    | 10:23::ffff
+ (9 rows)
+
+ SELECT * FROM inet_tbl WHERE i > '192.168.1.0/24'::cidr ORDER BY i;
+          c          |        i
+ --------------------+------------------
+  192.168.1.0/24     | 192.168.1.226/24
+  192.168.1.0/24     | 192.168.1.255/24
+  192.168.1.0/24     | 192.168.1.0/25
+  192.168.1.0/24     | 192.168.1.255/25
+  192.168.1.0/26     | 192.168.1.226
+  ::ffff:1.2.3.4/128 | ::4.3.2.1/24
+  10:23::f1/128      | 10:23::f1/64
+  10:23::8000/113    | 10:23::ffff
+ (8 rows)
+
+ SELECT * FROM inet_tbl WHERE i <> '192.168.1.0/24'::cidr ORDER BY i;
+          c          |        i
+ --------------------+------------------
+  10.0.0.0/8         | 9.1.2.3/8
+  10.0.0.0/8         | 10.1.2.3/8
+  10.0.0.0/32        | 10.1.2.3/8
+  10.0.0.0/8         | 10.1.2.3/8
+  10.1.0.0/16        | 10.1.2.3/16
+  10.1.2.0/24        | 10.1.2.3/24
+  10.1.2.3/32        | 10.1.2.3
+  10.0.0.0/8         | 11.1.2.3/8
+  192.168.1.0/24     | 192.168.1.226/24
+  192.168.1.0/24     | 192.168.1.255/24
+  192.168.1.0/24     | 192.168.1.0/25
+  192.168.1.0/24     | 192.168.1.255/25
+  192.168.1.0/26     | 192.168.1.226
+  ::ffff:1.2.3.4/128 | ::4.3.2.1/24
+  10:23::f1/128      | 10:23::f1/64
+  10:23::8000/113    | 10:23::ffff
+ (16 rows)
+
+ SET enable_seqscan TO on;
+ DROP INDEX inet_idx2;
  -- simple tests of inet boolean and arithmetic operators
  SELECT i, ~i AS "~i" FROM inet_tbl;
          i         |                     ~i
diff --git a/src/test/regress/expected/opr_sanity.out b/src/test/regress/expected/opr_sanity.out
index bf76501..118f7e4 100644
*** a/src/test/regress/expected/opr_sanity.out
--- b/src/test/regress/expected/opr_sanity.out
*************** ORDER BY 1, 2, 3;
*** 1118,1123 ****
--- 1118,1132 ----
          783 |           15 | <->
          783 |           16 | @>
          783 |           18 | =
+         783 |           19 | <>
+         783 |           20 | <
+         783 |           21 | <=
+         783 |           22 | >
+         783 |           23 | >=
+         783 |           24 | <<
+         783 |           25 | <<=
+         783 |           26 | >>
+         783 |           27 | >>=
          783 |           28 | <@
          783 |           48 | <@
          783 |           68 | <@
*************** ORDER BY 1, 2, 3;
*** 1153,1159 ****
         4000 |           15 | >
         4000 |           16 | @>
         4000 |           18 | =
! (71 rows)

  -- Check that all opclass search operators have selectivity estimators.
  -- This is not absolutely required, but it seems a reasonable thing
--- 1162,1168 ----
         4000 |           15 | >
         4000 |           16 | @>
         4000 |           18 | =
! (80 rows)

  -- Check that all opclass search operators have selectivity estimators.
  -- This is not absolutely required, but it seems a reasonable thing
diff --git a/src/test/regress/sql/inet.sql b/src/test/regress/sql/inet.sql
index 328f149..be078fb 100644
*** a/src/test/regress/sql/inet.sql
--- b/src/test/regress/sql/inet.sql
*************** SELECT '' AS ten, i, c,
*** 52,63 ****
    i < c AS lt, i <= c AS le, i = c AS eq,
    i >= c AS ge, i > c AS gt, i <> c AS ne,
    i << c AS sb, i <<= c AS sbe,
!   i >> c AS sup, i >>= c AS spe
    FROM INET_TBL;

  -- check the conversion to/from text and set_netmask
  SELECT '' AS ten, set_masklen(inet(text(i)), 24) FROM INET_TBL;
! -- check that index works correctly
  CREATE INDEX inet_idx1 ON inet_tbl(i);
  SET enable_seqscan TO off;
  SELECT * FROM inet_tbl WHERE i<<'192.168.1.0/24'::cidr;
--- 52,65 ----
    i < c AS lt, i <= c AS le, i = c AS eq,
    i >= c AS ge, i > c AS gt, i <> c AS ne,
    i << c AS sb, i <<= c AS sbe,
!   i >> c AS sup, i >>= c AS spe,
!   i && c AS ovr
    FROM INET_TBL;

  -- check the conversion to/from text and set_netmask
  SELECT '' AS ten, set_masklen(inet(text(i)), 24) FROM INET_TBL;
!
! -- check that btree index works correctly
  CREATE INDEX inet_idx1 ON inet_tbl(i);
  SET enable_seqscan TO off;
  SELECT * FROM inet_tbl WHERE i<<'192.168.1.0/24'::cidr;
*************** SELECT * FROM inet_tbl WHERE i<<='192.16
*** 65,70 ****
--- 67,89 ----
  SET enable_seqscan TO on;
  DROP INDEX inet_idx1;

+ -- check that gist index works correctly
+ CREATE INDEX inet_idx2 ON inet_tbl using gist (i inet_ops);
+ SET enable_seqscan TO off;
+ SELECT * FROM inet_tbl WHERE i << '192.168.1.0/24'::cidr ORDER BY i;
+ SELECT * FROM inet_tbl WHERE i <<= '192.168.1.0/24'::cidr ORDER BY i;
+ SELECT * FROM inet_tbl WHERE i && '192.168.1.0/24'::cidr ORDER BY i;
+ SELECT * FROM inet_tbl WHERE i >>= '192.168.1.0/24'::cidr ORDER BY i;
+ SELECT * FROM inet_tbl WHERE i >> '192.168.1.0/24'::cidr ORDER BY i;
+ SELECT * FROM inet_tbl WHERE i < '192.168.1.0/24'::cidr ORDER BY i;
+ SELECT * FROM inet_tbl WHERE i <= '192.168.1.0/24'::cidr ORDER BY i;
+ SELECT * FROM inet_tbl WHERE i = '192.168.1.0/24'::cidr ORDER BY i;
+ SELECT * FROM inet_tbl WHERE i >= '192.168.1.0/24'::cidr ORDER BY i;
+ SELECT * FROM inet_tbl WHERE i > '192.168.1.0/24'::cidr ORDER BY i;
+ SELECT * FROM inet_tbl WHERE i <> '192.168.1.0/24'::cidr ORDER BY i;
+ SET enable_seqscan TO on;
+ DROP INDEX inet_idx2;
+
  -- simple tests of inet boolean and arithmetic operators
  SELECT i, ~i AS "~i" FROM inet_tbl;
  SELECT i, c, i & c AS "and" FROM inet_tbl;
\timing on

\set count 10000

drop table if exists random_ips;

create table random_ips as
select
  ctr as pk,
  ((random()*255.4)::int::text || '.' ||
   (random()*255.4)::int::text || '.' ||
   (random()*255.4)::int::text || '.' ||
   (random()*255.4)::int::text || '/' ||
   (random()*32.4)::int::text)::inet as inet
from
  generate_series(1, :count) ctr;

insert into random_ips values
  (-1, null),
  (-2, '::1'),
  (-3, '::2/64');

drop table if exists flat_ips;

create table flat_ips as
select
  ctr as pk,
  ((random()*255.4)::int::text || '.' ||
   (random()*255.4)::int::text || '.' ||
   (random()*255.4)::int::text || '.' ||
   (random()*255.4)::int::text || '/32')::inet as inet
from
  generate_series(1, :count) ctr;

insert into flat_ips values
  (-1, null),
  (-2, '::1'),
  (-3, '::2/64');

drop table if exists ref_results;

create table ref_results as
select
  a.pk,
  a.inet,
  (select count(*) from random_ips b where a.inet = b.inet) as n_eq,
  (select count(*) from random_ips b where a.inet <> b.inet) as n_ne,
  (select count(*) from random_ips b where a.inet < b.inet) as n_lt,
  (select count(*) from random_ips b where a.inet <= b.inet) as n_le,
  (select count(*) from random_ips b where a.inet > b.inet) as n_gt,
  (select count(*) from random_ips b where a.inet >= b.inet) as n_ge,
  (select count(*) from random_ips b where a.inet && b.inet) as n_ov,
  (select count(*) from random_ips b where a.inet << b.inet) as n_sub,
  (select count(*) from random_ips b where a.inet <<= b.inet) as n_subeq,
  (select count(*) from random_ips b where a.inet >> b.inet) as n_sup,
  (select count(*) from random_ips b where a.inet >>= b.inet) as n_supeq
from
  random_ips a;

set maintenance_work_mem = '1GB';

create index on random_ips using gist (inet inet_ops);

create index on flat_ips using gist (inet inet_ops);

set enable_seqscan TO 0;

drop table if exists test_results;

create table test_results as
select
  a.pk,
  a.inet,
  (select count(*) from random_ips b where a.inet = b.inet) as n_eq,
  (select count(*) from random_ips b where a.inet <> b.inet) as n_ne,
  (select count(*) from random_ips b where a.inet < b.inet) as n_lt,
  (select count(*) from random_ips b where a.inet <= b.inet) as n_le,
  (select count(*) from random_ips b where a.inet > b.inet) as n_gt,
  (select count(*) from random_ips b where a.inet >= b.inet) as n_ge,
  (select count(*) from random_ips b where a.inet && b.inet) as n_ov,
  (select count(*) from random_ips b where a.inet << b.inet) as n_sub,
  (select count(*) from random_ips b where a.inet <<= b.inet) as n_subeq,
  (select count(*) from random_ips b where a.inet >> b.inet) as n_sup,
  (select count(*) from random_ips b where a.inet >>= b.inet) as n_supeq
from
  random_ips a;

set enable_seqscan TO 1;

select * from ref_results r join test_results t using (pk)
where
  r.n_eq <> t.n_eq OR
  r.n_ne <> t.n_ne OR
  r.n_lt <> t.n_lt OR
  r.n_le <> t.n_le OR
  r.n_gt <> t.n_gt OR
  r.n_ge <> t.n_ge OR
  r.n_ov <> t.n_ov OR
  r.n_sub <> t.n_sub OR
  r.n_subeq <> t.n_subeq OR
  r.n_sup <> t.n_sup OR
  r.n_supeq <> t.n_supeq
;

-- these are for performance testing
set enable_hashjoin to 0;
set enable_mergejoin to 0;
set enable_material to 0;

explain analyze select count(*) from random_ips a, random_ips b
  where a.inet = b.inet;
explain analyze select count(*) from random_ips a, random_ips b
  where a.inet <> b.inet;
explain analyze select count(*) from random_ips a, random_ips b
  where a.inet < b.inet;
explain analyze select count(*) from random_ips a, random_ips b
  where a.inet <= b.inet;
explain analyze select count(*) from random_ips a, random_ips b
  where a.inet > b.inet;
explain analyze select count(*) from random_ips a, random_ips b
  where a.inet >= b.inet;
explain analyze select count(*) from random_ips a, random_ips b
  where a.inet && b.inet;
explain analyze select count(*) from random_ips a, random_ips b
  where a.inet << b.inet;
explain analyze select count(*) from random_ips a, random_ips b
  where a.inet <<= b.inet;
explain analyze select count(*) from random_ips a, random_ips b
  where a.inet >> b.inet;
explain analyze select count(*) from random_ips a, random_ips b
  where a.inet >>= b.inet;

explain analyze select count(*) from random_ips a, flat_ips b
  where a.inet = b.inet;
explain analyze select count(*) from random_ips a, flat_ips b
  where a.inet <> b.inet;
explain analyze select count(*) from random_ips a, flat_ips b
  where a.inet < b.inet;
explain analyze select count(*) from random_ips a, flat_ips b
  where a.inet <= b.inet;
explain analyze select count(*) from random_ips a, flat_ips b
  where a.inet > b.inet;
explain analyze select count(*) from random_ips a, flat_ips b
  where a.inet >= b.inet;
explain analyze select count(*) from random_ips a, flat_ips b
  where a.inet && b.inet;
explain analyze select count(*) from random_ips a, flat_ips b
  where a.inet << b.inet;
explain analyze select count(*) from random_ips a, flat_ips b
  where a.inet <<= b.inet;
explain analyze select count(*) from random_ips a, flat_ips b
  where a.inet >> b.inet;
explain analyze select count(*) from random_ips a, flat_ips b
  where a.inet >>= b.inet;

set enable_hashjoin to 1;
set enable_mergejoin to 1;
set enable_material to 1;

Re: GiST support for inet datatypes

From
Tom Lane
Date:
I wrote:
> [ inet-gist-v6.patch ]

Committed with some additional documentation work.  Thanks for
submitting this!
        regards, tom lane



Re: GiST support for inet datatypes

From
Robert Haas
Date:
On Tue, Apr 8, 2014 at 3:51 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I wrote:
>> [ inet-gist-v6.patch ]
>
> Committed with some additional documentation work.  Thanks for
> submitting this!

NICE.  I'd like to tell you how excited I am about this part:

# It also handles a new network
# operator inet && inet (overlaps, a/k/a "is supernet or subnet of"),
# which is expected to be useful in exclusion constraints.

...but I can't, because my mouth is too full of drool.  Wouldn't you
really want that more for cidr than for inet, though?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: GiST support for inet datatypes

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> NICE.  I'd like to tell you how excited I am about this part:

> # It also handles a new network
> # operator inet && inet (overlaps, a/k/a "is supernet or subnet of"),
> # which is expected to be useful in exclusion constraints.

> ...but I can't, because my mouth is too full of drool.  Wouldn't you
> really want that more for cidr than for inet, though?

Probably, but it works with either since they're binary-compatible.
        regards, tom lane



Re: GiST support for inet datatypes

From
Stephen Frost
Date:
* Robert Haas (robertmhaas@gmail.com) wrote:
> On Tue, Apr 8, 2014 at 3:51 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > I wrote:
> >> [ inet-gist-v6.patch ]
> >
> > Committed with some additional documentation work.  Thanks for
> > submitting this!
>
> NICE.  I'd like to tell you how excited I am about this part:
>
> # It also handles a new network
> # operator inet && inet (overlaps, a/k/a "is supernet or subnet of"),
> # which is expected to be useful in exclusion constraints.
>
> ...but I can't, because my mouth is too full of drool.  Wouldn't you
> really want that more for cidr than for inet, though?

You realize ip4r has had all this and more for, like, forever, right?
It's also more efficient wrt storage and generally more 'sane' regarding
operators, etc..

Just thought I'd share..  If you have a use-case, ip4r is what
everyone's been using for quite some time.  Also, yes, it supports both
ipv4 and ipv6, despite the name.
Thanks,
    Stephen

Re: GiST support for inet datatypes

From
Emre Hasegeli
Date:
Tom Lane <tgl@sss.pgh.pa.us>:
> Committed with some additional documentation work.  Thanks for
> submitting this!

Thank you for committing. I had not thought of using different structure
for the index. It works faster with my test case, too.

I am sending rebased version of the consistent operator names patch
in case you would like to include it to the same release. This is what
I wrote about this change before:

> That is why I prepared it as a separate patch on top of the others. It is
> not only consistency with the range types: <@ and @> symbols used for
> containment everywhere except the inet data types, particularly on
> the geometric types, arrays; cube, hstore, intaray, ltree extensions.
> The patch does not just change the operator names, it leaves the old ones,
> adds new operators with GiST support and changes the documentation, like
> your commit ba920e1c9182eac55d5f1327ab0d29b721154277 back in 2006. I could
> not find why did you leave the inet operators unchanged on that commit,
> in the mailing list archives [1]. GiST support will be a promotion for
> users to switch to the new operators, if we make this change with it.
>
> This change will also indirectly deprecate the undocumented non-transparent
> btree index support that works sometimes for some of the subnet inclusion
> operators [2].
>
> [1] http://www.postgresql.org/message-id/14277.1157304939@sss.pgh.pa.us
>
> [2] http://www.postgresql.org/message-id/389.1042992616@sss.pgh.pa.us

Attachment