Thread: Repair plan for inet and cidr types

Repair plan for inet and cidr types

From
Peter Eisentraut
Date:
As we know, the inet and cidr types are still broken in several ways,
amongst others input and output functions, operators, ordering. I've
collected the bug reports from the last year or so from the archives.

There's apparently a lack of understanding of what exactly are these types
are supposed to do. Therefore, instead of addressing each bug
individually, let me first state what I reconstructed as the specification
of these types, and then add what is currently wrong with it.

* CIDR

The cidr type stores the identity of an IP _network_. A network
specification is of the form 'x.x.x.x/y'. The documentation states that if
y is omitted then it is constructed from the old A, B, C class scheme. So
be it. In a real world network, the bits (y+1)...32 have to be zero, but
the cidr type does not currently enforce this. This has been the source of
bugs in the past, and no doubt the source of some confusion as well. I
propose that cidr _reject_ input of the form '127.0.0.5/16'. If you think
about it, this is the same as int4 rejecting 3.5 as input.

* INET

The inet type stores the identity of an IP _host_. A host specification is
of the form 'x.x.x.x'. Optionally, the inet type also stores the identity
of the network the host is in. E.g., '127.0.0.5/16' means the host
127.0.0.5 in the network 127.0/16.

* Type equivalency

This has also been a source of problems. I propose that cidr and inet are
not made equivalent types at any level. No automatic casting either. A
network and a host are not the same thing. To construct a cidr value from
an inet value, you'd have to use some sort of (to be created) network()
function, e.g., network('127.0.0.5/16') => '127.0/16'. IMO, there is no
reasonable way to construct an inet value from a cidr value.

* Operators

Because the types are equivalent, the operators have also been bunched
together in confusing ways. I propose that ordering operators (>, +, <)
between inet and cidr be eliminated, they do not make sense. The only
useful operation between cidr and inet is the << ("contains") operator.
Ordering withing cidr and inet be defined in terms of their bit
representation, as is the case now. The << family of operators should also
be removed for the inet type -- a host cannot "contain" another host. What
you probably wanted is `inet1 << network(inet2)'.


Does anyone see this differently? If not, can we agree on this
specification?

-- 
Peter Eisentraut                  Sernanders väg 10:115
peter_e@gmx.net                   75262 Uppsala
http://yi.org/peter-e/            Sweden



Re: Repair plan for inet and cidr types

From
darcy@druid.net (D'Arcy J.M. Cain)
Date:
Thus spake Peter Eisentraut
> There's apparently a lack of understanding of what exactly are these types
> are supposed to do. Therefore, instead of addressing each bug
> individually, let me first state what I reconstructed as the specification
> of these types, and then add what is currently wrong with it.

I have been browsing through the old messages on the topic.  There was, in
fact some very good work defining the type before anyone actually started
to code.  There was a surprising amount of controversy over the actual
definitions but I think in the end we hammered it out at least to the
point that everyone could work with it.

> * CIDR
> 
> The cidr type stores the identity of an IP _network_. A network
> specification is of the form 'x.x.x.x/y'. The documentation states that if
> y is omitted then it is constructed from the old A, B, C class scheme. So
> be it. In a real world network, the bits (y+1)...32 have to be zero, but
> the cidr type does not currently enforce this. This has been the source of
> bugs in the past, and no doubt the source of some confusion as well. I
> propose that cidr _reject_ input of the form '127.0.0.5/16'. If you think
> about it, this is the same as int4 rejecting 3.5 as input.

There is also the option of accepting it but masking out the host bits
before storing it.  That gives us automatic conversion if we store an
inet into a cidr if our intent is to store the network part.

What sort of bugs do you think it caused btw?

> * INET
> 
> The inet type stores the identity of an IP _host_. A host specification is
> of the form 'x.x.x.x'. Optionally, the inet type also stores the identity
> of the network the host is in. E.g., '127.0.0.5/16' means the host
> 127.0.0.5 in the network 127.0/16.

That sounds right.  We also allowed for hosts to be stored implicitely by
simply making the netmask /32.

> * Type equivalency
> 
> This has also been a source of problems. I propose that cidr and inet are
> not made equivalent types at any level. No automatic casting either. A
> network and a host are not the same thing. To construct a cidr value from
> an inet value, you'd have to use some sort of (to be created) network()
> function, e.g., network('127.0.0.5/16') => '127.0/16'. IMO, there is no
> reasonable way to construct an inet value from a cidr value.

I'm not sure I understand why this is necessary.  I can see not allowing
cidr ==> inet conversions but inet ==> cidr can be done as it is a matter
of dropping information - the host part.


> * Operators
> 
> Because the types are equivalent, the operators have also been bunched
> together in confusing ways. I propose that ordering operators (>, +, <)
> between inet and cidr be eliminated, they do not make sense. The only
> useful operation between cidr and inet is the << ("contains") operator.
> Ordering withing cidr and inet be defined in terms of their bit
> representation, as is the case now. The << family of operators should also
> be removed for the inet type -- a host cannot "contain" another host. What
> you probably wanted is `inet1 << network(inet2)'.

Then let's define that as the meaning of "inet1 << inet2" i.e. define
the << operator between inet types as meaning "tell me if inet1 is in
the same network as inet2."  In fact, if we define << as only allowed
between inet and cidr (or cidr and cidr?) then the implied cast will
deal with it if that cast causes the host bits to drop as suggested
above.

-- 
D'Arcy J.M. Cain <darcy@{druid|vex}.net>   |  Democracy is three wolves
http://www.druid.net/darcy/                |  and a sheep voting on
+1 416 425 1212     (DoD#0082)    (eNTP)   |  what's for dinner.


Re: Repair plan for inet and cidr types

From
darcy@druid.net (D'Arcy J.M. Cain)
Date:
Thus spake Peter Eisentraut
> network and a host are not the same thing. To construct a cidr value from
> an inet value, you'd have to use some sort of (to be created) network()
> function, e.g., network('127.0.0.5/16') => '127.0/16'. IMO, there is no

Oh, I forgot to mention:

darcy=> select network('127.1.2.3/24'::inet);
network   
----------
127.1.2/24
(1 row)

There is also a host and netmask function and note:

darcy=> select host('127.1.2.3/24'::cidr);
ERROR:  CIDR type has no host part

But I still see no reason why that can't be implicit if we assign the
"'127.1.2.3/24'::inet" value to a cidr.  In other words let "select
('127.1.2.3/24'::inet)::cidr" give the same output.

-- 
D'Arcy J.M. Cain <darcy@{druid|vex}.net>   |  Democracy is three wolves
http://www.druid.net/darcy/                |  and a sheep voting on
+1 416 425 1212     (DoD#0082)    (eNTP)   |  what's for dinner.


Re: Repair plan for inet and cidr types

From
Tom Lane
Date:
Peter Eisentraut <peter_e@gmx.net> writes:
> There's apparently a lack of understanding of what exactly are these types
> are supposed to do. Therefore, instead of addressing each bug
> individually, let me first state what I reconstructed as the specification
> of these types, and then add what is currently wrong with it.

This sounds good offhand, but then I never paid a whole lot of attention
to the details originally.  Did you go through the original inet/cidr
design discussions (the threads where Paul Vixie was participating)?

I don't believe Paul is subscribed here anymore, but I'd feel a lot
happier if you can contact him and get him to sign off on the clarified
design.  Maybe this is what he had in mind all along, or maybe not.
        regards, tom lane

PS: You do know who Paul Vixie is, I assume ;-).  I can think of few
better-qualified experts in this domain...


Re: Repair plan for inet and cidr types

From
Sevo Stille
Date:
"D'Arcy J.M. Cain" wrote:

> > This has also been a source of problems. I propose that cidr and inet are
> > not made equivalent types at any level. No automatic casting either. A
> > network and a host are not the same thing. To construct a cidr value from
> > an inet value, you'd have to use some sort of (to be created) network()
> > function, e.g., network('127.0.0.5/16') => '127.0/16'. IMO, there is no
> > reasonable way to construct an inet value from a cidr value.
> 
> I'm not sure I understand why this is necessary.  I can see not allowing
> cidr ==> inet conversions but inet ==> cidr can be done as it is a matter
> of dropping information - the host part.

Given that only contains is a reasonable cidr vs. inet comparison, we
should not cast automatically - both the network >> network and network
>> host comparison are legitimate. 

> > * Operators
> >
> > Because the types are equivalent, the operators have also been bunched
> > together in confusing ways. I propose that ordering operators (>, +, <)
> > between inet and cidr be eliminated, they do not make sense. The only
> > useful operation between cidr and inet is the << ("contains") operator.
> > Ordering withing cidr and inet be defined in terms of their bit
> > representation, as is the case now. The << family of operators should also
> > be removed for the inet type -- a host cannot "contain" another host. What
> > you probably wanted is `inet1 << network(inet2)'.
> 
> Then let's define that as the meaning of "inet1 << inet2" i.e. define
> the << operator between inet types as meaning "tell me if inet1 is in
> the same network as inet2."  

If you do not mean "network(inet1) <<= network(inet2)" but rather
"host(inet1) << network(inet2)", yes.

> In fact, if we define << as only allowed
> between inet and cidr (or cidr and cidr?) then the implied cast will
> deal with it if that cast causes the host bits to drop as suggested
> above.

Right. The containing side must be a network, so we can implicitly cast
that. The contained side has to be taken as-is - if a network vs.
network comparison is intended, the user must explicitly cast it. 

Sevo

-- 
Sevo Stille
sevo@ip23.net


Re: Repair plan for inet and cidr types

From
eisentrp@csis.gvsu.edu
Date:
On Tue, 4 Jul 2000, D'Arcy J.M. Cain wrote:

> I'm not sure I understand why this is necessary.  I can see not allowing
> cidr ==> inet conversions but inet ==> cidr can be done as it is a matter
> of dropping information - the host part.

Automatic casts should not lose information. How would you feel if floats
were automatically rounded when you store them into int fields? I think
this is an important principle in any type system.

> Then let's define that as the meaning of "inet1 << inet2" i.e. define
> the << operator between inet types as meaning "tell me if inet1 is in
> the same network as inet2."

Again, let the user say what he wants: inet1 << network(inet2).

Also note that "is inet1 in the same network as inet2" is different from
"is inet1 contained in the network of inet2" (which is what it does now).
The operator you defined is symmetric (if inet1 is in the same network as
inet2, then inet2 is also in the same network as inet1), whereas the << is
antisymmetric. In fact, AFAICT, the operator you defined doesn't exist
yet, although it perhaps should.


-- 
Peter Eisentraut                  Sernanders vaeg 10:115
peter_e@gmx.net                   75262 Uppsala
http://yi.org/peter-e/            Sweden



Re: Repair plan for inet and cidr types

From
darcy@druid.net (D'Arcy J.M. Cain)
Date:
Thus spake eisentrp@csis.gvsu.edu
> On Tue, 4 Jul 2000, D'Arcy J.M. Cain wrote:
> > I'm not sure I understand why this is necessary.  I can see not allowing
> > cidr ==> inet conversions but inet ==> cidr can be done as it is a matter
> > of dropping information - the host part.
> 
> Automatic casts should not lose information. How would you feel if floats
> were automatically rounded when you store them into int fields? I think
> this is an important principle in any type system.

If it was defined well I would have no problem with it.

> > Then let's define that as the meaning of "inet1 << inet2" i.e. define
> > the << operator between inet types as meaning "tell me if inet1 is in
> > the same network as inet2."
> 
> Again, let the user say what he wants: inet1 << network(inet2).

I think that's what I meant.  I'm just saying that inet::cidr should be
the same as network(inet).  Allowing that cast makes a lot of operations
work intuitively.

> Also note that "is inet1 in the same network as inet2" is different from
> "is inet1 contained in the network of inet2" (which is what it does now).

Hmm.  It is a subtle difference and I did miss it.

> The operator you defined is symmetric (if inet1 is in the same network as
> inet2, then inet2 is also in the same network as inet1), whereas the << is
> antisymmetric. In fact, AFAICT, the operator you defined doesn't exist
> yet, although it perhaps should.

I guess what I was really getting at was this.
  host OP cidr

where inet would cast to host on one side and cidr on the other.  What
we have now is 
  cidr OP cidr

with both sides casting to cidr.  Of course there is no such thing as a host
type so I don't know how we would cast such a thing.

-- 
D'Arcy J.M. Cain <darcy@{druid|vex}.net>   |  Democracy is three wolves
http://www.druid.net/darcy/                |  and a sheep voting on
+1 416 425 1212     (DoD#0082)    (eNTP)   |  what's for dinner.


Re: Repair plan for inet and cidr types

From
Peter Eisentraut
Date:
D'Arcy J.M. Cain writes:

> > Automatic casts should not lose information. How would you feel if floats
> > were automatically rounded when you store them into int fields? I think
> > this is an important principle in any type system.
> 
> If it was defined well I would have no problem with it.

That is certainly not how type systems operate anywhere.

> I guess what I was really getting at was this.
> 
>    host OP cidr
> 
> where inet would cast to host on one side and cidr on the other.  What
> we have now is 
> 
>    cidr OP cidr
> 
> with both sides casting to cidr.  Of course there is no such thing as a host
> type so I don't know how we would cast such a thing.

I think that while the implicit casting could sometimes be convenient,
it's also a source of confusion. Consider the statement

select '10.0.0.3'::cidr < '10.0.0.2'::inet;    => f

This cannot possibly make sense on closer inspection. Firstly, it's
semantic nonsense, you cannot order a network and a host. Secondly, it's
also wrong. According to the documentation, the '10.0.0.3'::cidr should be
converted to '10/8' internally. Then one of two things could have happened
here: 1) cidr was implicitly converted to inet and '10.0.0.3' is taken to
be a host, which is completely wrong. Or 2) inet was converted to cidr.
But then we're looking at '10/8' < '10.0.0.2/32', which should be true.

See also

select '10.0.0.2'::cidr = '10.0.0.2'::inet;    => t

which is wrong for similar reasons.


Then let's look at the << family of operators.

select '10.0.0.2'::cidr >> '10.0.0.2'::inet;    => f

Again, there are two ways this could currently be resolved:
'10/8'::cidr >> '10.0.0.2/32'::cidr    which does return true
or'10.0.0.2'::inet >> '10.0.0.2'::inet
which doesn't make any sense.

On closer inspection, the inet << cidr case is completely misbehaving:

select '10.0.0.5/8'::inet << '10.0.0.0/16'::cidr;    => f
select '10.0.0.5/24'::inet << '10.0.0.0/16'::cidr;    => t

This is not what I'd expect.

Concretely, the casesinet << cidrcidr << cidr
are not the same:
'10.0.0.5/8'::inet << '10.0.0.0/16'::cidr
should be true
'10.0.0.5/8'::cidr << '10.0.0.0/16'::cidr
should be false, if you allow the left-side value in at all, which I
wouldn't.

What this tells me is that the cast from inet to cidr is not well-defined
in the mathematical sense, and therefore no implicit casting should be
allowed.

So the bottom line here is that these two types are, while from a related
domain, different, and the user should be the one that controls when and
how they are mixed together.


-- 
Peter Eisentraut                  Sernanders väg 10:115
peter_e@gmx.net                   75262 Uppsala
http://yi.org/peter-e/            Sweden