Thread: Cleaning up the INET/CIDR mess

Cleaning up the INET/CIDR mess

From

Tom Lane

Date:

24 January 2006, 14:23:20

We've had previous discussions about how the distinction between INET
and CIDR isn't very well thought out, for instance
http://archives.postgresql.org/pgsql-hackers/2005-01/msg01021.php
http://archives.postgresql.org/pgsql-hackers/2006-01/msg00233.php

The basic problem is that the code is schizophrenic about whether these
types are "the same" or not.  The fact that we have implicit binary
(no function) coercions in both directions makes them effectively the
same, so why are there two different type names in the catalogs?
On the other hand, if they should be different (and they definitely
have different I/O behavior), this coercion behavior is wrong.  Also,
if they are different types, it's redundant to have a flag inside the
data structure saying which type a particular value is.

After some consideration I've come to the conclusion that we really do
want them to be separate types: the I/O behavior is settled (after quite
some long discussions) and we don't want to change it, so we can't merge
them into one type.  That leads to the following proposals:

Remove the internal is_cidr flag; it's a waste of space.  (It doesn't
actually cost anything today, because of alignment considerations, but
it would cost 2 bytes if we implement the proposed 2-byte-length-word
variant datum format.)  Even more to the point, the presence of the
flag has encouraged the sort of sloppy thinking and coding that got us
into this mess.  Whether it's an INET or a CIDR should be totally
determined by the SQL type system.

Without the flag, it's okay for cidr-to-inet to be a binary-compatible (no
function) conversion.  However, inet-to-cidr has to either zero out bits
to the right of the netmask, or error out if any are set.  Joachim Wieland
posted a patch that makes the coercion function just silently zero out any
such bits.  That's OK with me, but does anyone want to argue for an error?
(If we do make the coercion function raise error, then we'd probably need
to provide a separate function that supports the bit-zeroing conversion.)

Currently, both directions of cast are implicit, but that is a bad idea.
I propose keeping cidr-to-inet as implicit but making inet-to-cidr an
assignment cast.  This fits with the fact that inet can represent all
values of cidr but not vice versa (compare int4 and int8).

Given the implicit binary-compatible coercion, it's OK to have just a
single function taking inet for any case where the function truly doesn't
care if it's looking at inet or cidr input.  For the cases where the code
currently pays attention to is_cidr, we'd have to make two separate
functions.  AFAICT that's only abbrev(inet) and text(inet) among the
current functions.  Also, set_masklen(inet,integer) would have to come
in two flavors since the output type should be the same as the input.

The relationship of cidr and inet would be a little bit like the relation
between varchar and text.  For instance, varchar doesn't have any
comparison operators of its own, but piggybacks on text's comparison
operators, relying on the implicit cast from varchar to text to make this
transparent to users.

One other point is what to do with the binary I/O functions (send/receive)
for inet and cidr.  I think that we should continue to send the is_cidr
flag byte for backwards-compatibility reasons.  On receive, we could
either ignore that byte entirely, or insist that it match the expected
datatype.  I'm inclined to ignore the byte but am willing to listen to
arguments to raise an error instead.

Comments?
        regards, tom lane

Re: Cleaning up the INET/CIDR mess

From

Andrew Sullivan

Date:

24 January 2006, 17:54:19

On Tue, Jan 24, 2006 at 01:23:17PM -0500, Tom Lane wrote:

> Without the flag, it's okay for cidr-to-inet to be a binary-compatible (no
> function) conversion.  However, inet-to-cidr has to either zero out bits
> to the right of the netmask, or error out if any are set.  Joachim Wieland
> posted a patch that makes the coercion function just silently zero out any
> such bits.  That's OK with me, but does anyone want to argue for an error?
> (If we do make the coercion function raise error, then we'd probably need
> to provide a separate function that supports the bit-zeroing conversion.)

I'd argue for an error, on correctness grounds (someone's bound to
come back having misused these, and complain that it silently changed
data.  They'd have a point).

> One other point is what to do with the binary I/O functions (send/receive)
> for inet and cidr.  I think that we should continue to send the is_cidr
> flag byte for backwards-compatibility reasons.  On receive, we could
> either ignore that byte entirely, or insist that it match the expected
> datatype.  I'm inclined to ignore the byte but am willing to listen to
> arguments to raise an error instead.

If this is exposed to users in some way (I don't think it is, is it?)
then I'd argue for erroring, on the same grounds of what I say above. 
But otherwise, I think you could ignore it.

A


-- 
Andrew Sullivan  | ajs@crankycanuck.ca
I remember when computers were frustrating because they *did* exactly what 
you told them to.  That actually seems sort of quaint now.    --J.D. Baldwin

Re: Cleaning up the INET/CIDR mess

From

Andrew - Supernews

Date:

24 January 2006, 19:54:59

On 2006-01-24, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Without the flag, it's okay for cidr-to-inet to be a binary-compatible (no
> function) conversion.  However, inet-to-cidr has to either zero out bits
> to the right of the netmask, or error out if any are set.  Joachim Wieland
> posted a patch that makes the coercion function just silently zero out any
> such bits.  That's OK with me, but does anyone want to argue for an error?
> (If we do make the coercion function raise error, then we'd probably need
> to provide a separate function that supports the bit-zeroing conversion.)
>
> Currently, both directions of cast are implicit, but that is a bad idea.
> I propose keeping cidr-to-inet as implicit but making inet-to-cidr an
> assignment cast.  This fits with the fact that inet can represent all
> values of cidr but not vice versa (compare int4 and int8).

If inet-to-cidr can zero out bits silently, then wouldn't making it an
assignment cast be rather dangerous and error-prone?

> Given the implicit binary-compatible coercion, it's OK to have just a
> single function taking inet for any case where the function truly doesn't
> care if it's looking at inet or cidr input.  For the cases where the code
> currently pays attention to is_cidr, we'd have to make two separate
> functions.  AFAICT that's only abbrev(inet) and text(inet) among the
> current functions.  Also, set_masklen(inet,integer) would have to come
> in two flavors since the output type should be the same as the input.

You sometimes need set_masklen(cidr,integer) returning inet, and I'd bet
there's existing code that does that.

> The relationship of cidr and inet would be a little bit like the relation
> between varchar and text.  For instance, varchar doesn't have any
> comparison operators of its own, but piggybacks on text's comparison
> operators, relying on the implicit cast from varchar to text to make this
> transparent to users.

Well, inet/cidr have far more justification for being separate types than
text/varchar do - the text/varchar issue causes a great deal of confusion.

-- 
Andrew, Supernews
http://www.supernews.com - individual and corporate NNTP services

Re: Cleaning up the INET/CIDR mess

From

Bruce Momjian

Date:

25 January 2006, 00:22:23

This is exactly what I had in mind:
split the types    zero out the bits going to cidrno change going to inetmake functions take inet, which as not cast
change

---------------------------------------------------------------------------

Tom Lane wrote:
> We've had previous discussions about how the distinction between INET
> and CIDR isn't very well thought out, for instance
> http://archives.postgresql.org/pgsql-hackers/2005-01/msg01021.php
> http://archives.postgresql.org/pgsql-hackers/2006-01/msg00233.php
> 
> The basic problem is that the code is schizophrenic about whether these
> types are "the same" or not.  The fact that we have implicit binary
> (no function) coercions in both directions makes them effectively the
> same, so why are there two different type names in the catalogs?
> On the other hand, if they should be different (and they definitely
> have different I/O behavior), this coercion behavior is wrong.  Also,
> if they are different types, it's redundant to have a flag inside the
> data structure saying which type a particular value is.
> 
> After some consideration I've come to the conclusion that we really do
> want them to be separate types: the I/O behavior is settled (after quite
> some long discussions) and we don't want to change it, so we can't merge
> them into one type.  That leads to the following proposals:
> 
> Remove the internal is_cidr flag; it's a waste of space.  (It doesn't
> actually cost anything today, because of alignment considerations, but
> it would cost 2 bytes if we implement the proposed 2-byte-length-word
> variant datum format.)  Even more to the point, the presence of the
> flag has encouraged the sort of sloppy thinking and coding that got us
> into this mess.  Whether it's an INET or a CIDR should be totally
> determined by the SQL type system.
> 
> Without the flag, it's okay for cidr-to-inet to be a binary-compatible (no
> function) conversion.  However, inet-to-cidr has to either zero out bits
> to the right of the netmask, or error out if any are set.  Joachim Wieland
> posted a patch that makes the coercion function just silently zero out any
> such bits.  That's OK with me, but does anyone want to argue for an error?
> (If we do make the coercion function raise error, then we'd probably need
> to provide a separate function that supports the bit-zeroing conversion.)
> 
> Currently, both directions of cast are implicit, but that is a bad idea.
> I propose keeping cidr-to-inet as implicit but making inet-to-cidr an
> assignment cast.  This fits with the fact that inet can represent all
> values of cidr but not vice versa (compare int4 and int8).
> 
> Given the implicit binary-compatible coercion, it's OK to have just a
> single function taking inet for any case where the function truly doesn't
> care if it's looking at inet or cidr input.  For the cases where the code
> currently pays attention to is_cidr, we'd have to make two separate
> functions.  AFAICT that's only abbrev(inet) and text(inet) among the
> current functions.  Also, set_masklen(inet,integer) would have to come
> in two flavors since the output type should be the same as the input.
> 
> The relationship of cidr and inet would be a little bit like the relation
> between varchar and text.  For instance, varchar doesn't have any
> comparison operators of its own, but piggybacks on text's comparison
> operators, relying on the implicit cast from varchar to text to make this
> transparent to users.
> 
> One other point is what to do with the binary I/O functions (send/receive)
> for inet and cidr.  I think that we should continue to send the is_cidr
> flag byte for backwards-compatibility reasons.  On receive, we could
> either ignore that byte entirely, or insist that it match the expected
> datatype.  I'm inclined to ignore the byte but am willing to listen to
> arguments to raise an error instead.
> 
> Comments?
> 
>             regards, tom lane
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 2: Don't 'kill -9' the postmaster
> 

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073

Re: Cleaning up the INET/CIDR mess

From

Greg Stark

Date:

25 January 2006, 12:22:09

Andrew - Supernews <andrew+nonews@supernews.com> writes:

> If inet-to-cidr can zero out bits silently, then wouldn't making it an
> assignment cast be rather dangerous and error-prone?

The proposal was to make cidr-to-inet an implicit cast (happens automatically
without being requested) but make cidr-to-inet not implicit (you need to type
myinet::cidr or the "CAST" syntax to get it to happen).

Making the cast zero bits or making it error and having a separate function to
handle the conversion both seem entirely reasonable approaches.

I wonder if this would be an opportunity to fix Postgres's handling of
addresses like '10.1'. The standard interpretation of this is the same as
'10.0.0.1'. Currently Postgres interprets it as '10.1.0.0' for cidr and gives
an error for inet.

Perhaps if they become truly separate types then we could leave the cidr
behaviour alone since it is common to write things like 10.1/16 to refer to
the network prefix. But let inet have the standard interpretation.

It might be confusing but actually it's entirely logical. cidr is a type for
representing address ranges. '10.1/16'::cidr represents a range of addresses
from 10.1.0.0-10.1.255.255. There would be no reasonable use case for 10.0.0.1
in a cidr unless it was /32 which would refer to a specific address, in which
case it would make more sense to use an inet.

On the other hand '10.1'::inet represents a specific address and it's clear
which address it means -- it's even required to be interpreted this way by the
POSIX functions. Erroring out as Postgres currently does means applications
built on Postgres don't accept standard inernet address syntax.

-- 
greg

Re: Cleaning up the INET/CIDR mess

From

Tom Lane

Date:

25 January 2006, 12:43:55

Greg Stark <gsstark@mit.edu> writes:
> I wonder if this would be an opportunity to fix Postgres's handling of
> addresses like '10.1'.

You've mistaken this for a proposal to change the I/O behavior, which
it is specifically not.

> The standard interpretation of this is the same as '10.0.0.1'.

Standard according to whom?  Paul Vixie evidently doesn't think that
that's a standard abbreviation, else the code we borrowed from libbind
would do it already.
        regards, tom lane

Re: Cleaning up the INET/CIDR mess

From

"Larry Rosenman"

Date:

25 January 2006, 13:01:09

Tom Lane wrote:
> Greg Stark <gsstark@mit.edu> writes:
>> I wonder if this would be an opportunity to fix Postgres's handling
>> of addresses like '10.1'.
> 
> You've mistaken this for a proposal to change the I/O behavior, which
> it is specifically not.
> 
>> The standard interpretation of this is the same as '10.0.0.1'.
> 
> Standard according to whom?  Paul Vixie evidently doesn't think that
> that's a standard abbreviation, else the code we borrowed from libbind
> would do it already.

We had a **LONG** discussion on the I/O formats back in the 7.2 timeframe.
the current
behavior is the result of that.

Please do **NOT** change the I/O formats.

LER
-- 
Larry Rosenman                     http://www.lerctr.org/~ler
Phone: +1 512-248-2683                 E-Mail: ler@lerctr.org
US Mail: 430 Valona Loop, Round Rock, TX 78681-3893

Re: Cleaning up the INET/CIDR mess

From

Bruce Momjian

Date:

25 January 2006, 13:30:09

Tom Lane wrote:
> Greg Stark <gsstark@mit.edu> writes:
> > I wonder if this would be an opportunity to fix Postgres's handling of
> > addresses like '10.1'.
> 
> You've mistaken this for a proposal to change the I/O behavior, which
> it is specifically not.
> 
> > The standard interpretation of this is the same as '10.0.0.1'.
> 
> Standard according to whom?  Paul Vixie evidently doesn't think that
> that's a standard abbreviation, else the code we borrowed from libbind
> would do it already.

Agreed.  10.1 as 10.0.0.1 is an old behavior which has been removed from
most modern versions of networking tools.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073

Re: Cleaning up the INET/CIDR mess

From

Andrew - Supernews

Date:

25 January 2006, 13:41:13

On 2006-01-25, Bruce Momjian <pgman@candle.pha.pa.us> wrote:
> Agreed.  10.1 as 10.0.0.1 is an old behavior which has been removed from
> most modern versions of networking tools.

Indeed so. However the current behaviour has neither the merit of being
traditional nor the merit of being logical:

=> select '10.1'::cidr;   cidr     
-------------10.1.0.0/16
(1 row)

=> select '128.1'::cidr;    cidr     
--------------128.1.0.0/16
(1 row)

=> select '192.1'::cidr;    cidr     
--------------192.1.0.0/24
(1 row)

Having the behaviour be dependent on which part of the IP space is used
is a total nonsense on the modern, CIDR, internet! The C in CIDR even
stands for "Classless", so how can you ever justify introducing _new_,
non-traditional, dependencies on the traditional classes?

-- 
Andrew, Supernews
http://www.supernews.com - individual and corporate NNTP services

Re: Cleaning up the INET/CIDR mess

From

Bruce Momjian

Date:

25 January 2006, 14:08:14

Andrew - Supernews wrote:
> On 2006-01-25, Bruce Momjian <pgman@candle.pha.pa.us> wrote:
> > Agreed.  10.1 as 10.0.0.1 is an old behavior which has been removed from
> > most modern versions of networking tools.
> 
> Indeed so. However the current behaviour has neither the merit of being
> traditional nor the merit of being logical:
> 
> => select '10.1'::cidr;
>     cidr     
> -------------
>  10.1.0.0/16
> (1 row)
> 
> => select '128.1'::cidr;
>      cidr     
> --------------
>  128.1.0.0/16
> (1 row)
> 
> => select '192.1'::cidr;
>      cidr     
> --------------
>  192.1.0.0/24
> (1 row)
> 
> Having the behaviour be dependent on which part of the IP space is used
> is a total nonsense on the modern, CIDR, internet! The C in CIDR even
> stands for "Classless", so how can you ever justify introducing _new_,
> non-traditional, dependencies on the traditional classes?

This is coming from inet_net_pton.c, which we got from:
 * Copyright (c) 2004 by Internet Systems Consortium, Inc. ("ISC") * Copyright (c) 1996,1999 by Internet Software
Consortium.
...
if (bits == -1){    if (*odst >= 240)       /* Class E */        bits = 32;    else if (*odst >= 224)  /* Class D */
   bits = 8;    else if (*odst >= 192)  /* Class C */        bits = 24;    else if (*odst >= 128)  /* Class B */
bits= 16;    else        /* Class A */        bits = 8;       /* If imputed mask is narrower than specified octets,
widen.*/       if (bits < ((dst - odst) * 8))           bits = (dst - odst) * 8;
 

...
test=>  select '11'::cidr;    cidr------------ 11.0.0.0/8(1 row)

We are doing our best here to follow industry standards on how things
should behave.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073

Re: Cleaning up the INET/CIDR mess

From

Greg Stark

Date:

25 January 2006, 14:17:34

Andrew - Supernews <andrew+nonews@supernews.com> writes:

> On 2006-01-25, Bruce Momjian <pgman@candle.pha.pa.us> wrote:
> > Agreed.  10.1 as 10.0.0.1 is an old behavior which has been removed from
> > most modern versions of networking tools.

On the contrary not only is it still widely used but it is *required* by POSIX
for the relevant functions, inet_aton and getaddrinfo. Note that getaddrinfo
was created from whole cloth by POSIX so there was no backwards compatibility
need for it.

This isn't an obscure old-fashioned thing. People really do use this syntax.

> Indeed so. However the current behaviour has neither the merit of being
> traditional nor the merit of being logical:

Well for networks (cidr datatype) people do frequently refer to things like
10.1/16 and intend it to mean the network prefix. Sure you could argue having
the netmask default to the old class-based addressing is anachronistic but
what other default netmask would you suggest anyways? The only other
reasonable default would be the longest 0-bit suffix which would produce some
odd surprising results like '10.1/16' but '10.2/17'.

-- 
greg

Re: Cleaning up the INET/CIDR mess

From

Andrew - Supernews

Date:

25 January 2006, 14:24:39

On 2006-01-25, Bruce Momjian <pgman@candle.pha.pa.us> wrote:
> Andrew - Supernews wrote:
>> Having the behaviour be dependent on which part of the IP space is used
>> is a total nonsense on the modern, CIDR, internet! The C in CIDR even
>> stands for "Classless", so how can you ever justify introducing _new_,
>> non-traditional, dependencies on the traditional classes?
>
> This is coming from inet_net_pton.c, which we got from:

That the behaviour comes from ISC code does not make it correct.

> We are doing our best here to follow industry standards on how things
> should behave.

This is not an "industry standard".

-- 
Andrew, Supernews
http://www.supernews.com - individual and corporate NNTP services

Re: Cleaning up the INET/CIDR mess

From

Greg Stark

Date:

25 January 2006, 14:26:38

I have a question in a different direction. What is the meaning of the network
mask in the inet data type anyways? Hosts don't have network masks, only
networks.

If we could store inet in four bytes it would be vastly more efficient both in
disk space usage and in cpu at runtime.

I think it would also clear up the perpetual user confusion between the two
datatypes. I posit that the main source of the confusion is that currently
Postgres lets you use inet for everything, even if what you're really storing
is a network address range which is what the cidr datatype is really for.

-- 
greg

Re: Cleaning up the INET/CIDR mess

From

Andrew - Supernews

Date:

25 January 2006, 14:31:00

On 2006-01-25, Greg Stark <gsstark@mit.edu> wrote:
> This isn't an obscure old-fashioned thing. People really do use this syntax.

Given how little code now supports 10.1 meaning 10.0.0.1, that seems a
questionable point.

>> Indeed so. However the current behaviour has neither the merit of being
>> traditional nor the merit of being logical:
>
> Well for networks (cidr datatype) people do frequently refer to things like
> 10.1/16 and intend it to mean the network prefix.

Do you mean they refer to '10.1' and intend it to mean '10.1/16'? If so I
agree; but in that case, not only should '10.1' mean '10.1/16', but also
'192.1' should mean '192.1/16' and _NOT_ '192.1/24'.

> Sure you could argue having
> the netmask default to the old class-based addressing is anachronistic but
> what other default netmask would you suggest anyways?

The one implied by the number of octets specified, assuming you are going
to accept the abbreviated forms at all.

(FWIW, ip4r at this time does not even accept '10.1/16', it insists on
'10.1.0.0/16'.)

-- 
Andrew, Supernews
http://www.supernews.com - individual and corporate NNTP services

Re: Cleaning up the INET/CIDR mess

From

Stephen Frost

Date:

25 January 2006, 14:35:38

* Greg Stark (gsstark@mit.edu) wrote:
> I have a question in a different direction. What is the meaning of the network
> mask in the inet data type anyways? Hosts don't have network masks, only
> networks.
>
> If we could store inet in four bytes it would be vastly more efficient both in
> disk space usage and in cpu at runtime.
>
> I think it would also clear up the perpetual user confusion between the two
> datatypes. I posit that the main source of the confusion is that currently
> Postgres lets you use inet for everything, even if what you're really storing
> is a network address range which is what the cidr datatype is really for.

I wholeheartedly agree with this.  It also makes the only cast option
from inet to cidr to be with a /32 and thus there's no zeroing of bits.
It would then be nice to have a function to which you pass in a cidr and
a netmask and say "give me the CIDR this CIDR is in with this mask".
With the inet-to-cidr implicit cast you could use this function to get
the CIDR you want without having to worry about the implicit cast
throwing data away.  The cidr-to-inet is then throwing away the mask and
so I'm not sure we should have an implicit cast in that direction but
honestly I wouldn't complain if we did.
Thanks,
    Stephen

Re: Cleaning up the INET/CIDR mess

From

Andrew - Supernews

Date:

25 January 2006, 14:39:14

On 2006-01-25, Greg Stark <gsstark@MIT.EDU> wrote:
> I have a question in a different direction. What is the meaning of the
> network mask in the inet data type anyways? Hosts don't have network masks,
> only networks.

As far as I can tell, the "inet" semantics are supposed to represent a
network interface, rather than just an address. So it designates a network
and a host within that network. This is a significant semantic overload,
which is not relevent to many applications and may be counterproductive
(for example, if you had a database of hosts and networks, the network info
would more correctly be accessed via a reference to a separate table than
embedded in the host address).

> If we could store inet in four bytes it would be vastly more efficient both
> in disk space usage and in cpu at runtime.

That's not reasonable for inet/cidr due to the need to support ipv6.

If you want that for ip4-only apps, that's why pgfoundry.org/projects/ip4r
exists. It is possible that ip4r will be extended to ipv6 addresses, but
most unlikely that it will ever implement the overloaded "inet" semantics.

-- 
Andrew, Supernews
http://www.supernews.com - individual and corporate NNTP services

Re: Cleaning up the INET/CIDR mess

From

Steve Atkins

Date:

25 January 2006, 14:40:42

On Jan 25, 2006, at 10:30 AM, Andrew - Supernews wrote:

> On 2006-01-25, Greg Stark <gsstark@mit.edu> wrote:
>> This isn't an obscure old-fashioned thing. People really do use  
>> this syntax.
>
> Given how little code now supports 10.1 meaning 10.0.0.1, that seems a
> questionable point.
>

All code that uses gethostbyname() on, at least, Linux, Solaris,  
Windows XP,
OS X and (I think) the BSDs and anything else that's even vaguely  
posix uses it.

I don't think that's terribly relevant to the PG inet types, though.  
Rejecting any
input format that's not a dotted-quad seems the safest thing to do,  
and doesn't
lose any useful functionality. Given the number of people in this  
thread who
think that the (non-standard, archaic) behaviour of bind is correct  
it's clear that
accepting anything other than a real dotted-quad will lead to an  
inconsistency
between what the data represents and what the user thinks it  
represents, and
that's bound to cause problems.

Cheers,  Steve

Re: Cleaning up the INET/CIDR mess

From

Greg Stark

Date:

25 January 2006, 15:09:23

"Larry Rosenman" <ler@lerctr.org> writes:

> We had a **LONG** discussion on the I/O formats back in the 7.2 timeframe.
> the current
> behavior is the result of that.

Well I wasn't around for 7.2 but I was for a discussion around 7.3, maybe it's
the same one. Regardless, back then there was an implied assumption that any
change would affect both types. I couldn't convince people because 10.1/16 for
a network address range really is a reasonable abbreviation for 10.1.0.0.

It turns out network address ranges and hosts really don't have the same needs
at all. 10.1 expanding to 10.1.0.0 is really nonsensical for hosts and 10.1
expanding to 10.0.0.1 is nonsensical for network address ranges.

Now that they're going to be two different types proper I think it's not a bad
idea at all to revisit any design decisions that were made as a consequence of
them being mashed into one pseudo type.

Note that this would be entirely backwards compatible since inet doesn't
actually accept any abbreviated syntaxes at all currently. Only cidr does.

-- 
greg

Re: Cleaning up the INET/CIDR mess

From

Tom Lane

Date:

25 January 2006, 15:22:09

Greg Stark <gsstark@MIT.EDU> writes:
> If we could store inet in four bytes it would be vastly more efficient both in
> disk space usage and in cpu at runtime.

You forgot IPv6.
        regards, tom lane

Re: Cleaning up the INET/CIDR mess

From

"Matthew D. Fuller"

Date:

25 January 2006, 18:20:41

On Wed, Jan 25, 2006 at 06:30:47PM -0000 I heard the voice of
Andrew - Supernews, and lo! it spake thus:
> On 2006-01-25, Greg Stark <gsstark@mit.edu> wrote:
> > This isn't an obscure old-fashioned thing. People really do use
> > this syntax.
> 
> Given how little code now supports 10.1 meaning 10.0.0.1, that seems
> a questionable point.

(ttyp7):{200}% ping 10.1
PING 10.1 (10.0.0.1): 56 data bytes

Given that it's how I learned v4 addresses shorten, and that it's
roughly similar to 0-minimization in v6 addresses, I would be
surprised as heck to find any other behavior.

OTOH, I never use it myself, because knowing the answer I still find
it requiring me to stop and think about what it means, because (unlike
the v6 version) there's no visual indication that there are 0's and
where they go.  I recently wrote up a C library to parse v4/v6 CIDR
forms, and explicitly chose not to support those shortened v4 forms.

-- 
Matthew Fuller     (MF4839)   |  fullermd@over-yonder.net
Systems/Network Administrator |  http://www.over-yonder.net/~fullermd/          On the Internet, nobody can hear you
scream.

Re: Cleaning up the INET/CIDR mess

From

Greg Stark

Date:

25 January 2006, 19:02:48

Andrew - Supernews <andrew+nonews@supernews.com> writes:

> On 2006-01-25, Greg Stark <gsstark@mit.edu> wrote:
> > This isn't an obscure old-fashioned thing. People really do use this syntax.
> 
> Given how little code now supports 10.1 meaning 10.0.0.1, that seems a
> questionable point.

I've reported the bug in the one instance I've found. 
What have you found with this omission?

It would be passing strange since most software just passes the text to
inet_aton or inet_pton.

Incidentally I mixed up getaddrinfo and inet_pton in a previous message. Sorry
for the confusion.

-- 
greg

Re: Cleaning up the INET/CIDR mess

From

Andrew - Supernews

Date:

25 January 2006, 19:43:02

On 2006-01-25, Greg Stark <gsstark@mit.edu> wrote:
> I've reported the bug in the one instance I've found. 
> What have you found with this omission?
>
> It would be passing strange since most software just passes the text to
> inet_aton or inet_pton.

STANDARDS    The inet_ntop() and inet_pton() functions conform to X/Open Networking    Services Issue 5.2 (``XNS5.2'').
Note that inet_pton() does not accept    1-, 2-, or 3-part dotted addresses; all four parts must be specified and
areinterpreted only as decimal values.  This is a narrower input set    than that accepted by inet_aton().

The spec is quite explicit that inet_pton is not expected to accept the
abbreviated forms or any non-decimal values.

-- 
Andrew, Supernews
http://www.supernews.com - individual and corporate NNTP services

Re: Cleaning up the INET/CIDR mess

From

Greg Stark

Date:

26 January 2006, 11:35:07

Andrew - Supernews <andrew+nonews@supernews.com> writes:

> The spec is quite explicit that inet_pton is not expected to accept the
> abbreviated forms or any non-decimal values.

Hum. That distinctly doesn't match my memory but it seems you're right. The
spec mandates inet_ntoa and inet_addr support it but not inet_ntop. Odd.

-- 
greg

Re: Cleaning up the INET/CIDR mess

From

Steve Atkins

Date:

26 January 2006, 12:03:45

On Jan 25, 2006, at 9:29 AM, Bruce Momjian wrote:

> Tom Lane wrote:
>> Greg Stark <gsstark@mit.edu> writes:
>>> I wonder if this would be an opportunity to fix Postgres's  
>>> handling of
>>> addresses like '10.1'.
>>
>> You've mistaken this for a proposal to change the I/O behavior, which
>> it is specifically not.
>>
>>> The standard interpretation of this is the same as '10.0.0.1'.
>>
>> Standard according to whom?  Paul Vixie evidently doesn't think that
>> that's a standard abbreviation, else the code we borrowed from  
>> libbind
>> would do it already.
>
> Agreed.  10.1 as 10.0.0.1 is an old behavior which has been removed  
> from
> most modern versions of networking tools.

Whether PG should support it or not is another question (personally I  
think
that anything other than a dotted quad should fail with an error) but  
it certainly
hasn't been removed from most modern versions of networking tools.

gethostbyname() is used by most networking tools, and on most unix OSes
it believes "10.1" 'resolves to' "10.0.0.1". That includes current  
versions of
linux, OS X, Solaris, Windows XP and I believe the BSDs.

So the vast majority of applications on the vast majority of deployed  
platforms
believe that "10.1" is the address 10.0.0.1. (As is often the case binds
behaviour is inconsistent and can't really be used as "proof" of  
standard
behaviour).

Cheers,  Steve

Re: Cleaning up the INET/CIDR mess

From

Peter Eisentraut

Date:

26 January 2006, 17:12:12

Tom Lane wrote:
> Without the flag, it's okay for cidr-to-inet to be a
> binary-compatible (no function) conversion.  However, inet-to-cidr
> has to either zero out bits to the right of the netmask, or error out
> if any are set.  Joachim Wieland posted a patch that makes the
> coercion function just silently zero out any such bits.  That's OK
> with me, but does anyone want to argue for an error?

Zero the bits if it's an explicit cast, raise an error if not.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

Re: Cleaning up the INET/CIDR mess

From

Tom Lane

Date:

26 January 2006, 17:37:31

Peter Eisentraut <peter_e@gmx.net> writes:
> Tom Lane wrote:
>> Without the flag, it's okay for cidr-to-inet to be a
>> binary-compatible (no function) conversion.  However, inet-to-cidr
>> has to either zero out bits to the right of the netmask, or error out
>> if any are set.  Joachim Wieland posted a patch that makes the
>> coercion function just silently zero out any such bits.  That's OK
>> with me, but does anyone want to argue for an error?

> Zero the bits if it's an explicit cast, raise an error if not.

I know there's precedent for such behavior in the SQL spec, but it
always seemed pretty ugly to me :-(.  The patch-as-committed just
silently zeroes the bits during any inet->cidr cast.  I'll change it
if there's consensus that that's a bad idea, but I don't really see
a reason to.

BTW, there is another case I came across that wasn't discussed before:
if you do set_masklen() on a cidr value that reduces the netmask length,
there are the same options of either zeroing the excess bits or
complaining if any aren't zero.  I've got that doing the zeroing too.
        regards, tom lane

Re: Cleaning up the INET/CIDR mess

From

Bruce Momjian

Date:

26 January 2006, 18:04:59

Tom Lane wrote:
> Peter Eisentraut <peter_e@gmx.net> writes:
> > Tom Lane wrote:
> >> Without the flag, it's okay for cidr-to-inet to be a
> >> binary-compatible (no function) conversion.  However, inet-to-cidr
> >> has to either zero out bits to the right of the netmask, or error out
> >> if any are set.  Joachim Wieland posted a patch that makes the
> >> coercion function just silently zero out any such bits.  That's OK
> >> with me, but does anyone want to argue for an error?
> 
> > Zero the bits if it's an explicit cast, raise an error if not.
> 
> I know there's precedent for such behavior in the SQL spec, but it
> always seemed pretty ugly to me :-(.  The patch-as-committed just
> silently zeroes the bits during any inet->cidr cast.  I'll change it
> if there's consensus that that's a bad idea, but I don't really see
> a reason to.

I agree.  Let's do the zeroing and see if people complain about it. 
Throwing an error seems extreme.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073

Re: Cleaning up the INET/CIDR mess

From

Peter Eisentraut

Date:

26 January 2006, 19:36:18

Bruce Momjian wrote:
> I agree.  Let's do the zeroing and see if people complain about it.

I'm complaining.  Losing data on a cast is not common behavior.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

Re: Cleaning up the INET/CIDR mess

From

Bruce Momjian

Date:

01 February 2006, 12:07:53

Peter Eisentraut wrote:
> Bruce Momjian wrote:
> > I agree.  Let's do the zeroing and see if people complain about it.
> 
> I'm complaining.  Losing data on a cast is not common behavior.

[ Sorry for the delay.]

OK, that's clear.  :-)

I looked around to see if I could find any places where we implicitly
cast and lose data.  I tried factorial(3.2), and that cast failed.

However, if I do this explicit cast:
test=> SELECT '3.4'::float8::integer; int4------    3(1 row)

it works.  Can't we consider this:
test=> SELECT '1.2.3.4/24'::inet::cidr;    cidr------------ 1.2.3.0/24(1 row)

a similar situation?  inet is like float8 (bits beyond the masklen),
while cidr is like int4.

Is it only implicit casts you are worried about?  Do we have any of
those left?  All functions that take cidr also have an inet version, so
I don't see how an implicit cast to cidr could happen.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073

Re: Cleaning up the INET/CIDR mess

From

Tom Lane

Date:

01 February 2006, 12:14:52

Bruce Momjian <pgman@candle.pha.pa.us> writes:
> Is it only implicit casts you are worried about?  Do we have any of
> those left?  All functions that take cidr also have an inet version, so
> I don't see how an implicit cast to cidr could happen.

The cast to cidr isn't implicit anymore anyway.  What I currently have
it marked as is "assignment".  You could make the argument that it
should be marked "explicit only" to avoid silent loss of data.  But
we have the numeric downcasts marked as "assignment" so I don't see
why this case is different.  If you do
insert into int4_tbl values(7.7);

what's inserted into the integer column is 8, and I've not heard anyone
complaining that that represents unacceptable data loss.
        regards, tom lane