Thread: Cleaning up the INET/CIDR mess
We've had previous discussions about how the distinction between INET and CIDR isn't very well thought out, for instance http://archives.postgresql.org/pgsql-hackers/2005-01/msg01021.php http://archives.postgresql.org/pgsql-hackers/2006-01/msg00233.php The basic problem is that the code is schizophrenic about whether these types are "the same" or not. The fact that we have implicit binary (no function) coercions in both directions makes them effectively the same, so why are there two different type names in the catalogs? On the other hand, if they should be different (and they definitely have different I/O behavior), this coercion behavior is wrong. Also, if they are different types, it's redundant to have a flag inside the data structure saying which type a particular value is. After some consideration I've come to the conclusion that we really do want them to be separate types: the I/O behavior is settled (after quite some long discussions) and we don't want to change it, so we can't merge them into one type. That leads to the following proposals: Remove the internal is_cidr flag; it's a waste of space. (It doesn't actually cost anything today, because of alignment considerations, but it would cost 2 bytes if we implement the proposed 2-byte-length-word variant datum format.) Even more to the point, the presence of the flag has encouraged the sort of sloppy thinking and coding that got us into this mess. Whether it's an INET or a CIDR should be totally determined by the SQL type system. Without the flag, it's okay for cidr-to-inet to be a binary-compatible (no function) conversion. However, inet-to-cidr has to either zero out bits to the right of the netmask, or error out if any are set. Joachim Wieland posted a patch that makes the coercion function just silently zero out any such bits. That's OK with me, but does anyone want to argue for an error? (If we do make the coercion function raise error, then we'd probably need to provide a separate function that supports the bit-zeroing conversion.) Currently, both directions of cast are implicit, but that is a bad idea. I propose keeping cidr-to-inet as implicit but making inet-to-cidr an assignment cast. This fits with the fact that inet can represent all values of cidr but not vice versa (compare int4 and int8). Given the implicit binary-compatible coercion, it's OK to have just a single function taking inet for any case where the function truly doesn't care if it's looking at inet or cidr input. For the cases where the code currently pays attention to is_cidr, we'd have to make two separate functions. AFAICT that's only abbrev(inet) and text(inet) among the current functions. Also, set_masklen(inet,integer) would have to come in two flavors since the output type should be the same as the input. The relationship of cidr and inet would be a little bit like the relation between varchar and text. For instance, varchar doesn't have any comparison operators of its own, but piggybacks on text's comparison operators, relying on the implicit cast from varchar to text to make this transparent to users. One other point is what to do with the binary I/O functions (send/receive) for inet and cidr. I think that we should continue to send the is_cidr flag byte for backwards-compatibility reasons. On receive, we could either ignore that byte entirely, or insist that it match the expected datatype. I'm inclined to ignore the byte but am willing to listen to arguments to raise an error instead. Comments? regards, tom lane
On Tue, Jan 24, 2006 at 01:23:17PM -0500, Tom Lane wrote: > Without the flag, it's okay for cidr-to-inet to be a binary-compatible (no > function) conversion. However, inet-to-cidr has to either zero out bits > to the right of the netmask, or error out if any are set. Joachim Wieland > posted a patch that makes the coercion function just silently zero out any > such bits. That's OK with me, but does anyone want to argue for an error? > (If we do make the coercion function raise error, then we'd probably need > to provide a separate function that supports the bit-zeroing conversion.) I'd argue for an error, on correctness grounds (someone's bound to come back having misused these, and complain that it silently changed data. They'd have a point). > One other point is what to do with the binary I/O functions (send/receive) > for inet and cidr. I think that we should continue to send the is_cidr > flag byte for backwards-compatibility reasons. On receive, we could > either ignore that byte entirely, or insist that it match the expected > datatype. I'm inclined to ignore the byte but am willing to listen to > arguments to raise an error instead. If this is exposed to users in some way (I don't think it is, is it?) then I'd argue for erroring, on the same grounds of what I say above. But otherwise, I think you could ignore it. A -- Andrew Sullivan | ajs@crankycanuck.ca I remember when computers were frustrating because they *did* exactly what you told them to. That actually seems sort of quaint now. --J.D. Baldwin
On 2006-01-24, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Without the flag, it's okay for cidr-to-inet to be a binary-compatible (no > function) conversion. However, inet-to-cidr has to either zero out bits > to the right of the netmask, or error out if any are set. Joachim Wieland > posted a patch that makes the coercion function just silently zero out any > such bits. That's OK with me, but does anyone want to argue for an error? > (If we do make the coercion function raise error, then we'd probably need > to provide a separate function that supports the bit-zeroing conversion.) > > Currently, both directions of cast are implicit, but that is a bad idea. > I propose keeping cidr-to-inet as implicit but making inet-to-cidr an > assignment cast. This fits with the fact that inet can represent all > values of cidr but not vice versa (compare int4 and int8). If inet-to-cidr can zero out bits silently, then wouldn't making it an assignment cast be rather dangerous and error-prone? > Given the implicit binary-compatible coercion, it's OK to have just a > single function taking inet for any case where the function truly doesn't > care if it's looking at inet or cidr input. For the cases where the code > currently pays attention to is_cidr, we'd have to make two separate > functions. AFAICT that's only abbrev(inet) and text(inet) among the > current functions. Also, set_masklen(inet,integer) would have to come > in two flavors since the output type should be the same as the input. You sometimes need set_masklen(cidr,integer) returning inet, and I'd bet there's existing code that does that. > The relationship of cidr and inet would be a little bit like the relation > between varchar and text. For instance, varchar doesn't have any > comparison operators of its own, but piggybacks on text's comparison > operators, relying on the implicit cast from varchar to text to make this > transparent to users. Well, inet/cidr have far more justification for being separate types than text/varchar do - the text/varchar issue causes a great deal of confusion. -- Andrew, Supernews http://www.supernews.com - individual and corporate NNTP services
This is exactly what I had in mind: split the types zero out the bits going to cidrno change going to inetmake functions take inet, which as not cast change --------------------------------------------------------------------------- Tom Lane wrote: > We've had previous discussions about how the distinction between INET > and CIDR isn't very well thought out, for instance > http://archives.postgresql.org/pgsql-hackers/2005-01/msg01021.php > http://archives.postgresql.org/pgsql-hackers/2006-01/msg00233.php > > The basic problem is that the code is schizophrenic about whether these > types are "the same" or not. The fact that we have implicit binary > (no function) coercions in both directions makes them effectively the > same, so why are there two different type names in the catalogs? > On the other hand, if they should be different (and they definitely > have different I/O behavior), this coercion behavior is wrong. Also, > if they are different types, it's redundant to have a flag inside the > data structure saying which type a particular value is. > > After some consideration I've come to the conclusion that we really do > want them to be separate types: the I/O behavior is settled (after quite > some long discussions) and we don't want to change it, so we can't merge > them into one type. That leads to the following proposals: > > Remove the internal is_cidr flag; it's a waste of space. (It doesn't > actually cost anything today, because of alignment considerations, but > it would cost 2 bytes if we implement the proposed 2-byte-length-word > variant datum format.) Even more to the point, the presence of the > flag has encouraged the sort of sloppy thinking and coding that got us > into this mess. Whether it's an INET or a CIDR should be totally > determined by the SQL type system. > > Without the flag, it's okay for cidr-to-inet to be a binary-compatible (no > function) conversion. However, inet-to-cidr has to either zero out bits > to the right of the netmask, or error out if any are set. Joachim Wieland > posted a patch that makes the coercion function just silently zero out any > such bits. That's OK with me, but does anyone want to argue for an error? > (If we do make the coercion function raise error, then we'd probably need > to provide a separate function that supports the bit-zeroing conversion.) > > Currently, both directions of cast are implicit, but that is a bad idea. > I propose keeping cidr-to-inet as implicit but making inet-to-cidr an > assignment cast. This fits with the fact that inet can represent all > values of cidr but not vice versa (compare int4 and int8). > > Given the implicit binary-compatible coercion, it's OK to have just a > single function taking inet for any case where the function truly doesn't > care if it's looking at inet or cidr input. For the cases where the code > currently pays attention to is_cidr, we'd have to make two separate > functions. AFAICT that's only abbrev(inet) and text(inet) among the > current functions. Also, set_masklen(inet,integer) would have to come > in two flavors since the output type should be the same as the input. > > The relationship of cidr and inet would be a little bit like the relation > between varchar and text. For instance, varchar doesn't have any > comparison operators of its own, but piggybacks on text's comparison > operators, relying on the implicit cast from varchar to text to make this > transparent to users. > > One other point is what to do with the binary I/O functions (send/receive) > for inet and cidr. I think that we should continue to send the is_cidr > flag byte for backwards-compatibility reasons. On receive, we could > either ignore that byte entirely, or insist that it match the expected > datatype. I'm inclined to ignore the byte but am willing to listen to > arguments to raise an error instead. > > Comments? > > regards, tom lane > > ---------------------------(end of broadcast)--------------------------- > TIP 2: Don't 'kill -9' the postmaster > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Andrew - Supernews <andrew+nonews@supernews.com> writes: > If inet-to-cidr can zero out bits silently, then wouldn't making it an > assignment cast be rather dangerous and error-prone? The proposal was to make cidr-to-inet an implicit cast (happens automatically without being requested) but make cidr-to-inet not implicit (you need to type myinet::cidr or the "CAST" syntax to get it to happen). Making the cast zero bits or making it error and having a separate function to handle the conversion both seem entirely reasonable approaches. I wonder if this would be an opportunity to fix Postgres's handling of addresses like '10.1'. The standard interpretation of this is the same as '10.0.0.1'. Currently Postgres interprets it as '10.1.0.0' for cidr and gives an error for inet. Perhaps if they become truly separate types then we could leave the cidr behaviour alone since it is common to write things like 10.1/16 to refer to the network prefix. But let inet have the standard interpretation. It might be confusing but actually it's entirely logical. cidr is a type for representing address ranges. '10.1/16'::cidr represents a range of addresses from 10.1.0.0-10.1.255.255. There would be no reasonable use case for 10.0.0.1 in a cidr unless it was /32 which would refer to a specific address, in which case it would make more sense to use an inet. On the other hand '10.1'::inet represents a specific address and it's clear which address it means -- it's even required to be interpreted this way by the POSIX functions. Erroring out as Postgres currently does means applications built on Postgres don't accept standard inernet address syntax. -- greg
Greg Stark <gsstark@mit.edu> writes: > I wonder if this would be an opportunity to fix Postgres's handling of > addresses like '10.1'. You've mistaken this for a proposal to change the I/O behavior, which it is specifically not. > The standard interpretation of this is the same as '10.0.0.1'. Standard according to whom? Paul Vixie evidently doesn't think that that's a standard abbreviation, else the code we borrowed from libbind would do it already. regards, tom lane
Tom Lane wrote: > Greg Stark <gsstark@mit.edu> writes: >> I wonder if this would be an opportunity to fix Postgres's handling >> of addresses like '10.1'. > > You've mistaken this for a proposal to change the I/O behavior, which > it is specifically not. > >> The standard interpretation of this is the same as '10.0.0.1'. > > Standard according to whom? Paul Vixie evidently doesn't think that > that's a standard abbreviation, else the code we borrowed from libbind > would do it already. We had a **LONG** discussion on the I/O formats back in the 7.2 timeframe. the current behavior is the result of that. Please do **NOT** change the I/O formats. LER -- Larry Rosenman http://www.lerctr.org/~ler Phone: +1 512-248-2683 E-Mail: ler@lerctr.org US Mail: 430 Valona Loop, Round Rock, TX 78681-3893
Tom Lane wrote: > Greg Stark <gsstark@mit.edu> writes: > > I wonder if this would be an opportunity to fix Postgres's handling of > > addresses like '10.1'. > > You've mistaken this for a proposal to change the I/O behavior, which > it is specifically not. > > > The standard interpretation of this is the same as '10.0.0.1'. > > Standard according to whom? Paul Vixie evidently doesn't think that > that's a standard abbreviation, else the code we borrowed from libbind > would do it already. Agreed. 10.1 as 10.0.0.1 is an old behavior which has been removed from most modern versions of networking tools. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
On 2006-01-25, Bruce Momjian <pgman@candle.pha.pa.us> wrote: > Agreed. 10.1 as 10.0.0.1 is an old behavior which has been removed from > most modern versions of networking tools. Indeed so. However the current behaviour has neither the merit of being traditional nor the merit of being logical: => select '10.1'::cidr; cidr -------------10.1.0.0/16 (1 row) => select '128.1'::cidr; cidr --------------128.1.0.0/16 (1 row) => select '192.1'::cidr; cidr --------------192.1.0.0/24 (1 row) Having the behaviour be dependent on which part of the IP space is used is a total nonsense on the modern, CIDR, internet! The C in CIDR even stands for "Classless", so how can you ever justify introducing _new_, non-traditional, dependencies on the traditional classes? -- Andrew, Supernews http://www.supernews.com - individual and corporate NNTP services
Andrew - Supernews wrote: > On 2006-01-25, Bruce Momjian <pgman@candle.pha.pa.us> wrote: > > Agreed. 10.1 as 10.0.0.1 is an old behavior which has been removed from > > most modern versions of networking tools. > > Indeed so. However the current behaviour has neither the merit of being > traditional nor the merit of being logical: > > => select '10.1'::cidr; > cidr > ------------- > 10.1.0.0/16 > (1 row) > > => select '128.1'::cidr; > cidr > -------------- > 128.1.0.0/16 > (1 row) > > => select '192.1'::cidr; > cidr > -------------- > 192.1.0.0/24 > (1 row) > > Having the behaviour be dependent on which part of the IP space is used > is a total nonsense on the modern, CIDR, internet! The C in CIDR even > stands for "Classless", so how can you ever justify introducing _new_, > non-traditional, dependencies on the traditional classes? This is coming from inet_net_pton.c, which we got from: * Copyright (c) 2004 by Internet Systems Consortium, Inc. ("ISC") * Copyright (c) 1996,1999 by Internet Software Consortium. ... if (bits == -1){ if (*odst >= 240) /* Class E */ bits = 32; else if (*odst >= 224) /* Class D */ bits = 8; else if (*odst >= 192) /* Class C */ bits = 24; else if (*odst >= 128) /* Class B */ bits= 16; else /* Class A */ bits = 8; /* If imputed mask is narrower than specified octets, widen.*/ if (bits < ((dst - odst) * 8)) bits = (dst - odst) * 8; ... test=> select '11'::cidr; cidr------------ 11.0.0.0/8(1 row) We are doing our best here to follow industry standards on how things should behave. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Andrew - Supernews <andrew+nonews@supernews.com> writes: > On 2006-01-25, Bruce Momjian <pgman@candle.pha.pa.us> wrote: > > Agreed. 10.1 as 10.0.0.1 is an old behavior which has been removed from > > most modern versions of networking tools. On the contrary not only is it still widely used but it is *required* by POSIX for the relevant functions, inet_aton and getaddrinfo. Note that getaddrinfo was created from whole cloth by POSIX so there was no backwards compatibility need for it. This isn't an obscure old-fashioned thing. People really do use this syntax. > Indeed so. However the current behaviour has neither the merit of being > traditional nor the merit of being logical: Well for networks (cidr datatype) people do frequently refer to things like 10.1/16 and intend it to mean the network prefix. Sure you could argue having the netmask default to the old class-based addressing is anachronistic but what other default netmask would you suggest anyways? The only other reasonable default would be the longest 0-bit suffix which would produce some odd surprising results like '10.1/16' but '10.2/17'. -- greg
On 2006-01-25, Bruce Momjian <pgman@candle.pha.pa.us> wrote: > Andrew - Supernews wrote: >> Having the behaviour be dependent on which part of the IP space is used >> is a total nonsense on the modern, CIDR, internet! The C in CIDR even >> stands for "Classless", so how can you ever justify introducing _new_, >> non-traditional, dependencies on the traditional classes? > > This is coming from inet_net_pton.c, which we got from: That the behaviour comes from ISC code does not make it correct. > We are doing our best here to follow industry standards on how things > should behave. This is not an "industry standard". -- Andrew, Supernews http://www.supernews.com - individual and corporate NNTP services
I have a question in a different direction. What is the meaning of the network mask in the inet data type anyways? Hosts don't have network masks, only networks. If we could store inet in four bytes it would be vastly more efficient both in disk space usage and in cpu at runtime. I think it would also clear up the perpetual user confusion between the two datatypes. I posit that the main source of the confusion is that currently Postgres lets you use inet for everything, even if what you're really storing is a network address range which is what the cidr datatype is really for. -- greg
On 2006-01-25, Greg Stark <gsstark@mit.edu> wrote: > This isn't an obscure old-fashioned thing. People really do use this syntax. Given how little code now supports 10.1 meaning 10.0.0.1, that seems a questionable point. >> Indeed so. However the current behaviour has neither the merit of being >> traditional nor the merit of being logical: > > Well for networks (cidr datatype) people do frequently refer to things like > 10.1/16 and intend it to mean the network prefix. Do you mean they refer to '10.1' and intend it to mean '10.1/16'? If so I agree; but in that case, not only should '10.1' mean '10.1/16', but also '192.1' should mean '192.1/16' and _NOT_ '192.1/24'. > Sure you could argue having > the netmask default to the old class-based addressing is anachronistic but > what other default netmask would you suggest anyways? The one implied by the number of octets specified, assuming you are going to accept the abbreviated forms at all. (FWIW, ip4r at this time does not even accept '10.1/16', it insists on '10.1.0.0/16'.) -- Andrew, Supernews http://www.supernews.com - individual and corporate NNTP services
* Greg Stark (gsstark@mit.edu) wrote: > I have a question in a different direction. What is the meaning of the network > mask in the inet data type anyways? Hosts don't have network masks, only > networks. > > If we could store inet in four bytes it would be vastly more efficient both in > disk space usage and in cpu at runtime. > > I think it would also clear up the perpetual user confusion between the two > datatypes. I posit that the main source of the confusion is that currently > Postgres lets you use inet for everything, even if what you're really storing > is a network address range which is what the cidr datatype is really for. I wholeheartedly agree with this. It also makes the only cast option from inet to cidr to be with a /32 and thus there's no zeroing of bits. It would then be nice to have a function to which you pass in a cidr and a netmask and say "give me the CIDR this CIDR is in with this mask". With the inet-to-cidr implicit cast you could use this function to get the CIDR you want without having to worry about the implicit cast throwing data away. The cidr-to-inet is then throwing away the mask and so I'm not sure we should have an implicit cast in that direction but honestly I wouldn't complain if we did. Thanks, Stephen
On 2006-01-25, Greg Stark <gsstark@MIT.EDU> wrote: > I have a question in a different direction. What is the meaning of the > network mask in the inet data type anyways? Hosts don't have network masks, > only networks. As far as I can tell, the "inet" semantics are supposed to represent a network interface, rather than just an address. So it designates a network and a host within that network. This is a significant semantic overload, which is not relevent to many applications and may be counterproductive (for example, if you had a database of hosts and networks, the network info would more correctly be accessed via a reference to a separate table than embedded in the host address). > If we could store inet in four bytes it would be vastly more efficient both > in disk space usage and in cpu at runtime. That's not reasonable for inet/cidr due to the need to support ipv6. If you want that for ip4-only apps, that's why pgfoundry.org/projects/ip4r exists. It is possible that ip4r will be extended to ipv6 addresses, but most unlikely that it will ever implement the overloaded "inet" semantics. -- Andrew, Supernews http://www.supernews.com - individual and corporate NNTP services
On Jan 25, 2006, at 10:30 AM, Andrew - Supernews wrote: > On 2006-01-25, Greg Stark <gsstark@mit.edu> wrote: >> This isn't an obscure old-fashioned thing. People really do use >> this syntax. > > Given how little code now supports 10.1 meaning 10.0.0.1, that seems a > questionable point. > All code that uses gethostbyname() on, at least, Linux, Solaris, Windows XP, OS X and (I think) the BSDs and anything else that's even vaguely posix uses it. I don't think that's terribly relevant to the PG inet types, though. Rejecting any input format that's not a dotted-quad seems the safest thing to do, and doesn't lose any useful functionality. Given the number of people in this thread who think that the (non-standard, archaic) behaviour of bind is correct it's clear that accepting anything other than a real dotted-quad will lead to an inconsistency between what the data represents and what the user thinks it represents, and that's bound to cause problems. Cheers, Steve
"Larry Rosenman" <ler@lerctr.org> writes: > We had a **LONG** discussion on the I/O formats back in the 7.2 timeframe. > the current > behavior is the result of that. Well I wasn't around for 7.2 but I was for a discussion around 7.3, maybe it's the same one. Regardless, back then there was an implied assumption that any change would affect both types. I couldn't convince people because 10.1/16 for a network address range really is a reasonable abbreviation for 10.1.0.0. It turns out network address ranges and hosts really don't have the same needs at all. 10.1 expanding to 10.1.0.0 is really nonsensical for hosts and 10.1 expanding to 10.0.0.1 is nonsensical for network address ranges. Now that they're going to be two different types proper I think it's not a bad idea at all to revisit any design decisions that were made as a consequence of them being mashed into one pseudo type. Note that this would be entirely backwards compatible since inet doesn't actually accept any abbreviated syntaxes at all currently. Only cidr does. -- greg
Greg Stark <gsstark@MIT.EDU> writes: > If we could store inet in four bytes it would be vastly more efficient both in > disk space usage and in cpu at runtime. You forgot IPv6. regards, tom lane
On Wed, Jan 25, 2006 at 06:30:47PM -0000 I heard the voice of Andrew - Supernews, and lo! it spake thus: > On 2006-01-25, Greg Stark <gsstark@mit.edu> wrote: > > This isn't an obscure old-fashioned thing. People really do use > > this syntax. > > Given how little code now supports 10.1 meaning 10.0.0.1, that seems > a questionable point. (ttyp7):{200}% ping 10.1 PING 10.1 (10.0.0.1): 56 data bytes Given that it's how I learned v4 addresses shorten, and that it's roughly similar to 0-minimization in v6 addresses, I would be surprised as heck to find any other behavior. OTOH, I never use it myself, because knowing the answer I still find it requiring me to stop and think about what it means, because (unlike the v6 version) there's no visual indication that there are 0's and where they go. I recently wrote up a C library to parse v4/v6 CIDR forms, and explicitly chose not to support those shortened v4 forms. -- Matthew Fuller (MF4839) | fullermd@over-yonder.net Systems/Network Administrator | http://www.over-yonder.net/~fullermd/ On the Internet, nobody can hear you scream.
Andrew - Supernews <andrew+nonews@supernews.com> writes: > On 2006-01-25, Greg Stark <gsstark@mit.edu> wrote: > > This isn't an obscure old-fashioned thing. People really do use this syntax. > > Given how little code now supports 10.1 meaning 10.0.0.1, that seems a > questionable point. I've reported the bug in the one instance I've found. What have you found with this omission? It would be passing strange since most software just passes the text to inet_aton or inet_pton. Incidentally I mixed up getaddrinfo and inet_pton in a previous message. Sorry for the confusion. -- greg
On 2006-01-25, Greg Stark <gsstark@mit.edu> wrote: > I've reported the bug in the one instance I've found. > What have you found with this omission? > > It would be passing strange since most software just passes the text to > inet_aton or inet_pton. STANDARDS The inet_ntop() and inet_pton() functions conform to X/Open Networking Services Issue 5.2 (``XNS5.2''). Note that inet_pton() does not accept 1-, 2-, or 3-part dotted addresses; all four parts must be specified and areinterpreted only as decimal values. This is a narrower input set than that accepted by inet_aton(). The spec is quite explicit that inet_pton is not expected to accept the abbreviated forms or any non-decimal values. -- Andrew, Supernews http://www.supernews.com - individual and corporate NNTP services
Andrew - Supernews <andrew+nonews@supernews.com> writes: > The spec is quite explicit that inet_pton is not expected to accept the > abbreviated forms or any non-decimal values. Hum. That distinctly doesn't match my memory but it seems you're right. The spec mandates inet_ntoa and inet_addr support it but not inet_ntop. Odd. -- greg
On Jan 25, 2006, at 9:29 AM, Bruce Momjian wrote: > Tom Lane wrote: >> Greg Stark <gsstark@mit.edu> writes: >>> I wonder if this would be an opportunity to fix Postgres's >>> handling of >>> addresses like '10.1'. >> >> You've mistaken this for a proposal to change the I/O behavior, which >> it is specifically not. >> >>> The standard interpretation of this is the same as '10.0.0.1'. >> >> Standard according to whom? Paul Vixie evidently doesn't think that >> that's a standard abbreviation, else the code we borrowed from >> libbind >> would do it already. > > Agreed. 10.1 as 10.0.0.1 is an old behavior which has been removed > from > most modern versions of networking tools. Whether PG should support it or not is another question (personally I think that anything other than a dotted quad should fail with an error) but it certainly hasn't been removed from most modern versions of networking tools. gethostbyname() is used by most networking tools, and on most unix OSes it believes "10.1" 'resolves to' "10.0.0.1". That includes current versions of linux, OS X, Solaris, Windows XP and I believe the BSDs. So the vast majority of applications on the vast majority of deployed platforms believe that "10.1" is the address 10.0.0.1. (As is often the case binds behaviour is inconsistent and can't really be used as "proof" of standard behaviour). Cheers, Steve
Tom Lane wrote: > Without the flag, it's okay for cidr-to-inet to be a > binary-compatible (no function) conversion. However, inet-to-cidr > has to either zero out bits to the right of the netmask, or error out > if any are set. Joachim Wieland posted a patch that makes the > coercion function just silently zero out any such bits. That's OK > with me, but does anyone want to argue for an error? Zero the bits if it's an explicit cast, raise an error if not. -- Peter Eisentraut http://developer.postgresql.org/~petere/
Peter Eisentraut <peter_e@gmx.net> writes: > Tom Lane wrote: >> Without the flag, it's okay for cidr-to-inet to be a >> binary-compatible (no function) conversion. However, inet-to-cidr >> has to either zero out bits to the right of the netmask, or error out >> if any are set. Joachim Wieland posted a patch that makes the >> coercion function just silently zero out any such bits. That's OK >> with me, but does anyone want to argue for an error? > Zero the bits if it's an explicit cast, raise an error if not. I know there's precedent for such behavior in the SQL spec, but it always seemed pretty ugly to me :-(. The patch-as-committed just silently zeroes the bits during any inet->cidr cast. I'll change it if there's consensus that that's a bad idea, but I don't really see a reason to. BTW, there is another case I came across that wasn't discussed before: if you do set_masklen() on a cidr value that reduces the netmask length, there are the same options of either zeroing the excess bits or complaining if any aren't zero. I've got that doing the zeroing too. regards, tom lane
Tom Lane wrote: > Peter Eisentraut <peter_e@gmx.net> writes: > > Tom Lane wrote: > >> Without the flag, it's okay for cidr-to-inet to be a > >> binary-compatible (no function) conversion. However, inet-to-cidr > >> has to either zero out bits to the right of the netmask, or error out > >> if any are set. Joachim Wieland posted a patch that makes the > >> coercion function just silently zero out any such bits. That's OK > >> with me, but does anyone want to argue for an error? > > > Zero the bits if it's an explicit cast, raise an error if not. > > I know there's precedent for such behavior in the SQL spec, but it > always seemed pretty ugly to me :-(. The patch-as-committed just > silently zeroes the bits during any inet->cidr cast. I'll change it > if there's consensus that that's a bad idea, but I don't really see > a reason to. I agree. Let's do the zeroing and see if people complain about it. Throwing an error seems extreme. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Bruce Momjian wrote: > I agree. Let's do the zeroing and see if people complain about it. I'm complaining. Losing data on a cast is not common behavior. -- Peter Eisentraut http://developer.postgresql.org/~petere/
Peter Eisentraut wrote: > Bruce Momjian wrote: > > I agree. Let's do the zeroing and see if people complain about it. > > I'm complaining. Losing data on a cast is not common behavior. [ Sorry for the delay.] OK, that's clear. :-) I looked around to see if I could find any places where we implicitly cast and lose data. I tried factorial(3.2), and that cast failed. However, if I do this explicit cast: test=> SELECT '3.4'::float8::integer; int4------ 3(1 row) it works. Can't we consider this: test=> SELECT '1.2.3.4/24'::inet::cidr; cidr------------ 1.2.3.0/24(1 row) a similar situation? inet is like float8 (bits beyond the masklen), while cidr is like int4. Is it only implicit casts you are worried about? Do we have any of those left? All functions that take cidr also have an inet version, so I don't see how an implicit cast to cidr could happen. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Bruce Momjian <pgman@candle.pha.pa.us> writes: > Is it only implicit casts you are worried about? Do we have any of > those left? All functions that take cidr also have an inet version, so > I don't see how an implicit cast to cidr could happen. The cast to cidr isn't implicit anymore anyway. What I currently have it marked as is "assignment". You could make the argument that it should be marked "explicit only" to avoid silent loss of data. But we have the numeric downcasts marked as "assignment" so I don't see why this case is different. If you do insert into int4_tbl values(7.7); what's inserted into the integer column is 8, and I've not heard anyone complaining that that represents unacceptable data loss. regards, tom lane