Thread: A small problem with the new inet and cidr types

A small problem with the new inet and cidr types

From
darcy@druid.net (D'Arcy J.M. Cain)
Date:
It seems that we have a small problem with the operators and functions
in the inet and cidr types.  If the value in a field is null then the
backend crashes when you try to apply one of the operators or functions
on it.  I have a small patch ready to fix the functions but I'm not
sure what the behaviour should be for the operators.  For the functions
I return an empty string if the argument is null except for masklen()
which returns 0.  Should masklen return 32?

At Marc's request I am holding my patch until 6.4 is frozen.

-- 
D'Arcy J.M. Cain <darcy@{druid|vex}.net>   |  Democracy is three wolves
http://www.druid.net/darcy/                |  and a sheep voting on
+1 416 424 2871     (DoD#0082)    (eNTP)   |  what's for dinner.


Re: [HACKERS] A small problem with the new inet and cidr types

From
Bruce Momjian
Date:
> It seems that we have a small problem with the operators and functions
> in the inet and cidr types.  If the value in a field is null then the
> backend crashes when you try to apply one of the operators or functions
> on it.  I have a small patch ready to fix the functions but I'm not
> sure what the behaviour should be for the operators.  For the functions
> I return an empty string if the argument is null except for masklen()
> which returns 0.  Should masklen return 32?
> 
> At Marc's request I am holding my patch until 6.4 is frozen.

Huh.  You have a backend crash fix, and we can't apply it until after
6.4 is out the door?

--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [HACKERS] A small problem with the new inet and cidr types

From
The Hermit Hacker
Date:
On Sun, 1 Nov 1998, Bruce Momjian wrote:

> > It seems that we have a small problem with the operators and functions
> > in the inet and cidr types.  If the value in a field is null then the
> > backend crashes when you try to apply one of the operators or functions
> > on it.  I have a small patch ready to fix the functions but I'm not
> > sure what the behaviour should be for the operators.  For the functions
> > I return an empty string if the argument is null except for masklen()
> > which returns 0.  Should masklen return 32?
> > 
> > At Marc's request I am holding my patch until 6.4 is frozen.
> 
> Huh.  You have a backend crash fix, and we can't apply it until after
> 6.4 is out the door?
D'Arcy is going to release a patch for the fix on the same day as
the release, as it is something that only affects a very small aspect of
the server, and will affect very few ppl...

Marc G. Fournier                                
Systems Administrator @ hub.org 
primary: scrappy@hub.org           secondary: scrappy@{freebsd|postgresql}.org 



Re: [HACKERS] A small problem with the new inet and cidr types

From
darcy@druid.net (D'Arcy J.M. Cain)
Date:
Thus spake Bruce Momjian
> > At Marc's request I am holding my patch until 6.4 is frozen.
> 
> Huh.  You have a backend crash fix, and we can't apply it until after
> 6.4 is out the door?

Well, it wasn't complete anyway since I wasn't sure what to do about
the operators.  However, I have since figured it out, I think.  I
think that if either argument to an operator function is NULL then
the function should return FALSE.  My reasoning is that it should be
consistent and people should be able to decide the sense of the test
for the results they expect.  Every test can be expressed as the
negation of another test so the user can decide whether they want
nulls to be included or not.  For example;

SELECT c < i FROM inet_tbl;
SELECT NOT (c >= i) FROM inet_tbl;

While both of the above selects are nominally equivalent, in the first
case rows where i1 or i2 is null will return 0 while in the second case
they return 1.  Does this seem right?

Now, after making the above changes to my local copy I tested it and I
found that when one of the fields was null, the bool result is blank, not
t or f.  I'm sure it goes to the functions as it crashes without the
changes so I'm not sure why it is blank after.

One more thing, I'm not sure how this gets fixed but the function results
are still right justified.  Weren't we going to left justify them?

-- 
D'Arcy J.M. Cain <darcy@{druid|vex}.net>   |  Democracy is three wolves
http://www.druid.net/darcy/                |  and a sheep voting on
+1 416 424 2871     (DoD#0082)    (eNTP)   |  what's for dinner.


Re: [HACKERS] A small problem with the new inet and cidr types

From
darcy@druid.net (D'Arcy J.M. Cain)
Date:
Thus spake The Hermit Hacker
> > Huh.  You have a backend crash fix, and we can't apply it until after
> > 6.4 is out the door?
> 
>     D'Arcy is going to release a patch for the fix on the same day as
> the release, as it is something that only affects a very small aspect of
> the server, and will affect very few ppl...

Not so sure about that, Marc.  It is, however, easy to work around.  You
just have to set NOT NULL for any inet or cidr type but I already have
a use for null ip numbers.  That's how I found the problem.

Of course, it's such a new type that I'm sure it won't be heavily used for
a little while anyway.

I just had another thought.  Do any other types exhibit this problem?
I think I'll go check enhance a few regression tests.

-- 
D'Arcy J.M. Cain <darcy@{druid|vex}.net>   |  Democracy is three wolves
http://www.druid.net/darcy/                |  and a sheep voting on
+1 416 424 2871     (DoD#0082)    (eNTP)   |  what's for dinner.


Re: [HACKERS] A small problem with the new inet and cidr types

From
darcy@druid.net (D'Arcy J.M. Cain)
Date:
Thus spake D'Arcy J.M. Cain
> I just had another thought.  Do any other types exhibit this problem?
> I think I'll go check enhance a few regression tests.

OK, there are more problems.  If you apply the following patch to the
regression tests you will crash the backend in a number of places.  I
also added some null inserts in places that it didn't crash (the date
and time functions are pretty clean) but it will, of course, change
the expected output.

I won't send this to patches since it crashes the backend.  I'll let others
fix the tests for the types that they fix.


*** ../sql/abstime.sql    Sat May 31 22:30:19 1997
--- abstime.sql    Mon Nov  2 09:17:48 1998
***************
*** 23,28 ****
--- 23,30 ----  INSERT INTO ABSTIME_TBL (f1) VALUES ('May 10, 1947 23:59:12'); 
+ INSERT INTO ABSTIME_TBL (f1) VALUES (null);
+   -- what happens if we specify slightly misformatted abstime?  INSERT INTO ABSTIME_TBL (f1) VALUES ('Feb 35, 1946
10:00:00');
*** ../sql/char.sql    Sun Nov 30 21:46:01 1997
--- char.sql    Mon Nov  2 09:24:51 1998
***************
*** 30,35 ****
--- 30,38 ---- -- zero-length char  INSERT INTO CHAR_TBL (f1) VALUES (''); 
+ -- null input
+ INSERT INTO CHAR_TBL (f1) VALUES (null);
+  -- try char's of greater than 1 length  INSERT INTO CHAR_TBL (f1) VALUES ('cd'); 
*** ../sql/circle.sql    Tue Jul 29 12:22:44 1997
--- circle.sql    Mon Nov  2 09:26:18 1998
***************
*** 16,21 ****
--- 16,23 ----  INSERT INTO CIRCLE_TBL VALUES ('<(100,0),100>'); 
+ INSERT INTO CIRCLE_TBL VALUES (null);
+  -- bad values  INSERT INTO CIRCLE_TBL VALUES ('<(-100,0),-100>');
*** ../sql/datetime.sql    Fri Nov 14 21:55:57 1997
--- datetime.sql    Mon Nov  2 09:30:37 1998
***************
*** 18,23 ****
--- 18,24 ---- INSERT INTO DATETIME_TBL VALUES ('tomorrow'); INSERT INTO DATETIME_TBL VALUES ('tomorrow EST'); INSERT
INTODATETIME_TBL VALUES ('tomorrow zulu');
 
+ INSERT INTO DATETIME_TBL VALUES (null);  SELECT count(*) AS one FROM DATETIME_TBL WHERE d1 = 'today'::datetime;
SELECTcount(*) AS one FROM DATETIME_TBL WHERE d1 = 'tomorrow'::datetime;
 
***************
*** 42,47 ****
--- 43,49 ---- INSERT INTO DATETIME_TBL VALUES ('-infinity'); INSERT INTO DATETIME_TBL VALUES ('infinity'); INSERT INTO
DATETIME_TBLVALUES ('epoch');
 
+ INSERT INTO DATETIME_TBL VALUES (null);  -- Postgres v6.0 standard output format INSERT INTO DATETIME_TBL VALUES
('MonFeb 10 17:32:01 1997 PST');
 
*** ../sql/horology.sql    Sat Sep 20 12:34:07 1997
--- horology.sql    Mon Nov  2 09:34:55 1998
***************
*** 15,20 ****
--- 15,22 ----   WHERE d1 BETWEEN '13-jun-1957' AND '1-jan-1997'    OR d1 BETWEEN '1-jan-1999' AND '1-jan-2010'; 
+ INSERT INTO TEMP_DATETIME (f1) VALUES (null);
+  SELECT '' AS ten, f1 AS datetime   FROM TEMP_DATETIME   ORDER BY datetime;
*** ../sql/inet.sql    Thu Oct 29 13:13:03 1998
--- inet.sql    Mon Nov  2 09:36:06 1998
***************
*** 15,20 ****
--- 15,21 ---- INSERT INTO INET_TBL (c, i) VALUES ('10', '10.1.2.3/8'); INSERT INTO INET_TBL (c, i) VALUES ('10',
'11.1.2.3/8');INSERT INTO INET_TBL (c, i) VALUES ('10', '9.1.2.3/8');
 
+ INSERT INTO INET_TBL (c, i) VALUES (null, null);  SELECT '' AS ten, c AS cidr, i AS inet FROM INET_TBL; 
*** ../sql/lseg.sql    Tue Jul 29 12:22:46 1997
--- lseg.sql    Mon Nov  2 09:36:46 1998
***************
*** 10,15 ****
--- 10,16 ---- INSERT INTO LSEG_TBL VALUES ('10,-10 ,-3,-4'); INSERT INTO LSEG_TBL VALUES ('[-1e6,2e2,3e5, -4e1]');
INSERTINTO LSEG_TBL VALUES ('(11,22,33,44)');
 
+ INSERT INTO LSEG_TBL VALUES (null);  -- bad values for parser testing INSERT INTO LSEG_TBL VALUES ('(3asdf,2
,3,4r2)');
*** ../sql/name.sql    Mon Apr 27 09:50:03 1998
--- name.sql    Mon Nov  2 09:37:57 1998
***************
*** 28,33 ****
--- 28,35 ----  INSERT INTO NAME_TBL(f1) VALUES ('1234567890ABCDEFGHIJKLMNOPQRSTUVWXYZ'); 
+ INSERT INTO NAME_TBL(f1) VALUES (null);
+   SELECT '' AS seven, NAME_TBL.*; 
*** ../sql/path.sql    Tue Jun  3 10:23:37 1997
--- path.sql    Mon Nov  2 09:39:36 1998
***************
*** 22,27 ****
--- 22,29 ----  INSERT INTO PATH_TBL VALUES ('(11,12,13,14)'); 
+ INSERT INTO PATH_TBL VALUES (null);
+  -- bad values for parser testing INSERT INTO PATH_TBL VALUES ('[(,2),(3,4)]'); 
*** ../sql/point.sql    Wed Sep 24 13:55:38 1997
--- point.sql    Mon Nov  2 09:40:49 1998
***************
*** 12,17 ****
--- 12,19 ----  INSERT INTO POINT_TBL(f1) VALUES ('(-5.0,-12.0)'); 
+ INSERT INTO POINT_TBL(f1) VALUES (null);
+  -- bad format points  INSERT INTO POINT_TBL(f1) VALUES ('asdfasdf'); 
*** ../sql/polygon.sql    Tue Jul 29 12:22:48 1997
--- polygon.sql    Mon Nov  2 09:41:50 1998
***************
*** 25,30 ****
--- 25,32 ----  INSERT INTO POLYGON_TBL(f1) VALUES ('(0.0,1.0),(0.0,1.0)'); 
+ INSERT INTO POLYGON_TBL(f1) VALUES (null);
+  -- bad polygon input strings  INSERT INTO POLYGON_TBL(f1) VALUES ('0.0'); 
*** ../sql/reltime.sql    Thu May  8 23:26:51 1997
--- reltime.sql    Mon Nov  2 09:43:27 1998
***************
*** 12,17 ****
--- 12,19 ----  INSERT INTO RELTIME_TBL (f1) VALUES ('@ 14 seconds ago'); 
+ INSERT INTO RELTIME_TBL (f1) VALUES (null);
+   -- badly formatted reltimes:    INSERT INTO RELTIME_TBL (f1) VALUES ('badly formatted reltime');
*** ../sql/timespan.sql    Thu May  8 23:26:56 1997
--- timespan.sql    Mon Nov  2 09:46:19 1998
***************
*** 10,15 ****
--- 10,16 ---- INSERT INTO TIMESPAN_TBL (f1) VALUES ('6 years'); INSERT INTO TIMESPAN_TBL (f1) VALUES ('5 months');
INSERTINTO TIMESPAN_TBL (f1) VALUES ('5 months 12 hours');
 
+ INSERT INTO TIMESPAN_TBL (f1) VALUES (null);  -- badly formatted timespan:    INSERT INTO TIMESPAN_TBL (f1) VALUES
('badlyformatted timespan');
 
*** ../sql/tinterval.sql    Sat Sep 20 12:33:24 1997
--- tinterval.sql    Mon Nov  2 09:47:23 1998
***************
*** 15,20 ****
--- 15,22 ---- INSERT INTO TINTERVAL_TBL (f1)    VALUES ('["Feb 15 1990 12:15:03" "current"]'); 
+ INSERT INTO TINTERVAL_TBL (f1) VALUES (null);
+   -- badly formatted tintervals  INSERT INTO TINTERVAL_TBL (f1)

-- 
D'Arcy J.M. Cain <darcy@{druid|vex}.net>   |  Democracy is three wolves
http://www.druid.net/darcy/                |  and a sheep voting on
+1 416 424 2871     (DoD#0082)    (eNTP)   |  what's for dinner.


Re: [HACKERS] A small problem with the new inet and cidr types

From
Tom Lane
Date:
The Hermit Hacker <scrappy@hub.org> writes:
>> Huh.  You have a backend crash fix, and we can't apply it until after
>> 6.4 is out the door?

>     D'Arcy is going to release a patch for the fix on the same day as
> the release, as it is something that only affects a very small aspect of
> the server, and will affect very few ppl...

Of course, the other side of that argument is that D'Arcy's fix is a
very low-risk affair for the people who won't be using the inet types.
For the people who will be using them, it seems rather critical.

I think shipping the release without this fix is a mistake.

It sounds like D'Arcy could use a little advice/help from someone who
knows about proper handling of null inputs to functions and operators
(not me...).   But if someone like that will step up to the plate, I
don't see why we can't squeeze in a solid fix.
        regards, tom lane


Re: [HACKERS] A small problem with the new inet and cidr types

From
The Hermit Hacker
Date:
On Mon, 2 Nov 1998, Tom Lane wrote:

> The Hermit Hacker <scrappy@hub.org> writes:
> >> Huh.  You have a backend crash fix, and we can't apply it until after
> >> 6.4 is out the door?
> 
> >     D'Arcy is going to release a patch for the fix on the same day as
> > the release, as it is something that only affects a very small aspect of
> > the server, and will affect very few ppl...
> 
> Of course, the other side of that argument is that D'Arcy's fix is a
> very low-risk affair for the people who won't be using the inet types.
> For the people who will be using them, it seems rather critical.
right, which is why we will be releasing a patch at the same time
as we release v6.4 ...

> I think shipping the release without this fix is a mistake.
and I think incorporating it at this late a date in the release
cycle would also be a mistake.  I'd rather a known bug with a patch then
an unknown bug created by fixing the known bug...
...for 99% of the users, they won't be using the INET/CIDR
stuff...for the 1% that will, they can easily download and install that
one patch to fix the problem...

> It sounds like D'Arcy could use a little advice/help from someone who
> knows about proper handling of null inputs to functions and operators
> (not me...).   But if someone like that will step up to the plate, I
> don't see why we can't squeeze in a solid fix.
Because we are less then 2 days away from a release, and we have
already over-shoot our release date by 1 month *because* of this feature.
In the future, as has been discussed by the core, this will not happen
again...we should never have held up the release for the INET/CIDR types.

Marc G. Fournier                                
Systems Administrator @ hub.org 
primary: scrappy@hub.org           secondary: scrappy@{freebsd|postgresql}.org 



Re: [HACKERS] A small problem with the new inet and cidr types

From
Tom Lane
Date:
darcy@druid.net (D'Arcy J.M. Cain) writes:
> OK, there are more problems.  If you apply the following patch to the
> regression tests you will crash the backend in a number of places.

Yipes!

I must withdraw my prior opinion that we should shoehorn in a repair to
the INET datatypes for this case.  It's clear that we have a wideranging
problem that ought to be fixed more globally.  But presumably it's
been there for quite a while, and we didn't know it; therefore it's not
critical enough to hold up the release.

My guess is that maybe this should not be fixed in the individual
datatypes at all; instead the generic function and operator code should
be modified so that if any input value is NULL, then NULL is returned as
the result without ever calling the datatype-specific code.

There might be specific operators for which this is not the right
behavior (although none spring to mind immediately).  In that case,
I think the best bet would be to have a per-operator flag, defaulting
to OFF, which could be turned on for those specific operators that are
prepared to cope with null inputs.

Thoughts?
        regards, tom lane


Re: [HACKERS] A small problem with the new inet and cidr types

From
Tom Lane
Date:
The Hermit Hacker <scrappy@hub.org> writes:
> ... we have
> already over-shoot our release date by 1 month *because* of this feature.
> In the future, as has been discussed by the core, this will not happen
> again...we should never have held up the release for the INET/CIDR types.

Well, maybe not, but don't beat yourselves up about it.  An awful lot of
small but good things have happened in the last month in the way of bug
fixes and configure improvements.  Not to mention docs.  If you'd
released 6.4 a month ago, it would've been of substantially lower
quality than what will go out the door this week.  I don't think the
delay was such a bad thing.
        regards, tom lane


Re: [HACKERS] A small problem with the new inet and cidr types

From
"Thomas G. Lockhart"
Date:
> Well, maybe not, but don't beat yourselves up about it.  An awful lot 
> of small but good things have happened in the last month in the way of 
> bug fixes and configure improvements.  Not to mention docs.  If you'd
> released 6.4 a month ago, it would've been of substantially lower
> quality than what will go out the door this week.  I don't think the
> delay was such a bad thing.

That's been pointed out :)

But every release we try to learn something new (or are forced to
relearn something old). In this case, imho we delayed focusing the
developer's group on release-specific issues while waiting for these
additions, and could have/should have covered the same territory
starting Sept 1 rather than Oct 15. otoh, you did a lot of these fixes
and improvements over the last month, and only you can guess if you
would have been prepared/willing/in the right frame of mind to do the
same in September. Also, I got some great docs contributions since Sept
15 (about when I had planned on freezing the docs), and those things
would have had to wait for v6.5, at least for hardcopy versions.

So, the only issue really is trying to be a bit more decisive about
which features will go into a release, feature-freezing when we had
planned to (more or less), and then doing the right thing for testing
and packaging once we have frozen.

And then move on to the next release cycle with clear heads and great
plans ;)
                    - Tom


Re: [HACKERS] A small problem with the new inet and cidr types

From
darcy@druid.net (D'Arcy J.M. Cain)
Date:
Thus spake Tom Lane
> My guess is that maybe this should not be fixed in the individual
> datatypes at all; instead the generic function and operator code should
> be modified so that if any input value is NULL, then NULL is returned as
> the result without ever calling the datatype-specific code.

Could it be tied to the return type?  IOW, functions or operators
that return bool return FALSE, text return "", etc.

> There might be specific operators for which this is not the right
> behavior (although none spring to mind immediately).  In that case,
> I think the best bet would be to have a per-operator flag, defaulting
> to OFF, which could be turned on for those specific operators that are
> prepared to cope with null inputs.

Obviously that will have to wait for 6.5 since it requires an initdb
to add the field.  Do we want to wait that long?

-- 
D'Arcy J.M. Cain <darcy@{druid|vex}.net>   |  Democracy is three wolves
http://www.druid.net/darcy/                |  and a sheep voting on
+1 416 424 2871     (DoD#0082)    (eNTP)   |  what's for dinner.


RE: [HACKERS] A small problem with the new inet and cidr types

From
"Taral"
Date:
> My guess is that maybe this should not be fixed in the individual
> datatypes at all; instead the generic function and operator code should
> be modified so that if any input value is NULL, then NULL is returned as
> the result without ever calling the datatype-specific code.

AFAICT, the function code returns blank when the input is NULL, regardless
of the function definition... this came up before when someone tried to
extend the functions and found that func(NULL) called func, but disregarded
the return value...

Taral



Re: [HACKERS] A small problem with the new inet and cidr types

From
Bruce Momjian
Date:
> Thus spake Tom Lane
> > My guess is that maybe this should not be fixed in the individual
> > datatypes at all; instead the generic function and operator code should
> > be modified so that if any input value is NULL, then NULL is returned as
> > the result without ever calling the datatype-specific code.
> 
> Could it be tied to the return type?  IOW, functions or operators
> that return bool return FALSE, text return "", etc.
> 
> > There might be specific operators for which this is not the right
> > behavior (although none spring to mind immediately).  In that case,
> > I think the best bet would be to have a per-operator flag, defaulting
> > to OFF, which could be turned on for those specific operators that are
> > prepared to cope with null inputs.
> 
> Obviously that will have to wait for 6.5 since it requires an initdb
> to add the field.  Do we want to wait that long?

The only thing I can add here is to look at the other functions, and do
what they do.

--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [HACKERS] A small problem with the new inet and cidr types

From
Tom Lane
Date:
darcy@druid.net (D'Arcy J.M. Cain) writes:
>> My guess is that maybe this should not be fixed in the individual
>> datatypes at all; instead the generic function and operator code should
>> be modified so that if any input value is NULL, then NULL is returned as
>> the result without ever calling the datatype-specific code.

> Could it be tied to the return type?  IOW, functions or operators
> that return bool return FALSE, text return "", etc.

That strikes me as really dangerous.  What you are basically proposing
is to overload a perfectly good value of each type as meaning "maybe
it's really this data value, and maybe it's the result of an operation
on NULL".  We have a good way to represent an unknown/undefined result,
and that is to return NULL; we shouldn't replace that with a kluge.

Compare for example the IEEE floating point standards, in which there
are specific "exceptional values" (Not a Numbers, or NaNs).  NaNs
propagate through operations --- for example, NaN + 23 yields NaN, not
0 or any other ordinary data value.  The people who defined it that way
knew what they were doing.  Practically every formal computational model
invented in the last twenty years has had a similar concept of an
"undefined" or "bottom" value, and they all treat it that way.

Now, any Postgres implementation running on an IEEE-float platform will
automatically do the right things in the float4 and float8 operators,
so NaNs in a database should do the right things.  (BTW, do the input
and output conversion operators for floats support a representation for
NaNs?  If not that should be on the to-do list...)  But we need a
similar concept for all the other data types.  As far as I can see,
NULL should be it.

(In fact, I'd be surprised if the SQL standard doesn't mandate that
behavior.  I don't have a copy to look at, however.)

It sounds like the function/operator support already gets this almost
right, but it shouldn't call the datatype-specific routine at all if
it's not going to use the result (IMHO anyway).

>> There might be specific operators for which this is not the right
>> behavior (although none spring to mind immediately).  In that case,
>> I think the best bet would be to have a per-operator flag, defaulting
>> to OFF, which could be turned on for those specific operators that are
>> prepared to cope with null inputs.

> Obviously that will have to wait for 6.5 since it requires an initdb
> to add the field.  Do we want to wait that long?

I haven't yet heard an example case demonstrating that it's necessary
at all...
        regards, tom lane


Re: [HACKERS] A small problem with the new inet and cidr types

From
darcy@druid.net (D'Arcy J.M. Cain)
Date:
Thus spake Bruce Momjian
> The only thing I can add here is to look at the other functions, and do
> what they do.

Uh, they crash the backend.  :-)

-- 
D'Arcy J.M. Cain <darcy@{druid|vex}.net>   |  Democracy is three wolves
http://www.druid.net/darcy/                |  and a sheep voting on
+1 416 424 2871     (DoD#0082)    (eNTP)   |  what's for dinner.


Re: [HACKERS] A small problem with the new inet and cidr types

From
darcy@druid.net (D'Arcy J.M. Cain)
Date:
Thus spake Taral
> > My guess is that maybe this should not be fixed in the individual
> > datatypes at all; instead the generic function and operator code should
> > be modified so that if any input value is NULL, then NULL is returned as
> > the result without ever calling the datatype-specific code.
> 
> AFAICT, the function code returns blank when the input is NULL, regardless
> of the function definition... this came up before when someone tried to
> extend the functions and found that func(NULL) called func, but disregarded
> the return value...

Well that sure fits with my observations.  Sure seems wrong though.  We
should either use the return value or don't call the function in the
first place.  I vote for the latter even though I have spent the time
fixing inet.  It seems like the proper method.

-- 
D'Arcy J.M. Cain <darcy@{druid|vex}.net>   |  Democracy is three wolves
http://www.druid.net/darcy/                |  and a sheep voting on
+1 416 424 2871     (DoD#0082)    (eNTP)   |  what's for dinner.


Re: [HACKERS] A small problem with the new inet and cidr types

From
Bruce Momjian
Date:
> Thus spake Bruce Momjian
> > The only thing I can add here is to look at the other functions, and do
> > what they do.
> 
> Uh, they crash the backend.  :-)

Oh.


--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [HACKERS] A small problem with the new inet and cidr types

From
Tom Ivar Helbekkmo
Date:
The Hermit Hacker <scrappy@hub.org> writes:

>     and I think incorporating it at this late a date in the release
> cycle would also be a mistake.

While we're talking of release cycles, I'd like to take the liberty to
post something that I've been keeping around for a while, which was
written by Dave Burgess (of *BSD FAQ fame) some time back.  He was
posting to a NetBSD mailing list, so there is wording in here that's
specific to that environment, but there is much good, general truth
here.  With respect to the CIDR case, note especially his "get it
right or get it out" rule.  Translated to PostgreSQL 6.4 terms, we
get: "Whoah!  This wasn't what we wanted!  Decide quick: do we ship
what we have, or do we ditch it and do it right for 6.5?".  We chose
to do neither, and there's a lesson to be learned from this.

Let me also take this opportunity to submit a plug for "The Mythical
Man-Month" by Frederick P. Brooks.  Anyone involved in software
development who hasn't yet read it, should.  Get the 1995 special
edition, expanded and edited from the original.

Anyway, here's Dave's writing on release engineering:

| 1.  When the interface to a system changes radically, the major number
| should be bumped.  The minor number should be reset to 0.
| 
| 2.  When more than 50% of the code in a system has been changed, the
| major number should be bumped EXCEPT if the minor number is 0.  This
| takes into account the case where a recent interface change drives a
| large code churn.
| 
| 3.  There should be a target list of changes for each release.  When
| those changes are accomplished, the minor number should be bumped unless
| the minor number was reset to 0 by 1 or 2 (above).  The target list
| should also have a target date (six months is a good consensus
| timeframe, from what I've heard). 
| 
| NOTE:  If the list only has one item on it, but a bunch of other
| projects have been started, then when that single item is ready to roll,
| everyone has to be ready to either get it right or get it out.
| 
| 4.  Once the target list is complete, all changes in progress are
| brought to closure and the system is Alpha released.  The Alpha baseline
| is established.  This means either backing out the work in progress or
| bringing the work to a state where it is working or will not impact
| performance.
| 
| 5.  The bugs from the Alpha test cycle are documented and analyzed.  If
| there are no Class A or Class B problems (A being "causes damage to
| equipment or files" and B being "causes reproducable system halts and
| panics") the code should be promoted to Beta.  If there are, these
| problems are addressed.  The Alpha is then re-released.  All SPR driven 
| changes are made to the Alpha Baseline code and the -current code.
| 
| 6.  Once the Beta has been released, a firm release date is established.
| For most of the Govt project I've worked on, that is 28 days out.  If no
| new Class A or Class B SPRs are received, the code is released on the
| release date.  If there are Class A or Class B SPRs, the changes are
| made.  The release date does not slip.  For us, a 14 day lead should be
| plenty.
| 
| 7.  If the release date cannot be met, a new Beta is released and a new
| release date is established.
| 
| The advantage of this is that there are a few driving upgrades that
| force new releases.  As an example, we could have established a short
| list of "2.0 candidate targets."  which could have included "bounce
| buffer support" and "working 100MB/s drivers for all existing network
| cards."
| 
| The logistics for this are daunting, but are workable.  We are already
| working using good CM practices, our testing is good, and the source
| code baseline could literally happen almost anytime.
| 
| The trick, then, is to find someone willing to do the Release
| Engineering.  It's a shame we don't know anyone with years of experience
| doing this....
| 
| Now for the schedule in real terms:
| 
| Day 0 - The list of existing "work in progress" and "need to fix"
| problems is culled for the next release.
| 
| Day 20 - The list is agreed upon; if anyone disagrees with the
| feasability of an item for the releas in 5 months, it is defered.
| 
| 100 days pass for people to work on their projects.  This should be
| enough to at least get to the point that you know whether you will make
| the deadline.
| 
| Day 120 - The release candidate is checked for completion of the list.
| The list is reviewed for possible deletions.  Work continues.
| 
| Day 150 - The release candidate is checked.  This is the "drop dead" for
| the projects in work.  Everything that is working stays, everything else
| is either removed or neutered.  This is the Alpha release.  At this
| point, the -current tree is baselined, as is.  Work on -current
| continues while the Alpha and Beta trees are used strictly for Problem
| reports.  Yes, this involves making the changes twice; once in the Alpha
| tree and once in -current.
| 
| Day 165 - The Beta is released.  Changes to the Alpha source tree are
| included and the Beta tree is baselined.
| 
| Day 180 - The release is cut and the next version list is started,  Day
| 0 is today for release +.1.

-tih
-- 
Popularity is the hallmark of mediocrity.  --Niles Crane, "Frasier"


Re: [HACKERS] A small problem with the new inet and cidr types

From
darcy@druid.net (D'Arcy J.M. Cain)
Date:
Thus spake Tom Ivar Helbekkmo
> While we're talking of release cycles, I'd like to take the liberty to
> post something that I've been keeping around for a while, which was
> written by Dave Burgess (of *BSD FAQ fame) some time back.  He was
> posting to a NetBSD mailing list, so there is wording in here that's
> specific to that environment, but there is much good, general truth
> here.  With respect to the CIDR case, note especially his "get it
> right or get it out" rule.  Translated to PostgreSQL 6.4 terms, we
> get: "Whoah!  This wasn't what we wanted!  Decide quick: do we ship
> what we have, or do we ditch it and do it right for 6.5?".  We chose
> to do neither, and there's a lesson to be learned from this.

Perhaps but let me point out a few things.
 - The code wasn't that late.  Other things were holding up release   and the inet/cidr stuff was in by the drop dead
datewe were given   at the time.
 
 - We may not have followed the release method you quote but that   wasn't our release method at the time.  Perhaps it
shouldbe   but let's not be blinded by hindsight.
 
 - The latest problem has turned out to be a general problem with   the system.  Rather than being an added bug, having
thisin now   just allowed us to find an existing problem sooner than we might   have otherwise.
 
 - Because of a few late night coding sessions we now have a really   cool feature and we beat the market.  That's
somethingwe can use in   our announcement.  Putting this feature in, even if we pushed the   deadline a little, didn't
weakenthe product, it strengthened it.
 

I don't want to sound defensive but people have been sounding negative
about having this type in.  I just want to make sure that we all see
the positive side of this too.

> Let me also take this opportunity to submit a plug for "The Mythical
> Man-Month" by Frederick P. Brooks.  Anyone involved in software
> development who hasn't yet read it, should.  Get the 1995 special
> edition, expanded and edited from the original.

I'll second that.

-- 
D'Arcy J.M. Cain <darcy@{druid|vex}.net>   |  Democracy is three wolves
http://www.druid.net/darcy/                |  and a sheep voting on
+1 416 424 2871     (DoD#0082)    (eNTP)   |  what's for dinner.


Re: [HACKERS] A small problem with the new inet and cidr types

From
The Hermit Hacker
Date:
On Mon, 2 Nov 1998, D'Arcy J.M. Cain wrote:

> Perhaps but let me point out a few things.
> 
>   - The code wasn't that late.  Other things were holding up release
>     and the inet/cidr stuff was in by the drop dead date we were given
>     at the time.
No disagreement on that from here...

>   - Because of a few late night coding sessions we now have a really
>     cool feature and we beat the market.  That's something we can use in
>     our announcement.  Putting this feature in, even if we pushed the
>     deadline a little, didn't weaken the product, it strengthened it.
No disagreement here either...to a certain extent.  What we shoudl
have done was followed our original release schedualed and looked at
getting the INET/CIDR stuff into v6.5 instead.  But, hindsight is 20/20
and its in now...

Marc G. Fournier                                
Systems Administrator @ hub.org 
primary: scrappy@hub.org           secondary: scrappy@{freebsd|postgresql}.org 



Re: [HACKERS] A small problem with the new inet and cidr types

From
jwieck@debis.com (Jan Wieck)
Date:
>
> Thus spake Taral
> > > My guess is that maybe this should not be fixed in the individual
> > > datatypes at all; instead the generic function and operator code should
> > > be modified so that if any input value is NULL, then NULL is returned as
> > > the result without ever calling the datatype-specific code.
> >
> > AFAICT, the function code returns blank when the input is NULL, regardless
> > of the function definition... this came up before when someone tried to
> > extend the functions and found that func(NULL) called func, but disregarded
> > the return value...
>
> Well that sure fits with my observations.  Sure seems wrong though.  We
> should either use the return value or don't call the function in the
> first place.  I vote for the latter even though I have spent the time
> fixing inet.  It seems like the proper method.

    Not  calling  a  function  if  one of it's arguments is NULL?
    Isn't NULL a legal value?

    I know that the function manager interface is  damned  stupid
    in  the  case of NULL's. Some of the interface functions pass
    isNull as in/out value and some do not. And the in value only
    tells if any of the arguments are NULL, not which of them. It
    hit me when building PL/pgSQL and PL/Tcl.

    Let's redesign the function call interface  and  define  that
    any  function  has  to handle NULL arguments properly. Yes, I
    know what that means :-).


Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#======================================== jwieck@debis.com (Jan Wieck) #

Re: [HACKERS] A small problem with the new inet and cidr types

From
Hannu Krosing
Date:
Jan Wieck wrote:
> 
>     Not  calling  a  function  if  one of it's arguments is NULL?
>     Isn't NULL a legal value?
> 
>     I know that the function manager interface is  damned  stupid
>     in  the  case of NULL's. Some of the interface functions pass
>     isNull as in/out value and some do not. And the in value only
>     tells if any of the arguments are NULL, not which of them. It
>     hit me when building PL/pgSQL and PL/Tcl.
> 
>     Let's redesign the function call interface  and  define  that
>     any  function  has  to handle NULL arguments properly. Yes, I
>     know what that means :-).

An easier way would be _not_ to call a function with NULL arguments, 
_unless_ it declares that it can handle them.

It would probably do the right thing in many (most?) places.

But we do need to redesign not only the function _call_ interface, but
most likely also the function _definition_ mechanisms as well.

On the whole I would like the functions to be more object-like, 
meaning that they should have methods (metadata) that could tell 
things about them. In addition to an method that tells if the 
function can take NULL as an argument and in what positions, 
I envision we could also use a method that would tell the max print 
length and other "field" attribute of the function return value given 
the attributes of function arguments. For example we would be able to 
determine that the max length of concatenating varchar(5) and varchar(7) 
is varchar(12) and not a varchar of infinite length as we have to 
assume now.

Is'nt PostgreSQL supposed to be somewhat OO DBMS ? 

----------
Hannu


Re: [HACKERS] A small problem with the new inet and cidr types

From
darcy@druid.net (D'Arcy J.M. Cain)
Date:
Thus spake Jan Wieck
> > Well that sure fits with my observations.  Sure seems wrong though.  We
> > should either use the return value or don't call the function in the
> > first place.  I vote for the latter even though I have spent the time
> > fixing inet.  It seems like the proper method.
> 
>     Not  calling  a  function  if  one of it's arguments is NULL?
>     Isn't NULL a legal value?

Sure but what is the reasonable thing to do if we perform a function
on a null?  Let's take the inet/cidr functions that started this.  If
i is a null field then these seem reasonable conversions.
 host(i) ==>  null network(i) ==> null broadcast(i) ==> null

...etc.

There may be cases where a function of a null is not null as some people
have pointed out but so far no one has come up with a practical example.

I suggested that the actual return value could depend on the return
type.  The only reason I suggested that was for the case of boolean
returns.  I see some merit in being able to decide in my select whether
or not to include rows where one or more operators is null but even
there it is probably of marginal utility.  If it is absolutely essential
to display every row then you can always constrain the field to NOT
NULL and use a marker value for null.

>     Let's redesign the function call interface  and  define  that
>     any  function  has  to handle NULL arguments properly. Yes, I
>     know what that means :-).

Well, let's hurry up and decide this so I know whether or not to
clean out my local patches to inet/cidr.  :-)

-- 
D'Arcy J.M. Cain <darcy@{druid|vex}.net>   |  Democracy is three wolves
http://www.druid.net/darcy/                |  and a sheep voting on
+1 416 424 2871     (DoD#0082)    (eNTP)   |  what's for dinner.


Re: [HACKERS] A small problem with the new inet and cidr types

From
Hannu Krosing
Date:
D'Arcy J.M. Cain wrote:
> 
> There may be cases where a function of a null is not null as some people
> have pointed out but so far no one has come up with a practical example.

isnull(field)

is_any_null(field1,field2,field3)

are_all_nulls(field1,field2,field3)

value_or_default(NULL,defaultvalue)

---------
Hannu


Re: [HACKERS] A small problem with the new inet and cidr types

From
darcy@druid.net (D'Arcy J.M. Cain)
Date:
Thus spake Hannu Krosing
> D'Arcy J.M. Cain wrote:
> > There may be cases where a function of a null is not null as some people
> > have pointed out but so far no one has come up with a practical example.
> 
> isnull(field)
> 
> is_any_null(field1,field2,field3)
> 
> are_all_nulls(field1,field2,field3)
> 
> value_or_default(NULL,defaultvalue)

I meant in the specific type functions.  These functions seem like they
can easily be handled at a higher level and still never call the type
function code.  IOW, if these functions are considered useful, they
should be implemented at the function dispatch level.

That last one seems particularly useful to me and, in fact, could handle
the issue of requiring functions to handle nulls all by itself.

-- 
D'Arcy J.M. Cain <darcy@{druid|vex}.net>   |  Democracy is three wolves
http://www.druid.net/darcy/                |  and a sheep voting on
+1 416 424 2871     (DoD#0082)    (eNTP)   |  what's for dinner.


Re: [HACKERS] A small problem with the new inet and cidr types

From
Tom Lane
Date:
darcy@druid.net (D'Arcy J.M. Cain) writes:
> Thus spake Hannu Krosing
>> D'Arcy J.M. Cain wrote:
>>>> There may be cases where a function of a null is not null as some people
>>>> have pointed out but so far no one has come up with a practical example.
>> 
>> isnull(field)
>> 
>> is_any_null(field1,field2,field3)
>> 
>> are_all_nulls(field1,field2,field3)
>> 
>> value_or_default(NULL,defaultvalue)

> I meant in the specific type functions.  These functions seem like they
> can easily be handled at a higher level and still never call the type
> function code.  IOW, if these functions are considered useful, they
> should be implemented at the function dispatch level.

In fact they would *have* to be implemented as generics.  If we tried
to implement them at the type-specific level, then we'd have to assign
specific types to the input and output values --- in other words, we'd
need a separate implementation for every possible combination of input
types.

The question at hand is whether any ordinary type-specific operators
or functions need to be non-strict (ie, capable of returning a non-NULL
output given one or more NULL inputs).  Before we go to the trouble
of looking at/fixing every single type-specific operator or function
routine in the system, I'd like to see some concrete examples why
just turning off the spigot at the type-independent function dispatch
routine isn't good enough.

NOTE: aggregates definitely must be non-strict; I don't want SUM()
to come back with a NULL if I happen to include a NULL in its inputs.
But there are a lot fewer aggregate functions to look at than plain
old scalar functions.
        regards, tom lane


Re: [HACKERS] A small problem with the new inet and cidr types

From
Bruce Momjian
Date:
>     Not  calling  a  function  if  one of it's arguments is NULL?
>     Isn't NULL a legal value?
> 
>     I know that the function manager interface is  damned  stupid
>     in  the  case of NULL's. Some of the interface functions pass
>     isNull as in/out value and some do not. And the in value only
>     tells if any of the arguments are NULL, not which of them. It
>     hit me when building PL/pgSQL and PL/Tcl.
> 
>     Let's redesign the function call interface  and  define  that
>     any  function  has  to handle NULL arguments properly. Yes, I
>     know what that means :-).
> 

Added to TODO list:
* redesign the function call interface to handle NULLs better


--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [HACKERS] A small problem with the new inet and cidr types

From
"Thomas G. Lockhart"
Date:
> * redesign the function call interface to handle NULLs better

I was planning on looking at this for v6.5, at least in the context of
trying to solve the problem of returning NULL for pass-by-value types.

We should have some discussion of pass-by-value vs. pass-by-reference
and whether it is worth having both mechanisms for common data types. As
it is, functions which return int2 or int4 cannot return NULL because
there is no way to represent that with these types. I was thinking of
implementing true smallint/integer pass-by-reference types to clean this
up.
                     - Tom


Re: [HACKERS] A small problem with the new inet and cidr types

From
jwieck@debis.com (Jan Wieck)
Date:
>
> > * redesign the function call interface to handle NULLs better
>
> I was planning on looking at this for v6.5, at least in the context of
> trying to solve the problem of returning NULL for pass-by-value types.
>
> We should have some discussion of pass-by-value vs. pass-by-reference
> and whether it is worth having both mechanisms for common data types. As
> it is, functions which return int2 or int4 cannot return NULL because
> there is no way to represent that with these types. I was thinking of
> implementing true smallint/integer pass-by-reference types to clean this
> up.

    Actually  they  can  -  but  only  if  they  take exactly one
    argument.  fmgr_c()  calls  those  ones  with  an  additional
    isNull bool pointer.

    The  mess  is,  that we have different entry points to call a
    function, some of them pass information about NULL  and  some
    not.  What I had in mind was to finally have one single entry
    point that handles NULL for any argument and for  the  return
    value.


Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#======================================== jwieck@debis.com (Jan Wieck) #

Re: [HACKERS] A small problem with the new inet and cidr types

From
Bruce Momjian
Date:
I am replying to something posted by D'Arcy on November 2nd, so I will
quote the whole thing:

> Thus spake Tom Ivar Helbekkmo
> > While we're talking of release cycles, I'd like to take the liberty to
> > post something that I've been keeping around for a while, which was
> > written by Dave Burgess (of *BSD FAQ fame) some time back.  He was
> > posting to a NetBSD mailing list, so there is wording in here that's
> > specific to that environment, but there is much good, general truth
> > here.  With respect to the CIDR case, note especially his "get it
> > right or get it out" rule.  Translated to PostgreSQL 6.4 terms, we
> > get: "Whoah!  This wasn't what we wanted!  Decide quick: do we ship
> > what we have, or do we ditch it and do it right for 6.5?".  We chose
> > to do neither, and there's a lesson to be learned from this.
> 
> Perhaps but let me point out a few things.
> 
>   - The code wasn't that late.  Other things were holding up release
>     and the inet/cidr stuff was in by the drop dead date we were given
>     at the time.
> 
>   - We may not have followed the release method you quote but that
>     wasn't our release method at the time.  Perhaps it should be
>     but let's not be blinded by hindsight.
> 
>   - The latest problem has turned out to be a general problem with
>     the system.  Rather than being an added bug, having this in now
>     just allowed us to find an existing problem sooner than we might
>     have otherwise.
> 
>   - Because of a few late night coding sessions we now have a really
>     cool feature and we beat the market.  That's something we can use in
>     our announcement.  Putting this feature in, even if we pushed the
>     deadline a little, didn't weaken the product, it strengthened it.
> 
> I don't want to sound defensive but people have been sounding negative
> about having this type in.  I just want to make sure that we all see
> the positive side of this too.
> 
> > Let me also take this opportunity to submit a plug for "The Mythical
> > Man-Month" by Frederick P. Brooks.  Anyone involved in software
> > development who hasn't yet read it, should.  Get the 1995 special
> > edition, expanded and edited from the original.
> 
> I'll second that.

OK.  I want to take responsibility for the one-month delay of 6.4.

If people remember, I posted the mega-patch mid-August, which added
multi-key indexes to the system tables, improved the heap API with
proper locking, and improved system cache usage.

This patch generated several problems, some of them bugs, some of them
areas I missed in adding multi-key system indexes.  We couldn't go live
with beta on September 1 because we had major problems doing initdb on
some platforms.  We kind of started the beta, but without all platforms
participating, it was pretty useless, and we started to let features in
during the month while we struggled to get the mega-patch problems
resolved.

We didn't start serious beta until October 1, and D'Arcy got caught with
Paul Vixie defining/adding the IP type.  D'Arcy was just left holding
the bag when people started to worry about the delay.  It was not his
fault.

It was me, putting a 18k line patch in two weeks before beta.  Of
course, I started earlier than mid-August, but it takes me a while to
generate 18k lines of diff.  :-)

Just wanted to clear that up.

--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [HACKERS] A small problem with the new inet and cidr types

From
"D'Arcy" "J.M." Cain
Date:
Thus spake Bruce Momjian
> > I don't want to sound defensive but people have been sounding negative
> > about having this type in.  I just want to make sure that we all see
> > the positive side of this too.
> > 
> OK.  I want to take responsibility for the one-month delay of 6.4.

No hair shirts, please.  :-)

> We didn't start serious beta until October 1, and D'Arcy got caught with
> Paul Vixie defining/adding the IP type.  D'Arcy was just left holding
> the bag when people started to worry about the delay.  It was not his
> fault.

Just to be clear, I was defending the type, not myself.  We can't take
any minor disagreements personally.  If we do we start splintering the
group and that's no good for the project as a whole.

The bottom line is that there was a little bit of a problem getting
6.4 out the door but we wound up with a superior product.  We also
have some ideas for improving the release methodology for next time.
Given what we have I don't believe anyone needs to beat themself up
over some minor problem that we had getting there.

-- 
D'Arcy J.M. Cain <darcy@{druid|vex}.net>   |  Democracy is three wolves
http://www.druid.net/darcy/                |  and a sheep voting on
+1 416 424 2871     (DoD#0082)    (eNTP)   |  what's for dinner.


Re: [HACKERS] A small problem with the new inet and cidr types

From
Tom Ivar Helbekkmo
Date:
Bruce Momjian <maillist@candle.pha.pa.us> writes:

> Just wanted to clear that up.

Just for the record, I was _not_ attacking D'Arcy or anyone else!  If
anyone, I'm the one who's to blame for the delays and misunderstanding
over the IP (CIDR) data type, since what I implemented (based on code
from Paul Vixie), was not actually in accordance with prior consensus
on pgsql-hackers.

My posting, the one that D'Arcy followed up and you then commented on
now, was meant to suggest that when it finally became clear that I'd
misunderstood, and implemented something different from what was
wanted, it was really too close to the release date to try to fix it.
The best handling would, in my opinion, have been to quickly decide
whether to use it anyway or postpone it until the next release.

As it turned out, D'Arcy did have time to modify and extend the code
into an acceptable solution, so everything actually worked out well.

-tih
-- 
Popularity is the hallmark of mediocrity.  --Niles Crane, "Frasier"