Thread: Odd behavior with NULL value

Odd behavior with NULL value

From
bsmith@h-e.com (Bob Smith, Hammett & Edison, Inc.)
Date:
See example below of using NULL values with type DATE.  It behaves
strangely in expressions, "(x <> null)" gives an entirely different
result than "not(x = null)".  Is this intended behavior, if so, why?
If not, is this a bug?

On a related note, does anyone know if 'current' works with DATE?  With
TIMESTAMP it always evaluates to the time as of the retrieval of the
value, not as of the insertion, but for DATE it looks like it gets
evaluated at insertion (I can't tell for sure from my test db for another
8 hours or so, I could just roll the date forward on the server, but I'd
like to keep my job...)

Thanks to anyone who can shed some light on this!

rsj=> \d test
          Table "test"
 Attribute |  Type   | Modifier
-----------+---------+----------
 key       | integer |
 date      | date    |

rsj=> select * from test;
 key |    date
-----+------------
   1 | 2001-12-20
   2 | 2001-12-20
   3 |
(3 rows)

rsj=> select * from test where date = null;
 key | date
-----+------
   3 |
(1 row)

rsj=> select * from test where date <> null;
 key | date
-----+------
(0 rows)

rsj=> select * from test where not(date = null);
 key |    date
-----+------------
   1 | 2001-12-20
   2 | 2001-12-20
(2 rows)

rsj=>


   |\      _,,,---,,_        Bob Smith
   /,`.-'`'    -.  ;-;;,_    Hammett & Edison, Inc.
  |,4-  ) )-,_. ,\ (  `'-'   bsmith@h-e.com
 '---''(_/--'  `-'\_)

Re: Odd behavior with NULL value

From
Bruno Wolff III
Date:
On Thu, Dec 20, 2001 at 04:24:33PM -0800,
  "Bob Smith, Hammett & Edison, Inc." <bsmith@h-e.com> wrote:
> See example below of using NULL values with type DATE.  It behaves
> strangely in expressions, "(x <> null)" gives an entirely different
> result than "not(x = null)".  Is this intended behavior, if so, why?
> If not, is this a bug?

There is a hack which will be off by default in 7.2 that changes
'x = null' into 'x is null' for compatibility with sqlserver.

Re: Odd behavior with NULL value

From
Phill Kenoyer
Date:
I think that this page will explain it all:
http://www.postgresql.org/idocs/index.php?functions-comparison.html

|On 011220 17:05
|Bob Smith, Hammett & Edison, Inc. (bsmith@h-e.com) wrote the following...
|
|See example below of using NULL values with type DATE.  It behaves
|strangely in expressions, "(x <> null)" gives an entirely different
|result than "not(x = null)".  Is this intended behavior, if so, why?
|If not, is this a bug?
|
|On a related note, does anyone know if 'current' works with DATE?  With
|TIMESTAMP it always evaluates to the time as of the retrieval of the
|value, not as of the insertion, but for DATE it looks like it gets
|evaluated at insertion (I can't tell for sure from my test db for another
|8 hours or so, I could just roll the date forward on the server, but I'd
|like to keep my job...)
|
|Thanks to anyone who can shed some light on this!
|
|rsj=> \d test
|          Table "test"
| Attribute |  Type   | Modifier
|-----------+---------+----------
| key       | integer |
| date      | date    |
|
|rsj=> select * from test;
| key |    date
|-----+------------
|   1 | 2001-12-20
|   2 | 2001-12-20
|   3 |
|(3 rows)
|
|rsj=> select * from test where date = null;
| key | date
|-----+------
|   3 |
|(1 row)
|
|rsj=> select * from test where date <> null;
| key | date
|-----+------
|(0 rows)
|
|rsj=> select * from test where not(date = null);
| key |    date
|-----+------------
|   1 | 2001-12-20
|   2 | 2001-12-20
|(2 rows)
--
Reason: Typo in the code
                                               _   |       _
                                              (_()(|('.|)('||.|()|`|(

Re: Odd behavior with NULL value

From
Tom Lane
Date:
bsmith@h-e.com (Bob Smith, Hammett & Edison, Inc.) writes:
> On a related note, does anyone know if 'current' works with DATE?

DATE does not have an internal representation of 'current', so the DATE
input parser treats it the same as 'now'.  AFAIR only timestamp (nee
datetime) has that concept.

FYI, the concept of 'current' has been deemed broken and removed
altogether for 7.2.  See discussions in the pgsql-hackers archives
if you want to know why.

            regards, tom lane

Re: Odd behavior with NULL value

From
Stephan Szabo
Date:
On Thu, 20 Dec 2001, Bruno Wolff III wrote:

> On Thu, Dec 20, 2001 at 04:24:33PM -0800,
>   "Bob Smith, Hammett & Edison, Inc." <bsmith@h-e.com> wrote:
> > See example below of using NULL values with type DATE.  It behaves
> > strangely in expressions, "(x <> null)" gives an entirely different
> > result than "not(x = null)".  Is this intended behavior, if so, why?
> > If not, is this a bug?
>
> There is a hack which will be off by default in 7.2 that changes
> 'x = null' into 'x is null' for compatibility with sqlserver.

As a note, I think it was actually for some feature of Access. :)


Re: Odd behavior with NULL value

From
bsmith@h-e.com (Bob Smith, Hammett & Edison, Inc.)
Date:
On Thu, Dec 20, 2001, 18:55:18 Tom Lane wrote:

>bsmith@h-e.com (Bob Smith, Hammett & Edison, Inc.) writes:
>> On a related note, does anyone know if 'current' works with DATE?
>
>DATE does not have an internal representation of 'current', so the DATE
>input parser treats it the same as 'now'.  AFAIR only timestamp (nee
>datetime) has that concept.
>
>FYI, the concept of 'current' has been deemed broken and removed
>altogether for 7.2.  See discussions in the pgsql-hackers archives
>if you want to know why.
>
>            regards, tom lane

Here is the problem I'm trying to solve, perhaps someone can help.  For an
invoicing system database, I have a table that defines employees.  Each has a
begin and end date defining the employment period.  For current employees, the
end date is "open".  How do I express that to keep queries as simple as
possible?  The three choices I came up with are:

  (1)  Use end_date = 'current'

  (2)  Store end_date as TEXT and cast it to DATE in expressions, so it can
       contain the text constant 'today' for current employees

  (3)  Use end_date = NULL and have an extra expression in the queries

Because 'current' doesn't work for DATE types, (1) is a bust.  (2) and (3)
both work, but I'm not sure which is better from a performance point of view.
 For example, if I want all the employees that are current as of a particular
date, for (2) it would be:

  SELECT * FROM employee WHERE (foo >= employee.start_date) AND
                               (foo <= employee.end_date::DATE)

and for (3) it would be:

  SELECT... WHERE (foo >= employee.start_date) AND
                  ((foo <= employee.end_date) OR (employee.end_date IS NULL))

(Thanks to all who posted with explanations of why (x IS NULL) should be used
instead of (x = NULL)).  The cast has a performance penalty, but then so does
using OR, especially in a join.  Which would be worse?

I just noticed that (3) does have one advantage over (2); if the system allows
end_date to be set into the future, (3) works for dates in the future, but (2)
does not.  But that isn't one of my requirements so it isn't a deciding
factor.

Any opinions on which approach is better, or does anyone see a fourth
alternative?

Thanks!


   |\      _,,,---,,_        Bob Smith
   /,`.-'`'    -.  ;-;;,_    Hammett & Edison, Inc.
  |,4-  ) )-,_. ,\ (  `'-'   bsmith@h-e.com
 '---''(_/--'  `-'\_)

Re: Odd behavior with NULL value

From
Tom Lane
Date:
bsmith@h-e.com (Bob Smith, Hammett & Edison, Inc.) writes:
> Here is the problem I'm trying to solve, perhaps someone can help.
> For an invoicing system database, I have a table that defines
> employees.  Each has a begin and end date defining the employment
> period.  For current employees, the end date is "open".  How do I
> express that to keep queries as simple as possible?

NULL seems like the most obvious and portable answer, though you might
have to phrase queries carefully to make it work.  Another possibility
is to use timestamp rather than date, so you could use "+infinity" for
the end date of active employees.

            regards, tom lane

Re: Odd behavior with NULL value

From
Bill Cunningham
Date:
I apologize if this is sent twice.


The simplest solution would be to use a view over the data hiding the
mess of the case statement so that ad-hoc users' queries would still work.

- Bill

Bob Smith, Hammett & Edison, Inc. wrote:

>On Thu, Dec 20, 2001, 18:55:18 Tom Lane wrote:
>
>>bsmith@h-e.com (Bob Smith, Hammett & Edison, Inc.) writes:
>>
>>>On a related note, does anyone know if 'current' works with DATE?
>>>
>>DATE does not have an internal representation of 'current', so the DATE
>>input parser treats it the same as 'now'.  AFAIR only timestamp (nee
>>datetime) has that concept.
>>
>>FYI, the concept of 'current' has been deemed broken and removed
>>altogether for 7.2.  See discussions in the pgsql-hackers archives
>>if you want to know why.
>>
>>            regards, tom lane
>>
>
>Here is the problem I'm trying to solve, perhaps someone can help.  For an
>invoicing system database, I have a table that defines employees.  Each has a
>begin and end date defining the employment period.  For current employees, the
>end date is "open".  How do I express that to keep queries as simple as
>possible?  The three choices I came up with are:
>
>  (1)  Use end_date = 'current'
>
>  (2)  Store end_date as TEXT and cast it to DATE in expressions, so it can
>       contain the text constant 'today' for current employees
>
>  (3)  Use end_date = NULL and have an extra expression in the queries
>
>Because 'current' doesn't work for DATE types, (1) is a bust.  (2) and (3)
>both work, but I'm not sure which is better from a performance point of view.
> For example, if I want all the employees that are current as of a particular
>date, for (2) it would be:
>
>  SELECT * FROM employee WHERE (foo >= employee.start_date) AND
>                               (foo <= employee.end_date::DATE)
>
>and for (3) it would be:
>
>  SELECT... WHERE (foo >= employee.start_date) AND
>                  ((foo <= employee.end_date) OR (employee.end_date IS NULL))
>
>(Thanks to all who posted with explanations of why (x IS NULL) should be used
>instead of (x = NULL)).  The cast has a performance penalty, but then so does
>using OR, especially in a join.  Which would be worse?
>
>I just noticed that (3) does have one advantage over (2); if the system allows
>end_date to be set into the future, (3) works for dates in the future, but (2)
>does not.  But that isn't one of my requirements so it isn't a deciding
>factor.
>
>Any opinions on which approach is better, or does anyone see a fourth
>alternative?
>
>Thanks!
>
>
>   |\      _,,,---,,_        Bob Smith
>   /,`.-'`'    -.  ;-;;,_    Hammett & Edison, Inc.
>  |,4-  ) )-,_. ,\ (  `'-'   bsmith@h-e.com
> '---''(_/--'  `-'\_)
>
>---------------------------(end of broadcast)---------------------------
>TIP 2: you can get off all lists at once with the unregister command
>    (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
>