Thread: SQL question

SQL question

From
Carolyn Lu Wong
Date:
I have table with the following definition:
create table table1(    account_no    int4,    start_date_tme    datetime    ....);

The table may contain null values for start_date_time.

When I run the following SQL query, it fails:
select * from table1where start_date_time::date >= '01/01/2000'::date and start_date_time::date <= '01/01/2001'::date;

I get error message 'Unable to convert null datetime to date.

It's fine if I run the same SQL query with added condition as follows:
select * from table1
>    where account_no = 1and start_date_time::date >= '01/01/2000'::date and start_date_time::date <=
'01/01/2001'::date;

Is this a bug? Or there's an logical explaination for this?


Re: SQL question

From
Philip Warner
Date:
At 12:07 17/07/00 +1000, Carolyn Lu Wong wrote:
>I have table with the following definition:
>
>    create table table1(
>        account_no    int4,
>        start_date_tme    datetime
>        ....
>    );
>
>The table may contain null values for start_date_time.
>
>When I run the following SQL query, it fails:
>
>    select * from table1
>    where start_date_time::date >= '01/01/2000'::date 
>    and start_date_time::date <= '01/01/2001'::date;
>
>I get error message 'Unable to convert null datetime to date.
>
>It's fine if I run the same SQL query with added condition as follows:
>
>    select * from table1
>>    where account_no = 1
>    and start_date_time::date >= '01/01/2000'::date 
>    and start_date_time::date <= '01/01/2001'::date;
>
>Is this a bug? Or there's an logical explaination for this?

The most logical explanation is that there are no null values in
start_date_time when account_no = 1.

Try    select count(*) from table1 where account_no = 1 and start_date_time is
null;

and see if you get 0.



----------------------------------------------------------------
Philip Warner                    |     __---_____
Albatross Consulting Pty. Ltd.   |----/       -  \
(A.C.N. 008 659 498)             |          /(@)   ______---_
Tel: (+61) 0500 83 82 81         |                 _________  \
Fax: (+61) 0500 83 82 82         |                 ___________ |
Http://www.rhyme.com.au          |                /           \|                                |    --________--
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/


Re: SQL question

From
Carolyn Lu Wong
Date:

Philip Warner wrote:
> 
> At 12:07 17/07/00 +1000, Carolyn Lu Wong wrote:
> >I have table with the following definition:
> >
> >       create table table1(
> >               account_no      int4,
> >               start_date_tme  datetime
> >               ....
> >       );
> >
> >The table may contain null values for start_date_time.
> >
> >When I run the following SQL query, it fails:
> >
> >       select * from table1
> >       where start_date_time::date >= '01/01/2000'::date
> >       and start_date_time::date <= '01/01/2001'::date;
> >
> >I get error message 'Unable to convert null datetime to date.
> >
> >It's fine if I run the same SQL query with added condition as follows:
> >
> >       select * from table1
> >>      where account_no = 1
> >       and start_date_time::date >= '01/01/2000'::date
> >       and start_date_time::date <= '01/01/2001'::date;
> >
> >Is this a bug? Or there's an logical explaination for this?
> 
> The most logical explanation is that there are no null values in
> start_date_time when account_no = 1.
> 
> Try
>     select count(*) from table1 where account_no = 1 and start_date_time is
> null;
> 
> and see if you get 0.

Yes, i get 0 from running the above query, but it fails if i re-arrange
the where clause to:
select * from table1where start_date_time::date >= '01/01/2000'::dateand start_date_time::date <= '01/01/2001'::dateand
account_no= 1;
 

with the same error message.

> 
> ----------------------------------------------------------------
> Philip Warner                    |     __---_____
> Albatross Consulting Pty. Ltd.   |----/       -  \
> (A.C.N. 008 659 498)             |          /(@)   ______---_
> Tel: (+61) 0500 83 82 81         |                 _________  \
> Fax: (+61) 0500 83 82 82         |                 ___________ |
> Http://www.rhyme.com.au          |                /           \|
>                                  |    --________--
> PGP key available upon request,  |  /
> and from pgp5.ai.mit.edu:11371   |/


Re: SQL question

From
Philip Warner
Date:
At 13:41 17/07/00 +1000, Carolyn Lu Wong wrote:
>> Try
>>     select count(*) from table1 where account_no = 1 and start_date_time is
>> null;
>> 
>> and see if you get 0.
>
>Yes, i get 0 from running the above query, but it fails if i re-arrange
>the where clause to:
>
>    select * from table1
>    where start_date_time::date >= '01/01/2000'::date
>    and start_date_time::date <= '01/01/2001'::date
>    and account_no = 1;
>
>with the same error message.
>

I think that there is no guarantee of the order of evaluation of the
components of a predicate, but that putting 'account_no=1' early in the
statement means the row is excluded before it needs to evaluate the rest of
the statement. I don't think you should rely on this behaviour - it might
be classified as "it's a feature, not a bug". Maybe.

You probably need to tell me what NULLs in the start_date mean. If, eg,
they mean 'not started', then you could create a view:
   create view started_things as select * from table1 where not start_date
is null;

then use:
       select * from started_things        where start_date_time::date >= '01/01/2000'::date    and
start_date_time::date<= '01/01/2001'::date    and account_no = 1;
 

AFAICT, this will still use nice indexes etc, but I could be wrong.

Another alternative would be to define a 'coalesce' function (I don't think
PG has one), which takes an arbitrary number of arguments and returns the
first non-null one. You could then say "where coalesce(start_date_time,
'1/1/1500')::date >= '01/01/2000'::date" etc, but then I think you will
lose the effectiveness of indexes.

Maybe someone else has a better idea...



----------------------------------------------------------------
Philip Warner                    |     __---_____
Albatross Consulting Pty. Ltd.   |----/       -  \
(A.C.N. 008 659 498)             |          /(@)   ______---_
Tel: (+61) 0500 83 82 81         |                 _________  \
Fax: (+61) 0500 83 82 82         |                 ___________ |
Http://www.rhyme.com.au          |                /           \|                                |    --________--
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/


Re: SQL question

From
Tom Lane
Date:
Philip Warner <pjw@rhyme.com.au> writes:
> Another alternative would be to define a 'coalesce' function (I don't think
> PG has one), which takes an arbitrary number of arguments and returns the
> first non-null one.

We surely do have that!  It even works pretty well in 7.0 ;-)
(I think there were still some bugs lurking in 6.5.)

The immediate cause of this gripe was discussed just a day or so ago
on one or another of the pgsql lists.  The timestamp-to-date conversion
routine has this weird idea that it should kick out an error instead
of returning NULL when presented with a NULL timestamp.  That's a bug
IMHO, and I've already changed the code in current sources.

COALESCE would be a good workaround if you don't want to patch your
local copy of 7.0.*.
        regards, tom lane


Re: SQL question

From
Thomas Lockhart
Date:
> The immediate cause of this gripe was discussed just a day or so ago
> on one or another of the pgsql lists.  The timestamp-to-date conversion
> routine has this weird idea that it should kick out an error instead
> of returning NULL when presented with a NULL timestamp.  That's a bug
> IMHO, and I've already changed the code in current sources.

That's not a bug, that was a feature, sort of. At least when I coded it,
Postgres *refused* to call any routine with NULL input, assuming that
NULL would be returned. A clever short-circuit, and the elog(ERROR) in
the conversion routine was just a safety net. Because it was also the
case that any routine returning a NULL pointer crashed the backend.

Now that those things aren't true, we are rewriting history to say that
they were bugs all along, eh? ;)
                     - Thomas


Re: SQL question

From
Tom Lane
Date:
Thomas Lockhart <lockhart@alumni.caltech.edu> writes:
>> The immediate cause of this gripe was discussed just a day or so ago
>> on one or another of the pgsql lists.  The timestamp-to-date conversion
>> routine has this weird idea that it should kick out an error instead
>> of returning NULL when presented with a NULL timestamp.  That's a bug
>> IMHO, and I've already changed the code in current sources.

> That's not a bug, that was a feature, sort of. At least when I coded it,
> Postgres *refused* to call any routine with NULL input, assuming that
> NULL would be returned.

Well before my time, I guess --- as long as I've been paying attention,
the function manager's approach was to call the routine first and *then*
insert a NULL result ... if the routine hadn't crashed first.  That's
about as braindead a choice as I can think of, but that's what it did.

> A clever short-circuit, and the elog(ERROR) in
> the conversion routine was just a safety net. Because it was also the
> case that any routine returning a NULL pointer crashed the backend.
> Now that those things aren't true, we are rewriting history to say that
> they were bugs all along, eh? ;)

Fixing that one routine to behave that way, when none of the hundreds
of others that might see a NULL input do the same, qualifies as a bug
IMHO.  But it's all water over the dam, now that fmgr has been
redesigned.
        regards, tom lane


Re: SQL question

From
Philip Warner
Date:
At 02:26 17/07/00 -0400, Tom Lane wrote:
>
>Well before my time, I guess --- as long as I've been paying attention,
>the function manager's approach was to call the routine first and *then*
>insert a NULL result ... if the routine hadn't crashed first.  That's
>about as braindead a choice as I can think of, but that's what it did.
>

Out of curiosity, what does it do now? 


----------------------------------------------------------------
Philip Warner                    |     __---_____
Albatross Consulting Pty. Ltd.   |----/       -  \
(A.C.N. 008 659 498)             |          /(@)   ______---_
Tel: (+61) 0500 83 82 81         |                 _________  \
Fax: (+61) 0500 83 82 82         |                 ___________ |
Http://www.rhyme.com.au          |                /           \|                                |    --________--
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/


Re: SQL question

From
Tom Lane
Date:
Philip Warner <pjw@rhyme.com.au> writes:
> At 02:26 17/07/00 -0400, Tom Lane wrote:
>> Well before my time, I guess --- as long as I've been paying attention,
>> the function manager's approach was to call the routine first and *then*
>> insert a NULL result ... if the routine hadn't crashed first.  That's
>> about as braindead a choice as I can think of, but that's what it did.

> Out of curiosity, what does it do now? 

As of current sources there is a distinction between "strict" and
"non-strict" functions.  A strict function must return NULL if any
input is NULL, therefore the function manager won't call it at all
if there is a NULL input value, but just assume the NULL result.
(Some other DBMSes have the same concept under different names,
like "NOT NULL CALL".)

A non-strict function is assumed to be able to take care of itself.
It gets called anyway and must check to see if any of its inputs
are NULL, then decide what it wants to do about it.

Both strict and nonstrict functions can return NULL if they wish,
though I've not seen many cases where a strict function would want to.

This is all predicated on a new function call interface that provides
an explicit isNull flag for each input, as well as an isNull flag for
the function result.  The real problem with the old code was that we
didn't have that, and there is no non-kluge workaround for not having
the information.

If I understand Thomas' remarks correctly, at one time in the past
the function-call code operated as though all functions were strict.
I suppose that foundered on the little problem that certain operations
like IS NULL and IS NOT NULL *must* be non-strict.  So someone changed
it to the opposite convention, but didn't follow through to the bitter
end: there has to be an explicit null flag for *each* argument, as
well as a clean way for the function to say whether it is returning
a null or not.

Defaulting to non-strict also created a ton of potential crash sites
in routines that couldn't cope with null-pointer inputs.  We've been
gradually "fixing" those by adding explicit tests for nulls, but it's
always been a stopgap solution IMHO.  Now there's a better way.
99% of the built-in functions in the backend are "strict" and so
will no longer need special checks to defend against null inputs,
because they'll never see 'em again.
        regards, tom lane