Re: Null comparisons (was Re: checksum) - Mailing list pgsql-general

From Greg Stark
Subject Re: Null comparisons (was Re: checksum)
Date
Msg-id 873c126rqo.fsf@stark.xeocode.com
Whole thread Raw
In response to Re: Null comparisons (was Re: checksum)  (Marco Colombo <pgsql@esiway.net>)
Responses Re: Null comparisons (was Re: checksum)
List pgsql-general
Marco Colombo <pgsql@esiway.net> writes:

> (a = b or (a is null and b is null))
>
> that raises a flag for me. It seems that NULL is used as a special value,
> which is not.

Well, as I said, it raised a flag for me too. However, it's not good to be too
dogmatic about things. General rules are useful guiding principles but you
have to recognize when it's worth it to break them. We don't know enough about
his problem to say another approach would be any better.

For example, in one application I have a table that *does* have "unknown"
values. However I do need to look up records that match criteria including
having "unknown" values in specific positions. For most queries using NULL is
convenient and it's perfectly appropriate. But I have queries like this user
does and I use coalesce since I find the resulting expression much clearer
than using the three-way logical expression above.

Incidentally, coalesce(a,0)=coalesce(b,0) has the advantage over all the other
suggestions that you can build an index on coalesce(a,0) and/or coalesce(b,0)
and use them for the join or for individual record lookups.

> BTW,
>
>    coalesce(a,0) = coalesce(b,0)
>
> is wrong, since it assumes 0 is a special value, never used in the
> table. If so, it's better use it from the start instead of NULL for
> those special rows.

I specifically said you had to use a special value in my suggestion. Saying
something is "wrong" when it does what's needed just because it violates some
abstract design principle is just short-sighted.

Using 0 in the table might violate unique constraints or foreign key
constraints. I try to avoid having a single quirky table propagate its
quirkiness to the rest of the system.

For example, creating a bogus "0" record in some other table just to satisfy
the foreign key constraint then having the rest of the application have to
work around this bogus record results in a much less workable system than
simply using NULL instead of 0 for the special value.

>    coalesce(a = b, a is null and b is null)
>
> is correct, and maybe slightly better than the original
>
>    (a = b) or (a is null and b is null)
>
> if the implementation is smart enough to evaluate its arguments only
> when needed. The or operator needs to evaluate the right side when
> the left side is either false or null, COALESCE only when it's null.
> I think the docs mention that.

Actually it's only the latter expression that will be able to avoid evaluating
the extra expression, not the coalesce example. In any case the time to
evaluate the "a is null and b is null" part is negligible. And the fact that
neither can use any indexes is the only relevant performance question. It's
possible that's not a concern, but if it is they both lose.

That's one possible argument in favour of a === operator. It would be easy (I
think?) to make === use a btree index without even having to build a
functional index like with coalesce(a,0).

--
greg

pgsql-general by date:

Previous
From: Stephan Szabo
Date:
Subject: Re: Foreign key order evaluation
Next
From: "Scrappy"
Date:
Subject: Re: Hello