Re: Null comparisons (was Re: checksum) - Mailing list pgsql-general
From | Greg Stark |
---|---|
Subject | Re: Null comparisons (was Re: checksum) |
Date | |
Msg-id | 873c126rqo.fsf@stark.xeocode.com Whole thread Raw |
In response to | Re: Null comparisons (was Re: checksum) (Marco Colombo <pgsql@esiway.net>) |
Responses |
Re: Null comparisons (was Re: checksum)
|
List | pgsql-general |
Marco Colombo <pgsql@esiway.net> writes: > (a = b or (a is null and b is null)) > > that raises a flag for me. It seems that NULL is used as a special value, > which is not. Well, as I said, it raised a flag for me too. However, it's not good to be too dogmatic about things. General rules are useful guiding principles but you have to recognize when it's worth it to break them. We don't know enough about his problem to say another approach would be any better. For example, in one application I have a table that *does* have "unknown" values. However I do need to look up records that match criteria including having "unknown" values in specific positions. For most queries using NULL is convenient and it's perfectly appropriate. But I have queries like this user does and I use coalesce since I find the resulting expression much clearer than using the three-way logical expression above. Incidentally, coalesce(a,0)=coalesce(b,0) has the advantage over all the other suggestions that you can build an index on coalesce(a,0) and/or coalesce(b,0) and use them for the join or for individual record lookups. > BTW, > > coalesce(a,0) = coalesce(b,0) > > is wrong, since it assumes 0 is a special value, never used in the > table. If so, it's better use it from the start instead of NULL for > those special rows. I specifically said you had to use a special value in my suggestion. Saying something is "wrong" when it does what's needed just because it violates some abstract design principle is just short-sighted. Using 0 in the table might violate unique constraints or foreign key constraints. I try to avoid having a single quirky table propagate its quirkiness to the rest of the system. For example, creating a bogus "0" record in some other table just to satisfy the foreign key constraint then having the rest of the application have to work around this bogus record results in a much less workable system than simply using NULL instead of 0 for the special value. > coalesce(a = b, a is null and b is null) > > is correct, and maybe slightly better than the original > > (a = b) or (a is null and b is null) > > if the implementation is smart enough to evaluate its arguments only > when needed. The or operator needs to evaluate the right side when > the left side is either false or null, COALESCE only when it's null. > I think the docs mention that. Actually it's only the latter expression that will be able to avoid evaluating the extra expression, not the coalesce example. In any case the time to evaluate the "a is null and b is null" part is negligible. And the fact that neither can use any indexes is the only relevant performance question. It's possible that's not a concern, but if it is they both lose. That's one possible argument in favour of a === operator. It would be easy (I think?) to make === use a btree index without even having to build a functional index like with coalesce(a,0). -- greg
pgsql-general by date: