On 28 October 2012 10:37, Hannu Krosing <hannu@2ndquadrant.com> wrote:
> On 10/28/2012 11:21 AM, Thomas Munro wrote:
>>
>> On 28 October 2012 09:43, Hannu Krosing <hannu@2ndquadrant.com> wrote:
>>>
>>> This is how PostgreSQL currently works -
>>>
>>> test=# select 'NaN'::float = 'NaN'::float as must_be_false;
>>> must_be_false
>>> ----------
>>> t
>>> (1 row)
>>>
>>> I think that PostgreSQL's behaviour of comparing two
>>> NaN-s as equal is wrong and Iwe should follow the IEEE 754 spec here
>>>
>>> As per IEEE 754 a NaN behaves similar to NULL in SQL.
>>
>> FWIW there is a note in the documentation about this:
>>
>> "Note: IEEE754 specifies that NaN should not compare equal to any
>> other floating-point value (including NaN). In order to allow
>> floating-point values to be sorted and used in tree-based indexes,
>> PostgreSQL treats NaN values as equal, and greater than all non-NaN
>> values."
>
> I wonder how hard it would be to start treating NaNs as NULLs
> so you could say ORDER BY fvalue NULLS AND NANS LAST :)
If you want to treat NaN like NULL, there are some other situations to
think about:
hack=> create table foo (x float);
CREATE TABLE
hack=> insert into foo values ('NaN'::float), (42);
INSERT 0 2
hack=> select sum(x) from foo;sum
-----NaN
(1 row)
That is the same result as evaluating 'NaN'::float + 42.
hack=> delete from foo;
DELETE 2
hack=> insert into foo values (null), (42);
INSERT 0 2
hack=> select sum(x) from foo;sum
----- 42
(1 row)
That is not the same results as evaluating NULL + 42 (the SUM
aggregate skips NULLs).
The trouble is, NULL is not a value within the type T (float in this
case), it's more like a nullable value has type NULL | T. The algebra(set of operators including SUM, +, < etc) for
thealgebraic type
NULL | T doesn't behave the same way as the algebra for T, whereas NaN
is one of the values that IEEE float can hold, so it's part of the T
in this case. In other words, for NaNs, if you have one set of
operators used by aggregates, ORDER BY, btrees and another set of
operators <, >, = for use in user expressions, that would be (at least
subtly) different than what happens today with NULL.
Well, I'm not that sure of what I'm saying. But I have thought about
it for a while, as I have been (very slowly) working on an extension
which provides IEEE 754-2008 decimal number types DECIMAL32,
DECIMAL64, DECIMAL 128 (they have the same NaN, -0, +0, -Inf, +Inf
shenanigans, and some more fun problems as well, like non-normal
numbers, so that you can represent 1.002 and 1.0020 as distinct bit
patterns, and then have to argue with yourself about what it all
means).