Re: Re: [GENERAL] +/- Inf for float8's - Mailing list pgsql-hackers

From Ross J. Reedstrom
Subject Re: Re: [GENERAL] +/- Inf for float8's
Date
Msg-id 20000820170828.A31805@rice.edu
Whole thread Raw
In response to Re: Re: [GENERAL] +/- Inf for float8's  (Peter Eisentraut <peter_e@gmx.net>)
Responses Re: Re: [GENERAL] +/- Inf for float8's  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: Re: [GENERAL] +/- Inf for float8's  (Bruce Momjian <pgman@candle.pha.pa.us>)
List pgsql-hackers
On Sun, Aug 20, 2000 at 12:33:00AM +0200, Peter Eisentraut wrote:
<snip side comment about bug tracking. My input: for an email controllable
system, take a look at the debian bug tracking system>

> Show me a system where it doesn't work and we'll get it to work.
> UNSAFE_FLOATS as it stands it probably not the most appropriate behaviour;
> it intends to speed things up, not make things portable.
>

I agree. In the previous thread on this, Thomas suggested creating a flag
that would allow control turning the  CheckFloat8Val function calls into
a macro NOOP. Sound slike a plan to me.

>
> > > NULL and NaN are not quite the same thing imho. If we are allowing NaN
> > > in columns, then it is *known* to be NaN.
> >
> > For the purposes of ordering, however, they are very similar.
>
> Then we can also treat them similar, i.e. sort them all last or all first.
> If you have NaN's in your data you wouldn't be interested in ordering
> anyway.

Right, but the problem is that NULLs are an SQL language feature, and
there for rightly special cased directly in the sorting apparatus. NaN is
type specific, and I'd be loath to special case it in the same place. As
it happens, I've spent some time this weekend groveling through the sort
(and index, as it happens) code, and have an idea for a type specific fix.

Here's the deal, and an actual, honest to goodness bug in the current code.

As it stands, we allow one non-finite to be stored in a float8 field:
NaN, with partial parsing of 'Infinity'.

As I reported last week, NaNs break sorts: they act as barriers, creating
sorted subsections in the output.  As those familiar with the code have
already guessed, there is a more serious bug: NaNs break indicies on
float8 fields, essentially chopping the index off at the first NaN.

Fixing this turns out to be a one liner to btfloat8cmp.

Fixing sorts is a bit tricker, but can be done: Currently, I've hacked
the float8lt and float8gt code to sort NaN to after +/-Infinity. (since
NULLs are special cased, they end up sorting after NaN). I don't see
any problems with this solution, and it give the desired behavior.

I've attached a patch which fixes all the sort and index problems, as well
as adding input support for -Infinity. This is not a complete solution,
since I haven't done anything with the CheckFloat8Val test. On my
system (linux/glibc2.1) compiling with UNSAFE_FLOATS seems to work fine
for testing.

>
> Side note 2: The paper "How Java's floating point hurts everyone
> everywhere" provides for good context reading.

http://http/cs.berkeley.edu/~wkahan/JAVAhurt.pdf ? I'll take a look at it
when I get in to work Monday.

>
> Side note 3: Once you read that paper you will agree that using floating
> point with Postgres is completely insane as long as the FE/BE protocol is
> text-based.

Probably. But it's not our job to enforce sanity, right? Another way to think
about it is fixing the implementation so the deficiencies of the FE/BE stand
out in a clearer light. ;-)

Ross
--
Ross J. Reedstrom, Ph.D., <reedstrm@rice.edu>
NSBRI Research Scientist/Programmer
Computer and Information Technology Institute
Rice University, 6100 S. Main St.,  Houston, TX 77005


Attachment

pgsql-hackers by date:

Previous
From: Ben Adida
Date:
Subject: Re: Bug tracking (was Re: +/- Inf for float8's)
Next
From: Hannu Krosing
Date:
Subject: Re: Optimisation deficiency: currval('seq')-->seq scan, constant-->index scan