Re: Re: [GENERAL] +/- Inf for float8's - Mailing list pgsql-hackers
From | Ross J. Reedstrom |
---|---|
Subject | Re: Re: [GENERAL] +/- Inf for float8's |
Date | |
Msg-id | 20000820170828.A31805@rice.edu Whole thread Raw |
In response to | Re: Re: [GENERAL] +/- Inf for float8's (Peter Eisentraut <peter_e@gmx.net>) |
Responses |
Re: Re: [GENERAL] +/- Inf for float8's
(Tom Lane <tgl@sss.pgh.pa.us>)
Re: Re: [GENERAL] +/- Inf for float8's (Bruce Momjian <pgman@candle.pha.pa.us>) |
List | pgsql-hackers |
On Sun, Aug 20, 2000 at 12:33:00AM +0200, Peter Eisentraut wrote: <snip side comment about bug tracking. My input: for an email controllable system, take a look at the debian bug tracking system> > Show me a system where it doesn't work and we'll get it to work. > UNSAFE_FLOATS as it stands it probably not the most appropriate behaviour; > it intends to speed things up, not make things portable. > I agree. In the previous thread on this, Thomas suggested creating a flag that would allow control turning the CheckFloat8Val function calls into a macro NOOP. Sound slike a plan to me. > > > > NULL and NaN are not quite the same thing imho. If we are allowing NaN > > > in columns, then it is *known* to be NaN. > > > > For the purposes of ordering, however, they are very similar. > > Then we can also treat them similar, i.e. sort them all last or all first. > If you have NaN's in your data you wouldn't be interested in ordering > anyway. Right, but the problem is that NULLs are an SQL language feature, and there for rightly special cased directly in the sorting apparatus. NaN is type specific, and I'd be loath to special case it in the same place. As it happens, I've spent some time this weekend groveling through the sort (and index, as it happens) code, and have an idea for a type specific fix. Here's the deal, and an actual, honest to goodness bug in the current code. As it stands, we allow one non-finite to be stored in a float8 field: NaN, with partial parsing of 'Infinity'. As I reported last week, NaNs break sorts: they act as barriers, creating sorted subsections in the output. As those familiar with the code have already guessed, there is a more serious bug: NaNs break indicies on float8 fields, essentially chopping the index off at the first NaN. Fixing this turns out to be a one liner to btfloat8cmp. Fixing sorts is a bit tricker, but can be done: Currently, I've hacked the float8lt and float8gt code to sort NaN to after +/-Infinity. (since NULLs are special cased, they end up sorting after NaN). I don't see any problems with this solution, and it give the desired behavior. I've attached a patch which fixes all the sort and index problems, as well as adding input support for -Infinity. This is not a complete solution, since I haven't done anything with the CheckFloat8Val test. On my system (linux/glibc2.1) compiling with UNSAFE_FLOATS seems to work fine for testing. > > Side note 2: The paper "How Java's floating point hurts everyone > everywhere" provides for good context reading. http://http/cs.berkeley.edu/~wkahan/JAVAhurt.pdf ? I'll take a look at it when I get in to work Monday. > > Side note 3: Once you read that paper you will agree that using floating > point with Postgres is completely insane as long as the FE/BE protocol is > text-based. Probably. But it's not our job to enforce sanity, right? Another way to think about it is fixing the implementation so the deficiencies of the FE/BE stand out in a clearer light. ;-) Ross -- Ross J. Reedstrom, Ph.D., <reedstrm@rice.edu> NSBRI Research Scientist/Programmer Computer and Information Technology Institute Rice University, 6100 S. Main St., Houston, TX 77005
Attachment
pgsql-hackers by date: