Re: infinite histogram bounds and nan (Re: comment regarding double timestamps; and, infinite timestamps and NaN) - Mailing list pgsql-hackers

From Tom Lane
Subject Re: infinite histogram bounds and nan (Re: comment regarding double timestamps; and, infinite timestamps and NaN)
Date
Msg-id 12957.1577974305@sss.pgh.pa.us
Whole thread Raw
In response to infinite histogram bounds and nan (Re: comment regarding doubletimestamps; and, infinite timestamps and NaN)  (Justin Pryzby <pryzby@telsasoft.com>)
List pgsql-hackers
Justin Pryzby <pryzby@telsasoft.com> writes:
> On Mon, Dec 30, 2019 at 02:18:17PM -0500, Tom Lane wrote:
>> This answer is simply broken.  You've caused it to estimate half
>> of the bucket, which is an insane estimate for the given bucket
>> boundaries and WHERE constraint.

> I'm fine if the isnan() logic changes, but the comment indicates it's intended
> to be hit for an infinite histogram bound, but that doesn't work for timestamps
> (convert_to_scalar() should return (double)INFINITY and not
> (double)INT64_MIN/MAX).

I suppose the code you're looking at is

                        binfrac = (val - low) / (high - low);

                        /*
                         * Watch out for the possibility that we got a NaN or
                         * Infinity from the division.  This can happen
                         * despite the previous checks, if for example "low"
                         * is -Infinity.
                         */
                        if (isnan(binfrac) ||
                            binfrac < 0.0 || binfrac > 1.0)
                            binfrac = 0.5;

This doesn't really have any goals beyond "make sure we get a result
between 0.0 and 1.0, even if the calculation went pear-shaped for
some reason".  You could make an argument that it should be like

                        if (isnan(binfrac))
                            binfrac = 0.5;   /* throw up our hands for NaN */
                        else if (binfrac <= 0.0)
                            binfrac = 0.0;   /* clamp in case of -Inf or -0 */
                        else if (binfrac > 1.0)
                            binfrac = 1.0;   /* clamp in case of +Inf */

which would probably produce saner results in edge cases like these.
I think it'd also obviate the need for fooling with the conversion in
convert_to_scalar: while DT_NOBEGIN/DT_NOEND wouldn't produce exactly
the same result (hard 0.0 or 1.0) as an infinity, they'd produce
results very close to that.

            regards, tom lane



pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Removal of support for OpenSSL 0.9.8 and 1.0.0
Next
From: Robert Haas
Date:
Subject: Re: Disallow cancellation of waiting for synchronous replication