Re: [HACKERS] indexes and floats - Mailing list pgsql-hackers

From Tom Lane
Subject Re: [HACKERS] indexes and floats
Date
Msg-id 29332.902242131@sss.pgh.pa.us
Whole thread Raw
In response to Re: [HACKERS] indexes and floats  ("Thomas G. Lockhart" <lockhart@alumni.caltech.edu>)
Responses Re: [HACKERS] indexes and floats  (Vince Vielhaber <vev@michvhf.com>)
List pgsql-hackers
"Thomas G. Lockhart" <lockhart@alumni.caltech.edu> writes:
> Oh, you have nailed it! This is interesting because (probably) a query
> like
>   select f4 from t4 where f4 = 500.0;
> is being automatically "upgraded" in the parser backend to
>   select f4 from t4 where float8(f4) = 500.0;
> So, since there is no functional index float8(f4) on the table we cannot
> use an existing index on f4 to advantage.

OK, that sounds plausible.  But in my examples,

    play=> explain select x from f8 where x = 500;
    NOTICE:  QUERY PLAN:
    Seq Scan on f8  (cost=40.00 size=100 width=8)

Your explanation implies that here, the parser is converting to
     select x from f8 where int4(x) = 500;
which is wrong for the same accuracy-loss reasons you cite later.
(And if that isn't what it's doing, what then?)

I think it would be a good idea if someone actually dug into this
and verified what's going on.  I have found some other cases that
lead me to think there's more to this than we understand just yet.
With an index on an int4 field, I get

tree=> explain select * from marketorderhistory where sequenceNo = 140000;
NOTICE:  QUERY PLAN:
Index Scan using marketorderhistory_sequenceno_i on marketorderhistory
  (cost=2.05 size=1 width=100)

tree=> explain select * from marketorderhistory where sequenceNo > 140000;
NOTICE:  QUERY PLAN:
Seq Scan on marketorderhistory  (cost=63.38 size=449 width=100)

which doesn't look like it could be explained by parser type coercions.
Perhaps this one just indicates an omission from the list of
type-specific routines that can be used for index comparisons?  If so,
maybe there are other omissions affecting the results for other types.

> ... you want to handle the following query correctly (I'll
> switch to an int column to make it clearer):
>   select i4 from t4 where i4 < 500.1;
> Now, if we do the "optimizable thing" blindly, then we would transform
> this to
>   select i4 from t4 where i4 < int4(500.1);
> But of course this would produce the wrong result if the table contains
> a value of 500. Perhaps something a bit different could be implemented,
> but it probably wouldn't generalize very well with the extensible type
> system.

That's a good point.  Still, it would be nice if the system had some
reasonable amount of smarts about the "primitive" types that the parser
has constant syntax for.  In particular I think an automatic coercion of
an int constant to float where needed would be a reasonable thing to
expect.  That's not happening now, see my example above.

> So, is there a problem to fix, or just documentation to write?

This one is most certainly a bug:

play=> select x from f4 where x = 500.0 :: float4;
ERROR:  parser_typecast: cannot cast this expression to type 'float4'

Beyond that, if I can force the right thing to happen by casting
the constant to the type of the field, then I can live with it.
I have seen a number of cases where the system wouldn't use an index
even with a cast, however, so I'm not a happy camper yet.

            regards, tom lane

pgsql-hackers by date:

Previous
From: Maarten Boekhold
Date:
Subject: Re: [HACKERS] How do I construct a varlena?
Next
From: Bruce Momjian
Date:
Subject: EXPLAIN VERBOSE