Re: [HACKERS] Help/advice/suggestions on query optimizer for a large table - Mailing list pgsql-hackers

From Thomas Lockhart
Subject Re: [HACKERS] Help/advice/suggestions on query optimizer for a large table
Date
Msg-id 372888D9.9BEE178C@alumni.caltech.edu
Whole thread Raw
In response to Help/advice/suggestions on query optimizer for a large table  (Martin Weinberg <weinberg@osprey.phast.umass.edu>)
Responses Re: [HACKERS] Help/advice/suggestions on query optimizer for a large table  (Bruce Momjian <maillist@candle.pha.pa.us>)
List pgsql-hackers
> Anyway, if I submit a query such as:
>         select * from mytable where x=3.14 and y=6.28;
> it takes about 3 minutes to return the record.  Both x and y are indexed:
> And "explain" on the select query above says it's doing a sequential scan.
> However if I say:
>         select * from mytable where x='3.14'::float4 and y='6.28'::float4;
> it takes about 3 seconds!  And now "explain" says it's doing an indexed
> scan.
> My understanding is that the query optimizer should pick the index
> scan for this query based on the cost.

This is a known feature. The Postgres parser converts an unquoted 3.14
to a float8, which is not the same as the float4 column you have
indexed. And the optimizer is not (yet) bright enough to convert
constants to the column type, and then use the available indexes.

In fact, the apparently more desirable strategy is not particularly
easy to get right. Look at this example:
 create table t1 (i int4); (insert a bunch of data) create index tx on t1 using btree(i); vacuum; select * from t1
wherei < 3.5;
 

In this case, we can't convert the 3.5 to an integer (3) without
changing the comparison operator to "<=". And in your case,
"downconverting" the float8 to a float4 probably would risk the same
problem. So Postgres *promotes* the float4s to float8s, and has to do
a sequential scan along the way.

Anyway, afaik you have two options. The first is to surround the
"3.14" in your example with single quotes (probably the coersion to
float4 is unnecessary). The second is to create a function index on
your table, to allow your queries to use float8 arguments
transparently:
 create index mx on mytable using btree (float8(x)); create index my on mytable using btree (float8(y));

If you are hiding all of the queries inside an app, then I'd suggest
the first technique. If you are going to be doing a lot of direct SQL,
then you might want to use the second.
                          - Tom

-- 
Thomas Lockhart                lockhart@alumni.caltech.edu
South Pasadena, California


pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: [HACKERS] v6.5 Release Date ...
Next
From: Brian P Millett
Date:
Subject: SIGBUS in AllocSetAlloc & jdbc