Thread: Query with rightmost function does not use index

Query with rightmost function does not use index

From
- -
Date:
For a table where column col has an index, the query:
     SELECT ... FROM ... WHERE col = MD5('')
doesn't seem to use an index but
     SELECT ... FROM ... WHERE col = 'd41d8cd98f00b204e9800998ecf8427e'
does.

Is this a gotcha?
I'm using PostgreSQL 8.4.8 on i686-pc-linux-gnu, compiled by GCC gcc-4.5.real (Ubuntu/Linaro 4.5.2-8ubuntu4) 4.5.2, 32-bit


Re: Query with rightmost function does not use index

From
Pavel Stehule
Date:
Hello

use a functional index

http://www.postgresql.org/docs/8.4/interactive/indexes-expressional.html

Regards

Pavel Stehule

2011/8/8 - - <loh.law@hotmail.com>:
> For a table where column col has an index, the query:
>      SELECT ... FROM ... WHERE col = MD5('')
> doesn't seem to use an index but
>      SELECT ... FROM ... WHERE col = 'd41d8cd98f00b204e9800998ecf8427e'
> does.
>
> Is this a gotcha?
> I'm using PostgreSQL 8.4.8 on i686-pc-linux-gnu, compiled by GCC
> gcc-4.5.real (Ubuntu/Linaro 4.5.2-8ubuntu4) 4.5.2, 32-bit
>
>
>

Re: Query with rightmost function does not use index

From
- -
Date:
But why?  The expression is not on the left side of the WHERE clause.

> Hello
>
> use a functional index
>
> http://www.postgresql.org/docs/8.4/interactive/indexes-expressional.html
>
> Regards
>
> Pavel Stehule
>
> 2011/8/8 - - <loh.law@hotmail.com>:
> > For a table where column col has an index, the query:
> >      SELECT ... FROM ... WHERE col = MD5('')
> > doesn't seem to use an index but
> >      SELECT ... FROM ... WHERE col = 'd41d8cd98f00b204e9800998ecf8427e'
> > does.
> >
> > Is this a gotcha?
> > I'm using PostgreSQL 8.4.8 on i686-pc-linux-gnu, compiled by GCC
> > gcc-4.5.real (Ubuntu/Linaro 4.5.2-8ubuntu4) 4.5.2, 32-bit
> >
> >
> >

Re: Query with rightmost function does not use index

From
pasman pasmański
Date:
Here may be other problem.
Show us the result of EXPLAIN Analyze.

2011/8/8, Pavel Stehule <pavel.stehule@gmail.com>:
> Hello
>
> use a functional index
>
> http://www.postgresql.org/docs/8.4/interactive/indexes-expressional.html
>
> Regards
>
> Pavel Stehule
>
> 2011/8/8 - - <loh.law@hotmail.com>:
>> For a table where column col has an index, the query:
>>      SELECT ... FROM ... WHERE col = MD5('')
>> doesn't seem to use an index but
>>      SELECT ... FROM ... WHERE col = 'd41d8cd98f00b204e9800998ecf8427e'
>> does.
>>
>> Is this a gotcha?
>> I'm using PostgreSQL 8.4.8 on i686-pc-linux-gnu, compiled by GCC
>> gcc-4.5.real (Ubuntu/Linaro 4.5.2-8ubuntu4) 4.5.2, 32-bit
>>
>>
>>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>


--
------------
pasman

Re: Query with rightmost function does not use index

From
- -
Date:
Here are the EXPLAIN ANALYZE outputs:


explain analyze select * from filter_item where filter_hash = MD5('');

                                                      QUERY PLAN                                                      
-----------------------------------------------------------------------------------------------------------------------
 Seq Scan on filter_item  (cost=0.00..424644.96 rows=86108 width=49) (ac tual time=8177.807..12421.921 rows=77 loops=1)
   Filter: ((filter_hash)::text = 'd41d8cd98f00b204e9800998ecf8427e'::text)
 Total runtime: 12421.959 ms
(3 rows)


explain analyze select * from filter_item where filter_hash = 'd41d8cd98f00b204e9800998ecf8427e'

                                                             QUERY PLAN                                                              
--------------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on filter_item  (cost=77.92..6609.02 rows=3534 width=49) (actual time=0.055..0.100 rows=77 loops=1)
   Recheck Cond: (filter_hash = 'd41d8cd98f00b204e9800998ecf8427e'::bpchar)
   ->  Bitmap Index Scan on filter_item__filter_hash  (cost=0.00..77.04 rows=3534 width=0) (actual time=0.049..0.049 rows=77 loops=1)
         Index Cond: (filter_hash = 'd41d8cd98f00b204e9800998ecf8427e'::bpchar)
 Total runtime: 0.130 ms
(5 rows)

Re: Query with rightmost function does not use index

From
"David Johnston"
Date:

From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of - -
Sent: Monday, August 08, 2011 3:40 AM
To: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Query with rightmost function does not use index

 

Here are the EXPLAIN ANALYZE outputs:


explain analyze select * from filter_item where filter_hash = MD5('');

                                                      QUERY PLAN                                                      
-----------------------------------------------------------------------------------------------------------------------
 Seq Scan on filter_item  (cost=0.00..424644.96 rows=86108 width=49) (ac tual time=8177.807..12421.921 rows=77 loops=1)
   Filter: ((filter_hash)::text = 'd41d8cd98f00b204e9800998ecf8427e'::text)
 Total runtime: 12421.959 ms
(3 rows)


explain analyze select * from filter_item where filter_hash = 'd41d8cd98f00b204e9800998ecf8427e'

                                                             QUERY PLAN                                                              
--------------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on filter_item  (cost=77.92..6609.02 rows=3534 width=49) (actual time=0.055..0.100 rows=77 loops=1)
   Recheck Cond: (filter_hash = 'd41d8cd98f00b204e9800998ecf8427e'::bpchar)
   ->  Bitmap Index Scan on filter_item__filter_hash  (cost=0.00..77.04 rows=3534 width=0) (actual time=0.049..0.049 rows=77 loops=1)
         Index Cond: (filter_hash = 'd41d8cd98f00b204e9800998ecf8427e'::bpchar)
 Total runtime: 0.130 ms
(5 rows)

 

 

The filter_hash index uses a “character(n)” data type – the ::bpchar. The second query is of unknown type and thus is converted to “character” and then used in the index.  The first query use a function that outputs a “text”.  Since the output type is known the left-side of the equals is casted to that known type.  Since the index is one the “character” version of the filter_hash but the comparison requires a “text” version the index cannot be used.  You would need to manually cast the result of the md5 function call to “character” in order to get the index usage; or convert the filter_hash column to text, the latter option probably being preferred.

 

It is not a bug, in cases of uncertainty the types of the value and the indexed field must be the same, but it could possibly be more user-friendly.

 

I’ll leave it to other to comment on whether this is different in more recent versions.  Text-character are binary compatible and so it is not be unreasonable to assume, like you did, that indexes of one should be usable by the other.

 

David J.

 

 

 

Re: Query with rightmost function does not use index

From
Andres Freund
Date:
Hi,

On Monday, August 08, 2011 15:40:20 - - wrote:
> explain analyze select * from filter_item where filter_hash = MD5('');
>
>                                                       QUERY PLAN
> ----------------------------------------------------------------------------
> ------------------------------------------- Seq Scan on filter_item
> (cost=0.00..424644.96 rows=86108 width=49) (actual time=8177.807..12421.921
> rows=77 loops=1) Filter: ((filter_hash)::text =
> 'd41d8cd98f00b204e9800998ecf8427e'::text) Total runtime: 12421.959 ms
> (3 rows)
>
>
> explain analyze select * from filter_item where filter_hash =
> 'd41d8cd98f00b204e9800998ecf8427e'
>
>                                                              QUERY PLAN
> ----------------------------------------------------------------------------
> ---------------------------------------------------------- Bitmap Heap Scan
> on filter_item  (cost=77.92..6609.02 rows=3534 width=49) (actual
> time=0.055..0.100 rows=77 loops=1) Recheck Cond: (filter_hash =
> 'd41d8cd98f00b204e9800998ecf8427e'::bpchar) ->  Bitmap Index Scan on
> filter_item__filter_hash  (cost=0.00..77.04 rows=3534 width=0) (actual
> time=0.049..0.049 rows=77 loops=1) Index Cond: (filter_hash =
> 'd41d8cd98f00b204e9800998ecf8427e'::bpchar) Total runtime: 0.130 ms
> (5 rows)
The problem is that your filter_hash columns seems to be of type char(n). Thats
not directly compatible with text (which is the type returned by the md5
function).
So either change the column type or cast the return type of md5 to char(n).

I do have to admit that this is somewhat strange.

Greetings,

Andres