Using indexes for like foo% type queries when foo isn't constant (not a locale issue) - Mailing list pgsql-performance

From Greg Stark
Subject Using indexes for like foo% type queries when foo isn't constant (not a locale issue)
Date
Msg-id 87of37oa57.fsf@stark.dyndns.tv
Whole thread Raw
List pgsql-performance
In the following query postgres doesn't use the index. In the hard-coded
version below it does. I suppose it can't because it's possible the "target"
could have wildcards etc in them. Is there any way to indicate the postgres
that that won't happen?

This is going to be even more of an issue when preparsed queries happen
because even in the hard coded example it will be an issue. I know in Oracle
if you parse a query with a LIKE :1||'%' type expression it still plans to use
the index and that's extremely useful. I don't know what it does if there's a
% in the parameter, it either takes the performance hit or it doesn't treat
them as special?

db=> explain analyze select postalcode, abs(substr(target,6,1)::integer-substr(postalcode,6,1)::integer) as dist from
postalcodes,(select 'L6C2M6'::text as target) as t where postalcode like substr(target,1,5)||'%' order by dist asc
limit2; 
                                                              QUERY PLAN
              

--------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=12182.77..12182.77 rows=2 width=42) (actual time=9226.17..9226.18 rows=2 loops=1)
   ->  Sort  (cost=12182.77..12186.16 rows=1359 width=42) (actual time=9226.16..9226.16 rows=2 loops=1)
         Sort Key: abs(((substr(t.target, 6, 1))::integer - (substr((postalcodes.postalcode)::text, 6, 1))::integer))
         ->  Nested Loop  (cost=0.00..12112.04 rows=1359 width=42) (actual time=3262.89..9205.25 rows=8 loops=1)
               Join Filter: ("inner".postalcode ~~ (substr("outer".target, 1, 5) || '%'::text))
               ->  Subquery Scan t  (cost=0.00..0.01 rows=1 width=0) (actual time=0.04..0.05 rows=1 loops=1)
                     ->  Result  (cost=0.00..0.01 rows=1 width=0) (actual time=0.02..0.02 rows=1 loops=1)
               ->  Seq Scan on postalcodes  (cost=0.00..7335.69 rows=271769 width=10) (actual time=5.52..3268.74
rows=271769loops=1) 
 Total runtime: 9241.92 msec
(9 rows)

db=> explain analyze select postalcode, abs(substr('L6C2M6',6,1)::integer-substr(postalcode,6,1)::integer) as dist from
postalcodeswhere postalcode like substr('L6C2M6',1,5)||'%' order by dist asc limit 2; 
                                                                      QUERY PLAN
                               

-------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=3.29..3.29 rows=1 width=10) (actual time=36.54..36.55 rows=2 loops=1)
   ->  Sort  (cost=3.29..3.29 rows=1 width=10) (actual time=36.53..36.54 rows=2 loops=1)
         Sort Key: abs((6 - (substr((postalcode)::text, 6, 1))::integer))
         ->  Index Scan using idx_postalcodes_postalcodeon on postalcodes  (cost=0.00..3.28 rows=1 width=10) (actual
time=35.91..36.33rows=8 loops=1) 
               Index Cond: ((postalcode >= 'L6C2M'::bpchar) AND (postalcode < 'L6C2N'::bpchar))
               Filter: (postalcode ~~ 'L6C2M%'::text)
 Total runtime: 36.93 msec
(7 rows)

--
greg

pgsql-performance by date:

Previous
From: Tom Lane
Date:
Subject: Re: Do Views offer any performance advantage?
Next
From: Andreas Pflug
Date:
Subject: Re: Do Views offer any performance advantage?