Thread: LIKE not using indexes (due to locale issue?)

LIKE not using indexes (due to locale issue?)

From

Ow Mun Heng

Date:

25 June 2008, 01:06:53

explain select * from d_trr where revision like '^B2.%.SX'
--where ast_revision  = 'B2.M.SX'

Seq Scan on d_trr  (cost=0.00..2268460.98 rows=1 width=16)
  Filter: ((revision)::text ~~ '^B2.%.SX'::text)

show lc_collate;
en_US.UTF-8

Is it that this is handled by tsearch2? Or I need to do the locale to
"C" for this to function?

the revision is indexed.

Bitmap Heap Scan on d_trr  (cost=4492.43..547709.26 rows=193453
width=16)
  Recheck Cond: ((revision)::text = 'B2.M.SX'::text)
  ->  Bitmap Index Scan on idx_d_trr_iw_ast  (cost=0.00..4444.06
rows=193453 width=0)
        Index Cond: ((revision)::text = 'B2.M.SX'::text)

Re: LIKE not using indexes (due to locale issue?)

From

Klint Gore

Date:

25 June 2008, 01:58:49

Ow Mun Heng wrote:
> explain select * from d_trr where revision like '^B2.%.SX'
> --where ast_revision  = 'B2.M.SX'
>
> Seq Scan on d_trr  (cost=0.00..2268460.98 rows=1 width=16)
>   Filter: ((revision)::text ~~ '^B2.%.SX'::text)
>
> show lc_collate;
> en_US.UTF-8
>
> Is it that this is handled by tsearch2? Or I need to do the locale to
> "C" for this to function?
>
See http://www.postgresql.org/docs/8.3/interactive/indexes-opclass.html.
It tells you how to create an index that like might use in non-C locales.

As a side note, the ^ in the string above has no special meaning for like.

klint.

--
Klint Gore
Database Manager
Sheep CRC
A.G.B.U.
University of New England
Armidale NSW 2350

Ph: 02 6773 3789
Fax: 02 6773 3266
EMail: kgore4@une.edu.au

Re: LIKE not using indexes (due to locale issue?)

From

Ow Mun Heng

Date:

25 June 2008, 03:44:31

On Wed, 2008-06-25 at 14:58 +1000, Klint Gore wrote:
> Ow Mun Heng wrote:
> > explain select * from d_trr where revision like '^B2.%.SX'
> > --where ast_revision  = 'B2.M.SX'
> >
> > Seq Scan on d_trr  (cost=0.00..2268460.98 rows=1 width=16)
> >   Filter: ((revision)::text ~~ '^B2.%.SX'::text)
> >
> > show lc_collate;
> > en_US.UTF-8
> >
> > Is it that this is handled by tsearch2? Or I need to do the locale to
> > "C" for this to function?
> >
> See http://www.postgresql.org/docs/8.3/interactive/indexes-opclass.html.
> It tells you how to create an index that like might use in non-C locales.

Just more information. This columns is created with the varchar type.

original index is created using

CREATE INDEX idx_d_ast
  ON xmms.d_trh
  USING btree
  (revision varchar_pattern_ops);


CREATE INDEX idx_d_ast2
  ON xmms.d_trh
  USING btree
  (revision);

after creating it, seems like it is still doing the seq_scan.
So what gives? Can I get more clues here?

Re: LIKE not using indexes (due to locale issue?)

From

Klint Gore

Date:

25 June 2008, 04:14:25

Ow Mun Heng wrote:
> On Wed, 2008-06-25 at 14:58 +1000, Klint Gore wrote:
> > Ow Mun Heng wrote:
> > > explain select * from d_trr where revision like '^B2.%.SX'
> > > --where ast_revision  = 'B2.M.SX'
> > >
> > > Seq Scan on d_trr  (cost=0.00..2268460.98 rows=1 width=16)
> > >   Filter: ((revision)::text ~~ '^B2.%.SX'::text)
> > >
> > > show lc_collate;
> > > en_US.UTF-8
> > >
> > > Is it that this is handled by tsearch2? Or I need to do the locale to
> > > "C" for this to function?
> > >
> > See http://www.postgresql.org/docs/8.3/interactive/indexes-opclass.html.
> > It tells you how to create an index that like might use in non-C locales.
>
> Just more information. This columns is created with the varchar type.
>
> original index is created using
>
> CREATE INDEX idx_d_ast
>   ON xmms.d_trh
>   USING btree
>   (revision varchar_pattern_ops);
>
>
> CREATE INDEX idx_d_ast2
>   ON xmms.d_trh
>   USING btree
>   (revision);
>
> after creating it, seems like it is still doing the seq_scan.
> So what gives? Can I get more clues here?
>
Post your query and the explain analyze of it and how many rows are in
the table.

In my database, there's 7200 rows in items and I know that none of the
identifiers for them start with 'xb'.  As you can see below, the 1st
query is sequential and the 2nd one is using the new index.  (v8.3.0)

postgres=# show lc_collate;
       lc_collate
------------------------
 English_Australia.1252
(1 row)

postgres=# explain analyse select * from items where identifier like 'xb%';
                                             QUERY PLAN
-----------------------------------------------------------------------------------------------------
 Seq Scan on items  (cost=0.00..160.18 rows=1 width=113) (actual
time=4.966..4.966 rows=0 loops=1)
   Filter: ((identifier)::text ~~ 'xb%'::text)
 Total runtime: 5.029 ms
(3 rows)

postgres=# create index anindex on items(identifier varchar_pattern_ops);
CREATE INDEX
postgres=# explain analyse select * from items where identifier like 'xb%';
                                                    QUERY PLAN
-------------------------------------------------------------------------------------------------------------------
 Index Scan using anindex on items  (cost=0.00..8.27 rows=1 width=113)
(actual time=0.165..0.165 rows=0 loops=1)
   Index Cond: (((identifier)::text ~>=~ 'xb'::text) AND
((identifier)::text ~<~ 'xc'::text))
   Filter: ((identifier)::text ~~ 'xb%'::text)
 Total runtime: 0.255 ms
(4 rows)


--
Klint Gore
Database Manager
Sheep CRC
A.G.B.U.
University of New England
Armidale NSW 2350

Ph: 02 6773 3789
Fax: 02 6773 3266
EMail: kgore4@une.edu.au

Re: LIKE not using indexes (due to locale issue?)

From

Ow Mun Heng

Date:

25 June 2008, 04:50:52

On Wed, 2008-06-25 at 17:00 +1000, Klint Gore wrote:
> Ow Mun Heng wrote:
> > On Wed, 2008-06-25 at 14:58 +1000, Klint Gore wrote:
> > > Ow Mun Heng wrote:
> > > > explain select * from d_trr where revision like '^B2.%.SX'
> > > > --where ast_revision  = 'B2.M.SX'
> > > >
> > > > Seq Scan on d_trr  (cost=0.00..2268460.98 rows=1 width=16)
> > > >   Filter: ((revision)::text ~~ '^B2.%.SX'::text)
> > > >
> > > > show lc_collate;
> > > > en_US.UTF-8
> > > >
> > > > Is it that this is handled by tsearch2? Or I need to do the locale to
> > > > "C" for this to function?
> > > >
> > > See http://www.postgresql.org/docs/8.3/interactive/indexes-opclass.html.
> > > It tells you how to create an index that like might use in non-C locales.
> >
> > Just more information. This columns is created with the varchar type.
> >
> > original index is created using
> >
> > CREATE INDEX idx_d_ast
> >   ON xmms.d_trh
> >   USING btree
> >   (revision varchar_pattern_ops);
> >
> >
> > CREATE INDEX idx_d_ast2
> >   ON xmms.d_trh
> >   USING btree
> >   (revision);
> >
> > after creating it, seems like it is still doing the seq_scan.
> > So what gives? Can I get more clues here?
> >
> Post your query and the explain analyze of it and how many rows are in
> the table.


explain analyse select count(*) from d_trr_iw
--where ast_revision like '^B2.%.SX'
where ast_revision  = 'B2.P.SX'

QUERY PLAN
Aggregate  (cost=353955.35..353955.36 rows=1 width=0) (actual time=54.565..54.566 rows=1 loops=1)
  ->  Bitmap Heap Scan on d_trr_iw  (cost=3150.63..353593.31 rows=144813 width=0) (actual time=54.557..54.557 rows=0
loops=1)
        Recheck Cond: ((ast_revision)::text = 'B2.P.SX'::text)
        ->  Bitmap Index Scan on idx_d_trr_iw_ast  (cost=0.00..3114.42 rows=144813 width=0) (actual time=54.520..54.520
rows=0loops=1) 
              Index Cond: ((ast_revision)::text = 'B2.P.SX'::text)
Total runtime: 54.662 ms


>
> In my database, there's 7200 rows in items and I know that none of the
> identifiers for them start with 'xb'.  As you can see below, the 1st
> query is sequential and the 2nd one is using the new index.  (v8.3.0)
>
There's approx 29million rows in there and using the LIKE condition will
churn it for a good 20-30min I suppose. (didn't try - live database)

> postgres=# show lc_collate;
>        lc_collate
> ------------------------
>  English_Australia.1252
> (1 row)
>
> postgres=# explain analyse select * from items where identifier like 'xb%';
>                                              QUERY PLAN
> -----------------------------------------------------------------------------------------------------
>  Seq Scan on items  (cost=0.00..160.18 rows=1 width=113) (actual
> time=4.966..4.966 rows=0 loops=1)
>    Filter: ((identifier)::text ~~ 'xb%'::text)
>  Total runtime: 5.029 ms
> (3 rows)
>
> postgres=# create index anindex on items(identifier varchar_pattern_ops);
> CREATE INDEX
> postgres=# explain analyse select * from items where identifier like 'xb%';
>                                                     QUERY PLAN
> -------------------------------------------------------------------------------------------------------------------
>  Index Scan using anindex on items  (cost=0.00..8.27 rows=1 width=113)
> (actual time=0.165..0.165 rows=0 loops=1)
>    Index Cond: (((identifier)::text ~>=~ 'xb'::text) AND
> ((identifier)::text ~<~ 'xc'::text))
>    Filter: ((identifier)::text ~~ 'xb%'::text)
>  Total runtime: 0.255 ms
> (4 rows)
>
>

Could it be that it's not able to determine the B2.%.SX in there?

explain select count(*) from d_trr_iw where ast_revision like 'B2.P.SX'

even this will result in a seq_scan.

Re: LIKE not using indexes (due to locale issue?)

From

Klint Gore

Date:

25 June 2008, 05:22:34

Ow Mun Heng wrote:
> Could it be that it's not able to determine the B2.%.SX in there?
>
> explain select count(*) from d_trr_iw where ast_revision like 'B2.P.SX'
>
> even this will result in a seq_scan.
>
How many values have you got that start with 'B2.'?  If it's more than
about 5% then it's probably quicker to sequential scan anyway.

Since the equals query says there's one row with a P in the middle, what
does this say?

explain select count(*) from d_trr_iw where ast_revision like 'B2.P%.SX'

or pick a value that you know can't exist e.g. if ast_revision can never start with WW then

explain select count(*) from d_trr_iw where ast_revision like 'WW.%.SX'

klint.

--
Klint Gore
Database Manager
Sheep CRC
A.G.B.U.
University of New England
Armidale NSW 2350

Ph: 02 6773 3789
Fax: 02 6773 3266
EMail: kgore4@une.edu.au