Thread: LIKE not using indexes (due to locale issue?)
explain select * from d_trr where revision like '^B2.%.SX' --where ast_revision = 'B2.M.SX' Seq Scan on d_trr (cost=0.00..2268460.98 rows=1 width=16) Filter: ((revision)::text ~~ '^B2.%.SX'::text) show lc_collate; en_US.UTF-8 Is it that this is handled by tsearch2? Or I need to do the locale to "C" for this to function? the revision is indexed. Bitmap Heap Scan on d_trr (cost=4492.43..547709.26 rows=193453 width=16) Recheck Cond: ((revision)::text = 'B2.M.SX'::text) -> Bitmap Index Scan on idx_d_trr_iw_ast (cost=0.00..4444.06 rows=193453 width=0) Index Cond: ((revision)::text = 'B2.M.SX'::text)
Ow Mun Heng wrote: > explain select * from d_trr where revision like '^B2.%.SX' > --where ast_revision = 'B2.M.SX' > > Seq Scan on d_trr (cost=0.00..2268460.98 rows=1 width=16) > Filter: ((revision)::text ~~ '^B2.%.SX'::text) > > show lc_collate; > en_US.UTF-8 > > Is it that this is handled by tsearch2? Or I need to do the locale to > "C" for this to function? > See http://www.postgresql.org/docs/8.3/interactive/indexes-opclass.html. It tells you how to create an index that like might use in non-C locales. As a side note, the ^ in the string above has no special meaning for like. klint. -- Klint Gore Database Manager Sheep CRC A.G.B.U. University of New England Armidale NSW 2350 Ph: 02 6773 3789 Fax: 02 6773 3266 EMail: kgore4@une.edu.au
On Wed, 2008-06-25 at 14:58 +1000, Klint Gore wrote: > Ow Mun Heng wrote: > > explain select * from d_trr where revision like '^B2.%.SX' > > --where ast_revision = 'B2.M.SX' > > > > Seq Scan on d_trr (cost=0.00..2268460.98 rows=1 width=16) > > Filter: ((revision)::text ~~ '^B2.%.SX'::text) > > > > show lc_collate; > > en_US.UTF-8 > > > > Is it that this is handled by tsearch2? Or I need to do the locale to > > "C" for this to function? > > > See http://www.postgresql.org/docs/8.3/interactive/indexes-opclass.html. > It tells you how to create an index that like might use in non-C locales. Just more information. This columns is created with the varchar type. original index is created using CREATE INDEX idx_d_ast ON xmms.d_trh USING btree (revision varchar_pattern_ops); CREATE INDEX idx_d_ast2 ON xmms.d_trh USING btree (revision); after creating it, seems like it is still doing the seq_scan. So what gives? Can I get more clues here?
Ow Mun Heng wrote: > On Wed, 2008-06-25 at 14:58 +1000, Klint Gore wrote: > > Ow Mun Heng wrote: > > > explain select * from d_trr where revision like '^B2.%.SX' > > > --where ast_revision = 'B2.M.SX' > > > > > > Seq Scan on d_trr (cost=0.00..2268460.98 rows=1 width=16) > > > Filter: ((revision)::text ~~ '^B2.%.SX'::text) > > > > > > show lc_collate; > > > en_US.UTF-8 > > > > > > Is it that this is handled by tsearch2? Or I need to do the locale to > > > "C" for this to function? > > > > > See http://www.postgresql.org/docs/8.3/interactive/indexes-opclass.html. > > It tells you how to create an index that like might use in non-C locales. > > Just more information. This columns is created with the varchar type. > > original index is created using > > CREATE INDEX idx_d_ast > ON xmms.d_trh > USING btree > (revision varchar_pattern_ops); > > > CREATE INDEX idx_d_ast2 > ON xmms.d_trh > USING btree > (revision); > > after creating it, seems like it is still doing the seq_scan. > So what gives? Can I get more clues here? > Post your query and the explain analyze of it and how many rows are in the table. In my database, there's 7200 rows in items and I know that none of the identifiers for them start with 'xb'. As you can see below, the 1st query is sequential and the 2nd one is using the new index. (v8.3.0) postgres=# show lc_collate; lc_collate ------------------------ English_Australia.1252 (1 row) postgres=# explain analyse select * from items where identifier like 'xb%'; QUERY PLAN ----------------------------------------------------------------------------------------------------- Seq Scan on items (cost=0.00..160.18 rows=1 width=113) (actual time=4.966..4.966 rows=0 loops=1) Filter: ((identifier)::text ~~ 'xb%'::text) Total runtime: 5.029 ms (3 rows) postgres=# create index anindex on items(identifier varchar_pattern_ops); CREATE INDEX postgres=# explain analyse select * from items where identifier like 'xb%'; QUERY PLAN ------------------------------------------------------------------------------------------------------------------- Index Scan using anindex on items (cost=0.00..8.27 rows=1 width=113) (actual time=0.165..0.165 rows=0 loops=1) Index Cond: (((identifier)::text ~>=~ 'xb'::text) AND ((identifier)::text ~<~ 'xc'::text)) Filter: ((identifier)::text ~~ 'xb%'::text) Total runtime: 0.255 ms (4 rows) -- Klint Gore Database Manager Sheep CRC A.G.B.U. University of New England Armidale NSW 2350 Ph: 02 6773 3789 Fax: 02 6773 3266 EMail: kgore4@une.edu.au
On Wed, 2008-06-25 at 17:00 +1000, Klint Gore wrote: > Ow Mun Heng wrote: > > On Wed, 2008-06-25 at 14:58 +1000, Klint Gore wrote: > > > Ow Mun Heng wrote: > > > > explain select * from d_trr where revision like '^B2.%.SX' > > > > --where ast_revision = 'B2.M.SX' > > > > > > > > Seq Scan on d_trr (cost=0.00..2268460.98 rows=1 width=16) > > > > Filter: ((revision)::text ~~ '^B2.%.SX'::text) > > > > > > > > show lc_collate; > > > > en_US.UTF-8 > > > > > > > > Is it that this is handled by tsearch2? Or I need to do the locale to > > > > "C" for this to function? > > > > > > > See http://www.postgresql.org/docs/8.3/interactive/indexes-opclass.html. > > > It tells you how to create an index that like might use in non-C locales. > > > > Just more information. This columns is created with the varchar type. > > > > original index is created using > > > > CREATE INDEX idx_d_ast > > ON xmms.d_trh > > USING btree > > (revision varchar_pattern_ops); > > > > > > CREATE INDEX idx_d_ast2 > > ON xmms.d_trh > > USING btree > > (revision); > > > > after creating it, seems like it is still doing the seq_scan. > > So what gives? Can I get more clues here? > > > Post your query and the explain analyze of it and how many rows are in > the table. explain analyse select count(*) from d_trr_iw --where ast_revision like '^B2.%.SX' where ast_revision = 'B2.P.SX' QUERY PLAN Aggregate (cost=353955.35..353955.36 rows=1 width=0) (actual time=54.565..54.566 rows=1 loops=1) -> Bitmap Heap Scan on d_trr_iw (cost=3150.63..353593.31 rows=144813 width=0) (actual time=54.557..54.557 rows=0 loops=1) Recheck Cond: ((ast_revision)::text = 'B2.P.SX'::text) -> Bitmap Index Scan on idx_d_trr_iw_ast (cost=0.00..3114.42 rows=144813 width=0) (actual time=54.520..54.520 rows=0loops=1) Index Cond: ((ast_revision)::text = 'B2.P.SX'::text) Total runtime: 54.662 ms > > In my database, there's 7200 rows in items and I know that none of the > identifiers for them start with 'xb'. As you can see below, the 1st > query is sequential and the 2nd one is using the new index. (v8.3.0) > There's approx 29million rows in there and using the LIKE condition will churn it for a good 20-30min I suppose. (didn't try - live database) > postgres=# show lc_collate; > lc_collate > ------------------------ > English_Australia.1252 > (1 row) > > postgres=# explain analyse select * from items where identifier like 'xb%'; > QUERY PLAN > ----------------------------------------------------------------------------------------------------- > Seq Scan on items (cost=0.00..160.18 rows=1 width=113) (actual > time=4.966..4.966 rows=0 loops=1) > Filter: ((identifier)::text ~~ 'xb%'::text) > Total runtime: 5.029 ms > (3 rows) > > postgres=# create index anindex on items(identifier varchar_pattern_ops); > CREATE INDEX > postgres=# explain analyse select * from items where identifier like 'xb%'; > QUERY PLAN > ------------------------------------------------------------------------------------------------------------------- > Index Scan using anindex on items (cost=0.00..8.27 rows=1 width=113) > (actual time=0.165..0.165 rows=0 loops=1) > Index Cond: (((identifier)::text ~>=~ 'xb'::text) AND > ((identifier)::text ~<~ 'xc'::text)) > Filter: ((identifier)::text ~~ 'xb%'::text) > Total runtime: 0.255 ms > (4 rows) > > Could it be that it's not able to determine the B2.%.SX in there? explain select count(*) from d_trr_iw where ast_revision like 'B2.P.SX' even this will result in a seq_scan.
Ow Mun Heng wrote: > Could it be that it's not able to determine the B2.%.SX in there? > > explain select count(*) from d_trr_iw where ast_revision like 'B2.P.SX' > > even this will result in a seq_scan. > How many values have you got that start with 'B2.'? If it's more than about 5% then it's probably quicker to sequential scan anyway. Since the equals query says there's one row with a P in the middle, what does this say? explain select count(*) from d_trr_iw where ast_revision like 'B2.P%.SX' or pick a value that you know can't exist e.g. if ast_revision can never start with WW then explain select count(*) from d_trr_iw where ast_revision like 'WW.%.SX' klint. -- Klint Gore Database Manager Sheep CRC A.G.B.U. University of New England Armidale NSW 2350 Ph: 02 6773 3789 Fax: 02 6773 3266 EMail: kgore4@une.edu.au