Thread: How to raise index points when equal and like is used with gist?
Hello, I have a problem with query and index scan based on pg_trgm module. Here is few examples: First example is with equal: explain analyze SELECT * FROM table WHERE firstname = 'OLEG' AND middlename || lastname LIKE '%KUZNICOV%IGORU%'; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------- Bitmap Heap Scan on abonats_tbl (cost=34.42..6043.65 rows=1 width=601) (actual time=2.885..14.062 rows=1 loops=1) Recheck Cond: (firstname = 'OLEG'::text) Filter: ((middlename || lastname) ~~ '%KUZNICOV%IGORU%'::text) Rows Removed by Filter: 1731 -> Bitmap Index Scan on table_firstname_idx (cost=0.00..34.42 rows=1690 width=0) (actual time=0.699..0.699 rows=1732 loops=1) Index Cond: (firstname = 'OLEG'::text) Total runtime: 14.126 ms (7 rows) But if I add one like with gist index result is not filtered here is example: explain analyze SELECT * FROM table WHERE phone LIKE '12%' AND firstname = 'OLEG' AND middlename || lastname LIKE '%KUZNICOV%IGORU%'; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------------------------- Bitmap Heap Scan on abonats_tbl (cost=1638.89..1816.65 rows=1 width=601) (actual time=219.793..219.793 rows=0 loops=1) Recheck Cond: ((firstname = 'OLEG'::text) AND (phone ~~ '12%'::text)) Filter: ((middlename || lastname) ~~ '%KUZNICOV%IGORU%'::text) Rows Removed by Filter: 65 -> BitmapAnd (cost=1638.89..1638.89 rows=45 width=0) (actual time=219.197..219.197 rows=0 loops=1) -> Bitmap Index Scan on table_firstname_idx (cost=0.00..34.42 rows=1690 width=0) (actual time=0.867..0.867 rows=1732 loops=1) Index Cond: (firstname = 'OLEG'::text) -> Bitmap Index Scan on table_phonegist_idx (cost=0.00..1604.22 rows=33995 width=0) (actual time=217.639..217.639 rows=33256 loops=1) Index Cond: (phone ~~ '12%'::text) Total runtime: 220.426 ms My question is: Is there any way how to make postgresql first to search from field that is with equal I have index there and then to filter result based to other conditions first gist and then other. I think may be I should play with index points. Im using postgresql 9.2.1 x86_64 Regards, C.
On Thu, Oct 11, 2012 at 2:23 AM, Condor <condor@stz-bg.com> wrote: > explain analyze SELECT * FROM table WHERE phone LIKE '12%' AND firstname = > 'OLEG' AND middlename || lastname LIKE '%KUZNICOV%IGORU%'; > QUERY > PLAN > ----------------------------------------------------------------------------------------------------------------------------------------------------- > Bitmap Heap Scan on abonats_tbl (cost=1638.89..1816.65 rows=1 width=601) > (actual time=219.793..219.793 rows=0 loops=1) > Recheck Cond: ((firstname = 'OLEG'::text) AND (phone ~~ '12%'::text)) > Filter: ((middlename || lastname) ~~ '%KUZNICOV%IGORU%'::text) > Rows Removed by Filter: 65 > -> BitmapAnd (cost=1638.89..1638.89 rows=45 width=0) (actual > time=219.197..219.197 rows=0 loops=1) > -> Bitmap Index Scan on table_firstname_idx (cost=0.00..34.42 > rows=1690 width=0) (actual time=0.867..0.867 rows=1732 loops=1) > Index Cond: (firstname = 'OLEG'::text) > -> Bitmap Index Scan on table_phonegist_idx (cost=0.00..1604.22 > rows=33995 width=0) (actual time=217.639..217.639 rows=33256 loops=1) > Index Cond: (phone ~~ '12%'::text) > Total runtime: 220.426 ms > > > My question is: Is there any way how to make postgresql first to search from > field that is with equal I have index there and then to filter result based > to other conditions first gist and then other. > I think may be I should play with index points. What about dropping table_phonegist_idx index? Is it used somewhere else? ps. BTW how do you cope with the pg_trgm ASCII alphanumeric restriction? Transliteration? -- Sergey Konoplev a database and software architect http://www.linkedin.com/in/grayhemp Jabber: gray.ru@gmail.com Skype: gray-hemp Phone: +14158679984
Condor <condor@stz-bg.com> writes: > explain analyze SELECT * FROM table WHERE phone LIKE '12%' AND > firstname = 'OLEG' AND middlename || lastname LIKE '%KUZNICOV%IGORU%'; > QUERY PLAN > ----------------------------------------------------------------------------------------------------------------------------------------------------- > Bitmap Heap Scan on abonats_tbl (cost=1638.89..1816.65 rows=1 > width=601) (actual time=219.793..219.793 rows=0 loops=1) > Recheck Cond: ((firstname = 'OLEG'::text) AND (phone ~~ > '12%'::text)) > Filter: ((middlename || lastname) ~~ '%KUZNICOV%IGORU%'::text) > Rows Removed by Filter: 65 > -> BitmapAnd (cost=1638.89..1638.89 rows=45 width=0) (actual > time=219.197..219.197 rows=0 loops=1) > -> Bitmap Index Scan on table_firstname_idx > (cost=0.00..34.42 rows=1690 width=0) (actual time=0.867..0.867 rows=1732 > loops=1) > Index Cond: (firstname = 'OLEG'::text) > -> Bitmap Index Scan on table_phonegist_idx > (cost=0.00..1604.22 rows=33995 width=0) (actual time=217.639..217.639 > rows=33256 loops=1) > Index Cond: (phone ~~ '12%'::text) > Total runtime: 220.426 ms You sure that server is 9.2? Because that looks like a planner bug we squelched some time ago, wherein it was way too enthusiastic about adding more indexes to a BitmapAnd. If it is 9.2, please send a self-contained test case, that is some test data (and settings, if you're using nondefault ones) that makes it do this. regards, tom lane
On 2012-10-12 01:14, Sergey Konoplev wrote: > On Thu, Oct 11, 2012 at 2:23 AM, Condor <condor@stz-bg.com> wrote: >> explain analyze SELECT * FROM table WHERE phone LIKE '12%' AND >> firstname = >> 'OLEG' AND middlename || lastname LIKE '%KUZNICOV%IGORU%'; >> >> QUERY >> PLAN >> >> ----------------------------------------------------------------------------------------------------------------------------------------------------- >> Bitmap Heap Scan on abonats_tbl (cost=1638.89..1816.65 rows=1 >> width=601) >> (actual time=219.793..219.793 rows=0 loops=1) >> Recheck Cond: ((firstname = 'OLEG'::text) AND (phone ~~ >> '12%'::text)) >> Filter: ((middlename || lastname) ~~ '%KUZNICOV%IGORU%'::text) >> Rows Removed by Filter: 65 >> -> BitmapAnd (cost=1638.89..1638.89 rows=45 width=0) (actual >> time=219.197..219.197 rows=0 loops=1) >> -> Bitmap Index Scan on table_firstname_idx >> (cost=0.00..34.42 >> rows=1690 width=0) (actual time=0.867..0.867 rows=1732 loops=1) >> Index Cond: (firstname = 'OLEG'::text) >> -> Bitmap Index Scan on table_phonegist_idx >> (cost=0.00..1604.22 >> rows=33995 width=0) (actual time=217.639..217.639 rows=33256 >> loops=1) >> Index Cond: (phone ~~ '12%'::text) >> Total runtime: 220.426 ms >> >> >> My question is: Is there any way how to make postgresql first to >> search from >> field that is with equal I have index there and then to filter >> result based >> to other conditions first gist and then other. >> I think may be I should play with index points. > > What about dropping table_phonegist_idx index? Is it used somewhere > else? > Ill try this night, no isn't used elsewhere. > ps. BTW how do you cope with the pg_trgm ASCII alphanumeric > restriction? Transliteration? The tel field has alphanumeric values and md5 hash values sometimes of some phone. Server is setup and started in CP1251 encoding.
On 2012-10-12 03:27, Tom Lane wrote: > Condor <condor@stz-bg.com> writes: >> explain analyze SELECT * FROM table WHERE phone LIKE '12%' AND >> firstname = 'OLEG' AND middlename || lastname LIKE >> '%KUZNICOV%IGORU%'; > >> QUERY PLAN >> >> ----------------------------------------------------------------------------------------------------------------------------------------------------- >> Bitmap Heap Scan on abonats_tbl (cost=1638.89..1816.65 rows=1 >> width=601) (actual time=219.793..219.793 rows=0 loops=1) >> Recheck Cond: ((firstname = 'OLEG'::text) AND (phone ~~ >> '12%'::text)) >> Filter: ((middlename || lastname) ~~ '%KUZNICOV%IGORU%'::text) >> Rows Removed by Filter: 65 >> -> BitmapAnd (cost=1638.89..1638.89 rows=45 width=0) (actual >> time=219.197..219.197 rows=0 loops=1) >> -> Bitmap Index Scan on table_firstname_idx >> (cost=0.00..34.42 rows=1690 width=0) (actual time=0.867..0.867 >> rows=1732 >> loops=1) >> Index Cond: (firstname = 'OLEG'::text) >> -> Bitmap Index Scan on table_phonegist_idx >> (cost=0.00..1604.22 rows=33995 width=0) (actual >> time=217.639..217.639 >> rows=33256 loops=1) >> Index Cond: (phone ~~ '12%'::text) >> Total runtime: 220.426 ms > > You sure that server is 9.2? Because that looks like a planner bug > we > squelched some time ago, wherein it was way too enthusiastic about > adding more indexes to a BitmapAnd. Yes, Im sure: PostgreSQL 9.2.1 on x86_64-slackware-linux-gnu, compiled by x86_64-slackware-linux-gcc (GCC) 4.7.1, 64-bit > > If it is 9.2, please send a self-contained test case, that is some > test > data (and settings, if you're using nondefault ones) that makes it do > this. > Hm ... strange problem I catch. When I try to reproduce the problem, with test table, I made a very little table http://pastebin.com/nEK3cRr2 When I run the same type of query results is different: Seq Scan on users (cost=0.00..1.12 rows=1 width=26) (actual time=0.014..0.016 rows=1 loops=1) Filter: ((tel ~~ '09%'::text) AND (firstname = 'GREG'::text) AND ((middlename || lastname) ~~ '%%'::text)) Rows Removed by Filter: 5 Total runtime: 0.042 ms (4 rows) Okay, may be the problem is because I use cp1251 encoding .. lets change the data values, drop table, insert cp1251 values, start vacuum and result was the same speed Total runtime: 0.052 ms the same type of scan was used: Seq Scan on users (cost=0.00..1.14 rows=1 width=132) (actual time=0.019..0.021 rows=1 loops=1) Filter: ((tel ~~ '09%'::text) AND (firstname = 'CP1251 CHARS HERE'::text) AND ((middlename || lastname) ~~ '%%'::text)) Rows Removed by Filter: 6 Total runtime: 0.052 ms Even without tel filed result and type of scan is the same (Seq Scan). Now first name is write in cyrillic and mean "GREG" (I replace it with CP1251 CHARS HERE, because some ppl did not have cyrillic encoding). When I run the same query on the same database but different table that give strange result Bitmap Heap Scan. Index field is the same like test table from pastebin, no difference. And here I must say the history of the table. That table was made on psql 7.3 version and migrate on every major upgrade of the server that require dump/restore of database if that information is valuable. Any one has ideas what is going wrong on that table ? Why the same query on two different table with the same data gives me different scan results ? Regards, C
On Fri, Oct 12, 2012 at 1:20 AM, Condor <condor@stz-bg.com> wrote: > Even without tel filed result and type of scan is the same (Seq Scan). This is because your table has to few rows and it is easier to seq scan. Add more rows, eg. 100 000, then ANALYZE the table and run tests. Use random() and generate_series() to generate the data. > > Now first name is write in cyrillic and mean "GREG" (I replace it with > CP1251 CHARS HERE, because some ppl did not have cyrillic encoding). When I > run the same query on the same database but different table that give > strange result Bitmap Heap Scan. Index field is the same like test table > from pastebin, no difference. > > > And here I must say the history of the table. That table was made on psql > 7.3 version and migrate on every major upgrade of the server that require > dump/restore of database if that information is valuable. > > Any one has ideas what is going wrong on that table ? Why the same query on > two different table with the same data gives me different scan results ? > > Regards, > > C > > > > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general -- Sergey Konoplev a database and software architect http://www.linkedin.com/in/grayhemp Jabber: gray.ru@gmail.com Skype: gray-hemp Phone: +14158679984
On 2012-10-12 11:30, Sergey Konoplev wrote: > On Fri, Oct 12, 2012 at 1:20 AM, Condor <condor@stz-bg.com> wrote: >> Even without tel filed result and type of scan is the same (Seq >> Scan). > > This is because your table has to few rows and it is easier to seq > scan. Add more rows, eg. 100 000, then ANALYZE the table and run > tests. Use random() and generate_series() to generate the data. > You was right, when I read documentation of pg_trgm I see how much time will take to search in 100 000 rows, but I was misled myself because did not expect to change the search scan. Seq to Bitmap. I understand my mistake and change query to: EXPLAIN ANALYZE WITH AS ab (SELECT * FROM tables WHERE firstname = 'OLEG' AND middlename || lastname LIKE '%KUZNICOV%IGORU%') SELECT * FROM ab WHERE tel LIKE '12%'; CTE Scan on ab (cost=6490.15..6531.14 rows=9 width=965) (actual time=2.256..20.017 rows=43 loops=1) Filter: (tel ~~ '12%'::text) Rows Removed by Filter: 1690 CTE ab -> Bitmap Heap Scan on tables (cost=39.87..6490.15 rows=1822 width=600) (actual time=1.789..17.817 rows=1733 loops=1) Recheck Cond: (firstname = 'OLEG'::text) Filter: ((middlename || lastname) ~~ '%KUZNICOV%IGORU%'::text) -> Bitmap Index Scan on tables_firstname_idx (cost=0.00..39.42 rows=1823 width=0) (actual time=1.178..1.178 rows=1733 loops=1) Index Cond: (firstname = 'OLEG'::text) Total runtime: 20.278 ms Now is much better 20 ms vs 220 ms. Thanks for your help. Cheers, C