Thread: How to raise index points when equal and like is used with gist?

How to raise index points when equal and like is used with gist?

From
Condor
Date:
Hello,
I have a problem with query and index scan based on pg_trgm module.

Here is few examples:
First example is with equal:


  explain analyze SELECT * FROM table WHERE firstname = 'OLEG' AND
middlename || lastname LIKE '%KUZNICOV%IGORU%';
                                                              QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------------
  Bitmap Heap Scan on abonats_tbl  (cost=34.42..6043.65 rows=1
width=601) (actual time=2.885..14.062 rows=1 loops=1)
    Recheck Cond: (firstname = 'OLEG'::text)
    Filter: ((middlename || lastname) ~~ '%KUZNICOV%IGORU%'::text)
    Rows Removed by Filter: 1731
    ->  Bitmap Index Scan on table_firstname_idx  (cost=0.00..34.42
rows=1690 width=0) (actual time=0.699..0.699 rows=1732 loops=1)
          Index Cond: (firstname = 'OLEG'::text)
  Total runtime: 14.126 ms
(7 rows)

But if I add one like with gist index result is not filtered here is
example:


explain analyze SELECT * FROM table WHERE phone LIKE '12%' AND
firstname = 'OLEG' AND middlename || lastname LIKE '%KUZNICOV%IGORU%';

QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------------------------------
  Bitmap Heap Scan on abonats_tbl  (cost=1638.89..1816.65 rows=1
width=601) (actual time=219.793..219.793 rows=0 loops=1)
    Recheck Cond: ((firstname = 'OLEG'::text) AND (phone ~~
'12%'::text))
    Filter: ((middlename || lastname) ~~ '%KUZNICOV%IGORU%'::text)
    Rows Removed by Filter: 65
    ->  BitmapAnd  (cost=1638.89..1638.89 rows=45 width=0) (actual
time=219.197..219.197 rows=0 loops=1)
          ->  Bitmap Index Scan on table_firstname_idx
(cost=0.00..34.42 rows=1690 width=0) (actual time=0.867..0.867 rows=1732
loops=1)
                Index Cond: (firstname = 'OLEG'::text)
          ->  Bitmap Index Scan on table_phonegist_idx
(cost=0.00..1604.22 rows=33995 width=0) (actual time=217.639..217.639
rows=33256 loops=1)
                Index Cond: (phone ~~ '12%'::text)
  Total runtime: 220.426 ms


My question is: Is there any way how to make postgresql first to search
from field that is with equal I have index there and then to filter
result based to other conditions first gist and then other.
I think may be I should play with index points.


Im using postgresql 9.2.1 x86_64


Regards,
C.


Re: How to raise index points when equal and like is used with gist ?

From
Sergey Konoplev
Date:
On Thu, Oct 11, 2012 at 2:23 AM, Condor <condor@stz-bg.com> wrote:
> explain analyze SELECT * FROM table WHERE phone LIKE '12%' AND firstname =
> 'OLEG' AND middlename || lastname LIKE '%KUZNICOV%IGORU%';
>                                                                      QUERY
> PLAN
>
-----------------------------------------------------------------------------------------------------------------------------------------------------
>  Bitmap Heap Scan on abonats_tbl  (cost=1638.89..1816.65 rows=1 width=601)
> (actual time=219.793..219.793 rows=0 loops=1)
>    Recheck Cond: ((firstname = 'OLEG'::text) AND (phone ~~ '12%'::text))
>    Filter: ((middlename || lastname) ~~ '%KUZNICOV%IGORU%'::text)
>    Rows Removed by Filter: 65
>    ->  BitmapAnd  (cost=1638.89..1638.89 rows=45 width=0) (actual
> time=219.197..219.197 rows=0 loops=1)
>          ->  Bitmap Index Scan on table_firstname_idx  (cost=0.00..34.42
> rows=1690 width=0) (actual time=0.867..0.867 rows=1732 loops=1)
>                Index Cond: (firstname = 'OLEG'::text)
>          ->  Bitmap Index Scan on table_phonegist_idx  (cost=0.00..1604.22
> rows=33995 width=0) (actual time=217.639..217.639 rows=33256 loops=1)
>                Index Cond: (phone ~~ '12%'::text)
>  Total runtime: 220.426 ms
>
>
> My question is: Is there any way how to make postgresql first to search from
> field that is with equal I have index there and then to filter result based
> to other conditions first gist and then other.
> I think may be I should play with index points.

What about dropping table_phonegist_idx index? Is it used somewhere else?

ps. BTW how do you cope with the pg_trgm ASCII alphanumeric
restriction? Transliteration?


--
Sergey Konoplev

a database and software architect
http://www.linkedin.com/in/grayhemp

Jabber: gray.ru@gmail.com Skype: gray-hemp Phone: +14158679984


Re: How to raise index points when equal and like is used with gist?

From
Tom Lane
Date:
Condor <condor@stz-bg.com> writes:
> explain analyze SELECT * FROM table WHERE phone LIKE '12%' AND
> firstname = 'OLEG' AND middlename || lastname LIKE '%KUZNICOV%IGORU%';

> QUERY PLAN
>
-----------------------------------------------------------------------------------------------------------------------------------------------------
>   Bitmap Heap Scan on abonats_tbl  (cost=1638.89..1816.65 rows=1
> width=601) (actual time=219.793..219.793 rows=0 loops=1)
>     Recheck Cond: ((firstname = 'OLEG'::text) AND (phone ~~
> '12%'::text))
>     Filter: ((middlename || lastname) ~~ '%KUZNICOV%IGORU%'::text)
>     Rows Removed by Filter: 65
>     ->  BitmapAnd  (cost=1638.89..1638.89 rows=45 width=0) (actual
> time=219.197..219.197 rows=0 loops=1)
>           ->  Bitmap Index Scan on table_firstname_idx
> (cost=0.00..34.42 rows=1690 width=0) (actual time=0.867..0.867 rows=1732
> loops=1)
>                 Index Cond: (firstname = 'OLEG'::text)
>           ->  Bitmap Index Scan on table_phonegist_idx
> (cost=0.00..1604.22 rows=33995 width=0) (actual time=217.639..217.639
> rows=33256 loops=1)
>                 Index Cond: (phone ~~ '12%'::text)
>   Total runtime: 220.426 ms

You sure that server is 9.2?  Because that looks like a planner bug we
squelched some time ago, wherein it was way too enthusiastic about
adding more indexes to a BitmapAnd.

If it is 9.2, please send a self-contained test case, that is some test
data (and settings, if you're using nondefault ones) that makes it do
this.

            regards, tom lane


Re: How to raise index points when equal and like is usedwith gist ?

From
Condor
Date:
On 2012-10-12 01:14, Sergey Konoplev wrote:
> On Thu, Oct 11, 2012 at 2:23 AM, Condor <condor@stz-bg.com> wrote:
>> explain analyze SELECT * FROM table WHERE phone LIKE '12%' AND
>> firstname =
>> 'OLEG' AND middlename || lastname LIKE '%KUZNICOV%IGORU%';
>>
>> QUERY
>> PLAN
>>
>>
-----------------------------------------------------------------------------------------------------------------------------------------------------
>>  Bitmap Heap Scan on abonats_tbl  (cost=1638.89..1816.65 rows=1
>> width=601)
>> (actual time=219.793..219.793 rows=0 loops=1)
>>    Recheck Cond: ((firstname = 'OLEG'::text) AND (phone ~~
>> '12%'::text))
>>    Filter: ((middlename || lastname) ~~ '%KUZNICOV%IGORU%'::text)
>>    Rows Removed by Filter: 65
>>    ->  BitmapAnd  (cost=1638.89..1638.89 rows=45 width=0) (actual
>> time=219.197..219.197 rows=0 loops=1)
>>          ->  Bitmap Index Scan on table_firstname_idx
>> (cost=0.00..34.42
>> rows=1690 width=0) (actual time=0.867..0.867 rows=1732 loops=1)
>>                Index Cond: (firstname = 'OLEG'::text)
>>          ->  Bitmap Index Scan on table_phonegist_idx
>> (cost=0.00..1604.22
>> rows=33995 width=0) (actual time=217.639..217.639 rows=33256
>> loops=1)
>>                Index Cond: (phone ~~ '12%'::text)
>>  Total runtime: 220.426 ms
>>
>>
>> My question is: Is there any way how to make postgresql first to
>> search from
>> field that is with equal I have index there and then to filter
>> result based
>> to other conditions first gist and then other.
>> I think may be I should play with index points.
>
> What about dropping table_phonegist_idx index? Is it used somewhere
> else?
>
Ill try this night, no isn't used elsewhere.

> ps. BTW how do you cope with the pg_trgm ASCII alphanumeric
> restriction? Transliteration?

The tel field has alphanumeric values and md5 hash values sometimes of
some phone.
Server is setup and started in CP1251 encoding.



Re: How to raise index points when equal and like is usedwith gist ?

From
Condor
Date:
On 2012-10-12 03:27, Tom Lane wrote:
> Condor <condor@stz-bg.com> writes:
>> explain analyze SELECT * FROM table WHERE phone LIKE '12%' AND
>> firstname = 'OLEG' AND middlename || lastname LIKE
>> '%KUZNICOV%IGORU%';
>
>> QUERY PLAN
>>
>>
-----------------------------------------------------------------------------------------------------------------------------------------------------
>>   Bitmap Heap Scan on abonats_tbl  (cost=1638.89..1816.65 rows=1
>> width=601) (actual time=219.793..219.793 rows=0 loops=1)
>>     Recheck Cond: ((firstname = 'OLEG'::text) AND (phone ~~
>> '12%'::text))
>>     Filter: ((middlename || lastname) ~~ '%KUZNICOV%IGORU%'::text)
>>     Rows Removed by Filter: 65
>>     ->  BitmapAnd  (cost=1638.89..1638.89 rows=45 width=0) (actual
>> time=219.197..219.197 rows=0 loops=1)
>>           ->  Bitmap Index Scan on table_firstname_idx
>> (cost=0.00..34.42 rows=1690 width=0) (actual time=0.867..0.867
>> rows=1732
>> loops=1)
>>                 Index Cond: (firstname = 'OLEG'::text)
>>           ->  Bitmap Index Scan on table_phonegist_idx
>> (cost=0.00..1604.22 rows=33995 width=0) (actual
>> time=217.639..217.639
>> rows=33256 loops=1)
>>                 Index Cond: (phone ~~ '12%'::text)
>>   Total runtime: 220.426 ms
>
> You sure that server is 9.2?  Because that looks like a planner bug
> we
> squelched some time ago, wherein it was way too enthusiastic about
> adding more indexes to a BitmapAnd.

Yes, Im sure:
  PostgreSQL 9.2.1 on x86_64-slackware-linux-gnu, compiled by
x86_64-slackware-linux-gcc (GCC) 4.7.1, 64-bit

>
> If it is 9.2, please send a self-contained test case, that is some
> test
> data (and settings, if you're using nondefault ones) that makes it do
> this.
>

Hm ... strange problem I catch. When I try to reproduce the problem,
with test table, I made a very little table

http://pastebin.com/nEK3cRr2

When I run the same type of query results is different:
  Seq Scan on users  (cost=0.00..1.12 rows=1 width=26) (actual
time=0.014..0.016 rows=1 loops=1)
    Filter: ((tel ~~ '09%'::text) AND (firstname = 'GREG'::text) AND
((middlename || lastname) ~~ '%%'::text))
    Rows Removed by Filter: 5
  Total runtime: 0.042 ms
(4 rows)


Okay, may be the problem is because I use cp1251 encoding .. lets
change the data values, drop table, insert cp1251 values,
start vacuum and result was the same speed  Total runtime: 0.052 ms the
same type of scan was used:

  Seq Scan on users  (cost=0.00..1.14 rows=1 width=132) (actual
time=0.019..0.021 rows=1 loops=1)
    Filter: ((tel ~~ '09%'::text) AND (firstname = 'CP1251 CHARS
HERE'::text) AND ((middlename || lastname) ~~ '%%'::text))
    Rows Removed by Filter: 6
  Total runtime: 0.052 ms

Even without tel filed result and type of scan is the same (Seq Scan).

Now first name is write in cyrillic and mean "GREG" (I replace it with
CP1251 CHARS HERE, because some ppl did not have cyrillic encoding).
When I run the same query on the same database but different table that
give strange result Bitmap Heap Scan. Index field is the same like test
table from pastebin, no difference.


And here I must say the history of the table. That table was made on
psql 7.3 version and migrate on every major upgrade of the server that
require dump/restore of database if that information is valuable.

Any one has ideas what is going wrong on that table ? Why the same
query on two different table with the same data gives me different scan
results ?

Regards,
C




Re: How to raise index points when equal and like is used with gist ?

From
Sergey Konoplev
Date:
On Fri, Oct 12, 2012 at 1:20 AM, Condor <condor@stz-bg.com> wrote:
> Even without tel filed result and type of scan is the same (Seq Scan).

This is because your table has to few rows and it is easier to seq
scan. Add more rows, eg. 100 000, then ANALYZE the table and run
tests. Use random() and generate_series() to generate the data.

>
> Now first name is write in cyrillic and mean "GREG" (I replace it with
> CP1251 CHARS HERE, because some ppl did not have cyrillic encoding). When I
> run the same query on the same database but different table that give
> strange result Bitmap Heap Scan. Index field is the same like test table
> from pastebin, no difference.
>
>
> And here I must say the history of the table. That table was made on psql
> 7.3 version and migrate on every major upgrade of the server that require
> dump/restore of database if that information is valuable.
>
> Any one has ideas what is going wrong on that table ? Why the same query on
> two different table with the same data gives me different scan results ?
>
> Regards,
>
> C
>
>
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general



--
Sergey Konoplev

a database and software architect
http://www.linkedin.com/in/grayhemp

Jabber: gray.ru@gmail.com Skype: gray-hemp Phone: +14158679984


Re: How to raise index points when equal and like is usedwith gist ?

From
Condor
Date:
On 2012-10-12 11:30, Sergey Konoplev wrote:
> On Fri, Oct 12, 2012 at 1:20 AM, Condor <condor@stz-bg.com> wrote:
>> Even without tel filed result and type of scan is the same (Seq
>> Scan).
>
> This is because your table has to few rows and it is easier to seq
> scan. Add more rows, eg. 100 000, then ANALYZE the table and run
> tests. Use random() and generate_series() to generate the data.
>

You was right,
when I read documentation of pg_trgm I see how much time will take to
search in 100 000 rows, but I was misled myself because did not expect
to change the search scan. Seq to Bitmap. I understand my mistake and
change
query to:

EXPLAIN ANALYZE WITH AS ab (SELECT * FROM tables WHERE firstname =
'OLEG' AND middlename || lastname LIKE '%KUZNICOV%IGORU%')
SELECT * FROM ab WHERE tel LIKE '12%';

  CTE Scan on ab  (cost=6490.15..6531.14 rows=9 width=965) (actual
time=2.256..20.017 rows=43 loops=1)
    Filter: (tel ~~ '12%'::text)
    Rows Removed by Filter: 1690
    CTE ab
      ->  Bitmap Heap Scan on tables (cost=39.87..6490.15 rows=1822
width=600) (actual time=1.789..17.817 rows=1733 loops=1)
            Recheck Cond: (firstname = 'OLEG'::text)
            Filter: ((middlename || lastname) ~~
'%KUZNICOV%IGORU%'::text)
            ->  Bitmap Index Scan on tables_firstname_idx
(cost=0.00..39.42 rows=1823 width=0) (actual time=1.178..1.178 rows=1733
loops=1)
                  Index Cond: (firstname = 'OLEG'::text)
  Total runtime: 20.278 ms



Now is much better 20 ms vs 220 ms.


Thanks for your help.


Cheers,
C