Thread: LIKE Query performance

LIKE Query performance

From
"Hari, Balaji"
Date:

Hi,

 

I am relatively new to PostgreSQL(8.1) and facing the following problem.

 

We have indexes defined on timestamp and description (create index description_idx on event using btree (description varchar_pattern_ops))

 

EXPLAIN ANALYZE SELECT event_id, category, current_session_number, description, event_type_id, realm_name, root_session_number, severity, source_name, target_key, target_name, timestamp, jdo_version FROM event WHERE description like '%mismatch%' ORDER BY timestamp desc;

                                                      QUERY PLAN                                                     

----------------------------------------------------------------------------------------------------------------------

 Sort  (cost=36267.09..36272.73 rows=2256 width=314) (actual time=19255.075..20345.774 rows=647537 loops=1)

   Sort Key: "timestamp"

   Sort Method:  external merge  Disk: 194080kB

   ->  Seq Scan on event  (cost=0.00..36141.44 rows=2256 width=314) (actual time=0.080..1475.041 rows=647537 loops=1)

         Filter: ((description)::text ~~ '%mismatch%'::text)

 Total runtime: 22547.292 ms

(6 rows)

 

But startsWith query use indexes.

 

EXPLAIN ANALYZE SELECT event_id, category, current_session_number, description, event_type_id, realm_name, root_session_number, severity, source_name, target_key, target_name, timestamp, jdo_version FROM event WHERE description like 'mismatch%' ORDER BY timestamp desc;

                                                          QUERY PLAN                                                          

-------------------------------------------------------------------------------------------------------------------------------

 Sort  (cost=9.26..9.27 rows=1 width=314) (actual time=0.766..0.766 rows=0 loops=1)

   Sort Key: "timestamp"

   Sort Method:  quicksort  Memory: 17kB

   ->  Index Scan using description_idx on event  (cost=0.00..9.25 rows=1 width=314) (actual time=0.741..0.741 rows=0 loops=1)

         Index Cond: (((description)::text ~>=~ 'mismatch'::text) AND ((description)::text ~<~ 'mismatci'::text))

         Filter: ((description)::text ~~ 'mismatch%'::text)

 Total runtime: 0.919 ms

(7 rows)

 

Is there any tweaks to force pgsql to use index on description?

 

Balaji

 

P.S The event database has 700k records.

Re: LIKE Query performance

From
Oleg Bartunov
Date:
Only wildspeed http://www.sai.msu.su/~megera/wiki/wildspeed
has index support for %text%
But, it has limitations.

Oleg
On Tue, 27 Jan 2009, Hari, Balaji wrote:

> Hi,
>
> I am relatively new to PostgreSQL(8.1) and facing the following problem.
>
> We have indexes defined on timestamp and description (create index description_idx on event using btree (description
varchar_pattern_ops))
>
> EXPLAIN ANALYZE SELECT event_id, category, current_session_number, description, event_type_id, realm_name,
root_session_number,severity, source_name, target_key, target_name, timestamp, jdo_version FROM event WHERE description
like'%mismatch%' ORDER BY timestamp desc; 
>                                                      QUERY PLAN
>
----------------------------------------------------------------------------------------------------------------------
> Sort  (cost=36267.09..36272.73 rows=2256 width=314) (actual time=19255.075..20345.774 rows=647537 loops=1)
>   Sort Key: "timestamp"
>   Sort Method:  external merge  Disk: 194080kB
>   ->  Seq Scan on event  (cost=0.00..36141.44 rows=2256 width=314) (actual time=0.080..1475.041 rows=647537 loops=1)
>         Filter: ((description)::text ~~ '%mismatch%'::text)
> Total runtime: 22547.292 ms
> (6 rows)
>
> But startsWith query use indexes.
>
> EXPLAIN ANALYZE SELECT event_id, category, current_session_number, description, event_type_id, realm_name,
root_session_number,severity, source_name, target_key, target_name, timestamp, jdo_version FROM event WHERE description
like'mismatch%' ORDER BY timestamp desc; 
>                                                          QUERY PLAN
>
-------------------------------------------------------------------------------------------------------------------------------
> Sort  (cost=9.26..9.27 rows=1 width=314) (actual time=0.766..0.766 rows=0 loops=1)
>   Sort Key: "timestamp"
>   Sort Method:  quicksort  Memory: 17kB
>   ->  Index Scan using description_idx on event  (cost=0.00..9.25 rows=1 width=314) (actual time=0.741..0.741 rows=0
loops=1)
>         Index Cond: (((description)::text ~>=~ 'mismatch'::text) AND ((description)::text ~<~ 'mismatci'::text))
>         Filter: ((description)::text ~~ 'mismatch%'::text)
> Total runtime: 0.919 ms
> (7 rows)
>
> Is there any tweaks to force pgsql to use index on description?
>
> Balaji
>
> P.S The event database has 700k records.
>

     Regards,
         Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

Re: LIKE Query performance

From
Marcin Stępnicki
Date:
On Wed, Jan 28, 2009 at 12:41 AM, Hari, Balaji <Balaji_Hari@bmc.com> wrote:
> EXPLAIN ANALYZE SELECT event_id, category, current_session_number,
> description, event_type_id, realm_name, root_session_number, severity,
> source_name, target_key, target_name, timestamp, jdo_version FROM event
> WHERE description like '%mismatch%' ORDER BY timestamp desc;
(...)
> Is there any tweaks to force pgsql to use index on description?

How long is usually the description? For me it sounds like the job for
tsearch2 module, which should be in "contrib" section in 8.1.

Re: LIKE Query performance

From
"A. Kretschmer"
Date:
In response to Hari, Balaji :
> Hi,
>
>
>
> I am relatively new to PostgreSQL(8.1) and facing the following problem.

Sure? 8.1? Your explain looks like 8.2 or 8.3. 8.3 contains a
full-text-search.

If really not 8.3 you can use tsearch2, it is a contrib-module.


Andreas
--
Andreas Kretschmer
Kontakt:  Heynitz: 035242/47150,   D1: 0160/7141639 (mehr: -> Header)
GnuPG-ID:   0x3FFF606C, privat 0x7F4584DA   http://wwwkeys.de.pgp.net

Re: LIKE Query performance

From
Robert Haas
Date:
> Is there any tweaks to force pgsql to use index on description?

Even if you could force it to use the index, it wouldn't make the
query run faster.

As others have pointed out, what you really need is a different kind of index...

...Robert

Re: LIKE Query performance

From
"Hari, Balaji"
Date:
Is there a patch to make Wildspeed work with postgresql version 8.3.1?

P.S
My bad, the version number was incorrect in my previous mail.
-----Original Message-----
From: Oleg Bartunov [mailto:oleg@sai.msu.su]
Sent: Wednesday, January 28, 2009 1:27 AM
To: Hari, Balaji
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] LIKE Query performance

Only wildspeed http://www.sai.msu.su/~megera/wiki/wildspeed
has index support for %text%
But, it has limitations.

Oleg
On Tue, 27 Jan 2009, Hari, Balaji wrote:

> Hi,
>
> I am relatively new to PostgreSQL(8.1) and facing the following problem.
>
> We have indexes defined on timestamp and description (create index description_idx on event using btree (description
varchar_pattern_ops))
>
> EXPLAIN ANALYZE SELECT event_id, category, current_session_number, description, event_type_id, realm_name,
root_session_number,severity, source_name, target_key, target_name, timestamp, jdo_version FROM event WHERE description
like'%mismatch%' ORDER BY timestamp desc; 
>                                                      QUERY PLAN
>
----------------------------------------------------------------------------------------------------------------------
> Sort  (cost=36267.09..36272.73 rows=2256 width=314) (actual time=19255.075..20345.774 rows=647537 loops=1)
>   Sort Key: "timestamp"
>   Sort Method:  external merge  Disk: 194080kB
>   ->  Seq Scan on event  (cost=0.00..36141.44 rows=2256 width=314) (actual time=0.080..1475.041 rows=647537 loops=1)
>         Filter: ((description)::text ~~ '%mismatch%'::text)
> Total runtime: 22547.292 ms
> (6 rows)
>
> But startsWith query use indexes.
>
> EXPLAIN ANALYZE SELECT event_id, category, current_session_number, description, event_type_id, realm_name,
root_session_number,severity, source_name, target_key, target_name, timestamp, jdo_version FROM event WHERE description
like'mismatch%' ORDER BY timestamp desc; 
>                                                          QUERY PLAN
>
-------------------------------------------------------------------------------------------------------------------------------
> Sort  (cost=9.26..9.27 rows=1 width=314) (actual time=0.766..0.766 rows=0 loops=1)
>   Sort Key: "timestamp"
>   Sort Method:  quicksort  Memory: 17kB
>   ->  Index Scan using description_idx on event  (cost=0.00..9.25 rows=1 width=314) (actual time=0.741..0.741 rows=0
loops=1)
>         Index Cond: (((description)::text ~>=~ 'mismatch'::text) AND ((description)::text ~<~ 'mismatci'::text))
>         Filter: ((description)::text ~~ 'mismatch%'::text)
> Total runtime: 0.919 ms
> (7 rows)
>
> Is there any tweaks to force pgsql to use index on description?
>
> Balaji
>
> P.S The event database has 700k records.
>

     Regards,
         Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83


Re: LIKE Query performance

From
"Hari, Balaji"
Date:
We won't need full text searching capabilities as in documents as the data type is varchar.

Wildspeed will exactly fit our needs.

-----Original Message-----
From: Oleg Bartunov [mailto:oleg@sai.msu.su]
Sent: Wednesday, January 28, 2009 1:27 AM
To: Hari, Balaji
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] LIKE Query performance

Only wildspeed http://www.sai.msu.su/~megera/wiki/wildspeed
has index support for %text%
But, it has limitations.

Oleg
On Tue, 27 Jan 2009, Hari, Balaji wrote:

> Hi,
>
> I am relatively new to PostgreSQL(8.1) and facing the following problem.
>
> We have indexes defined on timestamp and description (create index description_idx on event using btree (description
varchar_pattern_ops))
>
> EXPLAIN ANALYZE SELECT event_id, category, current_session_number, description, event_type_id, realm_name,
root_session_number,severity, source_name, target_key, target_name, timestamp, jdo_version FROM event WHERE description
like'%mismatch%' ORDER BY timestamp desc; 
>                                                      QUERY PLAN
>
----------------------------------------------------------------------------------------------------------------------
> Sort  (cost=36267.09..36272.73 rows=2256 width=314) (actual time=19255.075..20345.774 rows=647537 loops=1)
>   Sort Key: "timestamp"
>   Sort Method:  external merge  Disk: 194080kB
>   ->  Seq Scan on event  (cost=0.00..36141.44 rows=2256 width=314) (actual time=0.080..1475.041 rows=647537 loops=1)
>         Filter: ((description)::text ~~ '%mismatch%'::text)
> Total runtime: 22547.292 ms
> (6 rows)
>
> But startsWith query use indexes.
>
> EXPLAIN ANALYZE SELECT event_id, category, current_session_number, description, event_type_id, realm_name,
root_session_number,severity, source_name, target_key, target_name, timestamp, jdo_version FROM event WHERE description
like'mismatch%' ORDER BY timestamp desc; 
>                                                          QUERY PLAN
>
-------------------------------------------------------------------------------------------------------------------------------
> Sort  (cost=9.26..9.27 rows=1 width=314) (actual time=0.766..0.766 rows=0 loops=1)
>   Sort Key: "timestamp"
>   Sort Method:  quicksort  Memory: 17kB
>   ->  Index Scan using description_idx on event  (cost=0.00..9.25 rows=1 width=314) (actual time=0.741..0.741 rows=0
loops=1)
>         Index Cond: (((description)::text ~>=~ 'mismatch'::text) AND ((description)::text ~<~ 'mismatci'::text))
>         Filter: ((description)::text ~~ 'mismatch%'::text)
> Total runtime: 0.919 ms
> (7 rows)
>
> Is there any tweaks to force pgsql to use index on description?
>
> Balaji
>
> P.S The event database has 700k records.
>

     Regards,
         Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83


Re: LIKE Query performance

From
Oleg Bartunov
Date:
On Thu, 29 Jan 2009, Hari, Balaji wrote:

> Is there a patch to make Wildspeed work with postgresql version 8.3.1?

unfortunately, no.

>
> P.S
> My bad, the version number was incorrect in my previous mail.
> -----Original Message-----
> From: Oleg Bartunov [mailto:oleg@sai.msu.su]
> Sent: Wednesday, January 28, 2009 1:27 AM
> To: Hari, Balaji
> Cc: pgsql-performance@postgresql.org
> Subject: Re: [PERFORM] LIKE Query performance
>
> Only wildspeed http://www.sai.msu.su/~megera/wiki/wildspeed
> has index support for %text%
> But, it has limitations.
>
> Oleg
> On Tue, 27 Jan 2009, Hari, Balaji wrote:
>
>> Hi,
>>
>> I am relatively new to PostgreSQL(8.1) and facing the following problem.
>>
>> We have indexes defined on timestamp and description (create index description_idx on event using btree (description
varchar_pattern_ops))
>>
>> EXPLAIN ANALYZE SELECT event_id, category, current_session_number, description, event_type_id, realm_name,
root_session_number,severity, source_name, target_key, target_name, timestamp, jdo_version FROM event WHERE description
like'%mismatch%' ORDER BY timestamp desc; 
>>                                                      QUERY PLAN
>>
----------------------------------------------------------------------------------------------------------------------
>> Sort  (cost=36267.09..36272.73 rows=2256 width=314) (actual time=19255.075..20345.774 rows=647537 loops=1)
>>   Sort Key: "timestamp"
>>   Sort Method:  external merge  Disk: 194080kB
>>   ->  Seq Scan on event  (cost=0.00..36141.44 rows=2256 width=314) (actual time=0.080..1475.041 rows=647537 loops=1)
>>         Filter: ((description)::text ~~ '%mismatch%'::text)
>> Total runtime: 22547.292 ms
>> (6 rows)
>>
>> But startsWith query use indexes.
>>
>> EXPLAIN ANALYZE SELECT event_id, category, current_session_number, description, event_type_id, realm_name,
root_session_number,severity, source_name, target_key, target_name, timestamp, jdo_version FROM event WHERE description
like'mismatch%' ORDER BY timestamp desc; 
>>                                                          QUERY PLAN
>>
-------------------------------------------------------------------------------------------------------------------------------
>> Sort  (cost=9.26..9.27 rows=1 width=314) (actual time=0.766..0.766 rows=0 loops=1)
>>   Sort Key: "timestamp"
>>   Sort Method:  quicksort  Memory: 17kB
>>   ->  Index Scan using description_idx on event  (cost=0.00..9.25 rows=1 width=314) (actual time=0.741..0.741 rows=0
loops=1)
>>         Index Cond: (((description)::text ~>=~ 'mismatch'::text) AND ((description)::text ~<~ 'mismatci'::text))
>>         Filter: ((description)::text ~~ 'mismatch%'::text)
>> Total runtime: 0.919 ms
>> (7 rows)
>>
>> Is there any tweaks to force pgsql to use index on description?
>>
>> Balaji
>>
>> P.S The event database has 700k records.
>>
>
>     Regards,
>         Oleg
> _____________________________________________________________
> Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
> Sternberg Astronomical Institute, Moscow University, Russia
> Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
> phone: +007(495)939-16-83, +007(495)939-23-83
>
>
>

     Regards,
         Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83