Re: index on ILIKE/LIKE - PostgreSQL 9.2 - Mailing list pgsql-general

From Melvin Davidson
Subject Re: index on ILIKE/LIKE - PostgreSQL 9.2
Date
Msg-id CANu8FizFEiaF=W_0NrO1u3LFJadaL+_1MZ3XFbqsMZiCoooccA@mail.gmail.com
Whole thread Raw
In response to index on ILIKE/LIKE - PostgreSQL 9.2  (Lucas Possamai <drum.lucas@gmail.com>)
Responses Re: index on ILIKE/LIKE - PostgreSQL 9.2
List pgsql-general


On Wed, May 11, 2016 at 8:30 PM, Lucas Possamai <drum.lucas@gmail.com> wrote:
Hi there!

I've got a simple but slow query:

 SELECT DISTINCT title  
FROM ja_jobs WHERE title ILIKE '%RYAN WER%'
and clientid = 31239  AND time_job > 1457826264
order BY title
limit 10

Explain analyze: 

Limit  (cost=5946.40..5946.41 rows=1 width=19) (actual time=2746.759..2746.772 rows=1 loops=1)
  ->  Unique  (cost=5946.40..5946.41 rows=1 width=19) (actual time=2746.753..2746.763 rows=1 loops=1)
        ->  Sort  (cost=5946.40..5946.41 rows=1 width=19) (actual time=2746.750..2746.754 rows=4 loops=1)
              Sort Key: "title"
              Sort Method: quicksort  Memory: 25kB
              ->  Bitmap Heap Scan on "ja_jobs"  (cost=49.02..5946.39 rows=1 width=19) (actual time=576.275..2746.609 rows=4 loops=1)
                    Recheck Cond: (("clientid" = 31239) AND ("time_job" > 1457826264))
                    Filter: (("title")::"text" ~~* '%
RYAN WER%'::"text")
                    Rows Removed by Filter: 791
                    ->  Bitmap Index Scan on "ix_jobs_client_times"  (cost=0.00..49.02 rows=1546 width=0) (actual time=100.870..100.870 rows=795 loops=1)
                          Index Cond: (("clientid" = 31239) AND ("time_job" > 1457826264))
Total runtime: 2746.879 ms

Then, I created a trgm index:

CREATE INDEX ix_ja_jobs_trgm_gin ON public.ja_jobs USING gin (title gin_trgm_ops);

Explain analyze after the index: (Yes, I ran the analyze)

Limit  (cost=389.91..389.91 rows=1 width=20) (actual time=3720.511..3720.511 rows=0 loops=1)
  ->  Unique  (cost=389.91..389.91 rows=1 width=20) (actual time=3720.507..3720.507 rows=0 loops=1)
        ->  Sort  (cost=389.91..389.91 rows=1 width=20) (actual time=3720.505..3720.505 rows=0 loops=1)
              Sort Key: "title"
              Sort Method: quicksort  Memory: 25kB
              ->  Bitmap Heap Scan on "ja_jobs"  (cost=385.88..389.90 rows=1 width=20) (actual time=3720.497..3720.497 rows=0 loops=1)
                    Recheck Cond: (("clientid" = 31239) AND ("time_job" > 1457826264) AND (("title")::"text" ~~ '%
RYAN WER%'::"text"))
                    Rows Removed by Index Recheck: 4
                    ->  BitmapAnd  (cost=385.88..385.88 rows=1 width=0) (actual time=3720.469..3720.469 rows=0 loops=1)
                          ->  Bitmap Index Scan on "ix_jobs_client_times"  (cost=0.00..50.00 rows=1644 width=0) (actual time=0.142..0.142 rows=795 loops=1)
                                Index Cond: (("clientid" = 31239) AND ("time_job" > 1457826264))
                          ->  Bitmap Index Scan on "ix_ja_jobs_trgm_gin"  (cost=0.00..335.63 rows=484 width=0) (actual time=3720.213..3720.213 rows=32 loops=1)
                                Index Cond: (("title")::"text" ~~ '%
RYAN WER%'::"text")
Total runtime: 3720.653 ms 


so.. the query is still slow.. 
Do you guys  know what can be done ? related to the ILIKE?

cheers
Lucas

The main problem is WHERE title ILIKE '%RYAN WER%'
When you put a % on the left of the text, there is no way to optimize that, so yes, it will be slow.

If you can eliminate the leading percent and just have trailing, it will be much faster.


--
Melvin Davidson
I reserve the right to fantasize.  Whether or not you
wish to share my fantasy is entirely up to you.

pgsql-general by date:

Previous
From: Scott Marlowe
Date:
Subject: Re: Scaling Database for heavy load
Next
From: Melvin Davidson
Date:
Subject: Re: Scaling Database for heavy load