Re: Help with Query Tuning - Mailing list pgsql-performance

From Kenneth Marshall
Subject Re: Help with Query Tuning
Date
Msg-id 20110316163653.GF11541@aart.is.rice.edu
Whole thread Raw
In response to Help with Query Tuning  (Adarsh Sharma <adarsh.sharma@orkash.com>)
Responses Re: Help with Query Tuning  (Adarsh Sharma <adarsh.sharma@orkash.com>)
Re: Help with Query Tuning  (Adarsh Sharma <adarsh.sharma@orkash.com>)
List pgsql-performance
On Wed, Mar 16, 2011 at 02:43:38PM +0530, Adarsh Sharma wrote:
> Dear all,
>
> I am facing a problem while  creating the index to make the below query run
> faster. My table  size is near about 1065 MB and 428467 rows.
>
> explain analyze select  count(*)  from page_content where publishing_date
> like '%2010%' and content_language='en'  and content is not null and
> isprocessable = 1 and (content like '%Militant%'
> OR content like '%jihad%' OR  content like '%Mujahid%'  OR
> content like '%fedayeen%' OR content like '%insurgent%'  OR content like
> '%terrorist%' OR
>  content like '%cadre%'  OR content like '%civilians%' OR content like
> '%police%' OR content like '%defence%' OR content like '%cops%' OR content
> like '%crpf%' OR content like '%dsf%' OR content like '%ssb%') AND (content
> like '%kill%' or content like '%injure%');
>
> *Output:
>
> * Aggregate  (cost=107557.78..107557.79 rows=1 width=0) (actual
> time=18564.631..18564.631 rows=1 loops=1)
>   ->  Seq Scan on page_content  (cost=0.00..107466.82 rows=36381 width=0)
> (actual time=0.146..18529.371 rows=59918 loops=1)
>         Filter: ((content IS NOT NULL) AND (publishing_date ~~
> '%2010%'::text) AND (content_language = 'en'::bpchar) AND (isprocessable =
> 1) AND (((content)
> ::text ~~ '%kill%'::text) OR ((content)::text ~~ '%injure%'::text)) AND
> (((content)::text ~~ '%Militant%'::text) OR ((content)::text ~~
> '%jihad%'::text) OR (
> (content)::text ~~ '%Mujahid%'::text) OR ((content)::text ~~
> '%fedayeen%'::text) OR ((content)::text ~~ '%insurgent%'::text) OR
> ((content)::text ~~ '%terrori
> st%'::text) OR ((content)::text ~~ '%cadre%'::text) OR ((content)::text ~~
> '%civilians%'::text) OR ((content)::text ~~ '%police%'::text) OR
> ((content)::text
> ~~ '%defence%'::text) OR ((content)::text ~~ '%cops%'::text) OR
> ((content)::text ~~ '%crpf%'::text) OR ((content)::text ~~ '%dsf%'::text)
> OR ((content)::text
> ~~ '%ssb%'::text)))
> Total runtime: 18564.673 ms
>
>
> *Index on that Table :
>
> *CREATE INDEX idx_page_id
>  ON page_content
>  USING btree
>  (crawled_page_id);
>
> *Index I create :*
> CREATE INDEX idx_page_id_content
>  ON page_content
>  USING btree
>  (crawled_page_id,content_language,publishing_date,isprocessable);
>
> *Index that fail to create:
>
> *CREATE INDEX idx_page_id_content1
>  ON page_content
>  USING btree
>  (crawled_page_id,content);
>
> Error :-ERROR:  index row requires 13240 bytes, maximum size is 8191
> ********** Error **********
>
> ERROR: index row requires 13240 bytes, maximum size is 8191
> SQL state: 54000
>
> How to resolve this error
> Please give any suggestion to tune the query.
>
> Thanks & best Regards,
>
> Adarsh Sharma
>

You should probably be looking at using full-text indexing:

http://www.postgresql.org/docs/9.0/static/textsearch.html

or limit the size of content for the index.

Cheers,
Ken

pgsql-performance by date:

Previous
From: Adarsh Sharma
Date:
Subject: Help with Query Tuning
Next
From: "Kevin Grittner"
Date:
Subject: Re: Adding additional index causes 20,000x slowdown for certain select queries - postgres 9.0.3