Home > mailing lists

Re: Why Index is not used - Mailing list pgsql-performance

From	Adarsh Sharma
Subject	Re: Why Index is not used
Date	March 25, 2011 04:04:48
Msg-id	4D8C3FAB.7070001@orkash.com Whole thread Raw
In response to	Re: Why Index is not used (Chetan Suttraway <chetan.suttraway@enterprisedb.com>)
Responses	Re: Why Index is not used Re: Why Index is not used
List	pgsql-performance

Tree view

Chetan Suttraway wrote:

On Fri, Mar 25, 2011 at 12:05 PM, Adarsh Sharma <adarsh.sharma@orkash.com> wrote:
Dear all,

Today I got to run a query internally from my application by more than 10 connections.

But The query performed very badly. A the data size of tables are as :

pdc_uima=# select pg_size_pretty(pg_total_relation_size('clause2'));
pg_size_pretty
----------------
5858 MB
(1 row)

pdc_uima=# select pg_size_pretty(pg_total_relation_size('svo2')); pg_size_pretty
----------------
4719 MB
(1 row)

I explain the query as after making the indexes as :

pdc_uima=# explain select c.clause, s.* from clause2 c, svo2 s where c.clause_id=s.clause_id and s.doc_id=c.source_id and c.
pdc_uima-# sentence_id=s.sentence_id ;
QUERY PLAN --------------------------------------------------------------------------------------------------------------
Merge Join (cost=5673831.05..34033959.87 rows=167324179 width=2053)
Merge Cond: ((s.clause_id = c.clause_id) AND (s.doc_id = c.source_id) AND (s.sentence_id = c.sentence_id))
-> Index Scan using idx_svo2 on svo2 s (cost=0.00..24489343.65 rows=27471560 width=1993)
-> Materialize (cost=5673828.74..6071992.29 rows=31853084 width=72)
-> Sort (cost=5673828.74..5753461.45 rows=31853084 width=72)
Sort Key: c.clause_id, c.source_id, c.sentence_id
-> Seq Scan on clause2 c (cost=0.00..770951.84 rows=31853084 width=72)

Indexes are :

CREATE INDEX idx_clause ON clause2 USING btree (clause_id, source_id, sentence_id);
CREATE INDEX idx_svo2 ON svo2 USING btree (clause_id, doc_id, sentence_id);

I don't know why it not uses the index scan for clause2 table.

In this case, there are no predicates or filters on individual table. (maybe something like c.source_id=10)
so either of the 2 tables will have to go for simple scan.

Are you expecting seq. scan on svo2 and index scan on clause2?

As per the size consideration and the number of rows, I think index scan on clause2 is better.

Your constraint is valid but I need to perform this query faster.
What is the reason behind the seq scan of clause2.

Regards,
Adarsh

pgsql-performance by date:

From: Chetan Suttraway
Date: 25 March 2011, 03:56:50
Subject: Re: Why Index is not used

From: Thomas Kellerer
Date: 25 March 2011, 04:24:57
Subject: Re: Why Index is not used

Re: Why Index is not used - Mailing list pgsql-performance

Previous

Next