planner question.. - Mailing list pgsql-sql

From Rajesh Kumar Mallah
Subject planner question..
Date
Msg-id 200304171404.10306.mallah@trade-india.com
Whole thread Raw
Responses Re: planner question..  (Stephan Szabo <sszabo@megazone23.bigpanda.com>)
Re: planner question..  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-sql

Hi,

For a distribution of data like below why does the planner
choses to do an index scan by default for source = 'REGIS' when > 50%
of the rows are having source='REGIS'.

Of course index produced better results 22 sec versus 4 secs.
but can someone explain this case.

tradein_clients=# SELECT source,count(*) from  profile_master group by source;
+----------+--------+
|  source  | count  |
+----------+--------+
| BRANDING |  64008 |
| CATALOG  |    711 |
| EYP      |  10380 |
| IID      |    349 |
| IP       |    493 |
| REGIS    | 102090 |
+----------+--------+
(6 rows)

tradein_clients=# SELECT count(*) from  profile_master ;
+--------+
| count  |
+--------+
| 178031 |
+--------+
(1 row)


tradein_clients=# SET enable_indexscan=off;
tradein_clients=# explain analyze SELECT count(*) from  profile_master where source='REGIS';

+------------------------------------------------------------------------------------------------------------------------------+
|                                                          QUERY PLAN
      |
 

+------------------------------------------------------------------------------------------------------------------------------+
| Aggregate  (cost=23982.58..23982.58 rows=1 width=0) (actual time=22872.97..22872.97 rows=1 loops=1)
      |
 
|   ->  Seq Scan on profile_master  (cost=0.00..23970.40 rows=4871 width=0) (actual time=328.44..22730.69 rows=102090
loops=1)|
 
|         Filter: (source = 'REGIS'::character varying)
      |
 
| Total runtime: 22873.03 msec
      |
 

+------------------------------------------------------------------------------------------------------------------------------+
(4 rows)

tradein_clients=# SET enable_indexscan=on;
SET
tradein_clients=# explain analyze SELECT count(*) from  profile_master where source='REGIS';

+---------------------------------------------------------------------------------------------------------------------------------------------------------------+
|                                                                          QUERY PLAN
                                       |
 

+---------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Aggregate  (cost=18225.48..18225.48 rows=1 width=0) (actual time=5919.24..5919.24 rows=1 loops=1)
                                       |
 
|   ->  Index Scan using profile_master_index_source on profile_master  (cost=0.00..18213.31 rows=4871 width=0) (actual
time=9.43..5786.15rows=102090 loops=1) |
 
|         Index Cond: (source = 'REGIS'::character varying)
                                       |
 
| Total runtime: 5919.31 msec
                                       |
 

+---------------------------------------------------------------------------------------------------------------------------------------------------------------+
(4 rows)

-- 
Rajesh Kumar Mallah,
Project Manager (Development)
Infocom Network Limited, New Delhi
phone: +91(11)6152172 (221) (L) ,9811255597 (M)

Visit http://www.trade-india.com ,
India's Leading B2B eMarketplace.



pgsql-sql by date:

Previous
From: Rajesh Kumar Mallah
Date:
Subject: analyse question..
Next
From: Rajesh Kumar Mallah
Date:
Subject: Re: IN Qeury Problem