Re: planner question.. - Mailing list pgsql-sql
From | Rajesh Kumar Mallah |
---|---|
Subject | Re: planner question.. |
Date | |
Msg-id | 200304172040.06654.mallah@trade-india.com Whole thread Raw |
In response to | Re: planner question.. (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: planner question..
|
List | pgsql-sql |
On Thursday 17 Apr 2003 7:56 pm, you wrote: > Rajesh Kumar Mallah <mallah@trade-india.com> writes: > > For a distribution of data like below why does the planner > > choses to do an index scan by default for source = 'REGIS' when > 50% > > of the rows are having source='REGIS'. > > Are there a huge number of dead rows in the table? ("vacuum verbose" > would give some info) > Yes as you predicted after a vacuum verbose the planner did switch to seq scan and was only marginally slower than index scan. (do u think we shud live with it?) so whats the moral ? ANALYZE itself is not sufficient in updating pg_statistic, we must do VACUUM ANALYZE always , if so why would a seperate ANALYZE command exist ? here is the info (in the same order as commands were executed) tradein_clients=# explain analyze SELECT count(*) from general.profile_master where source='REGIS'; +----------------------------------------------------------------------------------------------------------------------------------------------------------------+ | QUERY PLAN | +----------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Aggregate (cost=18225.48..18225.48 rows=1 width=0) (actual time=6295.33..6295.34 rows=1 loops=1) | | -> Index Scan using profile_master_index_source on profile_master (cost=0.00..18213.31 rows=4871 width=0) (actual time=14.54..6132.61rows=102090 loops=1) | | Index Cond: (source = 'REGIS'::character varying) | | Total runtime: 6295.41 msec | +----------------------------------------------------------------------------------------------------------------------------------------------------------------+ (4 rows) tradein_clients=# VACUUM VERBOSE general.profile_master; INFO: --Relation general.profile_master-- INFO: Index profile_master_profile_id_pkey: Pages 426; Tuples 178031: Deleted 11802. CPU 0.02s/0.12u sec elapsed 0.24sec. INFO: Index profile_master_index_userid: Pages 431; Tuples 178031: Deleted 8940. CPU 0.00s/0.13u sec elapsed 0.14sec. INFO: Index profile_master_index_creation_date: Pages 420; Tuples 178031: Deleted 8940. CPU 0.02s/0.12u sec elapsed0.20 sec. INFO: Index profile_master_index_company_id: Pages 419; Tuples 178031: Deleted 8940. CPU 0.01s/0.10u sec elapsed 0.15sec. INFO: Index profile_master_index_eyp_list_id: Pages 419; Tuples 178031: Deleted 8940. CPU 0.01s/0.11u sec elapsed0.45 sec. INFO: Index profile_master_index_iid_list_id: Pages 420; Tuples 178031: Deleted 8940. CPU 0.01s/0.10u sec elapsed0.46 sec. INFO: Index profile_master_index_ip_list_id: Pages 421; Tuples 178031: Deleted 8940. CPU 0.00s/0.11u sec elapsed 0.12sec. INFO: Index profile_master_index_catalog_company_id: Pages 419; Tuples 178031: Deleted 8940. CPU 0.00s/0.12u sec elapsed0.46 sec. INFO: Index profile_master_index_source: Pages 619; Tuples 178031: Deleted 8940. CPU 0.02s/0.09u sec elapsed 0.29sec. INFO: Removed 11802 tuples in 495 pages. CPU 0.01s/0.03u sec elapsed 0.14 sec. INFO: Pages 23923: Changed 36, Empty 0; Tup 178031: Vac 11802, Keep 0, UnUsed 325029. Total CPU 0.96s/1.19u sec elapsed11.97 sec. INFO: --Relation pg_toast.pg_toast_122045388-- INFO: Pages 31: Changed 0, Empty 0; Tup 189: Vac 0, Keep 0, UnUsed 0. Total CPU 0.00s/0.00u sec elapsed 0.03 sec. VACUUM tradein_clients=# explain analyze SELECT count(*) from general.profile_master where source='REGIS'; +-----------------------------------------------------------------------------------------------------------------------------+ | QUERY PLAN | +-----------------------------------------------------------------------------------------------------------------------------+ | Aggregate (cost=26398.37..26398.37 rows=1 width=0) (actual time=8355.76..8355.76 rows=1 loops=1) | | -> Seq Scan on profile_master (cost=0.00..26148.39 rows=99994 width=0) (actual time=16.23..8237.53 rows=102090 loops=1)| | Filter: (source = 'REGIS'::character varying) | | Total runtime: 8355.88 msec | +-----------------------------------------------------------------------------------------------------------------------------+ (4 rows) > The given result seems suspect; an indexscan couldn't possibly read >50% > of the rows in less than a quarter of the time for a seqscan. Unless > (a) the table contains vast amounts of empty space that the seqscan has to > slog through, or Sorry i dont understand the vacuum verbose output throughly , does it looks like though ? (b) your second measurement is bogus due to caching > performed by the first measurement. I am not sure this time but i have the habit of running EXPLAIN ANALYZE thrice and post the middle one ;-) > > Also, might the table be in order by the "source" column? A > sufficiently high correlation might have persuaded the planner to try an > indexscan even if point (a) isn't true. Yes the data is loaded from one source completely before loading from another source. So they were in order but i did a lot of updates and deletes. regds mallah. > > regards, tom lane > > > ---------------------------(end of broadcast)--------------------------- > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org -- Rajesh Kumar Mallah, Project Manager (Development) Infocom Network Limited, New Delhi phone: +91(11)6152172 (221) (L) ,9811255597 (M) Visit http://www.trade-india.com , India's Leading B2B eMarketplace.