Query optimizer 8.0.1 (and 8.0) - Mailing list pgsql-hackers
From | pgsql@mohawksoft.com |
---|---|
Subject | Query optimizer 8.0.1 (and 8.0) |
Date | |
Msg-id | 16398.24.91.171.78.1107555018.squirrel@mail.mohawksoft.com Whole thread Raw |
Responses |
Re: Query optimizer 8.0.1 (and 8.0)
Re: Query optimizer 8.0.1 (and 8.0) |
List | pgsql-hackers |
Here's one: I have the USA census TIGER database loaded, the WHOLE THING, the whole country. It isn't the biggest database, but it is about 40G before indexes. Every table is over a few million rows. I can quite honestly say, a sequential scan is almost never the right thing to do. Info below. Here is the query: select * from rt1, rt2 where rt1.tlid = rt2.tlid and (zipr = 2186 or zipl=2186); Well, when you look at the explain, obviously it looks like the sequential scan is better, but trust me, it takes a good few minutes for the query to respond with sequential scan enabled. I've run analyze and everything. I suspect that analyze only samples a very small amount of the database and gets the wrong idea about it. Is there a way to force analyze to sample more rows? tiger=# analyze verbose rt2 ; INFO: analyzing "public.rt2" INFO: "rt2": scanned 3000 of 1139825 pages, containing 84160 live rows and 0 dead rows; 3000 rows in sample, 31975891 estimated total rows ANALYZE tiger=# analyze verbose rt1 ; INFO: analyzing "public.rt1" INFO: "rt1": scanned 3000 of 1527360 pages, containing 90951 live rows and 0 dead rows; 3000 rows in sample, 46304973 estimated total rows ANALYZE tiger=# explain select * from rt1, rt2 where rt1.tlid = rt2.tlid and (zipr = 2186 or zipl=2186); QUERY PLAN -----------------------------------------------------------------------------------------------------Hash Join (cost=121978.81..3996190.89rows=21118 width=520) Hash Cond: ("outer".tlid = "inner".tlid) -> Seq Scan on rt2 (cost=0.00..1459291.36rows=31946636 width=218) -> Hash (cost=120662.36..120662.36 rows=30579 width=302) -> IndexScan using rt1_zipr, rt1_zipl on rt1 (cost=0.00..120662.36 rows=30579 width=302) Index Cond: ((zipr = 2186) OR (zipl = 2186)) (6 rows) tiger=# set enable_seqscan=no; SET tiger=# explain select * from rt1, rt2 where rt1.tlid = rt2.tlid and (zipr = 2186 or zipl=2186); QUERY PLAN -----------------------------------------------------------------------------------------------Nested Loop (cost=0.00..46868256.00rows=21118 width=520) -> Index Scan using rt1_zipr, rt1_zipl on rt1 (cost=0.00..120662.36 rows=30579 width=302) Index Cond: ((zipr = 2186) OR (zipl = 2186)) -> Index Scan using rt2_tlid on rt2 (cost=0.00..1523.80rows=396 width=218) Index Cond: ("outer".tlid = rt2.tlid) Table "public.rt1" Column | Type | Modifiers -----------+-------------------+-----------tlid | integer |side1 | character varying |source | charactervarying |fedirp | character varying |fename | character varying |fetype | character varying |fedirs | character varying |cfcc | character varying |fraddl | character varying |toaddl | character varying |fraddr | character varying |toaddr | character varying |friaddl | character varying |toiaddl | character varying|friaddr | character varying |toiaddr | character varying |zipl | integer |zipr | integer |aianhhfpl | character varying |aianhhfpr | character varying |aihhtlil | character varying |aihhtlir | character varying |statel | character varying |stater | character varying |countyl | character varying |countyr | character varying |cousubl | character varying |cousubr | character varying |submcdl | character varying|submcdr | character varying |placel | character varying |placer | character varying |tractl | charactervarying |tractr | character varying |blockl | character varying |blockr | character varying |frlong | numeric |frlat | numeric |tolong | numeric |tolat | numeric | Indexes: "rt1_fename" btree (fename) "rt1_frlat" btree (frlat) "rt1_frlong" btree (frlong) "rt1_tlid" btree (tlid) "rt1_tolat" btree (tolat) "rt1_tolong" btree (tolong) "rt1_zipl" btree (zipl) "rt1_zipr" btree (zipr) Table "public.rt2"Column | Type | Modifiers --------+---------+-----------tlid | integer |rtsq | integer |long1 | numeric |lat1 | numeric |long2 | numeric |lat2 | numeric |long3 | numeric |lat3 | numeric |long4 | numeric |lat4 | numeric |long5 | numeric |lat5 | numeric|long6 | numeric |lat6 | numeric |long7 | numeric |lat7 | numeric |long8 | numeric |lat8 | numeric |long9 | numeric |lat9 | numeric |long10 | numeric |lat10 | numeric | Indexes: "rt2_tlid" btree (tlid)
pgsql-hackers by date: