Query optimizer 8.0.1 (and 8.0) - Mailing list pgsql-hackers
| From | pgsql@mohawksoft.com |
|---|---|
| Subject | Query optimizer 8.0.1 (and 8.0) |
| Date | |
| Msg-id | 16398.24.91.171.78.1107555018.squirrel@mail.mohawksoft.com Whole thread Raw |
| Responses |
Re: Query optimizer 8.0.1 (and 8.0)
Re: Query optimizer 8.0.1 (and 8.0) |
| List | pgsql-hackers |
Here's one:
I have the USA census TIGER database loaded, the WHOLE THING, the whole
country. It isn't the biggest database, but it is about 40G before
indexes. Every table is over a few million rows. I can quite honestly say,
a sequential scan is almost never the right thing to do.
Info below.
Here is the query:
select * from rt1, rt2 where rt1.tlid = rt2.tlid and (zipr = 2186 or
zipl=2186);
Well, when you look at the explain, obviously it looks like the sequential
scan is better, but trust me, it takes a good few minutes for the query to
respond with sequential scan enabled. I've run analyze and everything.
I suspect that analyze only samples a very small amount of the database
and gets the wrong idea about it. Is there a way to force analyze to
sample more rows?
tiger=# analyze verbose rt2 ;
INFO: analyzing "public.rt2"
INFO: "rt2": scanned 3000 of 1139825 pages, containing 84160 live rows
and 0 dead rows; 3000 rows in sample, 31975891 estimated total rows
ANALYZE
tiger=# analyze verbose rt1 ;
INFO: analyzing "public.rt1"
INFO: "rt1": scanned 3000 of 1527360 pages, containing 90951 live rows
and 0 dead rows; 3000 rows in sample, 46304973 estimated total rows
ANALYZE
tiger=# explain select * from rt1, rt2 where rt1.tlid = rt2.tlid and (zipr
= 2186 or zipl=2186); QUERY PLAN
-----------------------------------------------------------------------------------------------------Hash Join
(cost=121978.81..3996190.89rows=21118 width=520) Hash Cond: ("outer".tlid = "inner".tlid) -> Seq Scan on rt2
(cost=0.00..1459291.36rows=31946636 width=218) -> Hash (cost=120662.36..120662.36 rows=30579 width=302) ->
IndexScan using rt1_zipr, rt1_zipl on rt1
(cost=0.00..120662.36 rows=30579 width=302) Index Cond: ((zipr = 2186) OR (zipl = 2186))
(6 rows)
tiger=# set enable_seqscan=no;
SET
tiger=# explain select * from rt1, rt2 where rt1.tlid = rt2.tlid and (zipr
= 2186 or zipl=2186); QUERY PLAN
-----------------------------------------------------------------------------------------------Nested Loop
(cost=0.00..46868256.00rows=21118 width=520) -> Index Scan using rt1_zipr, rt1_zipl on rt1 (cost=0.00..120662.36
rows=30579 width=302) Index Cond: ((zipr = 2186) OR (zipl = 2186)) -> Index Scan using rt2_tlid on rt2
(cost=0.00..1523.80rows=396
width=218) Index Cond: ("outer".tlid = rt2.tlid)
Table "public.rt1" Column | Type | Modifiers
-----------+-------------------+-----------tlid | integer |side1 | character varying |source |
charactervarying |fedirp | character varying |fename | character varying |fetype | character varying |fedirs
| character varying |cfcc | character varying |fraddl | character varying |toaddl | character varying
|fraddr | character varying |toaddr | character varying |friaddl | character varying |toiaddl | character
varying|friaddr | character varying |toiaddr | character varying |zipl | integer |zipr |
integer |aianhhfpl | character varying |aianhhfpr | character varying |aihhtlil | character varying |aihhtlir
| character varying |statel | character varying |stater | character varying |countyl | character varying
|countyr | character varying |cousubl | character varying |cousubr | character varying |submcdl | character
varying|submcdr | character varying |placel | character varying |placer | character varying |tractl |
charactervarying |tractr | character varying |blockl | character varying |blockr | character varying |frlong
| numeric |frlat | numeric |tolong | numeric |tolat | numeric |
Indexes: "rt1_fename" btree (fename) "rt1_frlat" btree (frlat) "rt1_frlong" btree (frlong) "rt1_tlid" btree
(tlid) "rt1_tolat" btree (tolat) "rt1_tolong" btree (tolong) "rt1_zipl" btree (zipl) "rt1_zipr" btree (zipr)
Table "public.rt2"Column | Type | Modifiers
--------+---------+-----------tlid | integer |rtsq | integer |long1 | numeric |lat1 | numeric |long2 | numeric
|lat2 | numeric |long3 | numeric |lat3 | numeric |long4 | numeric |lat4 | numeric |long5 | numeric |lat5 |
numeric|long6 | numeric |lat6 | numeric |long7 | numeric |lat7 | numeric |long8 | numeric |lat8 | numeric
|long9 | numeric |lat9 | numeric |long10 | numeric |lat10 | numeric |
Indexes: "rt2_tlid" btree (tlid)
pgsql-hackers by date: