Thread: weird query plan
I have a table: webdigest=# \d wd_urlusermaps 表 "public.wd_urlusermaps" 字段名 | 类型 | 修饰词 ---------+-----------------------------+------------------------------------------------------------- id | integer | not null default nextval('wd_urlusermaps_id_seq'::regclass) urlid | integer | not null tag | character varying(512) | title | character varying(512) | summary | character varying(1024) | comment | character varying(1024) | ctime | timestamp without time zone | mtime | timestamp without time zone | share | smallint | userid | integer | import | smallint | default 0 索引: "wd_urlusermaps_pkey" PRIMARY KEY, btree (id) CLUSTER "urlusermaps_urlid_userid" UNIQUE, btree (urlid, userid) "urlusermaps_urlid" btree (urlid) "urlusermaps_userid" btree (userid) "wd_urlusermaps_ctime_idx" btree (ctime) "wd_urlusermaps_share_idx" btree ("share") and target statistic set to 1000, and two different query plan: webdigest=# explain analyze select A.id as fav_id,A.urlid,A.tag,A.title,A.summary,A.comment,A.ctime,A.share from wd_urlusermaps A where share =1 and A.userid='219177' ORDER BY A.id DESC limit 20 ; QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=0.00..4932.56 rows=20 width=96) (actual time=730.461..2374.435 rows=20 loops=1) -> Index Scan Backward using wd_urlusermaps_pkey on wd_urlusermaps a (cost=0.00..269810.77 rows=1094 width=96) (actual time=730.456..2374.367 rows=20 loops=1) Filter: (("share" = 1) AND (userid = 219177)) Total runtime: 2374.513 ms (4 rows) webdigest=# explain analyze select A.id as fav_id,A.urlid,A.tag,A.title,A.summary,A.comment,A.ctime,A.share from wd_urlusermaps A where share =1 and A.userid='219177' ORDER BY A.id DESC limit 40 ; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=6805.77..6805.87 rows=40 width=96) (actual time=5.731..5.905 rows=40 loops=1) -> Sort (cost=6805.77..6808.50 rows=1094 width=96) (actual time=5.726..5.785 rows=40 loops=1) Sort Key: id -> Index Scan using urlusermaps_userid on wd_urlusermaps a (cost=0.00..6750.55 rows=1094 width=96) (actual time=0.544..5.616 rows=41 loops=1) Index Cond: (userid = 219177) Filter: ("share" = 1) Total runtime: 6.013 ms (7 rows) the userid=219177 got 2000+ record and around 40 shared=1, why above 2 query shows so much difference? any hint would be greatly appreciated. -laser
sorry, forgot to mention our version, it's postgresql 8.2.3 -laser > I have a table: > webdigest=# \d wd_urlusermaps > 表 "public.wd_urlusermaps" > 字段名 | 类型 | 修饰词 > ---------+-----------------------------+------------------------------------------------------------- > id | integer | not null default nextval('wd_urlusermaps_id_seq'::regclass) > urlid | integer | not null > tag | character varying(512) | > title | character varying(512) | > summary | character varying(1024) | > comment | character varying(1024) | > ctime | timestamp without time zone | > mtime | timestamp without time zone | > share | smallint | > userid | integer | > import | smallint | default 0 > 索引: > "wd_urlusermaps_pkey" PRIMARY KEY, btree (id) CLUSTER > "urlusermaps_urlid_userid" UNIQUE, btree (urlid, userid) > "urlusermaps_urlid" btree (urlid) > "urlusermaps_userid" btree (userid) > "wd_urlusermaps_ctime_idx" btree (ctime) > "wd_urlusermaps_share_idx" btree ("share") > > and target statistic set to 1000, and two different query plan: > > webdigest=# explain analyze select A.id as > fav_id,A.urlid,A.tag,A.title,A.summary,A.comment,A.ctime,A.share from > wd_urlusermaps A where share =1 and A.userid='219177' ORDER BY A.id DESC > limit 20 ; > QUERY PLAN > -------------------------------------------------------------------------------------------------------------------------------------------------------------------- > Limit (cost=0.00..4932.56 rows=20 width=96) (actual > time=730.461..2374.435 rows=20 loops=1) > -> Index Scan Backward using wd_urlusermaps_pkey on wd_urlusermaps a > (cost=0.00..269810.77 rows=1094 width=96) (actual time=730.456..2374.367 > rows=20 loops=1) > Filter: (("share" = 1) AND (userid = 219177)) > Total runtime: 2374.513 ms > (4 rows) > > webdigest=# explain analyze select A.id as > fav_id,A.urlid,A.tag,A.title,A.summary,A.comment,A.ctime,A.share from > wd_urlusermaps A where share =1 and A.userid='219177' ORDER BY A.id DESC > limit 40 ; > QUERY PLAN > --------------------------------------------------------------------------------------------------------------------------------------------------------- > Limit (cost=6805.77..6805.87 rows=40 width=96) (actual time=5.731..5.905 > rows=40 loops=1) > -> Sort (cost=6805.77..6808.50 rows=1094 width=96) (actual > time=5.726..5.785 rows=40 loops=1) > Sort Key: id > -> Index Scan using urlusermaps_userid on wd_urlusermaps a > (cost=0.00..6750.55 rows=1094 width=96) (actual time=0.544..5.616 > rows=41 loops=1) > Index Cond: (userid = 219177) > Filter: ("share" = 1) > Total runtime: 6.013 ms > (7 rows) > > the userid=219177 got 2000+ record and around 40 shared=1, why above 2 query > shows so much difference? > > any hint would be greatly appreciated. > > -laser > > > ---------------------------(end of broadcast)--------------------------- > TIP 1: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly > > >
I changed the query to : EXPLAIN ANALYZE select id from wd_urlusermaps where id in (select id from wd_urlusermaps where share =1 and userid='219177') order by id desc limit 20; and it's much better now (from real execute time), but the cost report higher then slower one above, may be I should do some tunning on planner parameter or is it a planner bug? QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=16118.83..16118.88 rows=20 width=4) (actual time=17.539..17.619 rows=20 loops=1) -> Sort (cost=16118.83..16121.57 rows=1094 width=4) (actual time=17.534..17.560 rows=20 loops=1) Sort Key: public.wd_urlusermaps.id -> Nested Loop (cost=6753.28..16063.61 rows=1094 width=4) (actual time=16.739..17.439 rows=41 loops=1) -> HashAggregate (cost=6753.28..6764.22 rows=1094 width=4) (actual time=16.707..16.786 rows=41 loops=1) -> Index Scan using urlusermaps_userid on wd_urlusermaps (cost=0.00..6750.55 rows=1094 width=4) (actual time=1.478..16.563 rows=41 loops=1) Index Cond: (userid = 219177) Filter: ("share" = 1) -> Index Scan using wd_urlusermaps_pkey on wd_urlusermaps (cost=0.00..8.49 rows=1 width=4) (actual time=0.008..0.010 rows=1 loops=41) Index Cond: (public.wd_urlusermaps.id = public.wd_urlusermaps.id) Total runtime: 17.762 ms (11 rows) > sorry, forgot to mention our version, it's postgresql 8.2.3 > > -laser > >> I have a table: >> webdigest=# \d wd_urlusermaps >> 表 "public.wd_urlusermaps" >> 字段名 | 类型 | 修饰词 >> ---------+-----------------------------+------------------------------------------------------------- >> id | integer | not null default nextval('wd_urlusermaps_id_seq'::regclass) >> urlid | integer | not null >> tag | character varying(512) | >> title | character varying(512) | >> summary | character varying(1024) | >> comment | character varying(1024) | >> ctime | timestamp without time zone | >> mtime | timestamp without time zone | >> share | smallint | >> userid | integer | >> import | smallint | default 0 >> 索引: >> "wd_urlusermaps_pkey" PRIMARY KEY, btree (id) CLUSTER >> "urlusermaps_urlid_userid" UNIQUE, btree (urlid, userid) >> "urlusermaps_urlid" btree (urlid) >> "urlusermaps_userid" btree (userid) >> "wd_urlusermaps_ctime_idx" btree (ctime) >> "wd_urlusermaps_share_idx" btree ("share") >> >> and target statistic set to 1000, and two different query plan: >> >> webdigest=# explain analyze select A.id as >> fav_id,A.urlid,A.tag,A.title,A.summary,A.comment,A.ctime,A.share from >> wd_urlusermaps A where share =1 and A.userid='219177' ORDER BY A.id DESC >> limit 20 ; >> QUERY PLAN >> -------------------------------------------------------------------------------------------------------------------------------------------------------------------- >> Limit (cost=0.00..4932.56 rows=20 width=96) (actual >> time=730.461..2374.435 rows=20 loops=1) >> -> Index Scan Backward using wd_urlusermaps_pkey on wd_urlusermaps a >> (cost=0.00..269810.77 rows=1094 width=96) (actual time=730.456..2374.367 >> rows=20 loops=1) >> Filter: (("share" = 1) AND (userid = 219177)) >> Total runtime: 2374.513 ms >> (4 rows) >> >> webdigest=# explain analyze select A.id as >> fav_id,A.urlid,A.tag,A.title,A.summary,A.comment,A.ctime,A.share from >> wd_urlusermaps A where share =1 and A.userid='219177' ORDER BY A.id DESC >> limit 40 ; >> QUERY PLAN >> --------------------------------------------------------------------------------------------------------------------------------------------------------- >> Limit (cost=6805.77..6805.87 rows=40 width=96) (actual time=5.731..5.905 >> rows=40 loops=1) >> -> Sort (cost=6805.77..6808.50 rows=1094 width=96) (actual >> time=5.726..5.785 rows=40 loops=1) >> Sort Key: id >> -> Index Scan using urlusermaps_userid on wd_urlusermaps a >> (cost=0.00..6750.55 rows=1094 width=96) (actual time=0.544..5.616 >> rows=41 loops=1) >> Index Cond: (userid = 219177) >> Filter: ("share" = 1) >> Total runtime: 6.013 ms >> (7 rows) >> >> the userid=219177 got 2000+ record and around 40 shared=1, why above 2 query >> shows so much difference? >> >> any hint would be greatly appreciated. >> >> -laser >> >> >> ---------------------------(end of broadcast)--------------------------- >> TIP 1: if posting/reading through Usenet, please send an appropriate >> subscribe-nomail command to majordomo@postgresql.org so that your >> message can get through to the mailing list cleanly >> >> >> >> > > ---------------------------(end of broadcast)--------------------------- > TIP 5: don't forget to increase your free space map settings > > >
continue digging shows: set cpu_tuple_cost to 0.1; explain analyze select * from wd_urlusermaps where share =1 and userid='219177' order by id desc limit 20; SET 时间: 0.256 ms QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=7063.98..7064.03 rows=20 width=110) (actual time=6.047..6.130 rows=20 loops=1) -> Sort (cost=7063.98..7066.71 rows=1094 width=110) (actual time=6.043..6.070 rows=20 loops=1) Sort Key: id -> Index Scan using urlusermaps_userid on wd_urlusermaps (cost=0.00..7008.76 rows=1094 width=110) (actual time=0.710..5.838 rows=41 loops=1) Index Cond: (userid = 219177) Filter: ("share" = 1) Total runtime: 6.213 ms (7 rows) now it's what i need, which means we should increase cpu_tuple_cost for large RAM node (we got 16G RAN and the table only serveral hundred M) to avoid sort happened too early. is it true? -laser > I changed the query to : > EXPLAIN ANALYZE select id from wd_urlusermaps where id in (select id > from wd_urlusermaps where share =1 and userid='219177') order by id desc > limit 20; > > and it's much better now (from real execute time), but the cost report > higher > then slower one above, may be I should do some tunning on planner > parameter or > is it a planner bug? > QUERY PLAN > ------------------------------------------------------------------------------------------------------------------------------------------------------------------- > Limit (cost=16118.83..16118.88 rows=20 width=4) (actual > time=17.539..17.619 rows=20 loops=1) > -> Sort (cost=16118.83..16121.57 rows=1094 width=4) (actual > time=17.534..17.560 rows=20 loops=1) > Sort Key: public.wd_urlusermaps.id > -> Nested Loop (cost=6753.28..16063.61 rows=1094 width=4) (actual > time=16.739..17.439 rows=41 loops=1) > -> HashAggregate (cost=6753.28..6764.22 rows=1094 width=4) (actual > time=16.707..16.786 rows=41 loops=1) > -> Index Scan using urlusermaps_userid on wd_urlusermaps > (cost=0.00..6750.55 rows=1094 width=4) (actual time=1.478..16.563 > rows=41 loops=1) > Index Cond: (userid = 219177) > Filter: ("share" = 1) > -> Index Scan using wd_urlusermaps_pkey on wd_urlusermaps > (cost=0.00..8.49 rows=1 width=4) (actual time=0.008..0.010 rows=1 loops=41) > Index Cond: (public.wd_urlusermaps.id = public.wd_urlusermaps.id) > Total runtime: 17.762 ms > (11 rows) > > >> sorry, forgot to mention our version, it's postgresql 8.2.3 >> >> -laser >> >> >>> I have a table: >>> webdigest=# \d wd_urlusermaps >>> 表 "public.wd_urlusermaps" >>> 字段名 | 类型 | 修饰词 >>> ---------+-----------------------------+------------------------------------------------------------- >>> id | integer | not null default nextval('wd_urlusermaps_id_seq'::regclass) >>> urlid | integer | not null >>> tag | character varying(512) | >>> title | character varying(512) | >>> summary | character varying(1024) | >>> comment | character varying(1024) | >>> ctime | timestamp without time zone | >>> mtime | timestamp without time zone | >>> share | smallint | >>> userid | integer | >>> import | smallint | default 0 >>> 索引: >>> "wd_urlusermaps_pkey" PRIMARY KEY, btree (id) CLUSTER >>> "urlusermaps_urlid_userid" UNIQUE, btree (urlid, userid) >>> "urlusermaps_urlid" btree (urlid) >>> "urlusermaps_userid" btree (userid) >>> "wd_urlusermaps_ctime_idx" btree (ctime) >>> "wd_urlusermaps_share_idx" btree ("share") >>> >>> and target statistic set to 1000, and two different query plan: >>> >>> webdigest=# explain analyze select A.id as >>> fav_id,A.urlid,A.tag,A.title,A.summary,A.comment,A.ctime,A.share from >>> wd_urlusermaps A where share =1 and A.userid='219177' ORDER BY A.id DESC >>> limit 20 ; >>> QUERY PLAN >>> -------------------------------------------------------------------------------------------------------------------------------------------------------------------- >>> Limit (cost=0.00..4932.56 rows=20 width=96) (actual >>> time=730.461..2374.435 rows=20 loops=1) >>> -> Index Scan Backward using wd_urlusermaps_pkey on wd_urlusermaps a >>> (cost=0.00..269810.77 rows=1094 width=96) (actual time=730.456..2374.367 >>> rows=20 loops=1) >>> Filter: (("share" = 1) AND (userid = 219177)) >>> Total runtime: 2374.513 ms >>> (4 rows) >>> >>> webdigest=# explain analyze select A.id as >>> fav_id,A.urlid,A.tag,A.title,A.summary,A.comment,A.ctime,A.share from >>> wd_urlusermaps A where share =1 and A.userid='219177' ORDER BY A.id DESC >>> limit 40 ; >>> QUERY PLAN >>> --------------------------------------------------------------------------------------------------------------------------------------------------------- >>> Limit (cost=6805.77..6805.87 rows=40 width=96) (actual time=5.731..5.905 >>> rows=40 loops=1) >>> -> Sort (cost=6805.77..6808.50 rows=1094 width=96) (actual >>> time=5.726..5.785 rows=40 loops=1) >>> Sort Key: id >>> -> Index Scan using urlusermaps_userid on wd_urlusermaps a >>> (cost=0.00..6750.55 rows=1094 width=96) (actual time=0.544..5.616 >>> rows=41 loops=1) >>> Index Cond: (userid = 219177) >>> Filter: ("share" = 1) >>> Total runtime: 6.013 ms >>> (7 rows) >>> >>> the userid=219177 got 2000+ record and around 40 shared=1, why above 2 query >>> shows so much difference? >>> >>> any hint would be greatly appreciated. >>> >>> -laser >>> >>> >>> ---------------------------(end of broadcast)--------------------------- >>> TIP 1: if posting/reading through Usenet, please send an appropriate >>> subscribe-nomail command to majordomo@postgresql.org so that your >>> message can get through to the mailing list cleanly >>> >>> >>> >>> >>> >> ---------------------------(end of broadcast)--------------------------- >> TIP 5: don't forget to increase your free space map settings >> >> >> >> > > ---------------------------(end of broadcast)--------------------------- > TIP 3: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faq > > >
"weiping" <laserlist@pgsqldb.com> writes: > -> Index Scan using urlusermaps_userid on wd_urlusermaps > (cost=0.00..6750.55 rows=1094 width=4) (actual time=1.478..16.563 rows=41 loops=1) > Index Cond: (userid = 219177) > Filter: ("share" = 1) It's estimating 1094 rows and getting 41 rows. You might considering raising the statistics target for that table. Does it get accurate estimates for the number of rows for each of these? explain analyze select * from wd_urlusermaps where userid=219177 explain analyze select * from wd_urlusermaps where share=1 (the latter might take a while) -- Gregory Stark EnterpriseDB http://www.enterprisedb.com
weiping <laserlist@pgsqldb.com> writes: > -> Index Scan using urlusermaps_userid on wd_urlusermaps a > (cost=0.00..6750.55 rows=1094 width=96) (actual time=0.544..5.616 > rows=41 loops=1) > Index Cond: (userid = 219177) > Filter: ("share" = 1) > the userid=219177 got 2000+ record and around 40 shared=1, why above 2 query > shows so much difference? Probably because the rowcount estimate is so far off (1094 vs 41). Possibly boosting the statistics target would help. regards, tom lane