Re: severe performance issue with planner - Mailing list pgsql-performance
From | Eric Brown |
---|---|
Subject | Re: severe performance issue with planner |
Date | |
Msg-id | BAY13-F143A1R1q0iLo000112cf@hotmail.com Whole thread Raw |
In response to | severe performance issue with planner ("Eric Brown" <bigwhitecow@hotmail.com>) |
Responses |
Re: severe performance issue with planner
|
List | pgsql-performance |
ok, at the advice of jurka, I will post new results: Here's the query as I have changed it now: SELECT w8.wid, w8.variant, w8.num_variants, sum_text(w8.unicode) as unicodes, sum_text(w8.pinyin) as pinyins FROM (words as w8 JOIN (words as w7 JOIN (words as w6 JOIN (words as w5 JOIN (words as w4 JOIN (words as w3 JOIN (words as w2 JOIN (words as w0 JOIN words as w1 ON(w1.wid = w0.wid AND w1.variant = w0.variant AND w1.sequence = w0.sequence + 1 AND w1.pinyin LIKE 'fu_')) ON(w2.wid = w1.wid AND w2.variant = w1.variant AND w2.sequence = w1.sequence + 1 AND w2.pinyin LIKE 'ji_')) ON(w3.wid = w2.wid AND w3.variant = w2.variant AND w3.sequence = w2.sequence + 1 AND w3.pinyin LIKE 'guan_')) ON(w4.wid = w3.wid AND w4.variant = w3.variant AND w4.sequence = w3.sequence + 1 AND w4.pinyin LIKE 'kai_')) ON(w5.wid = w4.wid AND w5.variant = w4.variant AND w5.sequence = w4.sequence + 1 AND w5.pinyin LIKE 'fang_')) ON(w6.wid = w5.wid AND w6.variant = w5.variant AND w6.sequence = w5.sequence + 1 AND w6.pinyin LIKE 'xi_')) ON(w7.wid = w6.wid AND w7.variant = w6.variant AND w7.sequence = w6.sequence + 1 AND w7.pinyin LIKE 'tong_')) ON(w8.wid = w7.wid AND w8.variant = w7.variant)) WHERE w0.wid > 0 AND w0.pinyin = 'zheng4' AND w0.def_exists = 't' AND w0.sequence = 0 GROUP BY w8.wid, w8.variant, w8.num_variants, w8.page_order, w0.sequence , w1.sequence , w2.sequence , w3.sequence , w4.sequence , w5.sequence , w6.sequence , w7.sequence ORDER BY w8.page_order; And here's the output of explain analyze: QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Sort (cost=54.26..54.26 rows=1 width=43) (actual time=14.916..14.917 rows=1 loops=1) Sort Key: w8.page_order -> HashAggregate (cost=54.24..54.25 rows=1 width=43) (actual time=14.891..14.892 rows=1 loops=1) -> Nested Loop (cost=0.00..54.10 rows=4 width=43) (actual time=3.676..14.446 rows=12 loops=1) -> Nested Loop (cost=0.00..48.09 rows=1 width=64) (actual time=3.638..14.269 rows=1 loops=1) Join Filter: ("outer"."sequence" = ("inner"."sequence" + 1)) -> Nested Loop (cost=0.00..42.06 rows=1 width=56) (actual time=3.581..14.181 rows=1 loops=1) Join Filter: (("inner".wid = "outer".wid) AND ("inner".variant = "outer".variant) AND ("inner"."sequence" = ("outer"."sequence" + 1)) AND ("outer"."sequence" = ("inner"."sequence" + 1))) -> Nested Loop (cost=0.00..36.05 rows=1 width=48) (actual time=2.152..12.443 rows=1 loops=1) Join Filter: (("outer"."sequence" = ("inner"."sequence" + 1)) AND ("inner"."sequence" = ("outer"."sequence" + 1))) -> Nested Loop (cost=0.00..30.03 rows=1 width=40) (actual time=2.104..12.368 rows=1 loops=1) Join Filter: (("outer".variant = "inner".variant) AND ("outer".wid = "inner".wid) AND ("inner"."sequence" = ("outer"."sequence" + 1))) -> Nested Loop (cost=0.00..24.02 rows=1 width=32) (actual time=2.040..11.226 rows=1 loops=1) Join Filter: ("outer"."sequence" = ("inner"."sequence" + 1)) -> Nested Loop (cost=0.00..18.00 rows=1 width=24) (actual time=1.979..11.147 rows=1 loops=1) Join Filter: (("inner".variant = "outer".variant) AND ("inner".wid = "outer".wid)) -> Nested Loop (cost=0.00..12.00 rows=1 width=16) (actual time=0.258..8.765 rows=1 loops=1) -> Index Scan using pinyin_index on words w3 (cost=0.00..5.99 rows=1 width=8) (actual time=0.084..2.399 rows=304 loops=1) Index Cond: (((pinyin)::text >= 'guan'::character varying) AND ((pinyin)::text < 'guao'::character varying)) Filter: ((pinyin)::text ~~ 'guan_'::text) -> Index Scan using words2_pkey on words w1 (cost=0.00..6.00 rows=1 width=8) (actual time=0.018..0.018 rows=0 loops=304) Index Cond: (("outer".wid = w1.wid) AND ("outer".variant = w1.variant)) Filter: ((pinyin)::text ~~ 'fu_'::text) -> Index Scan using pinyin_index on words w7 (cost=0.00..5.99 rows=1 width=8) (actual time=0.025..1.863 rows=338 loops=1) Index Cond: (((pinyin)::text >= 'tong'::character varying) AND ((pinyin)::text < 'tonh'::character varying)) Filter: ((pinyin)::text ~~ 'tong_'::text) -> Index Scan using words2_pkey on words w6 (cost=0.00..6.00 rows=1 width=8) (actual time=0.037..0.052 rows=1 loops=1) Index Cond: ((w6.wid = "outer".wid) AND (w6.variant = "outer".variant)) Filter: ((pinyin)::text ~~ 'xi_'::text) -> Index Scan using pinyin_index on words w4 (cost=0.00..5.99 rows=1 width=8) (actual time=0.028..0.874 rows=165 loops=1) Index Cond: (((pinyin)::text >= 'kai'::character varying) AND ((pinyin)::text < 'kaj'::character varying)) Filter: ((pinyin)::text ~~ 'kai_'::text) -> Index Scan using words2_pkey on words w2 (cost=0.00..6.00 rows=1 width=8) (actual time=0.023..0.047 rows=1 loops=1) Index Cond: (("outer".wid = w2.wid) AND ("outer".variant = w2.variant)) Filter: ((pinyin)::text ~~ 'ji_'::text) -> Index Scan using pinyin_index on words w5 (cost=0.00..5.99 rows=1 width=8) (actual time=0.025..1.436 rows=259 loops=1) Index Cond: (((pinyin)::text >= 'fang'::character varying) AND ((pinyin)::text < 'fanh'::character varying)) Filter: ((pinyin)::text ~~ 'fang_'::text) -> Index Scan using words2_pkey on words w0 (cost=0.00..6.01 rows=1 width=8) (actual time=0.030..0.058 rows=1 loops=1) Index Cond: (("outer".wid = w0.wid) AND (w0.wid > 0) AND ("outer".variant = w0.variant)) Filter: (((pinyin)::text = 'zheng4'::text) AND (def_exists = true) AND ("sequence" = 0)) -> Index Scan using words2_pkey on words w8 (cost=0.00..6.00 rows=1 width=27) (actual time=0.019..0.103 rows=12 loops=1) Index Cond: ((w8.wid = "outer".wid) AND (w8.variant = "outer".variant)) Total runtime: 15.987 ms (44 rows) Time: 838.446 ms As you can see, there still appears to be some issue with planning taking ~820ms, while the actual query should only take 16ms (and as you can see i used the "folding" technique which takes some load off of the planner. Here is the old query/analysis as run right now: SELECT w8.wid, w8.variant, w8.num_variants, sum_text(w8.unicode) as unicodes, sum_text(w8.pinyin) as pinyins FROM words as w0, words as w1, words as w2, words as w3, words as w4, words as w5, words as w6, words as w7, words as w8 WHERE w0.wid > 0 AND w0.pinyin = 'zheng4' AND w0.def_exists = 't' AND w0.sequence = 0 AND w1.wid = w0.wid AND w1.pinyin LIKE 'fu_' AND w1.variant = w0.variant AND w1.sequence = (w0.sequence + 1) AND w2.wid = w1.wid AND w2.pinyin LIKE 'ji_' AND w2.variant = w1.variant AND w2.sequence = (w1.sequence + 1) AND w3.wid = w2.wid AND w3.pinyin LIKE 'guan_' AND w3.variant = w2.variant AND w3.sequence = (w2.sequence + 1) AND w4.wid = w3.wid AND w4.pinyin LIKE 'kai_' AND w4.variant = w3.variant AND w4.sequence = (w3.sequence + 1) AND w5.wid = w4.wid AND w5.pinyin LIKE 'fang_' AND w5.variant = w4.variant AND w5.sequence = (w4.sequence + 1) AND w6.wid = w5.wid AND w6.pinyin LIKE 'xi_' AND w6.variant = w5.variant AND w6.sequence = (w5.sequence + 1) AND w7.wid = w6.wid AND w7.pinyin LIKE 'tong_' AND w7.variant = w6.variant AND w7.sequence = (w6.sequence + 1) AND w8.wid = w7.wid AND w8.variant = w7.variant GROUP BY w8.wid, w8.variant, w8.num_variants, w8.page_order , w0.sequence , w1.sequence , w2.sequence , w3.sequence , w4.sequence , w5.sequence , w6.sequence , w7.sequence ORDER BY w8.page_order; wepy=> EXPLAIN ANALYZE wepy=> explain ANALYZE QUERY PLAN wepy-> wepy-> ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- wepy-> wepy-> Sort (cost=54.15..54.16 rows=1 width=43) (actual time=1043.148..1043.149 rows=1 loops=1) wepy-> wepy-> Sort Key: w8.page_order wepy-> wepy-> -> HashAggregate (cost=54.14..54.14 rows=1 width=43) (actual time=1043.121..1043.122 rows=1 loops=1) wepy-> wepy-> -> Nested Loop (cost=0.00..54.10 rows=1 width=43) (actual time=9.627..1042.565 rows=12 loops=1) wepy-> wepy-> Join Filter: (("inner"."sequence" = ("outer"."sequence" + 1)) AND ("outer"."sequence" = ("inner"."sequence" + 1))) wepy-> wepy-> -> Nested Loop (cost=0.00..48.08 rows=1 width=83) (actual time=9.557..1041.784 rows=12 loops=1) wepy-> wepy-> Join Filter: (("outer".wid = "inner".wid) AND ("outer".variant = "inner".variant) AND ("outer"."sequence" = ("inner"."sequence" + 1)) AND ("inner"."sequence" = ("outer"."sequence" + 1))) wepy-> wepy-> -> Nested Loop (cost=0.00..11.99 rows=1 width=16) (actual time=3.152..290.176 rows=6 loops=1) wepy-> wepy-> Join Filter: (("inner".wid = "outer".wid) AND ("inner".variant = "outer".variant) AND ("inner"."sequence" = ("outer"."sequence" + 1))) wepy-> wepy-> -> Index Scan using pinyin_index on words w4 (cost=0.00..5.99 rows=1 width=8) (actual time=0.084..1.233 rows=165 loops=1) wepy-> wepy-> Index Cond: (((pinyin)::text >= 'kai'::character varying) AND ((pinyin)::text < 'kaj'::character varying)) wepy-> wepy-> Filter: ((pinyin)::text ~~ 'kai_'::text) wepy-> wepy-> -> Index Scan using pinyin_index on words w5 (cost=0.00..5.99 rows=1 width=8) (actual time=0.018..1.411 rows=259 loops=165) wepy-> wepy-> Index Cond: (((pinyin)::text >= 'fang'::character varying) AND ((pinyin)::text < 'fanh'::character varying)) wepy-> wepy-> Filter: ((pinyin)::text ~~ 'fang_'::text) wepy-> wepy-> -> Nested Loop (cost=0.00..36.06 rows=1 width=67) (actual time=4.500..125.184 rows=12 loops=6) wepy-> wepy-> -> Nested Loop (cost=0.00..30.05 rows=1 width=40) (actual time=4.446..125.003 rows=1 loops=6) wepy-> wepy-> Join Filter: ("inner"."sequence" = ("outer"."sequence" + 1)) wepy-> wepy-> -> Nested Loop (cost=0.00..24.03 rows=1 width=32) (actual time=4.391..124.920 rows=1 loops=6) wepy-> wepy-> -> Nested Loop (cost=0.00..18.01 rows=1 width=24) (actual time=4.339..124.781 rows=3 loops=6) wepy-> wepy-> Join Filter: (("inner".variant = "outer".variant) AND ("inner".wid = "outer".wid) AND ("inner"."sequence" = ("outer"."sequence" + 1))) wepy-> wepy-> -> Nested Loop (cost=0.00..12.00 rows=1 width=16) (actual time=0.154..10.898 rows=18 loops=6) wepy-> wepy-> -> Index Scan using pinyin_index on words w3 (cost=0.00..5.99 rows=1 width=8) (actual time=0.027..2.358 rows=304 loops=6) wepy-> wepy-> Index Cond: (((pinyin)::text >= 'guan'::character varying) AND ((pinyin)::text < 'guao'::character varying)) wepy-> wepy-> Filter: ((pinyin)::text ~~ 'guan_'::text) wepy-> wepy-> -> Index Scan using words2_pkey on words w6 (cost=0.00..6.00 rows=1 width=8) (actual time=0.025..0.025 rows=0 loops=1824) wepy-> wepy-> Index Cond: ((w6.wid = "outer".wid) AND (w6.variant = "outer".variant)) wepy-> wepy-> Filter: ((pinyin)::text ~~ 'xi_'::text) wepy-> wepy-> -> Index Scan using pinyin_index on words w7 (cost=0.00..5.99 rows=1 width=8) (actual time=0.017..5.788 rows=338 loops=108) wepy-> wepy-> Index Cond: (((pinyin)::text >= 'tong'::character varying) AND ((pinyin)::text < 'tonh'::character varying)) wepy-> wepy-> Filter: ((pinyin)::text ~~ 'tong_'::text) wepy-> wepy-> -> Index Scan using words2_pkey on words w0 (cost=0.00..6.01 rows=1 width=8) (actual time=0.026..0.035 rows=0 loops=18) wepy-> wepy-> Index Cond: (("outer".wid = w0.wid) AND (w0.wid > 0) AND ("outer".variant = w0.variant)) wepy-> wepy-> Filter: (((pinyin)::text = 'zheng4'::text) AND (def_exists = true) AND ("sequence" = 0)) wepy-> wepy-> -> Index Scan using words2_pkey on words w1 (cost=0.00..6.00 rows=1 width=8) (actual time=0.021..0.047 rows=1 loops=6) wepy-> wepy-> Index Cond: (("outer".wid = w1.wid) AND ("outer".variant = w1.variant)) wepy-> wepy-> Filter: ((pinyin)::text ~~ 'fu_'::text) wepy-> wepy-> -> Index Scan using words2_pkey on words w8 (cost=0.00..6.00 rows=1 width=27) (actual time=0.021..0.085 rows=12 loops=6) wepy-> wepy-> Index Cond: ((w8.wid = "outer".wid) AND (w8.variant = "outer".variant)) wepy-> wepy-> -> Index Scan using words2_pkey on words w2 (cost=0.00..6.00 rows=1 width=8) (actual time=0.021..0.046 rows=1 loops=12) wepy-> wepy-> Index Cond: ((w2.wid = "outer".wid) AND (w2.variant = "outer".variant)) wepy-> wepy-> Filter: ((pinyin)::text ~~ 'ji_'::text) wepy-> wepy-> Total runtime: 1044.239 ms wepy-> wepy-> (43 rows) wepy-> wepy-> wepy-> wepy-> Time: 3939.938 ms Even the total runtime has improved dramatically with the use of explicit joins and folding. However, I still need to cut the query time down much more... My last resort is to dig through the source code and see if I can understand how the planner works, but I suspect it's thousands of lines of code :\ BTW, I have since upgraded to 7.4.2 (from 7.4.1), and you can verify these planner times even with an empty table. The table spec is below: Table "public.words" Column | Type | Modifiers --------------+----------------------+----------- wid | integer | not null sequence | smallint | not null variant | smallint | not null char_count | smallint | not null unicode | character varying(5) | not null pinyin | character varying(8) | not null simpvar | character varying(5) | zvar | character varying(5) | compatvar | character varying(5) | def_exists | boolean | not null num_variants | smallint | page_order | integer | pinyins | character varying | unicodes | character varying | Indexes: "words2_pkey" primary key, btree (wid, variant, "sequence") "page_index" btree (page_order) "pinyin_index" btree (pinyin) "unicode_index" btree (unicode) _________________________________________________________________ Get rid of annoying pop-up ads with the new MSN Toolbar � FREE! http://clk.atdmt.com/AVE/go/onm00200414ave/direct/01/
pgsql-performance by date: