Re: Query plan for very large number of joins - Mailing list pgsql-performance
From | Tom Lane |
---|---|
Subject | Re: Query plan for very large number of joins |
Date | |
Msg-id | 6292.1117819743@sss.pgh.pa.us Whole thread Raw |
In response to | Query plan for very large number of joins (<philb@vodafone.ie>) |
List | pgsql-performance |
<philb@vodafone.ie> writes: > I've attached the schema and query text, hopefully it will be of some use > to you. Note that both are taken from the HyperUBL project > (https://hyperubl.dev.java.net/). Sadly, at this stage I think it's > time for me to try alternatives to either Hibernate or Postgresql. Thanks. Profiling on 7.4 I get this for an EXPLAIN (after vacuum analyzing the database): % cumulative self self total time seconds seconds calls Ks/call Ks/call name 61.66 618.81 618.81 2244505819 0.00 0.00 compare_path_costs 15.01 769.44 150.63 1204882 0.00 0.00 add_path 8.08 850.57 81.13 772077 0.00 0.00 nth 3.76 888.27 37.70 1113598 0.00 0.00 nconc 2.59 914.30 26.03 233051 0.00 0.00 find_joininfo_node 2.23 936.70 22.40 30659124 0.00 0.00 bms_equal 1.14 948.14 11.44 39823463 0.00 0.00 equal 0.77 955.84 7.70 83300 0.00 0.00 find_base_rel This is with no special planner settings. Obviously the problem is that it's considering way too many different paths. We did do something about that in 8.0 (basically, throw away paths with "nearly the same" cost) ... but the bottom line didn't improve a whole lot. CVS tip profile for the same case is % cumulative self self total time seconds seconds calls s/call s/call name 38.37 176.41 176.41 53344348 0.00 0.00 list_nth_cell 35.26 338.52 162.11 196481 0.00 0.00 get_rte_attribute_is_dropped 5.42 363.44 24.92 233051 0.00 0.00 find_joininfo_node 4.72 385.14 21.70 30659416 0.00 0.00 bms_equal 4.09 403.95 18.81 53344348 0.00 0.00 list_nth 2.31 414.58 10.63 37347920 0.00 0.00 equal 1.40 421.03 6.45 83299 0.00 0.00 find_base_rel 1.08 426.01 4.98 617917 0.00 0.00 SearchCatCache 0.90 430.13 4.12 5771640 0.00 0.00 AllocSetAlloc The get_rte_attribute_is_dropped calls (and list_nth/list_nth_cell, which are mostly being called from there) arise from a rather hastily added patch that prevents failure when a JOIN list in a stored view refers to a since-dropped column of an underlying relation. I had not realized that that check could have O(N^2) behavior in deeply nested joins, but it does. Obviously we'll have to rethink that. After that it looks like the next hotspot is find_joininfo_node (and bms_equal which is mostly getting called from there). We could maybe fix that by rethinking the joininfo data structure --- right now it's a collection of simple Lists, which betrays the planner's Lispy heritage ;-). Again, that's not something I've ever seen at the top of a profile before --- there may be some O(N^2) behavior involved here too, but I've not analyzed it in detail. It does look like 8.0 would be about a factor of 2 faster for you than 7.4, but the real fix will probably have to wait for 8.1. Also: the 8.0 problem is definitely an O(N^2) type of deal, which means if you could reduce the depth of nesting by a factor of 2 the cost would go down 4x. You said this was an automatically generated query, so there may not be much you can do about it, but if you could parenthesize the FROM list a bit more intelligently the problem would virtually go away. What you have is effectively FROM ((((a left join b) left join c) left join d) .... so the nesting goes all the way down. With something like FROM ((a left join b) left join c ...) left join ((d left join e) left join f ...) the max nesting depth would be halved. I don't understand your schema at all so I'm not sure what an appropriate nesting might look like, but maybe there is a short-term workaround to be found there. (This will *not* help on 7.4, as the bottleneck there is completely different.) regards, tom lane
pgsql-performance by date: