Re: Query plan for very large number of joins - Mailing list pgsql-performance

From Tom Lane
Subject Re: Query plan for very large number of joins
Date
Msg-id 6292.1117819743@sss.pgh.pa.us
Whole thread Raw
In response to Query plan for very large number of joins  (<philb@vodafone.ie>)
List pgsql-performance
<philb@vodafone.ie> writes:
> I've attached the schema and query text, hopefully it will be of some use
> to you. Note that both are taken from the HyperUBL project
> (https://hyperubl.dev.java.net/).  Sadly, at this stage I think it's
> time for me to try alternatives to either Hibernate or Postgresql.

Thanks.  Profiling on 7.4 I get this for an EXPLAIN (after vacuum
analyzing the database):

  %   cumulative   self              self     total
 time   seconds   seconds    calls  Ks/call  Ks/call  name
 61.66    618.81   618.81 2244505819     0.00     0.00  compare_path_costs
 15.01    769.44   150.63  1204882     0.00     0.00  add_path
  8.08    850.57    81.13   772077     0.00     0.00  nth
  3.76    888.27    37.70  1113598     0.00     0.00  nconc
  2.59    914.30    26.03   233051     0.00     0.00  find_joininfo_node
  2.23    936.70    22.40 30659124     0.00     0.00  bms_equal
  1.14    948.14    11.44 39823463     0.00     0.00  equal
  0.77    955.84     7.70    83300     0.00     0.00  find_base_rel

This is with no special planner settings.  Obviously the problem is that
it's considering way too many different paths.  We did do something
about that in 8.0 (basically, throw away paths with "nearly the same"
cost) ... but the bottom line didn't improve a whole lot.  CVS tip
profile for the same case is

  %   cumulative   self              self     total
 time   seconds   seconds    calls   s/call   s/call  name
 38.37    176.41   176.41 53344348     0.00     0.00  list_nth_cell
 35.26    338.52   162.11   196481     0.00     0.00  get_rte_attribute_is_dropped
  5.42    363.44    24.92   233051     0.00     0.00  find_joininfo_node
  4.72    385.14    21.70 30659416     0.00     0.00  bms_equal
  4.09    403.95    18.81 53344348     0.00     0.00  list_nth
  2.31    414.58    10.63 37347920     0.00     0.00  equal
  1.40    421.03     6.45    83299     0.00     0.00  find_base_rel
  1.08    426.01     4.98   617917     0.00     0.00  SearchCatCache
  0.90    430.13     4.12  5771640     0.00     0.00  AllocSetAlloc

The get_rte_attribute_is_dropped calls (and list_nth/list_nth_cell,
which are mostly being called from there) arise from a rather hastily
added patch that prevents failure when a JOIN list in a stored view
refers to a since-dropped column of an underlying relation.  I had not
realized that that check could have O(N^2) behavior in deeply nested
joins, but it does.  Obviously we'll have to rethink that.

After that it looks like the next hotspot is find_joininfo_node
(and bms_equal which is mostly getting called from there).  We could
maybe fix that by rethinking the joininfo data structure --- right now
it's a collection of simple Lists, which betrays the planner's Lispy
heritage ;-).  Again, that's not something I've ever seen at the top
of a profile before --- there may be some O(N^2) behavior involved
here too, but I've not analyzed it in detail.

It does look like 8.0 would be about a factor of 2 faster for you
than 7.4, but the real fix will probably have to wait for 8.1.

Also: the 8.0 problem is definitely an O(N^2) type of deal, which means
if you could reduce the depth of nesting by a factor of 2 the cost would
go down 4x.  You said this was an automatically generated query, so
there may not be much you can do about it, but if you could parenthesize
the FROM list a bit more intelligently the problem would virtually go
away.  What you have is effectively

    FROM ((((a left join b) left join c) left join d) ....

so the nesting goes all the way down.  With something like

    FROM ((a left join b) left join c ...)
             left join
             ((d left join e) left join f ...)

the max nesting depth would be halved.  I don't understand your schema
at all so I'm not sure what an appropriate nesting might look like, but
maybe there is a short-term workaround to be found there.  (This will
*not* help on 7.4, as the bottleneck there is completely different.)

            regards, tom lane

pgsql-performance by date:

Previous
From: "J. Andrew Rogers"
Date:
Subject: Re: Filesystem
Next
From: Steve Poe
Date:
Subject: Postgresql and Software RAID/LVM