Re: Planning time grows exponentially with levels of nested views - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Planning time grows exponentially with levels of nested views
Date
Msg-id 140747.1618776847@sss.pgh.pa.us
Whole thread Raw
Responses Re: Planning time grows exponentially with levels of nested views  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: Planning time grows exponentially with levels of nested views  ("Joel Jacobson" <joel@compiler.org>)
List pgsql-hackers
[ redirecting to -hackers so the cfbot can see it ]

"Joel Jacobson" <joel@compiler.org> writes:
> I assumed the cost for each nested VIEW layer would grow linear,
> but my testing shows it appears to grow exponentially:

I think it's impossible to avoid less-than-O(N^2) growth on this sort
of case.  For example, the v2 subquery initially has RTEs for v2 itself
plus v1.  When we flatten v1 into v2, v2 acquires the RTEs from v1,
namely v1 itself plus foo.  Similarly, once vK-1 is pulled up into vK,
there are going to be order-of-K entries in vK's rtable, and that stacking
makes for O(N^2) work overall just in manipulating the rtable.

We can't get rid of these rtable entries altogether, since all of them
represent table privilege checks that the executor will need to do.
It occurs to me though that we don't need the rte->subquery trees anymore
once those are flattened, so maybe we could do something like the
attached.  For me, this reduces the slowdown in your example from
O(N^3) to O(N^2).

I'm slightly worried though by this comment earlier in
pull_up_simple_subquery:

    /*
     * Need a modifiable copy of the subquery to hack on.  Even if we didn't
     * sometimes choose not to pull up below, we must do this to avoid
     * problems if the same subquery is referenced from multiple jointree
     * items (which can't happen normally, but might after rule rewriting).
     */

If multiple references are actually possible then this'd break it.  There
seem to be no such cases in the regression tests though, and I'm having a
hard time wrapping my brain around what would cause it.  "git blame"
traces this text to my own commit f44639e1b, which has the log entry

    Don't crash if subquery appears multiple times in jointree.  This should
    not happen anyway, but let's try not to get completely confused if it does
    (due to rewriter bugs or whatever).

so I'm thinking that this was only hypothetical.

It's possible that we could do something similar in the sibling functions
pull_up_simple_union_all etc, but I'm not sure it's worth troubling over.
TBH, for the size of effect you're showing here, I wouldn't be worried
at all; it's only because it seems to be a one-liner to improve it that
I'm interested in doing something.

            regards, tom lane

diff --git a/src/backend/optimizer/prep/prepjointree.c b/src/backend/optimizer/prep/prepjointree.c
index 62a1668796..f93c037778 100644
--- a/src/backend/optimizer/prep/prepjointree.c
+++ b/src/backend/optimizer/prep/prepjointree.c
@@ -1174,6 +1174,14 @@ pull_up_simple_subquery(PlannerInfo *root, Node *jtnode, RangeTblEntry *rte,
     Assert(root->placeholder_list == NIL);
     Assert(subroot->placeholder_list == NIL);

+    /*
+     * We no longer need the RTE's copy of the subquery's query tree.  Getting
+     * rid of it saves nothing in particular so far as this level of query is
+     * concerned; but if this query level is in turn pulled up into a parent,
+     * we'd waste cycles copying the now-unused query tree.
+     */
+    rte->subquery = NULL;
+
     /*
      * Miscellaneous housekeeping.
      *

pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: More info on pg_stat_activity Wait Event Name when is DataFileRead
Next
From: Yura Sokolov
Date:
Subject: Re: Old Postgresql version on i7-1165g7