Re: [HACKERS] CTE inlining - Mailing list pgsql-hackers

From Tom Lane
Subject Re: [HACKERS] CTE inlining
Date
Msg-id 21113.1494364459@sss.pgh.pa.us
Whole thread Raw
In response to Re: [HACKERS] CTE inlining  (Ilya Shkuratov <motr.ilya@ya.ru>)
Responses Re: [HACKERS] CTE inlining  (Bruce Momjian <bruce@momjian.us>)
List pgsql-hackers
Ilya Shkuratov <motr.ilya@ya.ru> writes:
> Ok, it seems that most people in discussion are agree that removing optimization
> fence is a right thing to do.
> Nonetheless I still hoping to discuss the algorithm and its implementation.

Yeah, so far we've mainly discussed whether to do that and how to control
it, not what the actual results would be.

> I suppose, in case of a single reference we can validate CTE subquery and inline it
> just before SS_process_ctes() in subquery_planner() and then process remaining
> CTEs as before.

Yeah, something like that ought to be possible.  I'd prefer to think of it
as related to the work prepjointree.c does, ie pull_up_sublinks, which
raises the question why we do SS_process_ctes before that not after it.

You might want to start by postponing planning of CTE subqueries as late
as possible, without any other changes.  In particular, I think it'd be
a good thing to postpone Path-to-Plan conversion of the subqueries
until much later, ideally not till the final toplevel create_plan() call.
This is tied up with early Plan creation for regular subqueries too, so
that might have to be an all-or-nothing conversion.  But we aren't really
going to have much flexibility of planning for subqueries until we do
that.

> The case of multiple reference is more interesting.
> Ideally, we would decide whether to inline just before pull_up_sublinks(), so all
> the optimizations can be applied to inlined subquery. But It is impossible as we
> have no information to build subquery paths and estimate they costs at this point.

TBH, I would just ignore that completely, at least until the simpler
case is done and committed.  Trying to bite that off as part of the
initial patch is likely to lead to never getting anything done at all.
And I'm not exactly convinced either that it will win often enough to
be worth the trouble, or that users would thank you for rewriting their
queries that way.

When and if we get to this, you could imagine tackling it a bit like
preprocess_minmax_aggregates, or like what I did with OR clauses in
<22002.1487099934@sss.pgh.pa.us>, ie just repeat a lot of the work
over again to get several complete Path trees, and then pick one.
This is kind of expensive and ugly but it's hard to see how to do it
sensibly in a bottom-up fashion.
        regards, tom lane



pgsql-hackers by date:

Previous
From: "Daniel Verite"
Date:
Subject: Re: [HACKERS] export import bytea from psql
Next
From: Alvaro Herrera
Date:
Subject: [HACKERS] Re: [GSOC 17] Eliminate O(N^2) scaling from rw-conflict tracking inserializable transactions