Thread: add_path() for Path without InitPlan: cost comparison vs. Paths that require one

add_path() for Path without InitPlan: cost comparison vs. Paths that require one

From

Dent John

Date:

25 July 2019, 11:17:54

Hi folks,

I’ve run into a planning conundrum with my query rewriting extension for MVs when attempting to rewrite a RECURSIVE
CTE.

RECURSIVE CTEs are expensive — and presumably tricky to optimise — and so a good use case for query rewrite against an
MV;all the more so if Yugo’s Incremental View Maintenance concept gets traction.

I want to add an alternative Path for the UPPERREL_FINAL of the CTE root, but my new MV scan path (which is actually a
thinCustomScan atop a scan of the MV) is rejected in favour of the existing paths.

This seems to be because my Path is more expensive than the Rel’s existing Paths when considered alone. (The CTE’s
finalscan is actually a scan Path over a worktable, so it really is much lighter.)

However, if I factor back in the cost of the InitPlan, things net out much more in favour of a scan against the MV. Of
course,the add_path() comparison logic doesn’t include the InitPlan cost, so the point is moot.

I’m wondering how I should approach this problem. First pass, I can’t see how to achieve an amicable solution with
existinginfrastructure.

I have a few possible solutions. Do any of the following make sense?

1. Override the add_path() logic to force my Path to win? This was initially my least favourite approach, but perhaps
it’sactually the most pragmatic. Advantage is I think I could do this entirely in my EXTENSION.

2. Make a new version of add_path() which is more aware of dependencies.

Seems #2 could have utility in PG generally. If I’m not wrong, my guess is that one of the reasons for the
>=2-references-for-materialising-a-CTE;1-for-inliningpolicy is that we don’t have the planner logic to trade off
materialisationversus inlining.

Also, I am wondering if my MV rewrite logic misses cases where the planner decides to materialise an intermediate
resultas an InitPlan for later processing.

3. I considered creating a new root PlannerInfo structure, and burying the existing one another level down, alongside
myMV scan, in a Gather-like arrangement. That coverts the costing conundrum to a choice between roots. Obviously that
willinclude the InitPlan costs. I figured I could eliminate one sub-root much as Path elimination works. But on
reflection,I’m not sure PG has enough flexibility in the Path concept to support this route forward.

I’d welcome any view, ideas or advice.

Re: add_path() for Path without InitPlan: cost comparison vs. Paths that require one

From

Tom Lane

Date:

25 July 2019, 16:25:16

Dent John <denty@qqdd.eu> writes:
> However, if I factor back in the cost of the InitPlan, things net out much more in favour of a scan against the MV.
Ofcourse, the add_path() comparison logic doesn’t include the InitPlan cost, so the point is moot.  

Please explain yourself.  InitPlans will, as a rule, get stuck into the
same place in the plan tree regardless of which paths are chosen; that's
why we need not consider them in path cost comparisons.  Moreover, once
the initplan's own subplan is determined, it's going to be the same
regardless of the shape of the outer query --- so if we did factor it
in, it'd contribute the same cost to every outer path, and thus it
still wouldn't change any decisions.  So I don't follow what you're
on about here.

            regards, tom lane

Re: add_path() for Path without InitPlan: cost comparison vs. Paths that require one

From

Dent John

Date:

25 July 2019, 19:52:01

Hi Tom,

> On 25 Jul 2019, at 14:25, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Please explain yourself.  InitPlans will, as a rule, get stuck into the
> same place in the plan tree regardless of which paths are chosen; that's
> why we need not consider them in path cost comparisons.

Ah that’s true. I didn’t realise that at the time I wrote.

But I think my problem is still real...

>  Moreover, once
> the initplan's own subplan is determined, it's going to be the same
> regardless of the shape of the outer query ---

Yes that’s true too.

> so if we did factor it
> in, it'd contribute the same cost to every outer path, and thus it
> still wouldn't change any decisions.

I think I’m exposed to the problem because I’m changing how certain queries are fulfilled.

And in the case of a RECURSIVE CTE, the plan ends up being an InitPlan that materializes the CTE, and then a scan of
thatmaterialized result. 

The problem is that I can fulfil the entire query with a scan against an MV table. Point is it’s an alternative that
achievesboth the InitPlan (because it’s unnecessary) and the final scan. 

But the cost comparison during add_path() is only taking into account the cost of the final scan, which is so cheap
thatit is preferable even to a simple scan of an MV.  

> So I don't follow what you're
> on about here.

Hmm. Having written the above, I realise I’m not clear on why my extension isn’t offered the opportunity to materialise
thework table for the InitPlan. 

Sorry. I should have thought about that question first. It might just be an error in my code. I’ll follow up with an
answer.

>
>            regards, tom lane