Thread: BUG #17721: A completely unused CTE negatively affect Query Plan

BUG #17721: A completely unused CTE negatively affect Query Plan

From
PG Bug reporting form
Date:
The following bug has been logged on the website:

Bug reference:      17721
Logged by:          Nathaniel Hazelton
Email address:      nh.dev@sturdy.ai
PostgreSQL version: 15.1
Operating system:   Docker
Description:

I've boiled down an issue we have in production to a simple query that can
demonstrate it.  I've run this on 13,14 and 15 locally in a docker container
with the same results.  If a CTE that is completely unexecuted exists in a
subquery (or view in our production case) it affects the query plan VERY
negatively.  The first explain shows a sequential scan, where the second
explain shows an index scan, just by the existence of the (obviously)
unexecuted CTE.

In this example, the plans might not differ much in performance.  However,
in our production system, the parallel to this is VERY costly.  In fresh
local docker Postgres, the random page cost must be set to 1 to reflect this
issue.  With our production data, the page cost is the default 4, and
reflects the issue.



drop table if exists conv, conv_acc;

select setseed(0);

create temp table conv as (select id, 'meta' as meta from generate_series
(1, 1000) id);

create temp table conv_acc as (
    select conv.id, acc_id
    from conv
    left join lateral (
        select conv.id, floor(random()*1000) as acc_id
        from generate_series (1, 4) limit floor(random()*4+1)::int
    ) acc on true
);


create index conv_acc_id_index
        on conv_acc(id);

analyze conv; analyze conv_acc;

set random_page_cost to 1;

EXPLAIN (ANALYZE, COSTS, VERBOSE, BUFFERS, FORMAT JSON)
select * from
    (select id from conv limit 10) limitconv
    left join (
    with someccte as materialized (select 1/0)
        select id
        from    conv_acc ca
    ) vca
    on vca.id = limitconv.id;

EXPLAIN (ANALYZE, COSTS, VERBOSE, BUFFERS, FORMAT JSON)
select * from
    (select id from conv limit 10) limitconv
    left join (
    --with someccte as materialized (select 1/0)
        select id
        from    conv_acc ca
    ) vca
    on vca.id = limitconv.id;


Re: BUG #17721: A completely unused CTE negatively affect Query Plan

From
Tom Lane
Date:
PG Bug reporting form <noreply@postgresql.org> writes:
> I've boiled down an issue we have in production to a simple query that can
> demonstrate it.  I've run this on 13,14 and 15 locally in a docker container
> with the same results.  If a CTE that is completely unexecuted exists in a
> subquery (or view in our production case) it affects the query plan VERY
> negatively.  The first explain shows a sequential scan, where the second
> explain shows an index scan, just by the existence of the (obviously)
> unexecuted CTE.

The presence of the CTE prevents flattening of the subquery.
If you don't like it, don't attach the CTE right there.

            regards, tom lane



Re: BUG #17721: A completely unused CTE negatively affect Query Plan

From
Nathaniel Hazelton
Date:
The CTE in that location was emulating a view in which we have a CTE - which responds the same way.  Does this mean
CTEscan never be used in views that could be flattened without affecting performance?  I'm able to find any mention of
this. The only mention of CTEs being optimization fences goes back when all CTEs were materialized.
 

Is there some way to know when it is safe or not to use a CTE in a view / subquery?  It would be great to have some
documentationon that.
 

On 12/14/22, 11:25 AM, "Tom Lane" <tgl@sss.pgh.pa.us <mailto:tgl@sss.pgh.pa.us>> wrote:


PG Bug reporting form <noreply@postgresql.org <mailto:noreply@postgresql.org>> writes:
> I've boiled down an issue we have in production to a simple query that can
> demonstrate it. I've run this on 13,14 and 15 locally in a docker container
> with the same results. If a CTE that is completely unexecuted exists in a
> subquery (or view in our production case) it affects the query plan VERY
> negatively. The first explain shows a sequential scan, where the second
> explain shows an index scan, just by the existence of the (obviously)
> unexecuted CTE.


The presence of the CTE prevents flattening of the subquery.
If you don't like it, don't attach the CTE right there.


regards, tom lane




Re: BUG #17721: A completely unused CTE negatively affect Query Plan

From
Tom Lane
Date:
Nathaniel Hazelton <nathaniel@sturdyai.com> writes:
> The CTE in that location was emulating a view in which we have a CTE -
> which responds the same way.  Does this mean CTEs can never be used in
> views that could be flattened without affecting performance?

You've got a view with an unused CTE?  Why?

But at any rate, yes, the presence of a CTE is an optimization
fence: we can't flatten the subquery containing it without losing
the query level at which the CTE should be evaluated.  Perhaps there
are special cases where that could be relaxed, but nobody's tried
very hard.

            regards, tom lane



Re: BUG #17721: A completely unused CTE negatively affect Query Plan

From
David Rowley
Date:
On Thu, 15 Dec 2022 at 11:47, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Nathaniel Hazelton <nathaniel@sturdyai.com> writes:
> > The CTE in that location was emulating a view in which we have a CTE -
> > which responds the same way.  Does this mean CTEs can never be used in
> > views that could be flattened without affecting performance?
>
> You've got a view with an unused CTE?  Why?

hmm, but the CTE isn't unused.  It's just that we don't push quals
down due to the "materialized". Perhaps the view only contains the
CTE? Strange way to write a view... Perhaps there was some reason to
materialise it that's not obvious to us.

> But at any rate, yes, the presence of a CTE is an optimization
> fence: we can't flatten the subquery containing it without losing
> the query level at which the CTE should be evaluated.  Perhaps there
> are special cases where that could be relaxed, but nobody's tried
> very hard.

Just to expand on that a little.  Prior to v12, CTEs could never have
quals pushed down into them.  That was relaxed in v12 and the original
behaviour can be obtained using "materialized". Without that, the
planner is more free to perform its typical optimisations such as
pulling up subqueries or pushing quals down into subqueries.

David



Re: BUG #17721: A completely unused CTE negatively affect Query Plan

From
Nathaniel Hazelton
Date:


On 12/14/22, 10:46 PM, "David Rowley" <dgrowleyml@gmail.com <mailto:dgrowleyml@gmail.com>> wrote:

On Thu, 15 Dec 2022 at 11:47, Tom Lane <tgl@sss.pgh.pa.us <mailto:tgl@sss.pgh.pa.us>> wrote:
>
> Nathaniel Hazelton <nathaniel@sturdyai.com <mailto:nathaniel@sturdyai.com>> writes:
> > The CTE in that location was emulating a view in which we have a CTE -
> > which responds the same way. Does this mean CTEs can never be used in
> > views that could be flattened without affecting performance?
>
> You've got a view with an unused CTE? Why?


hmm, but the CTE isn't unused. It's just that we don't push quals
down due to the "materialized". Perhaps the view only contains the
CTE? Strange way to write a view... Perhaps there was some reason to
materialise it that's not obvious to us.

NH: Oops.  I posted a slightly incorrect version of what I was trying to show.  Remove the 'materialized', and it acts
thesame way.  Given that it selects 1/0, it's clearly not executed, materialized or not.  BUT it does affect the query
plan,materialized OR NOT, which is the part I couldn't understand.  We do not have an unused CTE in our view.  I was
justattempting to boil what I saw as a bug down to its minimal example.
 


> But at any rate, yes, the presence of a CTE is an optimization
> fence: we can't flatten the subquery containing it without losing
> the query level at which the CTE should be evaluated. Perhaps there
> are special cases where that could be relaxed, but nobody's tried
> very hard.


Just to expand on that a little. Prior to v12, CTEs could never have
quals pushed down into them. That was relaxed in v12 and the original
behaviour can be obtained using "materialized". Without that, the
planner is more free to perform its typical optimisations such as
pulling up subqueries or pushing quals down into subqueries.

NH;  We do use a non-materialized CTE in the view, because we want to use a CTE for what it's for - organizing our
code. Given that it's not materialized, we thought there should be no ill effect.  Remove the materialized from the
originalpost, and you will see it DOES affect the query plan.
 

NH: I will post a more complete example with views today.

Nathaniel