Thread: incomplete removal of not referenced CTEs

incomplete removal of not referenced CTEs

From
Tomas Vondra
Date:
Hi,

While investigating a CTE-related query, I've noticed that we don't
really remove all unreachable CTEs. For example, for this query
   with a as (select 1),        b as (select * from a),        c as (select * from b)   select 2;

where none of the CTEs if (directly or indirectly) referenced from the
query, we get a plan like this:
                      QUERY PLAN
---------------------------------------------------------Result  (cost=0.03..0.04 rows=1 width=4)  CTE a    ->  Result
(cost=0.00..0.01rows=1 width=4)  CTE b    ->  CTE Scan on a  (cost=0.00..0.02 rows=1 width=4)
 
(5 rows)

So we only remove the top-level CTE, but we fail to remove the other
CTEs because we don't tweak the refcount in SS_process_ctes().

Of course, it's harmless as none of those CTEs gets actually executed,
but is this intentional, or do we want/need to fix it? I don't see
anything about this in the docs, but it seems a bit awkward and
confusing to remove only some of the CTEs - I think we should either
remove all or none of them.

I don't think that should be particularly difficult - ISTM we need to
make SS_process_ctes a bit smarter, essentially by adding a loop to
remove the CTEs recursively (and decrease the refcount).

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: incomplete removal of not referenced CTEs

From
Andres Freund
Date:
On 2016-09-01 21:36:13 +0200, Tomas Vondra wrote:
> Of course, it's harmless as none of those CTEs gets actually executed,
> but is this intentional, or do we want/need to fix it? I don't see
> anything about this in the docs, but it seems a bit awkward and
> confusing to remove only some of the CTEs - I think we should either
> remove all or none of them.
> 
> I don't think that should be particularly difficult - ISTM we need to
> make SS_process_ctes a bit smarter, essentially by adding a loop to
> remove the CTEs recursively (and decrease the refcount).

I don't really see a lot of benefit in expanding energy on
this. Skipping the CTE in the easy case saves som eplan cycles. Making more
effort to remove CTEs recursively probably doesn't...

Greetings,

Andres Freund



Re: incomplete removal of not referenced CTEs

From
Tom Lane
Date:
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
> While investigating a CTE-related query, I've noticed that we don't
> really remove all unreachable CTEs.

We expend a grand total of three lines of code on making that happen.
I'm pretty much -1 on adding a great deal more code or complexity
to make it happen recursively; the case simply doesn't arise in
reasonably well written queries.
        regards, tom lane



Re: incomplete removal of not referenced CTEs

From
Andres Freund
Date:
On 2016-09-01 15:46:45 -0400, Tom Lane wrote:
> Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
> > While investigating a CTE-related query, I've noticed that we don't
> > really remove all unreachable CTEs.
> 
> We expend a grand total of three lines of code on making that happen.
> I'm pretty much -1 on adding a great deal more code or complexity
> to make it happen recursively;

Agreed. And the consequences are pretty much harmless.


> the case simply doesn't arise in reasonably well written queries.

Well, it might, when the CTE reference can be removed due to some other
part of the query (e.g. plan time evaluation of immutable function).


Andres



Re: incomplete removal of not referenced CTEs

From
Tomas Vondra
Date:

On 09/01/2016 09:58 PM, Andres Freund wrote:
> On 2016-09-01 15:46:45 -0400, Tom Lane wrote:
>> Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
>>> While investigating a CTE-related query, I've noticed that we don't
>>> really remove all unreachable CTEs.
>>
>> We expend a grand total of three lines of code on making that happen.
>> I'm pretty much -1 on adding a great deal more code or complexity
>> to make it happen recursively;
> 
> Agreed. And the consequences are pretty much harmless.
> 

I'm not sure I agree with that. The trouble is that we have customers,
and they don't always write "reasonably well written queries",
particularly when those queries are a bit more complex. And I have to
debug issues with those queries from time to time.

I'd actually be much happier if we haven't removed any CTEs at all,
instead of removing just some of them.

This incomplete CTE removal actually confused the hell out of me today,
which is why I started this thread. I only realized some of the CTEs are
useless after the EXPLAIN ANALYZE completed (as this was a performance
issue, it took a very long time).

> 
>> the case simply doesn't arise in reasonably well written queries.
> 
> Well, it might, when the CTE reference can be removed due to some other
> part of the query (e.g. plan time evaluation of immutable function).
> 

We also never remove CTEs with INSERT/UPDATE/DELETE commands, which
makes it a bit more complicated (but those are easy to spot).

FWIW, I'm not planning to hack on this, but I was thinking that this
might be a nice topic for a first patch for new PostgreSQL hackers
(something like EasyHacks from LibreOffice).

regards


-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services