Re: CTE Inline On TPC-DS Query 95 - Mailing list pgsql-performance

From David Rowley
Subject Re: CTE Inline On TPC-DS Query 95
Date
Msg-id CAApHDvouVmVBA8KvHuQHjsBS4v3dg9a_hGDQecA2BPYZgNhseQ@mail.gmail.com
Whole thread Raw
In response to CTE Inline On TPC-DS Query 95  (Ba Jinsheng <bajinsheng@u.nus.edu>)
List pgsql-performance
On Tue, 26 Nov 2024 at 10:03, Ba Jinsheng <bajinsheng@u.nus.edu> wrote:
> If we specify ``NOT MATERIALIZED`` for the query, the execution time is reduced from 1min to 1s due to the usage of
inlineCTE. It seems expected as described in the PostgreSQL documentation. However, from the code:
https://github.com/postgres/postgres/blob/REL_17_STABLE/src/backend/optimizer/plan/subselect.c#L939,I understand that
thisquery does not contain volatile functions and includes simple predicates. I am wondering whether we can relax this
conditionchecking, to enable inline CTE for such queries? Because it seems the performance benefit is huge on a
standardbenchmark. 

The relevant part of the documentation [1] seems to be "By default,
this happens if the parent query references the WITH query just once,
but not if it references the WITH query more than once."

If you're proposing that we change the heuristics for when to inline
CTEs, then I suggest you go and check the archives for why we made
this decision. You'll need to prove to us why your newly proposed
heuristic is better than what we have. That's going to take more than
one example query to show that your heuristics are better.

However, just to save you from wasting any time, I highly doubt an
adjustment to the CTE inlining heuristics alone is enough.  Just
imagine the CTE evaluation performed a Seq Scan on a 1 billion row
table and found 1 matching row. Let's assume the remaining part of the
query is very cheap. It's obviously going to be better to materialise
the CTE so that we perform the Seq Scan once rather than twice (or
however many times the CTE is referenced). Adjusting the heuristics
here is just going to make some queries faster at the expense of
making other queries slower. That's never going to pass community
standards.

If you truly wanted to improve things here, the heuristics would need
to be swapped out with a cost-based decision. I can tell you now, that
doing that is a very difficult task as it basically requires
performing the join search and quite a large portion of planning once
for each combination of CTE inlined vs not-inlined. If you had a query
with a dozen or so CTEs, that's going to be a very costly thing to
plan.

David

[1] https://www.postgresql.org/docs/current/queries-with.html



pgsql-performance by date:

Previous
From: David Rowley
Date:
Subject: Re: Performance of TPC-DS Query 95
Next
From: David Mullineux
Date:
Subject: Re: Has gen_random_uuid() gotten much slower in v17?