Thread: CTE Inline On TPC-DS Query 95

CTE Inline On TPC-DS Query 95

From
Ba Jinsheng
Date:
Hi all,

Still for the query 95:

with ws_wh as
(select ws1.ws_order_number,ws1.ws_warehouse_sk wh1,ws2.ws_warehouse_sk wh2
 from web_sales ws1,web_sales ws2
 where ws1.ws_order_number = ws2.ws_order_number
   and ws1.ws_warehouse_sk <> ws2.ws_warehouse_sk)
 select  
   count(distinct ws_order_number) as "order count"
  ,sum(ws_ext_ship_cost) as "total shipping cost"
  ,sum(ws_net_profit) as "total net profit"
from
   web_sales ws1
  ,date_dim
  ,customer_address
  ,web_site
where
    d_date between '1999-5-01' and
           (cast('1999-5-01' as date) + interval '60 days')
and ws1.ws_ship_date_sk = d_date_sk
and ws1.ws_ship_addr_sk = ca_address_sk
and ca_state = 'TX'
and ws1.ws_web_site_sk = web_site_sk
and web_company_name = 'pri'
and ws1.ws_order_number in (select ws_order_number
                            from ws_wh)
and ws1.ws_order_number in (select wr_order_number
                            from web_returns,ws_wh
                            where wr_order_number = ws_wh.ws_order_number)
order by count(distinct ws_order_number)
limit 100;


If we specify ``NOT MATERIALIZED`` for the query, the execution time is reduced from 1min to 1s due to the usage of inline CTE. It seems expected as described in the PostgreSQL documentation. However, from the code: https://github.com/postgres/postgres/blob/REL_17_STABLE/src/backend/optimizer/plan/subselect.c#L939, I understand that this query does not contain volatile functions and includes simple predicates. I am wondering whether we can relax this condition checking, to enable inline CTE for such queries? Because it seems the performance benefit is huge on a standard benchmark.



Best regards,

Jinsheng Ba

 

Notice: This email is generated from the account of an NUS alumnus. Contents, views, and opinions therein are solely those of the sender.

Re: CTE Inline On TPC-DS Query 95

From
David Rowley
Date:
On Tue, 26 Nov 2024 at 10:03, Ba Jinsheng <bajinsheng@u.nus.edu> wrote:
> If we specify ``NOT MATERIALIZED`` for the query, the execution time is reduced from 1min to 1s due to the usage of
inlineCTE. It seems expected as described in the PostgreSQL documentation. However, from the code:
https://github.com/postgres/postgres/blob/REL_17_STABLE/src/backend/optimizer/plan/subselect.c#L939,I understand that
thisquery does not contain volatile functions and includes simple predicates. I am wondering whether we can relax this
conditionchecking, to enable inline CTE for such queries? Because it seems the performance benefit is huge on a
standardbenchmark. 

The relevant part of the documentation [1] seems to be "By default,
this happens if the parent query references the WITH query just once,
but not if it references the WITH query more than once."

If you're proposing that we change the heuristics for when to inline
CTEs, then I suggest you go and check the archives for why we made
this decision. You'll need to prove to us why your newly proposed
heuristic is better than what we have. That's going to take more than
one example query to show that your heuristics are better.

However, just to save you from wasting any time, I highly doubt an
adjustment to the CTE inlining heuristics alone is enough.  Just
imagine the CTE evaluation performed a Seq Scan on a 1 billion row
table and found 1 matching row. Let's assume the remaining part of the
query is very cheap. It's obviously going to be better to materialise
the CTE so that we perform the Seq Scan once rather than twice (or
however many times the CTE is referenced). Adjusting the heuristics
here is just going to make some queries faster at the expense of
making other queries slower. That's never going to pass community
standards.

If you truly wanted to improve things here, the heuristics would need
to be swapped out with a cost-based decision. I can tell you now, that
doing that is a very difficult task as it basically requires
performing the join search and quite a large portion of planning once
for each combination of CTE inlined vs not-inlined. If you had a query
with a dozen or so CTEs, that's going to be a very costly thing to
plan.

David

[1] https://www.postgresql.org/docs/current/queries-with.html



RE: CTE Inline On TPC-DS Query 95

From
Date:
Hi,

The choice between inlining and materializing is not a question of syntax. It matters if the parent query references
theCTE expression more than once but this is not a decisive factor... 

Sometimes one is better and sometimes the other is best.
In Oracle, the choice is done by COST. This should be the same for PostgreSQL. In essence, it is the same thing for
views:inlining or materializing... 

Michel SALAIS


-----Message d'origine-----
De : David Rowley <dgrowleyml@gmail.com>
Envoyé : mardi 26 novembre 2024 05:03
À : Ba Jinsheng <bajinsheng@u.nus.edu>
Cc : Andrei Lepikhov <lepihov@gmail.com>; pgsql-performance@lists.postgresql.org
Objet : Re: CTE Inline On TPC-DS Query 95

On Tue, 26 Nov 2024 at 10:03, Ba Jinsheng <bajinsheng@u.nus.edu> wrote:
> If we specify ``NOT MATERIALIZED`` for the query, the execution time is reduced from 1min to 1s due to the usage of
inlineCTE. It seems expected as described in the PostgreSQL documentation. However, from the code:
https://github.com/postgres/postgres/blob/REL_17_STABLE/src/backend/optimizer/plan/subselect.c#L939,I understand that
thisquery does not contain volatile functions and includes simple predicates. I am wondering whether we can relax this
conditionchecking, to enable inline CTE for such queries? Because it seems the performance benefit is huge on a
standardbenchmark. 

The relevant part of the documentation [1] seems to be "By default, this happens if the parent query references the
WITHquery just once, but not if it references the WITH query more than once." 

If you're proposing that we change the heuristics for when to inline CTEs, then I suggest you go and check the archives
forwhy we made this decision. You'll need to prove to us why your newly proposed heuristic is better than what we have.
That'sgoing to take more than one example query to show that your heuristics are better. 

However, just to save you from wasting any time, I highly doubt an adjustment to the CTE inlining heuristics alone is
enough. Just imagine the CTE evaluation performed a Seq Scan on a 1 billion row table and found 1 matching row. Let's
assumethe remaining part of the query is very cheap. It's obviously going to be better to materialise the CTE so that
weperform the Seq Scan once rather than twice (or however many times the CTE is referenced). Adjusting the heuristics
hereis just going to make some queries faster at the expense of making other queries slower. That's never going to pass
communitystandards. 

If you truly wanted to improve things here, the heuristics would need to be swapped out with a cost-based decision. I
cantell you now, that doing that is a very difficult task as it basically requires performing the join search and quite
alarge portion of planning once for each combination of CTE inlined vs not-inlined. If you had a query with a dozen or
soCTEs, that's going to be a very costly thing to plan. 

David

[1] https://www.postgresql.org/docs/current/queries-with.html





Re: CTE Inline On TPC-DS Query 95

From
David Rowley
Date:
On Sat, 28 Dec 2024 at 00:34, <msalais@msym.fr> wrote:
> Sometimes one is better and sometimes the other is best.
> In Oracle, the choice is done by COST. This should be the same for PostgreSQL. In essence, it is the same thing for
views:inlining or materializing...
 

I don't doubt you. However, I did already mention the cost part in the
final paragraph of my email and I also hinted on why it might be more
difficult than you might imagine to implement. I don't want to put
anyone off making improvements in this area. I only aim to highlight
that it's not trivial to do so.

David