Thread: second CTE kills perf
Hello, I'm trying to understand this behaviour and the limits of CTE, when they reach the limits, when they cannot receive parametersfrom a caller, ... I'm running a query on PostgreSQL 10 with a cte. the query runs in ~ 10 seconds. If I add asecond CTE with the same query as the previous one and select * from second_cte as query, it now runs in ~ 10 minutes. oversimplified example: 10 seconds version: | WITH cte1 AS (SELECT x,y,z FROM table) SELECT row_number() over(),x,y,z FROM cte1 WHERE x=32; 10 minutes version: | WITH cte1 AS (SELECT x,y,z FROM table), cte2 AS (SELECT row_number() over(),x,y,z FROM cte1) SELECT * FROM cte2 WHERE x=32; The real examples, with query plans: https://explain.dalibo.com/plan/98A https://explain.dalibo.com/plan/o6X4 Thanks for your time, Nicolas Seinlet.
Attachment
On Tue, 22 Jun 2021 at 13:50, Nicolas Seinlet <nicolas@seinlet.com> wrote:
Hello,
oversimplified example:
10 seconds version:
| WITH cte1 AS (SELECT x,y,z FROM table) SELECT row_number() over(),x,y,z FROM cte1 WHERE x=32;
10 minutes version:
| WITH cte1 AS (SELECT x,y,z FROM table), cte2 AS (SELECT row_number() over(),x,y,z FROM cte1) SELECT * FROM cte2 WHERE x=32;
A simplified setup of the above issue.
I was able to reproduce this
postgres=# -- create table t(id int primary key, name text);
postgres=# -- insert into t select x, x::text from generate_series(1, 1000000) x;
--pushdown
postgres=# explain (analyze,buffers) with cte as (select * from t) select *, row_number() over () from cte where id = 10; QUERY PLAN
-----------------------------------------------------------------------------------------------------------------
WindowAgg (cost=0.42..8.46 rows=1 width=18) (actual time=0.014..0.016 rows=1 loops=1)
Buffers: shared hit=4
-> Index Scan using t_pkey on t (cost=0.42..8.44 rows=1 width=10) (actual time=0.010..0.011 rows=1 loops=1)
Index Cond: (id = 10)
Buffers: shared hit=4
Planning Time: 0.074 ms
Execution Time: 0.029 ms
(7 rows)
--no pushdown
postgres=# explain (analyze,buffers) with cte as (select * from t), cte2 as (select *, row_number() over () from cte) select * from cte2 where id = 10;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------
Subquery Scan on cte2 (cost=0.00..40405.00 rows=1 width=18) (actual time=0.017..224.461 rows=1 loops=1)
Filter: (cte2.id = 10)
Rows Removed by Filter: 999999
Buffers: shared hit=609 read=4796
-> WindowAgg (cost=0.00..27905.00 rows=1000000 width=18) (actual time=0.012..185.554 rows=1000000 loops=1)
Buffers: shared hit=609 read=4796
-> Seq Scan on t (cost=0.00..15405.00 rows=1000000 width=10) (actual time=0.007..45.168 rows=1000000 loops=1)
Buffers: shared hit=609 read=4796
Planning Time: 0.068 ms
Execution Time: 224.479 ms
(10 rows)
-- without aggregate, pushdown works even with multiple ctes
(analyze,buffers) with cte as (select * from t), cte2 as (select * from cte where id < 100) select * from cte2 where id = 10; QUERY PLAN
-----------------------------------------------------------------------------------------------------------
Index Scan using t_pkey on t (cost=0.42..8.45 rows=1 width=10) (actual time=0.005..0.006 rows=1 loops=1)
Index Cond: ((id < 100) AND (id = 10))
Buffers: shared hit=4
Planning:
Buffers: shared hit=4
Planning Time: 0.074 ms
Execution Time: 0.015 ms
(7 rows)
--with window aggregate, even at the top cte, predicate is not applied
explain (analyze,buffers) with cte as (select *, row_number() over () from t), cte2 as (select * from cte where id < 100) select * from cte2 where id = 10;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------
Subquery Scan on cte (cost=0.00..42905.00 rows=1 width=18) (actual time=0.013..226.454 rows=1 loops=1)
Rows Removed by Filter: 999999
Buffers: shared hit=673 read=4732
-> WindowAgg (cost=0.00..27905.00 rows=1000000 width=18) (actual time=0.009..187.550 rows=1000000 loops=1)
Buffers: shared hit=673 read=4732
-> Seq Scan on t (cost=0.00..15405.00 rows=1000000 width=10) (actual time=0.005..44.613 rows=1000000 loops=1)
Buffers: shared hit=673 read=4732
Planning Time: 0.055 ms
Execution Time: 226.468 ms
--without cte predicate is applied before window aggregate ?
postgres=# explain (analyze,buffers) select *, row_number() over () from t where id = 10;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------
WindowAgg (cost=0.42..8.46 rows=1 width=18) (actual time=0.018..0.020 rows=1 loops=1)
Buffers: shared hit=4
-> Index Scan using t_pkey on t (cost=0.42..8.44 rows=1 width=10) (actual time=0.013..0.014 rows=1 loops=1)
Index Cond: (id = 10)
Buffers: shared hit=4
Planning Time: 0.053 ms
Execution Time: 0.037 ms
(7 rows)
Thank you for raising this, I hope i'll gain something here.
Thanks,
Vijay
Mumbai, India
Nicolas Seinlet <nicolas@seinlet.com> writes: > I'm trying to understand this behaviour and the limits of CTE, when they reach the limits, when they cannot receive parametersfrom a caller, ... I'm running a query on PostgreSQL 10 with a cte. the query runs in ~ 10 seconds. If I add asecond CTE with the same query as the previous one and select * from second_cte as query, it now runs in ~ 10 minutes. > oversimplified example: > 10 seconds version: > | WITH cte1 AS (SELECT x,y,z FROM table) SELECT row_number() over(),x,y,z FROM cte1 WHERE x=32; > 10 minutes version: > | WITH cte1 AS (SELECT x,y,z FROM table), cte2 AS (SELECT row_number() over(),x,y,z FROM cte1) SELECT * FROM cte2 WHEREx=32; [ shrug... ] You are asking for two different computations, and the second one is far more expensive. In the first case, the WHERE x=32 clause is applied before the window function, so we can (indeed must) filter out all rows not having x=32 before doing the window function. In the second case, WHERE x=32 is applied above/after the window function. We cannot push down the WHERE to before the window function. (In this case, filtering beforehand would obviously change the results of row_number, but in general we don't know enough about window function behavior to risk such changes.) So row_number has to be computed over the entire contents of the "table", and that's not cheap. It does surprise me a bit that row_number is quite *that* expensive, but if you are expecting equivalent results from these two queries, you're simply wrong. regards, tom lane
Hi, thanks for all. I replaced row_number() with some computed int which speeds up a lot the query. ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐ On Tuesday, June 22nd, 2021 at 15:53, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Nicolas Seinlet nicolas@seinlet.com writes: > > > I'm trying to understand this behaviour and the limits of CTE, when they reach the limits, when they cannot receive parametersfrom a caller, ... I'm running a query on PostgreSQL 10 with a cte. the query runs in ~ 10 seconds. If I add asecond CTE with the same query as the previous one and select * from second_cte as query, it now runs in ~ 10 minutes. > > > oversimplified example: > > > > 10 seconds version: > > > > | WITH cte1 AS (SELECT x,y,z FROM table) SELECT row_number() over(),x,y,z FROM cte1 WHERE x=32; > > > 10 minutes version: > > > > | WITH cte1 AS (SELECT x,y,z FROM table), cte2 AS (SELECT row_number() over(),x,y,z FROM cte1) SELECT * FROM cte2 WHEREx=32; > > [ shrug... ] You are asking for two different computations, and the > > second one is far more expensive. > > In the first case, the WHERE x=32 clause is applied before the window > > function, so we can (indeed must) filter out all rows not having x=32 > > before doing the window function. > > In the second case, WHERE x=32 is applied above/after the window > > function. We cannot push down the WHERE to before the window function. > > (In this case, filtering beforehand would obviously change the results > > of row_number, but in general we don't know enough about window function > > behavior to risk such changes.) So row_number has to be computed over > > the entire contents of the "table", and that's not cheap. > > It does surprise me a bit that row_number is quite that expensive, > > but if you are expecting equivalent results from these two queries, > > you're simply wrong. > > regards, tom lane