Thread: Selection not "pushed down into" CTE

Selection not "pushed down into" CTE

From
Clemens Eisserer
Date:
Hi,

running postgresql 15.5 I was recently surpised postgresql didn't
perform an optimization which I thought would be easy to apply.
so in this case I don't have an actual performance problem but I am
rather curious if this is limitation in postgresql or whether there is
a semantic difference in the two queries below.

running the following query results in a full sort (caused by lead
over order by) as the ts > '2024-01-04' selection doesn't seem to be
applied to the CTE but only later:

with cte as (select ts, lead(ts, 1) over (order by ts) as ts2 from smartmeter)
select ts, ts2 from cte where ts > '2024-01-04' and  extract(epoch
from ts2) - extract(epoch from ts) > 9;

--------
 Subquery Scan on cte  (cost=1116514.38..1419735.26 rows=253 width=16)
(actual time=117487.536..117999.668 rows=10 loops=1)
   Filter: ((cte.ts > '2024-01-04 00:00:00+00'::timestamp with time
zone) AND ((EXTRACT(epoch FROM cte.ts2) - EXTRACT(epoch FROM cte.ts))
> '9'::numeric))
   Rows Removed by Filter: 7580259
   ->  WindowAgg  (cost=1116514.38..1249173.52 rows=7580522 width=16)
(actual time=67016.787..114141.495 rows=7580269 loops=1)
         ->  Sort  (cost=1116514.38..1135465.69 rows=7580522 width=8)
(actual time=67016.685..81802.822 rows=7580269 loops=1)
               Sort Key: smartmeter.ts
               Sort Method: external merge  Disk: 89024kB
               ->  Seq Scan on smartmeter  (cost=0.00..146651.22
rows=7580522 width=8) (actual time=7.251..56715.002 rows=7580269
loops=1)
 Planning Time: 0.502 ms
 Execution Time: 118100.528 ms


whereas if ts > '2024-01-04' is already filtered in the CTE the query
performs a lot better:

with cte as (select ts, lead(ts, 1) over (order by ts) as ts2 from
smartmeter where ts > '2024-01-04')
select ts, ts2 from cte where extract(epoch from ts2) - extract(epoch
from ts) > 9;


-----------------------------------------------------------------------------------------------------------------------------------------------------
 Subquery Scan on cte  (cost=74905.42..74933.84 rows=253 width=16)
(actual time=334.654..804.286 rows=10 loops=1)
   Filter: ((EXTRACT(epoch FROM cte.ts2) - EXTRACT(epoch FROM cte.ts))
> '9'::numeric)
   Rows Removed by Filter: 57021
   ->  WindowAgg  (cost=74905.42..74918.68 rows=758 width=16) (actual
time=263.950..550.566 rows=57031 loops=1)
         ->  Sort  (cost=74905.42..74907.31 rows=758 width=8) (actual
time=263.893..295.188 rows=57031 loops=1)
               Sort Key: smartmeter.ts
               Sort Method: quicksort  Memory: 1537kB
               ->  Bitmap Heap Scan on smartmeter
(cost=16.37..74869.16 rows=758 width=8) (actual time=170.485..243.591
rows=57031 loops=1)
                     Recheck Cond: (ts > '2024-01-04
00:00:00+00'::timestamp with time zone)
                     Rows Removed by Index Recheck: 141090
                     Heap Blocks: lossy=1854
                     ->  Bitmap Index Scan on smartmeter_ts_idx
(cost=0.00..16.18 rows=76345 width=0) (actual time=1.142..1.144
rows=18540 loops=1)
                           Index Cond: (ts > '2024-01-04
00:00:00+00'::timestamp with time zone)
 Planning Time: 0.565 ms
 Execution Time: 804.474 ms
(15 rows)

Thanks a lot, Clemens


The DDL of the table in question is:

CREATE TABLE public.smartmeter (
   leistungsfaktor real,
   momentanleistung integer,
   spannungl1 real,
   spannungl2 real,
   spannungl3 real,
   stroml1 real,
   stroml2 real,
   stroml3 real,
   wirkenergien real,
   wirkenergiep real,
   ts timestamp with time zone NOT NULL
);
CREATE INDEX smartmeter_ts_idx ON public.smartmeter USING brin (ts);



Re: Selection not "pushed down into" CTE

From
Tom Lane
Date:
Clemens Eisserer <linuxhippy@gmail.com> writes:
> running postgresql 15.5 I was recently surpised postgresql didn't
> perform an optimization which I thought would be easy to apply.

It is not.

> running the following query results in a full sort (caused by lead
> over order by) as the ts > '2024-01-04' selection doesn't seem to be
> applied to the CTE but only later:

> with cte as (select ts, lead(ts, 1) over (order by ts) as ts2 from smartmeter)
> select ts, ts2 from cte where ts > '2024-01-04' and  extract(epoch
> from ts2) - extract(epoch from ts) > 9;

The ts restriction is not pushed down because of the rules in
allpaths.c:

 * 4. If the subquery has any window functions, we must not push down quals
 * that reference any output columns that are not listed in all the subquery's
 * window PARTITION BY clauses.  We can push down quals that use only
 * partitioning columns because they should succeed or fail identically for
 * every row of any one window partition, and totally excluding some
 * partitions will not change a window function's results for remaining
 * partitions.  (Again, this also requires nonvolatile quals, but
 * subquery_is_pushdown_safe handles that.)

To conclude that it'd be safe with this particular window function
requires deep knowledge of that function's semantics, which the
planner has not got.

            regards, tom lane



Re: Selection not "pushed down into" CTE

From
Clemens Eisserer
Date:
Hi Tom,

Thanks for the detailed explanation what is preventing the
optimization and the view behind the scenes.

Best regards, Clemens