Hi,
it is definitively possible to get nested loop joins on successively
aggregated CTEs. However, for the index to be used, it must exist. And
you can only create the index on a real table, not on the intermediate
CTEs.
> WITH series1h AS MATERIALIZED (SELECT generate_series AS ts FROM
> generate_series('1990-01-01 00:00', '1990-12-31 23:59', INTERVAL '1
> hour') ORDER BY ts),
> series15m AS MATERIALIZED (SELECT generate_series AS ts FROM
> generate_series('1990-01-01 00:00', '1990-12-31 23:59', INTERVAL '15
> minutes') ORDER BY ts)
> SELECT count(*) FROM (SELECT h1.ts, count(*) FROM series1h h1 JOIN
> series15m m15 ON (m15.ts > (h1.ts - INTERVAL '1 hour') AND m15.ts <=
> h1.ts ) GROUP BY h1.ts ORDER BY h1.ts);
Here is an example based on the query above:
CREATE TEMP TABLE series15m AS
(SELECT
generate_series('1990-01-01 00:00',
'1990-12-31 23:59',
INTERVAL '15 minutes')::timestamp as ts,
random() as your_data
ORDER BY 1);
CREATE INDEX short_period ON series15m (ts);
CREATE INDEX long_period ON series15m (date_trunc('hour',ts));
EXPLAIN (costs 'off')
WITH series1h AS (SELECT date_trunc('hour',ts) as ts FROM series15m)
SELECT
h1.ts,
sum(your_data)
FROM series1h h1
JOIN series15m m15 ON
(m15.ts >(h1.ts - INTERVAL '1 hour') AND m15.ts <= h1.ts )
GROUP BY h1.ts
ORDER BY h1.ts ;
QUERY PLAN
-------------------------------------------------------------------
GroupAggregate
Group Key: date_trunc('hour'::text, series15m.ts)
-> Nested Loop
-> Index Scan using long_period on series15m
-> Index Scan using short_period on series15m m15
Index Cond: (...)
So, the general idea is to create functional indexes (long_period in
this example) on the base table that will cover the aggregate keys of
intermediate CTEs. This approach works as long as the aggregate keys
depend only on one table.
Best regards,
Renan Fonseca