On Sun, 9 Jul 2023 at 05:28, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> More generally, it's not clear to me why we should need to look inside
> lateral PHVs in the first place. Wouldn't the lateral PHV itself
> serve fine as a cache key?
For Memoize specifically, I purposefully made it so the expression was
used as a cache key rather than extracting the Vars from it and using
those. The reason for that was that the expression may result in
fewer distinct values to cache tuples for. For example:
create table t1 (a int primary key);
create table t2 (a int primary key);
create statistics on (a % 10) from t2;
insert into t2 select x from generate_Series(1,1000000)x;
insert into t1 select x from generate_Series(1,1000000)x;
analyze t1,t2;
explain (analyze, costs off) select * from t1 inner join t2 on t1.a=t2.a%10;
QUERY PLAN
--------------------------------------------------------------------------------------------
Nested Loop (actual time=0.015..212.798 rows=900000 loops=1)
-> Seq Scan on t2 (actual time=0.006..33.479 rows=1000000 loops=1)
-> Memoize (actual time=0.000..0.000 rows=1 loops=1000000)
Cache Key: (t2.a % 10)
Cache Mode: logical
Hits: 999990 Misses: 10 Evictions: 0 Overflows: 0 Memory Usage: 1kB
-> Index Only Scan using t1_pkey on t1 (actual
time=0.001..0.001 rows=1 loops=10)
Index Cond: (a = (t2.a % 10))
Heap Fetches: 0
Planning Time: 0.928 ms
Execution Time: 229.621 ms
(11 rows)