A performance issue with Memoize - Mailing list pgsql-hackers
From | Richard Guo |
---|---|
Subject | A performance issue with Memoize |
Date | |
Msg-id | CAMbWs48XHJEK1Q1CzAQ7L9sTANTs9W1cepXu8=Kc0quUL+tg4Q@mail.gmail.com Whole thread Raw |
Responses |
Re: A performance issue with Memoize
Re: A performance issue with Memoize Re: A performance issue with Memoize Re: A performance issue with Memoize |
List | pgsql-hackers |
I noticed $subject with the query below.
set enable_memoize to off;
explain (analyze, costs off)
select * from tenk1 t1 left join lateral
(select t1.two as t1two, * from tenk1 t2 offset 0) s
on t1.two = s.two;
QUERY PLAN
------------------------------------------------------------------------------------
Nested Loop Left Join (actual time=0.050..59578.053 rows=50000000 loops=1)
-> Seq Scan on tenk1 t1 (actual time=0.027..2.703 rows=10000 loops=1)
-> Subquery Scan on s (actual time=0.004..4.819 rows=5000 loops=10000)
Filter: (t1.two = s.two)
Rows Removed by Filter: 5000
-> Seq Scan on tenk1 t2 (actual time=0.002..3.834 rows=10000 loops=10000)
Planning Time: 0.666 ms
Execution Time: 60937.899 ms
(8 rows)
set enable_memoize to on;
explain (analyze, costs off)
select * from tenk1 t1 left join lateral
(select t1.two as t1two, * from tenk1 t2 offset 0) s
on t1.two = s.two;
QUERY PLAN
------------------------------------------------------------------------------------------
Nested Loop Left Join (actual time=0.061..122684.607 rows=50000000 loops=1)
-> Seq Scan on tenk1 t1 (actual time=0.026..3.367 rows=10000 loops=1)
-> Memoize (actual time=0.011..9.821 rows=5000 loops=10000)
Cache Key: t1.two, t1.two
Cache Mode: binary
Hits: 0 Misses: 10000 Evictions: 9999 Overflows: 0 Memory Usage: 1368kB
-> Subquery Scan on s (actual time=0.008..5.188 rows=5000 loops=10000)
Filter: (t1.two = s.two)
Rows Removed by Filter: 5000
-> Seq Scan on tenk1 t2 (actual time=0.004..4.081 rows=10000 loops=10000)
Planning Time: 0.607 ms
Execution Time: 124431.388 ms
(12 rows)
The execution time (best of 3) is 124431.388 VS 60937.899 with and
without memoize.
The Memoize runtime stats 'Hits: 0 Misses: 10000 Evictions: 9999'
seems suspicious to me, so I've looked into it a little bit, and found
that the MemoizeState's keyparamids and its outerPlan's chgParam are
always different, and that makes us have to purge the entire cache each
time we rescan the memoize node.
But why are they always different? Well, for the query above, we have
two NestLoopParam nodes, one (with paramno 1) is created when we replace
outer-relation Vars in the scan qual 't1.two = s.two', the other one
(with paramno 0) is added from the subquery's subplan_params, which was
created when we replaced uplevel vars with Param nodes for the subquery.
That is to say, the chgParam would be {0, 1}.
When it comes to replace outer-relation Vars in the memoize keys, the
two 't1.two' Vars are both replaced with the NestLoopParam with paramno
1, because it is the first NLP we see in root->curOuterParams that is
equal to the Vars in memoize keys. That is to say, the memoize node's
keyparamids is {1}.
I haven't thought thoroughly about the fix yet. But one way I'm
thinking is that in create_subqueryscan_plan() we can first add the
subquery's subplan_params to root->curOuterParams, and then replace
outer-relation Vars in scan_clauses afterwards. That can make us be
able to share the same PARAM_EXEC slot for the same Var that both
belongs to the subquery's uplevel vars and to the NestLoop's
outer-relation vars. To be concrete, something like attached.
With this change the same query runs much faster and the Memoize runtime
stats looks more normal.
explain (analyze, costs off)
select * from tenk1 t1 left join lateral
(select t1.two as t1two, * from tenk1 t2 offset 0) s
on t1.two = s.two;
QUERY PLAN
--------------------------------------------------------------------------------------
Nested Loop Left Join (actual time=0.063..21177.476 rows=50000000 loops=1)
-> Seq Scan on tenk1 t1 (actual time=0.025..5.415 rows=10000 loops=1)
-> Memoize (actual time=0.001..0.234 rows=5000 loops=10000)
Cache Key: t1.two, t1.two
Cache Mode: binary
Hits: 9998 Misses: 2 Evictions: 0 Overflows: 0 Memory Usage: 2735kB
-> Subquery Scan on s (actual time=0.009..5.169 rows=5000 loops=2)
Filter: (t1.two = s.two)
Rows Removed by Filter: 5000
-> Seq Scan on tenk1 t2 (actual time=0.006..4.050 rows=10000 loops=2)
Planning Time: 0.593 ms
Execution Time: 22486.621 ms
(12 rows)
Any thoughts?
Thanks
Richard
set enable_memoize to off;
explain (analyze, costs off)
select * from tenk1 t1 left join lateral
(select t1.two as t1two, * from tenk1 t2 offset 0) s
on t1.two = s.two;
QUERY PLAN
------------------------------------------------------------------------------------
Nested Loop Left Join (actual time=0.050..59578.053 rows=50000000 loops=1)
-> Seq Scan on tenk1 t1 (actual time=0.027..2.703 rows=10000 loops=1)
-> Subquery Scan on s (actual time=0.004..4.819 rows=5000 loops=10000)
Filter: (t1.two = s.two)
Rows Removed by Filter: 5000
-> Seq Scan on tenk1 t2 (actual time=0.002..3.834 rows=10000 loops=10000)
Planning Time: 0.666 ms
Execution Time: 60937.899 ms
(8 rows)
set enable_memoize to on;
explain (analyze, costs off)
select * from tenk1 t1 left join lateral
(select t1.two as t1two, * from tenk1 t2 offset 0) s
on t1.two = s.two;
QUERY PLAN
------------------------------------------------------------------------------------------
Nested Loop Left Join (actual time=0.061..122684.607 rows=50000000 loops=1)
-> Seq Scan on tenk1 t1 (actual time=0.026..3.367 rows=10000 loops=1)
-> Memoize (actual time=0.011..9.821 rows=5000 loops=10000)
Cache Key: t1.two, t1.two
Cache Mode: binary
Hits: 0 Misses: 10000 Evictions: 9999 Overflows: 0 Memory Usage: 1368kB
-> Subquery Scan on s (actual time=0.008..5.188 rows=5000 loops=10000)
Filter: (t1.two = s.two)
Rows Removed by Filter: 5000
-> Seq Scan on tenk1 t2 (actual time=0.004..4.081 rows=10000 loops=10000)
Planning Time: 0.607 ms
Execution Time: 124431.388 ms
(12 rows)
The execution time (best of 3) is 124431.388 VS 60937.899 with and
without memoize.
The Memoize runtime stats 'Hits: 0 Misses: 10000 Evictions: 9999'
seems suspicious to me, so I've looked into it a little bit, and found
that the MemoizeState's keyparamids and its outerPlan's chgParam are
always different, and that makes us have to purge the entire cache each
time we rescan the memoize node.
But why are they always different? Well, for the query above, we have
two NestLoopParam nodes, one (with paramno 1) is created when we replace
outer-relation Vars in the scan qual 't1.two = s.two', the other one
(with paramno 0) is added from the subquery's subplan_params, which was
created when we replaced uplevel vars with Param nodes for the subquery.
That is to say, the chgParam would be {0, 1}.
When it comes to replace outer-relation Vars in the memoize keys, the
two 't1.two' Vars are both replaced with the NestLoopParam with paramno
1, because it is the first NLP we see in root->curOuterParams that is
equal to the Vars in memoize keys. That is to say, the memoize node's
keyparamids is {1}.
I haven't thought thoroughly about the fix yet. But one way I'm
thinking is that in create_subqueryscan_plan() we can first add the
subquery's subplan_params to root->curOuterParams, and then replace
outer-relation Vars in scan_clauses afterwards. That can make us be
able to share the same PARAM_EXEC slot for the same Var that both
belongs to the subquery's uplevel vars and to the NestLoop's
outer-relation vars. To be concrete, something like attached.
With this change the same query runs much faster and the Memoize runtime
stats looks more normal.
explain (analyze, costs off)
select * from tenk1 t1 left join lateral
(select t1.two as t1two, * from tenk1 t2 offset 0) s
on t1.two = s.two;
QUERY PLAN
--------------------------------------------------------------------------------------
Nested Loop Left Join (actual time=0.063..21177.476 rows=50000000 loops=1)
-> Seq Scan on tenk1 t1 (actual time=0.025..5.415 rows=10000 loops=1)
-> Memoize (actual time=0.001..0.234 rows=5000 loops=10000)
Cache Key: t1.two, t1.two
Cache Mode: binary
Hits: 9998 Misses: 2 Evictions: 0 Overflows: 0 Memory Usage: 2735kB
-> Subquery Scan on s (actual time=0.009..5.169 rows=5000 loops=2)
Filter: (t1.two = s.two)
Rows Removed by Filter: 5000
-> Seq Scan on tenk1 t2 (actual time=0.006..4.050 rows=10000 loops=2)
Planning Time: 0.593 ms
Execution Time: 22486.621 ms
(12 rows)
Any thoughts?
Thanks
Richard
Attachment
pgsql-hackers by date: