Thread: BUG #17844: Memory consumption for memoize node
The following bug has been logged on the website: Bug reference: 17844 Logged by: Alexey Ermakov Email address: alexey.ermakov@dataegret.com PostgreSQL version: 14.7 Operating system: Ubuntu 20.04 Description: Hello, recently I found interesting query which consume a lot of memory (much more than expected with current work_mem and hash_mem_multiplier settings) and triggers OOM killer. After simplifying query looks like there is a problem in part where we join table on condition like: id = (((test_json.big_json_column)::json ->> 'id'::text))::integer and there is a memoize node in the plan. Without memoize node query works fine. I was able to create reproducible test case (on machine with default config and postgresql 14.7): ------------------------------------------------------------------------------ create extension pgcrypto; --create function to generate dummy data CREATE OR REPLACE FUNCTION generate_random_string(size INT) RETURNS TEXT AS $$ DECLARE characters TEXT := 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789'; bytes BYTEA := gen_random_bytes(size); l INT := length(characters); i INT := 0; output TEXT := ''; BEGIN WHILE i < size LOOP output := output || substr(characters, get_byte(bytes, i) % l + 1, 1); i := i + 1; END LOOP; RETURN output; END; $$ LANGUAGE plpgsql VOLATILE; --create table with 200k rows and text column with content like `{"random_string":"....", "id":1}` create table test_json as select id, ('{"random_string":"' || generate_random_string(500) || '", "id":' || id || '}')::text as big_json_column from generate_series(1,200000) gs(id); --duplicate rows and update statistics (to trigger memoize node) insert into test_json select * from test_json; analyze test_json; --another table with similar structure and 100k rows create table test_json2 as select id, ('{"random_string":"' || generate_random_string(500) || '", "id":' || id || '}')::json as big_json_column from generate_series(1,100000) gs(id); alter table test_json2 add primary key(id); --table sizes: test_json - 223MB, test_json2 - 56MB --change settings to trigger plan with memoize node set work_mem = '128MB'; set enable_hashjoin = off; set enable_mergejoin = off; set jit = off; explain (analyze, buffers) select test_json.id, test_json2.id from test_json left join test_json2 on test_json2.id = ((test_json.big_json_column::json)->>'id')::int; Nested Loop Left Join (cost=0.32..108047.22 rows=400000 width=8) (actual time=0.031..3155.261 rows=400000 loops=1) Buffers: shared hit=415739 read=12834 I/O Timings: read=18.254 -> Seq Scan on test_json (cost=0.00..32572.00 rows=400000 width=540) (actual time=0.005..57.248 rows=400000 loops=1) Buffers: shared hit=15738 read=12834 I/O Timings: read=18.254 -> Memoize (cost=0.32..0.34 rows=1 width=4) (actual time=0.008..0.008 rows=0 loops=400000) Cache Key: (((test_json.big_json_column)::json ->> 'id'::text))::integer Cache Mode: logical Hits: 200000 Misses: 200000 Evictions: 0 Overflows: 0 Memory Usage: 16797kB Buffers: shared hit=400001 -> Index Only Scan using test_json2_pkey on test_json2 (cost=0.30..0.33 rows=1 width=4) (actual time=0.001..0.001 rows=0 loops=200000) Index Cond: (id = (((test_json.big_json_column)::json ->> 'id'::text))::integer) Heap Fetches: 0 Buffers: shared hit=400001 Planning Time: 0.069 ms Execution Time: 3227.078 ms ------------------------------------------------------------------------------ During execution I looked on "avail Mem" in top output on test machine to check how much memory process consume. It looked different each time, usually hundreds of MB, sometime around 1.5GB (which is even bigger than table size). I was able to trigger OOM killer with this query and bigger test_json table with similar data. I'm wondering: 1) Is it a known bug ? Does it relate to json parsing somehow ? 2) Is it possible to show such memory consumption in explain (analyze, buffers) output for easier troubleshooting ? -- Thanks, Alexey Ermakov
On Thu, 16 Mar 2023 at 01:12, PG Bug reporting form <noreply@postgresql.org> wrote: > During execution I looked on "avail Mem" in top output on test machine to > check how much memory process consume. It looked different each time, > usually hundreds of MB, sometime around 1.5GB (which is even bigger than > table size). > I was able to trigger OOM killer with this query and bigger test_json table > with similar data. Thank you for raising this and for your effort with the script to reproduce it. > I'm wondering: > 1) Is it a known bug ? Does it relate to json parsing somehow ? It was unknown prior to this report. It's not related to json parsing. It seems to be related to a bug in nodeMemoize.c where we're evaluating the cache key expressions in the ExecutorState context. We should really be in a more temporary context that gets reset early in cache_lookup() before the call to prepare_probe_slot(). I'll need to look in a bit more detail about what that context actually should be. Another thing that came to mind is that we don't track the memory for the cache key. So that could account for some additional memory usage with Memoize. I have a patch locally to fix that. Likely that would be a master-only fix, however. I doubt that's accounting for much of the extra memory you're reporting anyway. In hindsight, we should be tracking that, but I think at the time I was writing this code, I had thoughts that it wasn't much memory compared to storing the cached tuples. I now think differently. It may be a few more days before any patch appears here. David > 2) Is it possible to show such memory consumption in explain (analyze, > buffers) output for easier troubleshooting ? > > -- > Thanks, > Alexey Ermakov >
On Sat, 18 Mar 2023 at 13:22, David Rowley <dgrowleyml@gmail.com> wrote: > It seems to be related to a bug in nodeMemoize.c where we're > evaluating the cache key expressions in the ExecutorState context. We > should really be in a more temporary context that gets reset early in > cache_lookup() before the call to prepare_probe_slot(). I'll need to > look in a bit more detail about what that context actually should be. I've attached fix_memoize_memory_leak.patch to address this. Using your test case, here are the memory stats before and after the fix (taken during ExecEndMemoize). Before: TopPortalContext: 8192 total in 1 blocks; 7656 free (0 chunks); 536 used PortalHoldContext: 24640 total in 2 blocks; 7384 free (0 chunks); 17256 used PortalContext: 51264 total in 10 blocks; 7320 free (11 chunks); 43944 used: <unnamed> ExecutorState: 1770040136 total in 223 blocks; 3728312 free (87 chunks); 1766311824 used MemoizeHashTable: 46137408 total in 15 blocks; 6353568 free (5 chunks); 39783840 used ExprContext: 8192 total in 1 blocks; 3512 free (0 chunks); 4680 used ExprContext: 8192 total in 1 blocks; 7928 free (0 chunks); 264 used ExprContext: 8192 total in 1 blocks; 7928 free (0 chunks); 264 used ExprContext: 8192 total in 1 blocks; 7928 free (0 chunks); 264 used ExprContext: 8192 total in 1 blocks; 7928 free (0 chunks); 264 used After: TopPortalContext: 8192 total in 1 blocks; 7656 free (0 chunks); 536 used PortalHoldContext: 24640 total in 2 blocks; 7384 free (0 chunks); 17256 used PortalContext: 51264 total in 10 blocks; 7320 free (11 chunks); 43944 used: <unnamed> ExecutorState: 76616 total in 5 blocks; 13528 free (8 chunks); 63088 used MemoizeHashTable: 46137408 total in 15 blocks; 6353568 free (5 chunks); 39783840 used ExprContext: 8192 total in 1 blocks; 3512 free (0 chunks); 4680 used ExprContext: 8192 total in 1 blocks; 7928 free (0 chunks); 264 used ExprContext: 8192 total in 1 blocks; 7928 free (0 chunks); 264 used ExprContext: 8192 total in 1 blocks; 7928 free (0 chunks); 264 used ExprContext: 8192 total in 1 blocks; 7928 free (0 chunks); 264 used > Another thing that came to mind is that we don't track the memory for > the cache key. So that could account for some additional memory usage > with Memoize. I have a patch locally to fix that. Likely that would be > a master-only fix, however. I doubt that's accounting for much of the > extra memory you're reporting anyway. In hindsight, we should be > tracking that, but I think at the time I was writing this code, I had > thoughts that it wasn't much memory compared to storing the cached > tuples. I now think differently. I've also attached the have_memoize_track_cachekey_memory.patch to address this. I intend this one for master only. I considered if maybe the executor changes without the planner changes could be backpatched, but I don't think that's a good idea. It wouldn't cause plan stability problems, but it could cause executor performance changes if we start evicting more cache entries due to memory pressure. David
Attachment
On Sun, Mar 19, 2023 at 5:33 PM David Rowley <dgrowleyml@gmail.com> wrote:
> Another thing that came to mind is that we don't track the memory for
> the cache key. So that could account for some additional memory usage
> with Memoize. I have a patch locally to fix that. Likely that would be
> a master-only fix, however. I doubt that's accounting for much of the
> extra memory you're reporting anyway. In hindsight, we should be
> tracking that, but I think at the time I was writing this code, I had
> thoughts that it wasn't much memory compared to storing the cached
> tuples. I now think differently.
I've also attached the have_memoize_track_cachekey_memory.patch to
address this. I intend this one for master only. I considered if
maybe the executor changes without the planner changes could be
backpatched, but I don't think that's a good idea. It wouldn't cause
plan stability problems, but it could cause executor performance
changes if we start evicting more cache entries due to memory
pressure.
+ mstate->mem_used -= sizeof(MemoizeKey) + GetMemoryChunkSpace(key->params);
It seems that the memory used by key is already accounted for in
EMPTY_ENTRY_MEMORY_BYTES. I wonder if this change is needed.
Also I'm kinda confused about using MinimalTuple->t_len vs. using
GetMemoryChunkSpace(MinimalTuple). Why do we choose t_len rather than
GetMemoryChunkSpace in EMPTY_ENTRY_MEMORY_BYTES and CACHE_TUPLE_BYTES?
Thanks
Richard
On Mon, 20 Mar 2023 at 16:10, Richard Guo <guofenglinux@gmail.com> wrote: > mstate->mem_used -= EMPTY_ENTRY_MEMORY_BYTES(entry); > + mstate->mem_used -= sizeof(MemoizeKey) + GetMemoryChunkSpace(key->params); > > It seems that the memory used by key is already accounted for in > EMPTY_ENTRY_MEMORY_BYTES. I wonder if this change is needed. Yeah, I noticed this earlier when looking at the patch again. I remembered I'd not taking the cache key memory into account, seems I should have remembered that I only did that for the planner only. I just pushed the patch to fix that part. > Also I'm kinda confused about using MinimalTuple->t_len vs. using > GetMemoryChunkSpace(MinimalTuple). Why do we choose t_len rather than > GetMemoryChunkSpace in EMPTY_ENTRY_MEMORY_BYTES and CACHE_TUPLE_BYTES? It's true that these can be wildly different, but at least slightly less so than it was in v15 due to c6e0fe1f2a. I understand GetMemoryChunkSpace is what we generally use in nodeAgg.c. So probably it would be more accurate to use that. However, it's not like making that change would make it perfect. Once we start evicting cache entries and adding new ones, aset.c may malloc() new blocks when the items we've pfree'd are on a free list that are not useful for new allocations. It would be interesting to see if there's any performance hit from using GetMemoryChunkSpace(). I expect that's slower. It's just a question of if we can measure it. David
On 2023-03-19 15:33, David Rowley wrote: > On Sat, 18 Mar 2023 at 13:22, David Rowley <dgrowleyml@gmail.com> wrote: >> It seems to be related to a bug in nodeMemoize.c where we're >> evaluating the cache key expressions in the ExecutorState context. We >> should really be in a more temporary context that gets reset early in >> cache_lookup() before the call to prepare_probe_slot(). I'll need to >> look in a bit more detail about what that context actually should be. > I've attached fix_memoize_memory_leak.patch to address this. > > Using your test case, here are the memory stats before and after the > fix (taken during ExecEndMemoize). > Thanks a lot! Could you please point me in the direction how to get such memory stats ? I might know another memory leak I'd like to check but sadly I don't know right tools to measure memory usage. -- Thanks, Alexey Ermakov
On Mon, 20 Mar 2023 at 19:23, Alexey Ermakov <alexey.ermakov@dataegret.com> wrote: > Thanks a lot! Could you please point me in the direction how to get such > memory stats ? I might know another memory leak I'd like to check but > sadly I don't know right tools to measure memory usage. I attached to the backend with a debugger and set a breakpoint in ExecEndMemoize. When the breakpoint was hit, using the debugger, I executed MemoryContextStatsDetail(TopMemoryContext, 100, 1) and grabbed the memory statistics output from the log. David