BUG #18172: High memory usage in tSRF function context - Mailing list pgsql-bugs

From PG Bug reporting form
Subject BUG #18172: High memory usage in tSRF function context
Date
Msg-id 18172-9b8c5fc1d676ded3@postgresql.org
Whole thread Raw
Responses Re: BUG #18172: High memory usage in tSRF function context  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-bugs
The following bug has been logged on the website:

Bug reference:      18172
Logged by:          Sergei Kornilov
Email address:      sk@zsrv.org
PostgreSQL version: Unsupported/Unknown
Operating system:   various Linux
Description:

Hello

While researching several cases of OOM, I discovered an atypically high
memory consumption in this case:

/*
 * test data looks like {"pay": [{"id": 1, "test": 1}], "name": "foo1"}
 * about 1,3GiB total
 */
create table test (id serial, data jsonb);
insert into test (data) select jsonb_build_object('name', 'foo'||i, 'pay',
jsonb_build_array(jsonb_build_object('id', i, 'test', i % 10))) from
generate_series(1,1e7,1) as i;

Using gdb I called MemoryContextStats(TopPortalContext) on break point
standard_ExecutorEnd

statement: select distinct (jsonb_array_elements(data->'pay2')->>'test')
from test;
TopPortalContext: 8192 total in 1 blocks; 7680 free (0 chunks); 512 used
  PortalContext: 1024 total in 1 blocks; 616 free (0 chunks); 408 used:
<unnamed>
    ExecutorState: 16384 total in 2 blocks; 1760 free (5 chunks); 14624
used
      HashAgg meta context: 24576 total in 2 blocks; 13880 free (2 chunks);
10696 used
        ExprContext: 8192 total in 1 blocks; 7952 free (0 chunks); 240
used
      tSRF function arguments: 1442840576 total in 182 blocks; 2833096 free
(356 chunks); 1440007480 used
      ExprContext: 8192 total in 1 blocks; 7952 free (0 chunks); 240 used
      ExprContext: 8192 total in 1 blocks; 7952 free (0 chunks); 240 used
      ExprContext: 8192 total in 1 blocks; 7952 free (0 chunks); 240 used
      ExprContext: 8192 total in 1 blocks; 7952 free (0 chunks); 240 used
      ExprContext: 8192 total in 1 blocks; 7952 free (0 chunks); 240 used
      ExprContext: 8192 total in 1 blocks; 7952 free (0 chunks); 240 used
      ExprContext: 8192 total in 1 blocks; 7952 free (0 chunks); 240 used
Grand total: 1442956288 bytes in 196 blocks; 2920648 free (363 chunks);
1440035640 used

statement: select distinct (jsonb_array_elements(data->'pay2')->>'test')
from test where data ? 'pay2';
TopPortalContext: 8192 total in 1 blocks; 7680 free (0 chunks); 512 used
  PortalContext: 1024 total in 1 blocks; 616 free (0 chunks); 408 used:
<unnamed>
    ExecutorState: 32768 total in 3 blocks; 16680 free (6 chunks); 16088
used
      HashAgg meta context: 57408 total in 2 blocks; 5744 free (0 chunks);
51664 used
        ExprContext: 8192 total in 1 blocks; 7952 free (0 chunks); 240
used
      tSRF function arguments: 8192 total in 1 blocks; 7952 free (0 chunks);
240 used
      ExprContext: 8192 total in 1 blocks; 7952 free (0 chunks); 240 used
      ExprContext: 8192 total in 1 blocks; 7952 free (0 chunks); 240 used
      ExprContext: 8192 total in 1 blocks; 7952 free (0 chunks); 240 used
      ExprContext: 8192 total in 1 blocks; 7952 free (0 chunks); 240 used
      ExprContext: 8192 total in 1 blocks; 7952 free (0 chunks); 240 used
      ExprContext: 8192 total in 1 blocks; 7952 free (0 chunks); 240 used
      ExprContext: 8192 total in 1 blocks; 7952 free (0 chunks); 240 used
Grand total: 173120 bytes in 16 blocks; 102288 free (6 chunks); 70832 used

statement: select distinct (jsonb_array_elements(data->'pay')->>'test') from
test;
TopPortalContext: 8192 total in 1 blocks; 7680 free (0 chunks); 512 used
  PortalContext: 1024 total in 1 blocks; 616 free (0 chunks); 408 used:
<unnamed>
    ExecutorState: 32768 total in 3 blocks; 17912 free (10 chunks); 14856
used
      HashAgg meta context: 1581120 total in 2 blocks; 5744 free (0 chunks);
1575376 used
        ExprContext: 8192 total in 1 blocks; 7952 free (0 chunks); 240
used
      tSRF function arguments: 8192 total in 1 blocks; 7952 free (0 chunks);
240 used
      ExprContext: 8192 total in 1 blocks; 7952 free (0 chunks); 240 used
      ExprContext: 8192 total in 1 blocks; 7952 free (0 chunks); 240 used
      ExprContext: 8192 total in 1 blocks; 7952 free (0 chunks); 240 used
      ExprContext: 8192 total in 1 blocks; 7952 free (0 chunks); 240 used
      ExprContext: 8192 total in 1 blocks; 7472 free (0 chunks); 720 used
      ExprContext: 8192 total in 1 blocks; 7952 free (0 chunks); 240 used
      ExprContext: 8192 total in 1 blocks; 7952 free (0 chunks); 240 used
Grand total: 1696832 bytes in 16 blocks; 103040 free (10 chunks); 1593792
used

The first query consumes much more memory; note that it refers to a
non-existent json key. Querying for a key that exists in JSON or checking
for the existence of a key in where clause corrects memory consumption. But,
unfortunately, data analysts write the query exactly like the first
statement, not like the second statement.

Queries, of course, are corrected later, but maybe it is possible to improve
memory consumption for this case?

MemoryContextStats above are from my local 17devel
(83510534d5f3f116efa035639b9b62b8c6c4df34 commit), occurs for all supported
versions.

regards, Sergei


pgsql-bugs by date:

Previous
From: Andrei Lepikhov
Date:
Subject: Re: BUG #18170: Unexpected error: no relation entry for relid 3
Next
From: hubert depesz lubaczewski
Date:
Subject: Logical replication is missing block of rows when sending initial sync?