Thread: Multiple calls to json_array_elements slow down nonlinearly

Multiple calls to json_array_elements slow down nonlinearly

From
Craig Ringer
Date:
Hi all

I ran into this performance report over the weekend:
   http://stackoverflow.com/q/21507127/398670

and wanted to mention it here.

json_array_elements seems to spend about 97% of its time in
MemoryContextReset(...).

Given dummy data:

test=> create table g as select (select json_agg(random()) json  from
generate_series(0, (r1*4)::int))  from (select random() r1 from
generate_series(1,20000)) aux;

Compare these two methods of producing the same result set:

test=> create table q as select json->x foo from g,
generate_series(0,json_array_length(g.json)-1) x;
SELECT 60103
Time: 157.702 ms
test=> create table p as select json_array_elements(json) foo from g;
SELECT 60103
Time: 4254.494 ms


The issue is reproducible and scales non-linearly with row count, which
is a clue. At 100k rows input, the lateral query takes 592ms vs 179959ms
(3 minutes) for json_array_elements.

Whenever I grab a backtrace it looks like:

> #0  0x000000000072dd7d in MemoryContextReset (context=0x2a02dc90) at mcxt.c:130
> #1  0x000000000072dd90 in MemoryContextResetChildren (context=<optimized out>) at mcxt.c:155
> #2  MemoryContextReset (context=0x1651220) at mcxt.c:131
> #3  0x00000000005817f9 in ExecScan (node=node@entry=0x164e1a0, accessMtd=accessMtd@entry=0x592040 <SeqNext>,
recheckMtd=recheckMtd@entry=0x592030<SeqRecheck>)
 
>     at execScan.c:155

(Sorry for the quote-paste; only way to make @#$ Thunderbird not wrap
mail, I need to switch clients or fix that).

"perf top" on the process shows:
96.92%  postgres      [.] MemoryContextReset 0.15%  [kernel]      [k] cpuacct_account_field 0.09%  [kernel]      [k]
update_cfs_rq_blocked_load0.09%  postgres      [.] AllocSetAlloc
 


At a guess, we're looking at a case where a new child context is created
at every call, so every MemoryContextResetChildren call has to deal with
more child contexts. I'm going to take a quick look now, I just wanted
to get this written up before I got sidetracked.

-- Craig Ringer                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: bugfix patch for json_array_elements

From
Craig Ringer
Date:
On 02/03/2014 09:09 AM, Craig Ringer wrote:

> At a guess, we're looking at a case where a new child context is created
> at every call, so every MemoryContextResetChildren call has to deal with
> more child contexts.

That would be "yes". After a short run, I see 32849 lines like:

json_array_elements temporary cxt: 8192 total in 1 blocks; 8160 free (0
chunks); 32 used

under the context:

  PortalMemory: 8192 total in 1 blocks
    PortalHeapMemory: 7168 total in 3 blocks
      ExecutorState: 65600 total in 4 blocks
        ExprContext: 8192 total in 1 blocks
          json_array_elements temporary cxt: 8192 total in 1 blocks;
8160 free (0 chunks); 32 used


The attached patch deletes the context after use, bringing performance
back into line. It should be backpatched to 9.3.

--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Attachment

Re: bugfix patch for json_array_elements

From
Andrew Dunstan
Date:
On 02/02/2014 08:54 PM, Craig Ringer wrote:
> On 02/03/2014 09:09 AM, Craig Ringer wrote:
>
>> At a guess, we're looking at a case where a new child context is created
>> at every call, so every MemoryContextResetChildren call has to deal with
>> more child contexts.
> That would be "yes". After a short run, I see 32849 lines like:
>
> json_array_elements temporary cxt: 8192 total in 1 blocks; 8160 free (0
> chunks); 32 used
>
> under the context:
>
>    PortalMemory: 8192 total in 1 blocks
>      PortalHeapMemory: 7168 total in 3 blocks
>        ExecutorState: 65600 total in 4 blocks
>          ExprContext: 8192 total in 1 blocks
>            json_array_elements temporary cxt: 8192 total in 1 blocks;
> 8160 free (0 chunks); 32 used
>
>
> The attached patch deletes the context after use, bringing performance
> back into line. It should be backpatched to 9.3.

[...]
> +    MemoryContextDelete(state->tmp_cxt);
> +    state->tmp_cxt = NULL;
> +
>       PG_RETURN_NULL();



Hmm. I guess I was assuming that the tmp_cxt would be cleaned up at the 
end of the function since it's a child of the CurrentMemoryContext. But 
if I got that wrong I'm happy to fix it. I don't see the point in 
setting state->tmp_cxt to NULL since it's about to go out of scope anyway.


cheers

andrew



Re: bugfix patch for json_array_elements

From
Tom Lane
Date:
Andrew Dunstan <andrew@dunslane.net> writes:
> On 02/02/2014 08:54 PM, Craig Ringer wrote:
>> The attached patch deletes the context after use, bringing performance
>> back into line. It should be backpatched to 9.3.

> Hmm. I guess I was assuming that the tmp_cxt would be cleaned up at the 
> end of the function since it's a child of the CurrentMemoryContext.

The executor does MemoryContextReset, not
MemoryContextResetAndDeleteChildren, on the per-tuple context.  That means
that child contexts will be reset, not deleted.  I seem to recall some
discussions about changing that, or even redefining MemoryContextReset to
automatically delete child contexts; but it would take a fair amount of
research to be sure such a change was safe.
        regards, tom lane



Re: bugfix patch for json_array_elements

From
Andrew Dunstan
Date:
On 02/03/2014 11:12 AM, Tom Lane wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
>> On 02/02/2014 08:54 PM, Craig Ringer wrote:
>>> The attached patch deletes the context after use, bringing performance
>>> back into line. It should be backpatched to 9.3.
>> Hmm. I guess I was assuming that the tmp_cxt would be cleaned up at the
>> end of the function since it's a child of the CurrentMemoryContext.
> The executor does MemoryContextReset, not
> MemoryContextResetAndDeleteChildren, on the per-tuple context.  That means
> that child contexts will be reset, not deleted.  I seem to recall some
> discussions about changing that, or even redefining MemoryContextReset to
> automatically delete child contexts; but it would take a fair amount of
> research to be sure such a change was safe.
>

Good to know.

Is it worth a note in src/backend/utils/mmgr/README so people are warned 
against making the same mistake I did? Both of these are set-returning 
functions operating in materialize mode - not sure if that makes any 
difference.

cheers

andrew