Multiple calls to json_array_elements slow down nonlinearly - Mailing list pgsql-hackers

From Craig Ringer
Subject Multiple calls to json_array_elements slow down nonlinearly
Date
Msg-id 52EEEC37.9040305@2ndquadrant.com
Whole thread Raw
Responses Re: bugfix patch for json_array_elements
List pgsql-hackers
Hi all

I ran into this performance report over the weekend:
   http://stackoverflow.com/q/21507127/398670

and wanted to mention it here.

json_array_elements seems to spend about 97% of its time in
MemoryContextReset(...).

Given dummy data:

test=> create table g as select (select json_agg(random()) json  from
generate_series(0, (r1*4)::int))  from (select random() r1 from
generate_series(1,20000)) aux;

Compare these two methods of producing the same result set:

test=> create table q as select json->x foo from g,
generate_series(0,json_array_length(g.json)-1) x;
SELECT 60103
Time: 157.702 ms
test=> create table p as select json_array_elements(json) foo from g;
SELECT 60103
Time: 4254.494 ms


The issue is reproducible and scales non-linearly with row count, which
is a clue. At 100k rows input, the lateral query takes 592ms vs 179959ms
(3 minutes) for json_array_elements.

Whenever I grab a backtrace it looks like:

> #0  0x000000000072dd7d in MemoryContextReset (context=0x2a02dc90) at mcxt.c:130
> #1  0x000000000072dd90 in MemoryContextResetChildren (context=<optimized out>) at mcxt.c:155
> #2  MemoryContextReset (context=0x1651220) at mcxt.c:131
> #3  0x00000000005817f9 in ExecScan (node=node@entry=0x164e1a0, accessMtd=accessMtd@entry=0x592040 <SeqNext>,
recheckMtd=recheckMtd@entry=0x592030<SeqRecheck>)
 
>     at execScan.c:155

(Sorry for the quote-paste; only way to make @#$ Thunderbird not wrap
mail, I need to switch clients or fix that).

"perf top" on the process shows:
96.92%  postgres      [.] MemoryContextReset 0.15%  [kernel]      [k] cpuacct_account_field 0.09%  [kernel]      [k]
update_cfs_rq_blocked_load0.09%  postgres      [.] AllocSetAlloc
 


At a guess, we're looking at a case where a new child context is created
at every call, so every MemoryContextResetChildren call has to deal with
more child contexts. I'm going to take a quick look now, I just wanted
to get this written up before I got sidetracked.

-- Craig Ringer                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: Making strxfrm() blobs in indexes work
Next
From: Craig Ringer
Date:
Subject: Re: narwhal and PGDLLIMPORT