Re: BUG #7916: memory leak with array_agg - Mailing list pgsql-bugs

From Sergey Burladyan
Subject Re: BUG #7916: memory leak with array_agg
Date
Msg-id 87ehftierx.fsf@home.progtech.ru
Whole thread Raw
In response to Re: BUG #7916: memory leak with array_agg  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-bugs
Tom Lane <tgl@sss.pgh.pa.us> writes:

> AFAICT there's no actual leak here; array_agg is just optimized for
> speed rather than space.  It eats about 8K per hashtable entry.
> While the planner knows that, it's got no good idea how many groups
> will be produced by the query, so it underestimates the space needed
> --- and the HashAggregate code is not currently capable of spilling
> the hashtable to disk, so the table balloons well past the intended
> work_mem limit.

Aha, I found this "8K per hashtable entry" with ltrace and now understand
what is going on here:
  === array_agg malloc calls count ===  === myagg malloc calls count===
      3 malloc(1024)                        3 malloc(1024)
      3 malloc(1048576)                     3 malloc(1048576)
      3 malloc(131072)                      3 malloc(131072)
      1 malloc(16056)                       4 malloc(16384)
      5 malloc(16384)                       2 malloc(16440)
      2 malloc(16440)                       1 malloc(2048)
      1 malloc(2048)                        3 malloc(2097152)
      3 malloc(2097152)                     3 malloc(262144)
      3 malloc(262144)                      3 malloc(32768)
      3 malloc(32768)                       1 malloc(32824)
      1 malloc(32824)                       1 malloc(4096)
      1 malloc(4096)                        3 malloc(4194304)
      3 malloc(4194304)                     3 malloc(524288)
      3 malloc(524288)                      3 malloc(65536)
      3 malloc(65536)                      12 malloc(8192)
 724151 malloc(8192)                        1 malloc(8296)
      1 malloc(8296)                       29 malloc(8360)
     44 malloc(8360)                       16 malloc(8388608)
      8 malloc(8388608)

Thank you for answer Tom!

> Although no real fix for this is within easy reach, it strikes me
> that we could possibly ameliorate things a bit by tweaking the
> memory context size parameters used by accumArrayResult().
> It would likely be reasonable to set the min size to 1K or so not 8K.
> This would make things a trifle slower when the actual space requirement
> exceeds 1K, but probably not by enough to notice.

Looks good.

> BTW, I don't believe your assertion that the handmade aggregate does
> this in 7MB.  Even a very optimistic calculation puts the space needed
> for 700000 10-element integer arrays at forty-some MB, and when I try
> it I see more like 100MB consumed thanks to hashtable overhead.

Yes you are right, Tom. My mistake.

--
Sergey Burladyan

pgsql-bugs by date:

Previous
From: Tom Lane
Date:
Subject: Re: BUG #7916: memory leak with array_agg
Next
From: adrianopatrick@gmail.com
Date:
Subject: BUG #7918: limitation of pagination with LIMIT and OFFSET