Re: array_agg() on a set larger than some arbitrary(?) limit causes runaway memory usage and eventually memory exhaustion - Mailing list pgsql-bugs

From Frank van Vugt
Subject Re: array_agg() on a set larger than some arbitrary(?) limit causes runaway memory usage and eventually memory exhaustion
Date
Msg-id 1615181.fDHOPIPOzk@techfox.foxi
Whole thread Raw
In response to Re: array_agg() on a set larger than some arbitrary(?) limit causes runaway memory usage and eventually memory exhaustion  (Tomas Vondra <tv@fuzzy.cz>)
Responses Re: array_agg() on a set larger than some arbitrary(?) limit causes runaway memory usage and eventually memory exhaustion
List pgsql-bugs
Hi,

Interesting read so far!

Op zondag 20 oktober 2013 04:48:01 schreef Tomas Vondra:
> because the example is constructed so that groups with multiple elements
> are very unlikely).

yep, the original intention of the 'experiment' was to investigate how often
random() would 'clash' on various machines with/without hardware number
generators, etc

> the report says that it consumed ~32GB RAM and swap (not sure how
> much, but probably not a small amount). On my machine it easily ate 8GB
> of RAM and 4GB of swap (and then got shot by OOM).

amount of swap is 32GB as well (see output of 'free' on top of report ;) )

> Anyway, I disabled the preallocation (i.e. 1 element initially, +1 for
> each iteration) which should be ~80MB of data, but even then I was
> unable to execute that query.
>
> The effect on smaller queries (say, 1e6 rows) was negligible too - it
> consumed more or less the same amount of memory, irrespectedly of the
> preallocation.

on my setup, running the array_agg() over 1e6 records uses up ~5GB

> With 1e6 groups that's ~8GB (not really far from what I see here), and
> with 1e7 groups it's ~80GB. Not the most efficient approach for 80MB of
> values.

exactly, the 'scaling' of memory hunger was not what I'd expect from such a
recordset, which made me write up the report

> The failing query is slightly artificial, but pretty much any array_agg
> query with large number of groups is going to fail exactly the same. So
> if we could improve that somehow, that'd be nice.

exactly

> But let's say it's a minor issue, and by switching to a shared memory
> context we've already saved ~50% memory for such these cases (single
> element in a group, 1kB chunk - 64*8B = 512B).

I never ran into this earlier (and I've been using PostgreSQL for quite some
time now) and even now, this is not about a 'production query', so I'm happy
with any and all benefits for future versions that come from this ;)

Having said that, should you want me to check the effects of some patch, just
let me know, I'd be happy to do that.




--

Best,




Frank.

pgsql-bugs by date:

Previous
From: Frank van Vugt
Date:
Subject: Re: array_agg() on a set larger than some arbitrary(?) limit causes runaway memory usage and eventually memory exhaustion
Next
From: "Tomas Vondra"
Date:
Subject: Re: array_agg() on a set larger than some arbitrary(?) limit causes runaway memory usage and eventually memory exhaustion