On Fri, Oct 6, 2017 at 12:37 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Lucas Fairchild-Madar <lucas.madar@gmail.com> writes: > Sure, here's a bash script. Assumes you have PGDATABASE, etc set. > Verified this also blows up on pg 10. I haven't verified the new patch.
Oh, I see the problem. This is a different animal, because it's actually an intra-row leak, as it were. What you've got is select data->'id', jsonb_array_elements(data->'items') from kaboom;
where the SRF jsonb_array_elements() emits a lot of values. For each of those values, data->'id' gets evaluated over again, and we can't reclaim memory in the per-tuple context until we've finished the whole cycle for the current row of "kaboom". So a leak would occur in any case ... but it's particularly awful in this case, because data->'id' involves detoasting the rather wide value of "data", which is then promptly leaked. So the total memory consumption is more or less proportional to O(N^2) in the length of "data".
This has been like this since forever, and it's probably impractical to do anything about it pre-v10, given the unstructured way that targetlist SRFs are handled. You could dodge the problem by moving the SRF to a lateral FROM item:
select data->'id', ja from kaboom, lateral jsonb_array_elements(data->'items') as ja;
(The LATERAL keyword is optional here, but I like it because it makes it clearer what's happening.)
As of v10, it might be possible to fix this for the tlist case as well, by doing something like using a separate short-lived context for the non-SRF tlist items.
Is there any sort of setting right now that can defend against this? A way to prevent a query from using 20+GB of memory? I'd prefer the query fail before the database system is kill -9'd.