Re: Out of Memory errors are frustrating as heck! - Mailing list pgsql-performance

From Gunther
Subject Re: Out of Memory errors are frustrating as heck!
Date
Msg-id ddd4cc94-83a4-3210-bda3-70afd535cd53@gusw.net
Whole thread Raw
In response to Re: Out of Memory errors are frustrating as heck!  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Out of Memory errors are frustrating as heck!
Re: Out of Memory errors are frustrating as heck!
Re: Out of Memory errors are frustrating as heck!
List pgsql-performance

Wow, we are getting somewhere.

Tom (BTW, your mail server rejects my direct mail, but I'm glad you got it through the list), you say:

I'm pretty sure that's not the droid we're looking for.
ExecHashJoinGetSavedTuple does palloc a new tuple, but it immediately
sticks it into a TupleTableSlot that will be responsible for freeing
it (when the next tuple is stuck into the same slot).  I'd suggest
continuing a few times and looking for other code paths leading
to AllocSetAlloc in this context.

I did continue a "few times", but few as in a dozen, it's always the same

(gdb) bt 6
#0  AllocSetAlloc (context=0x1168230, size=375) at aset.c:715
#1  0x000000000084e6cd in palloc (size=375) at mcxt.c:938
#2  0x000000000061019c in ExecHashJoinGetSavedTuple (file=file@entry=0x21df688, hashvalue=hashvalue@entry=0x7fff2e4ca76c,   tupleSlot=0x10856b8, hjstate=0x11688e0) at nodeHashjoin.c:1277
#3  0x0000000000610c83 in ExecHashJoinNewBatch (hjstate=0x11688e0) at nodeHashjoin.c:1042
#4  ExecHashJoinImpl (parallel=false, pstate=0x11688e0) at nodeHashjoin.c:539
#5  ExecHashJoin (pstate=0x11688e0) at nodeHashjoin.c:565
(More stack frames follow...)

So I decided to just let it go until it exits the ExecHashJoin function:

#6  0x00000000005fde68 in ExecProcNodeInstr (node=0x11688e0) at execProcnode.c:461
461             result = node->ExecProcNodeReal(node);
(gdb) list
456     {
457             TupleTableSlot *result;
458
459             InstrStartNode(node->instrument);
460
461             result = node->ExecProcNodeReal(node);
462
463             InstrStopNode(node->instrument, TupIsNull(result) ? 0.0 : 1.0);
464
465             return result;
(gdb) break 463
Breakpoint 3 at 0x5fde68: file execProcnode.c, line 463.
(gdb) disable 2
(gdb) cont
Continuing.

Breakpoint 3, ExecProcNodeInstr (node=0x11688e0) at execProcnode.c:463
463             InstrStopNode(node->instrument, TupIsNull(result) ? 0.0 : 1.0);

oops, that was fast, so up further ...

(gdb) cont
Continuing.

Breakpoint 4, ExecSort (pstate=0x11687d0) at nodeSort.c:109
109                             if (TupIsNull(slot))
(gdb) cont
Continuing.

Breakpoint 3, ExecProcNodeInstr (node=0x11688e0) at execProcnode.c:463
463             InstrStopNode(node->instrument, TupIsNull(result) ? 0.0 : 1.0);
(gdb) cont
Continuing.

Breakpoint 4, ExecSort (pstate=0x11687d0) at nodeSort.c:109
109                             if (TupIsNull(slot))
(gdb) up
#1  0x00000000005fde68 in ExecProcNodeInstr (node=0x11687d0) at execProcnode.c:461
461             result = node->ExecProcNodeReal(node);
(gdb) up
#2  0x000000000061d2c4 in ExecProcNode (node=0x11687d0) at ../../../src/include/executor/executor.h:247
247             return node->ExecProcNode(node);
(gdb) up
#3  ExecUnique (pstate=0x11685e0) at nodeUnique.c:73
73                      slot = ExecProcNode(outerPlan);
(gdb) list
68              for (;;)
69              {
70                      /*
71                       * fetch a tuple from the outer subplan
72                       */
73                      slot = ExecProcNode(outerPlan);
74                      if (TupIsNull(slot))
75                      {
76                              /* end of subplan, so we're done */
77                              ExecClearTuple(resultTupleSlot);

... but whatever I do, ultimately I get to that allocation routine through the same path.

Since that is the bulk of the activity, and memory was still growing while we come through this path, I assume that this is it.

My first thought on noticing the SELECT DISTINCT was that you might be
hitting the grouping-function-related leak that Andres fixed in 9cf37a527;
but that fix did make it into 11.2 (by just a couple of days...).  Still,
maybe there's another issue in the same area.

I don't know about that one, I only know that I am running 11.2 freshly compiled.

The change suggested by Alvaro Herrera wasn't applicable.

Jeff Janes had more

Breakpoint 2, AllocSetAlloc (context=0x1168230, size=8272) at aset.c:715
715     {
(gdb) p context->name
$8 = 0x96ce5b "ExecutorState"

I think that the above one might have been the one you wanted.
Not sure how you could tell that? It's the same place as everything else. If we can find out what you're looking for, may be we can set a break point earlier up the call chain?

I guess I should run this for a little longer. So I disable my breakpoints 

it went up pretty quick from 1.2 GB to 1.5 GB, but then it stopped growing fast, so now back to gdb and break:

Unfortunately, I think this means you missed your opportunity and are now getting backtraces of the innocent bystanders.
But why? If I see the memory still go up insanely fast, isn't that a sign for the leak?
Particularly since you report that the version using nested loops rather than hash joins also leaked, so it is probably not the hash-join specific code that is doing it.
How about it's in the DISTINCT? I noticed while peeking up the call chain, that it was already in the UNIQUE sort thing also.  I guess it's streaming the results from the hash join right into the unique sort step.
What I've done before is compile with the comments removed from 
src/backend/utils/mmgr/aset.c:/* #define HAVE_ALLOCINFO */
I have just done that and it creates an insane amount of output from all the processes, I'm afraid there will be no way to keep that stuff separated. If there was a way of turning that one and off for one process only, then we could probably get more info...

Everything is also extremely slow that way. Like in a half hour the memory didn't even reach 100 MB.

and then look for allocations sizes which are getting allocated but not freed, and then you can go back to gdb to look for allocations of those specific sizes. 
I guess I should look for both, address and size to match it better.
This generates a massive amount of output, and it bypasses the logging configuration and goes directly to stderr--so it might not end up where you expect.
Yes, massive, like I said. Impossible to use. File system fills up rapidly. I made it so that it can be turned on and off, with the debugger.
int _alloc_info = 0;
#ifdef HAVE_ALLOCINFO
#define AllocFreeInfo(_cxt, _chunk) \       if(_alloc_info) \           fprintf(stderr, "AllocFree: %s: %p, %zu\n", \               (_cxt)->header.name, (_chunk), (_chunk)->size)
#define AllocAllocInfo(_cxt, _chunk) \       if(_alloc_info) \           fprintf(stderr, "AllocAlloc: %s: %p, %zu\n", \               (_cxt)->header.name, (_chunk), (_chunk)->size)
#else
#define AllocFreeInfo(_cxt, _chunk)
#define AllocAllocInfo(_cxt, _chunk)
#endif

so with this I do

(gdb) b AllocSetAlloc
(gdb) cont
(gdb) set _alloc_info=1
(gdb) disable
(gdb) cont

then I wait, ... until it crashes again ... no, it's too much. It fills up my filesystem in no time with the logs.  It produced 3 GB in just a minute of run time.

And also, I doubt we can find anything specifically by allocation size. It's just going to be 512 or whatever.

Isn't there some other way?

I'm going to try without that DISTINCT step, or perhaps by dismantling this query until it works without this excessive memory growth.

-Gunther



pgsql-performance by date:

Previous
From: Jeff Janes
Date:
Subject: Re: Out of Memory errors are frustrating as heck!
Next
From: Mark Kirkwood
Date:
Subject: Re: PostgreSQL upgrade.