Re: Out of Memory errors are frustrating as heck! - Mailing list pgsql-performance

From Gunther
Subject Re: Out of Memory errors are frustrating as heck!
Date
Msg-id 2256ca91-9dac-1fe1-6b23-02c899f8395f@gusw.net
Whole thread Raw
In response to Re: Out of Memory errors are frustrating as heck!  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Out of Memory errors are frustrating as heck!  (Justin Pryzby <pryzby@telsasoft.com>)
Re: Out of Memory errors are frustrating as heck!  (Jeff Janes <jeff.janes@gmail.com>)
Re: Out of Memory errors are frustrating as heck!  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
List pgsql-performance
On 4/14/2019 23:24, Tom Lane wrote:
>>         ExecutorState: 2234123384 total in 266261 blocks; 3782328 free (17244 chunks); 2230341056 used
> Oooh, that looks like a memory leak right enough.  The ExecutorState
> should not get that big for any reasonable query.
2.2 GB is massive yes.
> Your error and stack trace show a failure in HashBatchContext,
> which is probably the last of these four:
>
>>             HashBatchContext: 57432 total in 3 blocks; 16072 free (6 chunks); 41360 used
>>             HashBatchContext: 90288 total in 4 blocks; 16072 free (6 chunks); 74216 used
>>             HashBatchContext: 90288 total in 4 blocks; 16072 free (6 chunks); 74216 used
>>             HashBatchContext: 100711712 total in 3065 blocks; 7936 free (0 chunks); 100703776 used
> Perhaps that's more than it should be, but it's silly to obsess over 100M
> when there's a 2.2G problem elsewhere.
Yes.
>    I think it's likely that it was
> just coincidence that the failure happened right there.  Unfortunately,
> that leaves us with no info about where the actual leak is coming from.

Strange though, that the vmstat tracking never showed that the cache 
allocated memory goes much below 6 GB. Even if this 2.2 GB memory leak 
is there, and even if I had 2 GB of shared_buffers, I would still have 
enough for the OS to give me.

Is there any doubt that this might be a problem with Linux? Because if 
you want, I can whip out a FreeBSD machine, compile pgsql, and attach 
the same disk, and try it there. I am longing to have a reason to move 
back to FreeBSD anyway. But I have tons of stuff to do, so if you do not 
have reason to suspect Linux to do wrong here, I prefer skipping that 
futile attempt

> The memory map shows that there were three sorts and four hashes going
> on, so I'm not sure I believe that this corresponds to the query plan
> you showed us before.
Like I said, the first explain was not using the same constraints (no 
NL). Now what I sent last should all be consistent. Memory dump and 
explain plan and gdb backtrace.
> Any chance of extracting a self-contained test case that reproduces this?

With 18 million rows involved in the base tables, hardly.

But I am ready to try some other things with the debugger that you want 
me to try. If we have a memory leak issue, we might just as well try to  
plug it!

I could even to give someone of you access to the system that runs this.

thanks,
-Gunther



pgsql-performance by date:

Previous
From: Tom Lane
Date:
Subject: Re: Out of Memory errors are frustrating as heck!
Next
From: Justin Pryzby
Date:
Subject: Re: Out of Memory errors are frustrating as heck!