Home > mailing lists

Re: BUG #15225: [XX000] ERROR: invalid DSA memory alloc request size1073741824 / Where: parallel worker - Mailing list pgsql-bugs

From	Thomas Munro
Subject	Re: BUG #15225: [XX000] ERROR: invalid DSA memory alloc request size1073741824 / Where: parallel worker
Date	June 6, 2018 00:36:37
Msg-id	CAEepm=1nemj1wGLsnNrVqBLmqV9N6+6WToo=prGxAXafPYiTrA@mail.gmail.com Whole thread Raw
In response to	Re: BUG #15225: [XX000] ERROR: invalid DSA memory alloc request size1073741824 / Where: parallel worker (Frits Jalvingh <jal@etc.to>)
Responses	Re: BUG #15225: [XX000] ERROR: invalid DSA memory alloc request size1073741824 / Where: parallel worker
List	pgsql-bugs

Tree view

On Tue, Jun 5, 2018 at 11:08 PM, Frits Jalvingh <jal@etc.to> wrote:
> Hi Thomas, thanks for your help.
>
> The problem seems to be not very consistent. Without any changes to either
> database or query I now get:
> org.postgresql.util.PSQLException: ERROR: could not resize shared memory
> segment "/PostgreSQL.347733883" to 2147483648 bytes: No space left on device

That's what it looks like when the OS just can't allocate any more
virtual memory (posix_fallocate() on a shared memory region failed).
The solution to that problem is to turn down work_mem.  With our
current model of memory management, we'll use work_mem for every
backend for every sort or hash join.  So when you said work_mem = 2GB
and max_parallel_workers = 2 (+ 1 leader process = 3 backends) and you
joined N tables together (that's a very hand wavy and stupid
approximation of the number of nodes), you said it was OK to use a
total of 2GB * (N - 1) * 3.  That's apparently too much for this rig.
Does it work if you turn work_mem down?

> Just executing the statement WITHOUT the explain does work properly, so it
> is the explain part that seems to trigger the issue.

That is weird.  I don't have a theory to explain that yet.

> I set enable_parallel_hash =  off in the code and now the explain part works
> properly. The plan XML looks like this:

>           <Node-Type>Sort</Node-Type>
>           <Plan-Rows>1462</Plan-Rows>
>           <Actual-Rows>184162324</Actual-Rows>

>               <Node-Type>Hash Join</Node-Type>
>               <Plan-Rows>1462</Plan-Rows>
>               <Actual-Rows>184162324</Actual-Rows>

>                               <Node-Type>Hash</Node-Type>
>                               <Plan-Rows>47373</Plan-Rows>
>                               <Actual-Rows>731320569</Actual-Rows>

Thanks.  Hmm.  I didn't try to understand this plan in detail but
clearly it is dramatically underestimating cardinality, and there are
large numbers involved that would result in nbuckets of the right sort
of order.

If the first error you reported results from an attempt to allocate a
very large bucket array (rather than, say, a monster tuple >= 1GB), as
I speculated, then there would need to be a flaw in the logic that
clamps nbuckets to fit in MaxAllocSize.  I will do some digging today.

-- 
Thomas Munro
http://www.enterprisedb.com

pgsql-bugs by date:

From: Tom Lane
Date: 05 June 2018, 20:18:00
Subject: Re: Warning when using psql

From: Fabien COELHO
Date: 06 June 2018, 05:51:23
Subject: RE: BUG #15228: pgbench custom script numbering off-by-one

Re: BUG #15225: [XX000] ERROR: invalid DSA memory alloc request size1073741824 / Where: parallel worker - Mailing list pgsql-bugs

Previous

Next