Re: BUG #15225: [XX000] ERROR: invalid DSA memory alloc request size1073741824 / Where: parallel worker - Mailing list pgsql-bugs

From Frits Jalvingh
Subject Re: BUG #15225: [XX000] ERROR: invalid DSA memory alloc request size1073741824 / Where: parallel worker
Date
Msg-id CAKhTGFWWVo8uZJP+YXyOObVxhaQ+ngmRh9oYRMTO_CiEj-OMYw@mail.gmail.com
Whole thread Raw
In response to Re: BUG #15225: [XX000] ERROR: invalid DSA memory alloc request size 1073741824 / Where: parallel worker  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: BUG #15225: [XX000] ERROR: invalid DSA memory alloc request size 1073741824 / Where: parallel worker
List pgsql-bugs
Thanks for your explanation.

>> - explain seems to somehow influence the plan that is being made.
>That should absolutely *not* be the case.
I could not agree more ;)
The "good" news is that this effect is apparently caused by something else. I fired those statements through an IDE (IntelliJ) written in Java and using jdbc. There seems to be something odd going on in there, because when I paste the query in psql then the effect with and without explain looks the same: 3 processes of which 2 "parallel workers" doing enormous amounts of I/O at 50..90% CPU. I will try to find out what easter egg in either that IDE or the JDBC driver causes this 8-/.

> this does not prove that the thing is non-parallel, ....
I do not understand? When it runs without explain there are no "parallel worker" processes at all, just a single backend process running the "select" at 100% cpu (there are of course the normal postgres processes but all are all but idle and none are "background workers"). If it has a bad distribution would this not just mean its workers are less busy?
Related question: as postgres does not use threading I assume that background workers are visible as such, so even if they are not very busy I would assume I would see them with ps -ef.

>work_mem has always been like that.
I know. My observation is that this behavior has more of a bad effect with newer postgresses: because of the increased parallelism (and the apparent OK to every node to grab work_mem when they see fit) newer version make way less good use of memory than older versions because you have to decrease the parameter. That 2GB value I had worked fine on 10, and helped a lot with speeding up my workload. Now for the same workload I have to put it on 512MB, so all queries that just do one sort are slower - and memory is used less well. It means that in all the system might perform less well despite parallelism because you have to prevent aborting queries.

Regards,
Frits Jalvingh

On Wed, Jun 6, 2018 at 4:38 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Frits Jalvingh <jal@etc.to> writes:
> As far as that explain thing goes, the odd thing seems to be that /without/
> the explain the database uses a non parallel plan. If I execute the exact
> same statement without explain I see one postgres process running at 100%
> for a long time, then it produces its output proper.

This doesn't prove that the thing is non-parallel, only that the work is
badly distributed.  You might try cranking up log_min_messages to the
point that worker-process launching gets logged, and then see whether
or not it's really non-parallel.

> - explain seems to somehow influence the plan that is being made.

That should absolutely *not* be the case.

> - there is a big problem with postgres memory assignment, because with the
> increase of parallelism having work_mem work per process means that you
> cannot effectively control the amount of memory that postgres uses anymore.

work_mem has always been like that.  You cannot set it to any very large
fraction of your system's available memory, at least not globally across
a whole bunch of queries.

                        regards, tom lane

pgsql-bugs by date:

Previous
From: Tom Lane
Date:
Subject: Re: Unnecessarily imprecise stats for very small tables leading to bad plans
Next
From: Tom Lane
Date:
Subject: Re: BUG #15225: [XX000] ERROR: invalid DSA memory alloc request size 1073741824 / Where: parallel worker