Home > mailing lists

Re: pg13dev: explain partial, parallel hashagg, and memory use - Mailing list pgsql-hackers

From	James Coleman
Subject	Re: pg13dev: explain partial, parallel hashagg, and memory use
Date	August 5, 2020 05:01:18
Msg-id	CAAaqYe-sg7cHgayWwKWZtSyFr5LQEiMExiqmjeHUOKXxHKxWjQ@mail.gmail.com Whole thread Raw
In response to	Re: pg13dev: explain partial, parallel hashagg, and memory use (David Rowley <dgrowleyml@gmail.com>)
List	pgsql-hackers

Tree view

On Tue, Aug 4, 2020 at 9:44 PM David Rowley <dgrowleyml@gmail.com> wrote:
>
> On Wed, 5 Aug 2020 at 13:21, Justin Pryzby <pryzby@telsasoft.com> wrote:
> >
> > I'm testing with a customer's data on pg13dev and got output for which Peak
> > Memory doesn't look right/useful.  I reproduced it on 565f16902.
>
> Likely the sanity of those results depends on whether you think that
> the Memory Usage reported outside of the workers is meant to be the
> sum of all processes or the memory usage for the leader backend.
>
> All that's going on here is that the Parallel Append is using some
> parallel safe paths and giving one to each worker. The 2 workers take
> the first 2 subpaths and the leader takes the third.  The memory usage
> reported helps confirm that's the case.
>
> Can you explain what you'd want to see changed about this?   Or do you
> want to see the non-parallel worker memory be the sum of all workers?
> Sort does not seem to do that, so I'm not sure if we should consider
> hash agg as an exception to that.

I've always found the way we report parallel workers in EXPLAIN quite
confusing. I realize it matches the actual implementation model (the
leader often is also "another worker", but I think the natural
expectation from a user perspective would be that you'd show as
workers all backends (including the leader) that did work, and then
aggregate into a summary line (where the leader is displayed now).

In the current output there's nothing really to hint to the use that
the model is leader + workers and that the "summary" line is really
the leader. If I were to design this from scratch, I'd want to propose
doing what I said above (summary aggregate line + treat leader as a
worker line, likely with a "leader" tag), but that seems like a big
change to make now. On the other hand, perhaps designating what looks
like a summary line as the "leader" or some such would help clear up
the confusion? Perhaps it could also say "Participating" or
"Non-participating"?

James

pgsql-hackers by date:

From: David Rowley
Date: 05 August 2020, 04:44:17
Subject: Re: pg13dev: explain partial, parallel hashagg, and memory use

From: Alvaro Herrera
Date: 05 August 2020, 05:11:09
Subject: Re: [DOC] Document concurrent index builds waiting on each other

Re: pg13dev: explain partial, parallel hashagg, and memory use - Mailing list pgsql-hackers

Previous

Next