Thread: Missing HashAgg EXPLAIN ANALYZE details for parallel plans

Missing HashAgg EXPLAIN ANALYZE details for parallel plans

From

David Rowley

Date:

18 June 2020, 03:37:21

Hi,

Now that HashAgg can spill to disk, we see a few more details in
EXPLAIN ANALYZE than we did previously, e.g. Peak Memory Usage, Disk
Usage.  However, the new code neglected to make EXPLAIN ANALYZE show
these new details for parallel workers.

Additionally, the new properties all were using
ExplainPropertyInteger() which meant that we got output like:

                             QUERY PLAN
---------------------------------------------------------------------
 HashAggregate (actual time=31.724..87.638 rows=1000 loops=1)
   Group Key: a
   Peak Memory Usage: 97 kB
   Disk Usage: 3928 kB
   HashAgg Batches: 798
   ->  Seq Scan on ab (actual time=0.006..9.243 rows=100000 loops=1)

Where all the properties were on a line by themselves.  This does not
really follow what other nodes are doing, e.g sort:

                                QUERY PLAN
---------------------------------------------------------------------------
 GroupAggregate (actual time=47.530..70.935 rows=1000 loops=1)
   Group Key: a
   ->  Sort (actual time=47.500..59.344 rows=100000 loops=1)
         Sort Key: a
         Sort Method: external merge  Disk: 2432kB
         ->  Seq Scan on ab (actual time=0.004..8.476 rows=100000 loops=1)

Where we stick all the properties on a single line.

The attached patch fixes both the missing parallel worker information
and puts the properties on a single line for format=text.

I'd like to push this in the next few days.

David

Attachment

hashagg_explain_fix.patch

Re: Missing HashAgg EXPLAIN ANALYZE details for parallel plans

From

Justin Pryzby

Date:

18 June 2020, 13:45:14

On Thu, Jun 18, 2020 at 03:37:21PM +1200, David Rowley wrote:
> Now that HashAgg can spill to disk, we see a few more details in
> EXPLAIN ANALYZE than we did previously, e.g. Peak Memory Usage, Disk
> Usage.  However, the new code neglected to make EXPLAIN ANALYZE show
> these new details for parallel workers.
> 
> Additionally, the new properties all were using
> ExplainPropertyInteger() which meant that we got output like:
> 
>    Group Key: a
>    Peak Memory Usage: 97 kB
>    Disk Usage: 3928 kB
>    HashAgg Batches: 798
> 
> Where all the properties were on a line by themselves.  This does not
> really follow what other nodes are doing, e.g sort:
> 
>    ->  Sort (actual time=47.500..59.344 rows=100000 loops=1)
>          Sort Key: a
>          Sort Method: external merge  Disk: 2432kB
> 
> Where we stick all the properties on a single line.

Note that "incremental sort" is also new, and splits things up more than sort.

See in particular 6a918c3ac8a6b1d8b53cead6fcb7cbd84eee5750, which splits things
up even more.

    ->  Incremental Sort (actual rows=70 loops=1)
          Sort Key: t.a, t.b
          Presorted Key: t.a
-         Full-sort Groups: 1 Sort Method: quicksort Memory: avg=NNkB peak=NNkB Presorted Groups: 5 Sort Methods: top-N
heapsort,quicksort Memory: avg=NNkB peak=NNkB
 
+         Full-sort Groups: 1  Sort Method: quicksort  Average Memory: NNkB  Peak Memory: NNkB
+         Pre-sorted Groups: 5  Sort Methods: top-N heapsort, quicksort  Average Memory: NNkB  Peak Memory: NNkB

That's not really a "precedent" and I don't think that necessarily invalidates
your change.

-- 
Justin

Re: Missing HashAgg EXPLAIN ANALYZE details for parallel plans

From

David Rowley

Date:

19 June 2020, 02:02:29

On Fri, 19 Jun 2020 at 01:45, Justin Pryzby <pryzby@telsasoft.com> wrote:
> Note that "incremental sort" is also new, and splits things up more than sort.
>
> See in particular 6a918c3ac8a6b1d8b53cead6fcb7cbd84eee5750, which splits things
> up even more.
>
>     ->  Incremental Sort (actual rows=70 loops=1)
>           Sort Key: t.a, t.b
>           Presorted Key: t.a
> -         Full-sort Groups: 1 Sort Method: quicksort Memory: avg=NNkB peak=NNkB Presorted Groups: 5 Sort Methods:
top-Nheapsort, quicksort Memory: avg=NNkB peak=NNkB
 
> +         Full-sort Groups: 1  Sort Method: quicksort  Average Memory: NNkB  Peak Memory: NNkB
> +         Pre-sorted Groups: 5  Sort Methods: top-N heapsort, quicksort  Average Memory: NNkB  Peak Memory: NNkB
>
> That's not really a "precedent" and I don't think that necessarily invalidates
> your change.

I imagine you moved "Per-sorted Groups" to a new line due to the lines
becoming too long? Or was there something else special about that
property to warrant putting it on a new line?

If it's due to the length of the line, then I don't think there are
quite enough properties for HashAgg to warrant wrapping them to
another line.

Perhaps there's some merit having something else decide when we should
wrap to a new line. e.g once we've put 4 properties on a single line
with the text format. However, it seems like we're pretty inconsistent
with the normal form of properties. Some have multiple values per
property, e.g:

if (es->format == EXPLAIN_FORMAT_TEXT)
{
ExplainIndentText(es);
appendStringInfo(es->str, "Sort Method: %s  %s: %ldkB\n",
sortMethod, spaceType, spaceUsed);
}
else
{
ExplainPropertyText("Sort Method", sortMethod, es);
ExplainPropertyInteger("Sort Space Used", "kB", spaceUsed, es);
ExplainPropertyText("Sort Space Type", spaceType, es);
}

So spaceType is a "Sort Method" in the text format, but it's "Sort
Space Type" in other formats.  It might not be easy to remove all the
special casing for the text format out of explain.c without changing
the output.


As for this patch, I don't think it's unreasonable to have the 3
possible HashAgg properties on a single line Other people might
disagree, so here's an example of what the patch changes it to:

postgres=# explain analyze select a,sum(b) from ab group by a;
                                                     QUERY PLAN
--------------------------------------------------------------------------------------------------------------------
 HashAggregate  (cost=175425.12..194985.79 rows=988 width=12) (actual
time=1551.920..5281.670 rows=1000 loops=1)
   Group Key: a
   Peak Memory Usage: 97 kB Disk Usage: 139760 kB HashAgg Batches: 832
   ->  Seq Scan on ab  (cost=0.00..72197.00 rows=5005000 width=8)
(actual time=0.237..451.228 rows=5005000 loops=1)

Master currently does:

                            QUERY PLAN
---------------------------------------------------------------------
 HashAggregate (actual time=31.724..87.638 rows=1000 loops=1)
   Group Key: a
   Peak Memory Usage: 97 kB
   Disk Usage: 3928 kB
   HashAgg Batches: 798
   ->  Seq Scan on ab (actual time=0.006..9.243 rows=100000 loops=1)

David

Re: Missing HashAgg EXPLAIN ANALYZE details for parallel plans

From

Justin Pryzby

Date:

19 June 2020, 02:20:01

On Fri, Jun 19, 2020 at 02:02:29PM +1200, David Rowley wrote:
> On Fri, 19 Jun 2020 at 01:45, Justin Pryzby <pryzby@telsasoft.com> wrote:
> > Note that "incremental sort" is also new, and splits things up more than sort.
> >
> > See in particular 6a918c3ac8a6b1d8b53cead6fcb7cbd84eee5750, which splits things
> > up even more.
> >
> >     ->  Incremental Sort (actual rows=70 loops=1)
> >           Sort Key: t.a, t.b
> >           Presorted Key: t.a
> > -         Full-sort Groups: 1 Sort Method: quicksort Memory: avg=NNkB peak=NNkB Presorted Groups: 5 Sort Methods:
top-Nheapsort, quicksort Memory: avg=NNkB peak=NNkB
 
> > +         Full-sort Groups: 1  Sort Method: quicksort  Average Memory: NNkB  Peak Memory: NNkB
> > +         Pre-sorted Groups: 5  Sort Methods: top-N heapsort, quicksort  Average Memory: NNkB  Peak Memory: NNkB
> >
> > That's not really a "precedent" and I don't think that necessarily invalidates
> > your change.
> 
> I imagine you moved "Per-sorted Groups" to a new line due to the lines
> becoming too long? Or was there something else special about that
> property to warrant putting it on a new line?

https://www.postgresql.org/message-id/20200419023625.GP26953@telsasoft.com
|I still think Pre-sorted groups should be on a separate line, as in 0002.
|In addition to looking better (to me), and being easier to read, another reason
|is that there are essentially key=>values here, but the keys are repeated (Sort
|Method, etc).

> postgres=# explain analyze select a,sum(b) from ab group by a;
>  HashAggregate  (cost=175425.12..194985.79 rows=988 width=12) (actual time=1551.920..5281.670 rows=1000 loops=1)
>    Group Key: a
>    Peak Memory Usage: 97 kB Disk Usage: 139760 kB HashAgg Batches: 832

Please be sure to use two spaces between each field !

See earlier discussions (and commits referenced by the Opened Items page).
https://www.postgresql.org/message-id/20200402054120.GC14618@telsasoft.com
https://www.postgresql.org/message-id/20200407042521.GH2228%40telsasoft.com

+                       appendStringInfoChar(es->str, ' ');
+
+               appendStringInfo(es->str, "Peak Memory Usage: " INT64_FORMAT " kB", memPeakKb);
+
+               if (aggstate->hash_batches_used > 0)
+                       appendStringInfo(es->str, " Disk Usage: " UINT64_FORMAT " kB HashAgg Batches: %d",

-- 
Justin

Re: Missing HashAgg EXPLAIN ANALYZE details for parallel plans

From

David Rowley

Date:

19 June 2020, 03:03:41

On Fri, 19 Jun 2020 at 14:20, Justin Pryzby <pryzby@telsasoft.com> wrote:
> Please be sure to use two spaces between each field !
>
> See earlier discussions (and commits referenced by the Opened Items page).
> https://www.postgresql.org/message-id/20200402054120.GC14618@telsasoft.com
> https://www.postgresql.org/message-id/20200407042521.GH2228%40telsasoft.com

Thanks. I wasn't aware of that conversion.

v2 attached.

David

Attachment

hashagg_explain_fix_v2.patch

Re: Missing HashAgg EXPLAIN ANALYZE details for parallel plans

From

Jeff Davis

Date:

19 June 2020, 03:28:30

On Thu, 2020-06-18 at 15:37 +1200, David Rowley wrote:
> The attached patch fixes both the missing parallel worker information
> and puts the properties on a single line for format=text.

Thank you. 

I noticed some strange results in one case where one worker had a lot
more batches than another. After I investigated, I think everything is
fine -- it seems that one worker was getting more tuples from the
underlying parallel scan. So everything looks good to me.

Regards,
    Jeff Davis

Re: Missing HashAgg EXPLAIN ANALYZE details for parallel plans

From

Justin Pryzby

Date:

19 June 2020, 04:06:24

On Fri, Jun 19, 2020 at 03:03:41PM +1200, David Rowley wrote:
> On Fri, 19 Jun 2020 at 14:20, Justin Pryzby <pryzby@telsasoft.com> wrote:
> > Please be sure to use two spaces between each field !
> >
> > See earlier discussions (and commits referenced by the Opened Items page).
> > https://www.postgresql.org/message-id/20200402054120.GC14618@telsasoft.com
> > https://www.postgresql.org/message-id/20200407042521.GH2228%40telsasoft.com
> 
> Thanks. I wasn't aware of that conversion.

To be clear, there wasn't a "conversion".  There were (and are still) different
formats (which everyone agrees isn't ideal) used by "explain(BUFFERS)" vs Sort
and Hash.  The new incr sort changed from output that looked like Buffers (one
space, and equals) to output that looked like Sort/Hashjoin (two spaces and
colons).  And the new explain(WAL) originally used a hybrid (which, on my
request, used two spaces), but it was changed (back) to use one space, for
consistency with explain(BUFFERS).

Some minor nitpicks now that we've dealt with the important issue of
whitespace:

+               bool    gotone = false;

=> Would you consider calling that "found" ?

I couldn't put my finger on it at first why this felt so important to ask, but
realized that my head was tripping over a variable whose name starts with
"goto", and spending 0.2 seconds trying to figure out what you might have meant
by "goto ne".

+               int n;
+
+               for (n = 0; n < aggstate->shared_info->num_workers; n++)

=> Maybe use a C99 declaration ?

+                               if (hash_batches_used > 0)
+                               {
+                                       ExplainPropertyInteger("Disk Usage", "kB", hash_disk_used,
+                                                                                  es);
+                                       ExplainPropertyInteger("HashAgg Batches", NULL,
+                                                                                  hash_batches_used, es);
+                               }

=> Shouldn't those *always* be shown in nontext format ?  I realize that's not
a change new to your patch, and maybe should be a separate commit.

+   size = offsetof(SharedAggInfo, sinstrument)
+       + pcxt->nworkers * sizeof(AggregateInstrumentation);

=> There's a couple places where I'd prefer to see "+" at the end of the
preceding line (but YMMV).

-- 
Justin

Re: Missing HashAgg EXPLAIN ANALYZE details for parallel plans

From

David Rowley

Date:

19 June 2020, 04:21:18

On Fri, 19 Jun 2020 at 16:06, Justin Pryzby <pryzby@telsasoft.com> wrote:
>
> On Fri, Jun 19, 2020 at 03:03:41PM +1200, David Rowley wrote:
> > On Fri, 19 Jun 2020 at 14:20, Justin Pryzby <pryzby@telsasoft.com> wrote:
> > > Please be sure to use two spaces between each field !
> > >
> > > See earlier discussions (and commits referenced by the Opened Items page).
> > > https://www.postgresql.org/message-id/20200402054120.GC14618@telsasoft.com
> > > https://www.postgresql.org/message-id/20200407042521.GH2228%40telsasoft.com
> >
> > Thanks. I wasn't aware of that conversion.
>
> To be clear, there wasn't a "conversion".

Sorry, I meant to write "conversation".

>
> Some minor nitpicks now that we've dealt with the important issue of
> whitespace:
>
> +               bool    gotone = false;
>
> => Would you consider calling that "found" ?

I'll consider it.  I didn't really invent the name. There's plenty of
other places that use that name for the same thing.  I think of
"found" as more often used when we're looking for something and need
to record if we found it or not.  That's not really happening here. I
just want to record if we've added a property yet.

> +               int n;
> +
> +               for (n = 0; n < aggstate->shared_info->num_workers; n++)
>
> => Maybe use a C99 declaration ?

Maybe.  It'll certainly save a couple of lines.

> +                               if (hash_batches_used > 0)
> +                               {
> +                                       ExplainPropertyInteger("Disk Usage", "kB", hash_disk_used,
> +                                                                                  es);
> +                                       ExplainPropertyInteger("HashAgg Batches", NULL,
> +                                                                                  hash_batches_used, es);
> +                               }
>
> => Shouldn't those *always* be shown in nontext format ?  I realize that's not
> a change new to your patch, and maybe should be a separate commit.

Perhaps a separate commit. I don't want to overload the debate with
too many things. I'd rather push forward with just the originally
proposed change since nobody has shown any objection for it.

> +   size = offsetof(SharedAggInfo, sinstrument)
> +       + pcxt->nworkers * sizeof(AggregateInstrumentation);
>
> => There's a couple places where I'd prefer to see "+" at the end of the
> preceding line (but YMMV).

I pretty much just copied the whole of that code from nodeSort.c. I'm
more inclined to just keep it as similar to that as possible. However,
if pgindent decides otherwise, then I'll go with that. I imagine it
won't move it though as that code has already been through indentation
a few times before.

Thanks for the review.

David

Re: Missing HashAgg EXPLAIN ANALYZE details for parallel plans

From

David Rowley

Date:

19 June 2020, 05:27:13

On Fri, 19 Jun 2020 at 15:28, Jeff Davis <pgsql@j-davis.com> wrote:
>
> On Thu, 2020-06-18 at 15:37 +1200, David Rowley wrote:
> > The attached patch fixes both the missing parallel worker information
> > and puts the properties on a single line for format=text.
>
> Thank you.
>
> I noticed some strange results in one case where one worker had a lot
> more batches than another. After I investigated, I think everything is
> fine -- it seems that one worker was getting more tuples from the
> underlying parallel scan. So everything looks good to me.

Thanks for having a look at this. I just pushed it.

David

Re: Missing HashAgg EXPLAIN ANALYZE details for parallel plans

From

Alvaro Herrera

Date:

19 June 2020, 19:14:45

On 2020-Jun-19, David Rowley wrote:


> > +   size = offsetof(SharedAggInfo, sinstrument)
> > +       + pcxt->nworkers * sizeof(AggregateInstrumentation);
> >
> > => There's a couple places where I'd prefer to see "+" at the end of the
> > preceding line (but YMMV).
> 
> I pretty much just copied the whole of that code from nodeSort.c. I'm
> more inclined to just keep it as similar to that as possible. However,
> if pgindent decides otherwise, then I'll go with that. I imagine it
> won't move it though as that code has already been through indentation
> a few times before.

pgindent won't change your choice here.  This seems an ideological
issue; each committer has got their style.  Some people prefer to put
operators at the end of the line, others do it this way.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Missing HashAgg EXPLAIN ANALYZE details for parallel plans

From

Justin Pryzby

Date:

08 July 2020, 16:30:21

On Fri, Jun 19, 2020 at 02:02:29PM +1200, David Rowley wrote:
> if (es->format == EXPLAIN_FORMAT_TEXT)
> {
> ExplainIndentText(es);
> appendStringInfo(es->str, "Sort Method: %s  %s: %ldkB\n",
...

> As for this patch, I don't think it's unreasonable to have the 3
> possible HashAgg properties on a single line Other people might
> disagree, so here's an example of what the patch changes it to:
> 
> postgres=# explain analyze select a,sum(b) from ab group by a;
>                                                      QUERY PLAN
> --------------------------------------------------------------------------------------------------------------------
>  HashAggregate  (cost=175425.12..194985.79 rows=988 width=12) (actual
> time=1551.920..5281.670 rows=1000 loops=1)
>    Group Key: a
>    Peak Memory Usage: 97 kB Disk Usage: 139760 kB HashAgg Batches: 832
>    ->  Seq Scan on ab  (cost=0.00..72197.00 rows=5005000 width=8) (actual time=0.237..451.228 rows=5005000 loops=1)

I've just noticed another inconsistency:
For "Sort", there's no space before "kB", but your patch (9bdb300d) uses a
space for text mode.

+               appendStringInfo(es->str, "Peak Memory Usage: " INT64_FORMAT " kB",
+                                                memPeakKb);
+
+               if (aggstate->hash_batches_used > 0)
+                       appendStringInfo(es->str, "  Disk Usage: " UINT64_FORMAT " kB  HashAgg Batches: %d",

...

+                               appendStringInfo(es->str, "Peak Memory Usage: " INT64_FORMAT " kB",
+                                                                memPeakKb);
+
+                               if (hash_batches_used > 0)
+                                       appendStringInfo(es->str, "  Disk Usage: " UINT64_FORMAT " kB  HashAgg Batches:
%d",

Hopefully there's the final whitespace inconsistency for this release.

-- 
Justin

Re: Missing HashAgg EXPLAIN ANALYZE details for parallel plans

From

David Rowley

Date:

08 July 2020, 22:08:05

On Thu, 9 Jul 2020 at 04:30, Justin Pryzby <pryzby@telsasoft.com> wrote:
> I've just noticed another inconsistency:
> For "Sort", there's no space before "kB", but your patch (9bdb300d) uses a
> space for text mode.

Thanks for the report. I just pushed a fix for that.

David