Re: Missing HashAgg EXPLAIN ANALYZE details for parallel plans - Mailing list pgsql-hackers

From David Rowley
Subject Re: Missing HashAgg EXPLAIN ANALYZE details for parallel plans
Date
Msg-id CAApHDvpsd7Un=ZGdLPDA4QFEEdtXSoiwTRG8Y3zr5JEFpMfhOA@mail.gmail.com
Whole thread Raw
In response to Re: Missing HashAgg EXPLAIN ANALYZE details for parallel plans  (Justin Pryzby <pryzby@telsasoft.com>)
Responses Re: Missing HashAgg EXPLAIN ANALYZE details for parallel plans
Re: Missing HashAgg EXPLAIN ANALYZE details for parallel plans
List pgsql-hackers
On Fri, 19 Jun 2020 at 01:45, Justin Pryzby <pryzby@telsasoft.com> wrote:
> Note that "incremental sort" is also new, and splits things up more than sort.
>
> See in particular 6a918c3ac8a6b1d8b53cead6fcb7cbd84eee5750, which splits things
> up even more.
>
>     ->  Incremental Sort (actual rows=70 loops=1)
>           Sort Key: t.a, t.b
>           Presorted Key: t.a
> -         Full-sort Groups: 1 Sort Method: quicksort Memory: avg=NNkB peak=NNkB Presorted Groups: 5 Sort Methods:
top-Nheapsort, quicksort Memory: avg=NNkB peak=NNkB
 
> +         Full-sort Groups: 1  Sort Method: quicksort  Average Memory: NNkB  Peak Memory: NNkB
> +         Pre-sorted Groups: 5  Sort Methods: top-N heapsort, quicksort  Average Memory: NNkB  Peak Memory: NNkB
>
> That's not really a "precedent" and I don't think that necessarily invalidates
> your change.

I imagine you moved "Per-sorted Groups" to a new line due to the lines
becoming too long? Or was there something else special about that
property to warrant putting it on a new line?

If it's due to the length of the line, then I don't think there are
quite enough properties for HashAgg to warrant wrapping them to
another line.

Perhaps there's some merit having something else decide when we should
wrap to a new line. e.g once we've put 4 properties on a single line
with the text format. However, it seems like we're pretty inconsistent
with the normal form of properties. Some have multiple values per
property, e.g:

if (es->format == EXPLAIN_FORMAT_TEXT)
{
ExplainIndentText(es);
appendStringInfo(es->str, "Sort Method: %s  %s: %ldkB\n",
sortMethod, spaceType, spaceUsed);
}
else
{
ExplainPropertyText("Sort Method", sortMethod, es);
ExplainPropertyInteger("Sort Space Used", "kB", spaceUsed, es);
ExplainPropertyText("Sort Space Type", spaceType, es);
}

So spaceType is a "Sort Method" in the text format, but it's "Sort
Space Type" in other formats.  It might not be easy to remove all the
special casing for the text format out of explain.c without changing
the output.


As for this patch, I don't think it's unreasonable to have the 3
possible HashAgg properties on a single line Other people might
disagree, so here's an example of what the patch changes it to:

postgres=# explain analyze select a,sum(b) from ab group by a;
                                                     QUERY PLAN
--------------------------------------------------------------------------------------------------------------------
 HashAggregate  (cost=175425.12..194985.79 rows=988 width=12) (actual
time=1551.920..5281.670 rows=1000 loops=1)
   Group Key: a
   Peak Memory Usage: 97 kB Disk Usage: 139760 kB HashAgg Batches: 832
   ->  Seq Scan on ab  (cost=0.00..72197.00 rows=5005000 width=8)
(actual time=0.237..451.228 rows=5005000 loops=1)

Master currently does:

                            QUERY PLAN
---------------------------------------------------------------------
 HashAggregate (actual time=31.724..87.638 rows=1000 loops=1)
   Group Key: a
   Peak Memory Usage: 97 kB
   Disk Usage: 3928 kB
   HashAgg Batches: 798
   ->  Seq Scan on ab (actual time=0.006..9.243 rows=100000 loops=1)

David



pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Add support for INDEX_CLEANUP and TRUNCATE to vacuumdb
Next
From: David Rowley
Date:
Subject: Re: Parallel Seq Scan vs kernel read ahead