Re: Open Item: Should non-text EXPLAIN always show properties? - Mailing list pgsql-hackers

From Justin Pryzby
Subject Re: Open Item: Should non-text EXPLAIN always show properties?
Date
Msg-id 20200723141454.GF4286@telsasoft.com
Whole thread Raw
In response to Re: Open Item: Should non-text EXPLAIN always show properties?  (James Coleman <jtc331@gmail.com>)
List pgsql-hackers
On Thu, Jun 25, 2020 at 08:41:43AM -0400, James Coleman wrote:
> On Thu, Jun 25, 2020 at 5:15 AM David Rowley <dgrowleyml@gmail.com> wrote:
> > Over on [1] Justin mentions that the non-text EXPLAIN ANALYZE should
> > always show the "Disk Usage" and "HashAgg Batches" properties.  I
> > agree with this. show_wal_usage() is a good example of how we normally
> > do things.  We try to keep the text format as humanly readable as
> > possible but don't really expect humans to be commonly reading the
> > other supported formats, so we care less about including additional
> > details there.
> >
> > There's also an open item regarding this for Incremental Sort, so I've
> > CC'd James and Tomas here. This seems like a good place to discuss
> > both.
> 
> Yesterday I'd replied [1] to Justin's proposal for this WRT
> incremental sort and expressed my opinion that including both
> unnecessarily (i.e., including disk when an in-memory sort was used)
> is undesirable and confusing and leads to shortcuts I believe to be
> bad habits when using the data programmatically.

I have gone back and forth about this.

The current non-text output for Incremental Sort is like:

       Sort Methods Used:            +
         - "quicksort"               +
       Sort Space Memory:            +
         Average Sort Space Used: 26 +
         Peak Sort Space Used: 26    +

explain.c determines whether to output in non-text mode by checking:
| if (groupInfo->maxDiskSpaceUsed > 0)

Which I think is per se wrong.  Either it should use a test like:
| if (groupInfo->sortMethods & SORT_TYPE_QUICKSORT != 0)
or it should output the "Sort Space" unconditionally.

It does seem wrong if Incr Sort says "Sort Space Disk / Average: 0, Peak: 0"
when there was no disk sort at all, and it wasn't listed as a "Sort Method".

On the other hand, that's determined during execution, right?  (Based on things
like table content and table order and tuple width) ?  David's argument in
making the HashAgg's explain output unconditionally show Disk/Batch was that
this is not known until execution time (based on table content).

HashAgg now shows:

SET work_mem='64 MB'; explain(format yaml, analyze) SELECT a ,COUNT(1) FROM generate_series(1,99999)a GROUP BY 1;
...
     Disk Usage: 0                       +
     HashAgg Batches: 0                  +

So I think I still think incr sort should do the same, showing Disk:0.

Otherwise, I think it should at least use a test like this, rather than (DiskSpaceUsed > 0):
| if (groupInfo->sortMethods & SORT_TYPE_QUICKSORT != 0)

-- 
Justin



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Building 12.3 from source on Mac
Next
From: Paul Förster
Date:
Subject: Re: Building 12.3 from source on Mac