On Thu, Jul 07, 2022 at 04:21:37PM -0400, Robert Haas wrote:
> I mean, what I really want here if I'm honest is to not have the
> system divide the number of rows by the loop count. And it sort of
> sounds like maybe that's what you want, too. You want to know whether
> the loop count is actually zero, not whether it's close to zero when
> you divide it by some number that might be gigantic.
...
> involves a dozen or two different nested loops, and if we didn't
> insist on dividing the time by the loop count, it would be MUCH EASIER
> to figure out whether the time spent in the Index Scan is a
> significant percentage of the total time or not.
I think the guiding princible for what to do should be to reduce how much is
needed to explain about how to interpret what explain is showing...
The docs say this:
| In such cases, the loops value reports the total number of executions of the
| node, and the actual time and rows values shown are averages per-execution.
| This is done to make the numbers comparable with the way that the cost
| estimates are shown. Multiply by the loops value to get the total time
| actually spent in the node.
On Thu, Jul 07, 2022 at 01:45:19PM -0700, Peter Geoghegan wrote:
> Plus you could probably
> make some kind of concession in the direction of maintaining
> compatibility with the current approach if you had to. Right?
The minimum would be to show the information in a way that makes it clear that
it's "new style" output showing a total and not an average, so that a person
who sees it knows how to interpret it (same for the web "explain tools")
A concession would be to show the current information *plus* total/raw values.
This thread is about how to display the existing values. But note that there's
a CF entry for also collecting more values to show things like min/max rows per
loop.
https://commitfest.postgresql.org/38/2765/
Add extra statistics to explain for Nested Loop
--
Justin