Re: EXPLAIN of Parallel Append - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: EXPLAIN of Parallel Append
Date
Msg-id CAA4eK1+0g8FYr+b-XmBuxAariK79mo9tM7_MiW0aHfEh03esGA@mail.gmail.com
Whole thread Raw
In response to Re: EXPLAIN of Parallel Append  (Jesper Pedersen <jesper.pedersen@redhat.com>)
List pgsql-hackers
On Mon, Jul 9, 2018 at 7:00 PM, Jesper Pedersen
<jesper.pedersen@redhat.com> wrote:
> On 07/07/2018 01:08 AM, Amit Kapila wrote:
>>
>> On Wed, Mar 14, 2018 at 8:58 PM, Jesper Pedersen
>>>
>>> Parallel Append's ntuples is 1, but given nloops is 3 you end up with the
>>> slightly confusing "(actual ... *rows=0* loops=3)".
>>>
>>
>> The number of rows displayed is total_rows / loops due to which you
>> are seeing these numbers.  This behavior is the same for all parallel
>> nodes, nothing specific to Parallel Append.
>>
>
> Thanks !
>
> Maybe something like the attached patch for the documentation is needed.
>

-    performance characteristics of the plan.
+    performance characteristics of the plan. Note, that the parallel nodes
+    may report zero rows returned due internal calculations when one or more
+    rows are actually being returned.

typo.
/due/due to

I think it is quite unclear what you mean by internal calculations.
If you can come up with something similar to how we have already
explained similar thing for Nest Loop Joins [1], then it would be
great, you can add something like what you have written at the end of
the paragraph after explaining the actual calculation.  This is quite
a common confusion since the parallel query is developed; if you can
write some nice example and text, it would be really helpful.


[1] -
https://www.postgresql.org/docs/devel/static/using-explain.html#USING-EXPLAIN-ANALYZE

Refer below text on that link:
"In some query plans, it is possible for a subplan node to be executed
more than once. For example, the inner index scan will be executed
once per outer row in the above nested-loop plan. In such cases, the
loops value reports the total number of executions of the node, and
the actual time and rows values shown are averages per-execution. This
is done to make the numbers comparable with the way that the cost
estimates are shown. Multiply by the loops value to get the total time
actually spent in the node."

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


pgsql-hackers by date:

Previous
From: Kyotaro HORIGUCHI
Date:
Subject: Re: shared-memory based stats collector
Next
From: Tomas Vondra
Date:
Subject: Re: shared-memory based stats collector