Re: explain analyze output with parallel workers - question aboutmeaning of information for explain.depesz.com - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: explain analyze output with parallel workers - question aboutmeaning of information for explain.depesz.com
Date
Msg-id CAA4eK1LxyBgkFYxtoW-ZTaC_jkk6Ck8UB+3XHHT1a4EnZ52Bcg@mail.gmail.com
Whole thread Raw
In response to Re: explain analyze output with parallel workers - question aboutmeaning of information for explain.depesz.com  (hubert depesz lubaczewski <depesz@depesz.com>)
Responses Re: explain analyze output with parallel workers - question aboutmeaning of information for explain.depesz.com  (hubert depesz lubaczewski <depesz@depesz.com>)
Re: explain analyze output with parallel workers - question aboutmeaning of information for explain.depesz.com  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Mon, Nov 27, 2017 at 11:26 PM, hubert depesz lubaczewski
<depesz@depesz.com> wrote:
> On Mon, Nov 27, 2017 at 05:24:49PM +0530, Amit Kapila wrote:
>> I think it is "actual_time * 1" for anything below Gather.
>
> Well, I think I found another problem.
>
> Please take a look at:
> https://explain.depesz.com/s/Bs8c
>
> Node 15 is Gather with loops = 2974 (because it was ran by nested loop).
>
> There were 4 workers, so 5 threads working, but the loops on parallel
> seq scan is 17.698 million ?!
>

The number of loops displayed on parallel seq scan seems to be wrong, see below.

> The problem is that for explain.depesz.com I'm calculating how much time
> pg actually spent doing given thing.
>
> So, if an operation has actual time of 1ms, but 100 loops, it took
> 100ms.
>
> In case of parallel operations it looks like I can't realistically find
> the information anywhere?
>
> In the explain above, we see that Nested loop took, in total, around
> 2 million miliseconds (step 9).
> Out of which, ~ 7000 ms was hash join and it's subnodes.
> This leaves most of the time for Gather step, which was called 2974
> times with actual time reported as 618ms.
> 2974 * 618 is 1837932 miliseconds, which seems reasonable.
>
> But then - how much of this time was spent in Parallel Seq Scan in step
> #16 ?
>
> 2974 * 609ms?
>

Yeah.

>Why is loops there 17 million?
>

That is wrong and I think you have hit a bug.  It should be 2974 * 5 =
14870 as you have seen in other cases.  The problem is that during
rescan, we generally reinitialize the required state, but we forgot to
reinitialize the instrumentation related memory which is used in the
accumulation of stats, so changing that would fix some part of this
problem which is that at Parallel node, you won't see wrong values.
However, we also need to ensure that the per-worker details also get
accumulated across rescans.  Attached patch should fix the problem you
are seeing.  I think this needs some more analysis and testing to see
if everything works in the desired way.

Is it possible for you to test the attached patch and see if you are
still seeing any unexpected values?

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachment

pgsql-hackers by date:

Previous
From: Rajkumar Raghuwanshi
Date:
Subject: Re: [HACKERS] Partition-wise aggregation/grouping
Next
From: Antonin Houska
Date:
Subject: Re: [HACKERS] Transactions involving multiple postgres foreign servers