Re: explain analyze output with parallel workers - question aboutmeaning of information for explain.depesz.com - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: explain analyze output with parallel workers - question aboutmeaning of information for explain.depesz.com
Date
Msg-id CAA4eK1J7XJ6Cw2jvgNTH1vkeTxGOWtuk+Y2r2P1H3Tn33cZ_DQ@mail.gmail.com
Whole thread Raw
In response to Re: explain analyze output with parallel workers - question aboutmeaning of information for explain.depesz.com  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: explain analyze output with parallel workers - question aboutmeaning of information for explain.depesz.com  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Wed, Nov 29, 2017 at 2:04 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Tue, Nov 28, 2017 at 9:42 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>> On Tue, Nov 28, 2017 at 2:23 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>>> That is wrong and I think you have hit a bug.  It should be 2974 * 5 =
>>> 14870 as you have seen in other cases.  The problem is that during
>>> rescan, we generally reinitialize the required state, but we forgot to
>>> reinitialize the instrumentation related memory which is used in the
>>> accumulation of stats, so changing that would fix some part of this
>>> problem which is that at Parallel node, you won't see wrong values.
>>> However, we also need to ensure that the per-worker details also get
>>> accumulated across rescans.  Attached patch should fix the problem you
>>> are seeing.  I think this needs some more analysis and testing to see
>>> if everything works in the desired way.
>>>
>>> Is it possible for you to test the attached patch and see if you are
>>> still seeing any unexpected values?
>>
>> FWIW, this looks sensible to me.  Not sure if there's any good way to
>> write a regression test for it.
>>
>
> I think so, but not 100% sure.  I will give it a try and report back.
>

Attached patch contains regression test as well.  Note that I have
carefully disabled all variable stats by using (analyze, timing off,
summary off, costs off) and then selected parallel sequential scan on
the right of join so that we have nloops and rows as variable stats
and those should remain constant.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachment

pgsql-hackers by date:

Previous
From: Beena Emerson
Date:
Subject: Re: [HACKERS] Runtime Partition Pruning
Next
From: Tomas Vondra
Date:
Subject: Re: [HACKERS] Custom compression methods