Thread: explain analyze output with parallel workers - question aboutmeaning of information for explain.depesz.com

Hi,

up to parallel executions, when we had node in explain analyze showing
"loops=x" with x more than 1, it meant that the "actual time" had to be
multiplied by loops to get real time spent in a node.

For example, check step 13 in https://explain.depesz.com/s/gNBd

It shows time of 3ms, but loops of 1873, so the actual time is ~ 5600ms.

But with parallel execution it seems to be no longer the case.

For example:
https://explain.depesz.com/s/LTMp
or
https://explain.depesz.com/s/QHRi

It looks that the actual time is really actual time, and loops is
"worker nodes + 1".

Is that really the case? Should I, for explain.depesz.com, when dealing
with partial* and parallel* nodes, use "loops=1" for calculation of
exclusive/inclusive time? always? some other nodes?

or am I missing something in here?

Best regards,

depesz


On Fri, Nov 24, 2017 at 4:51 PM, hubert depesz lubaczewski
<depesz@depesz.com> wrote:
> Hi,
>
> up to parallel executions, when we had node in explain analyze showing
> "loops=x" with x more than 1, it meant that the "actual time" had to be
> multiplied by loops to get real time spent in a node.
>
> For example, check step 13 in https://explain.depesz.com/s/gNBd
>
> It shows time of 3ms, but loops of 1873, so the actual time is ~ 5600ms.
>
> But with parallel execution it seems to be no longer the case.
>
> For example:
> https://explain.depesz.com/s/LTMp
> or
> https://explain.depesz.com/s/QHRi
>
> It looks that the actual time is really actual time, and loops is
> "worker nodes + 1".
>
> Is that really the case?
>

I think so.

> Should I, for explain.depesz.com, when dealing
> with partial* and parallel* nodes, use "loops=1" for calculation of
> exclusive/inclusive time? always? some other nodes?
>

I am not sure what exactly inclusive or exclusive means, but for
parallel nodes, total stats are accumulated so you are seeing loops as
'worker nodes + 1'.  Now, as presumably workers run parallelly, so I
think the actual time will be what will be shown in the node not
actual time * nloops.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


On Sat, Nov 25, 2017 at 7:08 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>>
>> It looks that the actual time is really actual time, and loops is
>> "worker nodes + 1".
>>
>> Is that really the case?
>>
>
> I think so.
>

To add to what Amit has explained. The + 1 is for the leader (the
backend running the whole query) which also runs the parallel plan if
it gets time between fetching rows from the workers.


-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


On Sat, Nov 25, 2017 at 07:08:03AM +0530, Amit Kapila wrote:
> > For example, check step 13 in https://explain.depesz.com/s/gNBd
> >
> > It shows time of 3ms, but loops of 1873, so the actual time is ~ 5600ms.
> >
> > But with parallel execution it seems to be no longer the case.
> >
> > For example:
> > https://explain.depesz.com/s/LTMp
> > or
> > https://explain.depesz.com/s/QHRi
> > Should I, for explain.depesz.com, when dealing
> > with partial* and parallel* nodes, use "loops=1" for calculation of
> > exclusive/inclusive time? always? some other nodes?
> >
> 
> I am not sure what exactly inclusive or exclusive means, but for
> parallel nodes, total stats are accumulated so you are seeing loops as
> 'worker nodes + 1'.  Now, as presumably workers run parallelly, so I
> think the actual time will be what will be shown in the node not
> actual time * nloops.

Please check the plans:
https://explain.depesz.com/s/gNBd (step 13)
and https://explain.depesz.com/s/LTMp (step 3)

Inclusive time is basically "loops * actual time", so for Index Scan,
which had 1873 loops and actual time of 3.002..3.016, we got 1873
* 3.016 = 5648.968ms.

In case of parallel workers it looks like the inclusive time is
basically the upper value from actual time.

The question now is: how can I tell which nodes should use "actual_time
* 1" and which "actual_time * loops" time?

Anything "below" "Gather"?
Anything starting with "Partial?"
Anything starting with "Parallel"?
Anything with "Worker" in node "description" in explain?

depesz


On Mon, Nov 27, 2017 at 2:45 PM, hubert depesz lubaczewski
<depesz@depesz.com> wrote:
> On Sat, Nov 25, 2017 at 07:08:03AM +0530, Amit Kapila wrote:
>> > For example, check step 13 in https://explain.depesz.com/s/gNBd
>> >
>> > It shows time of 3ms, but loops of 1873, so the actual time is ~ 5600ms.
>> >
>> > But with parallel execution it seems to be no longer the case.
>> >
>> > For example:
>> > https://explain.depesz.com/s/LTMp
>> > or
>> > https://explain.depesz.com/s/QHRi
>> > Should I, for explain.depesz.com, when dealing
>> > with partial* and parallel* nodes, use "loops=1" for calculation of
>> > exclusive/inclusive time? always? some other nodes?
>> >
>>
>> I am not sure what exactly inclusive or exclusive means, but for
>> parallel nodes, total stats are accumulated so you are seeing loops as
>> 'worker nodes + 1'.  Now, as presumably workers run parallelly, so I
>> think the actual time will be what will be shown in the node not
>> actual time * nloops.
>
> Please check the plans:
> https://explain.depesz.com/s/gNBd (step 13)
> and https://explain.depesz.com/s/LTMp (step 3)
>
> Inclusive time is basically "loops * actual time", so for Index Scan,
> which had 1873 loops and actual time of 3.002..3.016, we got 1873
> * 3.016 = 5648.968ms.
>
> In case of parallel workers it looks like the inclusive time is
> basically the upper value from actual time.
>
> The question now is: how can I tell which nodes should use "actual_time
> * 1" and which "actual_time * loops" time?
>
> Anything "below" "Gather"?
>

I think it is "actual_time * 1" for anything below Gather.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


On Mon, Nov 27, 2017 at 05:24:49PM +0530, Amit Kapila wrote:
> I think it is "actual_time * 1" for anything below Gather.

Well, I think I found another problem.

Please take a look at:
https://explain.depesz.com/s/Bs8c

Node 15 is Gather with loops = 2974 (because it was ran by nested loop).

There were 4 workers, so 5 threads working, but the loops on parallel
seq scan is 17.698 million ?!

The problem is that for explain.depesz.com I'm calculating how much time
pg actually spent doing given thing.

So, if an operation has actual time of 1ms, but 100 loops, it took
100ms.

In case of parallel operations it looks like I can't realistically find
the information anywhere?

In the explain above, we see that Nested loop took, in total, around
2 million miliseconds (step 9).
Out of which, ~ 7000 ms was hash join and it's subnodes.
This leaves most of the time for Gather step, which was called 2974
times with actual time reported as 618ms.
2974 * 618 is 1837932 miliseconds, which seems reasonable.

But then - how much of this time was spent in Parallel Seq Scan in step
#16 ?

2974 * 609ms? Why is loops there 17 million?

Best regards,

depesz



On Mon, Nov 27, 2017 at 11:26 PM, hubert depesz lubaczewski
<depesz@depesz.com> wrote:
> On Mon, Nov 27, 2017 at 05:24:49PM +0530, Amit Kapila wrote:
>> I think it is "actual_time * 1" for anything below Gather.
>
> Well, I think I found another problem.
>
> Please take a look at:
> https://explain.depesz.com/s/Bs8c
>
> Node 15 is Gather with loops = 2974 (because it was ran by nested loop).
>
> There were 4 workers, so 5 threads working, but the loops on parallel
> seq scan is 17.698 million ?!
>

The number of loops displayed on parallel seq scan seems to be wrong, see below.

> The problem is that for explain.depesz.com I'm calculating how much time
> pg actually spent doing given thing.
>
> So, if an operation has actual time of 1ms, but 100 loops, it took
> 100ms.
>
> In case of parallel operations it looks like I can't realistically find
> the information anywhere?
>
> In the explain above, we see that Nested loop took, in total, around
> 2 million miliseconds (step 9).
> Out of which, ~ 7000 ms was hash join and it's subnodes.
> This leaves most of the time for Gather step, which was called 2974
> times with actual time reported as 618ms.
> 2974 * 618 is 1837932 miliseconds, which seems reasonable.
>
> But then - how much of this time was spent in Parallel Seq Scan in step
> #16 ?
>
> 2974 * 609ms?
>

Yeah.

>Why is loops there 17 million?
>

That is wrong and I think you have hit a bug.  It should be 2974 * 5 =
14870 as you have seen in other cases.  The problem is that during
rescan, we generally reinitialize the required state, but we forgot to
reinitialize the instrumentation related memory which is used in the
accumulation of stats, so changing that would fix some part of this
problem which is that at Parallel node, you won't see wrong values.
However, we also need to ensure that the per-worker details also get
accumulated across rescans.  Attached patch should fix the problem you
are seeing.  I think this needs some more analysis and testing to see
if everything works in the desired way.

Is it possible for you to test the attached patch and see if you are
still seeing any unexpected values?

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachment
On Tue, Nov 28, 2017 at 12:53:41PM +0530, Amit Kapila wrote:
> Is it possible for you to test the attached patch and see if you are
> still seeing any unexpected values?

well, not really. the explains i had were posted by people on
explain.depesz.com, so i don't have original queries nor their datasets.

Best regards,

depesz



On Mon, Nov 27, 2017 at 6:54 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>> Anything "below" "Gather"?
>>
> I think it is "actual_time * 1" for anything below Gather.

The actual time amounts below Gather show total elapsed time divided
by loop count, just as they do anywhere else in the plan.  The loop
count tracks the number of times the plan was executed, just as it
does anywhere else in the plan.  So for example if there are 5
participants which each execute the node once, and the times spent are
5 s, 60 s, 60 s, 60 s, and 60 s, you'll get actual time = 49 s, loops
= 5.

If you want to know the total amount of time spent under the node,
then you have to multiply the actual time (49 s in this example) by
the loop count (5 in this example) just as you would for any other
plan node.  However, you have to keep in mind that, for a parallel
query, the total time spent under the node is not the same as the
elapsed time.  In this example, if all 5 workers started at the same
time and ran the node continuously without a break, the *elapsed* time
until they all finished would be 60 s, not 49 s, a value that EXPLAIN
will report nowhere.  But they can also start and stop executing under
that node repeatedly and they need not all start at the same time,
making the concept of elapsed time a bit unclear -- earliest start to
latest finish would be one way, but that will be misleading (perhaps)
if they spend 5 seconds a piece but start staggered at 4 second
intervals.

It's really hard to understand what's actually going on with a
parallel query unless you look at each worker individually, and even
then sometimes it's not as clear as it could be.  I'm not sure what to
do about that.  Elsewhere, trying to show the leader's information
separately when VERBOSE is used has been discussed, and I think that's
a good idea, but it may not be enough.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


On Tue, Nov 28, 2017 at 2:23 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> That is wrong and I think you have hit a bug.  It should be 2974 * 5 =
> 14870 as you have seen in other cases.  The problem is that during
> rescan, we generally reinitialize the required state, but we forgot to
> reinitialize the instrumentation related memory which is used in the
> accumulation of stats, so changing that would fix some part of this
> problem which is that at Parallel node, you won't see wrong values.
> However, we also need to ensure that the per-worker details also get
> accumulated across rescans.  Attached patch should fix the problem you
> are seeing.  I think this needs some more analysis and testing to see
> if everything works in the desired way.
>
> Is it possible for you to test the attached patch and see if you are
> still seeing any unexpected values?

FWIW, this looks sensible to me.  Not sure if there's any good way to
write a regression test for it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


On Tue, Nov 28, 2017 at 9:42 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Tue, Nov 28, 2017 at 2:23 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>> That is wrong and I think you have hit a bug.  It should be 2974 * 5 =
>> 14870 as you have seen in other cases.  The problem is that during
>> rescan, we generally reinitialize the required state, but we forgot to
>> reinitialize the instrumentation related memory which is used in the
>> accumulation of stats, so changing that would fix some part of this
>> problem which is that at Parallel node, you won't see wrong values.
>> However, we also need to ensure that the per-worker details also get
>> accumulated across rescans.  Attached patch should fix the problem you
>> are seeing.  I think this needs some more analysis and testing to see
>> if everything works in the desired way.
>>
>> Is it possible for you to test the attached patch and see if you are
>> still seeing any unexpected values?
>
> FWIW, this looks sensible to me.  Not sure if there's any good way to
> write a regression test for it.
>

I think so, but not 100% sure.  I will give it a try and report back.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


On Tue, Nov 28, 2017 at 9:37 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Mon, Nov 27, 2017 at 6:54 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>>> Anything "below" "Gather"?
>>>
>> I think it is "actual_time * 1" for anything below Gather.
>
> The actual time amounts below Gather show total elapsed time divided
> by loop count, just as they do anywhere else in the plan.  The loop
> count tracks the number of times the plan was executed, just as it
> does anywhere else in the plan.  So for example if there are 5
> participants which each execute the node once, and the times spent are
> 5 s, 60 s, 60 s, 60 s, and 60 s, you'll get actual time = 49 s, loops
> = 5.
>
> If you want to know the total amount of time spent under the node,
> then you have to multiply the actual time (49 s in this example) by
> the loop count (5 in this example) just as you would for any other
> plan node.  However, you have to keep in mind that, for a parallel
> query, the total time spent under the node is not the same as the
> elapsed time.  In this example, if all 5 workers started at the same
> time and ran the node continuously without a break, the *elapsed* time
> until they all finished would be 60 s, not 49 s, a value that EXPLAIN
> will report nowhere.  But they can also start and stop executing under
> that node repeatedly and they need not all start at the same time,
> making the concept of elapsed time a bit unclear -- earliest start to
> latest finish would be one way, but that will be misleading (perhaps)
> if they spend 5 seconds a piece but start staggered at 4 second
> intervals.
>
> It's really hard to understand what's actually going on with a
> parallel query unless you look at each worker individually, and even
> then sometimes it's not as clear as it could be.
>

I think stats at Gather node itself gives a somewhat right picture.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


On Wed, Nov 29, 2017 at 2:04 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Tue, Nov 28, 2017 at 9:42 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>> On Tue, Nov 28, 2017 at 2:23 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>>> That is wrong and I think you have hit a bug.  It should be 2974 * 5 =
>>> 14870 as you have seen in other cases.  The problem is that during
>>> rescan, we generally reinitialize the required state, but we forgot to
>>> reinitialize the instrumentation related memory which is used in the
>>> accumulation of stats, so changing that would fix some part of this
>>> problem which is that at Parallel node, you won't see wrong values.
>>> However, we also need to ensure that the per-worker details also get
>>> accumulated across rescans.  Attached patch should fix the problem you
>>> are seeing.  I think this needs some more analysis and testing to see
>>> if everything works in the desired way.
>>>
>>> Is it possible for you to test the attached patch and see if you are
>>> still seeing any unexpected values?
>>
>> FWIW, this looks sensible to me.  Not sure if there's any good way to
>> write a regression test for it.
>>
>
> I think so, but not 100% sure.  I will give it a try and report back.
>

Attached patch contains regression test as well.  Note that I have
carefully disabled all variable stats by using (analyze, timing off,
summary off, costs off) and then selected parallel sequential scan on
the right of join so that we have nloops and rows as variable stats
and those should remain constant.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachment
On Sat, Dec 2, 2017 at 8:04 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> Attached patch contains regression test as well.  Note that I have
> carefully disabled all variable stats by using (analyze, timing off,
> summary off, costs off) and then selected parallel sequential scan on
> the right of join so that we have nloops and rows as variable stats
> and those should remain constant.

The regression test contains a whitespace error about which git diff
--check complains.

Also, looking at this again, shouldn't the reinitialization of the
instrumentation arrays happen in ExecParallelReinitialize rather than
ExecParallelFinish, so that we don't spend time doing it unless the
Gather is actually re-executed?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


On Mon, Dec 4, 2017 at 11:17 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Sat, Dec 2, 2017 at 8:04 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>> Attached patch contains regression test as well.  Note that I have
>> carefully disabled all variable stats by using (analyze, timing off,
>> summary off, costs off) and then selected parallel sequential scan on
>> the right of join so that we have nloops and rows as variable stats
>> and those should remain constant.
>
> The regression test contains a whitespace error about which git diff
> --check complains.
>

oops, a silly mistake from my side.

> Also, looking at this again, shouldn't the reinitialization of the
> instrumentation arrays happen in ExecParallelReinitialize rather than
> ExecParallelFinish, so that we don't spend time doing it unless the
> Gather is actually re-executed?
>

Yeah, that sounds better, so modified the patch accordingly.

I have one another observation in the somewhat related area.  From the
code, it looks like we might have some problem with displaying sort
info for workers for rescans.  I think the problem with the sortinfo
is that it initializes shared info with local memory in
ExecSortRetrieveInstrumentation after which it won't be able to access
the values in shared memory changed by workers in rescans.  We might
be able to fix it by having some local_info same as sahred_info in
sort node.  But the main problem is how do we accumulate stats for
workers across rescans.  The type of sort method can change across
rescans.  We might be able to accumulate the size of Memory though,
but not sure if that is right.  I think though this appears to be
somewhat related to the problem being discussed in this thread, it can
be dealt separately if we want to fix it.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachment
On Tue, Dec 5, 2017 at 6:39 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> I have one another observation in the somewhat related area.  From the
> code, it looks like we might have some problem with displaying sort
> info for workers for rescans.  I think the problem with the sortinfo
> is that it initializes shared info with local memory in
> ExecSortRetrieveInstrumentation after which it won't be able to access
> the values in shared memory changed by workers in rescans.  We might
> be able to fix it by having some local_info same as sahred_info in
> sort node.  But the main problem is how do we accumulate stats for
> workers across rescans.  The type of sort method can change across
> rescans.  We might be able to accumulate the size of Memory though,
> but not sure if that is right.  I think though this appears to be
> somewhat related to the problem being discussed in this thread, it can
> be dealt separately if we want to fix it.

Yeah, that's broken.  ExecSortRetrieveInstrumentation() is run for
each loop, and after the first loop we've lost track of the pointer
into shared memory because we replaced it with palloc'd copy.  We
could do what you said, or we could reinstate the pointer into the DSM
in ExecSortReInitializeDSM() by looking it up in the TOC.

The reason I've popped up out of nowhere on this thread to say this is
that I just realised that my nearby patch that adds support for Hash
instrumentation has the same bug, because I copied the way
ExecSortRetrieveInstrumentation() works to make
ExecHashRetrieveInstrumentation().  I initially assumed this would run
just one after the final loop in a rescan situation but on testing I
see that it runs repeatedly, and of course looses intrumentation
during rescans.  I'll go and fix that.  We should choose one approach
for both cases.  Do you prefer a separate variable for the local copy,
or reinstating the pointer into the DSM during
ExecXXXReInitializeDSM()?

As for how to aggregate the information, isn't it reasonable to show
data from the last loop on the basis that it's representative?
Summing wouldn't make too much sense, because you didn't use that much
memory all at once.

-- 
Thomas Munro
http://www.enterprisedb.com


On Tue, Dec 5, 2017 at 8:49 PM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:
> On Tue, Dec 5, 2017 at 6:39 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>> I have one another observation in the somewhat related area.  From the
>> code, it looks like we might have some problem with displaying sort
>> info for workers for rescans.  I think the problem with the sortinfo
>> is that it initializes shared info with local memory in
>> ExecSortRetrieveInstrumentation after which it won't be able to access
>> the values in shared memory changed by workers in rescans.  We might
>> be able to fix it by having some local_info same as sahred_info in
>> sort node.  But the main problem is how do we accumulate stats for
>> workers across rescans.  The type of sort method can change across
>> rescans.  We might be able to accumulate the size of Memory though,
>> but not sure if that is right.  I think though this appears to be
>> somewhat related to the problem being discussed in this thread, it can
>> be dealt separately if we want to fix it.
>
> Yeah, that's broken.  ExecSortRetrieveInstrumentation() is run for
> each loop, and after the first loop we've lost track of the pointer
> into shared memory because we replaced it with palloc'd copy.  We
> could do what you said, or we could reinstate the pointer into the DSM
> in ExecSortReInitializeDSM() by looking it up in the TOC.

Or would it be an option to change the time
ExecXXXRetrieveInstrumentation() is called so it is run only once?

-- 
Thomas Munro
http://www.enterprisedb.com


On Tue, Dec 5, 2017 at 1:29 PM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:
> On Tue, Dec 5, 2017 at 8:49 PM, Thomas Munro
> <thomas.munro@enterprisedb.com> wrote:
>> On Tue, Dec 5, 2017 at 6:39 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>>> I have one another observation in the somewhat related area.  From the
>>> code, it looks like we might have some problem with displaying sort
>>> info for workers for rescans.  I think the problem with the sortinfo
>>> is that it initializes shared info with local memory in
>>> ExecSortRetrieveInstrumentation after which it won't be able to access
>>> the values in shared memory changed by workers in rescans.  We might
>>> be able to fix it by having some local_info same as sahred_info in
>>> sort node.  But the main problem is how do we accumulate stats for
>>> workers across rescans.  The type of sort method can change across
>>> rescans.  We might be able to accumulate the size of Memory though,
>>> but not sure if that is right.  I think though this appears to be
>>> somewhat related to the problem being discussed in this thread, it can
>>> be dealt separately if we want to fix it.
>>
>> Yeah, that's broken.  ExecSortRetrieveInstrumentation() is run for
>> each loop, and after the first loop we've lost track of the pointer
>> into shared memory because we replaced it with palloc'd copy.  We
>> could do what you said, or we could reinstate the pointer into the DSM
>> in ExecSortReInitializeDSM() by looking it up in the TOC.
>
> Or would it be an option to change the time
> ExecXXXRetrieveInstrumentation() is called so it is run only once?
>

To me, that doesn't sound like a bad option.  I think if do so, then
we don't even need to reinitialize the shared sort stats.  I think
something, like attached, should work if we want to go this route.  We
can add regression test if this is what we think is a good idea.
Having said that, one problem I see doing thing this way is that in
general, we will display the accumulated stats of each worker, but for
sort or some other special nodes (like hash), we will show the
information of only last loop.  I am not sure if that is a matter of
concern, but if we want to do this way, then probably we should
explain this in documentation as well.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachment
On Tue, Dec 5, 2017 at 2:49 AM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:
> As for how to aggregate the information, isn't it reasonable to show
> data from the last loop on the basis that it's representative?
> Summing wouldn't make too much sense, because you didn't use that much
> memory all at once.

Sorts can be rescanned even without parallel query, so I guess we
should try to make the parallel case kinda like the non-parallel case.
If I'm not wrong, that will just use the stats from the most recent
execution (i.e. the last loop) -- see show_sort_info() in explain.c.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


On Tue, Dec 5, 2017 at 12:39 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> Yeah, that sounds better, so modified the patch accordingly.

I committed this to master and REL_10_STABLE, but it conflicts all
over the place on 9.6.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


On Wed, Dec 6, 2017 at 12:02 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Tue, Dec 5, 2017 at 1:29 PM, Thomas Munro
> <thomas.munro@enterprisedb.com> wrote:
>> Or would it be an option to change the time
>> ExecXXXRetrieveInstrumentation() is called so it is run only once?
>
> To me, that doesn't sound like a bad option.  I think if do so, then
> we don't even need to reinitialize the shared sort stats.  I think
> something, like attached, should work if we want to go this route.  We
> can add regression test if this is what we think is a good idea.
> Having said that, one problem I see doing thing this way is that in
> general, we will display the accumulated stats of each worker, but for
> sort or some other special nodes (like hash), we will show the
> information of only last loop.  I am not sure if that is a matter of
> concern, but if we want to do this way, then probably we should
> explain this in documentation as well.

The hash version of this code is now committed as 5bcf389e.  Here is a
patch for discussion that adds some extra tests to join.sql to
exercise rescans of a hash join under a Gather node.  It fails on
head, because it loses track of the instrumentation pointer after the
first loop as you described (since the Hash coding is the same is the
Sort coding), so it finishes up with no instrumentation data.  If you
move ExecParallelRetrieveInstrumentation() to ExecParallelCleanup() as
you showed in your patch, then it passes.  The way I'm asserting that
instrumentation data is making its way back to the leader is by
turning off leader participation and then checking if it knows how
many batches there were.

-- 
Thomas Munro
http://www.enterprisedb.com

Attachment
On Wed, Dec 6, 2017 at 12:18 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Tue, Dec 5, 2017 at 2:49 AM, Thomas Munro
> <thomas.munro@enterprisedb.com> wrote:
>> As for how to aggregate the information, isn't it reasonable to show
>> data from the last loop on the basis that it's representative?
>> Summing wouldn't make too much sense, because you didn't use that much
>> memory all at once.
>
> Sorts can be rescanned even without parallel query, so I guess we
> should try to make the parallel case kinda like the non-parallel case.
> If I'm not wrong, that will just use the stats from the most recent
> execution (i.e. the last loop) -- see show_sort_info() in explain.c.
>

Right and seeing that I have prepared the patch (posted above [1])
which fixes it such that it will resemble the non-parallel case.
Ideally, it would have obviated the need for my previous patch which
got committed as 778e78ae.  However, now that is committed, I could
think of below options:

1. I shall rebase it atop what is committed and actually, I have done
that in the attached patch.  I have also prepared a regression test
case patch just to show the output with and without the patch.
2. For sort node, we can fix it by having some local_info same as
shared_info in sort node and copy the shared_info in that or we could
reinstate the pointer to the DSM in ExecSortReInitializeDSM() by
looking it up in the TOC as suggested by Thomas. If we go this way,
then we need a similar fix for hash node as well.


[1] - https://www.postgresql.org/message-id/CAA4eK1JBj4YCEQKeTua5%3DBMXy7zW7zNOvoXomzBP%3Dkb_aqMF7w%40mail.gmail.com

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachment
On Wed, Dec 6, 2017 at 1:10 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Tue, Dec 5, 2017 at 12:39 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>> Yeah, that sounds better, so modified the patch accordingly.
>
> I committed this to master and REL_10_STABLE, but it conflicts all
> over the place on 9.6.
>

I will try to prepare the patch for 9.6, but I think it might be
better if we first decide what to do about the open issue for sort and
hash node as there can be some overlap based on what approach we
choose to fix it.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


On Wed, Dec 6, 2017 at 1:05 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> Right and seeing that I have prepared the patch (posted above [1])
> which fixes it such that it will resemble the non-parallel case.
> Ideally, it would have obviated the need for my previous patch which
> got committed as 778e78ae.  However, now that is committed, I could
> think of below options:
>
> 1. I shall rebase it atop what is committed and actually, I have done
> that in the attached patch.  I have also prepared a regression test
> case patch just to show the output with and without the patch.
> 2. For sort node, we can fix it by having some local_info same as
> shared_info in sort node and copy the shared_info in that or we could
> reinstate the pointer to the DSM in ExecSortReInitializeDSM() by
> looking it up in the TOC as suggested by Thomas. If we go this way,
> then we need a similar fix for hash node as well.

Well, the patch you've actually attached makes the bug go away by
removing a net of 53 lines of code.  The other approach would probably
add code.  So I am tempted to go with the approach you have here.  I
would probably change the T_HashState and T_SortState cases in
ExecParallelReInitializeDSM so that they still exist, but just do
something like this:

case T_HashState:
case T_SortState:
/* these nodes have DSM state, but no reinitialization is required */
break;

That way, it will be more clear to future readers of this code that
the lack of a reinitialize function is not an oversight, and the
compiler should optimize these cases away, merging them with the
default case.

I was a little worried when I first opened this patch that it might be
imposing a one-size-fits-all solution; just because sort and hash want
to report details from the last execution, there could be some other
executor node that wants to do otherwise.  But on reflection, that's
just fine: an individual executor node is free to accumulate stats
across rescans if it wants to do so.  It's merely that sort and hash
don't want to do that.  In fact, it's really the current system that
is imposing a straightjacket: no matter what the individual node types
choose to do, rescans are a pile of fail.

Long story short, I like the patch.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


On Fri, Dec 8, 2017 at 7:36 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Wed, Dec 6, 2017 at 1:05 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>> Right and seeing that I have prepared the patch (posted above [1])
>> which fixes it such that it will resemble the non-parallel case.
>
> Long story short, I like the patch.

LGTM.  There might be an argument for clearing the instrumentation
every time on the basis that you might finish up keeping data from a
non-final loop when a worker opted not to do anything in the final
loop, but I'm not going to make that argument because I don't think it
matters.   The patch makes the tests in
test-hash-join-rescan-instr-v1.patch pass (from my previous message).
Please also consider that test patch for commit.

-- 
Thomas Munro
http://www.enterprisedb.com


On Fri, Dec 8, 2017 at 12:06 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Wed, Dec 6, 2017 at 1:05 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>> Right and seeing that I have prepared the patch (posted above [1])
>> which fixes it such that it will resemble the non-parallel case.
>> Ideally, it would have obviated the need for my previous patch which
>> got committed as 778e78ae.  However, now that is committed, I could
>> think of below options:
>>
>> 1. I shall rebase it atop what is committed and actually, I have done
>> that in the attached patch.  I have also prepared a regression test
>> case patch just to show the output with and without the patch.
>> 2. For sort node, we can fix it by having some local_info same as
>> shared_info in sort node and copy the shared_info in that or we could
>> reinstate the pointer to the DSM in ExecSortReInitializeDSM() by
>> looking it up in the TOC as suggested by Thomas. If we go this way,
>> then we need a similar fix for hash node as well.
>
> Well, the patch you've actually attached makes the bug go away by
> removing a net of 53 lines of code.  The other approach would probably
> add code.  So I am tempted to go with the approach you have here.  I
> would probably change the T_HashState and T_SortState cases in
> ExecParallelReInitializeDSM so that they still exist, but just do
> something like this:
>
> case T_HashState:
> case T_SortState:
> /* these nodes have DSM state, but no reinitialization is required */
> break;
>
> That way, it will be more clear to future readers of this code that
> the lack of a reinitialize function is not an oversight, and the
> compiler should optimize these cases away, merging them with the
> default case.
>

Okay, I have adjusted the patch accordingly.  I have also added a
regression test which should produce the same result across different
runs, see if that looks okay to you, then it is better to add such a
test as well.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachment
On Fri, Dec 8, 2017 at 9:33 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Fri, Dec 8, 2017 at 12:06 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>> On Wed, Dec 6, 2017 at 1:05 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>>> Right and seeing that I have prepared the patch (posted above [1])
>>> which fixes it such that it will resemble the non-parallel case.
>>> Ideally, it would have obviated the need for my previous patch which
>>> got committed as 778e78ae.  However, now that is committed, I could
>>> think of below options:
>>>
>>> 1. I shall rebase it atop what is committed and actually, I have done
>>> that in the attached patch.  I have also prepared a regression test
>>> case patch just to show the output with and without the patch.
>>> 2. For sort node, we can fix it by having some local_info same as
>>> shared_info in sort node and copy the shared_info in that or we could
>>> reinstate the pointer to the DSM in ExecSortReInitializeDSM() by
>>> looking it up in the TOC as suggested by Thomas. If we go this way,
>>> then we need a similar fix for hash node as well.
>>
>> Well, the patch you've actually attached makes the bug go away by
>> removing a net of 53 lines of code.  The other approach would probably
>> add code.  So I am tempted to go with the approach you have here.  I
>> would probably change the T_HashState and T_SortState cases in
>> ExecParallelReInitializeDSM so that they still exist, but just do
>> something like this:
>>
>> case T_HashState:
>> case T_SortState:
>> /* these nodes have DSM state, but no reinitialization is required */
>> break;
>>
>> That way, it will be more clear to future readers of this code that
>> the lack of a reinitialize function is not an oversight, and the
>> compiler should optimize these cases away, merging them with the
>> default case.
>>
>
> Okay, I have adjusted the patch accordingly.  I have also added a
> regression test which should produce the same result across different
> runs, see if that looks okay to you, then it is better to add such a
> test as well.
>

The regression test added by patch needs cleanup at the end which I
have added in the attached patch.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachment
On Fri, Dec 8, 2017 at 5:11 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> The regression test added by patch needs cleanup at the end which I
> have added in the attached patch.

OK, so you've got a test case now, but Thomas independently submitted
a test case patch.  Which one is more awesome?  :-)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


On Sat, Dec 9, 2017 at 6:42 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Fri, Dec 8, 2017 at 5:11 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>> The regression test added by patch needs cleanup at the end which I
>> have added in the attached patch.
>
> OK, so you've got a test case now, but Thomas independently submitted
> a test case patch.  Which one is more awesome?  :-)

My test case is for Hash nodes and Amit's is for Sort nodes.  Surely
both should be covered?

-- 
Thomas Munro
http://www.enterprisedb.com


On Fri, Dec 8, 2017 at 2:22 PM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:
> On Sat, Dec 9, 2017 at 6:42 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>> On Fri, Dec 8, 2017 at 5:11 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>>> The regression test added by patch needs cleanup at the end which I
>>> have added in the attached patch.
>>
>> OK, so you've got a test case now, but Thomas independently submitted
>> a test case patch.  Which one is more awesome?  :-)
>
> My test case is for Hash nodes and Amit's is for Sort nodes.  Surely
> both should be covered?

Oh, yeah.  I totally missed that detail, sorry.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


On Fri, Dec 8, 2017 at 5:11 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>> Okay, I have adjusted the patch accordingly.  I have also added a
>> regression test which should produce the same result across different
>> runs, see if that looks okay to you, then it is better to add such a
>> test as well.
>
> The regression test added by patch needs cleanup at the end which I
> have added in the attached patch.

Hmm.  If we're going this way, then shouldn't we revert the changes
commit 2c09a5c12a66087218c7f8cba269cd3de51b9b82 made to
ExecParallelRetrieveInstrumentation?  If that function is only ever
called once, then there's no point doing InstrInit + InstrAgg node, or
checking whether worker_instrument is already initialized.  We can
just palloc + memcpy as the code did previously.

I think.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


On Sat, Dec 9, 2017 at 1:24 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Fri, Dec 8, 2017 at 5:11 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>>> Okay, I have adjusted the patch accordingly.  I have also added a
>>> regression test which should produce the same result across different
>>> runs, see if that looks okay to you, then it is better to add such a
>>> test as well.
>>
>> The regression test added by patch needs cleanup at the end which I
>> have added in the attached patch.
>
> Hmm.  If we're going this way, then shouldn't we revert the changes
> commit 2c09a5c12a66087218c7f8cba269cd3de51b9b82 made to
> ExecParallelRetrieveInstrumentation?
>

Yeah, it is better to revert it as ideally that is not required after
this patch and that is what I have tried to convey above ("Ideally, it
would have obviated the need for my previous patch which
got committed as 778e78ae." (The commit id is for branch 10,
otherwise, it is same as what you mention.)).  I have locally reverted
that patch and then rebased it on top of that.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachment
On Sat, Dec 9, 2017 at 5:30 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Sat, Dec 9, 2017 at 1:24 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>> On Fri, Dec 8, 2017 at 5:11 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>>>> Okay, I have adjusted the patch accordingly.  I have also added a
>>>> regression test which should produce the same result across different
>>>> runs, see if that looks okay to you, then it is better to add such a
>>>> test as well.
>>>
>>> The regression test added by patch needs cleanup at the end which I
>>> have added in the attached patch.
>>
>> Hmm.  If we're going this way, then shouldn't we revert the changes
>> commit 2c09a5c12a66087218c7f8cba269cd3de51b9b82 made to
>> ExecParallelRetrieveInstrumentation?
>>
>
> Yeah, it is better to revert it as ideally that is not required after
> this patch and that is what I have tried to convey above ("Ideally, it
> would have obviated the need for my previous patch which
> got committed as 778e78ae." (The commit id is for branch 10,
> otherwise, it is same as what you mention.)).  I have locally reverted
> that patch and then rebased it on top of that.

Uh, should I just revert that commit entirely first, and then we can
commit the new fix afterward?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


On Wed, Dec 13, 2017 at 3:26 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Sat, Dec 9, 2017 at 5:30 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>> On Sat, Dec 9, 2017 at 1:24 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>>> On Fri, Dec 8, 2017 at 5:11 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>>>>> Okay, I have adjusted the patch accordingly.  I have also added a
>>>>> regression test which should produce the same result across different
>>>>> runs, see if that looks okay to you, then it is better to add such a
>>>>> test as well.
>>>>
>>>> The regression test added by patch needs cleanup at the end which I
>>>> have added in the attached patch.
>>>
>>> Hmm.  If we're going this way, then shouldn't we revert the changes
>>> commit 2c09a5c12a66087218c7f8cba269cd3de51b9b82 made to
>>> ExecParallelRetrieveInstrumentation?
>>>
>>
>> Yeah, it is better to revert it as ideally that is not required after
>> this patch and that is what I have tried to convey above ("Ideally, it
>> would have obviated the need for my previous patch which
>> got committed as 778e78ae." (The commit id is for branch 10,
>> otherwise, it is same as what you mention.)).  I have locally reverted
>> that patch and then rebased it on top of that.
>
> Uh, should I just revert that commit entirely first, and then we can
> commit the new fix afterward?
>

Yes. I have already extracted the test case of that commit to the new
patch which is what we need from that commit.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


On Tue, Dec 12, 2017 at 9:37 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>> Uh, should I just revert that commit entirely first, and then we can
>> commit the new fix afterward?
>
> Yes. I have already extracted the test case of that commit to the new
> patch which is what we need from that commit.

OK, done.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


On Thu, Dec 14, 2017 at 2:32 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Tue, Dec 12, 2017 at 9:37 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>>> Uh, should I just revert that commit entirely first, and then we can
>>> commit the new fix afterward?
>>
>> Yes. I have already extracted the test case of that commit to the new
>> patch which is what we need from that commit.
>
> OK, done.
>

Thanks.  I think now we can proceed with
fix_accum_instr_parallel_workers_v8.patch posted above which will fix
the original issue and the problem we have found in sort and hash
nodes.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


On Wed, Dec 13, 2017 at 9:18 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> Thanks.  I think now we can proceed with
> fix_accum_instr_parallel_workers_v8.patch posted above which will fix
> the original issue and the problem we have found in sort and hash
> nodes.

Committed and back-patched to v10.  While I was doing that, I couldn't
help wondering if this code doesn't also need to be moved:

    /*
     * Next, accumulate buffer usage.  (This must wait for the workers to
     * finish, or we might get incomplete data.)
     */
    for (i = 0; i < nworkers; i++)
        InstrAccumParallelQuery(&pei->buffer_usage[i]);

It seems that it doesn't, because the way that instrumentation data
gets accumulated is via InstrStartParallelQuery and
InstrEndParallelQuery, the latter of which clobbers the entry in the
buffer_usage array rather than adding to it:

void
InstrEndParallelQuery(BufferUsage *result)
{
    memset(result, 0, sizeof(BufferUsage));
    BufferUsageAccumDiff(result, &pgBufferUsage, &save_pgBufferUsage);
}

But we could think about choosing to make that work the same way; that
is, move the code block to ExecParallelCleanup, remove the memset()
call from InstrEndParallelQuery, and change the code that allocates
PARALLEL_KEY_BUFFER_USAGE to initialize the space.  That would make
the handling of this more consistent with what we're now doing for the
instrumentation data, although I can't see that it fixes any live bug.

Thoughts?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


On Tue, Dec 5, 2017 at 4:23 PM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:
> The hash version of this code is now committed as 5bcf389e.  Here is a
> patch for discussion that adds some extra tests to join.sql to
> exercise rescans of a hash join under a Gather node.  It fails on
> head, because it loses track of the instrumentation pointer after the
> first loop as you described (since the Hash coding is the same is the
> Sort coding), so it finishes up with no instrumentation data.  If you
> move ExecParallelRetrieveInstrumentation() to ExecParallelCleanup() as
> you showed in your patch, then it passes.  The way I'm asserting that
> instrumentation data is making its way back to the leader is by
> turning off leader participation and then checking if it knows how
> many batches there were.

In a later email in this thread, you asked me to consider this patch
for commit, but it doesn't apply.  I thought that might be the result
of conflicts with Amit's patch which I just committed, but I think
that's not the real explanation, because it touches the 'join'
regression test, not 'select_parallel'.  Well, I thought, I'll just
find the place where the SQL should be inserted and stick it in there
-- trivial rebase, right?

Well, not really, because the context surrounding the lines you've
added seems to refer to SQL that I can't find in join.sql or anywhere
else in the tree.  So my suspicion is that this patch is based on your
parallel hash patch set rather than master.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


On Tue, Dec 19, 2017 at 1:29 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> Well, not really, because the context surrounding the lines you've
> added seems to refer to SQL that I can't find in join.sql or anywhere
> else in the tree.  So my suspicion is that this patch is based on your
> parallel hash patch set rather than master.

Thomas pinged me off-list and showed patch -p1 < $file working fine
for him and ... I'm dumb.  I was trying to apply the patch to v10, not
master.  It applies fine to master; committed there.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


On Tue, Dec 19, 2017 at 11:37 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Wed, Dec 13, 2017 at 9:18 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>> Thanks.  I think now we can proceed with
>> fix_accum_instr_parallel_workers_v8.patch posted above which will fix
>> the original issue and the problem we have found in sort and hash
>> nodes.
>
> Committed and back-patched to v10.  While I was doing that, I couldn't
> help wondering if this code doesn't also need to be moved:
>
>     /*
>      * Next, accumulate buffer usage.  (This must wait for the workers to
>      * finish, or we might get incomplete data.)
>      */
>     for (i = 0; i < nworkers; i++)
>         InstrAccumParallelQuery(&pei->buffer_usage[i]);
>
> It seems that it doesn't, because the way that instrumentation data
> gets accumulated is via InstrStartParallelQuery and
> InstrEndParallelQuery, the latter of which clobbers the entry in the
> buffer_usage array rather than adding to it:
>

Right and that is the reason, I have not touched it in the patch.

> But we could think about choosing to make that work the same way; that
> is, move the code block to ExecParallelCleanup, remove the memset()
> call from InstrEndParallelQuery, and change the code that allocates
> PARALLEL_KEY_BUFFER_USAGE to initialize the space.
>

Do you mean to say initialize the first time it is allocated or both
during a first time and during rescans?

>  That would make
> the handling of this more consistent with what we're now doing for the
> instrumentation data, although I can't see that it fixes any live bug.
>

I think the change you are proposing makes sense to me.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


On Tue, Dec 19, 2017 at 11:37 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Wed, Dec 13, 2017 at 9:18 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>> Thanks.  I think now we can proceed with
>> fix_accum_instr_parallel_workers_v8.patch posted above which will fix
>> the original issue and the problem we have found in sort and hash
>> nodes.
>
> Committed and back-patched to v10.
>

Thanks and attached find the rebased version that can be applied to
v9.6.  I have to change the test case to produce a stable output and
the reason for the change is that 9.6 doesn't have 'summary off'
option for Explain.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachment
On Wed, Dec 20, 2017 at 3:38 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Tue, Dec 19, 2017 at 11:37 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>> On Wed, Dec 13, 2017 at 9:18 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>>> Thanks.  I think now we can proceed with
>>> fix_accum_instr_parallel_workers_v8.patch posted above which will fix
>>> the original issue and the problem we have found in sort and hash
>>> nodes.
>>
>> Committed and back-patched to v10.
>
> Thanks and attached find the rebased version that can be applied to
> v9.6.  I have to change the test case to produce a stable output and
> the reason for the change is that 9.6 doesn't have 'summary off'
> option for Explain.

This thread got lost in my inbox over Christmas, but I've now
committed this back-port to REL9_6_STABLE.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


On Thu, Jan 4, 2018 at 11:29 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Wed, Dec 20, 2017 at 3:38 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>> On Tue, Dec 19, 2017 at 11:37 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>>> On Wed, Dec 13, 2017 at 9:18 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>>>> Thanks.  I think now we can proceed with
>>>> fix_accum_instr_parallel_workers_v8.patch posted above which will fix
>>>> the original issue and the problem we have found in sort and hash
>>>> nodes.
>>>
>>> Committed and back-patched to v10.
>>
>> Thanks and attached find the rebased version that can be applied to
>> v9.6.  I have to change the test case to produce a stable output and
>> the reason for the change is that 9.6 doesn't have 'summary off'
>> option for Explain.
>
> This thread got lost in my inbox over Christmas, but I've now
> committed this back-port to REL9_6_STABLE.
>

Thanks!



-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com