Re: Buffers from parallel workers not accumulated to upper nodes with gather merge - Mailing list pgsql-bugs

From Jehan-Guillaume de Rorthais
Subject Re: Buffers from parallel workers not accumulated to upper nodes with gather merge
Date
Msg-id 20200721173457.6e869d46@firost
Whole thread Raw
In response to Re: Buffers from parallel workers not accumulated to upper nodes with gather merge  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: Buffers from parallel workers not accumulated to upper nodes with gather merge
Re: Buffers from parallel workers not accumulated to upper nodes with gather merge
List pgsql-bugs
On Mon, 20 Jul 2020 15:29:00 +0530
Amit Kapila <amit.kapila16@gmail.com> wrote:

> On Mon, Jul 20, 2020 at 1:29 PM Jehan-Guillaume de Rorthais
> <jgdr@dalibo.com> wrote:
> >
> > On Mon, 20 Jul 2020 11:30:34 +0530
> > Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > > On Sat, Jul 18, 2020 at 7:32 PM Jehan-Guillaume de Rorthais
> > > <jgdr@dalibo.com> wrote:
> > [...]
> > > > The Merge node works correctly because it calls
> > > > ExecShutdownGatherWorkers as soon as the workers are exhausted from
> > > > gather_readnext. Because of this, buffers from workers are already
> > > > accounted and propagated to upper nodes before the recursive call of
> > > > ExecShutdownNode on each nodes. There's no similar call to
> > > > ExecShutdownGatherMergeWorkers for Gather Merge. Adding a call to
> > > > ExecShutdownGatherMergeWorkers in gather_merge_getnext when workers are
> > > > exhausted seems to fix the issue, but I feel like this is the wrong
> > > > place to fix this issue.
> > >
> > > Why do you think so?
> >
> > Because the fix seemed too specific to the Gather Merge node alone. This bug
> > might exist for some other nodes (I didn't look for them) and the trap will
> > stay for futur ones.
> >
> > The fix in ExecShutdownNode recursion have a broader impact on all present
> > and futur nodes.
> >
> > > I think irrespective of whether we want to call
> > > ExecShutdownGatherMergeWorkers in gather_merge_getnext (when we know
> > > we are done aka binaryheap_empty) to fix this particular issue, it is
> > > better to shutdown the workers as soon as we are done similar to what
> > > we do for Gather node.  It is good to release resources as soon as we
> > > can.
> >
> > Indeed. I was focusing on the bug and I didn't thought about that. The
> > patch I test was really just a quick test. I'll have a closer look at this,
> > but I suppose this might be considered as a separate commit?
> >
> 
> Good Question.  Initially, I thought we will have it in a same commit,
> but now on again thinking about it, we might want to keep this one for
> the HEAD only and ExecShutdownNode related change in the back-branches.

OK.

I'll send a patch proposal for Gather Merge on HEAD tomorrow.

> BTW, can you please test if the problem exist in back-branches, and does
> your change fix it there as well?

Gather Merge appears in v10, but I haven't been able to produce a plan using it.
The SQL scenario expose the same bug in v11, v12, v13 and head.

Nevertheless, the recursion in ExecShutdownNode always happen before
InstrStartNode back to v10. The code in 9.6 is fine, the recursion happen
after both InstrStartNode and InstrStopNode.

You'll find in attachment patches for versions from 10 to head.

Regards,

Attachment

pgsql-bugs by date:

Previous
From: Francisco Olarte
Date:
Subject: Re: BUG #16548: Order by on array element giving disparity in result
Next
From: Pavel Stehule
Date:
Subject: Re: BUG #16549: "CASE" not work properly , the function works properly on PostgreSQL 9.6.8