Re: SegFault on 9.6.14 - Mailing list pgsql-hackers
From | Amit Kapila |
---|---|
Subject | Re: SegFault on 9.6.14 |
Date | |
Msg-id | CAA4eK1+U9oecKb6VXSay1QtznnMgiwEnuUL3MwraWa=yuz1-1Q@mail.gmail.com Whole thread Raw |
In response to | Re: SegFault on 9.6.14 (Amit Kapila <amit.kapila16@gmail.com>) |
Responses |
Re: SegFault on 9.6.14
|
List | pgsql-hackers |
On Mon, Sep 2, 2019 at 4:51 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Fri, Aug 9, 2019 at 6:29 PM Robert Haas <robertmhaas@gmail.com> wrote: > > > > > > But beyond that, the issue here is that the Limit node is shutting > > down the Gather node too early, and the right fix must be to stop > > doing that, not to change the definition of what it means to shut down > > a node, as this patch does. So maybe a possible approach here - which > > I think is more or less what Tom is proposing - is: > > > > 1. Remove the code from ExecLimit() that calls ExecShutdownNode(). > > > > Attached patch does that. I have also added one test as a separate > patch so that later if we introduce shutting down resources in Limit > node, we don't break anything. As of now, I have kept it separate for > easy verification, but if we decide to go with this approach and test > appears fine, we can merge it along with the fix. > I have merged the code change and test case patch as I felt that it is good to cover this case. I have slightly changed the test case to make its output predictable (made the inner scan ordered so that the query always produces the same result). One more thing I am not able to come up with some predictable test case for 9.6 branches as it doesn't support Gather Merge which is required for this particular test to always produce predictable output. There could be some better way to write this test, so any input in that regards or otherwise is welcome. So, if we commit this patch the containing test case will be for branches HEAD~10, but the code will be for HEAD~9.6. > > 2. Adjust ExecutePlan() so that it ensures that ExecuteShutdownNode() > > gets called at the very end of execution, at least when execute_once > > is set, before exiting parallel mode. > > > > I am not sure if I completely understand this point. AFAICS, the > ExecuteShutdownNode is called whenever we are done getting the tuples. > One place where it is not there in that function is when we assume > destination is closed, basically below code: > ExecutePlan() > { > .. > if (!dest->receiveSlot(slot, dest)) > break; > .. > } > > Do you expect this case to be also dealt or you have something else in > mind? > It still appears problematic, but I couldn't come up with a test case to reproduce the problem. I'll try some more on this, but I think this anyway can be done separately once we have a test to show the problem. > The other possibility could be that we move the shutdown of the > node at the end of the function when we exit parallel mode but doing > that lead to some regression failure on my machine. I will > investigate the same. > This was failing because use_parallel_mode flag in function ExecutePlan() won't be set for workers and hence they won't get a chance to accumulate its stats. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Attachment
pgsql-hackers by date: