Re: SegFault on 9.6.14 - Mailing list pgsql-hackers
From | Amit Kapila |
---|---|
Subject | Re: SegFault on 9.6.14 |
Date | |
Msg-id | CAA4eK1Jke-evw+XrL_CYuDDpcn3nH3s_8CF=it0fE1td4jZ95g@mail.gmail.com Whole thread Raw |
In response to | Re: SegFault on 9.6.14 (Thomas Munro <thomas.munro@gmail.com>) |
Responses |
Re: SegFault on 9.6.14
|
List | pgsql-hackers |
On Tue, Jul 23, 2019 at 9:11 AM Thomas Munro <thomas.munro@gmail.com> wrote: > > On Fri, Jul 19, 2019 at 3:00 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > I am thinking that why not we remove the part of destroying the > > parallel context (and shared memory) from ExecShutdownGather (and > > ExecShutdownGatherMerge) and then do it at the time of ExecEndGather > > (and ExecEndGatherMerge)? This should fix the bug in hand and seems > > to be more consistent with our overall design principles. I have not > > tried to code it to see if there are any other repercussions of the > > same but seems worth investigating. What do you think? > > I tried moving ExecParallelCleanup() into ExecEndGather(). The first > problem is that ExecutePlan() wraps execution in > EnterParallelMode()/ExitParallelMode(), but ExitParallelMode() fails > an assertion that no parallel context is active because > ExecEndGather() hasn't run yet. The enclosing > ExecutorStart()/ExecutorEnd() calls are further down the call stack, > in ProcessQuery(). So some more restructuring might be needed to exit > parallel mode later, but then I feel like you might be getting way out > of back-patchable territory, especially if it involves moving code to > the other side of the executor hook boundary. Is there an easier way? > If we have to follow the solution on these lines, then I don't see an easier way. One idea could be that we relax the assert in ExitParallelMode so that it doesn't expect parallel context to be gone by that time, but not sure if that is a good idea because it is used in some other places as well. I feel in general it is a good assertion that before we leave parallel mode, the parallel context should be gone as that ensures we won't do any parallel activity after that. > Another idea from the band-aid-solutions-that-are-easy-to-back-patch > department: in ExecutePlan() where we call ExecShutdownNode(), we > could write EXEC_FLAG_DONE into estate->es_top_eflags, and then have > ExecGatherShutdown() only run ExecParallelCleanup() if it sees that > flag. That's not beautiful, but it's less churn that the 'bool final' > argument we discussed before, and could be removed in master when we > have a better way. > Right, that will be lesser code churn and it can also work. However, one thing that needs some thought is till now es_top_eflags is only set in ExecutorStart and same is mentioned in comments where it is declared and it seems we are going to change that with this idea. How about having a separate function ExecBlahShutdown which will clean up resources as parallel context and can be called only from ExecutePlan where we are calling ExecShutdownNode? I think both these and the other solution we have discussed are on similar lines and another idea could be to relax the assert which again is not a superb idea. > Stepping back a bit, it seems like we need two separate tree-walking > calls: one to free resources not needed anymore by the current rescan > (workers), and another to free resources not needed ever again > (parallel context). That could be spelled ExecShutdownNode(false) and > ExecShutdownNode(true), or controlled with the EXEC_FLAG_DONE kluge, > or a new additional ExecSomethingSomethingNode() function, or as you > say, perhaps the second thing could be incorporated into > ExecEndNode(). I suspect that the Shutdown callbacks for Hash, Hash > Join, Custom Scan and Foreign Scan might not be needed anymore if we > could keep the parallel context around until after the run > ExecEndNode(). > I think we need those to collect instrumentation information. I guess that has to be done before we call InstrStopNode, otherwise, we might miss some instrumentation information. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
pgsql-hackers by date: