Re: [HACKERS] why not parallel seq scan for slow functions - Mailing list pgsql-hackers
From | Amit Kapila |
---|---|
Subject | Re: [HACKERS] why not parallel seq scan for slow functions |
Date | |
Msg-id | CAA4eK1K3pxVknO5yLjHTi2ciSO9Yabi2yCRnrw5=PTJR-FoQ7g@mail.gmail.com Whole thread Raw |
In response to | Re: [HACKERS] why not parallel seq scan for slow functions (Jeff Janes <jeff.janes@gmail.com>) |
Responses |
Re: [HACKERS] why not parallel seq scan for slow functions
Re: [HACKERS] why not parallel seq scan for slow functions |
List | pgsql-hackers |
On Wed, Jul 12, 2017 at 11:20 PM, Jeff Janes <jeff.janes@gmail.com> wrote: > On Tue, Jul 11, 2017 at 10:25 PM, Amit Kapila <amit.kapila16@gmail.com> > wrote: >> >> On Wed, Jul 12, 2017 at 1:50 AM, Jeff Janes <jeff.janes@gmail.com> wrote: >> > On Mon, Jul 10, 2017 at 9:51 PM, Dilip Kumar <dilipbalaut@gmail.com> >> > wrote: >> >> >> >> So because of this high projection cost the seqpath and parallel path >> >> both have fuzzily same cost but seqpath is winning because it's >> >> parallel safe. >> > >> > >> > I think you are correct. However, unless parallel_tuple_cost is set >> > very >> > low, apply_projection_to_path never gets called with the Gather path as >> > an >> > argument. It gets ruled out at some earlier stage, presumably because >> > it >> > assumes the projection step cannot make it win if it is already behind >> > by >> > enough. >> > >> >> I think that is genuine because tuple communication cost is very high. > > > Sorry, I don't know which you think is genuine, the early pruning or my > complaint about the early pruning. > Early pruning. See, currently, we don't have a way to maintain both parallel and non-parallel paths till later stage and then decide which one is better. If we want to maintain both parallel and non-parallel paths, it can increase planning cost substantially in the case of joins. Now, surely it can have benefit in many cases, so it is a worthwhile direction to pursue. > I agree that the communication cost is high, which is why I don't want to > have to set parellel_tuple_cost very low. For example, to get the benefit > of your patch, I have to set parellel_tuple_cost to 0.0049 or less (in my > real-world case, not the dummy test case I posted, although the number are > around the same for that one too). But with a setting that low, all kinds > of other things also start using parallel plans, even if they don't benefit > from them and are harmed. > > I realize we need to do some aggressive pruning to avoid an exponential > explosion in planning time, but in this case it has some rather unfortunate > consequences. I wanted to explore it, but I can't figure out where this > particular pruning is taking place. > > By the time we get to planner.c line 1787, current_rel->pathlist already > does not contain the parallel plan if parellel_tuple_cost >= 0.0050, so the > pruning is happening earlier than that. > Check generate_gather_paths. > >> >> If your table is reasonable large then you might want to try by >> increasing parallel workers (Alter Table ... Set (parallel_workers = >> ..)) > > > > Setting parallel_workers to 8 changes the threshold for the parallel to even > be considered from parellel_tuple_cost <= 0.0049 to <= 0.0076. So it is > going in the correct direction, but not by enough to matter. > You might want to play with cpu_tuple_cost and or seq_page_cost. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
pgsql-hackers by date: