Re: Asynchronous execution on FDW - Mailing list pgsql-hackers
From | Kyotaro HORIGUCHI |
---|---|
Subject | Re: Asynchronous execution on FDW |
Date | |
Msg-id | 20150724.151059.102807210.horiguchi.kyotaro@lab.ntt.co.jp Whole thread Raw |
In response to | Re: Asynchronous execution on FDW (Kouhei Kaigai <kaigai@ak.jp.nec.com>) |
Responses |
Re: Asynchronous execution on FDW
|
List | pgsql-hackers |
Hello, At Thu, 23 Jul 2015 09:38:39 +0000, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote in <9A28C8860F777E439AA12E8AEA7694F80111BCEC@BPXM15GP.gisp.nec.co.jp> > I expected workloads like single shot scan on a partitioned large > fact table on DWH system. Yep, if workload is expected to rescan > so frequently, its expected cost shall be higher (by the cost to > launch bgworker) than existing Append, then planner will kick out > this path. > > Regarding of interaction between Limit and ParallelMergeAppend, > it is probably the best scenario, isn't it? If Limit picks up > the least 1000rows from a partitioned table consists of 20 child > tables, ParallelMergeAppend can launch 20 parallel jobs that > picks up the least 1000rows from the child relations for each. > Probably, it is same job done in pass_down_bound() of nodeLimit.c. Yes. I confused a bit. The scenario is one of least problematic cases. > > As for ForeignScan, it is merely an API for FDW and does nothing > > substantial so it would have nothing special to do. As for > > postgres_fdw, current patch restricts one execution per one > > foreign server at once by itself. We would have to provide > > another execution management if we want to have two or more > > simultaneous scans per one foreign server at once. > > > Yep, your 4th patch defines a new callback to FdwRoutines and > 5th patch implements postgres_fdw specific portion. > It shall work for distributed / shaded database environment well, > however, its benefit is around ForeignScan only. > Once management node kicks underlying SeqScan, ForeignScan or > others in parallel, it also enables to run local heap scan > asynchronously. I suppose SeqScan don't need async kick since its startup cost is extremely low as nothing. (fetching first several pages would boost seqscans?) On the other hand sort/hash would be a field where asynchronous execution is in effect. > > Sorry for the focusless discussion but does this answer some of > > your question? > > > Hmm... Its advantage is still unclear for me. However, it is not > fair to hijack this thread by my idea. It would be more advantageous if join/sort pushdown on fdw comes, where start-up cost could be extremely high... > I'll submit my design proposal about ParallelAppend towards the > next commit-fest. Please comment on. Ok, I'll come there. > > > Expected waste of CPU or I/O is common problem to be solved, however, it does > > > not need to add a special case handling to ForeignScan, I think. > > > How about your opinion? > > > > I agree with you that ForeignScan as the wrapper for FDWs don't > > need anything special for the case. I suppose for now that > > avoiding the penalty from abandoning too many speculatively > > executed scans (or other works on bg worker like sorts) would be > > a business of the upper node of FDWs, or somewhere else. > > > > However, I haven't dismissed the possibility that some common > > works related to resource management could be integrated into > > executor (or even into planner), but I see none for now. > > > I also agree with it is "eventually" needed, but may not be supported > in the first version. regards, -- Kyotaro Horiguchi NTT Open Source Software Center
pgsql-hackers by date: