Re: Asynchronous execution on FDW - Mailing list pgsql-hackers
From | Kyotaro HORIGUCHI |
---|---|
Subject | Re: Asynchronous execution on FDW |
Date | |
Msg-id | 20150707.101935.28049720.horiguchi.kyotaro@lab.ntt.co.jp Whole thread Raw |
In response to | Re: Asynchronous execution on FDW (Heikki Linnakangas <hlinnaka@iki.fi>) |
Responses |
Re: Asynchronous execution on FDW
|
List | pgsql-hackers |
Hello, thank you for looking this. If it is acceptable to reconstruct the executor nodes to have additional return state PREP_RUN or such (which means it needs one more call for the first tuple) , I'll modify the whole executor to handle the state in the next patch to do so. I haven't take the advice I had so far in this sense. But I came to think that it is the most reasonable way to solve this. ====== > > - It was a problem when to give the first kick for async exec. It > > is not in ExecInit phase, and ExecProc phase does not fit, > > too. An extra phase ExecPreProc or something is too > > invasive. So I tried "pre-exec callback". > > > > Any init-node can register callbacks on their turn, then the > > registerd callbacks are called just before ExecProc phase in > > executor. The first patch adds functions and structs to enable > > this. > > At a quick glance, I think this has all the same problems as starting > the execution at ExecInit phase. The correct way to do this is to kick > off the queries in the first IterateForeignScan() call. You said that > "ExecProc phase does not fit" - why not? Execution nodes are expected to return the first tuple if available. But asynchronous execution can not return the first tuple immediately. Simultaneous execution for the first tuple on every foreign node is crucial than asynchronous fetching for many cases, especially for the cases like sort/agg pushdown on FDW. The reason why ExecProc does not fit is that the first loop without returning tuple looks impact too large portion in executor. It is my mistake that it doesn't address the problem about parameterized paths. Parameterized paths should be executed within ExecProc loops so this patch would be like following. - To gain the advantage of kicking execution before the first ExecProc loop, non-parameterized paths are started using thecallback feature this patch provides. - Parameterized paths need the upper nodes executed before it starts execution so they should be start in ExecProc loop,but runs asynchronously if possible. This is rather a makeshift solution for the problem, but considering current trend of parallelism, it might the time to make the executor to fit parallel execution. If it is acceptable to reconstruct the executor nodes to have additional return state PREP_RUN or such (which means it needs one more call for the first tuple) , I'll modify the whole executor to handle the state in the next patch to do so. I hate my stupidity if you suggested this kind of solution by "do it in ExecProc":( regards, -- Kyotaro Horiguchi NTT Open Source Software Center
pgsql-hackers by date: