Re: asynchronous execution - Mailing list pgsql-hackers
From | Kyotaro HORIGUCHI |
---|---|
Subject | Re: asynchronous execution |
Date | |
Msg-id | 20161018.103051.30820907.horiguchi.kyotaro@lab.ntt.co.jp Whole thread Raw |
In response to | Re: asynchronous execution (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>) |
Responses |
Re: asynchronous execution
(Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>)
|
List | pgsql-hackers |
Hello, this works but ExecAppend gets a bit degradation. At Mon, 03 Oct 2016 19:46:32 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote in <20161003.194632.204401048.horiguchi.kyotaro@lab.ntt.co.jp> > > Some notes: > > > > - EvalPlanQual rechecks are broken. This is fixed by adding (restoring) async-cancelation. > > - EXPLAIN ANALYZE instrumentation is broken. EXPLAIN ANALYE seems working but async-specific information is not available yet. > > - ExecReScanAppend is broken, because the async stuff needs some way > > of canceling an async request and I didn't invent anything like that > > yet. Fixed as EvalPlanQual. > > - The postgres_fdw changes pretend to be async but aren't actually. > > It's just a demo of (part of) the interface at this point. Applied my previous patch with some modification. > > - The postgres_fdw changes also report all pg-fdw paths as > > async-capable, but actually the direct-modify ones aren't, so the > > regression tests fail. All actions other than scan does vacate_connection() to use a connection. > > - Errors in the executor can leak the WaitEventSet. Probably we need > > to modify ResourceOwners to be able to own WaitEventSets. WaitEventSet itself is not leaked but epoll-fd should be closed at failure. This seems doable with TRY-CATCHing in ExecAsyncEventLoop. (not yet) > > - There are probably other bugs, too. > > > > Whee! > > > > Note that I've tried to solve the re-entrancy problems by (1) putting > > all of the event loop's state inside the EState rather than in local > > variables and (2) having the function that is called to report arrival > > of a result be thoroughly different than the function that is used to > > return a tuple to a synchronous caller. > > > > Comments welcome, if you're feeling brave enough to look at anything > > this half-baked. This doesn't cause reentry since this no longer bubbles up tupples through async-unaware nodes. This framework passes tuples through private channels for requestor and requestees. Anyway, I amended this and made postgres_fdw async and then finally all regtests passed with minor modifications. The attached patches are the following. 0001-robert-s-2nd-framework.patchThe patch Robert shown upthread 0002-Fix-some-bugs.patchA small patch to fix complation errors of 0001. 0003-Modify-async-execution-infrastructure.patchSeveral modifications on the infrastructure. The details areshown after themeasurement below. 0004-Make-postgres_fdw-async-capable.patchTrue-async postgres-fdw. gentblr.sql, testrun.sh, calc.plPerformance test script suite. gentblr.sql - creates test tables. testrun.sh - does single test run and calc.pl - drives testrunc.sh and summirize itsresults. I measured performance and had the following result. t0 - SELECT sum(a) FROM <local single table>; pl - SELECT sum(a) FROM <4 local children>; pf0 - SELECT sum(a) FROM <4 foreign children on single connection>; pf1 - SELECT sum(a) FROM <4 foreign children on dedicate connections>; The result is written as "time<ms> (std dev <ms>)" sync t0: 3820.33 ( 1.88) pl: 1608.59 ( 12.06)pf0: 7928.29 ( 46.58)pf1: 8023.16 ( 26.43) async t0: 3806.31 ( 4.49) 0.4% faster (should be error) pl: 1629.17 ( 0.29) 1.3% slowerpf0: 6447.07 ( 25.19) 18.7%fasterpf1: 1876.80 ( 47.13) 76.6% faster t0 is not affected since the ExecProcNode stuff has gone. pl is getting a bit slower. (almost the same to simple seqscan of the previous patch) This should be a misprediction penalty. pf0, pf1 are faster as expected. ======== The below is a summary of modifications made by 0002 and 0003 patch. execAsync.c, execnodes.h: - Added include "pgstat.h" to use WAIT_EVENT_ASYNC_WAIT. - Changed the interface of ExecAsyncRequest to return if a tuple is immediately available or not. - Made ExecAsyncConfigureWait to return if it registered at least one waitevent or not. This is used to know the caller (ExecAsyncEventWait) has a event to wait (for safety). If two or more postgres_fdw nodes are sharing one connection, only one of them can be waited at once. It is a responsibilityto the FDW drivers to ensure at least one wait event to be added but on failure WaitEventSetWait silently waits forever. - There were separate areq->callback_pending and areq->request_complete but they are altering together so they are replacedwith one state variable areq->state. New enum AsyncRequestState for areq->state in execnodes.h. nodeAppend.c: - Return a tuple immediately if ExecAsyncRequest says that a tuple is available. - Reduced nest level of for(;;). nodeForeignscan.[ch], fdwapi.h, execProcnode.c:: - Calling postgresIterateForeignScan can yield tuples in wrong shape. Call ExecForeignScan instead. - Changed the interface of AsyncConfigureWait as execAsync.c. - Added ShutdownForeignScan interface. createplan.c, ruleutils.c, plannodes.h: - With the Rebert's change, explain shows somewhat odd plans where the Output of Append is named after non-parent child.This does not harm but uneasy. Added index of the parent in Append.referent to make it reasoable. (But this looksugly..). Still children in explain are in different order from definition. (expected/postgres_fdw.out is edited) regards, -- Kyotaro Horiguchi NTT Open Source Software Center
pgsql-hackers by date: