Re: asynchronous execution - Mailing list pgsql-hackers

From Kyotaro HORIGUCHI
Subject Re: asynchronous execution
Date
Msg-id 20161018.103051.30820907.horiguchi.kyotaro@lab.ntt.co.jp
Whole thread Raw
In response to Re: asynchronous execution  (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>)
Responses Re: asynchronous execution  (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>)
List pgsql-hackers
Hello, this works but ExecAppend gets a bit degradation.

At Mon, 03 Oct 2016 19:46:32 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote in
<20161003.194632.204401048.horiguchi.kyotaro@lab.ntt.co.jp>
> > Some notes:
> > 
> > - EvalPlanQual rechecks are broken.

This is fixed by adding (restoring) async-cancelation.

> > - EXPLAIN ANALYZE instrumentation is broken.

EXPLAIN ANALYE seems working but async-specific information is
not available yet.

> > - ExecReScanAppend is broken, because the async stuff needs some way
> > of canceling an async request and I didn't invent anything like that
> > yet.

Fixed as EvalPlanQual.

> > - The postgres_fdw changes pretend to be async but aren't actually.
> > It's just a demo of (part of) the interface at this point.

Applied my previous patch with some modification.

> > - The postgres_fdw changes also report all pg-fdw paths as
> > async-capable, but actually the direct-modify ones aren't, so the
> > regression tests fail.

All actions other than scan does vacate_connection() to use a
connection.

> > - Errors in the executor can leak the WaitEventSet.  Probably we need
> > to modify ResourceOwners to be able to own WaitEventSets.

WaitEventSet itself is not leaked but epoll-fd should be closed
at failure. This seems doable with TRY-CATCHing in
ExecAsyncEventLoop. (not yet)

> > - There are probably other bugs, too.
> > 
> > Whee!
> > 
> > Note that I've tried to solve the re-entrancy problems by (1) putting
> > all of the event loop's state inside the EState rather than in local
> > variables and (2) having the function that is called to report arrival
> > of a result be thoroughly different than the function that is used to
> > return a tuple to a synchronous caller.
> > 
> > Comments welcome, if you're feeling brave enough to look at anything
> > this half-baked.

This doesn't cause reentry since this no longer bubbles up
tupples through async-unaware nodes. This framework passes tuples
through private channels for requestor and requestees.

Anyway, I amended this and made postgres_fdw async and then
finally all regtests passed with minor modifications. The
attached patches are the following.

0001-robert-s-2nd-framework.patchThe patch Robert shown upthread

0002-Fix-some-bugs.patchA small patch to fix complation errors of 0001.

0003-Modify-async-execution-infrastructure.patchSeveral modifications on the infrastructure. The details areshown after
themeasurement below.
 

0004-Make-postgres_fdw-async-capable.patchTrue-async postgres-fdw.

gentblr.sql, testrun.sh, calc.plPerformance test script suite.
  gentblr.sql - creates test tables.  testrun.sh - does single test run and  calc.pl - drives testrunc.sh and summirize
itsresults.
 


I measured performance and had the following result.

t0  - SELECT sum(a) FROM <local single table>;
pl  - SELECT sum(a) FROM <4 local children>;
pf0 - SELECT sum(a) FROM <4 foreign children on single connection>;
pf1 - SELECT sum(a) FROM <4 foreign children on dedicate connections>;

The result is written as "time<ms> (std dev <ms>)"

sync t0: 3820.33 (  1.88) pl: 1608.59 ( 12.06)pf0: 7928.29 ( 46.58)pf1: 8023.16 ( 26.43)

async t0: 3806.31 (  4.49)    0.4% faster (should be error) pl: 1629.17 (  0.29)    1.3% slowerpf0: 6447.07 ( 25.19)
18.7%fasterpf1: 1876.80 ( 47.13)   76.6% faster
 

t0 is not affected since the ExecProcNode stuff has gone.

pl is getting a bit slower. (almost the same to simple seqscan of
the previous patch) This should be a misprediction penalty.

pf0, pf1 are faster as expected.


========

The below is a summary of modifications made by 0002 and 0003 patch.

execAsync.c, execnodes.h:
 - Added include "pgstat.h" to use WAIT_EVENT_ASYNC_WAIT.
 - Changed the interface of ExecAsyncRequest to return if a tuple is   immediately available or not.
 - Made ExecAsyncConfigureWait to return if it registered at least   one waitevent or not. This is used to know the
caller  (ExecAsyncEventWait) has a event to wait (for safety).
 
   If two or more postgres_fdw nodes are sharing one connection,   only one of them can be waited at once. It is a
responsibilityto the FDW drivers to ensure at least one wait   event to be added but on failure WaitEventSetWait
silently  waits forever.
 
 - There were separate areq->callback_pending and   areq->request_complete but they are altering together so they are
replacedwith one state variable areq->state. New enum   AsyncRequestState for areq->state in execnodes.h.
 


nodeAppend.c:
 - Return a tuple immediately if ExecAsyncRequest says that a   tuple is available.
 - Reduced nest level of for(;;).

nodeForeignscan.[ch], fdwapi.h, execProcnode.c::
 - Calling postgresIterateForeignScan can yield tuples in wrong   shape. Call ExecForeignScan instead.
 - Changed the interface of AsyncConfigureWait as execAsync.c.
 - Added ShutdownForeignScan interface.

createplan.c, ruleutils.c, plannodes.h:
 - With the Rebert's change, explain shows somewhat odd plans   where the Output of Append is named after non-parent
child.This does not harm but uneasy. Added index of the   parent in Append.referent to make it reasoable. (But this
looksugly..). Still children in explain are in different   order from definition. (expected/postgres_fdw.out is
edited)

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: emergency outage requiring database restart
Next
From: Gavin Flower
Date:
Subject: Re: emergency outage requiring database restart