Re: Asynchronous execution on FDW - Mailing list pgsql-hackers

From Kyotaro HORIGUCHI
Subject Re: Asynchronous execution on FDW
Date
Msg-id 20150710.163231.255890296.horiguchi.kyotaro@lab.ntt.co.jp
Whole thread Raw
In response to Re: Asynchronous execution on FDW  (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>)
Responses Re: Asynchronous execution on FDW
List pgsql-hackers
Hello, This is the new version of this patch.

At Tue, 07 Jul 2015 10:19:35 +0900, Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote
> This is rather a makeshift solution for the problem, but
> considering current trend of parallelism, it might the time to
> make the executor to fit parallel execution.
> 
> If it is acceptable to reconstruct the executor nodes to have
> additional return state PREP_RUN or such (which means it needs
> one more call for the first tuple) , I'll modify the whole
> executor to handle the state in the next patch to do so.

I made a patchset to do this. The details of it and some examples
are shown after the summary below.

- I provided an infrastructure for asynchronous (simultaneous) execution of multiple execnodes belonging one node, like
joins.

- It (should) have addressed the "parameterized plan" problem.

- The infrastructure is a bit intrusive but simple, and it will be usable by any nodes that supports asynchronous
execution(none so far except fdw, needs some modification in core). So the async exec for Postgres-FDW now became an
exapmlefor the infrastructure. It might be nice to start backend worker for promising async resuest for a sort node.
 

- The postgres_fdw part is almost the same as the previous one.


The detailed explanation of the patchset follows.

============

I made a patchset to do this. It consists of five patches (plus
one for debug message).

1. Add infrastructure for run-state of executor node.
Currently executor nodes have binary run-states, one is!TupIsNull(slot) which indicates that the next tuple may
comefromthe node, and the another is TupIsNull(slot) whichindicates that no more tuple will be come.
 
This patch expands it to four-state and have the value inPlanState struct.
  Inited : it is just after initialized.
  Started: it is startd execution but no tuple retrieved. This           could be skipped.
  Running: it is returning tuples.
  Done   : it has no more tuple to return. This is equivalent to           TupIsNull(slot).
The nodes Group, ModifyTable, SetOp and WindowAgg had their ownstate flag replaceable by the new states in their own
*Statepartso they are moved to this new state set in this patch. Thispatch does not change the current behavior.
 


2. Change all tuple-returning execnodes to maintain the new  run-state appropriately.
The rest nodes are modified by this patch to maintain the stateto be consistent with the TupIsNull() state at the
ExecProcNodelevel. This patch does not change the current behavior, too. (Ifeel that the state Done would be no other
thanan encumbrancein maintenance. The state is not referred in nowhere)
 

3. Add a feature to start node asynchronously.
All nodes that have more than one child node can execute thechildren asynchronously by this patch. It tries start
childrenasynchronouslyif the state is "Inited" when entering Exec*functions. Async request for nodes which has just one
childissimply propagated to the child, and leaf nodes such as scanswill decide whether to be async or not. Currently no
leafnodecan be async except postgres_fdw.
 
NestLoop may run parameterized plan so it is specially treatedin StartNestLoop so that parameterized plans will not
beasynchronouslystarted.
 
In StartHashJoin, whether the inner (hash) node is executed ornot is judged by the similar logic with ExecHashJoin.
Even after this patch applied, no leaf node can startasynchronously so the behavior of the executor still beunchanged.


4. Add StartForeignScan to FdwRoutine
Add new entry function to accept the asynchronous executionrequest from the core.


5. Allow asynchronous remote query of postgres_fdw.
This is almost the same as the previous version. Except that itruns on the new infrastructure, and added new
server/foreigntableoption allow_async.
 
The first foreign scan on the same server will be asynchronouslystarted execution if requested. And apart from the
asyncstart,every successive fetches for the same foreign scan will beasynchronously fetched.
 

Currently there's no means to observe what it is doing from
outside, so the additional sixth patch is to output debug
messages about asynchronous execution.

However, currently it is no test code for that but I'm at a loss
what to do as the test..


FWIW I provided two exaples of running asynchronous exexution.

regards,


===== Example
CREATE SERVER sv1 FOREIGN DATA WRAPPER postgres_fdw OPTIONS (host 'localhost', dbname 'postgres');
CREATE SERVER sv2 FOREIGN DATA WRAPPER postgres_fdw OPTIONS (host 'localhost', dbname 'postgres');
CREATE USER MAPPING FOR CURRENT_USER SERVER sv1;
CREATE USER MAPPING FOR CURRENT_USER SERVER sv2;
CREATE TABLE lp (a int, b int);
CREATE TABLE lt1 () INHERITS (lp);
CREATE TABLE lt2 () INHERITS (lp);
CREATE TABLE lt3 () INHERITS (lp);
CREATE TABLE lt4 () INHERITS (lp);
CREATE TABLE fp (LIKE lp);
CREATE FOREIGN TABLE ft1 () INHERITS (fp) SERVER sv1 OPTIONS (table_name 'lt1');
CREATE FOREIGN TABLE ft2 () INHERITS (fp) SERVER sv1 OPTIONS (table_name 'lt1');
CREATE FOREIGN TABLE ft3 () INHERITS (fp) SERVER sv2 OPTIONS (table_name 'lt1');
CREATE FOREIGN TABLE ft4 () INHERITS (fp) SERVER sv2 OPTIONS (table_name 'lt1');
INSERT INTO lt1 (SELECT a, a FROM generate_series(0, 999) a);
INSERT INTO lt2 (SELECT a+1000, a FROM generate_series(0, 999) a);
INSERT INTO lt3 (SELECT a+2000, a FROM generate_series(0, 999) a);
INSERT INTO lt4 (SELECT a+3000, a FROM generate_series(0, 999) a);

;; TEST FOR SIMPLE APPEND
=# SELECT * FROM fp;
1  LOG:  pg_fdw: [ft1/sv1/0x293a580] Async exec started.
2  LOG:  pg_fdw: [ft2/sv1/0x293a580] Async exec denied.
3  LOG:  pg_fdw: [ft3/sv2/0x2898c70] Async exec started.
4  LOG:  pg_fdw: [ft4/sv2/0x2898c70] Async exec denied.
5  LOG:  pg_fdw: [ft1/sv1/0x293a580] Async fetch  ....
6  LOG:  pg_fdw: [ft1/sv1/0x293a580] Async fetch
7  LOG:  pg_fdw: [ft2/sv1/0x293a580] Sync fetch.
8  LOG:  pg_fdw: [ft2/sv1/0x293a580] Async fetch  ...
9  LOG:  pg_fdw: [ft2/sv1/0x293a580] Async fetch
10 LOG:  pg_fdw: [ft3/sv2/0x2898c70] Async fetch  ....
11 LOG:  pg_fdw: [ft3/sv2/0x2898c70] Async fetch
12 LOG:  pg_fdw: [ft4/sv2/0x2898c70] Sync fetch.
14 LOG:  pg_fdw: [ft4/sv2/0x2898c70] Async fetch  ...
15 LOG:  pg_fdw: [ft4/sv2/0x2898c70] Async fetch

;;  The notation inside the square bracket is
;;       <table name>/<server name>/<ponter of connection>.
;;
;;  1-4 foreign servers denied async for the second scan for each (ft2/ft4).
;;
;; At 7, reading different table from 6 made it sync fetch but
;;   the successive fetches afterward are async.
;;
;; ft2 and ft3 was on different server so 10 is async fetch for
;;   the query executed asynchronously at 3.
;;
;; At 12 the same thing to 7 occurred.


;; TEST FOR PARAMETERIZED NESTLOOP
=# SET enable_hashjoin TO false;
=# SET enable_mergejoin TO false;
=# SET enable_material TO false;
=# ALTER FOREIGN TABLE ft4 OPTIONS (ADD use_remote_estimate 'true');
=# SELECT ft4.a FROM ft1 JOIN ft4 ON ft1.b = ft4.b WHERE ft1.a BETWEEN 800 AND 1000;
1  LOG:  pg_fdw: [ft1/sv1/0x293a580] Async exec started.
2  LOG:  pg_fdw: [ft1/sv1/0x293a580] Async fetch
3  LOG:  pg_fdw: [ft4/sv2/0x2898c70] Sync fetch.
4  LOG:  pg_fdw: [ft4/sv2/0x2898c70] Sync fetch.  ...
5  LOG:  pg_fdw: [ft4/sv2/0x2898c70] Sync fetch.
6  LOG:  pg_fdw: [ft1/sv1/0x293a580] Async fetch
7  LOG:  pg_fdw: [ft4/sv2/0x2898c70] Sync fetch.  ...
8  LOG:  pg_fdw: [ft4/sv2/0x2898c70] Sync fetch.
9  LOG:  pg_fdw: [ft1/sv1/0x293a580] Async fetch

;; ft4 did not even try to async since the inner(ft4) is parameterized.
;; All fetches for inner(ft4) was executed synchronously.
;;
;; Meanwhile, ft1 was continuously reading asynchronously.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

pgsql-hackers by date:

Previous
From: Sawada Masahiko
Date:
Subject: Re: Freeze avoidance of very large table.
Next
From: Kyotaro HORIGUCHI
Date:
Subject: Re: Asynchronous execution on FDW