Re: Asynchronous Append on postgres_fdw nodes. - Mailing list pgsql-hackers
From | Etsuro Fujita |
---|---|
Subject | Re: Asynchronous Append on postgres_fdw nodes. |
Date | |
Msg-id | CAPmGK16+y8mEX9AT1LXVLksbTyDnYWZXm0uDxZ8bza153Wey9A@mail.gmail.com Whole thread Raw |
In response to | Re: Asynchronous Append on postgres_fdw nodes. (Kyotaro Horiguchi <horikyota.ntt@gmail.com>) |
Responses |
Re: Asynchronous Append on postgres_fdw nodes.
|
List | pgsql-hackers |
On Thu, Aug 20, 2020 at 4:36 PM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote: > This is rebased version. Thanks for the rebased version! > Come to think of "complex", ExecAsync stuff in this patch might be > too-much for a short-term solution until executor overhaul, if it > comes shortly. (the patch of mine here as a whole is like that, > though..). The queueing stuff in postgres_fdw is, too. Here are some review comments on “ExecAsync stuff” (the 0002 patch): @@ -192,10 +196,20 @@ ExecInitAppend(Append *node, EState *estate, int eflags) i = -1; while ((i = bms_next_member(validsubplans, i)) >= 0) { Plan *initNode = (Plan *) list_nth(node->appendplans, i); + int sub_eflags = eflags; + + /* Let async-capable subplans run asynchronously */ + if (i < node->nasyncplans) + { + sub_eflags |= EXEC_FLAG_ASYNC; + nasyncplans++; + } This would be more ambitious than Thomas’ patch: his patch only allows foreign scan nodes beneath an Append node to be executed asynchronously, but your patch allows any plan nodes beneath it (e.g., local child joins between foreign tables). Right? I think that would be great, but I’m not sure how we execute such plan nodes asynchronously as other parts of your patch seem to assume that only foreign scan nodes beneath an Append are considered as async-capable. Maybe I’m missing something, though. Could you elaborate on that? Your patch (and the original patch by Robert [1]) modified ExecAppend() so that it can process local plan nodes while waiting for the results from remote queries, which would be also a feature that’s not supported by Thomas’ patch, but I’d like to know performance results. Did you do performance testing on that? I couldn’t find that from the archive. +bool +is_async_capable_path(Path *path) +{ + switch (nodeTag(path)) + { + case T_ForeignPath: + { + FdwRoutine *fdwroutine = path->parent->fdwroutine; + + Assert(fdwroutine != NULL); + if (fdwroutine->IsForeignPathAsyncCapable != NULL && + fdwroutine->IsForeignPathAsyncCapable((ForeignPath *) path)) + return true; + } Do we really need to introduce the FDW API IsForeignPathAsyncCapable()? I think we could determine whether a foreign path is async-capable, by checking whether the FDW has the postgresForeignAsyncConfigureWait() API. In relation to the first comment, I noticed this change in the postgres_fdw regression tests: HEAD: EXPLAIN (VERBOSE, COSTS OFF) SELECT a, count(t1) FROM pagg_tab t1 GROUP BY a HAVING avg(b) < 22 ORDER BY 1; QUERY PLAN ------------------------------------------------------------------------ Sort Output: t1.a, (count(((t1.*)::pagg_tab))) Sort Key: t1.a -> Append -> HashAggregate Output: t1.a, count(((t1.*)::pagg_tab)) Group Key: t1.a Filter: (avg(t1.b) < '22'::numeric) -> Foreign Scan on public.fpagg_tab_p1 t1 Output: t1.a, t1.*, t1.b Remote SQL: SELECT a, b, c FROM public.pagg_tab_p1 -> HashAggregate Output: t1_1.a, count(((t1_1.*)::pagg_tab)) Group Key: t1_1.a Filter: (avg(t1_1.b) < '22'::numeric) -> Foreign Scan on public.fpagg_tab_p2 t1_1 Output: t1_1.a, t1_1.*, t1_1.b Remote SQL: SELECT a, b, c FROM public.pagg_tab_p2 -> HashAggregate Output: t1_2.a, count(((t1_2.*)::pagg_tab)) Group Key: t1_2.a Filter: (avg(t1_2.b) < '22'::numeric) -> Foreign Scan on public.fpagg_tab_p3 t1_2 Output: t1_2.a, t1_2.*, t1_2.b Remote SQL: SELECT a, b, c FROM public.pagg_tab_p3 (25 rows) Patched: EXPLAIN (VERBOSE, COSTS OFF) SELECT a, count(t1) FROM pagg_tab t1 GROUP BY a HAVING avg(b) < 22 ORDER BY 1; QUERY PLAN ------------------------------------------------------------------------ Sort Output: t1.a, (count(((t1.*)::pagg_tab))) Sort Key: t1.a -> HashAggregate Output: t1.a, count(((t1.*)::pagg_tab)) Group Key: t1.a Filter: (avg(t1.b) < '22'::numeric) -> Append Async subplans: 3 -> Async Foreign Scan on public.fpagg_tab_p1 t1_1 Output: t1_1.a, t1_1.*, t1_1.b Remote SQL: SELECT a, b, c FROM public.pagg_tab_p1 -> Async Foreign Scan on public.fpagg_tab_p2 t1_2 Output: t1_2.a, t1_2.*, t1_2.b Remote SQL: SELECT a, b, c FROM public.pagg_tab_p2 -> Async Foreign Scan on public.fpagg_tab_p3 t1_3 Output: t1_3.a, t1_3.*, t1_3.b Remote SQL: SELECT a, b, c FROM public.pagg_tab_p3 (18 rows) So, your patch can only handle foreign scan nodes beneath an Append for now? Anyway, I think this would lead to the improved efficiency, considering performance results from Movead [2]. And I think planner changes to make this happen would be a good thing in your patch. That’s all I have for now. Sorry for the delay. Best regards, Etsuro Fujita [1] https://www.postgresql.org/message-id/CA%2BTgmoaXQEt4tZ03FtQhnzeDEMzBck%2BLrni0UWHVVgOTnA6C1w%40mail.gmail.com [2] https://www.postgresql.org/message-id/2020011417113872105895%40highgo.ca
pgsql-hackers by date: