Re: Defer selection of asynchronous subplans until the executor initialization stage - Mailing list pgsql-hackers

From Etsuro Fujita
Subject Re: Defer selection of asynchronous subplans until the executor initialization stage
Date
Msg-id CAPmGK15AshMEtgUqs3yZfD5mzW87wyn=_uAXbBeRLG0nod7vHA@mail.gmail.com
Whole thread Raw
In response to Re: Defer selection of asynchronous subplans until the executor initialization stage  (Andrey Lepikhov <a.lepikhov@postgrespro.ru>)
Responses Re: Defer selection of asynchronous subplans until the executor initialization stage
List pgsql-hackers
On Wed, Jun 30, 2021 at 1:50 PM Andrey Lepikhov
<a.lepikhov@postgrespro.ru> wrote:
> I have completely rewritten this patch.
>
> Main idea:
>
> The async_capable field of a plan node inform us that this node could
> work in async mode. Each node sets this field based on its own logic.
> The actual mode of a node is defined by the async_capable of PlanState
> structure. It is made at the executor initialization stage.
> In this patch, only an append node could define async behaviour for its
> subplans.

I finally reviewed the patch.  One thing I noticed about the patch is
that it would break ordered Appends.  Here is such an example using
the patch:

create table pt (a int) partition by range (a);
create table loct1 (a int);
create table loct2 (a int);
create foreign table p1 partition of pt for values from (10) to (20)
server loopback1 options (table_name 'loct1');
create foreign table p2 partition of pt for values from (20) to (30)
server loopback2 options (table_name 'loct2');

explain verbose select * from pt order by a;
                       QUERY PLAN
-------------------------------------------------------------------------------------
 Append  (cost=200.00..440.45 rows=5850 width=4)
   ->  Async Foreign Scan on public.p1 pt_1  (cost=100.00..205.60
rows=2925 width=4)
         Output: pt_1.a
         Remote SQL: SELECT a FROM public.loct1 ORDER BY a ASC NULLS LAST
   ->  Async Foreign Scan on public.p2 pt_2  (cost=100.00..205.60
rows=2925 width=4)
         Output: pt_2.a
         Remote SQL: SELECT a FROM public.loct2 ORDER BY a ASC NULLS LAST
(7 rows)

This would not always provide tuples in the required order, as async
execution would provide them from the subplans rather randomly.  I
think it would not only be too late but be not efficient to do the
planning work at execution time (consider executing generic plans!),
so I think we should avoid doing so.  (The cost of doing that work for
simple foreign scans is small, but if we support async execution for
upper plan nodes such as NestLoop as discussed before, I think the
cost for such plan nodes would not be small anymore.)

To just execute what was planned at execution time, I think we should
return to the patch in [1].  The patch was created for Horiguchi-san’s
async-execution patch, so I modified it to work with HEAD, and added a
simplified version of your test cases.  Please find attached a patch.

Best regards,
Etsuro Fujita

[1] https://www.postgresql.org/message-id/7fe10f95-ac6c-c81d-a9d3-227493eb9055@postgrespro.ru

Attachment

pgsql-hackers by date:

Previous
From: "houzj.fnst@fujitsu.com"
Date:
Subject: RE: [BUG] wrong refresh when ALTER SUBSCRIPTION ADD/DROP PUBLICATION
Next
From: Magnus Hagander
Date:
Subject: Re: Proposal: More structured logging