Re: Asynchronous Append on postgres_fdw nodes. - Mailing list pgsql-hackers

From Etsuro Fujita
Subject Re: Asynchronous Append on postgres_fdw nodes.
Date
Msg-id CAPmGK15c--uUJfGqnxKn9U8S3EfvSTRGKKquM05PV-2UpfMm8w@mail.gmail.com
Whole thread Raw
In response to Re: Asynchronous Append on postgres_fdw nodes.  (Kyotaro Horiguchi <horikyota.ntt@gmail.com>)
Responses Re: Asynchronous Append on postgres_fdw nodes.  (Etsuro Fujita <etsuro.fujita@gmail.com>)
List pgsql-hackers
On Mon, Sep 28, 2020 at 10:35 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:
> At Sat, 26 Sep 2020 19:45:39 +0900, Etsuro Fujita <etsuro.fujita@gmail.com> wrote in
> > Here are some review comments on “ExecAsync stuff” (the 0002 patch):
> >
> > @@ -192,10 +196,20 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
> >
> >     i = -1;
> >     while ((i = bms_next_member(validsubplans, i)) >= 0)
> >     {
> >         Plan       *initNode = (Plan *) list_nth(node->appendplans, i);
> > +       int         sub_eflags = eflags;
> > +
> > +       /* Let async-capable subplans run asynchronously */
> > +       if (i < node->nasyncplans)
> > +       {
> > +           sub_eflags |= EXEC_FLAG_ASYNC;
> > +           nasyncplans++;
> > +       }
> >
> > This would be more ambitious than Thomas’ patch: his patch only allows
> > foreign scan nodes beneath an Append node to be executed
> > asynchronously, but your patch allows any plan nodes beneath it (e.g.,
> > local child joins between foreign tables).  Right?  I think that would
>
> Right. It is intended to work any place,

> > be great, but I’m not sure how we execute such plan nodes
> > asynchronously as other parts of your patch seem to assume that only
> > foreign scan nodes beneath an Append are considered as async-capable.
> > Maybe I’m missing something, though.  Could you elaborate on that?
>
> Right about this patch.  As a trial at hand, in my faint memory, some
> join methods and some aggregaioion can be async-aware but they are not
> included in this patch not to bloat it with more complex stuff.

Yeah.  I’m concerned about what was discussed in [1] as well.  I think
it would be better only to allow foreign scan nodes beneath an Append,
as in Thomas’ patch (and the original patch by Robert), at least in
the first cut of this feature.

BTW: I noticed that you changed the ExecProcNode() API so that an
Append calling FDWs can know wether they return tuples immediately or
not:

+   while ((i = bms_first_member(needrequest)) >= 0)
+   {
+       TupleTableSlot *slot;
+       PlanState *subnode = node->appendplans[i];
+
+       slot = ExecProcNode(subnode);
+       if (subnode->asyncstate == AS_AVAILABLE)
+       {
+           if (!TupIsNull(slot))
+           {
+               node->as_asyncresult[node->as_nasyncresult++] = slot;
+               node->as_needrequest = bms_add_member(node->as_needrequest, i);
+           }
+       }
+       else
+           node->as_pending_async = bms_add_member(node->as_pending_async, i);
+   }

In the case of postgres_fdw:

 /*
  * postgresIterateForeignScan
- *     Retrieve next row from the result set, or clear tuple slot to indicate
- *     EOF.
+ *     Retrieve next row from the result set.
+ *
+ *     For synchronous nodes, returns clear tuple slot means EOF.
+ *
+ *     For asynchronous nodes, if clear tuple slot is returned, the caller
+ *     needs to check async state to tell if all tuples received
+ *     (AS_AVAILABLE) or waiting for the next data to come (AS_WAITING).
  */

That is, 1) in postgresIterateForeignScan() postgres_fdw sets the new
PlanState’s flag asyncstate to AS_AVAILABLE/AS_WAITING depending on
whether it returns a tuple immediately or not, and then 2) the Append
knows that from the new flag when the callback routine returns.  I’m
not sure this is a good idea, because it seems likely that the
ExecProcNode() change would affect many other places in the executor,
making maintenance and/or future development difficult.  I think the
FDW callback routines proposed in the original patch by Robert would
provide a cleaner way to do asynchronous execution of FDWs without
changing the ExecProcNode() API, IIUC:

+On the other hand, nodes that wish to produce tuples asynchronously
+generally need to implement three methods:
+
+1. When an asynchronous request is made, the node's ExecAsyncRequest callback
+will be invoked; it should use ExecAsyncSetRequiredEvents to indicate the
+number of file descriptor events for which it wishes to wait and whether it
+wishes to receive a callback when the process latch is set. Alternatively,
+it can instead use ExecAsyncRequestDone if a result is available immediately.
+
+2. When the event loop wishes to wait or poll for file descriptor events and
+the process latch, the ExecAsyncConfigureWait callback is invoked to configure
+the file descriptor wait events for which the node wishes to wait.  This
+callback isn't needed if the node only cares about the process latch.
+
+3. When file descriptors or the process latch become ready, the node's
+ExecAsyncNotify callback is invoked.

What is the reason for not doing like this in your patch?

Thanks for the explanation!

Best regards,
Etsuro Fujita

[1]
https://www.postgresql.org/message-id/CA%2BTgmoYrbgTBnLwnr1v%3Dpk%2BC%3DznWg7AgV9%3DM9ehrq6TDexPQNw%40mail.gmail.com



pgsql-hackers by date:

Previous
From: Andrew Dunstan
Date:
Subject: Re: BLOB / CLOB support in PostgreSQL
Next
From: John Naylor
Date:
Subject: Re: WIP: BRIN multi-range indexes