Re: Asynchronous Append on postgres_fdw nodes. - Mailing list pgsql-hackers

From Kyotaro Horiguchi
Subject Re: Asynchronous Append on postgres_fdw nodes.
Date
Msg-id 20200619.120540.1819958483484550401.horikyota.ntt@gmail.com
Whole thread Raw
In response to Re: Asynchronous Append on postgres_fdw nodes.  ("Andrey V. Lepikhov" <a.lepikhov@postgrespro.ru>)
List pgsql-hackers
At Wed, 17 Jun 2020 15:01:08 +0500, "Andrey V. Lepikhov" <a.lepikhov@postgrespro.ru> wrote in 
> On 6/16/20 1:30 PM, Kyotaro Horiguchi wrote:
> > They return 25056 rows, which is far more than 9741 rows. So remote
> > join won.
> > Of course the number of returning rows is not the only factor of the
> > cost change but is the most significant factor in this case.
> > 
> Thanks for the attention.
> I see one slight flaw of this approach to asynchronous append:
> AsyncAppend works only for ForeignScan subplans. if we have
> PartialAggregate, Join or another more complicated subplan, we can't
> use asynchronous machinery.

Yes, the asynchronous append works only when it has at least one
async-capable immediate subnode. Currently there's only one
async-capable node, ForeignScan.

> I imagine an Append node, that can switch current subplan from time to
> time and all ForeignScan nodes of the overall plan are added to one
> queue. The scan buffer can be larger than a cursor fetch size and each
> IterateForeignScan() call can induce asynchronous scan of another
> ForeignScan node if buffer is not full.
> But these are only thoughts, not an proposal. I have no questions to
> your patch right now.

A major property of async-capable nodes is yieldability(?), that is,
it ought to be able to give way for other nodes when it is not ready
to return a tuple. That means such nodes are state machine rather than
function.  Fortunately ForeignScan is natively a kind of state machine
in a sense so it is easily turned into async-capable node. Append is
also a state machine in the same sense but currently no other nodes
can use it as async-capable node.

For example, an Agg or Sort node generally needs two or more tuples
from its subnode to generate a tuple to be returned to parent.  Some
working memory is needed while generating a returning tuple.  If the
node takes in a tuple from a subnode but not generated a result tuple,
the node must yield CPU time to other nodes. These nodes are not state
machines at all and it is somewhat hard to make it so.  It gets quite
complex in WindowAgg since it calls subnodes at arbitrary call level
of component functions.

Further issue is leaf scan nodes, SeqScan, IndexScan, etc. also need
to be asynchronous.

Finally the executor will turn into push-up style from the current
volcano (pull-style).

I tried all of that (perhaps except scan nodes) a couple of years ago
but the result was a kind of crap^^;

After all, I returned to the current shape.  It doesn't seem bad as
Thomas proposed the same thing.


*1: async-aware is defined (here) as a node that can have
    async-capable subnodes.

> It may lead to a situation than small difference in a filter constant
> can cause a big difference in execution time.

It is what we usually see?  We could get a big win for certain
condition without a loss even otherwise.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



pgsql-hackers by date:

Previous
From: David Rowley
Date:
Subject: Re: Missing HashAgg EXPLAIN ANALYZE details for parallel plans
Next
From: "movead.li@highgo.ca"
Date:
Subject: Re: POC and rebased patch for CSN based snapshots