Re: Asynchronous Append on postgres_fdw nodes. - Mailing list pgsql-hackers

From Etsuro Fujita
Subject Re: Asynchronous Append on postgres_fdw nodes.
Date
Msg-id CAPmGK150qZzNbZFKSX2H0Cn5U+YX=2iGzwipUkcMdwDiJ7Oh8A@mail.gmail.com
Whole thread Raw
In response to Re: Asynchronous Append on postgres_fdw nodes.  (Kyotaro Horiguchi <horikyota.ntt@gmail.com>)
Responses Re: Asynchronous Append on postgres_fdw nodes.
List pgsql-hackers
On Fri, Nov 20, 2020 at 3:51 PM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:
> At Tue, 17 Nov 2020 18:56:02 +0900, Etsuro Fujita <etsuro.fujita@gmail.com> wrote in
> > * In Robert's patch [1] (and Horiguchi-san's, which was created based
> > on Robert's), ExecAppend() was modified to retrieve tuples from
> > async-aware children *before* the tuples will be needed, but I don't
>
> The "retrieve" means the move of a tuple from fdw to executor
> (ExecAppend or ExecAsync) layer?

Yes, that's what I mean.

> > think that's really a good idea, because the query might complete
> > before returning the tuples.  So I modified that function so that a
>
> I'm not sure how it matters. Anyway the fdw holds up to tens of tuples
> before the executor actually make requests for them.  The reason for
> the early fetching is letting fdw send the next request as early as
> possible. (However, I didn't measure the effect of the
> nodeAppend-level prefetching.)

I agree that that would lead to an improved efficiency in some cases,
but I still think that that would be useless in some other cases like
SELECT * FROM sharded_table LIMIT 1.  Also, I think the situation
would get worse if we support Append on top of joins or aggregates
over ForeignScans, which would be more expensive to perform than these
ForeignScans.

If we do prefetching, I think it would be better that it’s the
responsibility of the FDW to do prefetching, and I think that that
could be done by letting the FDW to start another data fetch,
independently of the core, in the ForeignAsyncNotify callback routine,
which I revived from Robert's original patch.  I think that that would
be more efficient, because the FDW would no longer need to wait until
all buffered tuples are returned to the core.  In the WIP patch, I
only allowed the callback routine to put the corresponding ForeignScan
node into a state where it’s either ready for a new request or needing
a callback for another data fetch, but I think we could probably relax
the restriction so that the ForeignScan node can be put into another
state where it’s ready for a new request while needing a callback for
the prefetch.

> > tuple is retrieved from an async-aware child *when* it is needed, like
> > Thomas' patch.  I used FDW callback functions proposed by Robert, but
> > introduced another FDW callback function ForeignAsyncBegin() for each
> > async-aware child to start an asynchronous data fetch at the first
> > call to ExecAppend() after ExecInitAppend() or ExecReScanAppend().
>
> Even though the terminology is not officially determined, in the past
> discussions "async-aware" meant "can handle async-capable subnodes"
> and "async-capable" is used as "can run asynchronously".

Thanks for the explanation!

> Likewise you
> seem to have changed the meaning of as_needrequest from "subnodes that
> needs to request for the next tuple" to "subnodes that already have
> got query-send request and waiting for the result to come".

No.  I think I might slightly change the original definition of
as_needrequest, though.

> I would
> argue to use the words and variables (names) in such meanings.

I think the word "aware" has a broader meaning, so the naming as
proposed would be OK IMO.  But actually, I don't have any strong
opinion about that, so I'll change it as explained.

> > * For EvalPlanQual, I modified the patch so that async-aware children
> > are treated as if they were synchronous when executing EvalPlanQual.
>
> Doesn't async execution accelerate the epq-fetching?  Or does
> async-execution goes into trouble in the EPQ path?

The reason why I disabled async execution when executing EPQ is to
avoid sending asynchronous queries to the remote sides, which would be
useless, because scan tuples for an EPQ recheck are obtained in a
dedicated way.

> > * In Robert's patch, all async-aware children below Append nodes in
> > the query waiting for events to occur were managed by a single EState,
> > but I modified the patch so that such children are managed by each
> > Append node, like Horiguchi-san's patch and Thomas'.
>
> Managing in Estate give advantage for push-up style executor but
> managing in node_state is simpler.

What do you mean by "push-up style executor"?

Thanks for the review!  Sorry for the delay.

Best regards,
Etsuro Fujita



pgsql-hackers by date:

Previous
From: Wolfgang Walther
Date:
Subject: Suggestion: optionally return default value instead of error on failed cast
Next
From: Etsuro Fujita
Date:
Subject: Re: Asynchronous Append on postgres_fdw nodes.