Re: Asynchronous Append on postgres_fdw nodes. - Mailing list pgsql-hackers
From | Kyotaro Horiguchi |
---|---|
Subject | Re: Asynchronous Append on postgres_fdw nodes. |
Date | |
Msg-id | 20201214.175623.1287257830379316172.horikyota.ntt@gmail.com Whole thread Raw |
In response to | Re: Asynchronous Append on postgres_fdw nodes. (Etsuro Fujita <etsuro.fujita@gmail.com>) |
Responses |
Re: Asynchronous Append on postgres_fdw nodes.
|
List | pgsql-hackers |
At Sat, 12 Dec 2020 19:06:51 +0900, Etsuro Fujita <etsuro.fujita@gmail.com> wrote in > On Fri, Nov 20, 2020 at 8:16 PM Kyotaro Horiguchi > <horikyota.ntt@gmail.com> wrote: > > I looked through the nodeAppend.c and postgres_fdw.c part and those > > are I think the core of this patch. > > Thanks again for the review! > > > - * figure out which subplan we are currently processing > > + * try to get a tuple from async subplans > > + */ > > + if (!bms_is_empty(node->as_needrequest) || > > + (node->as_syncdone && !bms_is_empty(node->as_asyncpending))) > > + { > > + if (ExecAppendAsyncGetNext(node, &result)) > > + return result; > > > > The function ExecAppendAsyncGetNext() is a function called only here, > > and contains only 31 lines. It doesn't seem to me that the separation > > makes the code more readable. > > Considering the original ExecAppend() is about 50 lines long, 31 lines > of code would not be small. So I'd vote for separating it into > another function as proposed. Ok, I no longer oppose to separating some part from ExecAppend(). > > - /* choose new subplan; if none, we're done */ > > - if (!node->choose_next_subplan(node)) > > + /* wait or poll async events */ > > + if (!bms_is_empty(node->as_asyncpending)) > > + { > > + Assert(!node->as_syncdone); > > + Assert(bms_is_empty(node->as_needrequest)); > > + ExecAppendAsyncEventWait(node); > > > > You moved the function to wait for events from execAsync to > > nodeAppend. The former is a generic module that can be used from any > > kind of executor nodes, but the latter is specialized for nodeAppend. > > In other words, the abstraction level is lowered here. What is the > > reason for the change? > > The reason is just because that function is only called from > ExecAppend(). I put some functions only called from nodeAppend.c in > execAsync.c, though. (I think) You told me that you preferred the genericity of the original interface, but you're doing the opposite. If you think we can move such a generic feature to a part of Append node, all other features can be move the same way. I guess there's a reason you want only the this specific feature out of all of them be Append-spcific and I want to know that. > > + /* Perform the actual callback. */ > > + ExecAsyncRequest(areq); > > + if (ExecAppendAsyncResponse(areq)) > > + { > > + Assert(!TupIsNull(areq->result)); > > + *result = areq->result; > > > > Putting aside the name of the functions, the first two function are > > used only this way at only two places. ExecAsyncRequest(areq) tells > > fdw to store the first tuple among the already received ones to areq, > > and ExecAppendAsyncResponse(areq) is checking the result is actually > > set. Finally the result is retrieved directory from areq->result. > > What is the reason that the two functions are separately exists? > > I think that when an async-aware node gets a tuple from an > async-capable node, they should use ExecAsyncRequest() / > ExecAyncHogeResponse() rather than ExecProcNode() [1]. I modified the > patch so that ExecAppendAsyncResponse() is called from Append, but to > support bubbling up the plan tree discussed in [2], I think it should > be called from ForeignScans (the sides of async-capable nodes). Am I > right? Anyway, I’ll rename ExecAppendAyncResponse() to the one > proposed in Robert’s original patch. Even though I understand the concept but to make work it we need to remember the parent *async* node somewhere. In my faint memory the very early patch did something like that. So I think just providing ExecAsyncResponse() doesn't make it true. But if we make it true, it would be something like partially-reversed steps from what the current Exec*()s do for some of the existing nodes and further code is required for some other nodes like WindowFunction. Bubbling up works only in very simple cases where a returned tuple is thrown up to further parent as-is or at least when the node convers a tuple into another shape. If an async-receiver node wants to process multiple tuples from a child or from multiple children, it is no longer be just a bubbling up. That being said, we could avoid passing (a-kind-of) side-channel information when ExecProcNode is called by providing ExecAsyncResponse(). But I don't think the "side-channel" is not a problem since it is just another state of the node. And.. I think the reason I feel uneasy for the patch may be that the patch uses the interface names in somewhat different context. Origianlly the fraemework resides in-between executor nodes, not on a node of either side. ExecAsyncNotify() notifies the requestee about an event and ExecAsyncResonse() notifies the requestor about a new tuple. I don't feel strangeness in this usage. But this patch feels to me using the same names in different (and somewhat wrong) context. > > + /* Perform the actual callback. */ > > + ExecAsyncNotify(areq); > > > > Mmm. The usage of the function (or its name) looks completely reverse > > to me. I think FDW should NOTIFY to exec nodes that the new tuple > > gets available but the reverse is nonsense. What the function is > > actually doing is to REQUEST fdw to fetch tuples that are expected to > > have arrived, which is different from what the name suggests. > > As mentioned in a previous email, this is an FDW callback routine > revived from Robert’s patch. I think the naming is reasonable, > because the callback routine notifies the FDW of readiness of a file > descriptor. And actually, the callback routine tells the core whether > the corresponding ForeignScan node is ready for a new request or not, > by setting the callback_pending flag accordingly. Hmm. Agreed. The word "callback" is also used there [3]... I remember and it seems reasonable that the core calls AsyncNotify() on FDW and the FDW calls ExecForeignScan as a response to it and notify back to core of that using ExecAsyncRequestDone(). But the patch here feels a little strange, or uneasy, to me. [3] https://www.postgresql.org/message-id/20161018.103051.30820907.horiguchi.kyotaro%40lab.ntt.co.jp > > postgres_fdw.c > > > > > postgresIterateForeignScan(ForeignScanState *node) > > > { > > > PgFdwScanState *fsstate = (PgFdwScanState *) node->fdw_state; > > > TupleTableSlot *slot = node->ss.ss_ScanTupleSlot; > > > > > > /* > > > * If this is the first call after Begin or ReScan, we need to create the > > > * cursor on the remote side. > > > */ > > > if (!fsstate->cursor_exists) > > > create_cursor(node); > > > > With the patch, cursors are also created in another place so at least > > the comment is wrong. > > Good catch! Will fix. > > > That being said, I think we should unify the > > code except the differences between async and sync. For example, if > > the fetch_more_data_begin() needs to be called only for async > > fetching, the cursor should be created before calling the function, in > > the code path common with sync fetching. > > I think that that would make the code easier to understand, but I’m > not 100% sure we really need to do so. And I believe that we don't tolerate even the slightest performance degradation. > Best regards, > Etsuro Fujita > > [1] https://www.postgresql.org/message-id/CA%2BTgmoYrbgTBnLwnr1v%3Dpk%2BC%3DznWg7AgV9%3DM9ehrq6TDexPQNw%40mail.gmail.com > [2] https://www.postgresql.org/message-id/CA%2BTgmoZSWnhy%3DSB3ggQcB6EqKxzbNeNn%3DEfwARnCS5tyhhBNcw%40mail.gmail.com regards. -- Kyotaro Horiguchi NTT Open Source Software Center
pgsql-hackers by date: