Re: [HACKERS] asynchronous execution - Mailing list pgsql-hackers
From | Antonin Houska |
---|---|
Subject | Re: [HACKERS] asynchronous execution |
Date | |
Msg-id | 6448.1499761731@localhost Whole thread Raw |
In response to | Re: [HACKERS] asynchronous execution (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>) |
Responses |
Re: [HACKERS] asynchronous execution
|
List | pgsql-hackers |
Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote: > > Just one idea that I had while reading the code. > > > > In ExecAsyncEventLoop you iterate estate->es_pending_async, then move the > > complete requests to the end and finaly adjust estate->es_num_pending_async so > > that the array no longer contains the complete requests. I think the point is > > that then you can add new requests to the end of the array. > > > > I wonder if a set (Bitmapset) of incomplete requests would make the code more > > efficient. The set would contain position of each incomplete request in > > estate->es_num_pending_async (I think it's the myindex field of > > PendingAsyncRequest). If ExecAsyncEventLoop used this set to retrieve the > > requests subject to ExecAsyncNotify etc, then the compaction of > > estate->es_pending_async wouldn't be necessary. > > > > ExecAsyncRequest would use the set to look for space for new requests by > > iterating it and trying to find the first gap (which corresponds to completed > > request). > > > > And finally, item would be removed from the set at the moment the request > > state is being set to ASYNCREQ_COMPLETE. > > Effectively it is a waiting-queue followed by a > completed-list. The point of the compaction is keeping the order > of waiting or not-yet-completed requests, which is crucial to > avoid kind-a precedence inversion. We cannot keep the order by > using bitmapset in such way. > The current code waits all waiters at once and processes all > fired events at once. The order in the waiting-queue is > inessential in the case. On the other hand I suppoese waiting on > several-tens to near-hundred remote hosts is in a realistic > target range. Keeping the order could be crucial if we process a > part of the queue at once in the case. > > Putting siginificance on the deviation of response time of > remotes, process-all-at-once is effective. In turn we should > consider the effectiveness of the lifecycle of the larger wait > event set. ok, I missed the fact that the order of es_pending_async entries is important. I think this is worth adding a comment. Actually the reason I thought of simplification was that I noticed small inefficiency in the way you do the compaction. In particular, I think it's not always necessary to swap the tail and head entries. Would something like this make sense? /* If any node completed, compact the array. */ if (any_node_done) { int hidx = 0, tidx; /* * Swap all non-yet-completed items to the start of the array. * Keep them in the same order. */ for (tidx = 0; tidx < estate->es_num_pending_async; ++tidx) { PendingAsyncRequest *tail= estate->es_pending_async[tidx]; Assert(tail->state != ASYNCREQ_CALLBACK_PENDING); if (tail->state == ASYNCREQ_COMPLETE) continue; /* * If the array starts with one or more incomplete requests, * both head and tail pointat the same item, so there's no * point in swapping. */ if (tidx > hidx) { PendingAsyncRequest *head = estate->es_pending_async[hidx]; /* * Once the tail got ahead, it should only leave * ASYNCREQ_COMPLETE behind.Only those can then be seen * by head. */ Assert(head->state == ASYNCREQ_COMPLETE); estate->es_pending_async[tidx] = head; estate->es_pending_async[hidx] = tail; } ++hidx; } estate->es_num_pending_async = hidx; } And besides that, I think it'd be more intuitive if the meaning of "head" and "tail" was reversed: if the array is iterated from lower to higher positions, then I'd consider head to be at higher position, not tail. -- Antonin Houska Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt Web: http://www.postgresql-support.de, http://www.cybertec.at
pgsql-hackers by date: