Re: asynchronous execution - Mailing list pgsql-hackers

From Robert Haas
Subject Re: asynchronous execution
Date
Msg-id CA+TgmoYMoB4OG1W6KZjgRda1J9=Lo1fXpH0YXjzvSEwU5rqhVA@mail.gmail.com
Whole thread Raw
In response to Re: asynchronous execution  (Amit Khandekar <amitdkhan.pg@gmail.com>)
Responses Re: asynchronous execution  (Amit Khandekar <amitdkhan.pg@gmail.com>)
List pgsql-hackers
On Wed, Sep 28, 2016 at 12:30 AM, Amit Khandekar <amitdkhan.pg@gmail.com> wrote:
> On 24 September 2016 at 06:39, Robert Haas <robertmhaas@gmail.com> wrote:
>> Since Kyotaro Horiguchi found that my previous design had a
>> system-wide performance impact due to the ExecProcNode changes, I
>> decided to take a different approach here: I created an async
>> infrastructure where both the requestor and the requestee have to be
>> specifically modified to support parallelism, and then modified Append
>> and ForeignScan to cooperate using the new interface.  Hopefully that
>> means that anything other than those two nodes will suffer no
>> performance impact.  Of course, it might have other problems....
>
> I see that the reason why you re-designed the asynchronous execution
> implementation is because the earlier implementation showed
> performance degradation in local sequential and local parallel scans.
> But I checked that the ExecProcNode() changes were not that
> significant as to cause the degradation.

I think we need some testing to prove that one way or the other.  If
you can do some - say on a plan with multiple nested loop joins with
inner index-scans, which will call ExecProcNode() a lot - that would
be great.  I don't think we can just rely on "it doesn't seem like it
should be slower", though - ExecProcNode() is too important a function
for us to guess at what the performance will be.

The thing I'm really worried about with either implementation is what
happens when we start to add asynchronous capability to multiple
nodes.  For example, if you imagine a plan like this:

Append
-> Hash Join -> Foreign Scan -> Hash   -> Seq Scan
-> Hash Join -> Foreign Scan -> Hash   -> Seq Scan

In order for this to run asynchronously, you need not only Append and
Foreign Scan to be async-capable, but also Hash Join.  That's true in
either approach.  Things are slightly better with the original
approach, but the basic problem is there in both cases.  So it seems
we need an approach that will make adding async capability to a node
really cheap, which seems like it might be a problem.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: Learning to hack Postgres - Keeping track of ctids
Next
From: Robert Haas
Date:
Subject: Re: proposal: psql \setfileref