Re: [HACKERS] Enabling parallelism for queries coming from SQL orother PL functions - Mailing list pgsql-hackers

From Robert Haas
Subject Re: [HACKERS] Enabling parallelism for queries coming from SQL orother PL functions
Date
Msg-id CA+TgmoYx9nP=LERwi+nxb02pe=dEkT_XZQytg6-PAFgH-nAg8g@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Enabling parallelism for queries coming from SQL orother PL functions  (Rafia Sabih <rafia.sabih@enterprisedb.com>)
Responses Re: [HACKERS] Enabling parallelism for queries coming from SQL orother PL functions  (Rafia Sabih <rafia.sabih@enterprisedb.com>)
List pgsql-hackers
On Fri, Mar 10, 2017 at 7:08 AM, Rafia Sabih
<rafia.sabih@enterprisedb.com> wrote:
> I wanted to clarify a few things here, I noticed that call of ExecutorRun in
> postquel_getnext() uses !es->lazyEval as execute_once, this is confusing, as
> it is true even in cases when a simple query like "select count(*) from t"
> is used in a sql function. Hence, restricting parallelism for cases when it
> shouldn't. It seems to me that es->lazyEval is not set properly or it should
> not be true for simple select statements. I found that in the definition of
> execution_state
> bool lazyEval; /* true if should fetch one row at a time */
> and in init_execution_state, there is a comment saying,
> * Mark the last canSetTag query as delivering the function result; then,
> * if it is a plain SELECT, mark it for lazy evaluation. If it's not a
> * SELECT we must always run it to completion.
>
> I find these two things contradictory to each other. So, is this point
> missed or is there some deep reasoning behind that?

I don't understand what you think is contradictory.  I think the idea
is that if it's not a SELECT, we have to run it to completion because
it might have side effects, but if it is a SELECT, we assume (granted,
it might be wrong) that there are no side effects, and therefore we
can just run it until it produces the number of rows of output that we
need.

Note this:
       if (completed || !fcache->returnsSet)           postquel_end(es);

When the SQL function doesn't return a set, then we can allow
parallelism even when lazyEval is set, because we'll only call
ExecutorStart() once.  But my impression is that something like this:

SELECT * FROM blah() LIMIT 3

...will trigger three separate calls to ExecutorRun(), which is a
problem if the plan is a parallel plan.

I have not verified this; the above thoughts are just based on code-reading.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Dilip Kumar
Date:
Subject: Re: [HACKERS] Parallel Bitmap scans a bit broken
Next
From: Ashutosh Sharma
Date:
Subject: Re: [HACKERS] Microvacuum support for Hash Index