Re: parallel.c is not marked as test covered - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: parallel.c is not marked as test covered
Date
Msg-id CAA4eK1K8kaO_jRk42-o2rmhSRbKV-3mR+iNVcONLdbcSXW5TfQ@mail.gmail.com
Whole thread Raw
In response to Re: parallel.c is not marked as test covered  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: parallel.c is not marked as test covered
List pgsql-hackers
On Thu, Jun 16, 2016 at 8:20 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Wed, Jun 15, 2016 at 10:48 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > exec_stmt_execsql() is used to execute SQL statements insider plpgsql which
> > includes dml statements as well, so probably you wanted to play safe by not
> > allowing parallel option from that place.  However, I think there shouldn't
> > be a problem in using CURSOR_OPT_PARALLEL_OK from this place as we have a
> > check in standard_planner which will take care of whether to choose parallel
> > mode or not for a particular statement.  If you want, I can do more detailed
> > analysis and prepare a patch.
>
> That would be great.  I have a vague recollection that that function
> might be called in some contexts where execution can be suspended,
> which would be no good for parallel query.  But that might be wrong.
>

I have done analysis on this and didn't found any use case where passing CURSOR_OPT_PARALLEL_OK in exec_stmt_execsql() can help in parallelizing the queries.  Basically, there seems to be three ways in which function exec_stmt_execsql can be used inside plpgsql.

a. In execution of statement used in open cursor (via exec_stmt_open())
b. In execution of statement when used in for loop (via exec_stmt_forc())
c. In execution of SQL statement (via direct call to exec_stmt_execsql())

For the cases (a) and (b), one can construct a case where execution needs to be suspended, so using parallel mode for those cases will not be meaningful.  For case (c), the Select statement seems to execute successfully only when there is a *into* target, else it will give an error after execution as below:
ERROR:  query has no destination for result data
HINT:  If you want to discard the results of a SELECT, use PERFORM instead.

Now, if one has used into clause, the number of rows will be restricted in which cases parallelism won't be enabled.

Do, let me know if you see any case where generating a parallel path is meaningful via this path?  I am of opinion, lets leave this code as is or may be we can add a comment why we have decided not to enable parallelism in this path.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

pgsql-hackers by date:

Previous
From: "David G. Johnston"
Date:
Subject: Re: Bug in to_timestamp().
Next
From: Tom Lane
Date:
Subject: Re: Bug in to_timestamp().