Re: Parallel Seq Scan - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Parallel Seq Scan
Date
Msg-id CAA4eK1L3CZ-27h5dOCbcg-x=UC8GVfHg2Eo7u65trAB2yxVXMQ@mail.gmail.com
Whole thread Raw
In response to Re: Parallel Seq Scan  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Parallel Seq Scan  (Amit Kapila <amit.kapila16@gmail.com>)
Re: Parallel Seq Scan  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Sat, Oct 17, 2015 at 2:41 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Fri, Oct 16, 2015 at 2:29 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> >> Yeah, but I think the scenario is legitimate.  When a query gets run
> >> from within PL/pgsql, parallelism is an option, at least as we have
> >> the code today.  So if a Gather were present, and the query used a
> >> parameter, then you could have this issue.  For example:
> >>
> >> SELECT * FROM bigtable WHERE unindexed_column = some_plpgsql_variable;
> >>
> >
> > I don't think for such statements the control flow will set up an unshared
> > param list.  I have tried couple of such statements [1] and found that
> > always such parameters are set up by setup_param_list().  I think there
> > are only two possibilities which could lead to setting up of unshared
> > params:
> >
> > 1. Usage of cursors - This is already prohibited for parallel-mode.
> > 2. Usage of read-write-param - This only happens for expressions like
> > x := array_append(x, foo) (Refer exec_check_rw_parameter()).  Read-write
> > params are not used for SQL statements. So this also won't be used for
> > parallel-mode
> >
> > There is a chance that I might be missing some case where unshared
> > params will be required for parallel-mode (as of today), but if not then
> > I think we can live without current changes.
>
> *shrug*
>
> The gather-test stuff isn't failing for no reason.  Either PL/pgsql
> shouldn't be passing CURSOR_OPT_PARALLEL_OK, or having a parallel plan
> get generated there should work.  There's not a third option.
>

Agreed and on looking at code, I think in below code, if we pass
parallelOK as true for the cases where Portal is non-null, such a
problem could happen.


static int

exec_run_select(PLpgSQL_execstate *estate,

PLpgSQL_expr *expr, long maxtuples, Portal *portalP,

bool parallelOK)

{

ParamListInfo paramLI;

int rc;


/*

* On the first call for this expression generate the plan

*/

if (expr->plan == NULL)

exec_prepare_plan(estate, expr, parallelOK ?

CURSOR_OPT_PARALLEL_OK : 0);


/*

* If a portal was requested, put the query into the portal

*/

if (portalP != NULL)

{

/*

* Set up short-lived ParamListInfo

*/

paramLI = setup_unshared_param_list(estate, expr);


*portalP = SPI_cursor_open_with_paramlist(NULL, expr->plan,

  paramLI,

  estate->readonly_func);




and one such case is

exec_stmt_return_query()
{
..

if (stmt->query != NULL)

{

/* static query */

exec_run_select(estate, stmt->query, 0, &portal, true);

..
}


In this function we are using controlled fetch mechanism (count as 50) to
fetch the tuples which we initially thought of not supporting for parallelism,
as such a mechanism is not built for parallel workers and that is the
reason we want to prohibit cases where ever cursor is getting created.

Do we want to support parallelism for this case on the basis that this API
will eventually fetch all the tuples by using controlled fetch mechanism?


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: Parallel Seq Scan
Next
From: Amit Kapila
Date:
Subject: Re: Parallel Seq Scan