Home > mailing lists

Re: Parallel Seq Scan - Mailing list pgsql-hackers

From	Amit Kapila
Subject	Re: Parallel Seq Scan
Date	October 17, 2015 09:16:09
Msg-id	CAA4eK1L3CZ-27h5dOCbcg-x=UC8GVfHg2Eo7u65trAB2yxVXMQ@mail.gmail.com Whole thread Raw
In response to	Re: Parallel Seq Scan (Robert Haas <robertmhaas@gmail.com>)
Responses	Re: Parallel Seq Scan (Amit Kapila <amit.kapila16@gmail.com>) Re: Parallel Seq Scan (Robert Haas <robertmhaas@gmail.com>)
List	pgsql-hackers

Tree view

On Sat, Oct 17, 2015 at 2:41 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Fri, Oct 16, 2015 at 2:29 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> >> Yeah, but I think the scenario is legitimate. When a query gets run
> >> from within PL/pgsql, parallelism is an option, at least as we have
> >> the code today. So if a Gather were present, and the query used a
> >> parameter, then you could have this issue. For example:
> >>
> >> SELECT * FROM bigtable WHERE unindexed_column = some_plpgsql_variable;
> >>
> >
> > I don't think for such statements the control flow will set up an unshared
> > param list. I have tried couple of such statements [1] and found that
> > always such parameters are set up by setup_param_list(). I think there
> > are only two possibilities which could lead to setting up of unshared
> > params:
> >
> > 1. Usage of cursors - This is already prohibited for parallel-mode.
> > 2. Usage of read-write-param - This only happens for expressions like
> > x := array_append(x, foo) (Refer exec_check_rw_parameter()). Read-write
> > params are not used for SQL statements. So this also won't be used for
> > parallel-mode
> >
> > There is a chance that I might be missing some case where unshared
> > params will be required for parallel-mode (as of today), but if not then
> > I think we can live without current changes.
>
> *shrug*
>
> The gather-test stuff isn't failing for no reason. Either PL/pgsql
> shouldn't be passing CURSOR_OPT_PARALLEL_OK, or having a parallel plan
> get generated there should work. There's not a third option.
>

Agreed and on looking at code, I think in below code, if we pass

parallelOK as true for the cases where Portal is non-null, such a

problem could happen.

static int

exec_run_select(PLpgSQL_execstate *estate,

PLpgSQL_expr *expr, long maxtuples, Portal *portalP,

bool parallelOK)

{

ParamListInfo paramLI;

int rc;

* On the first call for this expression generate the plan

if (expr->plan == NULL)

exec_prepare_plan(estate, expr, parallelOK ?

CURSOR_OPT_PARALLEL_OK : 0);

* If a portal was requested, put the query into the portal

if (portalP != NULL)

{

* Set up short-lived ParamListInfo

paramLI = setup_unshared_param_list(estate, expr);

*portalP = SPI_cursor_open_with_paramlist(NULL, expr->plan,

paramLI,

estate->readonly_func);

and one such case is

exec_stmt_return_query()

{

if (stmt->query != NULL)

{

/* static query */

exec_run_select(estate, stmt->query, 0, &portal, true);

}

In this function we are using controlled fetch mechanism (count as 50) to

fetch the tuples which we initially thought of not supporting for parallelism,

as such a mechanism is not built for parallel workers and that is the

reason we want to prohibit cases where ever cursor is getting created.

Do we want to support parallelism for this case on the basis that this API

will eventually fetch all the tuples by using controlled fetch mechanism?

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

pgsql-hackers by date:

From: Amit Kapila
Date: 17 October 2015, 08:33:20
Subject: Re: Parallel Seq Scan

From: Amit Kapila
Date: 17 October 2015, 09:19:04
Subject: Re: Parallel Seq Scan

Re: Parallel Seq Scan - Mailing list pgsql-hackers

Previous

Next