Thread: [HACKERS] parallel "return query" is no good

[HACKERS] parallel "return query" is no good

From
Robert Haas
Date:
Commit 7aea8e4f2daa4b39ca9d1309a0c4aadb0f7ed81b allowed a parallel
plan to be generated when for a RETURN QUERY or RETURN QUERY EXECUTE
statement in a PL/pgsql block.  As it turns out, the analysis that led
to this decision was totally wrong-headed, because the plan will
always be executed using SPI_cursor_fetch(portal, true, 50), which
will cause ExecutePlan() to get invoked with a count of 50, which will
cause it to run the parallel plan serially, without workers.
Therefore, passing CURSOR_OPT_PARALLEL_OK is a bad idea here; all it
can do is cause us to pick a parallel plan that's slow when executed
serially instead of the best serial plan.

The attached patch fixes it.  I plan to commit this and back-patch it
to 9.6, barring objections or better ideas.

I previously remarked on this in
http://postgr.es/m/CA+TgmobXEhvHbJtWDuPZM9bVSLiTj-kShxQJ2uM5GPDze9fRYA@mail.gmail.com
but I wasn't quite so clear what the whole picture was in that email
as I am now.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

Re: [HACKERS] parallel "return query" is no good

From
Robert Haas
Date:
On Thu, Mar 23, 2017 at 12:50 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> Commit 7aea8e4f2daa4b39ca9d1309a0c4aadb0f7ed81b allowed a parallel
> plan to be generated when for a RETURN QUERY or RETURN QUERY EXECUTE
> statement in a PL/pgsql block.  As it turns out, the analysis that led
> to this decision was totally wrong-headed, because the plan will
> always be executed using SPI_cursor_fetch(portal, true, 50), which
> will cause ExecutePlan() to get invoked with a count of 50, which will
> cause it to run the parallel plan serially, without workers.
> Therefore, passing CURSOR_OPT_PARALLEL_OK is a bad idea here; all it
> can do is cause us to pick a parallel plan that's slow when executed
> serially instead of the best serial plan.
>
> The attached patch fixes it.  I plan to commit this and back-patch it
> to 9.6, barring objections or better ideas.

I guess the downside of back-patching this is that it could cause a
plan change for somebody which ends up being worse.  On the whole,
serial execution of queries intended to be run in parallel isn't
likely to work out well, but it's always possible somebody has a cases
where it happens to be winning, and this could break it.  So maybe I
should do this only in master?  Thoughts?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] parallel "return query" is no good

From
Stephen Frost
Date:
* Robert Haas (robertmhaas@gmail.com) wrote:
> On Thu, Mar 23, 2017 at 12:50 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> > Commit 7aea8e4f2daa4b39ca9d1309a0c4aadb0f7ed81b allowed a parallel
> > plan to be generated when for a RETURN QUERY or RETURN QUERY EXECUTE
> > statement in a PL/pgsql block.  As it turns out, the analysis that led
> > to this decision was totally wrong-headed, because the plan will
> > always be executed using SPI_cursor_fetch(portal, true, 50), which
> > will cause ExecutePlan() to get invoked with a count of 50, which will
> > cause it to run the parallel plan serially, without workers.
> > Therefore, passing CURSOR_OPT_PARALLEL_OK is a bad idea here; all it
> > can do is cause us to pick a parallel plan that's slow when executed
> > serially instead of the best serial plan.
> >
> > The attached patch fixes it.  I plan to commit this and back-patch it
> > to 9.6, barring objections or better ideas.
>
> I guess the downside of back-patching this is that it could cause a
> plan change for somebody which ends up being worse.  On the whole,
> serial execution of queries intended to be run in parallel isn't
> likely to work out well, but it's always possible somebody has a cases
> where it happens to be winning, and this could break it.  So maybe I
> should do this only in master?  Thoughts?

For my 2c, I'd back-patch it.

Thanks!

Stephen

Re: [HACKERS] parallel "return query" is no good

From
Andres Freund
Date:
On 2017-03-23 13:03:19 -0400, Robert Haas wrote:
> On Thu, Mar 23, 2017 at 12:50 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> > Commit 7aea8e4f2daa4b39ca9d1309a0c4aadb0f7ed81b allowed a parallel
> > plan to be generated when for a RETURN QUERY or RETURN QUERY EXECUTE
> > statement in a PL/pgsql block.  As it turns out, the analysis that led
> > to this decision was totally wrong-headed, because the plan will
> > always be executed using SPI_cursor_fetch(portal, true, 50), which
> > will cause ExecutePlan() to get invoked with a count of 50, which will
> > cause it to run the parallel plan serially, without workers.
> > Therefore, passing CURSOR_OPT_PARALLEL_OK is a bad idea here; all it
> > can do is cause us to pick a parallel plan that's slow when executed
> > serially instead of the best serial plan.
> >
> > The attached patch fixes it.  I plan to commit this and back-patch it
> > to 9.6, barring objections or better ideas.
> 
> I guess the downside of back-patching this is that it could cause a
> plan change for somebody which ends up being worse.  On the whole,
> serial execution of queries intended to be run in parallel isn't
> likely to work out well, but it's always possible somebody has a cases
> where it happens to be winning, and this could break it.  So maybe I
> should do this only in master?  Thoughts?

I'm +0.5 for backpatching.

- Andres



Re: [HACKERS] parallel "return query" is no good

From
"Joshua D. Drake"
Date:
On 03/23/2017 10:03 AM, Robert Haas wrote:
> On Thu, Mar 23, 2017 at 12:50 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>> Commit 7aea8e4f2daa4b39ca9d1309a0c4aadb0f7ed81b allowed a parallel
>> plan to be generated when for a RETURN QUERY or RETURN QUERY EXECUTE
>> statement in a PL/pgsql block.  As it turns out, the analysis that led
>> to this decision was totally wrong-headed, because the plan will
>> always be executed using SPI_cursor_fetch(portal, true, 50), which
>> will cause ExecutePlan() to get invoked with a count of 50, which will
>> cause it to run the parallel plan serially, without workers.
>> Therefore, passing CURSOR_OPT_PARALLEL_OK is a bad idea here; all it
>> can do is cause us to pick a parallel plan that's slow when executed
>> serially instead of the best serial plan.
>>
>> The attached patch fixes it.  I plan to commit this and back-patch it
>> to 9.6, barring objections or better ideas.
>
> I guess the downside of back-patching this is that it could cause a
> plan change for somebody which ends up being worse.  On the whole,
> serial execution of queries intended to be run in parallel isn't
> likely to work out well, but it's always possible somebody has a cases
> where it happens to be winning, and this could break it.  So maybe I
> should do this only in master?  Thoughts?

I think the greater good of a fix applies here. +1 to 9.6.

>


-- 
Command Prompt, Inc.                  http://the.postgres.company/                        +1-503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Everyone appreciates your honesty, until you are honest with them.
Unless otherwise stated, opinions are my own.



Re: [HACKERS] parallel "return query" is no good

From
Alvaro Herrera
Date:
Robert Haas wrote:

> I guess the downside of back-patching this is that it could cause a
> plan change for somebody which ends up being worse.  On the whole,
> serial execution of queries intended to be run in parallel isn't
> likely to work out well, but it's always possible somebody has a cases
> where it happens to be winning, and this could break it.  So maybe I
> should do this only in master?  Thoughts?

I think that the chances of someone depending on a parallel plan running
serially by accident which is better than the non-parallel plan, are
pretty slim.

+1 for back-patching.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: parallel "return query" is no good

From
Robert Haas
Date:
On Thu, Mar 23, 2017 at 1:53 PM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:
> Robert Haas wrote:
>> I guess the downside of back-patching this is that it could cause a
>> plan change for somebody which ends up being worse.  On the whole,
>> serial execution of queries intended to be run in parallel isn't
>> likely to work out well, but it's always possible somebody has a cases
>> where it happens to be winning, and this could break it.  So maybe I
>> should do this only in master?  Thoughts?
>
> I think that the chances of someone depending on a parallel plan running
> serially by accident which is better than the non-parallel plan, are
> pretty slim.
>
> +1 for back-patching.

All right, done.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company