Re: [HACKERS] Enabling parallelism for queries coming from SQL orother PL functions - Mailing list pgsql-hackers

From Robert Haas
Subject Re: [HACKERS] Enabling parallelism for queries coming from SQL orother PL functions
Date
Msg-id CA+TgmoZFmDrhSbbbpBb3AEFA3Zwj9izmXrHi9OqP-83J3GEjLg@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Enabling parallelism for queries coming from SQL orother PL functions  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: [HACKERS] Enabling parallelism for queries coming from SQL orother PL functions
List pgsql-hackers
On Sun, Feb 26, 2017 at 6:34 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Sat, Feb 25, 2017 at 9:47 PM, Dilip Kumar <dilipbalaut@gmail.com> wrote:
>> On Sat, Feb 25, 2017 at 5:12 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>>> Sure, but that should only happen if the function is *not* declared as
>>> parallel safe (aka in parallel safe functions, we should not generate
>>> parallel plans).
>>
>> So basically we want to put a restriction that parallel-safe function
>> can not use the parallel query? This will work but it seems too
>> restrictive to me. Because by marking function parallel safe we enable
>> it to be used with the outer parallel query that is fine. But, that
>> should not restrict the function from using the parallel query if it's
>> used with the other outer query which is not having the parallel
>> plan(or function is executed directly).
>
> I think if the user is explicitly marking a function as parallel-safe,
> then it doesn't make much sense to allow parallel query in such
> functions as it won't be feasible for the planner (or at least it will
> be quite expensive) to detect the same.  By the way, if the user has
> any such expectation from a function, then he can mark the function as
> parallel-restricted or parallel-unsafe.

However, if a query is parallel-safe, it might not end up getting run
in parallel.  In that case, it could still benefit from parallelism
internally.  I think we want to allow that.  For example, suppose you
run a query like:

SELECT x, sum(somewhat_expensive_function(y)) FROM sometab GROUP BY 1;

If sometab isn't very big, it's probably better to use a non-parallel
plan for this query, because then somewhat_expensive_function() can
still use parallelism internally, which might be better. However, if
sometab is large enough, then it might be better to parallelize the
whole query using a Partial/FinalizeAggregate and force each call to
somewhat_expensive_function() to run serially.  So I don't think a
hard-and-fast rule that parallel-safe functions can't use parallelism
internally is a good idea.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: [HACKERS] Allow pg_dumpall to work without pg_authid
Next
From: Robins Tharakan
Date:
Subject: Re: [HACKERS] Allow pg_dumpall to work without pg_authid