Re: Rename max_parallel_degree? - Mailing list pgsql-hackers

From Robert Haas
Subject Re: Rename max_parallel_degree?
Date
Msg-id CA+Tgmob3q28ELsMyVDHOOPjhVwFk6E3dP_t===Fnr5z8CTYi0w@mail.gmail.com
Whole thread Raw
In response to Re: Rename max_parallel_degree?  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Rename max_parallel_degree?  (Peter Geoghegan <pg@heroku.com>)
List pgsql-hackers
On Mon, Apr 25, 2016 at 4:24 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>>> What about calling it something even simpler, such as "max_parallelism"?
>>> This avoids such cargo cult, and there's no implication that it's
>>> per-query.
>
>> So what would we call the "parallel_degree" member of the Path data
>> structure, and the "parallel_degree" reloption?  I don't think
>> renaming either of those to "parallelism" is going to be an
>> improvement.
>
> I think we should rename all of these to something based on the concept of
> "number of worker processes", and adjust the code if necessary to match.

"worker processes" are a more general term than "parallel workers".  I
think if you conflate those two things, which I've tried quite hard to
keep separate in code and comments, there will be no end of confusion.
All parallel workers are background workers, aka worker processes, but
the reverse is false.

Also, when you think about the parallel degree of a path or plan, it
is the number of workers for which that path or plan has been costed,
which may or may not match the number we actually get.  I think of the
parallel_degree as the DESIRED number of workers for a path or plan
more than the actual number.  I think if we call it "number of
parallel workers" or something like that we had better think about
whether we're implying something other than that.  Perhaps the meaning
of the existing term "parallel degree" isn't as fully explicated as
would be desirable, but changing it to something else that definitely
doesn't mean what this was intended to mean won't be better.

> I think the "degree" terminology is fundamentally tainted by the question
> of whether or not it counts the leader, and that we will have bugs (or
> indeed may have them today) caused by getting that wrong.

This theory does not seem very plausible to me.  I don't really see
how that could happen, although perhaps I'm blinded by being too close
to the feature.  Also, you haven't presented entirely convincing
evidence that other people are all united in the way they view this
and that that way is different than PostgreSQL; and I've submitted
some contrary evidence.  Even if Oracle for example does do it
differently than what I've done here, slavishly following Oracle has
never been a prerequisite for regarding a PostgreSQL feature as
well-designed.  I think it is far more likely that going and
offsetting the value of parallel_degree by 1 everywhere, as Peter has
proposed, is going to introduce subtle bugs.

BTW, I am told by some of my colleagues that hinting /* PARALLEL 0 */
in Oracle hints no-parallelism, and that hinting /* PARALLEL 1 */
hints the use of one worker in addition to the main process.  I
haven't tried to verify this myself.

> Your arguments
> for not changing it seem to me not to address that point; you've merely
> focused on the question of whether we have the replacement terminology
> right.  If we don't, let's make it so, but the current situation is not
> good.

I don't agree that the current situation isn't good, but I'm willing
to change it anyway if we can come up with something that's actually
better.  In the absence of that, it makes no sense to change anything.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: Rename max_parallel_degree?
Next
From: Peter Eisentraut
Date:
Subject: Re: [COMMITTERS] pgsql: Add trigonometric functions that work in degrees.