Re: Parallel Seq Scan - Mailing list pgsql-hackers

From Stephen Frost
Subject Re: Parallel Seq Scan
Date
Msg-id 20141219200035.GD29570@tamriel.snowman.net
Whole thread Raw
In response to Re: Parallel Seq Scan  (Heikki Linnakangas <hlinnakangas@vmware.com>)
List pgsql-hackers
* Heikki Linnakangas (hlinnakangas@vmware.com) wrote:
> On 12/19/2014 04:39 PM, Stephen Frost wrote:
> >* Marko Tiikkaja (marko@joh.to) wrote:
> >>I'd be perfectly (that means 100%) happy if it just defaulted to
> >>off, but I could turn it up to 11 whenever I needed it.  I don't
> >>believe to be the only one with this opinion, either.
> >
> >Perhaps we should reconsider our general position on hints then and
> >add them so users can define the plan to be used..  For my part, I don't
> >see this as all that much different.
> >
> >Consider if we were just adding HashJoin support today as an example.
> >Would we be happy if we had to default to enable_hashjoin = off?  Or if
> >users had to do that regularly because our costing was horrid?  It's bad
> >enough that we have to resort to those tweaks today in rare cases.
>
> This is somewhat different. Imagine that we achieve perfect
> parallelization, so that when you set enable_parallel_query=8, every
> query runs exactly 8x faster on an 8-core system, by using all eight
> cores.

To be clear, as I mentioned to Robert just now, I'm not objecting to a
GUC being added to turn off or control parallelization.  I don't want
such a GUC to be a crutch for us to lean on when it comes to questions
about the optimizer though.  We need to work through the optimizer
questions of "should this be parallelized?" and, perhaps later, "how
many ways is it sensible to parallelize this?"  I'm worried we'll take
such a GUC as a directive along the lines of "we are being told to
parallelize to exactly this level every time and for every query which
can be."  The GUC should be an input into the planner/optimizer much the
way enable_hashjoin is, unless it's being done as a *limiting* factor
for the administrator to be able to control, but we've generally avoided
doing that (see: work_mem) and, if we're going to start, we should
probably come up with an approach that addresses the considerations for
other resources too.
Thanks,
    Stephen

pgsql-hackers by date:

Previous
From: Stephen Frost
Date:
Subject: Re: Parallel Seq Scan
Next
From: Stephen Frost
Date:
Subject: Re: Role Attribute Bitmask Catalog Representation