Home > mailing lists

Re: [HACKERS] Parallel Hash take II - Mailing list pgsql-hackers

From	Thomas Munro
Subject	Re: [HACKERS] Parallel Hash take II
Date	November 16, 2017 03:11:25
Msg-id	CAEepm=0GBJVFRdsjbhtjCWuQk=QzxrTUhhySnezq6FvYTdb=1A@mail.gmail.com Whole thread Raw
In response to	Re: [HACKERS] Parallel Hash take II (Robert Haas <robertmhaas@gmail.com>)
List	pgsql-hackers

Tree view

On Thu, Nov 16, 2017 at 8:09 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Wed, Nov 15, 2017 at 1:35 PM, Andres Freund <andres@anarazel.de> wrote:
>> But this does bug me, and I think it's what made me pause here to make a
>> bad joke.  The way that parallelism treats work_mem makes it even more
>> useless of a config knob than it was before.  Parallelism, especially
>> after this patch, shouldn't compete / be benchmarked against a
>> single-process run with the same work_mem. To make it "fair" you need to
>> compare parallelism against a single threaded run with work_mem *
>> max_parallelism.
>
> I don't really know how to do a fair comparison between a parallel
> plan and a non-parallel plan.  Even if the parallel plan contains zero
> nodes that use work_mem, it might still use more memory than the
> non-parallel plan, because a new backend uses a bunch of memory.  If
> you really want a comparison that is fair on the basis of memory
> usage, you have to take that into account somehow.
>
> But even then, the parallel plan is also almost certainly consuming
> more CPU cycles to produce the same results.  Parallelism is all about
> trading away efficiency for execution time.  Not just because of
> current planner and executor limitations, but intrinsically, parallel
> plans are less efficient.  The globally optimal solution on a system
> that is short on either memory or CPU cycles is to turn parallelism
> off.

The guys who worked on the first attempt at Parallel Query for
Berkeley POSTGRES (and then ripped that out, moving to another project
called XPRS which I have found no trace of, perhaps it finished up in
some commercial RDBMS) wrote this[1]:

"The objective function that XPRS uses for query optimization is a
combination of resource consumption and response time as follows:
 cost = resource consumption + w * response time

Here w is a system-specifc weighting factor. A small w mostly
optimizes resource consumption, while a large w mostly optimizes
response time. Resource consumption is measured by the number of disk
pages accessed and number of tuples processed, while response time is
the elapsed time for executing the query."

http://db.cs.berkeley.edu/papers/ERL-M93-28.pdf

-- 
Thomas Munro
http://www.enterprisedb.com

pgsql-hackers by date:

From: Thomas Munro
Date: 16 November 2017, 03:06:23
Subject: Re: [HACKERS] Parallel Hash take II

From: Robert Haas
Date: 16 November 2017, 03:11:35
Subject: Re: [HACKERS] Re: protocol version negotiation (Re: LibpqPGRES_COPY_BOTH - version compatibility)

Re: [HACKERS] Parallel Hash take II - Mailing list pgsql-hackers

Previous

Next