Re: Parallel Seq Scan - Mailing list pgsql-hackers

From Jim Nasby
Subject Re: Parallel Seq Scan
Date
Msg-id 5482001E.20001@BlueTreble.com
Whole thread Raw
In response to Re: Parallel Seq Scan  (José Luis Tallón <jltallon@adv-solutions.net>)
Responses Re: Parallel Seq Scan  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers
On 12/5/14, 9:08 AM, José Luis Tallón wrote:
>
> More over, when load goes up, the relative cost of parallel working should go up as well.
> Something like:
>      p = number of cores
>      l = 1min-load
>
>      additional_cost = tuple estimate * cpu_tuple_cost * (l+1)/(c-1)
>
> (for c>1, of course)

...

> The parallel seq scan nodes are definitively the best approach for "parallel query", since the planner can optimize
thembased on cost.
 
> I'm wondering about the ability to modify the implementation of some methods themselves once at execution time: given
apreviously planned query, chances are that, at execution time (I'm specifically thinking about prepared statements
here),a different implementation of the same "node" might be more suitable and could be used instead while the
conditionholds.
 

These comments got me wondering... would it be better to decide on parallelism during execution instead of at plan
time?That would allow us to dynamically scale parallelism based on system load. If we don't even consider parallelism
untilwe've pulled some number of tuples/pages from a relation, this would also eliminate all parallel overhead on small
relations.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com



pgsql-hackers by date:

Previous
From: Stephen Frost
Date:
Subject: Re: Role Attribute Bitmask Catalog Representation
Next
From: Bruce Momjian
Date:
Subject: Re: Testing DDL deparsing support