Home > mailing lists

Re: Parallel Seq Scan - Mailing list pgsql-hackers

From	Jim Nasby
Subject	Re: Parallel Seq Scan
Date	January 27, 2015 23:43:55
Msg-id	54C822A4.7040106@BlueTreble.com Whole thread Raw
In response to	Re: Parallel Seq Scan (Amit Kapila <amit.kapila16@gmail.com>)
List	pgsql-hackers

Tree view

On 1/26/15 11:11 PM, Amit Kapila wrote:
> On Tue, Jan 27, 2015 at 3:18 AM, Jim Nasby <Jim.Nasby@bluetreble.com <mailto:Jim.Nasby@bluetreble.com>> wrote:
>  >
>  > On 1/23/15 10:16 PM, Amit Kapila wrote:
>  >>
>  >> Further, if we want to just get the benefit of parallel I/O, then
>  >> I think we can get that by parallelising partition scan where different
>  >> table partitions reside on different disk partitions, however that is
>  >> a matter of separate patch.
>  >
>  >
>  > I don't think we even have to go that far.
>  >
>  >
>  > We'd be a lot less sensitive to IO latency.
>  >
>  > I wonder what kind of gains we would see if every SeqScan in a query spawned a worker just to read tuples and
shovethem in a queue (or shove a pointer to a buffer in the queue).
 
>  >
>
> Here IIUC, you want to say that just get the read done by one parallel
> worker and then all expression calculation (evaluation of qualification
> and target list) in the main backend, it seems to me that by doing it
> that way, the benefit of parallelisation will be lost due to tuple
> communication overhead (may be the overhead is less if we just
> pass a pointer to buffer but that will have another kind of problems
> like holding buffer pins for a longer period of time).
>
> I could see the advantage of testing on lines as suggested by Tom Lane,
> but that seems to be not directly related to what we want to achieve by
> this patch (parallel seq scan) or if you think otherwise then let me know?

There's some low-hanging fruit when it comes to improving our IO performance (or more specifically, decreasing our
sensitivityto IO latency). Perhaps the way to do that is with the parallel infrastructure, perhaps not. But I think
it'spremature to look at parallelism for increasing IO performance, or worrying about things like how many IO threads
weshould have before we at least look at simpler things we could do. We shouldn't assume there's nothing to be gained
shortof a full parallelization implementation.
 

That's not to say there's nothing else we could use parallelism for. Sort, merge and hash operations come to mind.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com

pgsql-hackers by date:

From: Tom Lane
Date: 27 January 2015, 23:27:25
Subject: Re: jsonb, unicode escapes and escaped backslashes

From: Jim Nasby
Date: 27 January 2015, 23:52:43
Subject: Re: Parallel Seq Scan

Re: Parallel Seq Scan - Mailing list pgsql-hackers

Previous

Next