Re: Parallel Seq Scan - Mailing list pgsql-hackers

From Jim Nasby
Subject Re: Parallel Seq Scan
Date
Msg-id 54BEBCFB.4000304@BlueTreble.com
Whole thread Raw
In response to Re: Parallel Seq Scan  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On 1/19/15 7:20 AM, Robert Haas wrote:
>> >Another thing is that I think prefetching is not supported on all platforms
>> >(Windows) and for such systems as per above algorithm we need to
>> >rely on block-by-block method.
> Well, I think we should try to set up a test to see if this is hurting
> us.  First, do a sequential-scan of a related too big at least twice
> as large as RAM.  Then, do a parallel sequential scan of the same
> relation with 2 workers.  Repeat these in alternation several times.
> If the operating system is accomplishing meaningful readahead, and the
> parallel sequential scan is breaking it, then since the test is
> I/O-bound I would expect to see the parallel scan actually being
> slower than the normal way.
>
> Or perhaps there is some other test that would be better (ideas
> welcome) but the point is we may need something like this, but we
> should try to figure out whether we need it before spending too much
> time on it.

I'm guessing that not all supported platforms have prefetching that actually helps us... but it would be good to
actuallyknow if that's the case.
 

Where I think this gets a lot more interesting is if we could apply this to an index scan. My thought is that would
resultin one worker mostly being responsible for advancing the index scan itself while the other workers were issuing
(andwaiting on) heap IO. So even if this doesn't turn out to be a win for seqscan, there's other places we might well
wantto use it.
 
-- 
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com



pgsql-hackers by date:

Previous
From: Jim Nasby
Date:
Subject: Re: proposal: searching in array function - array_position
Next
From: Andrew Gierth
Date:
Subject: Re: Final Patch for GROUPING SETS