Home > mailing lists

Re: Parallel Seq Scan - Mailing list pgsql-hackers

From	Tom Lane
Subject	Re: Parallel Seq Scan
Date	January 26, 2015 22:39:49
Msg-id	17752.1422311968@sss.pgh.pa.us Whole thread Raw
In response to	Re: Parallel Seq Scan (Jim Nasby <Jim.Nasby@BlueTreble.com>)
List	pgsql-hackers

Tree view

Jim Nasby <Jim.Nasby@BlueTreble.com> writes:
> On 1/23/15 10:16 PM, Amit Kapila wrote:
>> Further, if we want to just get the benefit of parallel I/O, then
>> I think we can get that by parallelising partition scan where different
>> table partitions reside on different disk partitions, however that is
>> a matter of separate patch.

> I don't think we even have to go that far.

> My experience with Postgres is that it is *very* sensitive to IO latency (not bandwidth). I believe this is the case
becausecomplex queries tend to interleave CPU intensive code in-between IO requests. So we see this pattern:
 

> Wait 5ms on IO
> Compute for a few ms
> Wait 5ms on IO
> Compute for a few ms
> ...

> We blindly assume that the kernel will magically do read-ahead for us, but I've never seen that work so great. It
certainlyfalls apart on something like an index scan.
 

> If we could instead do this:

> Wait for first IO, issue second IO request
> Compute
> Already have second IO request, issue third
> ...

> We'd be a lot less sensitive to IO latency.

It would take about five minutes of coding to prove or disprove this:
stick a PrefetchBuffer call into heapgetpage() to launch a request for the
next page as soon as we've read the current one, and then see if that
makes any obvious performance difference.  I'm not convinced that it will,
but if it did then we could think about how to make it work for real.
        regards, tom lane

pgsql-hackers by date:

From: Josh Berkus
Date: 26 January 2015, 22:37:03
Subject: Re: New CF app deployment

From: Stephen Frost
Date: 26 January 2015, 22:42:10
Subject: Re: pg_upgrade and rsync

Re: Parallel Seq Scan - Mailing list pgsql-hackers

Previous

Next