Re: Parallel query execution - Mailing list pgsql-hackers

From Claudio Freire
Subject Re: Parallel query execution
Date
Msg-id CAGTBQpbRYp9GNe2JjbXXCAO1-OsYXdh2fUyZVka+GXY22dj+iQ@mail.gmail.com
Whole thread Raw
In response to Re: Parallel query execution  (Stephen Frost <sfrost@snowman.net>)
Responses Re: Parallel query execution
List pgsql-hackers
On Wed, Jan 16, 2013 at 12:55 AM, Stephen Frost <sfrost@snowman.net> wrote:
>> If memory serves me correctly (and it does, I suffered it a lot), the
>> performance hit is quite considerable. Enough to make it "a lot worse"
>> rather than "not as good".
>
> I feel like we must not be communicating very well.
>
> If the CPU is pegged at 100% and the I/O system is at 20%, adding
> another CPU at 100% will bring the I/O load up to 40% and you're now
> processing data twice as fast overall

Well, there's the fault in your logic. It won't be as linear. Adding
another sequential scan will decrease bandwidth, if the I/O system was
doing say 10MB/s at 20% load, now it will be doing 20MB/s at 80% load
(maybe even worse). Quite suddenly you'll meet diminishing returns,
and the I/O subsystem which wasn't the bottleneck will become it,
bandwidth being the key. You might end up with less bandwidth than
you've started, if you go far enough past that knee.

Add some concurrent operations (connections) to the mix and it just gets worse.

Figuring out where the knee is may be the hardest problem you'll face.
I don't think it'll be predictable enough to make I/O parallelization
in that case worth the effort.

If you instead think of parallelizing random I/O (say index scans
within nested loops), that might work (or it might not). Again it
depends a helluva lot on what else is contending with the I/O
resources and how far ahead of optimum you push it. I've faced this
problem when trying to prefetch on index scans. If you try to prefetch
too much, you induce extra delays and it's a bad tradeoff.

Feel free to do your own testing.



pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Parallel query execution
Next
From: Alvaro Herrera
Date:
Subject: Re: Parallel query execution