Re: Parallel Seq Scan - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: Parallel Seq Scan
Date
Msg-id CAM3SWZTZnhu0awWe6QO8xN2kw_rKS1NpdpxRfaES3LJ_MnEMfA@mail.gmail.com
Whole thread Raw
In response to Re: Parallel Seq Scan  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Wed, Jul 22, 2015 at 10:44 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> One thing I noticed that is a bit dismaying is that we don't get a lot
> of benefit from having more workers.  Look at the 0.1 data.  At 2
> workers, if we scaled perfectly, we would be 3x faster (since the
> master can do work too), but we are actually 2.4x faster.  Each
> process is on the average 80% efficient.  That's respectable.  At 4
> workers, we would be 5x faster with perfect scaling; here we are 3.5x
> faster.   So the third and fourth worker were about 50% efficient.
> Hmm, not as good.  But then going up to 8 workers bought us basically
> nothing.

...sorry for bumping up this mail from July...

I don't think you meant to imply it, but why should we be able to
scale perfectly? Even when the table fits entirely in shared_buffers,
I would expect memory bandwidth to become the bottleneck before a
large number of workers are added. Context switching might also be
problematic.

I have almost no sense of whether this is below or above par, which is
what I'm really curious about. FWIW, I think that parallel sort will
scale somewhat better.

-- 
Peter Geoghegan



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: creating extension including dependencies
Next
From: Tatsuo Ishii
Date:
Subject: Re: Unicode mapping scripts cleanup