Re: Support Parallel Query Execution in Executor - Mailing list pgsql-hackers

From Gregory Maxwell
Subject Re: Support Parallel Query Execution in Executor
Date
Msg-id e692861c0604081743k7b3e8f7bu45241ecc34d26bcf@mail.gmail.com
Whole thread Raw
In response to Re: Support Parallel Query Execution in Executor  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Support Parallel Query Execution in Executor  ("Luke Lonergan" <llonergan@greenplum.com>)
Re: Support Parallel Query Execution in Executor  (Myron Scott <lister@sacadia.com>)
Re: Support Parallel Query Execution in Executor  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On 4/8/06, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> This is exactly the bit of optimism I was questioning.  We've already
> been sweating blood trying to reduce multiprocessor contention on data
> structures in which collisions ought to be avoidable (ie, buffer arrays
> where you hope not everyone is hitting the same buffer at once).  I
> think passing large volumes of data between different processes is going
> to incur quite a lot of locking overhead, pipeline stalls for cache line
> transfers, etc, etc, because heavy contention for the transfer buffer is
> simply not going to be avoidable.

We should consider true parallel execution and overlapping execution
with I/O as distinct cases.

For example, one case made in this thread involved bursty performance
with seqscans presumably because the I/O was stalling while processing
was being performed.  In general this can be avoided without parallel
execution through the use of non-blocking I/O and making an effort to
keep the request pipeline full.

There are other cases where it is useful to perform parallel I/O
without parallel processing.. for example: a query that will perform
an index lookup per row can benefit from running some number of those
lookups in parallel in order to hide the lookup latency and give the
OS and disk elevators a chance to make the random accesses a little
more orderly. This can be accomplished without true parallel
processing. (Perhaps PG does this already?)

Parallel execution to get access to more CPU and memory bandwidth is a
fine thing, and worth the costs in many cases... but it shouldn't be
used as an easy way to get parallel IO without careful consideration.


pgsql-hackers by date:

Previous
From: "Jonah H. Harris"
Date:
Subject: Re: How to implement oracle like rownum(function or seudocolumn)
Next
From: David Wheeler
Date:
Subject: Re: FOUND not set by EXECUTE?