Re: PostgreSQL Parallel Processing ! - Mailing list pgsql-performance

From Claudio Freire
Subject Re: PostgreSQL Parallel Processing !
Date
Msg-id CAGTBQpbm2uSfsQ=B-ASH36dKMknnoYmJFwe+OmSqjX6X_vvgUg@mail.gmail.com
Whole thread Raw
In response to Re: PostgreSQL Parallel Processing !  (sridhar bamandlapally <sridhar.bn1@gmail.com>)
Responses Re: PostgreSQL Parallel Processing !  (sridhar bamandlapally <sridhar.bn1@gmail.com>)
Re: PostgreSQL Parallel Processing !  (Merlin Moncure <mmoncure@gmail.com>)
List pgsql-performance
On Wed, Jan 25, 2012 at 6:18 AM, sridhar bamandlapally
<sridhar.bn1@gmail.com> wrote:
> I just want to illustrate an idea may possible for bringing up
> parallel process in PostgreSQL at SQL-Query level
>
> The PARALLEL option in Oracle really give great improvment in
> performance, multi-thread concept has great possibilities
>
> In Oracle we have hints ( see below ) :
> SELECT /*+PARALLEL( e, 2 )*/ e.* FROM EMP e ;
>
> PostgreSQL ( may if possible in future ) :
> SELECT e.* FROM EMP PARALLEL ( e, 2) ;

It makes little sense (and is contrary to pg policy of no hinting) to
do it like that.

In fact, I've been musing for a long time on leveraging pg's
sophisticated planner to do the parallelization:
 * Synchroscan means whenever a table has to be scanned twice, it can
be done with two threads.
 * Knowing whether a scan will hit mostly disk or memory can help in
deciding whether to do them in parallel or not (memory can be
parallelized, interleaved memory access isn't so bad, but interleaved
disk access is disastrous)
 * Big sorts can be parallelized quite easily
 * Number of threads to use can be a tunable or automatically set to
the number of processors on the system
 * Pipelining is another useful plan transformation: parallelize
I/O-bound nodes with CPU-bound ones.

I know squat about how to implement this, but I've been considering
picking the low hanging fruit on that tree and patching up PG to try
the concept. Many of the items above would require a thread-safe
execution engine, which may be quite hard to get and have a
significant performance hit. Some don't, like parallel sort.

Also, it is necessary to notice that parallelization will create some
priority inversion issues. Simple, non-parallelizable queries will
suffer from resource starvation when contending against more complex,
parallelizable ones.

pgsql-performance by date:

Previous
From: sridhar bamandlapally
Date:
Subject: Re: PostgreSQL Parallel Processing !
Next
From: sridhar bamandlapally
Date:
Subject: Re: PostgreSQL Parallel Processing !