Thread: Parrallel query execution for UNION ALL Queries

Parrallel query execution for UNION ALL Queries

From
"Benjamin Arai"
Date:
Hi,

If I have a query such as:

SELECT * FROM (SELECT * FROM A) UNION ALL (SELECT * FROM B) WHERE
blah='food';

Assuming the table A and B both have the same attributes and the data
between the table is not partitioned in any special way, does Postgresql
execute WHERE blah="food" on both table simultaiously or what?  If not, is
there a way to execute the query on both in parrallel then aggregate the
results?

To give some context, I have a very large amount of new data being loaded
each week.  Currently I am partitioning the data into a new table every
month which is working great from a indexing standpoint.  But I want to
parrallelize searches if possible to reduce the perofrmance loss of having
multiple tables.

Benjamin


Re: Parrallel query execution for UNION ALL Queries

From
"Jonah H. Harris"
Date:
On 7/18/07, Benjamin Arai <me@benjaminarai.com> wrote:
> But I want to parrallelize searches if possible to reduce
> the perofrmance loss of having multiple tables.

PostgreSQL does not support parallel query.  Parallel query on top of
PostgreSQL is provided by ExtenDB and PGPool-II.

--
Jonah H. Harris, Software Architect | phone: 732.331.1324
EnterpriseDB Corporation            | fax: 732.331.1301
33 Wood Ave S, 3rd Floor            | jharris@enterprisedb.com
Iselin, New Jersey 08830            | http://www.enterprisedb.com/

Re: Parrallel query execution for UNION ALL Queries

From
"Scott Marlowe"
Date:
On 7/18/07, Benjamin Arai <me@benjaminarai.com> wrote:
> Hi,
>
> If I have a query such as:
>
> SELECT * FROM (SELECT * FROM A) UNION ALL (SELECT * FROM B) WHERE
> blah='food';
>
> Assuming the table A and B both have the same attributes and the data
> between the table is not partitioned in any special way, does Postgresql
> execute WHERE blah="food" on both table simultaiously or what?  If not, is
> there a way to execute the query on both in parrallel then aggregate the
> results?
>
> To give some context, I have a very large amount of new data being loaded
> each week.  Currently I am partitioning the data into a new table every
> month which is working great from a indexing standpoint.  But I want to
> parrallelize searches if possible to reduce the perofrmance loss of having
> multiple tables.

Most of the time, the real issue would be the I/O throughput for such
queries, not the CPU capability.

If you have only one disk for your data storage, you're likely to get
WORSE performance if you have two queries running at once, since the
heads would not be going back and forth from one data set to the
other.

EnterpriseDB, a commercially enhanced version of PostgreSQL can do
query parallelization, but it comes at a cost, and that cost is making
sure you have enough spindles / I/O bandwidth that you won't be
actually slowing your system down.

Re: Parrallel query execution for UNION ALL Queries

From
"Jim C. Nasby"
Date:
On Wed, Jul 18, 2007 at 11:30:48AM -0500, Scott Marlowe wrote:
> EnterpriseDB, a commercially enhanced version of PostgreSQL can do
> query parallelization, but it comes at a cost, and that cost is making
> sure you have enough spindles / I/O bandwidth that you won't be
> actually slowing your system down.

I think you're thinking ExtendDB. :)
--
Jim Nasby                                      decibel@decibel.org
EnterpriseDB      http://enterprisedb.com      512.569.9461 (cell)

Attachment

Re: [GENERAL] Parrallel query execution for UNION ALL Queries

From
Dimitri Fontaine
Date:
Hi,

Le mercredi 18 juillet 2007, Jonah H. Harris a écrit :
> On 7/18/07, Benjamin Arai <me@benjaminarai.com> wrote:
> > But I want to parrallelize searches if possible to reduce
> > the perofrmance loss of having multiple tables.
>
> PostgreSQL does not support parallel query.  Parallel query on top of
> PostgreSQL is provided by ExtenDB and PGPool-II.

Seems to me that :
 - GreenPlum provides some commercial parallel query engine on top of
   PostgreSQL,

 - plproxy could be a solution to the given problem.
   https://developer.skype.com/SkypeGarage/DbProjects/PlProxy

Hope this helps,
--
dim

Attachment

Re: [GENERAL] Parrallel query execution for UNION ALL Queries

From
"Luke Lonergan"
Date:
Dimitri,

> Seems to me that :
>  - GreenPlum provides some commercial parallel query engine on top of
>    PostgreSQL,

I certainly think so and so do our customers in production with 100s of
terabytes :-)

>  - plproxy could be a solution to the given problem.
>    https://developer.skype.com/SkypeGarage/DbProjects/PlProxy

This is solving real world problems at Skype of a different kind than
Greenplum, well worth checking out.

- Luke