Re: is parallel union all possible over dblink? - Mailing list pgsql-performance

From Svetlin Manavski
Subject Re: is parallel union all possible over dblink?
Date
Msg-id BANLkTim7gViO93cRhBPCjd-+n96mTTrORw@mail.gmail.com
Whole thread Raw
In response to Re: is parallel union all possible over dblink?  (Greg Spiegelberg <gspiegelberg@gmail.com>)
List pgsql-performance
I am now a bit puzzled after the initial satisfaction by Marinos' reply.

1. what do you mean exactly by "to ensure your UNION succeeds". The dblink docs do not mention anything about issues using directly  the suggested dblink_send_query() + dblink_get_results(). What problems should I expect in using them as suggested by Marinos?

2. If I understand correctly your method, it is not applicable from inside a stored procedure, is it? I need to keep all the business logic within PostgreSQL and provide just a clean interface to a simple GUI layer

3. Unfortunately GridSQL and Pgpool-II do not seem mature and stable products to be used in commercial software. Neither one provides clear documentation. GridSQL has been discontinued and it is not clear what kind of future it will have. I have not tried GridSQL but I did try Pgpool-II. It is disappointing that it may stop working correctly even just because of the way you write the query (e.g. using uppercase in a field or using named field in group by, ecc.). Even worse, when it does not recognize something in the parallel query, it just provides incorrect result (from only the local DB) rather than raising an exception. So I guess Pgpool-II in its current state is good only for very simple applications, which are not supposed to be reliable at all.

Thank you,
Svetlin Manavski


On Wed, Jun 29, 2011 at 8:14 PM, Greg Spiegelberg <gspiegelberg@gmail.com> wrote:

This does work however you'll need to add a little more to it to ensure your UNION succeeds.  In pseudo...

connection #1:
CREATE TABLE target_1 ...
BEGIN;
LOCK TABLE target_1 IN ACCESS EXCLUSIVE MODE;
INSERT INTO target_1 SELECT ...
COMMIT;

connection #2:
CREATE TABLE target_2 ...
BEGIN;
LOCK TABLE target_2 IN ACCESS EXCLUSIVE MODE;
INSERT INTO target_2 SELECT ...
COMMIT;

connection #3:
SELECT * FROM target_1 UNION SELECT * FROM target_2;

Connections 1 and 2 can be done in simultaneously and after both have reached the LOCK statement then the SELECT on connection 3 can be executed.  Same fundamentals if all three connections are to different databases and connection 3 uses dblink to pull the data.

Another alternative is to use GridSQL.  I haven't used it myself but seen it in action on a large install with 4 backend databases.  Pretty slick.

Greg


pgsql-performance by date:

Previous
From: Samuel Gendler
Date:
Subject: Re: near identical queries have vastly different plans
Next
From: Anthony Presley
Date:
Subject: Infinite Cache