[Fwd: dblink patch - Asynchronous queries and parallel execution] - Mailing list pgsql-hackers

From Joe Conway
Subject [Fwd: dblink patch - Asynchronous queries and parallel execution]
Date
Msg-id 44C50FA0.9060500@joeconway.com
Whole thread Raw
Responses Re: [Fwd: dblink patch - Asynchronous queries and parallel  (Josh Berkus <josh@agliodbs.com>)
List pgsql-hackers
(cross-posted from PATCHES without the attachement)

I just received this (offlist), and have not had a chance to review it
myself yet, but figured I should post it now in case others want to have
a look and comment or discuss before feature freeze.

If there are no major objections to the concept, I'll take
responsibility to review and commit once I'm through with the "Values
list-of-targetlists" stuff.

(I'm not sure where we finished off with the discussion of PATCHES vs
HACKERS list for this kind of stuff, so I'm going to send another copy
of this to HACKERS without the attachement)

Thanks,

Joe


-------- Original Message --------
Subject: dblink patch - Asynchronous queries and parallel execution
Date: Mon, 24 Jul 2006 12:47:51 +0200
From: Kai Londenberg <K.Londenberg@librics.de>
To: mail@joeconway.com

Hello,

I needed parallel query execution features for a project, and so I
modified the dblink module to add support
for asynchronous query execution.

I thought others might find these features useful as well, therefore I'd
like to contribute this to the
current Postgresql contrib/dblink codebase.

The code is based on the contrib/dblink code included with the current
8.1.4 version of PostgreSQL.
I'm including the entire modified contrib/dblink directory in archived form.

I modified dblink.c and dblink.sql.in, and created the file README.async

Hope you like it, and include it in a possible next version of dblink.

The code still needs some testing and code review. I made it work for
me, but I don't have any
experience writing Postgresql Extensions, and haven't touched C for a while.

The most important thing about this code is that it allows parallel
execution of queries on several
backend databases, and re-joining of their results. This solves a lot of
scalability problems.

This is my corresponding README.async file which describes my additions.

-----
dblink-async patch by Kai Londenberg (K.Londenberg@librics.de)
24.7.2006

All code is licensed under the same terms as the rest of the dblink
code.

SQL Function declarations have been added at the bottom of dblink.sql

Added functions:

int dblink_send_query(connstr text, sql text)      Sends a query to a remote server for asynchronous execution.
      returns immediately without waiting for results.
          returns 1 on success, or 1 on failure.      results *must* be fetched by dblink_get_result(connstr)      a
runningquery may be cancelled by dblink_cancel_query(connstr)
 


dblink_get_result(connstr text[,bool fail_on_error])      retrieves the result of a query started by
dblink_send_query.
      Blocks until a result gets available.
      This function *must* be called if dblink_send_query returned      a 1, even on cancelled queries - otherwise the
connection         can't be used anymore.
 

dblink_get_connections()      List all open dblink connections by name.      Returns a comma separated string of all
connectionnames.      Takes no params
 
      Example: SELECT string_to_array(dblink_get_connections(), ',');

int dblink_is_busy(connstr)      returns 1 if connection is busy, 0 if it is not busy.      If this function returns 0,
itis guaranteed that dblink_get_result      will not block.
 

text dblink_cancel_query(connstr)      Cancels a running query on a given connection.      returns "OK" on success, or
anerror message on failure.
 


Examples:

---- Example 1 - Union over parallel executed remote queries --

SELECT dblink_connect('dtest1', 'host=server1 port=5432 dbname=dtest_1
user=duser password=pass');
SELECT * from   dblink_send_query('dtest1', 'SELECT country_code, city from
world_cities where city like \'fe%\'') as t1;

SELECT dblink_connect('dtest2', 'host=server2 port=5432 dbname=dtest_2
user=duser password=pass');
SELECT * from   dblink_send_query('dtest2', 'SELECT country_code, city from
world_cities where city like \'fe%\'') as t1;

SELECT dblink_connect('dtest3', 'host=server3 port=5432 dbname=dtest_3
user=duser password=pass');
SELECT * from   dblink_send_query('dtest3', 'SELECT country_code, city from
world_cities where city like \'fe%\'') as t1;

CREATE TEMPORARY TABLE result AS
(SELECT * from dblink_get_result('dtest1') as t1(country_code text, city
text))
UNION
(SELECT * from dblink_get_result('dtest2') as t2(country_code text, city
text))
UNION
(SELECT * from dblink_get_result('dtest3') as t3(country_code text, city
text))
ORDER by city DESC LIMIT 100;

SELECT dblink_disconnect('dtest1');
SELECT dblink_disconnect('dtest2');
SELECT dblink_disconnect('dtest3');
SELECT * from result;

--- End of Example 1 ---

best regards,

Kai Londenberg






pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: [PATCHES] The vacuum-ignore-vacuum patch
Next
From: Peter Eisentraut
Date:
Subject: Re: Making config file parser available to add-ins