Home > mailing lists

FDW-based dblink - Mailing list pgsql-hackers

From	Itagaki Takahiro
Subject	FDW-based dblink
Date	August 13, 2009 10:35:34
Msg-id	20090813184143.7714.52131E4D@oss.ntt.co.jp Whole thread Raw
Responses	Re: FDW-based dblink (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>) Re: FDW-based dblink (Alvaro Herrera <alvherre@commandprompt.com>)
List	pgsql-hackers

Tree view

Here is a proposal to integrate contrib/dblink and SQL/MED (foreign
data wrapper).

Dblink manages connections and transactions by itself at the moment,
but there are some benefits to split database connectors into FDW.
Dblink will uses those multiple connectors. For example, we will be
able to retrieve data from Oracle into PostgreSQL directly if we had
Oracle-connector.
New syntax in SQL
-------------------
CREATE FOREIGN DATA WRAPPER postgres   CONNECTOR pg_catalog.dblink_postgres;
or
CREATE FOREIGN DATA WRAPPER postgres   OPTIONS (connector 'pg_catalog.dblink_postgres')

We don't have to modify gram.y if we take the latter syntax, but need
to modify VALIDATORs to distinguish 'connector' and other options.
The 'connector' option should not be passed as connection string.
New interface in C
--------------------
pg_catalog.dblink_postgres is a function that havs folloing prototype:
   Connection *connector(List *defElems);

The argument 'defElems' is a concatenated connection options
in FDW, server, and user-mapping.

Also new two interfaces will be introduced:
   interface Connection    /* represents PGconn */   {       void    disconnect(self);       Cursor *open(self, query,
fetchsize);/* for SELECT */       int64   exec(self, query);            /* for UPDATE, INSERT, DELETE */       bool
transaction_command(self,type);   }
 
   interface Cursor    /* represents PGresult and server-side cursor */   {       bool fetch(self, OUT values);
voidclose(self);   }
 

They have some methods implemented with function pointers. The benefit
of using function pointers is that we only have to export one connector
function to pg_proc.

The Cursor interface represents both result-set and server-side cursor.
PostgreSQL has SQL-level cursor, but there are some database that have
protocol-level cursor. This abstraction layer is needed for dblink to
handle connectors to other databases.
Other features
----------------
Present dblink is a thin wrapper of libpq, but some of my customers
want automatic transaction managements. Remote transactions are
committed with 2PC when the local transaction is committed.
To achieve it, I think we need on-commit trigger is needed,
but it is hard to implement with current infrastructure.
(That is one of the reason I proposed to merge dblink into core.)
Other considerations
----------------------
The proposed method is a SQL-based connector. There might be another
approach -- ScanKey-based connector. It is similar to the index access
method interface (pg_am). It takes relation id and scankeys instead of
SQL text. The scanKey-based approach will work better if we try to pass
WHERE-clause to an external database. However, I think we need SQL-based
interface in any case. ScanKey will be converted to SQL and passed
to an external database.


I have a prototype of the feature. I'd like to submit it for 8.5.
Comments welcome.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center

pgsql-hackers by date:

From: Pierre Frédéric Caillaud
Date: 13 August 2009, 09:59:53
Subject: Re: COPY speedup

From: Peter Eisentraut
Date: 13 August 2009, 10:48:00
Subject: trigger functions can only be called as triggers

FDW-based dblink - Mailing list pgsql-hackers

Previous

Next