Thread: FDW-based dblink (WIP)

FDW-based dblink (WIP)

From

Itagaki Takahiro

Date:

19 August 2009, 05:07:20

Here is a WIP patch for a foreign data wrapper based dblink.

It integrates dblink module into core and adds a new functionality,
automatic transaction management. The new interface of dblink is
exported by include/foreign/dblink.h. We can easily write a connector
module for another database because we can reuse transaction and
resource management parts in core.


Syntax to create FDW with connector is below:
    CREATE FOREIGN DATA WRAPPER postgresql
        VALIDATOR postgresql_fdw_validator
        CONNECTOR postgresql_fdw_connector
        OPTIONS (...);

contrib/dblink2 is a sample of postgres connector. It exports one function:
    CREATE FUNCTION postgresql_fdw_connector(options internal)
      RETURNS internal -- returns a connection object

Basic dblink functions are moved to postgres core:
            Name       | Result type  | Argument data types
    -------------------+--------------+----------------------
     dblink            | SETOF record | text, text
     dblink_close      | boolean      | integer
     dblink_connect    | boolean      | text
     dblink_connect    | boolean      | text, text
     dblink_disconnect | boolean      | text
     dblink_exec       | bigint       | text, text
     dblink_fetch      | SETOF record | integer, integer
     dblink_open       | integer      | text, text

The new dblink can work together closely with local transactions. If a
local transaction is committed or rollbacked, remote transactions take
the same status with the local one. Please set max_prepared_transactions
to 1 or greater if you could test the patch.

I want pretty much the automatic transaction management. It is useful to
write applied modules like materialized-view-over-network. But it should
be able to be turned off if we don't want it. I'll work on those parts next.

    -- connect
    CREATE SERVER server_postgres FOREIGN DATA WRAPPER postgresql;
    SELECT dblink_connect('conn_postgres', 'server_postgres');
    -- commit both local and remote transactions.
    BEGIN;
    SELECT dblink_exec('conn_postgres', 'UPDATE ...');
    COMMIT;
    -- rollback both local and remote transactions.
    BEGIN;
    SELECT dblink_exec('conn_postgres', 'UPDATE ...');
    ROLLBACK;
    -- disconnect
    SELECT dblink_disconnect('conn_postgres');

I've not ported all features in present dblink, but I'd like to hear
wheather the goal and the concepts are reasonable. Comments welcome.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center

Attachment

dblink2-20090819.patch

Re: FDW-based dblink (WIP)

From

Tom Lane

Date:

19 August 2009, 12:16:59

Itagaki Takahiro <itagaki.takahiro@oss.ntt.co.jp> writes:
> Here is a WIP patch for a foreign data wrapper based dblink.
> It integrates dblink module into core and adds a new functionality,
> automatic transaction management.

I don't believe there is any consensus for integrating dblink into core,
and I for one will resist that strongly.  Keep it in contrib.
        regards, tom lane

Re: FDW-based dblink (WIP)

From

Pavel Stehule

Date:

19 August 2009, 12:21:52

2009/8/19 Tom Lane <tgl@sss.pgh.pa.us>:
> Itagaki Takahiro <itagaki.takahiro@oss.ntt.co.jp> writes:
>> Here is a WIP patch for a foreign data wrapper based dblink.
>> It integrates dblink module into core and adds a new functionality,
>> automatic transaction management.
>
> I don't believe there is any consensus for integrating dblink into core,
> and I for one will resist that strongly.  Keep it in contrib.

if integration means, so I could to write query like

SELECT * FROM otherdatabase.schema.table ....
UPDATE otherdb.table SET ...

I am for integration.

regards
Pavel Stehule
>
>                        regards, tom lane
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

Re: FDW-based dblink (WIP)

From

Tom Lane

Date:

19 August 2009, 12:29:38

Pavel Stehule <pavel.stehule@gmail.com> writes:
> 2009/8/19 Tom Lane <tgl@sss.pgh.pa.us>:
>> I don't believe there is any consensus for integrating dblink into core,
>> and I for one will resist that strongly.  Keep it in contrib.

> if integration means, so I could to write query like
> SELECT * FROM otherdatabase.schema.table ....
> UPDATE otherdb.table SET ...
> I am for integration.

That is not what "integrating dblink" means --- what Itagaki-san is
talking about is moving the dblink_xxx functions into core.  What
you are talking about is actual SQL/MED functionality, which we should
indeed try to get into core someday.  But dblink is a dead end as far
as standards compliance goes.  Between that and the potential security
issues, we should not put it in core.
        regards, tom lane

Re: FDW-based dblink (WIP)

From

Heikki Linnakangas

Date:

19 August 2009, 12:31:47

Itagaki Takahiro wrote:
> It integrates dblink module into core and adds a new functionality,
> automatic transaction management.

Why does it need to be integrated to core? I'd prefer to keep dblink out
of core, unless there's a reason.

> I want pretty much the automatic transaction management. It is useful to
> write applied modules like materialized-view-over-network. But it should
> be able to be turned off if we don't want it. I'll work on those parts next.

The transaction management is very unsafe as it is, for the reasons I
mentioned earlier.

--  Heikki Linnakangas EnterpriseDB   http://www.enterprisedb.com

Re: FDW-based dblink (WIP)

From

Pavel Stehule

Date:

19 August 2009, 12:33:09

2009/8/19 Tom Lane <tgl@sss.pgh.pa.us>:
> Pavel Stehule <pavel.stehule@gmail.com> writes:
>> 2009/8/19 Tom Lane <tgl@sss.pgh.pa.us>:
>>> I don't believe there is any consensus for integrating dblink into core,
>>> and I for one will resist that strongly.  Keep it in contrib.
>
>> if integration means, so I could to write query like
>> SELECT * FROM otherdatabase.schema.table ....
>> UPDATE otherdb.table SET ...
>> I am for integration.
>
> That is not what "integrating dblink" means --- what Itagaki-san is
> talking about is moving the dblink_xxx functions into core.  What
> you are talking about is actual SQL/MED functionality, which we should
> indeed try to get into core someday.  But dblink is a dead end as far
> as standards compliance goes.  Between that and the potential security
> issues, we should not put it in core.
>

aha, - ok

regards
Pavel Stehule

>                        regards, tom lane
>

Re: FDW-based dblink (WIP)

From

Joe Conway

Date:

19 August 2009, 12:38:47

Tom Lane wrote:
> Pavel Stehule <pavel.stehule@gmail.com> writes:
>> 2009/8/19 Tom Lane <tgl@sss.pgh.pa.us>:
>>> I don't believe there is any consensus for integrating dblink into core,
>>> and I for one will resist that strongly. Â Keep it in contrib.
>
>> if integration means, so I could to write query like
>> SELECT * FROM otherdatabase.schema.table ....
>> UPDATE otherdb.table SET ...
>> I am for integration.
>
> That is not what "integrating dblink" means --- what Itagaki-san is
> talking about is moving the dblink_xxx functions into core.  What
> you are talking about is actual SQL/MED functionality, which we should
> indeed try to get into core someday.  But dblink is a dead end as far
> as standards compliance goes.  Between that and the potential security
> issues, we should not put it in core.

+1

Joe

Re: FDW-based dblink (WIP)

From

Itagaki Takahiro

Date:

19 August 2009, 23:36:21

Tom Lane <tgl@sss.pgh.pa.us> wrote:

> I don't believe there is any consensus for integrating dblink into core,
> and I for one will resist that strongly.  Keep it in contrib.

OK, our consensus is that dblink should be replaced with SQL/MED interface
and then we'll start to consider integrating into core.

However, automatic transaction management needs help by core. Is it
acceptable to have two-phase callbacks? Registered callbacks are
called with TWOPHASE_EVENT_PRE_COMMIT when a transaction is about
to be committed or prepared. The argument gxact is NULL if the
transaction is committed without 2PC.

    typedef enum
    {
        TWOPHASE_EVENT_PRE_COMMIT,
        TWOPHASE_EVENT_POST_COMMIT,
        TWOPHASE_EVENT_POST_ABORT,
        TWOPHASE_EVENT_RECOVER,
    } TwoPhaseEvent;

    typedef void (*TwoPhaseEventCallback) (
        TwoPhaseEvent event, GlobalTransaction gxact, void *arg);

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center

Attachment

twophase_callbacks-20090820.patch

Re: FDW-based dblink (WIP)

From

Heikki Linnakangas

Date:

20 August 2009, 03:14:37

Itagaki Takahiro wrote:
> However, automatic transaction management needs help by core. Is it
> acceptable to have two-phase callbacks?

Probably, but it's far too early to decide what the interface should
look like.

--  Heikki Linnakangas EnterpriseDB   http://www.enterprisedb.com

Re: FDW-based dblink (WIP)

From

Peter Eisentraut

Date:

15 September 2009, 07:52:45

On Wed, 2009-08-19 at 17:07 +0900, Itagaki Takahiro wrote:
> Here is a WIP patch for a foreign data wrapper based dblink.
> 
> It integrates dblink module into core and adds a new functionality,
> automatic transaction management. The new interface of dblink is
> exported by include/foreign/dblink.h. We can easily write a connector
> module for another database because we can reuse transaction and
> resource management parts in core.

This patch is listed in the commitfest, but I think the consensus was
that it needed some rework.  I think the idea is that we will have
support for syntax like

> Syntax to create FDW with connector is below:
>     CREATE FOREIGN DATA WRAPPER postgresql
>         VALIDATOR postgresql_fdw_validator
>         CONNECTOR postgresql_fdw_connector
>         OPTIONS (...);

in core, but the actual implementation of postgresql_fdw_connector would
be a loadable module.

Personally, I'm undecided whether the single-function connector
implementation is the best.  The other approach would be to use a
multiple-function interface based directly on the functions currently
provided by dblink.

More generally, what does this really buy us?  It doesn't advance the
SQL/MED implementation, because you are not adding, say, some kind of
CREATE FOREIGN TABLE support.  You are just changing the dblink
implementation to go through the FDW.  I would argue that it should be
the other way around: The FDW should go through dblink.

Re: FDW-based dblink (WIP)

From

Itagaki Takahiro

Date:

16 September 2009, 01:47:17

Peter Eisentraut <peter_e@gmx.net> wrote:

> This patch is listed in the commitfest, but I think the consensus was
> that it needed some rework.

No doubt, but SQL/MED will require a lot of works. Can we split the work
into small parts? I intended FDW-based dblink to be one of the split jobs.

Here are some random considerations:
 * Split dblink to connector and connection management layers.   Present dblink has own name-based connection
managementand error   handling routines, but I think we should share them amoung connectors.

 * CREATE FOREIGN TABLE supports only select query in SQL standard.   I thnk we will still need to have free-style SQL
executorlike dblink   even when we support SQL/MED It is not a waste to include dblink in core.

 * Consider interface of extensible connecter to be able to connect   other DBMSs. Especially, there are many
differencesin usage of 2PC.

 * Automatic 2PC is very useful if we supports non-select query in SQL/MED.   It would be better to have some
infrastructurefor it.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center

Re: FDW-based dblink (WIP)

From

Peter Eisentraut

Date:

20 September 2009, 13:49:00

On Wed, 2009-09-16 at 13:47 +0900, Itagaki Takahiro wrote:
> Peter Eisentraut <peter_e@gmx.net> wrote:
> 
> > This patch is listed in the commitfest, but I think the consensus was
> > that it needed some rework.
> 
> No doubt, but SQL/MED will require a lot of works. Can we split the work
> into small parts? I intended FDW-based dblink to be one of the split jobs.

Sure, but I think what you are doing here isn't on anyone's agenda.  I
would imagine that the next step would be to implement foreign tables
using something like dblink's interface as underlying interface.  What
you are proposing doesn't really move us closer to that, or I'm
misunderstanding what you are trying to achieve.  So what is the purpose
of this patch anyway?