Thread: FDW-based dblink

FDW-based dblink

From

Itagaki Takahiro

Date:

13 August 2009, 07:35:34

Here is a proposal to integrate contrib/dblink and SQL/MED (foreign
data wrapper).

Dblink manages connections and transactions by itself at the moment,
but there are some benefits to split database connectors into FDW.
Dblink will uses those multiple connectors. For example, we will be
able to retrieve data from Oracle into PostgreSQL directly if we had
Oracle-connector.
New syntax in SQL
-------------------
CREATE FOREIGN DATA WRAPPER postgres   CONNECTOR pg_catalog.dblink_postgres;
or
CREATE FOREIGN DATA WRAPPER postgres   OPTIONS (connector 'pg_catalog.dblink_postgres')

We don't have to modify gram.y if we take the latter syntax, but need
to modify VALIDATORs to distinguish 'connector' and other options.
The 'connector' option should not be passed as connection string.
New interface in C
--------------------
pg_catalog.dblink_postgres is a function that havs folloing prototype:
   Connection *connector(List *defElems);

The argument 'defElems' is a concatenated connection options
in FDW, server, and user-mapping.

Also new two interfaces will be introduced:
   interface Connection    /* represents PGconn */   {       void    disconnect(self);       Cursor *open(self, query,
fetchsize);/* for SELECT */       int64   exec(self, query);            /* for UPDATE, INSERT, DELETE */       bool
transaction_command(self,type);   }
 
   interface Cursor    /* represents PGresult and server-side cursor */   {       bool fetch(self, OUT values);
voidclose(self);   }
 

They have some methods implemented with function pointers. The benefit
of using function pointers is that we only have to export one connector
function to pg_proc.

The Cursor interface represents both result-set and server-side cursor.
PostgreSQL has SQL-level cursor, but there are some database that have
protocol-level cursor. This abstraction layer is needed for dblink to
handle connectors to other databases.
Other features
----------------
Present dblink is a thin wrapper of libpq, but some of my customers
want automatic transaction managements. Remote transactions are
committed with 2PC when the local transaction is committed.
To achieve it, I think we need on-commit trigger is needed,
but it is hard to implement with current infrastructure.
(That is one of the reason I proposed to merge dblink into core.)
Other considerations
----------------------
The proposed method is a SQL-based connector. There might be another
approach -- ScanKey-based connector. It is similar to the index access
method interface (pg_am). It takes relation id and scankeys instead of
SQL text. The scanKey-based approach will work better if we try to pass
WHERE-clause to an external database. However, I think we need SQL-based
interface in any case. ScanKey will be converted to SQL and passed
to an external database.


I have a prototype of the feature. I'd like to submit it for 8.5.
Comments welcome.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center

Re: FDW-based dblink

From

Heikki Linnakangas

Date:

13 August 2009, 08:01:39

Itagaki Takahiro wrote:
> Present dblink is a thin wrapper of libpq, but some of my customers
> want automatic transaction managements. Remote transactions are
> committed with 2PC when the local transaction is committed.
> To achieve it, I think we need on-commit trigger is needed,
> but it is hard to implement with current infrastructure.
> (That is one of the reason I proposed to merge dblink into core.)

Quite aside from the requirement for on-commit trigger, how exactly
would you use 2PC with the remote database? When would you issue PREPARE
TRANSACTION, and when would COMMIT PREPARED? What if the local database
crashes in between - is the remote transaction left hanging in prepared
state?

Making the remote transcation atomic with the local one is a lot harder
than it may seem at first glance. It's doable, but I think you'll need
to build a full-fledged transaction manager into dblink, or integrate
with a 3rd party one,

--  Heikki Linnakangas EnterpriseDB   http://www.enterprisedb.com

Re: FDW-based dblink

From

Alvaro Herrera

Date:

13 August 2009, 11:23:02

Itagaki Takahiro wrote:

> Also new two interfaces will be introduced:
> 
>     interface Connection    /* represents PGconn */
>     {
>         void    disconnect(self);
>         Cursor *open(self, query, fetchsize); /* for SELECT */
>         int64   exec(self, query);            /* for UPDATE, INSERT, DELETE */
>         bool    transaction_command(self, type);
>     }

It's not good to return int64 in exec(), because it could have a
RETURNING clause.  (So it also needs a fetchsize).

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

Re: FDW-based dblink

From

David Fetter

Date:

13 August 2009, 15:26:05

On Thu, Aug 13, 2009 at 02:01:19PM +0300, Heikki Linnakangas wrote:
> Itagaki Takahiro wrote:
> > Present dblink is a thin wrapper of libpq, but some of my customers
> > want automatic transaction managements. Remote transactions are
> > committed with 2PC when the local transaction is committed.
> > To achieve it, I think we need on-commit trigger is needed,
> > but it is hard to implement with current infrastructure.
> > (That is one of the reason I proposed to merge dblink into core.)
> 
> Quite aside from the requirement for on-commit trigger, how exactly
> would you use 2PC with the remote database? When would you issue PREPARE
> TRANSACTION, and when would COMMIT PREPARED?

For what it's worth, in DBI-Link, I've allowed some of this by letting
people pass commands like BEGIN, COMMIT and ROLLBACK through to the
remote side.  However, it doesn't--can't, as far as I know--implement
the full 2PC.

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Re: FDW-based dblink

From

Itagaki Takahiro

Date:

13 August 2009, 21:34:10

Alvaro Herrera <alvherre@commandprompt.com> wrote:

> >         int64   exec(self, query);            /* for UPDATE, INSERT, DELETE */
> 
> It's not good to return int64 in exec(), because it could have a
> RETURNING clause.  (So it also needs a fetchsize).

We should use open() for RETURNING query.
It is just same as present dblink_exec(), that only returns a command tag.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center

Re: FDW-based dblink

From

Itagaki Takahiro

Date:

13 August 2009, 22:53:07

Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote:

> Quite aside from the requirement for on-commit trigger, how exactly
> would you use 2PC with the remote database? When would you issue PREPARE
> TRANSACTION, and when would COMMIT PREPARED? What if the local database
> crashes in between - is the remote transaction left hanging in prepared
> state?

I'm thinking prepareing remote transactions just before commit the local
transaction in CommitTransaction(). The pseudo code is something like:
   1. Fire deferred triggers and do works for just-before-commit.   2. AtEOXact_dblink()       => prepare and commit
remotetransactions.   3. HOLD_INTERRUPTS()       We cannot rollback the local transaction after this.   4. do works for
commit

If we need more robust atomicity, we could use 2PC against the local
transaction if there some remote transactions. i.e., expand COMMIT
command into PREPARE TRANSACTION and COMMIT PREPARED internally:
   1. Fire deferred triggers and do works for just-before-commit.   2. AtEOXact_dblink_prepare()            -- prepare
remotes  3. PrepareTransaction()                 -- prepare local   4. AtEOXact_dblink_commit()             -- commit
remotes  5. FinishPreparedTransaction(commit)    -- commit local

I'm using deferrable after trigger for the purpose in my present
prototype, and it seems to work if the trigger is called at the
end of deferrable event and local backend doesn't crash in final
works for commit -- and we have some should-not-failed operations
in the final works already  (flushing WAL, etc.).

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center

Re: FDW-based dblink

From

Heikki Linnakangas

Date:

14 August 2009, 03:40:31

2009/8/14 Itagaki Takahiro <itagaki.takahiro@oss.ntt.co.jp>

Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote:

> Quite aside from the requirement for on-commit trigger, how exactly
> would you use 2PC with the remote database? When would you issue PREPARE
> TRANSACTION, and when would COMMIT PREPARED? What if the local database
> crashes in between - is the remote transaction left hanging in prepared
> state?

I'm thinking prepareing remote transactions just before commit the local
transaction in CommitTransaction(). The pseudo code is something like:

1. Fire deferred triggers and do works for just-before-commit.
2. AtEOXact_dblink()
=> prepare and commit remote transactions.
3. HOLD_INTERRUPTS()
We cannot rollback the local transaction after this.
4. do works for commit

If we need more robust atomicity, we could use 2PC against the local
transaction if there some remote transactions. i.e., expand COMMIT
command into PREPARE TRANSACTION and COMMIT PREPARED internally:

1. Fire deferred triggers and do works for just-before-commit.
2. AtEOXact_dblink_prepare() -- prepare remotes
3. PrepareTransaction() -- prepare local
4. AtEOXact_dblink_commit() -- commit remotes
5. FinishPreparedTransaction(commit) -- commit local

I'm using deferrable after trigger for the purpose in my present
prototype, and it seems to work if the trigger is called at the
end of deferrable event and local backend doesn't crash in final
works for commit -- and we have some should-not-failed operations
in the final works already (flushing WAL, etc.).

You're completely missing the point. You need to be prepared for a crash at any point in the sequence, and still recover into a coherent state where all the local and remote transactions are either committed or rolled back. Without some kind of a recovery system, you can end up in a situation where some transactions are already committed and others rolled back. 2PC makes it possible to write such a recovery system, but using 2PC alone isn't enough to guarantee atomicity. In fact, by using 2PC without a recovery system you can end up with a transaction that's prepared but never committed or aborted, requiring an admin to remove it manually, which is even worse than not using 2PC to begin with.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

Re: FDW-based dblink

From

Itagaki Takahiro

Date:

14 August 2009, 04:20:18

Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote:

> In fact, by using 2PC without a recovery system you
> can end up with a transaction that's prepared but never committed or
> aborted, requiring an admin to remove it manually, which is even worse than
> not using 2PC to begin with.

Yes, true. But I think "hard-to-use" is better than "unable-to-do".
If needed, we can develop some recovery systems working with the
dblink together, no?

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center