Thread: FDW-based dblink
Here is a proposal to integrate contrib/dblink and SQL/MED (foreign data wrapper). Dblink manages connections and transactions by itself at the moment, but there are some benefits to split database connectors into FDW. Dblink will uses those multiple connectors. For example, we will be able to retrieve data from Oracle into PostgreSQL directly if we had Oracle-connector. New syntax in SQL ------------------- CREATE FOREIGN DATA WRAPPER postgres CONNECTOR pg_catalog.dblink_postgres; or CREATE FOREIGN DATA WRAPPER postgres OPTIONS (connector 'pg_catalog.dblink_postgres') We don't have to modify gram.y if we take the latter syntax, but need to modify VALIDATORs to distinguish 'connector' and other options. The 'connector' option should not be passed as connection string. New interface in C -------------------- pg_catalog.dblink_postgres is a function that havs folloing prototype: Connection *connector(List *defElems); The argument 'defElems' is a concatenated connection options in FDW, server, and user-mapping. Also new two interfaces will be introduced: interface Connection /* represents PGconn */ { void disconnect(self); Cursor *open(self, query, fetchsize);/* for SELECT */ int64 exec(self, query); /* for UPDATE, INSERT, DELETE */ bool transaction_command(self,type); } interface Cursor /* represents PGresult and server-side cursor */ { bool fetch(self, OUT values); voidclose(self); } They have some methods implemented with function pointers. The benefit of using function pointers is that we only have to export one connector function to pg_proc. The Cursor interface represents both result-set and server-side cursor. PostgreSQL has SQL-level cursor, but there are some database that have protocol-level cursor. This abstraction layer is needed for dblink to handle connectors to other databases. Other features ---------------- Present dblink is a thin wrapper of libpq, but some of my customers want automatic transaction managements. Remote transactions are committed with 2PC when the local transaction is committed. To achieve it, I think we need on-commit trigger is needed, but it is hard to implement with current infrastructure. (That is one of the reason I proposed to merge dblink into core.) Other considerations ---------------------- The proposed method is a SQL-based connector. There might be another approach -- ScanKey-based connector. It is similar to the index access method interface (pg_am). It takes relation id and scankeys instead of SQL text. The scanKey-based approach will work better if we try to pass WHERE-clause to an external database. However, I think we need SQL-based interface in any case. ScanKey will be converted to SQL and passed to an external database. I have a prototype of the feature. I'd like to submit it for 8.5. Comments welcome. Regards, --- ITAGAKI Takahiro NTT Open Source Software Center
Itagaki Takahiro wrote: > Present dblink is a thin wrapper of libpq, but some of my customers > want automatic transaction managements. Remote transactions are > committed with 2PC when the local transaction is committed. > To achieve it, I think we need on-commit trigger is needed, > but it is hard to implement with current infrastructure. > (That is one of the reason I proposed to merge dblink into core.) Quite aside from the requirement for on-commit trigger, how exactly would you use 2PC with the remote database? When would you issue PREPARE TRANSACTION, and when would COMMIT PREPARED? What if the local database crashes in between - is the remote transaction left hanging in prepared state? Making the remote transcation atomic with the local one is a lot harder than it may seem at first glance. It's doable, but I think you'll need to build a full-fledged transaction manager into dblink, or integrate with a 3rd party one, -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Itagaki Takahiro wrote: > Also new two interfaces will be introduced: > > interface Connection /* represents PGconn */ > { > void disconnect(self); > Cursor *open(self, query, fetchsize); /* for SELECT */ > int64 exec(self, query); /* for UPDATE, INSERT, DELETE */ > bool transaction_command(self, type); > } It's not good to return int64 in exec(), because it could have a RETURNING clause. (So it also needs a fetchsize). -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
On Thu, Aug 13, 2009 at 02:01:19PM +0300, Heikki Linnakangas wrote: > Itagaki Takahiro wrote: > > Present dblink is a thin wrapper of libpq, but some of my customers > > want automatic transaction managements. Remote transactions are > > committed with 2PC when the local transaction is committed. > > To achieve it, I think we need on-commit trigger is needed, > > but it is hard to implement with current infrastructure. > > (That is one of the reason I proposed to merge dblink into core.) > > Quite aside from the requirement for on-commit trigger, how exactly > would you use 2PC with the remote database? When would you issue PREPARE > TRANSACTION, and when would COMMIT PREPARED? For what it's worth, in DBI-Link, I've allowed some of this by letting people pass commands like BEGIN, COMMIT and ROLLBACK through to the remote side. However, it doesn't--can't, as far as I know--implement the full 2PC. Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
Alvaro Herrera <alvherre@commandprompt.com> wrote: > > int64 exec(self, query); /* for UPDATE, INSERT, DELETE */ > > It's not good to return int64 in exec(), because it could have a > RETURNING clause. (So it also needs a fetchsize). We should use open() for RETURNING query. It is just same as present dblink_exec(), that only returns a command tag. Regards, --- ITAGAKI Takahiro NTT Open Source Software Center
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote: > Quite aside from the requirement for on-commit trigger, how exactly > would you use 2PC with the remote database? When would you issue PREPARE > TRANSACTION, and when would COMMIT PREPARED? What if the local database > crashes in between - is the remote transaction left hanging in prepared > state? I'm thinking prepareing remote transactions just before commit the local transaction in CommitTransaction(). The pseudo code is something like: 1. Fire deferred triggers and do works for just-before-commit. 2. AtEOXact_dblink() => prepare and commit remotetransactions. 3. HOLD_INTERRUPTS() We cannot rollback the local transaction after this. 4. do works for commit If we need more robust atomicity, we could use 2PC against the local transaction if there some remote transactions. i.e., expand COMMIT command into PREPARE TRANSACTION and COMMIT PREPARED internally: 1. Fire deferred triggers and do works for just-before-commit. 2. AtEOXact_dblink_prepare() -- prepare remotes 3. PrepareTransaction() -- prepare local 4. AtEOXact_dblink_commit() -- commit remotes 5. FinishPreparedTransaction(commit) -- commit local I'm using deferrable after trigger for the purpose in my present prototype, and it seems to work if the trigger is called at the end of deferrable event and local backend doesn't crash in final works for commit -- and we have some should-not-failed operations in the final works already (flushing WAL, etc.). Regards, --- ITAGAKI Takahiro NTT Open Source Software Center
2009/8/14 Itagaki Takahiro <itagaki.takahiro@oss.ntt.co.jp>
You're completely missing the point. You need to be prepared for a crash at any point in the sequence, and still recover into a coherent state where all the local and remote transactions are either committed or rolled back. Without some kind of a recovery system, you can end up in a situation where some transactions are already committed and others rolled back. 2PC makes it possible to write such a recovery system, but using 2PC alone isn't enough to guarantee atomicity. In fact, by using 2PC without a recovery system you can end up with a transaction that's prepared but never committed or aborted, requiring an admin to remove it manually, which is even worse than not using 2PC to begin with.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
I'm thinking prepareing remote transactions just before commit the local
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote:
> Quite aside from the requirement for on-commit trigger, how exactly
> would you use 2PC with the remote database? When would you issue PREPARE
> TRANSACTION, and when would COMMIT PREPARED? What if the local database
> crashes in between - is the remote transaction left hanging in prepared
> state?
transaction in CommitTransaction(). The pseudo code is something like:
1. Fire deferred triggers and do works for just-before-commit.
2. AtEOXact_dblink()
=> prepare and commit remote transactions.
3. HOLD_INTERRUPTS()
We cannot rollback the local transaction after this.
4. do works for commit
If we need more robust atomicity, we could use 2PC against the local
transaction if there some remote transactions. i.e., expand COMMIT
command into PREPARE TRANSACTION and COMMIT PREPARED internally:
1. Fire deferred triggers and do works for just-before-commit.
2. AtEOXact_dblink_prepare() -- prepare remotes
3. PrepareTransaction() -- prepare local
4. AtEOXact_dblink_commit() -- commit remotes
5. FinishPreparedTransaction(commit) -- commit local
I'm using deferrable after trigger for the purpose in my present
prototype, and it seems to work if the trigger is called at the
end of deferrable event and local backend doesn't crash in final
works for commit -- and we have some should-not-failed operations
in the final works already (flushing WAL, etc.).
You're completely missing the point. You need to be prepared for a crash at any point in the sequence, and still recover into a coherent state where all the local and remote transactions are either committed or rolled back. Without some kind of a recovery system, you can end up in a situation where some transactions are already committed and others rolled back. 2PC makes it possible to write such a recovery system, but using 2PC alone isn't enough to guarantee atomicity. In fact, by using 2PC without a recovery system you can end up with a transaction that's prepared but never committed or aborted, requiring an admin to remove it manually, which is even worse than not using 2PC to begin with.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote: > In fact, by using 2PC without a recovery system you > can end up with a transaction that's prepared but never committed or > aborted, requiring an admin to remove it manually, which is even worse than > not using 2PC to begin with. Yes, true. But I think "hard-to-use" is better than "unable-to-do". If needed, we can develop some recovery systems working with the dblink together, no? Regards, --- ITAGAKI Takahiro NTT Open Source Software Center