Re: FDW-based dblink - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: FDW-based dblink
Date
Msg-id 7bbd365e0908132340r4ad2db43t7bd816e46542e644@mail.gmail.com
Whole thread Raw
In response to Re: FDW-based dblink  (Itagaki Takahiro <itagaki.takahiro@oss.ntt.co.jp>)
Responses Re: FDW-based dblink  (Itagaki Takahiro <itagaki.takahiro@oss.ntt.co.jp>)
List pgsql-hackers
2009/8/14 Itagaki Takahiro <itagaki.takahiro@oss.ntt.co.jp>

Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote:

> Quite aside from the requirement for on-commit trigger, how exactly
> would you use 2PC with the remote database? When would you issue PREPARE
> TRANSACTION, and when would COMMIT PREPARED? What if the local database
> crashes in between - is the remote transaction left hanging in prepared
> state?

I'm thinking prepareing remote transactions just before commit the local
transaction in CommitTransaction(). The pseudo code is something like:

   1. Fire deferred triggers and do works for just-before-commit.
   2. AtEOXact_dblink()
       => prepare and commit remote transactions.
   3. HOLD_INTERRUPTS()
       We cannot rollback the local transaction after this.
   4. do works for commit

If we need more robust atomicity, we could use 2PC against the local
transaction if there some remote transactions. i.e., expand COMMIT
command into PREPARE TRANSACTION and COMMIT PREPARED internally:

   1. Fire deferred triggers and do works for just-before-commit.
   2. AtEOXact_dblink_prepare()            -- prepare remotes
   3. PrepareTransaction()                 -- prepare local
   4. AtEOXact_dblink_commit()             -- commit remotes
   5. FinishPreparedTransaction(commit)    -- commit local

I'm using deferrable after trigger for the purpose in my present
prototype, and it seems to work if the trigger is called at the
end of deferrable event and local backend doesn't crash in final
works for commit -- and we have some should-not-failed operations
in the final works already  (flushing WAL, etc.).

You're completely missing the point. You need to be prepared for a crash at any point in the sequence, and still recover into a coherent state where all the local and remote transactions are either committed or rolled back. Without some kind of a recovery system, you can end up in a situation where some transactions are already committed and others rolled back. 2PC makes it possible to write such a recovery system, but using 2PC alone isn't enough to guarantee atomicity. In fact, by using 2PC without a recovery system you can end up with a transaction that's prepared but never committed or aborted, requiring an admin to remove it manually, which is even worse than not using 2PC to begin with.

--
 Heikki Linnakangas
 EnterpriseDB   http://www.enterprisedb.com

pgsql-hackers by date:

Previous
From: Pavel Stehule
Date:
Subject: Re: CommitFest 2009-07: Remaining Patches
Next
From: Itagaki Takahiro
Date:
Subject: Re: FDW-based dblink