Re: Transactions involving multiple postgres foreign servers - Mailing list pgsql-hackers
From | Kevin Grittner |
---|---|
Subject | Re: Transactions involving multiple postgres foreign servers |
Date | |
Msg-id | 1355046515.3912543.1420740056866.JavaMail.yahoo@jws10049.mail.ne1.yahoo.com Whole thread Raw |
In response to | Re: Transactions involving multiple postgres foreign servers (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: Transactions involving multiple postgres foreign servers
Re: Transactions involving multiple postgres foreign servers |
List | pgsql-hackers |
Robert Haas <robertmhaas@gmail.com> wrote: > On Thu, Jan 8, 2015 at 10:19 AM, Kevin Grittner <kgrittn@ymail.com> wrote: >> Robert Haas <robertmhaas@gmail.com> wrote: >>> Andres is talking in my other ear suggesting that we ought to >>> reuse the 2PC infrastructure to do all this. >> >> If you mean that the primary transaction and all FDWs in the >> transaction must use 2PC, that is what I was saying, although >> apparently not clearly enough. All nodes *including the local one* >> must be prepared and committed with data about the nodes saved >> safely off somewhere that it can be read in the event of a failure >> of any of the nodes *including the local one*. Without that, I see >> this whole approach as a train wreck just waiting to happen. > > Clearly, all the nodes other than the local one need to use 2PC. I am > unconvinced that the local node must write a 2PC state file only to > turn around and remove it again almost immediately thereafter. The key point is that the distributed transaction data must be flagged as needing to commit rather than roll back between the prepare phase and the final commit. If you try to avoid the PREPARE, flagging, COMMIT PREPARED sequence by building the flagging of the distributed transaction metadata into the COMMIT process, you still have the problem of what to do on crash recovery. You really need to use 2PC to keep that clean, I think. >> I'm not really clear on the mechanism that is being proposed for >> doing this, but one way would be to have the PREPARE of the local >> transaction be requested explicitly and to have that cause all FDWs >> participating in the transaction to also be prepared. (That might >> be what Andres meant; I don't know.) > > We want this to be client-transparent, so that the client just says > COMMIT and everything Just Works. What about the case where one or more nodes doesn't support 2PC. Do we silently make the choice, without the client really knowing? >> That doesn't strike me as the >> only possible mechanism to drive this, but it might well be the >> simplest and cleanest. The trickiest bit might be to find a good >> way to persist the distributed transaction information in a way >> that survives the failure of the main transaction -- or even the >> abrupt loss of the machine it's running on. > > I'd be willing to punt on surviving a loss of the entire machine. But > I'd like to be able to survive an abrupt reboot. As long as people are aware that there is an urgent need to find and fix all data stores to which clusters on the failed machine were connected via FDW when there is a hard machine failure, I guess it is OK. In essence we just document it and declare it to be somebody else's problem. In general I would expect a distributed transaction manager to behave well in the face of any single-machine failure, but if there is one aspect of a full-featured distributed transaction manager we could give up, I guess that would be it. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: