Re: Transactions involving multiple postgres foreign servers, take 2 - Mailing list pgsql-hackers
From | Masahiko Sawada |
---|---|
Subject | Re: Transactions involving multiple postgres foreign servers, take 2 |
Date | |
Msg-id | CA+fd4k7WqN1zYeazdCbuTi_jaCzdrsTiNPZRssvS-T86-bdygg@mail.gmail.com Whole thread Raw |
In response to | Re: Transactions involving multiple postgres foreign servers, take 2 (Kyotaro Horiguchi <horikyota.ntt@gmail.com>) |
Responses |
Re: Transactions involving multiple postgres foreign servers, take 2
|
List | pgsql-hackers |
On Fri, 9 Oct 2020 at 14:55, Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote: > > At Fri, 9 Oct 2020 02:33:37 +0000, "tsunakawa.takay@fujitsu.com" <tsunakawa.takay@fujitsu.com> wrote in > > From: Masahiko Sawada <masahiko.sawada@2ndquadrant.com> > > > What about temporary network failures? I think there are users who > > > don't want to give up resolving foreign transactions failed due to a > > > temporary network failure. Or even they might want to wait for > > > transaction completion until they send a cancel request. If we want to > > > call the commit routine only once and therefore want FDW to retry > > > connecting the foreign server within the call, it means we require all > > > FDW implementors to write a retry loop code that is interruptible and > > > ensures not to raise an error, which increases difficulty. > > > > > > Yes, but if we don’t retry to resolve foreign transactions at all on > > > an unreliable network environment, the user might end up requiring > > > every transaction to check the status of foreign transactions of the > > > previous distributed transaction before starts. If we allow to do > > > retry, I guess we ease that somewhat. > > > > OK. As I said, I'm not against trying to cope with temporary network failure. I just don't think it's mandatory. Ifthe network failure is really temporary and thus recovers soon, then the resolver will be able to commit the transactionsoon, too. > > I should missing something, though... > > I don't understand why we hate ERRORs from fdw-2pc-commit routine so > much. I think remote-commits should be performed before local commit > passes the point-of-no-return and the v26-0002 actually places > AtEOXact_FdwXact() before the critical section. > So you're thinking the following sequence? 1. Prepare all foreign transactions. 2. Commit the all prepared foreign transactions. 3. Commit the local transaction. Suppose we have the backend process call the commit routine, what if one of FDW raises an ERROR during committing the foreign transaction after committing other foreign transactions? The transaction will end up with an abort but some foreign transactions are already committed. Also, what if the backend process failed to commit the local transaction? Since it already committed all foreign transactions it cannot ensure the global atomicity in this case too. Therefore, I think we should commit the distributed transactions in the following sequence: 1. Prepare all foreign transactions. 2. Commit the local transaction. 3. Commit the all prepared foreign transactions. But this is still not a perfect solution. If we have the backend process call the commit routine and an error happens during executing the commit routine of an FDW (i.g., at step 3) it's too late to report an error to the client because we already committed the local transaction. So the current solution is to have a background process commit the foreign transactions so that the backend can just wait without the possibility of errors. Regards, -- Masahiko Sawada http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
pgsql-hackers by date: