Re: Transactions involving multiple postgres foreign servers, take 2 - Mailing list pgsql-hackers
From | Kyotaro Horiguchi |
---|---|
Subject | Re: Transactions involving multiple postgres foreign servers, take 2 |
Date | |
Msg-id | 20201009.145514.78253792462097980.horikyota.ntt@gmail.com Whole thread Raw |
In response to | RE: Transactions involving multiple postgres foreign servers, take 2 ("tsunakawa.takay@fujitsu.com" <tsunakawa.takay@fujitsu.com>) |
Responses |
RE: Transactions involving multiple postgres foreign servers, take 2
Re: Transactions involving multiple postgres foreign servers, take 2 |
List | pgsql-hackers |
At Fri, 9 Oct 2020 02:33:37 +0000, "tsunakawa.takay@fujitsu.com" <tsunakawa.takay@fujitsu.com> wrote in > From: Masahiko Sawada <masahiko.sawada@2ndquadrant.com> > > What about temporary network failures? I think there are users who > > don't want to give up resolving foreign transactions failed due to a > > temporary network failure. Or even they might want to wait for > > transaction completion until they send a cancel request. If we want to > > call the commit routine only once and therefore want FDW to retry > > connecting the foreign server within the call, it means we require all > > FDW implementors to write a retry loop code that is interruptible and > > ensures not to raise an error, which increases difficulty. > > > > Yes, but if we don’t retry to resolve foreign transactions at all on > > an unreliable network environment, the user might end up requiring > > every transaction to check the status of foreign transactions of the > > previous distributed transaction before starts. If we allow to do > > retry, I guess we ease that somewhat. > > OK. As I said, I'm not against trying to cope with temporary network failure. I just don't think it's mandatory. Ifthe network failure is really temporary and thus recovers soon, then the resolver will be able to commit the transactionsoon, too. I should missing something, though... I don't understand why we hate ERRORs from fdw-2pc-commit routine so much. I think remote-commits should be performed before local commit passes the point-of-no-return and the v26-0002 actually places AtEOXact_FdwXact() before the critical section. (FWIW, I think remote commits should be performed by backends, not by another process, because backends should wait for all remote-commits to end anyway and it is simpler. If we want to multiple remote-commits in parallel, we could do that by adding some async-waiting interface.) > Then, we can have a commit retry timeout or retry count like the following WebLogic manual says. (I couldn't quickly findthe English manual, so below is in Japanese. I quoted some text that got through machine translation, which appearsa bit strange.) > > https://docs.oracle.com/cd/E92951_01/wls/WLJTA/trxcon.htm > -------------------------------------------------- > Abandon timeout > Specifies the maximum time (in seconds) that the transaction manager attempts to complete the second phase of a two-phasecommit transaction. > > In the second phase of a two-phase commit transaction, the transaction manager attempts to complete the transaction untilall resource managers indicate that the transaction is complete. After the abort transaction timer expires, no attemptis made to resolve the transaction. If the transaction enters a ready state before it is destroyed, the transactionmanager rolls back the transaction and releases the held lock on behalf of the destroyed transaction. > -------------------------------------------------- That's not a retry timeout but a timeout for total time of all 2nd-phase-commits. But I think it would be sufficient. Even if an fdw could retry 2pc-commit, it's a matter of that fdw and the core has nothing to do with. > > Also, what if the user sets the statement timeout to 60 sec and they > > want to cancel the waits after 5 sec by pressing ctl-C? You mentioned > > that client libraries of other DBMSs don't have asynchronous execution > > functionality. If the SQL execution function is not interruptible, the > > user will end up waiting for 60 sec, which seems not good. I think fdw-2pc-commit can be interruptible safely as far as we run the remote commits before entring critical section of local commit. > FDW functions can be uninterruptible in general, aren't they? We experienced that odbc_fdw didn't allow cancellation ofSQL execution. At least postgres_fdw is interruptible while waiting the remote. create view lt as select 1 as slp from (select pg_sleep(10)) t; create foreign table ft(slp int) server sv1 options (table_name 'lt'); select * from ft; ^CCancel request sent ERROR: canceling statement due to user request regrds. -- Kyotaro Horiguchi NTT Open Source Software Center
pgsql-hackers by date: