Re: Transactions involving multiple postgres foreign servers, take 2 - Mailing list pgsql-hackers
From | Masahiko Sawada |
---|---|
Subject | Re: Transactions involving multiple postgres foreign servers, take 2 |
Date | |
Msg-id | CA+fd4k7VD7ueNQJ5UBcH4B3YqCm4GeLkJDat2gXb=NYrpKJVkg@mail.gmail.com Whole thread Raw |
In response to | RE: Transactions involving multiple postgres foreign servers, take 2 ("tsunakawa.takay@fujitsu.com" <tsunakawa.takay@fujitsu.com>) |
Responses |
Re: Transactions involving multiple postgres foreign servers, take 2
RE: Transactions involving multiple postgres foreign servers, take 2 |
List | pgsql-hackers |
On Fri, 17 Jul 2020 at 14:22, tsunakawa.takay@fujitsu.com <tsunakawa.takay@fujitsu.com> wrote: > > From: Masahiko Sawada <masahiko.sawada@2ndquadrant.com> > I have briefly checked the only oracle_fdw but in general I think that > > if an existing FDW supports transaction begin, commit, and rollback, > > these can be ported to new FDW transaction APIs easily. > > Does oracle_fdw support begin, commit and rollback? > > And most importantly, do other major DBMSs, including Oracle, provide the API for preparing a transaction? In other words,will the FDWs other than postgres_fdw really be able to take advantage of the new FDW functions to join the 2PC processing? I think we need to confirm that there are concrete examples. I also believe they do. But I'm concerned that some FDW needs to start a transaction differently when using 2PC. For instance, IIUC MySQL also supports 2PC but the transaction needs to be started with "XA START id” when the transaction needs to be prepared. The transaction started with XA START can be closed by XA END followed by XA PREPARE or XA COMMIT ONE PHASE. It means that when starts a new transaction the transaction needs to prepare the transaction identifier and to know that 2PC might be used. It’s quite different from PostgreSQL. In PostgreSQL, we can start a transaction by BEGIN and end it by PREPARE TRANSACTION, COMMIT, or ROLLBACK. The transaction identifier is required when PREPARE TRANSACTION. With MySQL, I guess FDW needs a way to tell the (next) transaction needs to be started with XA START so it can be prepared. It could be a custom GUC or an SQL function. Then when starts a new transaction on MySQL server, FDW can generate and store a transaction identifier into somewhere alongside the connection. At the prepare phase, it passes the transaction identifier via GetPrepareId() API to the core. I haven’t tested the above yet and it’s just a desk plan. it's definitely a good idea to try integrating this 2PC feature to FDWs other than postgres_fdw to see if design and interfaces are implemented sophisticatedly. > > What I'm worried is that if only postgres_fdw can implement the prepare function, it's a sign that FDW interface will beriddled with functions only for Postgres. That is, the FDW interface is getting away from its original purpose "accessexternal data as a relation" and complex. Tomas Vondra showed this concern as follows: > > Horizontal scalability/sharding > https://www.postgresql.org/message-id/flat/CANP8%2BjK%3D%2B3zVYDFY0oMAQKQVJ%2BqReDHr1UPdyFEELO82yVfb9A%40mail.gmail.com#2c45f0ee97855449f1f7fedcef1d5e11 > > > [Tomas Vondra's remarks] > -------------------------------------------------- > > This strikes me as a bit of a conflict of interest with FDW which > > seems to want to hide the fact that it's foreign; the FDW > > implementation makes it's own optimization decisions which might > > make sense for single table queries but breaks down in the face of > > joins. > > +1 to these concerns > > In my mind, FDW is a wonderful tool to integrate PostgreSQL with > external data sources, and it's nicely shaped for this purpose, which > implies the abstractions and assumptions in the code. > > The truth however is that many current uses of the FDW API are actually > using it for different purposes because there's no other way to do that, > not because FDWs are the "right way". And this includes the attempts to > build sharding on FDW, I think. > > Situations like this result in "improvements" of the API that seem to > improve the API for the second group, but make the life harder for the > original FDW API audience by making the API needlessly complex. And I > say "seem to improve" because the second group eventually runs into the > fundamental abstractions and assumptions the API is based on anyway. > > And based on the discussions at pgcon, I think this is the main reason > why people cringe when they hear "FDW" and "sharding" in the same sentence. > > ... > My other worry is that we'll eventually mess the FDW infrastructure, > making it harder to use for the original purpose. Granted, most of the > improvements proposed so far look sane and useful for FDWs in general, > but sooner or later that ceases to be the case - there sill be changes > needed merely for the sharding. Those will be tough decisions. > -------------------------------------------------- > > > > Regarding the comparison between FDW transaction APIs and transaction > > callbacks, I think one of the benefits of providing FDW transaction > > APIs is that the core is able to manage the status of foreign > > transactions. We need to track the status of individual foreign > > transactions to support atomic commit. If we use transaction callbacks > > (XactCallback) that many FDWs are using, I think we will end up > > calling the transaction callback and leave the transaction work to > > FDWs, leading that the core is not able to know the return values of > > PREPARE TRANSACTION for example. We can add more arguments passed to > > transaction callbacks to get the return value from FDWs but I don’t > > think it’s a good idea as transaction callbacks are used not only by > > FDW but also other external modules. > > To track the foreign transaction status, we can add GetTransactionStatus() to the FDW interface as an alternative, can'twe? I haven't thought such an interface but it sounds like the transaction status is managed on both the core and FDWs. Could you elaborate on that? > > > > With the current version patch (v23), it supports only > > INSERT/DELETE/UPDATE. But I'm going to change the patch so that it > > supports other writes SQLs as Fujii-san also pointed out. > > OK. I've just read that Fujii san already pointed out a similar thing. But I wonder if we can know that the UDF executedon the foreign server has updated data. Maybe we can know or guess it by calling txid_current_if_any() or checkingthe transaction status in FE/BE protocol, but can we deal with other FDWs other than postgres_fdw? Ah, my answer was not enough. It was only about tracking local writes. Regarding tracking of writes on the foreign server, I think there are restrictions. Currently, the executor registers a foreign sever as a participant of 2PC before calling BeginForeignInsert(), BeginForeignModify(), and BeginForeignScan() etc with a flag indicating whether writes is going to happen on the foreign server. So even if an UDF in a SELECT statement that could update data were to be pushed down to the foreign server, the foreign server would be marked as *not* modified. I’ve not tested yet but I guess that since FDW also is allowed to register the foreign server along with that flag anytime before commit, FDW is able to forcibly change that flag if it knows the SELECT query is going to modify the data on the remote server. > > > > No, in the current design, the backend who received a query from the > > client does PREPARE, and then the transaction resolver process, a > > background worker, does COMMIT PREPARED. > > This "No" means the current implementation cannot group commits from multiple transactions? Yes. > Does the transaction resolver send COMMIT PREPARED and waits for its response for each transaction one by one? For example, > > [local server] > Transaction T1 and T2 performs 2PC at the same time. > Transaction resolver sends COMMIT PREPARED for T1 and then waits for the response. > T1 writes COMMIT PREPARED record locally and sync the WAL. > Transaction resolver sends COMMIT PREPARED for T2 and then waits for the response. > T2 writes COMMIT PREPARED record locally and sync the WAL. > > [foreign server] > T1 writes COMMIT PREPARED record locally and sync the WAL. > T2 writes COMMIT PREPARED record locally and sync the WAL. Just to be clear, the transaction resolver writes FDWXACT_REMOVE records instead of COMMIT PREPARED record to remove foreign transaction entry. But, yes, the transaction resolver works like the above you explained. > If the WAL records of multiple concurrent transactions are written and synced separately, i.e. group commit doesn't takeeffect, then the OLTP transaction performance will be unacceptable. I agree that it'll be a large performance penalty. I'd like to have it but I’m not sure we should have it in the first version from the perspective of complexity. Since the procedure of 2PC is originally high cost, in my opinion, the user should not use as much as possible in terms of performance. Especially in OLTP, its cost will directly affect the latency. I’d suggest designing database schema so transaction touches only one foreign server but do you have concrete OLTP usecase where normally requires 2PC, and how many servers involved within a distributed transaction? Regards, -- Masahiko Sawada http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
pgsql-hackers by date: