RE: Transactions involving multiple postgres foreign servers, take 2 - Mailing list pgsql-hackers
From | tsunakawa.takay@fujitsu.com |
---|---|
Subject | RE: Transactions involving multiple postgres foreign servers, take 2 |
Date | |
Msg-id | TYAPR01MB2990B802DB4CFC5DD5678859FE390@TYAPR01MB2990.jpnprd01.prod.outlook.com Whole thread Raw |
In response to | Re: Transactions involving multiple postgres foreign servers, take 2 (Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>) |
List | pgsql-hackers |
From: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> > The way I am looking at is to put the parallelism in the resolution > worker and not in the FDW. If we use multiple resolution workers, they > can fire commit/abort on multiple foreign servers at a time. From a single session's view, yes. However, the requests from multiple sessions are processed one at a time within eachresolver, because the resolver has to call the synchronous FDW prepare/commit routines and wait for the response fromthe remote server. That's too limiting. > But if we want parallelism within a single resolution worker, we will > need a separate FDW APIs for firing asynchronous commit/abort prepared > txn and fetching their results resp. But given the variety of FDWs, > not all of them will support asynchronous API, so we have to support > synchronous API anyway, which is what can be targeted in the first > version. I agree in that most FDWs will be unlikely to have asynchronous prepare/commit functions, as demonstrated by the fact thateven Oracle and Db2 don't implement XA asynchronous APIs. That's one problem of using FDW for Postgres scale-out. Whenwe enhance FDW, we have to take care of other DBMSs to make the FDW interface practical. OTOH, we want to make maximumuse of Postgres features, such as libpq asynchronous API, to make Postgres scale-out as performant as possible. Butthe scale-out design is bound by the FDW interface. I don't feel accepting such less performant design is an attitudeof this community, as people here are strict against even 1 or 2 percent performance drop. > Thinking more about it, the core may support an API which accepts a > list of prepared transactions, their foreign servers and user mappings > and let FDW resolve all those either in parallel or one by one. So > parallelism is responsibility of FDW and not the core. But then we > loose parallelism across FDWs, which may not be a common case. Hmm, I understand asynchronous FDW relation scan is being developed now, in the form of cooperation between the FDW and theexecutor. If we make just the FDW responsible for prepare/commit parallelism, the design becomes asymmetric. As yousay, I'm not sure if the parallelism is wanted among different types, say, Postgres and Oracle. In fact, major DBMSsdon't implement XA asynchronous API. But such lack of parallelism may be one cause of the bad reputation that 2PC (ofXA) is slow. > Given the complications around this, I think we should go ahead > supporting synchronous API first and in second version introduce > optional asynchronous API. How about the following? * Add synchronous and asynchronous versions of prepare/commit/abort routines and a routine to wait for completion of asynchronousexecution in FdwRoutine. They are optional. postgres_fdw can implement the asynchronous routines using libpq asynchronous functions. Other DBMSs can implement XA asynchronousAPI for them in theory. * The client backend uses asynchronous FDW routines if available: /* Issue asynchronous prepare | commit | rollback to FDWs that support it */ foreach (per each foreign server used in the transaction) { if (fdwroutine->{prepare | commit | rollback}_async_func) fdwroutine->{prepare | commit | rollback}_async_func(...); } /* Wait for completion of asynchronous prepare | commit | rollback */ foreach (per each foreign server used in the transaction) { if (fdwroutine->{prepare | commit | rollback}_async_func) ret = fdwroutine->wait_for_completion(...); } /* Issue synchronous prepare | commit | rollback to FDWs that don't support it */ foreach (per each foreign server used in the transaction) { if (fdwroutine->{prepare | commit | rollback}_async_func == NULL) ret = fdwroutine->{prepare | commit | rollback}_func(...); } * The client backend asks the resolver to commit or rollback the remote transaction only when the remote transaction fails(due to the failure of remote server or network.) That is, the resolver is not involved during normal operation. This will not be complex, and can be included in the first version, if we really want to use FDW for Postgres scale-out. Regards Takayuki Tsunakawa
pgsql-hackers by date: