RE: Transactions involving multiple postgres foreign servers, take 2 - Mailing list pgsql-hackers

From tsunakawa.takay@fujitsu.com
Subject RE: Transactions involving multiple postgres foreign servers, take 2
Date
Msg-id TYAPR01MB2990DBB9A8281DCA014FFA1CFE270@TYAPR01MB2990.jpnprd01.prod.outlook.com
Whole thread Raw
In response to Re: Transactions involving multiple postgres foreign servers, take 2  (Fujii Masao <masao.fujii@oss.nttdata.com>)
Responses Re: Transactions involving multiple postgres foreign servers, take 2  (Fujii Masao <masao.fujii@oss.nttdata.com>)
List pgsql-hackers
Alexey-san, Sawada-san,
cc: Fujii-san,


From: Fujii Masao <masao.fujii@oss.nttdata.com>
> But if we
> implement 2PC as the improvement on FDW independently from PostgreSQL
> sharding, I think that it's necessary to support other FDW. And this is our
> direction, isn't it?

I understand the same way as Fujii san.  2PC FDW is itself useful, so I think we should pursue the tidy FDW interface
andgood performance withinn the FDW framework.  "tidy" means that many other FDWs should be able to implement it.  I
guessXA/JTA is the only material we can use to consider whether the FDW interface is good.
 


> Sawada-san's patch supports that case by implememnting some conponents
> for that also in PostgreSQL core. For example, with the patch, all the remote
> transactions that participate at the transaction are managed by PostgreSQL
> core instead of postgres_fdw layer.
> 
> Therefore, at least regarding the difference 2), I think that Sawada-san's
> approach is better. Thought?

I think so.  Sawada-san's patch needs to address the design issues I posed before digging into the code for thorough
review,though.
 

BTW, is there something Sawada-san can take from Alexey-san's patch?  I'm concerned about the performance for practical
use. Do you two have differences in these points, for instance?  The first two items are often cited to evaluate the
algorithm'sperformance, as you know.
 

* The number of round trips to remote nodes.
* The number of disk I/Os on each node and all nodes in total (WAL, two-phase file, pg_subtrans file, CLOG?).
* Are prepare and commit executed in parallel on remote nodes? (serious DBMSs do so)
* Is there any serialization point in the processing? (Sawada-san's has one)

I'm sorry to repeat myself, but I don't think we can compromise the 2PC performance.  Of course, we recommend users to
designa schema that co-locates data that each transaction accesses to avoid 2PC, but it's not always possible (e.g.,
whensecondary indexes are used.)
 

Plus, as the following quote from TPC-C specification shows, TPC-C requires 15% of (Payment?) transactions to do 2PC.
(Iknew this on Microsoft, CockroachDB, or Citus Data's site.)
 


--------------------------------------------------
Independent of the mode of selection, the customer resident 
warehouse is the home warehouse 85% of the time and is a randomly selected remote warehouse 15% of the time. 
This can be implemented by generating two random numbers x and y within [1 .. 100]; 

. If x <= 85 a customer is selected from the selected district number (C_D_ID = D_ID) and the home warehouse 
number (C_W_ID = W_ID). The customer is paying through his/her own warehouse. 

. If x > 85 a customer is selected from a random district number (C_D_ID is randomly selected within [1 .. 10]), 
and a random remote warehouse number (C_W_ID is randomly selected within the range of active 
warehouses (see Clause 4.2.2), and C_W_ID ≠ W_ID). The customer is paying through a warehouse and a 
district other than his/her own. 
--------------------------------------------------


Regards
Takayuki Tsunakawa



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: SIGQUIT handling, redux
Next
From: Peter Smith
Date:
Subject: Re: extension patch of CREATE OR REPLACE TRIGGER