Re: Transactions involving multiple postgres foreign servers, take 2 - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: Transactions involving multiple postgres foreign servers, take 2 |
Date | |
Msg-id | CA+TgmoZ7=MufRiz7hmQPN=JF-QYt+H2TooGNcxsYUDa=nucbqg@mail.gmail.com Whole thread Raw |
In response to | RE: Transactions involving multiple postgres foreign servers, take 2 ("tsunakawa.takay@fujitsu.com" <tsunakawa.takay@fujitsu.com>) |
Responses |
RE: Transactions involving multiple postgres foreign servers, take 2
|
List | pgsql-hackers |
On Thu, Jun 10, 2021 at 9:58 PM tsunakawa.takay@fujitsu.com <tsunakawa.takay@fujitsu.com> wrote: > I understand that. As I cited yesterday and possibly before, that's why xa_commit() returns various return codes. So,I have never suggested that FDWs should not report an error and always report success for the commit request. They shouldbe allowed to report an error. In the text to which I was responding it seemed like you were saying the opposite. Perhaps I misunderstood. > The question I have been asking is how. With that said, we should only have two options; one is the return value of theFDW commit routine, and the other is via ereport(ERROR). I suggested the possibility of the former, because if the FDWdoes ereport(ERROR), Postgres core (transaction manager) may have difficulty in handling the rest of the participants. I don't think that is going to work. It is very difficult to write code that doesn't ever ERROR in PostgreSQL. It is not impossible if the operation is trivial enough, but I think you're greatly underestimating the complexity of committing the remote transaction. If somebody had designed PostgreSQL so that every function returns a return code and every time you call some other function you check that return code and pass any error up to your own caller, then there would be no problem here. But in fact the design was that at the first sign of trouble you throw an ERROR. It's not easy to depart from that programming model in just one place. > > Also, leaving aside theoretical arguments, I think it's not > > realistically possible for an FDW author to write code to commit a > > prepared transaction that will be safe in the context of running late > > in PrepareTransaction(), after we've already done > > RecordTransactionCommit(). Such code can't avoid throwing errors > > because it can't avoid performing operations and allocating memory. > > I'm not completely sure about this. I thought (and said) that the only thing the FDW does would be to send a commit requestthrough an existing connection. So, I think it's not a severe restriction to require FDWs to do ereport(ERROR) duringcommits (of the second phase of 2PC.) To send a commit request through an existing connection, you have to send some bytes over the network using a send() or write() system call. That can fail. Then you have to read the response back over the network using recv() or read(). That can also fail. You also need to parse the result that you get from the remote side, which can also fail, because you could get back garbage for some reason. And depending on the details, you might first need to construct the message you're going to send, which might be able to fail too. Also, the data might be encrypted using SSL, so you might have to decrypt it, which can also fail, and you might need to encrypt data before sending it, which can fail. In fact, if you're using the OpenSSL, trying to call SSL_read() or SSL_write() can both read and write data from the socket, even multiple times, so you have extra opportunities to fail. > (I took "abort" as the same as "rollback" here.) Once we've sent commit requests to some participants, we can't abortthe transaction. If one FDW returned an error halfway, we need to send commit requests to the rest of participants. I understand that it's not possible to abort the local transaction to abort after it's been committed, but that doesn't mean that we're going to be able to send the commit requests to the rest of the participants. We want to be able to do that, certainly, but there's no guarantee that it's actually possible. Again, the remote servers may be dropped into a volcano, or less seriously, we may not be able to access them. Also, someone may kill off our session. > It's a design question, as I repeatedly said, whether and how we should report the error of some participants to the client. For instance, how should we report the errors of multiple participants? Concatenate those error messages? Sure, I agree that there are some questions about how to report errors. > Anyway, we should design the interface first, giving much thought and respecting the ideas of predecessors (TX/XA, MS DTC,JTA/JTS). Otherwise, we may end up like "We implemented like this, so the interface is like this and it can only behavelike this, although you may find it strange..." That might be a situation similar to what your comment "the designof PostgreSQL, in all circumstances, the way you recover from an error is to abort the transaction" suggests -- Postgresdoesn't have statement-level rollback. I think that's a valid concern, but we also have to have a plan that is realistic. Some things are indeed not possible in PostgreSQL's design. Also, some of these problems are things everyone has to somehow confront. There's no database doing 2PC that can't have a situation where one of the machines disappears unexpectedly due to some natural disaster or administrator interference. It might be the case that our inability to do certain things safely during transaction commit puts us out of compliance with the spec, but it can't be the case that some other system has no possible failures during transaction commit. The problem of the network potentially being disconnected between one packet and the next exists in every system. > I don't think the resolver-based approach would bring us far enough. It's fundamentally a bottleneck. Such a backgroundprocess should only handle commits whose requests failed to be sent due to server down. Why is it fundamentally a bottleneck? It seems to me in some cases it could scale better than any other approach. If we have to commit on 100 shards in only one process we can only do those commits one at a time. If we can use resolver processes we could do all 100 at once if the user can afford to run that many resolvers, which should be way faster. It is true that if the resolver does not have a connection open and must open one, that might be slow, but presumably after that it can keep the connection open and reuse it for subsequent distributed transactions. I don't really see why that should be particularly slow. -- Robert Haas EDB: http://www.enterprisedb.com
pgsql-hackers by date: