Re: Transactions involving multiple postgres foreign servers, take 2 - Mailing list pgsql-hackers

From Masahiko Sawada
Subject Re: Transactions involving multiple postgres foreign servers, take 2
Date
Msg-id CA+fd4k5Y0Wvezij5o=+gN1iMGXJUwOjYNyJDpZxa9ySgbs131w@mail.gmail.com
Whole thread Raw
In response to RE: Transactions involving multiple postgres foreign servers, take 2  ("tsunakawa.takay@fujitsu.com" <tsunakawa.takay@fujitsu.com>)
Responses RE: Transactions involving multiple postgres foreign servers, take 2
List pgsql-hackers
On Mon, 28 Sep 2020 at 13:58, tsunakawa.takay@fujitsu.com
<tsunakawa.takay@fujitsu.com> wrote:
>
> From: Masahiko Sawada <masahiko.sawada@2ndquadrant.com>
> > On Fri, 25 Sep 2020 at 18:21, tsunakawa.takay@fujitsu.com
> > <tsunakawa.takay@fujitsu.com> wrote:
> > > Why does the client backend have to create a new connection cache entry
> > during PREPARE or COMMIT PREPARE?  Doesn't the client backend naturally
> > continue to use connections that it has used in its current transaction?
> >
> > I think there are two cases: a process executes PREPARE TRANSACTION
> > and another process executes COMMIT PREPARED later, and if the
> > coordinator has cascaded foreign servers (i.g., a foreign server has
> > its foreign server) and temporary connection problem happens in the
> > intermediate node after PREPARE then another process on the
> > intermediate node will execute COMMIT PREPARED on its foreign server.
>
> Aren't both the cases failure cases, and thus handled by the resolver?

No. Please imagine a case where a user executes PREPARE TRANSACTION on
the transaction that modified data on foreign servers. The backend
process prepares both the local transaction and foreign transactions.
But another client can execute COMMIT PREPARED on the prepared
transaction. In this case, another backend newly connects foreign
servers and commits prepared foreign transactions. Therefore, the new
connection cache entry can be created during COMMIT PREPARED which
could lead to an error but since the local prepared transaction is
already committed the backend must not fail with an error.

In the latter case, I’m assumed that the backend continues to retry
foreign transaction resolution until the user requests cancellation.
Please imagine the case where the server-A connects a foreign server
(say, server-B) and server-B connects another foreign server (say,
server-C). The transaction initiated on server-A modified the data on
both local and server-B which further modified the data on server-C
and executed COMMIT.  The backend process on server-A (say, backend-A)
sends PREPARE TRANSACTION to server-B then the backend process  on
server-B (say, backend-B) connected by backend-A prepares the local
transaction and further sends PREPARE TRANSACTION to server-C. Let’s
suppose a temporary connection failure happens between server-A and
server-B before the backend-A sending COMMIT PREPARED (i.g, 2nd phase
of 2PC). When the backend-A attempts to sends COMMIT PREPARED to
server-B it realizes that the connection to server-B was lost but
since the user doesn’t request cancellatino yet the backend-A retries
to connect server-B and suceeds. Since now that the backend-A
established a new connection to server-B, there is another backend
process on server-B (say, backend-B’). Since the backend-B’ doen’t
have a connection to server-C yet, it creates new connection cache
entry, which could lead to an error.  IOW, on server-B different
processes performed PREPARE TRANSACTION and COMMIT PREPARED and the
later process created a connection cache entry.

>
>
> > > > In terms of performance you're concerned, I wonder if we can somewhat
> > > > eliminate the bottleneck if multiple resolvers are able to run on one
> > > > database in the future. For example, if we could launch resolver
> > > > processes as many as connections on the database, individual backend
> > > > processes could have one resolver process. Since there would be
> > > > contention and inter-process communication it still brings some
> > > > overhead but it might be negligible comparing to network round trip.
> > >
> > > Do you mean that if concurrent 200 clients each update data on two foreign
> > servers, there are 400 resolvers?  ...That's overuse of resources.
> >
> > I think we have 200 resolvers in this case since one resolver process
> > per backend process.
>
> That does not parallelize prepare or commit for a single client, as each resolver can process only one prepare or
commitsynchronously at a time.  Not to mention the resource usage is high. 

Well, I think we should discuss parallel (and/or asyncronous)
execution of prepare and commit separated from the discussion on
whether the resolver process is responsible for 2nd phase of 2PC. I've
been suggesting that the first phase and the second phase of 2PC
should be performed by different processes in terms of safety. And
having multiple resolvers on one database is my suggestion in response
to the concern you raised that one resolver process on one database
can be bottleneck. Both parallel executionand asynchronous execution
are slightly related to this topic but I think it should be discussed
separately.

Regarding parallel and asynchronous execution, I basically agree on
supporting asynchronous execution as the XA specification also has,
although I think it's better not to include it in the first version
for simplisity.

Overall, my suggestion for the first version is to support synchronous
execution of prepare, commit, and rollback, have one resolver process
per database, and have resolver take 2nd phase of 2PC. As the next
step we can add APIs for asynchronous execution, have multiple
resolvers on one database and so on.

Regards,

--
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



pgsql-hackers by date:

Previous
From: Amul Sul
Date:
Subject: Re: [Patch] ALTER SYSTEM READ ONLY
Next
From: Ashutosh Sharma
Date:
Subject: Re: Parallel copy