Re: Transactions involving multiple postgres foreign servers, take 2 - Mailing list pgsql-hackers

From Masahiko Sawada
Subject Re: Transactions involving multiple postgres foreign servers, take 2
Date
Msg-id CA+fd4k6Pd9dbm5ngiktaNCcqskeEW-Fy+XqztYF-WeORb2w=kw@mail.gmail.com
Whole thread Raw
In response to Re: Transactions involving multiple postgres foreign servers, take 2  (Masahiko Sawada <masahiko.sawada@2ndquadrant.com>)
Responses RE: Transactions involving multiple postgres foreign servers, take 2
List pgsql-hackers
On Tue, 29 Sep 2020 at 15:03, Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:
>
> On Tue, 29 Sep 2020 at 11:37, tsunakawa.takay@fujitsu.com
> <tsunakawa.takay@fujitsu.com> wrote:
> >
> > From: Masahiko Sawada <masahiko.sawada@2ndquadrant.com>
> > > No. Please imagine a case where a user executes PREPARE TRANSACTION on
> > > the transaction that modified data on foreign servers. The backend
> > > process prepares both the local transaction and foreign transactions.
> > > But another client can execute COMMIT PREPARED on the prepared
> > > transaction. In this case, another backend newly connects foreign
> > > servers and commits prepared foreign transactions. Therefore, the new
> > > connection cache entry can be created during COMMIT PREPARED which
> > > could lead to an error but since the local prepared transaction is
> > > already committed the backend must not fail with an error.
> > >
> > > In the latter case, I’m assumed that the backend continues to retry
> > > foreign transaction resolution until the user requests cancellation.
> > > Please imagine the case where the server-A connects a foreign server
> > > (say, server-B) and server-B connects another foreign server (say,
> > > server-C). The transaction initiated on server-A modified the data on
> > > both local and server-B which further modified the data on server-C
> > > and executed COMMIT.  The backend process on server-A (say, backend-A)
> > > sends PREPARE TRANSACTION to server-B then the backend process  on
> > > server-B (say, backend-B) connected by backend-A prepares the local
> > > transaction and further sends PREPARE TRANSACTION to server-C. Let’s
> > > suppose a temporary connection failure happens between server-A and
> > > server-B before the backend-A sending COMMIT PREPARED (i.g, 2nd phase
> > > of 2PC). When the backend-A attempts to sends COMMIT PREPARED to
> > > server-B it realizes that the connection to server-B was lost but
> > > since the user doesn’t request cancellatino yet the backend-A retries
> > > to connect server-B and suceeds. Since now that the backend-A
> > > established a new connection to server-B, there is another backend
> > > process on server-B (say, backend-B’). Since the backend-B’ doen’t
> > > have a connection to server-C yet, it creates new connection cache
> > > entry, which could lead to an error.  IOW, on server-B different
> > > processes performed PREPARE TRANSACTION and COMMIT PREPARED and
> > > the
> > > later process created a connection cache entry.
> >
> > Thank you, I understood the situation.  I don't think it's a good design to not address practical performance
duringnormal operation by fearing the rare error case. 
> >
> > The transaction manager (TM) or the FDW implementor can naturally do things like the following:
> >
> > * Use palloc_extended(MCXT_ALLOC_NO_OOM) and hash_search(HASH_ENTER_NULL) to return control to the caller.
> >
> > * Use PG_TRY(), as its overhead is relatively negligible to connection establishment.
>
> I suppose you mean that the FDW implementor uses PG_TRY() to catch an
> error but not do PG_RE_THROW(). I'm concerned that it's safe to return
> the control to the caller and continue trying to resolve foreign
> transactions without neither rethrowing an error nor transaction
> abort.
>
> IMHO, it's rather a bad design something like "high performance but
> doesn't work fine in a rare failure case", especially for the
> transaction management feature.

To avoid misunderstanding, I didn't mean to disregard the performance.
I mean especially for the transaction management feature it's
essential to work fine even in failure cases. So I hope we have a
safe, robust, and probably simple design for the first version that
might be low performance yet though but have a potential for
performance improvement and we will be able to try to improve
performance later.

Regards,

--
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Add header support to text format and matching feature
Next
From: Heikki Linnakangas
Date:
Subject: Re: Concurrency issue in pg_rewind