Re: Transactions involving multiple postgres foreign servers, take 2 - Mailing list pgsql-hackers

From Muhammad Usama
Subject Re: Transactions involving multiple postgres foreign servers, take 2
Date
Msg-id CAEJvTzVs31FUOKi2vprjdyKuEgE6ppCFzDxvhmqQh-WhKNBjOg@mail.gmail.com
Whole thread Raw
In response to Re: Transactions involving multiple postgres foreign servers, take 2  (Masahiko Sawada <masahiko.sawada@2ndquadrant.com>)
Responses Re: Transactions involving multiple postgres foreign servers, take 2
List pgsql-hackers


On Tue, May 12, 2020 at 11:45 AM Masahiko Sawada <masahiko.sawada@2ndquadrant.com> wrote:
On Thu, 30 Apr 2020 at 20:43, Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:
>
> On Tue, 28 Apr 2020 at 19:37, Muhammad Usama <m.usama@gmail.com> wrote:
> >
> >
> >
> > On Wed, Apr 8, 2020 at 11:16 AM Masahiko Sawada <masahiko.sawada@2ndquadrant.com> wrote:
> >>
> >> On Fri, 27 Mar 2020 at 22:06, Muhammad Usama <m.usama@gmail.com> wrote:
> >> >
> >> > Hi Sawada San,
> >> >
> >> > I have been further reviewing and testing the transaction involving multiple server patches.
> >> > Overall the patches are working as expected bar a few important exceptions.
> >> > So as discussed over the call I have fixed the issues I found during the testing
> >> > and also rebased the patches with the current head of the master branch.
> >> > So can you please have a look at the attached updated patches.
> >>
> >> Thank you for reviewing and updating the patch!
> >>
> >> >
> >> > Below is the list of changes I have made on top of V18 patches.
> >> >
> >> > 1- In register_fdwxact(), As we are just storing the callback function pointers from
> >> > FdwRoutine in fdw_part structure, So I think we can avoid calling
> >> > GetFdwRoutineByServerId() in TopMemoryContext.
> >> > So I have moved the MemoryContextSwitch to TopMemoryContext after the
> >> > GetFdwRoutineByServerId() call.
> >>
> >> Agreed.
> >>
> >> >
> >> >
> >> > 2- If PrepareForeignTransaction functionality is not present in some FDW then
> >> > during the registration process we should only set the XACT_FLAGS_FDWNOPREPARE
> >> > transaction flag if the modified flag is also set for that server. As for the server that has
> >> > not done any data modification within the transaction we do not do two-phase commit anyway.
> >>
> >> Agreed.
> >>
> >> >
> >> > 3- I have moved the foreign_twophase_commit in sample file after
> >> > max_foreign_transaction_resolvers because the default value of max_foreign_transaction_resolvers
> >> > is 0 and enabling the foreign_twophase_commit produces an error with default
> >> > configuration parameter positioning in postgresql.conf
> >> > Also, foreign_twophase_commit configuration was missing the comments
> >> > about allowed values in the sample config file.
> >>
> >> Sounds good. Agreed.
> >>
> >> >
> >> > 4- Setting ForeignTwophaseCommitIsRequired in is_foreign_twophase_commit_required()
> >> > function does not seem to be the correct place. The reason being, even when
> >> > is_foreign_twophase_commit_required() returns true after setting ForeignTwophaseCommitIsRequired
> >> > to true, we could still end up not using the two-phase commit in the case when some server does
> >> > not support two-phase commit and foreign_twophase_commit is set to FOREIGN_TWOPHASE_COMMIT_PREFER
> >> > mode. So I have moved the ForeignTwophaseCommitIsRequired assignment to PreCommit_FdwXacts()
> >> > function after doing the prepare transaction.
> >>
> >> Agreed.
> >>
> >> >
> >> > 6- In prefer mode, we commit the transaction in single-phase if the server does not support
> >> > the two-phase commit. But instead of doing the single-phase commit right away,
> >> > IMHO the better way is to wait until all the two-phase transactions are successfully prepared
> >> > on servers that support the two-phase. Since an error during a "PREPARE" stage would
> >> > rollback the transaction and in that case, we would end up with committed transactions on
> >> > the server that lacks the support of the two-phase commit.
> >>
> >> When an error occurred before the local commit, a 2pc-unsupported
> >> server could be rolled back or committed depending on the error
> >> timing. On the other hand all 2pc-supported servers are always rolled
> >> back when an error occurred before the local commit. Therefore even if
> >> we change the order of COMMIT and PREPARE it is still possible that we
> >> will end up committing the part of 2pc-unsupported servers while
> >> rolling back others including 2pc-supported servers.
> >>
> >> I guess the motivation of your change is that since errors are likely
> >> to happen during executing PREPARE on foreign servers, we can minimize
> >> the possibility of rolling back 2pc-unsupported servers by deferring
> >> the commit of 2pc-unsupported server as much as possible. Is that
> >> right?
> >
> >
> > Yes, that is correct. The idea of doing the COMMIT on NON-2pc-supported servers
> > after all the PREPAREs are successful is to minimize the chances of partial commits.
> > And as you mentioned there will still be chances of getting a partial commit even with
> > this approach but the probability of that would be less than what it is with the
> > current sequence.
> >
> >
> >>
> >>
> >> > So I have modified the flow a little bit and instead of doing a one-phase commit right away
> >> > the servers that do not support a two-phase commit is added to another list and that list is
> >> > processed after once we have successfully prepared all the transactions on two-phase supported
> >> > foreign servers. Although this technique is also not bulletproof, still it is better than doing
> >> > the one-phase commits before doing the PREPAREs.
> >>
> >> Hmm the current logic seems complex. Maybe we can just reverse the
> >> order of COMMIT and PREPARE; do PREPARE on all 2pc-supported and
> >> modified servers first and then do COMMIT on others?
> >
> >
> > Agreed, seems reasonable.
> >>
> >>
> >> >
> >> > Also, I think we can improve on this one by throwing an error even in PREFER
> >> > mode if there is more than one server that had data modified within the transaction
> >> > and lacks the two-phase commit support.
> >> >
> >>
> >> IIUC the concept of PREFER mode is that the transaction uses 2pc only
> >> for 2pc-supported servers. IOW, even if the transaction modifies on a
> >> 2pc-unsupported server we can proceed with the commit if in PREFER
> >> mode, which cannot if in REQUIRED mode. What is the motivation of your
> >> above idea?
> >
> >
> > I was thinking that we could change the behavior of PREFER mode such that we only allow
> > to COMMIT the transaction if the transaction needs to do a single-phase commit on one
> > server only. That way we can ensure that we would never end up with partial commit.
> >
>
> I think it's good to avoid a partial commit by using your idea but if
> we want to avoid a partial commit we can use the 'required' mode,
> which requires all participant servers to support 2pc. We throw an
> error if participant servers include even one 2pc-unsupported server
> is modified within the transaction. Of course if the participant node
> is only one 2pc-unsupported server it can use 1pc even in the
> 'required' mode.
>
> > One Idea in this regards would be to switch the local transaction to commit using 2pc
> > if there is a total of only one foreign server that does not support the 2pc in the transaction,
> > ensuring that 1-pc commit servers should always be less than or equal to 1. and if there are more
> > than one foreign server requires 1-pc then we just throw an error.
>
> I might be missing your point but I suppose this idea is to do
> something like the following?
>
> 1. prepare the local transaction
> 2. commit the foreign transaction on 2pc-unsupported server
> 3. commit the prepared local transaction
>
> >
> > However having said that, I am not 100% sure if its a good or an acceptable Idea, and
> > I am okay with continuing with the current behavior of PREFER mode if we put it in the
> > document that this mode can cause a partial commit.
>
> There will three types of servers: (a) a server doesn't support any
> transaction API, (b) a server supports only commit and rollback API
> and (c) a server supports all APIs (commit, rollback and prepare).
> Currently postgres transaction manager manages only server-(b) and
> server-(c), adds them to FdwXactParticipants. I'm considering changing
> the code so that it adds also server-(a) to FdwXactParticipants, in
> order to track the number of server-(a) involved in the transaction.
> But it doesn't insert FdwXact entry for it, and manage transactions on
> these servers.
>
> The reason is this; if we want to have the 'required' mode strictly
> require all participant servers to support 2pc, we should use 2pc when
> (# of server-(a) + # of server-(b) + # of server-(c)) >= 2. But since
> currently we just track the modification on a server-(a) by a flag we
> cannot handle the case where two server-(a) are modified in the
> transaction. On the other hand, if we don't consider server-(a) the
> transaction could end up with a partial commit when a server-(a)
> participates in the transaction. Therefore I'm thinking of the above
> change so that the transaction manager can ensure that a partial
> commit doesn't happen in the 'required' mode. What do you think?
>
> >
> >>
> >> > 7- Added a pfree() and list_free_deep() in PreCommit_FdwXacts() to reclaim the
> >> > memory if fdw_part is removed from the list
> >>
> >> I think at the end of the transaction we free entries of
> >> FdwXactParticipants list and set FdwXactParticipants to NIL. Why do we
> >> need to do that in PreCommit_FdwXacts()?
> >
> >
> > Correct me if I am wrong, The fdw_part structures are created in TopMemoryContext
> > and if that fdw_part structure is removed from the list at pre_commit stage
> > (because we did 1-PC COMMIT on it) then it would leak memory.
>
> The fdw_part structures are created in TopTransactionContext so these
> are freed at the end of the transaction.
>
> >
> >>
> >> >
> >> > 8- The function FdwXactWaitToBeResolved() was bailing out as soon as it finds
> >> > (FdwXactParticipants == NIL). The problem with that was in the case of
> >> > "COMMIT/ROLLBACK PREPARED" we always get FdwXactParticipants = NIL and
> >> > effectively the foreign prepared transactions(if any) associated with locally
> >> > prepared transactions were never getting resolved automatically.
> >> >
> >> >
> >> > postgres=# BEGIN;
> >> > BEGIN
> >> > INSERT INTO test_local  VALUES ( 2, 'TWO');
> >> > INSERT 0 1
> >> > INSERT INTO test_foreign_s1  VALUES ( 2, 'TWO');
> >> > INSERT 0 1
> >> > INSERT INTO test_foreign_s2  VALUES ( 2, 'TWO');
> >> > INSERT 0 1
> >> > postgres=*# PREPARE TRANSACTION 'local_prepared';
> >> > PREPARE TRANSACTION
> >> >
> >> > postgres=# select * from pg_foreign_xacts ;
> >> > dbid  | xid | serverid | userid |  status  | in_doubt |         identifier
> >> > -------+-----+----------+--------+----------+----------+----------------------------
> >> >  12929 | 515 |    16389 |     10 | prepared | f        | fx_1339567411_515_16389_10
> >> >  12929 | 515 |    16391 |     10 | prepared | f        | fx_1963224020_515_16391_10
> >> > (2 rows)
> >> >
> >> > -- Now commit the prepared transaction
> >> >
> >> > postgres=# COMMIT PREPARED 'local_prepared';
> >> >
> >> > COMMIT PREPARED
> >> >
> >> > --Foreign prepared transactions associated with 'local_prepared' not resolved
> >> >
> >> > postgres=#
> >> >
> >> > postgres=# select * from pg_foreign_xacts ;
> >> > dbid  | xid | serverid | userid |  status  | in_doubt |         identifier
> >> > -------+-----+----------+--------+----------+----------+----------------------------
> >> >  12929 | 515 |    16389 |     10 | prepared | f        | fx_1339567411_515_16389_10
> >> >  12929 | 515 |    16391 |     10 | prepared | f        | fx_1963224020_515_16391_10
> >> > (2 rows)
> >> >
> >> >
> >> > So to fix this in case of the two-phase transaction, the function checks the existence
> >> > of associated foreign prepared transactions before bailing out.
> >> >
> >>
> >> Good catch. But looking at your change, we should not accept the case
> >> where FdwXactParticipants == NULL but TwoPhaseExists(wait_xid) ==
> >> false.
> >>
> >>        if (FdwXactParticipants == NIL)
> >>        {
> >>                /*
> >>                 * If we are here because of COMMIT/ROLLBACK PREPARED then the
> >>                 * FdwXactParticipants list would be empty. So we need to
> >>                 * see if there are any foreign prepared transactions exists
> >>                 * for this prepared transaction
> >>                 */
> >>                if (TwoPhaseExists(wait_xid))
> >>                {
> >>                        List *foreign_trans = NIL;
> >>
> >>                        foreign_trans = get_fdwxacts(MyDatabaseId,
> >> wait_xid, InvalidOid, InvalidOid,
> >>                                         false, false, true);
> >>
> >>                        if (foreign_trans == NIL)
> >>                                return;
> >>                        list_free(foreign_trans);
> >>                }
> >>        }
> >>
> >
> > Sorry my bad, its a mistake on my part. we should just return from the function when
> > FdwXactParticipants == NULL but TwoPhaseExists(wait_xid) == false.
> >
> >         if (TwoPhaseExists(wait_xid))
> >         {
> >             List *foreign_trans = NIL;
> >             foreign_trans = get_fdwxacts(MyDatabaseId, wait_xid, InvalidOid, InvalidOid,
> >                      false, false, true);
> >
> >             if (foreign_trans == NIL)
> >                 return;
> >             list_free(foreign_trans);
> >         }
> >         else
> >             return;
> >
> >>
> >> > 9- In function XlogReadFdwXactData() XLogBeginRead call was missing before XLogReadRecord()
> >> > that was causing the crash during recovery.
> >>
> >> Agreed.
> >>
> >> >
> >> > 10- incorporated set_ps_display() signature change.
> >>
> >> Thanks.
> >>
> >> Regarding other changes you did in v19 patch, I have some comments:
> >>
> >> 1.
> >> +       ereport(LOG,
> >> +                       (errmsg("trying to %s the foreign transaction
> >> associated with transaction %u on server %u",
> >> +                                       fdwxact->status ==
> >> FDWXACT_STATUS_COMMITTING?"COMMIT":"ABORT",
> >> +                                       fdwxact->local_xid,
> >> fdwxact->serverid)));
> >> +
> >>
> >> Why do we need to emit LOG message in pg_resolve_foreign_xact() SQL function?
> >
> >
> > That change was not intended to get into the patch file. I had done it during testing to
> > quickly get info on which way the transaction is going to be resolved.
> >
> >>
> >> 2.
> >> diff --git a/src/bin/pg_waldump/fdwxactdesc.c b/src/bin/pg_waldump/fdwxactdesc.c
> >> deleted file mode 120000
> >> index ce8c21880c..0000000000
> >> --- a/src/bin/pg_waldump/fdwxactdesc.c
> >> +++ /dev/null
> >> @@ -1 +0,0 @@
> >> -../../../src/backend/access/rmgrdesc/fdwxactdesc.c
> >> \ No newline at end of file
> >> diff --git a/src/bin/pg_waldump/fdwxactdesc.c b/src/bin/pg_waldump/fdwxactdesc.c
> >> new file mode 100644
> >> index 0000000000..ce8c21880c
> >> --- /dev/null
> >> +++ b/src/bin/pg_waldump/fdwxactdesc.c
> >> @@ -0,0 +1 @@
> >> +../../../src/backend/access/rmgrdesc/fdwxactdesc.c
> >>
> >> We need to remove src/bin/pg_waldump/fdwxactdesc.c from the patch.
> >
> >
> > Again sorry! that was an oversight on my part.
> >
> >>
> >> 3.
> >> --- a/doc/src/sgml/monitoring.sgml
> >> +++ b/doc/src/sgml/monitoring.sgml
> >> @@ -1526,14 +1526,14 @@ postgres   27093  0.0  0.0  30096  2752 ?
> >>   Ss   11:34   0:00 postgres: ser
> >>           <entry><literal>SafeSnapshot</literal></entry>
> >>           <entry>Waiting for a snapshot for a <literal>READ ONLY
> >> DEFERRABLE</literal> transaction.</entry>
> >>          </row>
> >> -        <row>
> >> -         <entry><literal>SyncRep</literal></entry>
> >> -         <entry>Waiting for confirmation from remote server during
> >> synchronous replication.</entry>
> >> -        </row>
> >>          <row>
> >>           <entry><literal>FdwXactResolution</literal></entry>
> >>           <entry>Waiting for all foreign transaction participants to
> >> be resolved during atomic commit among foreign servers.</entry>
> >>          </row>
> >> +        <row>
> >> +         <entry><literal>SyncRep</literal></entry>
> >> +         <entry>Waiting for confirmation from remote server during
> >> synchronous replication.</entry>
> >> +        </row>
> >>          <row>
> >>           <entry morerows="4"><literal>Timeout</literal></entry>
> >>           <entry><literal>BaseBackupThrottle</literal></entry>
> >>
> >> We need to move the entry of FdwXactResolution to right before
> >> Hash/Batch/Allocating for alphabetical order.
> >
> >
> > Agreed!
> >>
> >>
> >> I've incorporated your changes I agreed with to my local branch and
> >> will incorporate other changes after discussion. I'll also do more
> >> test and self-review and will submit the latest version patch.
> >>
> >
> > Meanwhile, I found a couple of more small issues, One is the break statement missing
> > i n pgstat_get_wait_ipc() and secondly fdwxact_relaunch_resolvers()
> > could return un-initialized value.
> > I am attaching a small patch for these changes that can be applied on top of existing
> > patches.
>
> Thank you for the patch!
>
> I'm updating the patches because current behavior in error case would
> not be good. For example, when an error occurs in the prepare phase,
> prepared transactions are left as in-doubt transaction. And these
> transactions are not handled by the resolver process. That means that
> a user could need to resolve these transactions manually every abort
> time, which is not good. In abort case, I think that prepared
> transactions can be resolved by the backend itself, rather than
> leaving them for the resolver. I'll submit the updated patch.
>

I've attached the latest version patch set which includes some changes
from the previous version:

* I've added regression tests that test all types of FDW
implementations. There are three types of FDW: FDW doesn't support any
transaction APIs, FDW supports only commit and rollback APIs and FDW
supports all (prepare, commit and rollback) APISs.
src/test/module/test_fdwxact contains those FDW implementations for
tests, and test some cases where a transaction reads/writes data on
various types of foreign servers.
* Also test_fdwxact has TAP tests that check failure cases. The test
FDW implementation has the ability to inject error or panic into
prepare or commit phase. Using it the TAP test checks if distributed
transactions can be committed or rolled back even in failure cases.
* When foreign_twophase_commit = 'required', the transaction commit
fails if the transaction modified data on even one server not
supporting prepare API. Previously, we used to ignore servers that
don't support any transaction API but we check them to strictly
require all involved foreign servers to support all transaction APIs.
* Transaction resolver process resolves in-doubt transactions automatically.
* Incorporated comments from Muhammad Usama.

Regards,

--
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Hi Sawada,

I have just done some review and testing of the patches and have
a couple of comments.

1- IMHO the PREPARE TRANSACTION should always use 2PC even
when the transaction has operated on a single foreign server regardless
of foreign_twophase_commit setting, and throw an error otherwise when
2PC is not available on any of the data-modified servers.

For example, consider the case

BEGIN;
INSERT INTO ft_2pc_1 VALUES(1);
PREPARE TRANSACTION 'global_x1';

Here since we are preparing the local transaction so we should also prepare
the transaction on the foreign server even if the transaction has modified only
one foreign table.

What do you think?

Also without this change, the above test case produces an assertion failure
with your patches.

2- when deciding if the two-phase commit is required or not in
FOREIGN_TWOPHASE_COMMIT_PREFER mode we should use
2PC when we have at least one server capable of doing that.

i.e

For FOREIGN_TWOPHASE_COMMIT_PREFER case in
checkForeignTwophaseCommitRequired() function I think
the condition should be

need_twophase_commit = (nserverstwophase >= 1);
instead of
need_twophase_commit = (nserverstwophase >= 2);

I am attaching a patch that I have generated on top of your V20
patches with these two modifications along with the related test case.



Best regards!
--
...
Muhammad Usama
Highgo Software (Canada/China/Pakistan) 
ADDR: 10318 WHALLEY BLVD, Surrey, BC 
Attachment

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: [PATCH] Fix ouside scope t_ctid (ItemPointerData)
Next
From: Robert Haas
Date:
Subject: Re: Our naming of wait events is a disaster.