Thread: Logical Replication vs. 2PC

Logical Replication vs. 2PC

From
Amit Kapila
Date:
While reviewing/testing subscriber-side work for $SUBJECT [1], I
noticed a problem that seems to need a broader discussion, so started
this thread. We can get prepare for the same GID more than once for
the cases where we have defined multiple subscriptions for
publications on the same server and prepared transaction has
operations on tables subscribed to those subscriptions. For such
cases, one of the prepare will be successful and others will fail in
which case the server will send them again. Once the commit prepared
is done for the first one, the next prepare will be successful. Now,
this is not ideal but will work.

However, if the user has setup synchronous_standby_names for all the
subscriptions then we won't be able to proceed because the prepare on
publisher will wait for all the subscriptions to ack and the
subscriptions are waiting for the first prepare to finish. See an
example below for such a situation. I think this can also happen if we
get any key violation while applying the changes on the subscriber,
but for that, we can ask the user to remove the violating key on the
subscriber as that is what we suggest now also for commits. Similarly,
say the user has already prepared the transaction with the same GID on
subscriber-node, then also we can get into a similar situation but for
that, we can ask the user to commit such a GID.

We can think of appending some unique identifier (like subid) with GID
but that won't work for cascaded standby setup (where the prepares on
subscriber will be again sent to another subscriber) as the GID can
become too long. So that might not be a good solution, maybe we can
optimize it in some way that we append only when there is a GID clash.
The other thing we could do is to ask the user to temporarily disable
the subscription and change synchronous_standby_settings on the
publisher node. Any better ideas?

Example of the above scenario, you can see this problem after applying
the patches at [1].

Publisher
=================
CREATE TABLE mytbl(id SERIAL PRIMARY KEY, somedata int, text varchar(120));

BEGIN;
INSERT INTO mytbl(somedata, text) VALUES (1, 1);
INSERT INTO mytbl(somedata, text) VALUES (1, 2);
COMMIT;

CREATE PUBLICATION mypub FOR TABLE mytbl;

CREATE TABLE mytbl1(id SERIAL PRIMARY KEY, somedata int, text varchar(120));

BEGIN;
INSERT INTO mytbl1(somedata, text) VALUES (1, 1);
INSERT INTO mytbl1(somedata, text) VALUES (1, 2);
COMMIT;

CREATE PUBLICATION mypub1 FOR TABLE mytbl1;

Subscriber
=============
CREATE TABLE mytbl(id SERIAL PRIMARY KEY, somedata int, text varchar(120));

CREATE SUBSCRIPTION mysub
         CONNECTION 'host=localhost port=5432 dbname=postgres'
        PUBLICATION mypub WITH(two_phase = on);

CREATE TABLE mytbl1(id SERIAL PRIMARY KEY, somedata int, text
varchar(120)); CREATE SUBSCRIPTION mysub1
         CONNECTION 'host=localhost port=5432 dbname=postgres'
        PUBLICATION mypub1 WITH(two_phase = on);

Now, set synchronous_standby_names = 'FIRST 2 (mysub, mysub1)' on the
publisher in postgresql.conf and restart both publisher and
subscriber, actually restart is not required as
synchronous_standby_names is a SIGHUP parameter.

Publisher
=============
BEGIN;
Insert into mytbl values(17,1,18);
Insert into mytbl1 values(17,1,18);
Prepare Transaction 'foo';

Now, this Prepare transaction will wait forever because on subscriber
we are getting "ERROR:  transaction identifier "foo" is already in
use" which means it is waiting for a publisher to send commit prepared
for first apply worker and publisher is waiting for both the
subscriptions to send ack. This is happening because the prepared
transaction on publisher operates on tables of both subscriptions.

In short, on the subscriber, both the apply workers (corresponding to
two subscriptions) are getting the same prepare transaction GID,
leading to an error on the subscriber and making the publisher wait
forever.

Thoughts?

[1] - https://www.postgresql.org/message-id/CAHut%2BPv3X7YH_nDEjH1ZJf5U6M6DHHtEjevu7PY5Dv5071jQ4A%40mail.gmail.com

-- 
With Regards,
Amit Kapila.



Re: Logical Replication vs. 2PC

From
vignesh C
Date:
On Thu, Mar 18, 2021 at 3:16 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> While reviewing/testing subscriber-side work for $SUBJECT [1], I
> noticed a problem that seems to need a broader discussion, so started
> this thread. We can get prepare for the same GID more than once for
> the cases where we have defined multiple subscriptions for
> publications on the same server and prepared transaction has
> operations on tables subscribed to those subscriptions. For such
> cases, one of the prepare will be successful and others will fail in
> which case the server will send them again. Once the commit prepared
> is done for the first one, the next prepare will be successful. Now,
> this is not ideal but will work.
>
> However, if the user has setup synchronous_standby_names for all the
> subscriptions then we won't be able to proceed because the prepare on
> publisher will wait for all the subscriptions to ack and the
> subscriptions are waiting for the first prepare to finish. See an
> example below for such a situation. I think this can also happen if we
> get any key violation while applying the changes on the subscriber,
> but for that, we can ask the user to remove the violating key on the
> subscriber as that is what we suggest now also for commits. Similarly,
> say the user has already prepared the transaction with the same GID on
> subscriber-node, then also we can get into a similar situation but for
> that, we can ask the user to commit such a GID.
>
> We can think of appending some unique identifier (like subid) with GID
> but that won't work for cascaded standby setup (where the prepares on
> subscriber will be again sent to another subscriber) as the GID can
> become too long. So that might not be a good solution, maybe we can
> optimize it in some way that we append only when there is a GID clash.
> The other thing we could do is to ask the user to temporarily disable
> the subscription and change synchronous_standby_settings on the
> publisher node. Any better ideas?
>
> In short, on the subscriber, both the apply workers (corresponding to
> two subscriptions) are getting the same prepare transaction GID,
> leading to an error on the subscriber and making the publisher wait
> forever.
>
> Thoughts?

I see the main problem here is because the GID clashes as you have
rightly pointed out. I'm not sure if we are allowed to change the
GID's in the subscriber.
If we are allowed to change the GID's in the subscriber. Worker can do
something like: When the apply worker is applying the prepared
transaction, try to apply the prepare transaction with the GID as is.
If there is an error GID already in use, workers can try to catch that
error and change the GID to a fixed length hash key of (GID,
subscription name, node name, timestamp,etc) to generate a unique hash
key(modified GID), prepare the transaction with the generated hash
key. Store this key and the original GID for later use, this will be
required during commit prepared or in case of rollback prepared. When
applying the commit prepared or rollback prepared, change the GID with
the hash key that was used during the prepare transaction.
If we are not allowed to change the GID's in the subscriber. This
thought is in similar lines where in one of the earlier design
prepared spool files was used. Can we have some mechanism where we can
identify this scenario and store the failing prepare transaction
information, so that when the worker is restarted worker can use this
stored information to identify the failed prepare transaction, once
worker identifies that it is a failed prepare transaction then all of
this transaction can be serialized into a file and later when the
apply worker receives a commit prepared it can get the changes from
the file and apply this transaction  or discard the file in case of
rollback prepared.

Regards,
Vignesh



Re: Logical Replication vs. 2PC

From
Amit Kapila
Date:
On Thu, Mar 18, 2021 at 5:31 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Thu, Mar 18, 2021 at 3:16 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> >
> > In short, on the subscriber, both the apply workers (corresponding to
> > two subscriptions) are getting the same prepare transaction GID,
> > leading to an error on the subscriber and making the publisher wait
> > forever.
> >
> > Thoughts?
>
> I see the main problem here is because the GID clashes as you have
> rightly pointed out. I'm not sure if we are allowed to change the
> GID's in the subscriber.
> If we are allowed to change the GID's in the subscriber. Worker can do
> something like: When the apply worker is applying the prepared
> transaction, try to apply the prepare transaction with the GID as is.
> If there is an error GID already in use, workers can try to catch that
> error and change the GID to a fixed length hash key of (GID,
> subscription name, node name, timestamp,etc) to generate a unique hash
> key(modified GID), prepare the transaction with the generated hash
> key. Store this key and the original GID for later use, this will be
> required during commit prepared or in case of rollback prepared. When
> applying the commit prepared or rollback prepared, change the GID with
> the hash key that was used during the prepare transaction.
>

I think it will be tricky to distinguish the clash is due to the user
has already prepared a xact with the same GID on a subscriber or it is
from one of the apply workers. For earlier cases, the user needs to
take action. You need to change both file format and WAL for this and
not sure but generating hash key for this looks a bit shaky. Now, we
might be able to make it work but how about if we always append subid
with GID for prepare and store GID and subid separately in WAL (I
think we can store additional subscriber-id information
conditionally). Then during recovery, we will use both GID and subid
for prepare but for decoding, we will only use GID. This way for
cascaded set up we can always send GID by reading WAL and the
downstream subscriber will append its subid to GID. I know this is
also not that straight-forward but I don't have any better ideas at
the moment.

> If we are not allowed to change the GID's in the subscriber. This
> thought is in similar lines where in one of the earlier design
> prepared spool files was used. Can we have some mechanism where we can
> identify this scenario and store the failing prepare transaction
> information, so that when the worker is restarted worker can use this
> stored information to identify the failed prepare transaction, once
> worker identifies that it is a failed prepare transaction then all of
> this transaction can be serialized into a file and later when the
> apply worker receives a commit prepared it can get the changes from
> the file and apply this transaction  or discard the file in case of
> rollback prepared.
>

Hmm, this idea will face similar problems as described here [1].

Note: added Petr Jelinek to see if he has any opinion on this matter.

[1] - https://www.postgresql.org/message-id/CAA4eK1LVEdPYnjdajYzu3k6KEii1%2BF0jdQ6sWnYugiHcSGZD6Q%40mail.gmail.com


-- 
With Regards,
Amit Kapila.



Re: Logical Replication vs. 2PC

From
Ajin Cherian
Date:


On Thu, Mar 18, 2021 at 8:46 PM Amit Kapila <amit.kapila16@gmail.com> wrote:


However, if the user has setup synchronous_standby_names for all the
subscriptions then we won't be able to proceed because the prepare on
publisher will wait for all the subscriptions to ack and the
subscriptions are waiting for the first prepare to finish. 

But is it a valid use case to have two synchronous standbys which are two subscriptions that are on the same server both with 2pc enabled?
If the purpose of synchronous standby is for durability to prevent data loss, then why split your tables across 2 subscriptions which are on the same server?
Maybe it could be documented warning users from having such a setup. Do we really want to create a solution for an impractical scenario?

regards,
Ajin Cherian
Fujitsu Australia

Re: Logical Replication vs. 2PC

From
Markus Wanner
Date:
On 18.03.21 10:45, Amit Kapila wrote:
> While reviewing/testing subscriber-side work for $SUBJECT [1], I
> noticed a problem that seems to need a broader discussion, so started
> this thread. We can get prepare for the same GID more than once for
> the cases where we have defined multiple subscriptions for
> publications on the same server and prepared transaction has
> operations on tables subscribed to those subscriptions. For such
> cases, one of the prepare will be successful and others will fail in
> which case the server will send them again. Once the commit prepared
> is done for the first one, the next prepare will be successful. Now,
> this is not ideal but will work.

That's assuming you're using the same gid on the subscriber, which does 
not apply to all use cases.  It clearly depends on what you try to 
achieve by decoding in two phases, obviously.

We clearly don't have this issue in BDR, because we're using xids 
(together with a node id) to globally identify transactions and 
construct local (per-node) gids that don't clash.

(Things get even more interesting if you take into account that users 
may reuse the same gid for different transactions.  Lag between 
subscriptions could then lead to blocking between different origin 
transactions...)

Regards

Markus



Re: Logical Replication vs. 2PC

From
Amit Kapila
Date:
On Fri, Mar 19, 2021 at 9:22 PM Markus Wanner
<markus.wanner@enterprisedb.com> wrote:
>
> On 18.03.21 10:45, Amit Kapila wrote:
> > While reviewing/testing subscriber-side work for $SUBJECT [1], I
> > noticed a problem that seems to need a broader discussion, so started
> > this thread. We can get prepare for the same GID more than once for
> > the cases where we have defined multiple subscriptions for
> > publications on the same server and prepared transaction has
> > operations on tables subscribed to those subscriptions. For such
> > cases, one of the prepare will be successful and others will fail in
> > which case the server will send them again. Once the commit prepared
> > is done for the first one, the next prepare will be successful. Now,
> > this is not ideal but will work.
>
> That's assuming you're using the same gid on the subscriber, which does
> not apply to all use cases.  It clearly depends on what you try to
> achieve by decoding in two phases, obviously.
>
> We clearly don't have this issue in BDR, because we're using xids
> (together with a node id) to globally identify transactions and
> construct local (per-node) gids that don't clash.
>

So, I think you are using xid of publisher and origin_id of
subscription to achieve uniqueness because both will be accessible in
prepare and commit prepared. Right? If so, I think that will work out
here as well. But if we think to use xid generated on subscriber then
we need to keep some mapping of original GID sent by publisher and GID
generated by us (origin+xid of subscription) because, at commit
prepared time, we won't know that xid.

> (Things get even more interesting if you take into account that users
> may reuse the same gid for different transactions.
>

Are you saying that users might use the same GID which we have
constructed internally (say by combining origin and xid: originid_xid)
and then there will be conflict while replaying such transactions?


>  Lag between
> subscriptions could then lead to blocking between different origin
> transactions...)
>

Right and even for one subscription that can lead to blocking
transactions. But isn't it similar to what we get for a primary key
violation while replaying transactions? In that case, we suggest users
remove conflicting rows, so in such cases, we can recommend users to
commit/rollback such prepared xacts?

-- 
With Regards,
Amit Kapila.



Re: Logical Replication vs. 2PC

From
Markus Wanner
Date:
On 20.03.21 03:17, Amit Kapila wrote:
> Are you saying that users might use the same GID which we have
> constructed internally (say by combining origin and xid: originid_xid)
> and then there will be conflict while replaying such transactions?

No, I was pondering about a user doing (in short sequence):

..
PREPARE TRANSACTION 'foobar';
COMMIT PREPARED 'foobar';

BEGIN;
...
PREPARE TRANSACTION 'foobar';
COMMIT PREPARED 'foobar';

> Right and even for one subscription that can lead to blocking
> transactions. But isn't it similar to what we get for a primary key
> violation while replaying transactions?

Sure, it's a conflict that prevents application.  A primary key conflict 
may be different in that it does not eventually resolve, though.

> In that case, we suggest users
> remove conflicting rows, so in such cases, we can recommend users to
> commit/rollback such prepared xacts?

Right, if you use gids, you could ask the user to always provide unique 
identifiers and not reuse them on any other node.  That's putting the 
burden of coming up with unique identifiers on the user, but that's a 
perfectly fine and reasonable thing to do.  (Lots of other systems out 
there requiring a unique request id or such, which would get confused if 
you issue requests with duplicate ids.)

Regards

Markus



Re: Logical Replication vs. 2PC

From
Dilip Kumar
Date:
On Sat, Mar 20, 2021 at 7:50 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Mar 19, 2021 at 9:22 PM Markus Wanner
> <markus.wanner@enterprisedb.com> wrote:

> So, I think you are using xid of publisher and origin_id of
> subscription to achieve uniqueness because both will be accessible in
> prepare and commit prepared. Right? If so, I think that will work out
> here as well. But if we think to use xid generated on subscriber then
> we need to keep some mapping of original GID sent by publisher and GID
> generated by us (origin+xid of subscription) because, at commit
> prepared time, we won't know that xid.

I agree that if we use (publisher's xid + subscriber origin id)
instead of GID,  we can resolve this deadlock issue.  I was also
thinking that is it okay to change the prepared transaction name on
the subscriber? I mean instead of GID if we use some other name then
imagine a case where a user has prepared some transaction on the
publisher and then tries to commit that on the subscriber using the
prepared transaction name, then it will not work.  But maybe this is
not really a practical use case.  I mean why anyone would want to
prepare a transaction on the publisher and commit that prepared
transaction directly on the subscriber.  Thoughts?

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



Re: Logical Replication vs. 2PC

From
Amit Kapila
Date:
On Sat, Mar 20, 2021 at 2:57 PM Markus Wanner
<markus.wanner@enterprisedb.com> wrote:
>
> On 20.03.21 03:17, Amit Kapila wrote:
> > Are you saying that users might use the same GID which we have
> > constructed internally (say by combining origin and xid: originid_xid)
> > and then there will be conflict while replaying such transactions?
>
> No, I was pondering about a user doing (in short sequence):
>
> ..
> PREPARE TRANSACTION 'foobar';
> COMMIT PREPARED 'foobar';
>
> BEGIN;
> ...
> PREPARE TRANSACTION 'foobar';
> COMMIT PREPARED 'foobar';
>
> > Right and even for one subscription that can lead to blocking
> > transactions. But isn't it similar to what we get for a primary key
> > violation while replaying transactions?
>
> Sure, it's a conflict that prevents application.  A primary key conflict
> may be different in that it does not eventually resolve, though.
>
> > In that case, we suggest users
> > remove conflicting rows, so in such cases, we can recommend users to
> > commit/rollback such prepared xacts?
>
> Right, if you use gids, you could ask the user to always provide unique
> identifiers and not reuse them on any other node.  That's putting the
> burden of coming up with unique identifiers on the user, but that's a
> perfectly fine and reasonable thing to do.  (Lots of other systems out
> there requiring a unique request id or such, which would get confused if
> you issue requests with duplicate ids.)
>

Right, but I guess in our case using user-provided GID will conflict
if we use multiple subscriptions on the same node. So, it is better to
generate a unique identifier like we are discussing here, something
like (origin_id of subscription + xid of the publisher). Do you see
any problem with that?

-- 
With Regards,
Amit Kapila.



Re: Logical Replication vs. 2PC

From
Amit Kapila
Date:
On Sat, Mar 20, 2021 at 4:02 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Sat, Mar 20, 2021 at 7:50 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Fri, Mar 19, 2021 at 9:22 PM Markus Wanner
> > <markus.wanner@enterprisedb.com> wrote:
>
> > So, I think you are using xid of publisher and origin_id of
> > subscription to achieve uniqueness because both will be accessible in
> > prepare and commit prepared. Right? If so, I think that will work out
> > here as well. But if we think to use xid generated on subscriber then
> > we need to keep some mapping of original GID sent by publisher and GID
> > generated by us (origin+xid of subscription) because, at commit
> > prepared time, we won't know that xid.
>
> I agree that if we use (publisher's xid + subscriber origin id)
> instead of GID,  we can resolve this deadlock issue.
>

Yeah, the two things to keep in mind with this solution as well are
(a) still it is possible that conflict can be generated if the user
has prepared the transaction with that name of subscriber, the chances
of which are bleak and the user can always commit/rollback the
conflicting GID; (b) the subscription has two publications at
different nodes and then there is some chance that both send the same
xid, again the chances of this are bleak.

I think even though in the above kind of cases there is a chance of
conflict but it won't be a deadlock kind of situation. So, I guess it
is better to do this solution, what do you think?

>  I was also
> thinking that is it okay to change the prepared transaction name on
> the subscriber? I mean instead of GID if we use some other name then
> imagine a case where a user has prepared some transaction on the
> publisher and then tries to commit that on the subscriber using the
> prepared transaction name, then it will not work.  But maybe this is
> not really a practical use case.  I mean why anyone would want to
> prepare a transaction on the publisher and commit that prepared
> transaction directly on the subscriber.
>

It is not clear to me either if for such a purpose we need to use the
same GID as provided by the publisher. I don't know if there is any
such use case but if there is one, maybe later we can provide an
option with a subscription to use GID provided by the publisher when
two_phase is enabled?

-- 
With Regards,
Amit Kapila.



Re: Logical Replication vs. 2PC

From
Amit Kapila
Date:
On Sat, Mar 20, 2021 at 8:53 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Sat, Mar 20, 2021 at 4:02 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Sat, Mar 20, 2021 at 7:50 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Fri, Mar 19, 2021 at 9:22 PM Markus Wanner
> > > <markus.wanner@enterprisedb.com> wrote:
> >
> > > So, I think you are using xid of publisher and origin_id of
> > > subscription to achieve uniqueness because both will be accessible in
> > > prepare and commit prepared. Right? If so, I think that will work out
> > > here as well. But if we think to use xid generated on subscriber then
> > > we need to keep some mapping of original GID sent by publisher and GID
> > > generated by us (origin+xid of subscription) because, at commit
> > > prepared time, we won't know that xid.
> >
> > I agree that if we use (publisher's xid + subscriber origin id)
> > instead of GID,  we can resolve this deadlock issue.
> >
>
> Yeah, the two things to keep in mind with this solution as well are
> (a) still it is possible that conflict can be generated if the user
> has prepared the transaction with that name of subscriber, the chances
> of which are bleak and the user can always commit/rollback the
> conflicting GID; (b) the subscription has two publications at
> different nodes and then there is some chance that both send the same
> xid, again the chances of this are bleak.
>
> I think even though in the above kind of cases there is a chance of
> conflict but it won't be a deadlock kind of situation. So, I guess it
> is better to do this solution, what do you think?
>

I have enhanced the patch for 2PC implementation on the
subscriber-side as per the solution discussed here [1].

[1] - https://www.postgresql.org/message-id/CAA4eK1KvXA34S24My1qnRhOn%2Bw30b2FdGNNzqh1pm0ENveGJJw%40mail.gmail.com

-- 
With Regards,
Amit Kapila.



Re: Logical Replication vs. 2PC

From
Markus Wanner
Date:
On 20.03.21 16:14, Amit Kapila wrote:
> Right, but I guess in our case using user-provided GID will conflict
> if we use multiple subscriptions on the same node. So, it is better to
> generate a unique identifier like we are discussing here, something
> like (origin_id of subscription + xid of the publisher). Do you see
> any problem with that?

No, quite the opposite: I'm the one advocating the use of xids to 
identify transactions.  See my patch for filter_prepare.

Regards

Markus



Re: Logical Replication vs. 2PC

From
Amit Kapila
Date:
On Sun, Mar 21, 2021 at 2:47 PM Markus Wanner
<markus.wanner@enterprisedb.com> wrote:
>
> On 20.03.21 16:14, Amit Kapila wrote:
> > Right, but I guess in our case using user-provided GID will conflict
> > if we use multiple subscriptions on the same node. So, it is better to
> > generate a unique identifier like we are discussing here, something
> > like (origin_id of subscription + xid of the publisher). Do you see
> > any problem with that?
>
> No, quite the opposite: I'm the one advocating the use of xids to
> identify transactions.
>

Okay.

>  See my patch for filter_prepare.
>

I'll think once again from this angle and respond on that thread,
probably one use case could be for the plugins which use xid to
generate GID. In such cases, xid might be required to filter the
transaction.

-- 
With Regards,
Amit Kapila.



RE: Logical Replication vs. 2PC

From
"tanghy.fnst@fujitsu.com"
Date:
On Sunday, March 21, 2021 4:40 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

>I have enhanced the patch for 2PC implementation on the
>subscriber-side as per the solution discussed here [1].

FYI.
I did the confirmation for the solution of unique GID problem raised at [1].
This problem in V61-patches at [2] is fixed in the latest V66-patches at [3].

B.T.W. NG log at V61-patches is attached, please take it as your reference.
       Test step is just the same as Amit said at [1].

[1] - https://www.postgresql.org/message-id/CAA4eK1+opiV4aFTmWWUF9h_32=HfPOW9vZASHarT0UA5oBrtGw@mail.gmail.com
[2] - https://www.postgresql.org/message-id/CAHut%2BPv3X7YH_nDEjH1ZJf5U6M6DHHtEjevu7PY5Dv5071jQ4A%40mail.gmail.com
[3] - https://www.postgresql.org/message-id/CAA4eK1JPEoYAkggmLqbdD%2BcF%3DkWNpLkZb_wJ8eqj0QD2AjBTBA%40mail.gmail.com

Regards,
Tang


Attachment