Thread: Logical Replication vs. 2PC
While reviewing/testing subscriber-side work for $SUBJECT [1], I noticed a problem that seems to need a broader discussion, so started this thread. We can get prepare for the same GID more than once for the cases where we have defined multiple subscriptions for publications on the same server and prepared transaction has operations on tables subscribed to those subscriptions. For such cases, one of the prepare will be successful and others will fail in which case the server will send them again. Once the commit prepared is done for the first one, the next prepare will be successful. Now, this is not ideal but will work. However, if the user has setup synchronous_standby_names for all the subscriptions then we won't be able to proceed because the prepare on publisher will wait for all the subscriptions to ack and the subscriptions are waiting for the first prepare to finish. See an example below for such a situation. I think this can also happen if we get any key violation while applying the changes on the subscriber, but for that, we can ask the user to remove the violating key on the subscriber as that is what we suggest now also for commits. Similarly, say the user has already prepared the transaction with the same GID on subscriber-node, then also we can get into a similar situation but for that, we can ask the user to commit such a GID. We can think of appending some unique identifier (like subid) with GID but that won't work for cascaded standby setup (where the prepares on subscriber will be again sent to another subscriber) as the GID can become too long. So that might not be a good solution, maybe we can optimize it in some way that we append only when there is a GID clash. The other thing we could do is to ask the user to temporarily disable the subscription and change synchronous_standby_settings on the publisher node. Any better ideas? Example of the above scenario, you can see this problem after applying the patches at [1]. Publisher ================= CREATE TABLE mytbl(id SERIAL PRIMARY KEY, somedata int, text varchar(120)); BEGIN; INSERT INTO mytbl(somedata, text) VALUES (1, 1); INSERT INTO mytbl(somedata, text) VALUES (1, 2); COMMIT; CREATE PUBLICATION mypub FOR TABLE mytbl; CREATE TABLE mytbl1(id SERIAL PRIMARY KEY, somedata int, text varchar(120)); BEGIN; INSERT INTO mytbl1(somedata, text) VALUES (1, 1); INSERT INTO mytbl1(somedata, text) VALUES (1, 2); COMMIT; CREATE PUBLICATION mypub1 FOR TABLE mytbl1; Subscriber ============= CREATE TABLE mytbl(id SERIAL PRIMARY KEY, somedata int, text varchar(120)); CREATE SUBSCRIPTION mysub CONNECTION 'host=localhost port=5432 dbname=postgres' PUBLICATION mypub WITH(two_phase = on); CREATE TABLE mytbl1(id SERIAL PRIMARY KEY, somedata int, text varchar(120)); CREATE SUBSCRIPTION mysub1 CONNECTION 'host=localhost port=5432 dbname=postgres' PUBLICATION mypub1 WITH(two_phase = on); Now, set synchronous_standby_names = 'FIRST 2 (mysub, mysub1)' on the publisher in postgresql.conf and restart both publisher and subscriber, actually restart is not required as synchronous_standby_names is a SIGHUP parameter. Publisher ============= BEGIN; Insert into mytbl values(17,1,18); Insert into mytbl1 values(17,1,18); Prepare Transaction 'foo'; Now, this Prepare transaction will wait forever because on subscriber we are getting "ERROR: transaction identifier "foo" is already in use" which means it is waiting for a publisher to send commit prepared for first apply worker and publisher is waiting for both the subscriptions to send ack. This is happening because the prepared transaction on publisher operates on tables of both subscriptions. In short, on the subscriber, both the apply workers (corresponding to two subscriptions) are getting the same prepare transaction GID, leading to an error on the subscriber and making the publisher wait forever. Thoughts? [1] - https://www.postgresql.org/message-id/CAHut%2BPv3X7YH_nDEjH1ZJf5U6M6DHHtEjevu7PY5Dv5071jQ4A%40mail.gmail.com -- With Regards, Amit Kapila.
On Thu, Mar 18, 2021 at 3:16 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > While reviewing/testing subscriber-side work for $SUBJECT [1], I > noticed a problem that seems to need a broader discussion, so started > this thread. We can get prepare for the same GID more than once for > the cases where we have defined multiple subscriptions for > publications on the same server and prepared transaction has > operations on tables subscribed to those subscriptions. For such > cases, one of the prepare will be successful and others will fail in > which case the server will send them again. Once the commit prepared > is done for the first one, the next prepare will be successful. Now, > this is not ideal but will work. > > However, if the user has setup synchronous_standby_names for all the > subscriptions then we won't be able to proceed because the prepare on > publisher will wait for all the subscriptions to ack and the > subscriptions are waiting for the first prepare to finish. See an > example below for such a situation. I think this can also happen if we > get any key violation while applying the changes on the subscriber, > but for that, we can ask the user to remove the violating key on the > subscriber as that is what we suggest now also for commits. Similarly, > say the user has already prepared the transaction with the same GID on > subscriber-node, then also we can get into a similar situation but for > that, we can ask the user to commit such a GID. > > We can think of appending some unique identifier (like subid) with GID > but that won't work for cascaded standby setup (where the prepares on > subscriber will be again sent to another subscriber) as the GID can > become too long. So that might not be a good solution, maybe we can > optimize it in some way that we append only when there is a GID clash. > The other thing we could do is to ask the user to temporarily disable > the subscription and change synchronous_standby_settings on the > publisher node. Any better ideas? > > In short, on the subscriber, both the apply workers (corresponding to > two subscriptions) are getting the same prepare transaction GID, > leading to an error on the subscriber and making the publisher wait > forever. > > Thoughts? I see the main problem here is because the GID clashes as you have rightly pointed out. I'm not sure if we are allowed to change the GID's in the subscriber. If we are allowed to change the GID's in the subscriber. Worker can do something like: When the apply worker is applying the prepared transaction, try to apply the prepare transaction with the GID as is. If there is an error GID already in use, workers can try to catch that error and change the GID to a fixed length hash key of (GID, subscription name, node name, timestamp,etc) to generate a unique hash key(modified GID), prepare the transaction with the generated hash key. Store this key and the original GID for later use, this will be required during commit prepared or in case of rollback prepared. When applying the commit prepared or rollback prepared, change the GID with the hash key that was used during the prepare transaction. If we are not allowed to change the GID's in the subscriber. This thought is in similar lines where in one of the earlier design prepared spool files was used. Can we have some mechanism where we can identify this scenario and store the failing prepare transaction information, so that when the worker is restarted worker can use this stored information to identify the failed prepare transaction, once worker identifies that it is a failed prepare transaction then all of this transaction can be serialized into a file and later when the apply worker receives a commit prepared it can get the changes from the file and apply this transaction or discard the file in case of rollback prepared. Regards, Vignesh
On Thu, Mar 18, 2021 at 5:31 PM vignesh C <vignesh21@gmail.com> wrote: > > On Thu, Mar 18, 2021 at 3:16 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > In short, on the subscriber, both the apply workers (corresponding to > > two subscriptions) are getting the same prepare transaction GID, > > leading to an error on the subscriber and making the publisher wait > > forever. > > > > Thoughts? > > I see the main problem here is because the GID clashes as you have > rightly pointed out. I'm not sure if we are allowed to change the > GID's in the subscriber. > If we are allowed to change the GID's in the subscriber. Worker can do > something like: When the apply worker is applying the prepared > transaction, try to apply the prepare transaction with the GID as is. > If there is an error GID already in use, workers can try to catch that > error and change the GID to a fixed length hash key of (GID, > subscription name, node name, timestamp,etc) to generate a unique hash > key(modified GID), prepare the transaction with the generated hash > key. Store this key and the original GID for later use, this will be > required during commit prepared or in case of rollback prepared. When > applying the commit prepared or rollback prepared, change the GID with > the hash key that was used during the prepare transaction. > I think it will be tricky to distinguish the clash is due to the user has already prepared a xact with the same GID on a subscriber or it is from one of the apply workers. For earlier cases, the user needs to take action. You need to change both file format and WAL for this and not sure but generating hash key for this looks a bit shaky. Now, we might be able to make it work but how about if we always append subid with GID for prepare and store GID and subid separately in WAL (I think we can store additional subscriber-id information conditionally). Then during recovery, we will use both GID and subid for prepare but for decoding, we will only use GID. This way for cascaded set up we can always send GID by reading WAL and the downstream subscriber will append its subid to GID. I know this is also not that straight-forward but I don't have any better ideas at the moment. > If we are not allowed to change the GID's in the subscriber. This > thought is in similar lines where in one of the earlier design > prepared spool files was used. Can we have some mechanism where we can > identify this scenario and store the failing prepare transaction > information, so that when the worker is restarted worker can use this > stored information to identify the failed prepare transaction, once > worker identifies that it is a failed prepare transaction then all of > this transaction can be serialized into a file and later when the > apply worker receives a commit prepared it can get the changes from > the file and apply this transaction or discard the file in case of > rollback prepared. > Hmm, this idea will face similar problems as described here [1]. Note: added Petr Jelinek to see if he has any opinion on this matter. [1] - https://www.postgresql.org/message-id/CAA4eK1LVEdPYnjdajYzu3k6KEii1%2BF0jdQ6sWnYugiHcSGZD6Q%40mail.gmail.com -- With Regards, Amit Kapila.
On Thu, Mar 18, 2021 at 8:46 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
However, if the user has setup synchronous_standby_names for all the
subscriptions then we won't be able to proceed because the prepare on
publisher will wait for all the subscriptions to ack and the
subscriptions are waiting for the first prepare to finish.
But is it a valid use case to have two synchronous standbys which are two subscriptions that are on the same server both with 2pc enabled?
If the purpose of synchronous standby is for durability to prevent data loss, then why split your tables across 2 subscriptions which are on the same server?
Maybe it could be documented warning users from having such a setup. Do we really want to create a solution for an impractical scenario?
regards,
Ajin Cherian
Fujitsu Australia
On 18.03.21 10:45, Amit Kapila wrote: > While reviewing/testing subscriber-side work for $SUBJECT [1], I > noticed a problem that seems to need a broader discussion, so started > this thread. We can get prepare for the same GID more than once for > the cases where we have defined multiple subscriptions for > publications on the same server and prepared transaction has > operations on tables subscribed to those subscriptions. For such > cases, one of the prepare will be successful and others will fail in > which case the server will send them again. Once the commit prepared > is done for the first one, the next prepare will be successful. Now, > this is not ideal but will work. That's assuming you're using the same gid on the subscriber, which does not apply to all use cases. It clearly depends on what you try to achieve by decoding in two phases, obviously. We clearly don't have this issue in BDR, because we're using xids (together with a node id) to globally identify transactions and construct local (per-node) gids that don't clash. (Things get even more interesting if you take into account that users may reuse the same gid for different transactions. Lag between subscriptions could then lead to blocking between different origin transactions...) Regards Markus
On Fri, Mar 19, 2021 at 9:22 PM Markus Wanner <markus.wanner@enterprisedb.com> wrote: > > On 18.03.21 10:45, Amit Kapila wrote: > > While reviewing/testing subscriber-side work for $SUBJECT [1], I > > noticed a problem that seems to need a broader discussion, so started > > this thread. We can get prepare for the same GID more than once for > > the cases where we have defined multiple subscriptions for > > publications on the same server and prepared transaction has > > operations on tables subscribed to those subscriptions. For such > > cases, one of the prepare will be successful and others will fail in > > which case the server will send them again. Once the commit prepared > > is done for the first one, the next prepare will be successful. Now, > > this is not ideal but will work. > > That's assuming you're using the same gid on the subscriber, which does > not apply to all use cases. It clearly depends on what you try to > achieve by decoding in two phases, obviously. > > We clearly don't have this issue in BDR, because we're using xids > (together with a node id) to globally identify transactions and > construct local (per-node) gids that don't clash. > So, I think you are using xid of publisher and origin_id of subscription to achieve uniqueness because both will be accessible in prepare and commit prepared. Right? If so, I think that will work out here as well. But if we think to use xid generated on subscriber then we need to keep some mapping of original GID sent by publisher and GID generated by us (origin+xid of subscription) because, at commit prepared time, we won't know that xid. > (Things get even more interesting if you take into account that users > may reuse the same gid for different transactions. > Are you saying that users might use the same GID which we have constructed internally (say by combining origin and xid: originid_xid) and then there will be conflict while replaying such transactions? > Lag between > subscriptions could then lead to blocking between different origin > transactions...) > Right and even for one subscription that can lead to blocking transactions. But isn't it similar to what we get for a primary key violation while replaying transactions? In that case, we suggest users remove conflicting rows, so in such cases, we can recommend users to commit/rollback such prepared xacts? -- With Regards, Amit Kapila.
On 20.03.21 03:17, Amit Kapila wrote: > Are you saying that users might use the same GID which we have > constructed internally (say by combining origin and xid: originid_xid) > and then there will be conflict while replaying such transactions? No, I was pondering about a user doing (in short sequence): .. PREPARE TRANSACTION 'foobar'; COMMIT PREPARED 'foobar'; BEGIN; ... PREPARE TRANSACTION 'foobar'; COMMIT PREPARED 'foobar'; > Right and even for one subscription that can lead to blocking > transactions. But isn't it similar to what we get for a primary key > violation while replaying transactions? Sure, it's a conflict that prevents application. A primary key conflict may be different in that it does not eventually resolve, though. > In that case, we suggest users > remove conflicting rows, so in such cases, we can recommend users to > commit/rollback such prepared xacts? Right, if you use gids, you could ask the user to always provide unique identifiers and not reuse them on any other node. That's putting the burden of coming up with unique identifiers on the user, but that's a perfectly fine and reasonable thing to do. (Lots of other systems out there requiring a unique request id or such, which would get confused if you issue requests with duplicate ids.) Regards Markus
On Sat, Mar 20, 2021 at 7:50 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Fri, Mar 19, 2021 at 9:22 PM Markus Wanner > <markus.wanner@enterprisedb.com> wrote: > So, I think you are using xid of publisher and origin_id of > subscription to achieve uniqueness because both will be accessible in > prepare and commit prepared. Right? If so, I think that will work out > here as well. But if we think to use xid generated on subscriber then > we need to keep some mapping of original GID sent by publisher and GID > generated by us (origin+xid of subscription) because, at commit > prepared time, we won't know that xid. I agree that if we use (publisher's xid + subscriber origin id) instead of GID, we can resolve this deadlock issue. I was also thinking that is it okay to change the prepared transaction name on the subscriber? I mean instead of GID if we use some other name then imagine a case where a user has prepared some transaction on the publisher and then tries to commit that on the subscriber using the prepared transaction name, then it will not work. But maybe this is not really a practical use case. I mean why anyone would want to prepare a transaction on the publisher and commit that prepared transaction directly on the subscriber. Thoughts? -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
On Sat, Mar 20, 2021 at 2:57 PM Markus Wanner <markus.wanner@enterprisedb.com> wrote: > > On 20.03.21 03:17, Amit Kapila wrote: > > Are you saying that users might use the same GID which we have > > constructed internally (say by combining origin and xid: originid_xid) > > and then there will be conflict while replaying such transactions? > > No, I was pondering about a user doing (in short sequence): > > .. > PREPARE TRANSACTION 'foobar'; > COMMIT PREPARED 'foobar'; > > BEGIN; > ... > PREPARE TRANSACTION 'foobar'; > COMMIT PREPARED 'foobar'; > > > Right and even for one subscription that can lead to blocking > > transactions. But isn't it similar to what we get for a primary key > > violation while replaying transactions? > > Sure, it's a conflict that prevents application. A primary key conflict > may be different in that it does not eventually resolve, though. > > > In that case, we suggest users > > remove conflicting rows, so in such cases, we can recommend users to > > commit/rollback such prepared xacts? > > Right, if you use gids, you could ask the user to always provide unique > identifiers and not reuse them on any other node. That's putting the > burden of coming up with unique identifiers on the user, but that's a > perfectly fine and reasonable thing to do. (Lots of other systems out > there requiring a unique request id or such, which would get confused if > you issue requests with duplicate ids.) > Right, but I guess in our case using user-provided GID will conflict if we use multiple subscriptions on the same node. So, it is better to generate a unique identifier like we are discussing here, something like (origin_id of subscription + xid of the publisher). Do you see any problem with that? -- With Regards, Amit Kapila.
On Sat, Mar 20, 2021 at 4:02 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Sat, Mar 20, 2021 at 7:50 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Fri, Mar 19, 2021 at 9:22 PM Markus Wanner > > <markus.wanner@enterprisedb.com> wrote: > > > So, I think you are using xid of publisher and origin_id of > > subscription to achieve uniqueness because both will be accessible in > > prepare and commit prepared. Right? If so, I think that will work out > > here as well. But if we think to use xid generated on subscriber then > > we need to keep some mapping of original GID sent by publisher and GID > > generated by us (origin+xid of subscription) because, at commit > > prepared time, we won't know that xid. > > I agree that if we use (publisher's xid + subscriber origin id) > instead of GID, we can resolve this deadlock issue. > Yeah, the two things to keep in mind with this solution as well are (a) still it is possible that conflict can be generated if the user has prepared the transaction with that name of subscriber, the chances of which are bleak and the user can always commit/rollback the conflicting GID; (b) the subscription has two publications at different nodes and then there is some chance that both send the same xid, again the chances of this are bleak. I think even though in the above kind of cases there is a chance of conflict but it won't be a deadlock kind of situation. So, I guess it is better to do this solution, what do you think? > I was also > thinking that is it okay to change the prepared transaction name on > the subscriber? I mean instead of GID if we use some other name then > imagine a case where a user has prepared some transaction on the > publisher and then tries to commit that on the subscriber using the > prepared transaction name, then it will not work. But maybe this is > not really a practical use case. I mean why anyone would want to > prepare a transaction on the publisher and commit that prepared > transaction directly on the subscriber. > It is not clear to me either if for such a purpose we need to use the same GID as provided by the publisher. I don't know if there is any such use case but if there is one, maybe later we can provide an option with a subscription to use GID provided by the publisher when two_phase is enabled? -- With Regards, Amit Kapila.
On Sat, Mar 20, 2021 at 8:53 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Sat, Mar 20, 2021 at 4:02 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > On Sat, Mar 20, 2021 at 7:50 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > On Fri, Mar 19, 2021 at 9:22 PM Markus Wanner > > > <markus.wanner@enterprisedb.com> wrote: > > > > > So, I think you are using xid of publisher and origin_id of > > > subscription to achieve uniqueness because both will be accessible in > > > prepare and commit prepared. Right? If so, I think that will work out > > > here as well. But if we think to use xid generated on subscriber then > > > we need to keep some mapping of original GID sent by publisher and GID > > > generated by us (origin+xid of subscription) because, at commit > > > prepared time, we won't know that xid. > > > > I agree that if we use (publisher's xid + subscriber origin id) > > instead of GID, we can resolve this deadlock issue. > > > > Yeah, the two things to keep in mind with this solution as well are > (a) still it is possible that conflict can be generated if the user > has prepared the transaction with that name of subscriber, the chances > of which are bleak and the user can always commit/rollback the > conflicting GID; (b) the subscription has two publications at > different nodes and then there is some chance that both send the same > xid, again the chances of this are bleak. > > I think even though in the above kind of cases there is a chance of > conflict but it won't be a deadlock kind of situation. So, I guess it > is better to do this solution, what do you think? > I have enhanced the patch for 2PC implementation on the subscriber-side as per the solution discussed here [1]. [1] - https://www.postgresql.org/message-id/CAA4eK1KvXA34S24My1qnRhOn%2Bw30b2FdGNNzqh1pm0ENveGJJw%40mail.gmail.com -- With Regards, Amit Kapila.
On 20.03.21 16:14, Amit Kapila wrote: > Right, but I guess in our case using user-provided GID will conflict > if we use multiple subscriptions on the same node. So, it is better to > generate a unique identifier like we are discussing here, something > like (origin_id of subscription + xid of the publisher). Do you see > any problem with that? No, quite the opposite: I'm the one advocating the use of xids to identify transactions. See my patch for filter_prepare. Regards Markus
On Sun, Mar 21, 2021 at 2:47 PM Markus Wanner <markus.wanner@enterprisedb.com> wrote: > > On 20.03.21 16:14, Amit Kapila wrote: > > Right, but I guess in our case using user-provided GID will conflict > > if we use multiple subscriptions on the same node. So, it is better to > > generate a unique identifier like we are discussing here, something > > like (origin_id of subscription + xid of the publisher). Do you see > > any problem with that? > > No, quite the opposite: I'm the one advocating the use of xids to > identify transactions. > Okay. > See my patch for filter_prepare. > I'll think once again from this angle and respond on that thread, probably one use case could be for the plugins which use xid to generate GID. In such cases, xid might be required to filter the transaction. -- With Regards, Amit Kapila.
On Sunday, March 21, 2021 4:40 PM Amit Kapila <amit.kapila16@gmail.com> wrote: >I have enhanced the patch for 2PC implementation on the >subscriber-side as per the solution discussed here [1]. FYI. I did the confirmation for the solution of unique GID problem raised at [1]. This problem in V61-patches at [2] is fixed in the latest V66-patches at [3]. B.T.W. NG log at V61-patches is attached, please take it as your reference. Test step is just the same as Amit said at [1]. [1] - https://www.postgresql.org/message-id/CAA4eK1+opiV4aFTmWWUF9h_32=HfPOW9vZASHarT0UA5oBrtGw@mail.gmail.com [2] - https://www.postgresql.org/message-id/CAHut%2BPv3X7YH_nDEjH1ZJf5U6M6DHHtEjevu7PY5Dv5071jQ4A%40mail.gmail.com [3] - https://www.postgresql.org/message-id/CAA4eK1JPEoYAkggmLqbdD%2BcF%3DkWNpLkZb_wJ8eqj0QD2AjBTBA%40mail.gmail.com Regards, Tang