Thread: Decoding of two-phase xacts missing from CREATE_REPLICATION_SLOT command
Commit 19890a064 changed pg_create_logical_replication_slot() to allow decoding of two-phase transactions, but did not extend the CREATE_REPLICATION_SLOT command to support it. Strangely, it does extend the CreateReplicationSlotCmd struct to add a "two_phase" field, but doesn't set it anywhere. There were patches[1] from around the time of the commit to support CREATE_REPLICATION_SLOT as well. Is there a reason to support two-phase decoding, but not with the replication protocol? If so, why change the CreateReplicationSlotCmd structure as though we will support it? Regards, Jeff Davis [1] https://www.postgresql.org/message-id/CAFPTHDZ2rigOf0oM0OBhv1yRmyMO5-SQfT9FCLYj-Jp9ShXB3A@mail.gmail.com
On Thu, Jun 3, 2021 at 4:48 AM Jeff Davis <pgsql@j-davis.com> wrote: > > Commit 19890a064 changed pg_create_logical_replication_slot() to allow > decoding of two-phase transactions, but did not extend the > CREATE_REPLICATION_SLOT command to support it. Strangely, it does > extend the CreateReplicationSlotCmd struct to add a "two_phase" field, > but doesn't set it anywhere. > > There were patches[1] from around the time of the commit to support > CREATE_REPLICATION_SLOT as well. > > Is there a reason to support two-phase decoding, but not with the > replication protocol? If so, why change the CreateReplicationSlotCmd > structure as though we will support it? > The idea is to support two_phase via protocol with a subscriber-side work where we can test it as well. The code to support it via replication protocol is present in the first patch of subscriber-side work [1] which uses that code as well. Basically, we don't have a good way to test it without subscriber-side work so decided to postpone it till the corresponding work is done. I think we can remove the change in CreateReplicationSlotCmd, that is a leftover. If we have to support it via protocol, then at the minimum, we need to enhance pg_recvlogical so that the same can be tested. [1] - https://www.postgresql.org/message-id/CAHut%2BPt7wnctZpfhaLyuPA0mXDAtuw7DsMUfw3TePJLxqTArjA%40mail.gmail.com -- With Regards, Amit Kapila.
On Thu, 2021-06-03 at 09:29 +0530, Amit Kapila wrote: > The idea is to support two_phase via protocol with a subscriber-side > work where we can test it as well. The code to support it via > replication protocol is present in the first patch of subscriber-side > work [1] which uses that code as well. Basically, we don't have a > good > way to test it without subscriber-side work so decided to postpone it > till the corresponding work is done. Thank you for clarifying. Right now, it feels a bit incomplete. If it's not much work, I recommend breaking out the CREATE_REPLICATION_SLOT changes and updating pg_recvlogical, so that it can go in v14 (and pg_create_logical_replication_slot() will match CREATE_REPLICATION_SLOT). But if that's complicated or controversial, then I'm fine waiting for the other work to complete. Regards, Jeff Davis
On Thu, Jun 3, 2021 at 10:08 PM Jeff Davis <pgsql@j-davis.com> wrote: > > On Thu, 2021-06-03 at 09:29 +0530, Amit Kapila wrote: > > The idea is to support two_phase via protocol with a subscriber-side > > work where we can test it as well. The code to support it via > > replication protocol is present in the first patch of subscriber-side > > work [1] which uses that code as well. Basically, we don't have a > > good > > way to test it without subscriber-side work so decided to postpone it > > till the corresponding work is done. > > Thank you for clarifying. > > Right now, it feels a bit incomplete. If it's not much work, I > recommend breaking out the CREATE_REPLICATION_SLOT changes and updating > pg_recvlogical, so that it can go in v14 (and > pg_create_logical_replication_slot() will match > CREATE_REPLICATION_SLOT). But if that's complicated or controversial, > then I'm fine waiting for the other work to complete. > I think we can try but not sure if we can get it by then. So, here is my suggestion: a. remove the change in CreateReplicationSlotCmd b. prepare the patches for protocol change and pg_recvlogical. This will anyway include the change we removed as part of (a). Does that sound reasonable? -- With Regards, Amit Kapila.
Re: Decoding of two-phase xacts missing from CREATE_REPLICATION_SLOT command
From
Ajin Cherian
Date:
On Fri, Jun 4, 2021 at 1:06 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > I think we can try but not sure if we can get it by then. So, here is > my suggestion: > a. remove the change in CreateReplicationSlotCmd > b. prepare the patches for protocol change and pg_recvlogical. This > will anyway include the change we removed as part of (a). Attaching two patches: 1. Removes two-phase from CreateReplicationSlotCmd 2. Adds two-phase option in CREATE_REPLICATION_SLOT command. I will send a patch to update pg_recvlogical next week. regards, Ajin Cherian Fujitsu Australia
Attachment
On Fri, 2021-06-04 at 08:36 +0530, Amit Kapila wrote: > I think we can try but not sure if we can get it by then. So, here is > my suggestion: > a. remove the change in CreateReplicationSlotCmd > b. prepare the patches for protocol change and pg_recvlogical. This > will anyway include the change we removed as part of (a). Yes, sounds good. Regards, Jeff Davis
On Fri, Jun 4, 2021 at 2:29 PM Ajin Cherian <itsajin@gmail.com> wrote: > > On Fri, Jun 4, 2021 at 1:06 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > I think we can try but not sure if we can get it by then. So, here is > > my suggestion: > > a. remove the change in CreateReplicationSlotCmd > > b. prepare the patches for protocol change and pg_recvlogical. This > > will anyway include the change we removed as part of (a). > > > Attaching two patches: > 1. Removes two-phase from CreateReplicationSlotCmd > Pushed the above patch. -- With Regards, Amit Kapila.
Re: Decoding of two-phase xacts missing from CREATE_REPLICATION_SLOT command
From
Ajin Cherian
Date:
On Mon, Jun 7, 2021 at 3:17 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > Pushed the above patch. Here's an updated patchset that adds back in the option for two-phase in CREATE_REPLICATION_SLOT command and a second patch that adds support for two-phase decoding in pg_recvlogical. regards, Ajin Cherian
Attachment
On Tue, 2021-06-08 at 17:41 +1000, Ajin Cherian wrote: > Here's an updated patchset that adds back in the option for two-phase > in CREATE_REPLICATION_SLOT command and a second patch that adds > support for > two-phase decoding in pg_recvlogical. A few things: * I suggest putting the TWO_PHASE keyword after the LOGICAL keyword * Document the TWO_PHASE keyword in doc/src/sgml/protocol.sgml * Cross check that --two-phase is specified only if --create-slot is specified * Maybe an Assert(!(two_phase && is_physical)) in CreateReplicationSlot()? Other than that, it looks good, and it works as I expect it to. Regards, Jeff Davis
Re: Decoding of two-phase xacts missing from CREATE_REPLICATION_SLOT command
From
Ajin Cherian
Date:
On Wed, Jun 9, 2021 at 6:23 AM Jeff Davis <pgsql@j-davis.com> wrote: > > On Tue, 2021-06-08 at 17:41 +1000, Ajin Cherian wrote: > > Here's an updated patchset that adds back in the option for two-phase > > in CREATE_REPLICATION_SLOT command and a second patch that adds > > support for > > two-phase decoding in pg_recvlogical. > > A few things: > > * I suggest putting the TWO_PHASE keyword after the LOGICAL keyword > * Document the TWO_PHASE keyword in doc/src/sgml/protocol.sgml > * Cross check that --two-phase is specified only if --create-slot is > specified > * Maybe an Assert(!(two_phase && is_physical)) in > CreateReplicationSlot()? > > Other than that, it looks good, and it works as I expect it to. Updated. Do have a look. thanks, Ajin Cherian Fujitsu Australia
Attachment
On Wed, Jun 9, 2021 at 1:53 AM Jeff Davis <pgsql@j-davis.com> wrote: > > On Tue, 2021-06-08 at 17:41 +1000, Ajin Cherian wrote: > > Here's an updated patchset that adds back in the option for two-phase > > in CREATE_REPLICATION_SLOT command and a second patch that adds > > support for > > two-phase decoding in pg_recvlogical. > > A few things: > > * I suggest putting the TWO_PHASE keyword after the LOGICAL keyword > Isn't it better to add it after LOGICAL IDENT? In docs [1], we expect that way. Also, see code in libpqrcv_create_slot where we expect them to be together but we can change that if you still prefer to add it after LOGICAL. BTW, can't we consider it to be part of create_slot_opt_list? Also, it might be good if we can add a test in src/bin/pg_basebackup/t/030_pg_recvlogical [1] - https://www.postgresql.org/docs/devel/logicaldecoding-walsender.html -- With Regards, Amit Kapila.
On Wed, Jun 9, 2021 at 4:50 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Wed, Jun 9, 2021 at 1:53 AM Jeff Davis <pgsql@j-davis.com> wrote: > > > > On Tue, 2021-06-08 at 17:41 +1000, Ajin Cherian wrote: > > > Here's an updated patchset that adds back in the option for two-phase > > > in CREATE_REPLICATION_SLOT command and a second patch that adds > > > support for > > > two-phase decoding in pg_recvlogical. > > > > A few things: > > > > * I suggest putting the TWO_PHASE keyword after the LOGICAL keyword > > > > Isn't it better to add it after LOGICAL IDENT? In docs [1], we expect > that way. Also, see code in libpqrcv_create_slot where we expect them > to be together but we can change that if you still prefer to add it > after LOGICAL. BTW, can't we consider it to be part of > create_slot_opt_list? > > Also, it might be good if we can add a test in > src/bin/pg_basebackup/t/030_pg_recvlogical > Some more points: 1. pg_recvlogical can only send two_phase option if (PQserverVersion(conn) >= 140000), otherwise, it won't work for older versions of the server. 2. In the main patch [1], we do send two_phase option even during START_REPLICATION for the very first time when the two_phase can be enabled. There are reasons as described in the worker.c why we can't enable it during CREATE_REPLICATION_SLOT. Now, if we want to do protocol changes, I wonder why only do some changes and leave the rest for the next version? [1] - https://www.postgresql.org/message-id/CAHut%2BPt7wnctZpfhaLyuPA0mXDAtuw7DsMUfw3TePJLxqTArjA%40mail.gmail.com -- With Regards, Amit Kapila.
On Wed, 2021-06-09 at 16:50 +0530, Amit Kapila wrote: > BTW, can't we consider it to be part of > create_slot_opt_list? Yes, that would be better. It looks like the physical and logical slot options are mixed together -- should they be separated in the grammar so that using an option with the wrong kind of slot would be a parse error? Regards, Jeff Davis
On Wed, 2021-06-09 at 17:27 +0530, Amit Kapila wrote: > 2. In the main patch [1], we do send two_phase option even during > START_REPLICATION for the very first time when the two_phase can be > enabled. There are reasons as described in the worker.c why we can't > enable it during CREATE_REPLICATION_SLOT. I'll have to catch up on the thread to digest that reasoning and how it applies to decoding vs. replication. But there don't seem to be changes to START_REPLICATION for twophase, so I don't quite follow your point. Are you saying that we should not be able to create slots with twophase at all until the rest of the changes go in? > Now, if we want to do > protocol changes, I wonder why only do some changes and leave the > rest > for the next version? I started this thread because it's possible to create a slot a certain way using the SQL function create_logical_replication_slot(), but it's impossible over the replication protocol. That seems inconsistent to me. Regards, Jeff Davis
Jeff Davis <pgsql@j-davis.com> writes: > On Wed, 2021-06-09 at 16:50 +0530, Amit Kapila wrote: >> BTW, can't we consider it to be part of >> create_slot_opt_list? > Yes, that would be better. It looks like the physical and logical slot > options are mixed together -- should they be separated in the grammar > so that using an option with the wrong kind of slot would be a parse > error? That sort of parse error is usually pretty unfriendly to users who may not quite remember which options are for what; all they'll get is "syntax error" which won't illuminate anything. I'd rather let the grammar accept both, and throw an appropriate error further downstream. regards, tom lane
On Thu, Jun 10, 2021 at 4:13 AM Jeff Davis <pgsql@j-davis.com> wrote: > > On Wed, 2021-06-09 at 17:27 +0530, Amit Kapila wrote: > > 2. In the main patch [1], we do send two_phase option even during > > START_REPLICATION for the very first time when the two_phase can be > > enabled. There are reasons as described in the worker.c why we can't > > enable it during CREATE_REPLICATION_SLOT. > > I'll have to catch up on the thread to digest that reasoning and how it > applies to decoding vs. replication. But there don't seem to be changes > to START_REPLICATION for twophase, so I don't quite follow your point. > I think it is because we pass there as an option as I have suggested doing in the case of CREATE_REPLICATION_SLOT. > Are you saying that we should not be able to create slots with twophase > at all until the rest of the changes go in? > No, the slots will be created but two_phase option will be enabled only after the initial tablesync is complete. > > Now, if we want to do > > protocol changes, I wonder why only do some changes and leave the > > rest > > for the next version? > > I started this thread because it's possible to create a slot a certain > way using the SQL function create_logical_replication_slot(), but it's > impossible over the replication protocol. That seems inconsistent to > me. > Right, I understand that but on the protocol side, there are few more things to be considered to allow subscribers to enable two_phase. However, maybe, for now, we can do it just for create_replication_slot and the start_replication stuff required for subscribers can be done later. I was not completely sure if that is a good idea. -- With Regards, Amit Kapila.
Re: Decoding of two-phase xacts missing from CREATE_REPLICATION_SLOT command
From
Ajin Cherian
Date:
On Wed, Jun 9, 2021 at 9:57 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Wed, Jun 9, 2021 at 4:50 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Wed, Jun 9, 2021 at 1:53 AM Jeff Davis <pgsql@j-davis.com> wrote: > > > > > > On Tue, 2021-06-08 at 17:41 +1000, Ajin Cherian wrote: > > > > Here's an updated patchset that adds back in the option for two-phase > > > > in CREATE_REPLICATION_SLOT command and a second patch that adds > > > > support for > > > > two-phase decoding in pg_recvlogical. > > > > > > A few things: > > > > > > * I suggest putting the TWO_PHASE keyword after the LOGICAL keyword > > > > > > > Isn't it better to add it after LOGICAL IDENT? In docs [1], we expect > > that way. Also, see code in libpqrcv_create_slot where we expect them > > to be together but we can change that if you still prefer to add it > > after LOGICAL. BTW, can't we consider it to be part of > > create_slot_opt_list? Changed accordingly. > Some more points: > 1. pg_recvlogical can only send two_phase option if > (PQserverVersion(conn) >= 140000), otherwise, it won't work for older > versions of the server. Updated accordingly. I've also modified the pg_recvlogical test case with the new option. regards, Ajin Cherian
Attachment
On Thu, Jun 10, 2021 at 2:04 PM Ajin Cherian <itsajin@gmail.com> wrote: > The new patches look mostly good apart from the below cosmetic issues. I think the question is whether we want to do these for PG-14 or postpone them till PG-15. I think these don't appear to be risky changes so we can get them in PG-14 as that might help some outside core solutions as appears to be the case for Jeff. The changes related to start_replication are too specific to the subscriber-side solution so we can postpone those along with the subscriber-side 2PC work. Jeff, Ajin, what do you think? Also, I can take care of the below cosmetic issues before committing if we decide to do this for PG-14. Few cosmetic issues: ================== 1. git diff --check shows src/bin/pg_basebackup/t/030_pg_recvlogical.pl:109: new blank line at EOF. 2. + <para> The following example shows SQL interface that can be used to decode prepared transactions. Before you use two-phase commit commands, you must set Spurious line addition. 3. /* Build query */ appendPQExpBuffer(query, "CREATE_REPLICATION_SLOT \"%s\"", slot_name); if (is_temporary) appendPQExpBufferStr(query, " TEMPORARY"); + if (is_physical) Spurious line addition. 4. appendPQExpBuffer(query, " LOGICAL \"%s\"", plugin); + if (two_phase && PQserverVersion(conn) >= 140000) + appendPQExpBufferStr(query, " TWO_PHASE"); + if (PQserverVersion(conn) >= 100000) /* pg_recvlogical doesn't use an exported snapshot, so suppress */ appendPQExpBufferStr(query, " NOEXPORT_SNAPSHOT"); I think it might be better to append TWO_PHASE after NOEXPORT_SNAPSHOT but it doesn't matter much. 5. +$node->safe_psql('postgres', + "BEGIN;INSERT INTO test_table values (11); PREPARE TRANSACTION 'test'"); There is no space after BEGIN but there is a space after INSERT. For consistency-sake, I will have space after BEGIN as well. -- With Regards, Amit Kapila.
Re: Decoding of two-phase xacts missing from CREATE_REPLICATION_SLOT command
From
Ajin Cherian
Date:
On Fri, Jun 11, 2021 at 8:14 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Thu, Jun 10, 2021 at 2:04 PM Ajin Cherian <itsajin@gmail.com> wrote: > > > > The new patches look mostly good apart from the below cosmetic issues. > I think the question is whether we want to do these for PG-14 or > postpone them till PG-15. I think these don't appear to be risky > changes so we can get them in PG-14 as that might help some outside > core solutions as appears to be the case for Jeff. The changes related > to start_replication are too specific to the subscriber-side solution > so we can postpone those along with the subscriber-side 2PC work. > Jeff, Ajin, what do you think? > Since we are exposing two-phase decoding using the pg_create_replication_slot API, I think it is reasonable to expose CREATE_REPLICATION_SLOT as well. We can leave the subscriber side changes for PG-15. > Also, I can take care of the below cosmetic issues before committing > if we decide to do this for PG-14. Thanks, Regards, Ajin Cherian Fujitsu Australia
On Fri, 2021-06-11 at 15:43 +0530, Amit Kapila wrote: > The new patches look mostly good apart from the below cosmetic > issues. > I think the question is whether we want to do these for PG-14 or > postpone them till PG-15. I think these don't appear to be risky > changes so we can get them in PG-14 as that might help some outside > core solutions as appears to be the case for Jeff. My main interest here is that I'm working on replication protocol support in a rust library. While doing that, I've encountered a lot of rough edges (as you may have seen in my recent posts), and this patch fixes one of them. But at the same time, several small changes to the protocol spread across several releases introduces more opportunity for confusion. If we are confident this is the right direction, then v14 makes sense for consistency. But if waiting for v15 might result in a better change, then we should probably just get it into the July CF for v15. (My apologies if my opinion has drifted a bit since this thread began.) Regards, Jeff Davis
On Sat, Jun 12, 2021 at 12:56 AM Jeff Davis <pgsql@j-davis.com> wrote: > > On Fri, 2021-06-11 at 15:43 +0530, Amit Kapila wrote: > > The new patches look mostly good apart from the below cosmetic > > issues. > > I think the question is whether we want to do these for PG-14 or > > postpone them till PG-15. I think these don't appear to be risky > > changes so we can get them in PG-14 as that might help some outside > > core solutions as appears to be the case for Jeff. > > My main interest here is that I'm working on replication protocol > support in a rust library. While doing that, I've encountered a lot of > rough edges (as you may have seen in my recent posts), and this patch > fixes one of them. > > But at the same time, several small changes to the protocol spread > across several releases introduces more opportunity for confusion. > > If we are confident this is the right direction, then v14 makes sense > for consistency. But if waiting for v15 might result in a better > change, then we should probably just get it into the July CF for v15. > If that is the case, I would prefer July CF v15. The patch is almost ready, so I'll try to get it early in the July CF. Ajin, feel free to post the patch after addressing some minor comments raised by me yesterday. -- With Regards, Amit Kapila.
Re: Decoding of two-phase xacts missing from CREATE_REPLICATION_SLOT command
From
Ajin Cherian
Date:
On Fri, Jun 11, 2021 at 8:14 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > Also, I can take care of the below cosmetic issues before committing > if we decide to do this for PG-14. > > Few cosmetic issues: > ================== > 1. git diff --check shows > src/bin/pg_basebackup/t/030_pg_recvlogical.pl:109: new blank line at EOF. > > 2. > + > <para> > The following example shows SQL interface that can be used to decode prepared > transactions. Before you use two-phase commit commands, you must set > > Spurious line addition. > Fixed. > 3. > /* Build query */ > appendPQExpBuffer(query, "CREATE_REPLICATION_SLOT \"%s\"", slot_name); > if (is_temporary) > appendPQExpBufferStr(query, " TEMPORARY"); > + > if (is_physical) > > Spurious line addition. > Fixed. > 4. > appendPQExpBuffer(query, " LOGICAL \"%s\"", plugin); > + if (two_phase && PQserverVersion(conn) >= 140000) > + appendPQExpBufferStr(query, " TWO_PHASE"); > + > if (PQserverVersion(conn) >= 100000) > /* pg_recvlogical doesn't use an exported snapshot, so suppress */ > appendPQExpBufferStr(query, " NOEXPORT_SNAPSHOT"); > > I think it might be better to append TWO_PHASE after NOEXPORT_SNAPSHOT > but it doesn't matter much. > I haven't changed this, I like to keep it this way. > 5. > +$node->safe_psql('postgres', > + "BEGIN;INSERT INTO test_table values (11); PREPARE TRANSACTION 'test'"); > > There is no space after BEGIN but there is a space after INSERT. For > consistency-sake, I will have space after BEGIN as well. Changed this. regards, Ajin Cherian Fujitsu Australia
Attachment
Re: Decoding of two-phase xacts missing from CREATE_REPLICATION_SLOT command
From
Ajin Cherian
Date:
On Tue, Jun 15, 2021 at 5:34 PM Ajin Cherian <itsajin@gmail.com> wrote: Since we've decided to not commit this for PG-14, I've added these patches along with the larger patch-set for subscriber side 2pc in thread [1] [1] - https://www.postgresql.org/message-id/CAHut+PuJKTNRjFre0VBufWMz9BEScC__nT+PUhbSaUNW2biPow@mail.gmail.com regards, Ajin Cherian