Re: [HACKERS] logical decoding of two-phase transactions - Mailing list pgsql-hackers
From | Peter Smith |
---|---|
Subject | Re: [HACKERS] logical decoding of two-phase transactions |
Date | |
Msg-id | CAHut+PsCTC7ebm_MaNNhGFpep=Q7VxAdPSZb0ZOUCtfG-+Jy_Q@mail.gmail.com Whole thread Raw |
In response to | Re: [HACKERS] logical decoding of two-phase transactions (Amit Kapila <amit.kapila16@gmail.com>) |
List | pgsql-hackers |
On Mon, May 31, 2021 at 9:16 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Fri, May 28, 2021 at 11:55 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > One minor comment for 0001. > > * Special case: if when tables were specified but copy_data is > > + * false then it is safe to enable two_phase up-front because > > + * those tables are already initially READY state. Note, if > > + * the subscription has no tables then enablement cannot be > > + * done here - we must leave the twophase state as PENDING, to > > + * allow ALTER SUBSCRIPTION ... REFRESH PUBLICATION to work. > > > > Can we slightly modify this comment as: "Note that if tables were > > specified but copy_data is false then it is safe to enable two_phase > > up-front because those tables are already initially READY state. When > > the subscription has no tables, we leave the twophase state as > > PENDING, to allow ALTER SUBSCRIPTION ... REFRESH PUBLICATION to work." > > > > Also, I don't see any test after you enable this special case. Is it > > covered by existing tests, if not then let's try to add a test for > > this? > > > > I see that Ajin's latest patch has addressed the other comments except > for the above test case suggestion. Yes, this is a known pending task. > I have again reviewed the first > patch and have some comments. > > Comments on v81-0001-Add-support-for-prepared-transactions-to-built-i > ============================================================================ > 1. > <para> > The logical replication solution that builds distributed two > phase commit > using this feature can deadlock if the prepared transaction has locked > - [user] catalog tables exclusively. They need to inform users to not have > - locks on catalog tables (via explicit <command>LOCK</command> > command) in > - such transactions. > + [user] catalog tables exclusively. To avoid this users must refrain from > + having locks on catalog tables (via explicit > <command>LOCK</command> command) > + in such transactions. > </para> > > This change doesn't belong to this patch. I see the proposed text > could be considered as an improvement but still we can do this > separately. We are already trying to improve things in this regard in > the thread [1], so you can propose this change there. > OK. This change has been removed in v82, and a patch posted to other thread here [1] > 2. > +<varlistentry> > +<term>Byte1('K')</term> > +<listitem><para> > + Identifies the message as the commit of a two-phase > transaction message. > +</para></listitem> > +</varlistentry> > + > +<varlistentry> > +<term>Int8</term> > +<listitem><para> > + Flags; currently unused (must be 0). > +</para></listitem> > +</varlistentry> > + > +<varlistentry> > +<term>Int64</term> > +<listitem><para> > + The LSN of the commit. > +</para></listitem> > +</varlistentry> > + > +<varlistentry> > +<term>Int64</term> > +<listitem><para> > + The end LSN of the commit transaction. > +</para></listitem> > +</varlistentry> > > Can we change the description of LSN's as "The LSN of the commit > prepared." and "The end LSN of the commit prepared transaction." > respectively? This will make their description different from regular > commit and I think that defines them better. > > 3. > +<varlistentry> > +<term>Int64</term> > +<listitem><para> > + The end LSN of the rollback transaction. > +</para></listitem> > +</varlistentry> > > Similar to above, can we change the description here as: "The end LSN > of the rollback prepared transaction."? > > 4. > + * The exception to this restriction is when copy_data = > + * false, because when copy_data is false the tablesync will > + * start already in READY state and will exit directly without > + * doing anything which could interfere with the apply > + * worker's message handling. > + * > + * For more details see comments atop worker.c. > + */ > + if (sub->twophasestate == LOGICALREP_TWOPHASE_STATE_ENABLED && copy_data) > + ereport(ERROR, > + (errcode(ERRCODE_SYNTAX_ERROR), > + errmsg("ALTER SUBSCRIPTION ... REFRESH with copy_data is not allowed > when two_phase is enabled"), > + errhint("Use ALTER SUBSCRIPTION ... REFRESH with copy_data = false" > + ", or use DROP/CREATE SUBSCRIPTION."))); > > The above comment is a bit unclear because it seems you are saying > there is some problem even when copy_data is false. Are you missing > 'not' after 'could' in the comment? > > 5. > XXX Now, this can even lead to a deadlock if the prepare > * transaction is waiting to get it logically replicated for > - * distributed 2PC. Currently, we don't have an in-core > - * implementation of prepares for distributed 2PC but some > - * out-of-core logical replication solution can have such an > - * implementation. They need to inform users to not have locks > - * on catalog tables in such transactions. > + * distributed 2PC. This can be avoided by disallowing to > + * prepare transactions that have locked [user] catalog tables > + * exclusively. > > Can we slightly modify this part of the comment as: "This can be > avoided by disallowing to prepare transactions that have locked [user] > catalog tables exclusively but as of now we ask users not to do such > operation"? > > 6. > +AllTablesyncsReady(void) > +{ > + bool found_busy = false; > + bool started_tx = false; > + bool has_subrels = false; > + > + /* We need up-to-date sync state info for subscription tables here. */ > + has_subrels = FetchTableStates(&started_tx); > + > + found_busy = list_length(table_states_not_ready) > 0; > + > + if (started_tx) > + { > + CommitTransactionCommand(); > + pgstat_report_stat(false); > + } > + > + /* > + * When there are no tables, then return false. > + * When no tablesyncs are busy, then all are READY > + */ > + return has_subrels && !found_busy; > +} > > Do we really need found_busy variable in above function. Can't we > change the return as (has_subrels) && (table_states_not_ready != NIL)? > If so, then change the comments above return. > > 7. > +/* > + * Common code to fetch the up-to-date sync state info into the static lists. > + * > + * Returns true if subscription has 1 or more tables, else false. > + */ > +static bool > +FetchTableStates(bool *started_tx) > > Can we update comments indicating that if this function starts the > transaction then the caller is responsible to commit it? > > 8. > (errmsg("logical replication apply worker for subscription \"%s\" will > restart so two_phase can be enabled", > + MySubscription->name))); > > Can we slightly change the message as: "logical replication apply > worker for subscription \"%s\" will restart so that two_phase can be > enabled"? > > 9. > +void > +UpdateTwoPhaseState(Oid suboid, char new_state) > { > .. > + /* And update/set two_phase ENABLED */ > + values[Anum_pg_subscription_subtwophasestate - 1] = CharGetDatum(new_state); > + replaces[Anum_pg_subscription_subtwophasestate - 1] = true; > .. > } > > The above comment seems wrong to me as we are updating the state as > passed by the caller. > All the above reported issues 2-9 are addressed in the latest 2PC patch set v82 ------ [1] https://www.postgresql.org/message-id/CAHut%2BPuTjTp_WERO%3D3Ybp8snTgDpiZeNaxzZhN8ky8XMo4KFVQ%40mail.gmail.com Kind Regards, Peter Smith. Fujitsu Australia
pgsql-hackers by date: