Re: Skip collecting decoded changes of already-aborted transactions - Mailing list pgsql-hackers
From | vignesh C |
---|---|
Subject | Re: Skip collecting decoded changes of already-aborted transactions |
Date | |
Msg-id | CALDaNm06VtK_9C1nLBMXNWQKtsX68--pzueHFXtdoUqfHGAX1Q@mail.gmail.com Whole thread Raw |
In response to | Skip collecting decoded changes of already-aborted transactions (Masahiko Sawada <sawada.mshk@gmail.com>) |
Responses |
Re: Skip collecting decoded changes of already-aborted transactions
|
List | pgsql-hackers |
On Mon, 11 Nov 2024 at 23:30, Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > On Sun, Nov 10, 2024 at 11:24 PM Peter Smith <smithpb2250@gmail.com> wrote: > > > > Hi Sawada-San, here are some review comments for the patch v5-0001. > > > > Thank you for reviewing the patch! > > > ====== > > Commit message. > > > > 1. > > This commit introduces an additional check to determine if a > > transaction is already aborted by a CLOG lookup, so the logical > > decoding skips further change also when it doesn't touch system > > catalogs. > > > > ~ > > > > Is that wording backwards? Is it meant to say: > > > > This commit introduces an additional CLOG lookup check to determine if > > a transaction is already aborted, so the ... > > Fixed. > > > > > ====== > > contrib/test_decoding/sql/stats.sql > > > > 2 > > +SELECT slot_name, spill_txns = 0 AS spill_txn, spill_count = 0 AS > > spill_count FROM pg_stat_replication_slots WHERE slot_name = > > 'regression_slot_stats4_twophase'; > > > > Why do the SELECT "= 0" like this, instead of just having zeros in the > > "expected" results? > > Indeed. I used "=0" like other queries in the same file do, but it > makes sense to me just to have zeros in the expected file. That way, > it would make it a bit easier to investigate in case of failures. > > > > > ====== > > .../replication/logical/reorderbuffer.c > > > > 3. > > static void ReorderBufferTruncateTXN(ReorderBuffer *rb, ReorderBufferTXN *txn, > > - bool txn_prepared); > > + bool txn_prepared, bool mark_streamed); > > > > That last parameter name ('mark_streamed') does not match the same > > parameter name in this function's definition. > > Fixed. > > > > > ~~~ > > > > ReorderBufferTruncateTXN: > > > > 4. > > if (txn_streaming && (!txn_prepared) && > > (rbtxn_is_toptxn(txn) || (txn->nentries_mem != 0))) > > txn->txn_flags |= RBTXN_IS_STREAMED; > > > > if (txn_prepared) > > { > > ~ > > > > Since the following condition was already "if (txn_prepared)" would it > > be better remove the "(!txn_prepared)" here and instead just refactor > > the code like: > > > > if (txn_prepared) > > { > > ... > > } > > else if (txn_streaming && (rbtxn_is_toptxn(txn) || (txn->nentries_mem != 0))) > > { > > ... > > } > > Good idea. > > > > > ~~~ > > > > ReorderBufferProcessTXN: > > > > 5. > > + > > + /* Remember the transaction is aborted */ > > + Assert((curtxn->txn_flags & RBTXN_IS_COMMITTED) == 0); > > + curtxn->txn_flags |= RBTXN_IS_ABORTED; > > > > Missing period on comment. > > Fixed. > > > > > ~~~ > > > > ReorderBufferCheckTXNAbort: > > > > 6. > > + * If GUC 'debug_logical_replication_streaming' is "immediate", we don't > > + * check the transaction status, so the caller always processes this > > + * transaction. This is to disable this check for regression tests. > > + */ > > +static bool > > +ReorderBufferCheckTXNAbort(ReorderBuffer *rb, ReorderBufferTXN *txn) > > +{ > > + /* > > + * If GUC 'debug_logical_replication_streaming' is "immediate", we don't > > + * check the transaction status, so the caller always processes this > > + * transaction. > > + */ > > + if (unlikely(debug_logical_replication_streaming == > > DEBUG_LOGICAL_REP_STREAMING_IMMEDIATE)) > > + return false; > > + > > > > The wording of the sentence "This is to disable..." seemed a bit > > confusing. Maybe this area can be simplified by doing the following. > > > > 6a. > > Change the function comment to say more like below: > > > > When the GUC 'debug_logical_replication_streaming' is set to > > "immediate", we don't check the transaction status, meaning the caller > > will always process this transaction. This mode is used by regression > > tests to avoid unnecessary transaction status checking. > > > > ~ > > > > 6b. > > It is not necessary for this 2nd comment to repeat everything that was > > already said in the function comment. A simpler comment here might be > > all you need: > > > > SUGGESTION: > > Quick return for regression tests. > > Agreed with the above two comments. Fixed. > > > > > ~~~ > > > > 7. > > Is it worth mentioning about this skipping of the transaction status > > check in the docs for this GUC? [1] > > If we want to mention this optimization in the docs, we have to > explain how the optimization works too. I think it's too detailed. > > I've attached the updated patch. Few minor suggestions: 1) Can we use rbtxn_is_committed here? + /* Remember the transaction is aborted. */ + Assert((curtxn->txn_flags & RBTXN_IS_COMMITTED) == 0); + curtxn->txn_flags |= RBTXN_IS_ABORTED; 2) Similarly here too: + /* + * Mark the transaction as aborted so we ignore future changes of this + * transaction. + */ + Assert((txn->txn_flags & RBTXN_IS_COMMITTED) == 0); + txn->txn_flags |= RBTXN_IS_ABORTED; 3) Can we use rbtxn_is_aborted here? + /* + * Remember the transaction is committed so that we can skip CLOG + * check next time, avoiding the pressure on CLOG lookup. + */ + Assert((txn->txn_flags & RBTXN_IS_ABORTED) == 0); Regards, Vignesh
pgsql-hackers by date: