Thread: Skip collecting decoded changes of already-aborted transactions
Hi, In logical decoding, we don't need to collect decoded changes of aborted transactions. While streaming changes, we can detect concurrent abort of the (sub)transaction but there is no mechanism to skip decoding changes of transactions that are known to already be aborted. With the attached WIP patch, we check CLOG when decoding the transaction for the first time. If it's already known to be aborted, we skip collecting decoded changes of such transactions. That way, when the logical replication is behind or restarts, we don't need to decode large transactions that already aborted, which helps improve the decoding performance. Feedback is very welcome. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
Attachment
Hi, On 2023-06-09 14:16:44 +0900, Masahiko Sawada wrote: > In logical decoding, we don't need to collect decoded changes of > aborted transactions. While streaming changes, we can detect > concurrent abort of the (sub)transaction but there is no mechanism to > skip decoding changes of transactions that are known to already be > aborted. With the attached WIP patch, we check CLOG when decoding the > transaction for the first time. If it's already known to be aborted, > we skip collecting decoded changes of such transactions. That way, > when the logical replication is behind or restarts, we don't need to > decode large transactions that already aborted, which helps improve > the decoding performance. It's very easy to get uses of TransactionIdDidAbort() wrong. For one, it won't return true when a transaction was implicitly aborted due to a crash / restart. You're also supposed to use it only after a preceding TransactionIdIsInProgress() call. I'm not sure there are issues with not checking TransactionIdIsInProgress() first in this case, but I'm also not sure there aren't. A separate issue is that TransactionIdDidAbort() can end up being very slow if a lot of transactions are in progress concurrently. As soon as the clog buffers are extended all time is spent copying pages from the kernel pagecache. I'd not at all be surprised if this changed causes a substantial slowdown in workloads with lots of small transactions, where most transactions commit. Greetings, Andres Freund
On Sun, Jun 11, 2023 at 5:31 AM Andres Freund <andres@anarazel.de> wrote: > > Hi, > > On 2023-06-09 14:16:44 +0900, Masahiko Sawada wrote: > > In logical decoding, we don't need to collect decoded changes of > > aborted transactions. While streaming changes, we can detect > > concurrent abort of the (sub)transaction but there is no mechanism to > > skip decoding changes of transactions that are known to already be > > aborted. With the attached WIP patch, we check CLOG when decoding the > > transaction for the first time. If it's already known to be aborted, > > we skip collecting decoded changes of such transactions. That way, > > when the logical replication is behind or restarts, we don't need to > > decode large transactions that already aborted, which helps improve > > the decoding performance. > Thank you for the comment. > It's very easy to get uses of TransactionIdDidAbort() wrong. For one, it won't > return true when a transaction was implicitly aborted due to a crash / > restart. You're also supposed to use it only after a preceding > TransactionIdIsInProgress() call. > > I'm not sure there are issues with not checking TransactionIdIsInProgress() > first in this case, but I'm also not sure there aren't. Yeah, it seems to be better to use !TransactionIdDidCommit() with a preceding TransactionIdIsInProgress() check. > > A separate issue is that TransactionIdDidAbort() can end up being very slow if > a lot of transactions are in progress concurrently. As soon as the clog > buffers are extended all time is spent copying pages from the kernel > pagecache. I'd not at all be surprised if this changed causes a substantial > slowdown in workloads with lots of small transactions, where most transactions > commit. > Indeed. So it should check the transaction status less frequently. It doesn't benefit much even if we can skip collecting decoded changes of small transactions. Another idea is that we check the status of only large transactions. That is, when the size of decoded changes of an aborted transaction exceeds logical_decoding_work_mem, we mark it as aborted , free its changes decoded so far, and skip further collection. Regards -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
On Tue, Jun 13, 2023 at 2:06 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > On Sun, Jun 11, 2023 at 5:31 AM Andres Freund <andres@anarazel.de> wrote: > > > > A separate issue is that TransactionIdDidAbort() can end up being very slow if > > a lot of transactions are in progress concurrently. As soon as the clog > > buffers are extended all time is spent copying pages from the kernel > > pagecache. I'd not at all be surprised if this changed causes a substantial > > slowdown in workloads with lots of small transactions, where most transactions > > commit. > > > > Indeed. So it should check the transaction status less frequently. It > doesn't benefit much even if we can skip collecting decoded changes of > small transactions. Another idea is that we check the status of only > large transactions. That is, when the size of decoded changes of an > aborted transaction exceeds logical_decoding_work_mem, we mark it as > aborted , free its changes decoded so far, and skip further > collection. > Your idea might work for large transactions but I have not come across reports where this is reported as a problem. Do you see any such reports and can we see how much is the benefit with large transactions? Because we do have the handling of concurrent aborts during sys table scans and that might help sometimes for large transactions. -- With Regards, Amit Kapila.
On Thu, Jun 15, 2023 at 7:50 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Tue, Jun 13, 2023 at 2:06 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > On Sun, Jun 11, 2023 at 5:31 AM Andres Freund <andres@anarazel.de> wrote: > > > > > > A separate issue is that TransactionIdDidAbort() can end up being very slow if > > > a lot of transactions are in progress concurrently. As soon as the clog > > > buffers are extended all time is spent copying pages from the kernel > > > pagecache. I'd not at all be surprised if this changed causes a substantial > > > slowdown in workloads with lots of small transactions, where most transactions > > > commit. > > > > > > > Indeed. So it should check the transaction status less frequently. It > > doesn't benefit much even if we can skip collecting decoded changes of > > small transactions. Another idea is that we check the status of only > > large transactions. That is, when the size of decoded changes of an > > aborted transaction exceeds logical_decoding_work_mem, we mark it as > > aborted , free its changes decoded so far, and skip further > > collection. > > > > Your idea might work for large transactions but I have not come across > reports where this is reported as a problem. Do you see any such > reports and can we see how much is the benefit with large > transactions? Because we do have the handling of concurrent aborts > during sys table scans and that might help sometimes for large > transactions. I've heard there was a case where a user had 29 million deletes in a single transaction with each one wrapped in a savepoint and rolled it back, which led to 11TB of spill files. If decoding such a large transaction fails for some reasons (e.g. a disk full), it would try decoding the same transaction again and again. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
On Wed, Jun 21, 2023 at 8:12 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > On Thu, Jun 15, 2023 at 7:50 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Tue, Jun 13, 2023 at 2:06 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > > > On Sun, Jun 11, 2023 at 5:31 AM Andres Freund <andres@anarazel.de> wrote: > > > > > > > > A separate issue is that TransactionIdDidAbort() can end up being very slow if > > > > a lot of transactions are in progress concurrently. As soon as the clog > > > > buffers are extended all time is spent copying pages from the kernel > > > > pagecache. I'd not at all be surprised if this changed causes a substantial > > > > slowdown in workloads with lots of small transactions, where most transactions > > > > commit. > > > > > > > > > > Indeed. So it should check the transaction status less frequently. It > > > doesn't benefit much even if we can skip collecting decoded changes of > > > small transactions. Another idea is that we check the status of only > > > large transactions. That is, when the size of decoded changes of an > > > aborted transaction exceeds logical_decoding_work_mem, we mark it as > > > aborted , free its changes decoded so far, and skip further > > > collection. > > > > > > > Your idea might work for large transactions but I have not come across > > reports where this is reported as a problem. Do you see any such > > reports and can we see how much is the benefit with large > > transactions? Because we do have the handling of concurrent aborts > > during sys table scans and that might help sometimes for large > > transactions. > > I've heard there was a case where a user had 29 million deletes in a > single transaction with each one wrapped in a savepoint and rolled it > back, which led to 11TB of spill files. If decoding such a large > transaction fails for some reasons (e.g. a disk full), it would try > decoding the same transaction again and again. > I was thinking why the existing handling of concurrent aborts doesn't handle such a case and it seems that we check that only on catalog access. However, in your case, the user probably is accessing the same relation without any concurrent DDL on the same table, so it would just be a cache look-up for catalogs. Your idea of checking aborts every logical_decoding_work_mem should work for such cases. -- With Regards, Amit Kapila.
On Fri, Jun 9, 2023 at 10:47 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > Hi, > > In logical decoding, we don't need to collect decoded changes of > aborted transactions. While streaming changes, we can detect > concurrent abort of the (sub)transaction but there is no mechanism to > skip decoding changes of transactions that are known to already be > aborted. With the attached WIP patch, we check CLOG when decoding the > transaction for the first time. If it's already known to be aborted, > we skip collecting decoded changes of such transactions. That way, > when the logical replication is behind or restarts, we don't need to > decode large transactions that already aborted, which helps improve > the decoding performance. > +1 for the idea of checking the transaction status only when we need to flush it to the disk or send it downstream (if streaming in progress is enabled). Although this check is costly since we are planning only for large transactions then it is worth it if we can occasionally avoid disk or network I/O for the aborted transactions. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
On Fri, Jun 23, 2023 at 12:39 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Fri, Jun 9, 2023 at 10:47 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > Hi, > > > > In logical decoding, we don't need to collect decoded changes of > > aborted transactions. While streaming changes, we can detect > > concurrent abort of the (sub)transaction but there is no mechanism to > > skip decoding changes of transactions that are known to already be > > aborted. With the attached WIP patch, we check CLOG when decoding the > > transaction for the first time. If it's already known to be aborted, > > we skip collecting decoded changes of such transactions. That way, > > when the logical replication is behind or restarts, we don't need to > > decode large transactions that already aborted, which helps improve > > the decoding performance. > > > +1 for the idea of checking the transaction status only when we need > to flush it to the disk or send it downstream (if streaming in > progress is enabled). Although this check is costly since we are > planning only for large transactions then it is worth it if we can > occasionally avoid disk or network I/O for the aborted transactions. > Thanks. I've attached the updated patch. With this patch, we check the transaction status for only large-transactions when eviction. For regression test purposes, I disable this transaction status check when logical_replication_mode is set to 'immediate'. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
Attachment
On Mon, 3 Jul 2023 at 07:16, Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > On Fri, Jun 23, 2023 at 12:39 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > On Fri, Jun 9, 2023 at 10:47 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > > > Hi, > > > > > > In logical decoding, we don't need to collect decoded changes of > > > aborted transactions. While streaming changes, we can detect > > > concurrent abort of the (sub)transaction but there is no mechanism to > > > skip decoding changes of transactions that are known to already be > > > aborted. With the attached WIP patch, we check CLOG when decoding the > > > transaction for the first time. If it's already known to be aborted, > > > we skip collecting decoded changes of such transactions. That way, > > > when the logical replication is behind or restarts, we don't need to > > > decode large transactions that already aborted, which helps improve > > > the decoding performance. > > > > > +1 for the idea of checking the transaction status only when we need > > to flush it to the disk or send it downstream (if streaming in > > progress is enabled). Although this check is costly since we are > > planning only for large transactions then it is worth it if we can > > occasionally avoid disk or network I/O for the aborted transactions. > > > > Thanks. > > I've attached the updated patch. With this patch, we check the > transaction status for only large-transactions when eviction. For > regression test purposes, I disable this transaction status check when > logical_replication_mode is set to 'immediate'. May be there is some changes that are missing in the patch, which is giving the following errors: reorderbuffer.c: In function ‘ReorderBufferCheckTXNAbort’: reorderbuffer.c:3584:22: error: ‘logical_replication_mode’ undeclared (first use in this function) 3584 | if (unlikely(logical_replication_mode == LOGICAL_REP_MODE_IMMEDIATE)) | ^~~~~~~~~~~~~~~~~~~~~~~~ Regards, Vignesh
On Tue, 3 Oct 2023 at 15:54, vignesh C <vignesh21@gmail.com> wrote: > > On Mon, 3 Jul 2023 at 07:16, Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > On Fri, Jun 23, 2023 at 12:39 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > On Fri, Jun 9, 2023 at 10:47 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > > > > > Hi, > > > > > > > > In logical decoding, we don't need to collect decoded changes of > > > > aborted transactions. While streaming changes, we can detect > > > > concurrent abort of the (sub)transaction but there is no mechanism to > > > > skip decoding changes of transactions that are known to already be > > > > aborted. With the attached WIP patch, we check CLOG when decoding the > > > > transaction for the first time. If it's already known to be aborted, > > > > we skip collecting decoded changes of such transactions. That way, > > > > when the logical replication is behind or restarts, we don't need to > > > > decode large transactions that already aborted, which helps improve > > > > the decoding performance. > > > > > > > +1 for the idea of checking the transaction status only when we need > > > to flush it to the disk or send it downstream (if streaming in > > > progress is enabled). Although this check is costly since we are > > > planning only for large transactions then it is worth it if we can > > > occasionally avoid disk or network I/O for the aborted transactions. > > > > > > > Thanks. > > > > I've attached the updated patch. With this patch, we check the > > transaction status for only large-transactions when eviction. For > > regression test purposes, I disable this transaction status check when > > logical_replication_mode is set to 'immediate'. > > May be there is some changes that are missing in the patch, which is > giving the following errors: > reorderbuffer.c: In function ‘ReorderBufferCheckTXNAbort’: > reorderbuffer.c:3584:22: error: ‘logical_replication_mode’ undeclared > (first use in this function) > 3584 | if (unlikely(logical_replication_mode == > LOGICAL_REP_MODE_IMMEDIATE)) > | ^~~~~~~~~~~~~~~~~~~~~~~~ With no update to the thread and the compilation still failing I'm marking this as returned with feedback. Please feel free to resubmit to the next CF when there is a new version of the patch. Regards, Vignesh
On Fri, Feb 2, 2024 at 12:48 AM vignesh C <vignesh21@gmail.com> wrote: > > On Tue, 3 Oct 2023 at 15:54, vignesh C <vignesh21@gmail.com> wrote: > > > > On Mon, 3 Jul 2023 at 07:16, Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > > > On Fri, Jun 23, 2023 at 12:39 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > > > On Fri, Jun 9, 2023 at 10:47 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > > > > > > > Hi, > > > > > > > > > > In logical decoding, we don't need to collect decoded changes of > > > > > aborted transactions. While streaming changes, we can detect > > > > > concurrent abort of the (sub)transaction but there is no mechanism to > > > > > skip decoding changes of transactions that are known to already be > > > > > aborted. With the attached WIP patch, we check CLOG when decoding the > > > > > transaction for the first time. If it's already known to be aborted, > > > > > we skip collecting decoded changes of such transactions. That way, > > > > > when the logical replication is behind or restarts, we don't need to > > > > > decode large transactions that already aborted, which helps improve > > > > > the decoding performance. > > > > > > > > > +1 for the idea of checking the transaction status only when we need > > > > to flush it to the disk or send it downstream (if streaming in > > > > progress is enabled). Although this check is costly since we are > > > > planning only for large transactions then it is worth it if we can > > > > occasionally avoid disk or network I/O for the aborted transactions. > > > > > > > > > > Thanks. > > > > > > I've attached the updated patch. With this patch, we check the > > > transaction status for only large-transactions when eviction. For > > > regression test purposes, I disable this transaction status check when > > > logical_replication_mode is set to 'immediate'. > > > > May be there is some changes that are missing in the patch, which is > > giving the following errors: > > reorderbuffer.c: In function ‘ReorderBufferCheckTXNAbort’: > > reorderbuffer.c:3584:22: error: ‘logical_replication_mode’ undeclared > > (first use in this function) > > 3584 | if (unlikely(logical_replication_mode == > > LOGICAL_REP_MODE_IMMEDIATE)) > > | ^~~~~~~~~~~~~~~~~~~~~~~~ > > With no update to the thread and the compilation still failing I'm > marking this as returned with feedback. Please feel free to resubmit > to the next CF when there is a new version of the patch. > I resumed working on this item. I've attached the new version patch. I rebased the patch to the current HEAD and updated comments and commit messages. The patch is straightforward and I'm somewhat satisfied with it, but I'm thinking of adding some tests for it. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
Attachment
On Fri, Mar 15, 2024 at 3:17 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I resumed working on this item. I've attached the new version patch.
I rebased the patch to the current HEAD and updated comments and
commit messages. The patch is straightforward and I'm somewhat
satisfied with it, but I'm thinking of adding some tests for it.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
I just had a look at the patch, the patch no longer applies because of a removal of a header in a recent commit. Overall the patch looks fine, and I didn't find any issues. Some cosmetic comments:
in ReorderBufferCheckTXNAbort()
+ /* Quick return if we've already knew the transaction status */
+ if (txn->aborted)
+ return true;
knew/know
/*
+ * If logical_replication_mode is "immediate", we don't check the
+ * transaction status so the caller always process this transaction.
+ */
+ if (debug_logical_replication_streaming == DEBUG_LOGICAL_REP_STREAMING_IMMEDIATE)
+ return false;
/process/processes
+ if (txn->aborted)
+ return true;
knew/know
/*
+ * If logical_replication_mode is "immediate", we don't check the
+ * transaction status so the caller always process this transaction.
+ */
+ if (debug_logical_replication_streaming == DEBUG_LOGICAL_REP_STREAMING_IMMEDIATE)
+ return false;
/process/processes
regards,
Ajin Cherian
Fujitsu Australia
On Fri, Mar 15, 2024 at 1:21 PM Ajin Cherian <itsajin@gmail.com> wrote: > > > > On Fri, Mar 15, 2024 at 3:17 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: >> >> >> I resumed working on this item. I've attached the new version patch. >> >> I rebased the patch to the current HEAD and updated comments and >> commit messages. The patch is straightforward and I'm somewhat >> satisfied with it, but I'm thinking of adding some tests for it. >> >> Regards, >> >> -- >> Masahiko Sawada >> Amazon Web Services: https://aws.amazon.com > > > I just had a look at the patch, the patch no longer applies because of a removal of a header in a recent commit. Overallthe patch looks fine, and I didn't find any issues. Some cosmetic comments: Thank you for your review comments. > in ReorderBufferCheckTXNAbort() > + /* Quick return if we've already knew the transaction status */ > + if (txn->aborted) > + return true; > > knew/know Maybe it should be "known"? > > /* > + * If logical_replication_mode is "immediate", we don't check the > + * transaction status so the caller always process this transaction. > + */ > + if (debug_logical_replication_streaming == DEBUG_LOGICAL_REP_STREAMING_IMMEDIATE) > + return false; > > /process/processes > Fixed. In addition to these changes, I've made some changes to the latest patch. Here is the summary: - Use txn_flags field to record the transaction status instead of two 'committed' and 'aborted' flags. - Add regression tests. - Update commit message. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
Attachment
On Mon, Mar 18, 2024 at 7:50 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
In addition to these changes, I've made some changes to the latest
patch. Here is the summary:
- Use txn_flags field to record the transaction status instead of two
'committed' and 'aborted' flags.
- Add regression tests.
- Update commit message.
Regards,
Hi Sawada-san,
Thanks for the updated patch. Some comments:
1.
+ * already aborted, we discards all changes accumulated so far and ignore
+ * future changes, and return true. Otherwise return false.
+ * already aborted, we discards all changes accumulated so far and ignore
+ * future changes, and return true. Otherwise return false.
we discards/we discard
2. In function ReorderBufferCheckTXNAbort(): I haven't tested this but I wonder how prepared transactions would be considered, they are neither committed, nor in progress.
regards,
Ajin Cherian
Fujitsu Australia
On Wed, Mar 27, 2024 at 8:49 PM Ajin Cherian <itsajin@gmail.com> wrote: > > > > On Mon, Mar 18, 2024 at 7:50 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: >> >> >> In addition to these changes, I've made some changes to the latest >> patch. Here is the summary: >> >> - Use txn_flags field to record the transaction status instead of two >> 'committed' and 'aborted' flags. >> - Add regression tests. >> - Update commit message. >> >> Regards, >> > > Hi Sawada-san, > > Thanks for the updated patch. Some comments: Thank you for the view comments! > > 1. > + * already aborted, we discards all changes accumulated so far and ignore > + * future changes, and return true. Otherwise return false. > > we discards/we discard Will fix it. > > 2. In function ReorderBufferCheckTXNAbort(): I haven't tested this but I wonder how prepared transactions would be considered,they are neither committed, nor in progress. The transaction that is prepared but not resolved yet is considered as in-progress. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
Hi, here are some review comments for your patch v4-0001. ====== contrib/test_decoding/sql/stats.sql 1. Huh? The test fails because the "expected results" file for these new tests is missing from the patch. ====== .../replication/logical/reorderbuffer.c 2. static void ReorderBufferTruncateTXN(ReorderBuffer *rb, ReorderBufferTXN *txn, - bool txn_prepared); + bool txn_prepared, bool mark_streamed); IIUC this new 'mark_streamed' parameter is more like a prerequisite for the other conditions to decide to mark the tx as streamed -- i.e. it is more like 'can_mark_streamed', so I felt the name should be changed to be like that (everywhere it is used). ~~~ 3. ReorderBufferTruncateTXN - * 'txn_prepared' indicates that we have decoded the transaction at prepare - * time. + * If mark_streamed is true, we could mark the transaction as streamed. + * + * 'streaming_txn' indicates that the given transaction is a streaming transaction. */ static void -ReorderBufferTruncateTXN(ReorderBuffer *rb, ReorderBufferTXN *txn, bool txn_prepared) +ReorderBufferTruncateTXN(ReorderBuffer *rb, ReorderBufferTXN *txn, bool txn_prepared, + bool mark_streamed) ~ What's that new comment about 'streaming_txn' for? It seemed unrelated to the patch code. ~~~ 4. /* * Mark the transaction as streamed. * * The top-level transaction, is marked as streamed always, even if it * does not contain any changes (that is, when all the changes are in * subtransactions). * * For subtransactions, we only mark them as streamed when there are * changes in them. * * We do it this way because of aborts - we don't want to send aborts for * XIDs the downstream is not aware of. And of course, it always knows * about the toplevel xact (we send the XID in all messages), but we never * stream XIDs of empty subxacts. */ if (mark_streamed && (!txn_prepared) && (rbtxn_is_toptxn(txn) || (txn->nentries_mem != 0))) txn->txn_flags |= RBTXN_IS_STREAMED; ~~ With the patch introduction of the new parameter, I felt this code might be better if it was refactored as follows: /* Mark the transaction as streamed, if appropriate. */ if (can_mark_streamed) { /* ... large comment */ if ((!txn_prepared) && (rbtxn_is_toptxn(txn) || (txn->nentries_mem != 0))) txn->txn_flags |= RBTXN_IS_STREAMED; } ~~~ 5. ReorderBufferPrepare - if (txn->concurrent_abort && !rbtxn_is_streamed(txn)) + if (!txn_aborted && rbtxn_did_abort(txn) && !rbtxn_is_streamed(txn)) rb->prepare(rb, txn, txn->final_lsn); ~ Maybe I misunderstood this logic, but won't a "concurrent abort" cause your new Assert added in ReorderBufferProcessTXN to fail? + /* Update transaction status */ + Assert((curtxn->txn_flags & (RBTXN_COMMITTED | RBTXN_ABORTED)) == 0); ~~~ 6. ReorderBufferCheckTXNAbort + /* Check the transaction status using CLOG lookup */ + if (TransactionIdIsInProgress(txn->xid)) + return false; + + if (TransactionIdDidCommit(txn->xid)) + { + /* + * Remember the transaction is committed so that we can skip CLOG + * check next time, avoiding the pressure on CLOG lookup. + */ + txn->txn_flags |= RBTXN_COMMITTED; + return false; + } IIUC the purpose of the TransactionIdDidCommit() was to avoid the overhead of calling the TransactionIdIsInProgress(). So, shouldn't the order of these checks be swapped? Otherwise, there might be 1 extra unnecessary call to TransactionIdIsInProgress() next time. ====== src/include/replication/reorderbuffer.h 7. #define RBTXN_PREPARE 0x0040 #define RBTXN_SKIPPED_PREPARE 0x0080 #define RBTXN_HAS_STREAMABLE_CHANGE 0x0100 +#define RBTXN_COMMITTED 0x0200 +#define RBTXN_ABORTED 0x0400 For consistency with the existing bitmask names, I guess these should be named: - RBTXN_COMMITTED --> RBTXN_IS_COMMITTED - RBTXN_ABORTED --> RBTXN_IS_ABORTED ~~~ 8. Similarly, IMO the macros should have the same names as the bitmasks, like the other nearby ones generally seem to. rbtxn_did_commit --> rbtxn_is_committed rbtxn_did_abort --> rbtxn_is_aborted ====== 9. Also, attached is a top-up patch for other cosmetic nitpicks: - comment wording - typos in comments - excessive or missing blank lines - etc. ====== Kind Regards, Peter Smith. Fujitsu Australia
Attachment
On Wed, Mar 27, 2024 at 4:49 AM Ajin Cherian <itsajin@gmail.com> wrote: > > > > On Mon, Mar 18, 2024 at 7:50 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: >> >> >> In addition to these changes, I've made some changes to the latest >> patch. Here is the summary: >> >> - Use txn_flags field to record the transaction status instead of two >> 'committed' and 'aborted' flags. >> - Add regression tests. >> - Update commit message. >> >> Regards, >> > > Hi Sawada-san, > > Thanks for the updated patch. Some comments: > > 1. > + * already aborted, we discards all changes accumulated so far and ignore > + * future changes, and return true. Otherwise return false. > > we discards/we discard This comment is incorporated into the latest v5 patch I've just sent[1]. > > 2. In function ReorderBufferCheckTXNAbort(): I haven't tested this but I wonder how prepared transactions would be considered,they are neither committed, nor in progress. > IIUC prepared transactions are considered as in-progress. Regards, [1] https://www.postgresql.org/message-id/CAD21AoDJE-bLdxt9T_z1rw74RN%3DE0n0%2BesYU0eo%2B-_P32EbuVg%40mail.gmail.com -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
Hi Sawada-San, here are some review comments for the patch v5-0001. ====== Commit message. 1. This commit introduces an additional check to determine if a transaction is already aborted by a CLOG lookup, so the logical decoding skips further change also when it doesn't touch system catalogs. ~ Is that wording backwards? Is it meant to say: This commit introduces an additional CLOG lookup check to determine if a transaction is already aborted, so the ... ====== contrib/test_decoding/sql/stats.sql 2 +SELECT slot_name, spill_txns = 0 AS spill_txn, spill_count = 0 AS spill_count FROM pg_stat_replication_slots WHERE slot_name = 'regression_slot_stats4_twophase'; Why do the SELECT "= 0" like this, instead of just having zeros in the "expected" results? ====== .../replication/logical/reorderbuffer.c 3. static void ReorderBufferTruncateTXN(ReorderBuffer *rb, ReorderBufferTXN *txn, - bool txn_prepared); + bool txn_prepared, bool mark_streamed); That last parameter name ('mark_streamed') does not match the same parameter name in this function's definition. ~~~ ReorderBufferTruncateTXN: 4. if (txn_streaming && (!txn_prepared) && (rbtxn_is_toptxn(txn) || (txn->nentries_mem != 0))) txn->txn_flags |= RBTXN_IS_STREAMED; if (txn_prepared) { ~ Since the following condition was already "if (txn_prepared)" would it be better remove the "(!txn_prepared)" here and instead just refactor the code like: if (txn_prepared) { ... } else if (txn_streaming && (rbtxn_is_toptxn(txn) || (txn->nentries_mem != 0))) { ... } ~~~ ReorderBufferProcessTXN: 5. + + /* Remember the transaction is aborted */ + Assert((curtxn->txn_flags & RBTXN_IS_COMMITTED) == 0); + curtxn->txn_flags |= RBTXN_IS_ABORTED; Missing period on comment. ~~~ ReorderBufferCheckTXNAbort: 6. + * If GUC 'debug_logical_replication_streaming' is "immediate", we don't + * check the transaction status, so the caller always processes this + * transaction. This is to disable this check for regression tests. + */ +static bool +ReorderBufferCheckTXNAbort(ReorderBuffer *rb, ReorderBufferTXN *txn) +{ + /* + * If GUC 'debug_logical_replication_streaming' is "immediate", we don't + * check the transaction status, so the caller always processes this + * transaction. + */ + if (unlikely(debug_logical_replication_streaming == DEBUG_LOGICAL_REP_STREAMING_IMMEDIATE)) + return false; + The wording of the sentence "This is to disable..." seemed a bit confusing. Maybe this area can be simplified by doing the following. 6a. Change the function comment to say more like below: When the GUC 'debug_logical_replication_streaming' is set to "immediate", we don't check the transaction status, meaning the caller will always process this transaction. This mode is used by regression tests to avoid unnecessary transaction status checking. ~ 6b. It is not necessary for this 2nd comment to repeat everything that was already said in the function comment. A simpler comment here might be all you need: SUGGESTION: Quick return for regression tests. ~~~ 7. Is it worth mentioning about this skipping of the transaction status check in the docs for this GUC? [1] ====== [1] https://www.postgresql.org/docs/devel/runtime-config-developer.html Kind Regards, Peter Smith. Fujitsu Australia.
On Tue, Nov 12, 2024 at 5:00 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > I've attached the updated patch. > Hi, here are some review comments for the latest v6-0001. ====== contrib/test_decoding/sql/stats.sql 1. +INSERT INTO stats_test SELECT 'serialize-topbig--1:'||g.i FROM generate_series(1, 5000) g(i); I didn't understand the meaning of "serialize-topbig--1". My guess is it is a typo that was supposed to say "toobig". Perhaps there should also be some comment to explain that this "toobig" stuff was done deliberately like this to exceed 'logical_decoding_work_mem' because that would normally (if it was not aborted) cause a spill to disk. ~~~ 2. +-- Check stats. We should not spill anything as the transaction is already +-- aborted. +SELECT pg_stat_force_next_flush(); +SELECT slot_name, spill_txns AS spill_txn, spill_count AS spill_count FROM pg_stat_replication_slots WHERE slot_name = 'regression_slot_stats4_twophase'; + Those aliases seem unnecessary: "spill_txns AS spill_txn" and "spill_count AS spill_count" ====== .../replication/logical/reorderbuffer.c ReorderBufferCheckTXNAbort: 3. Other static functions are also declared at the top of this module. For consistency, shouldn't this be the same? ~~~ 4. + * We don't mark the transaction as streamed since this function can be + * called for non-streamed transactions too. + */ + ReorderBufferTruncateTXN(rb, txn, rbtxn_prepared(txn), false); + ReorderBufferToastReset(rb, txn); Given the comment says "since this function can be called for non-streamed transactions too", would it be easier to pass rbtxn_is_streamed(txn) here instead of 'false', and then just remove the comment? ====== Kind Regards, Peter Smith. Fujitsu Australia
On Mon, 11 Nov 2024 at 23:30, Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > On Sun, Nov 10, 2024 at 11:24 PM Peter Smith <smithpb2250@gmail.com> wrote: > > > > Hi Sawada-San, here are some review comments for the patch v5-0001. > > > > Thank you for reviewing the patch! > > > ====== > > Commit message. > > > > 1. > > This commit introduces an additional check to determine if a > > transaction is already aborted by a CLOG lookup, so the logical > > decoding skips further change also when it doesn't touch system > > catalogs. > > > > ~ > > > > Is that wording backwards? Is it meant to say: > > > > This commit introduces an additional CLOG lookup check to determine if > > a transaction is already aborted, so the ... > > Fixed. > > > > > ====== > > contrib/test_decoding/sql/stats.sql > > > > 2 > > +SELECT slot_name, spill_txns = 0 AS spill_txn, spill_count = 0 AS > > spill_count FROM pg_stat_replication_slots WHERE slot_name = > > 'regression_slot_stats4_twophase'; > > > > Why do the SELECT "= 0" like this, instead of just having zeros in the > > "expected" results? > > Indeed. I used "=0" like other queries in the same file do, but it > makes sense to me just to have zeros in the expected file. That way, > it would make it a bit easier to investigate in case of failures. > > > > > ====== > > .../replication/logical/reorderbuffer.c > > > > 3. > > static void ReorderBufferTruncateTXN(ReorderBuffer *rb, ReorderBufferTXN *txn, > > - bool txn_prepared); > > + bool txn_prepared, bool mark_streamed); > > > > That last parameter name ('mark_streamed') does not match the same > > parameter name in this function's definition. > > Fixed. > > > > > ~~~ > > > > ReorderBufferTruncateTXN: > > > > 4. > > if (txn_streaming && (!txn_prepared) && > > (rbtxn_is_toptxn(txn) || (txn->nentries_mem != 0))) > > txn->txn_flags |= RBTXN_IS_STREAMED; > > > > if (txn_prepared) > > { > > ~ > > > > Since the following condition was already "if (txn_prepared)" would it > > be better remove the "(!txn_prepared)" here and instead just refactor > > the code like: > > > > if (txn_prepared) > > { > > ... > > } > > else if (txn_streaming && (rbtxn_is_toptxn(txn) || (txn->nentries_mem != 0))) > > { > > ... > > } > > Good idea. > > > > > ~~~ > > > > ReorderBufferProcessTXN: > > > > 5. > > + > > + /* Remember the transaction is aborted */ > > + Assert((curtxn->txn_flags & RBTXN_IS_COMMITTED) == 0); > > + curtxn->txn_flags |= RBTXN_IS_ABORTED; > > > > Missing period on comment. > > Fixed. > > > > > ~~~ > > > > ReorderBufferCheckTXNAbort: > > > > 6. > > + * If GUC 'debug_logical_replication_streaming' is "immediate", we don't > > + * check the transaction status, so the caller always processes this > > + * transaction. This is to disable this check for regression tests. > > + */ > > +static bool > > +ReorderBufferCheckTXNAbort(ReorderBuffer *rb, ReorderBufferTXN *txn) > > +{ > > + /* > > + * If GUC 'debug_logical_replication_streaming' is "immediate", we don't > > + * check the transaction status, so the caller always processes this > > + * transaction. > > + */ > > + if (unlikely(debug_logical_replication_streaming == > > DEBUG_LOGICAL_REP_STREAMING_IMMEDIATE)) > > + return false; > > + > > > > The wording of the sentence "This is to disable..." seemed a bit > > confusing. Maybe this area can be simplified by doing the following. > > > > 6a. > > Change the function comment to say more like below: > > > > When the GUC 'debug_logical_replication_streaming' is set to > > "immediate", we don't check the transaction status, meaning the caller > > will always process this transaction. This mode is used by regression > > tests to avoid unnecessary transaction status checking. > > > > ~ > > > > 6b. > > It is not necessary for this 2nd comment to repeat everything that was > > already said in the function comment. A simpler comment here might be > > all you need: > > > > SUGGESTION: > > Quick return for regression tests. > > Agreed with the above two comments. Fixed. > > > > > ~~~ > > > > 7. > > Is it worth mentioning about this skipping of the transaction status > > check in the docs for this GUC? [1] > > If we want to mention this optimization in the docs, we have to > explain how the optimization works too. I think it's too detailed. > > I've attached the updated patch. Few minor suggestions: 1) Can we use rbtxn_is_committed here? + /* Remember the transaction is aborted. */ + Assert((curtxn->txn_flags & RBTXN_IS_COMMITTED) == 0); + curtxn->txn_flags |= RBTXN_IS_ABORTED; 2) Similarly here too: + /* + * Mark the transaction as aborted so we ignore future changes of this + * transaction. + */ + Assert((txn->txn_flags & RBTXN_IS_COMMITTED) == 0); + txn->txn_flags |= RBTXN_IS_ABORTED; 3) Can we use rbtxn_is_aborted here? + /* + * Remember the transaction is committed so that we can skip CLOG + * check next time, avoiding the pressure on CLOG lookup. + */ + Assert((txn->txn_flags & RBTXN_IS_ABORTED) == 0); Regards, Vignesh
On Tue, Nov 12, 2024 at 7:29 PM vignesh C <vignesh21@gmail.com> wrote: > > On Mon, 11 Nov 2024 at 23:30, Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > On Sun, Nov 10, 2024 at 11:24 PM Peter Smith <smithpb2250@gmail.com> wrote: > > > > > > Hi Sawada-San, here are some review comments for the patch v5-0001. > > > > > > > Thank you for reviewing the patch! > > > > > ====== > > > Commit message. > > > > > > 1. > > > This commit introduces an additional check to determine if a > > > transaction is already aborted by a CLOG lookup, so the logical > > > decoding skips further change also when it doesn't touch system > > > catalogs. > > > > > > ~ > > > > > > Is that wording backwards? Is it meant to say: > > > > > > This commit introduces an additional CLOG lookup check to determine if > > > a transaction is already aborted, so the ... > > > > Fixed. > > > > > > > > ====== > > > contrib/test_decoding/sql/stats.sql > > > > > > 2 > > > +SELECT slot_name, spill_txns = 0 AS spill_txn, spill_count = 0 AS > > > spill_count FROM pg_stat_replication_slots WHERE slot_name = > > > 'regression_slot_stats4_twophase'; > > > > > > Why do the SELECT "= 0" like this, instead of just having zeros in the > > > "expected" results? > > > > Indeed. I used "=0" like other queries in the same file do, but it > > makes sense to me just to have zeros in the expected file. That way, > > it would make it a bit easier to investigate in case of failures. > > > > > > > > ====== > > > .../replication/logical/reorderbuffer.c > > > > > > 3. > > > static void ReorderBufferTruncateTXN(ReorderBuffer *rb, ReorderBufferTXN *txn, > > > - bool txn_prepared); > > > + bool txn_prepared, bool mark_streamed); > > > > > > That last parameter name ('mark_streamed') does not match the same > > > parameter name in this function's definition. > > > > Fixed. > > > > > > > > ~~~ > > > > > > ReorderBufferTruncateTXN: > > > > > > 4. > > > if (txn_streaming && (!txn_prepared) && > > > (rbtxn_is_toptxn(txn) || (txn->nentries_mem != 0))) > > > txn->txn_flags |= RBTXN_IS_STREAMED; > > > > > > if (txn_prepared) > > > { > > > ~ > > > > > > Since the following condition was already "if (txn_prepared)" would it > > > be better remove the "(!txn_prepared)" here and instead just refactor > > > the code like: > > > > > > if (txn_prepared) > > > { > > > ... > > > } > > > else if (txn_streaming && (rbtxn_is_toptxn(txn) || (txn->nentries_mem != 0))) > > > { > > > ... > > > } > > > > Good idea. > > > > > > > > ~~~ > > > > > > ReorderBufferProcessTXN: > > > > > > 5. > > > + > > > + /* Remember the transaction is aborted */ > > > + Assert((curtxn->txn_flags & RBTXN_IS_COMMITTED) == 0); > > > + curtxn->txn_flags |= RBTXN_IS_ABORTED; > > > > > > Missing period on comment. > > > > Fixed. > > > > > > > > ~~~ > > > > > > ReorderBufferCheckTXNAbort: > > > > > > 6. > > > + * If GUC 'debug_logical_replication_streaming' is "immediate", we don't > > > + * check the transaction status, so the caller always processes this > > > + * transaction. This is to disable this check for regression tests. > > > + */ > > > +static bool > > > +ReorderBufferCheckTXNAbort(ReorderBuffer *rb, ReorderBufferTXN *txn) > > > +{ > > > + /* > > > + * If GUC 'debug_logical_replication_streaming' is "immediate", we don't > > > + * check the transaction status, so the caller always processes this > > > + * transaction. > > > + */ > > > + if (unlikely(debug_logical_replication_streaming == > > > DEBUG_LOGICAL_REP_STREAMING_IMMEDIATE)) > > > + return false; > > > + > > > > > > The wording of the sentence "This is to disable..." seemed a bit > > > confusing. Maybe this area can be simplified by doing the following. > > > > > > 6a. > > > Change the function comment to say more like below: > > > > > > When the GUC 'debug_logical_replication_streaming' is set to > > > "immediate", we don't check the transaction status, meaning the caller > > > will always process this transaction. This mode is used by regression > > > tests to avoid unnecessary transaction status checking. > > > > > > ~ > > > > > > 6b. > > > It is not necessary for this 2nd comment to repeat everything that was > > > already said in the function comment. A simpler comment here might be > > > all you need: > > > > > > SUGGESTION: > > > Quick return for regression tests. > > > > Agreed with the above two comments. Fixed. > > > > > > > > ~~~ > > > > > > 7. > > > Is it worth mentioning about this skipping of the transaction status > > > check in the docs for this GUC? [1] > > > > If we want to mention this optimization in the docs, we have to > > explain how the optimization works too. I think it's too detailed. > > > > I've attached the updated patch. > > Few minor suggestions: > 1) Can we use rbtxn_is_committed here? > + /* Remember the transaction is aborted. */ > + Assert((curtxn->txn_flags & RBTXN_IS_COMMITTED) == 0); > + curtxn->txn_flags |= RBTXN_IS_ABORTED; > > 2) Similarly here too: > + /* > + * Mark the transaction as aborted so we ignore future changes of this > + * transaction. > + */ > + Assert((txn->txn_flags & RBTXN_IS_COMMITTED) == 0); > + txn->txn_flags |= RBTXN_IS_ABORTED; > > 3) Can we use rbtxn_is_aborted here? > + /* > + * Remember the transaction is committed so that we > can skip CLOG > + * check next time, avoiding the pressure on CLOG lookup. > + */ > + Assert((txn->txn_flags & RBTXN_IS_ABORTED) == 0); > Thank you for reviewing the patch! These comments are incorporated into the latest v6 patch I just sent[1]. Regards, [1] https://www.postgresql.org/message-id/CAD21AoDtMjbc8YCQiX1K8%2BRKeahcX2MLt3gwApm5BWGfv14i5A%40mail.gmail.com -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
Hi Sawda-San, Here are some more review comments for the latest (accidentally called v6 again?) v6-0001 patch. ====== contrib/test_decoding/sql/stats.sql 1. +-- Execute a transaction that is prepared and aborted. We detect that the +-- transaction is aborted before spilling changes, and then skip collecting +-- further changes. You had replied (referring to the above comment): I think we already mentioned the transaction is going to be spilled but actually not. ~ Yes, spilling was already mentioned in the current comment but I felt it assumes the reader is expected to know details of why it was going to be spilled in the first place. In other words, I thought the comment could include a bit more explanatory background info: (Also, it's not really "we detect" the abort -- it's the new postgres code of this patch that detects it.) SUGGESTION: Execute a transaction that is prepared but then aborted. The INSERT data exceeds the 'logical_decoding_work_mem limit' limit which normally would result in the transaction being spilled to disk, but now when Postgres detects the abort it skips the spilling and also skips collecting further changes. ~~~ 2. +-- Check if the transaction is not spilled as it's already aborted. +SELECT count(*) FROM pg_logical_slot_get_changes('regression_slot_stats4_twophase', NULL, NULL, 'include-xids', '0', 'skip-empty-xacts', '1'); +SELECT pg_stat_force_next_flush(); +SELECT slot_name, spill_txns, spill_count FROM pg_stat_replication_slots WHERE slot_name = 'regression_slot_stats4_twophase'; + /Check if the transaction is not spilled/Verify that the transaction was not spilled/ ====== .../replication/logical/reorderbuffer.c ReorderBufferResetTXN: 3. /* Discard the changes that we just streamed */ - ReorderBufferTruncateTXN(rb, txn, rbtxn_prepared(txn)); + ReorderBufferTruncateTXN(rb, txn, rbtxn_prepared(txn), true); Looking at the calling code for ReorderBufferResetTXN it seems this function can called for streaming OR prepared. So is it OK here to be passing hardwired 'true' as the txn_streaming parameter, or should that be passing rbtxn_is_streamed(txn)? ~~~ ReorderBufferLargestStreamableTopTXN: 4. if ((largest == NULL || txn->total_size > largest_size) && (txn->total_size > 0) && !(rbtxn_has_partial_change(txn)) && - rbtxn_has_streamable_change(txn)) + rbtxn_has_streamable_change(txn) && !(rbtxn_is_aborted(txn))) { largest = txn; largest_size = txn->total_size; I felt that this increasingly complicated code would be a lot easier to understand if you just separate the conditions into: (a) the ones that filter out transaction you don't care about; (b) the ones that check for the largest size. For example, SUGGESTION: dlist_foreach(...) { ... /* Don't consider these kinds of transactions for eviction. */ if (rbtxn_has_partial_change(txn) || !rbtxn_has_streamable_change(txn) || rbtxn_is_aborted(txn)) continue; /* Find the largest of the eviction candidates. */ if ((largest == NULL || txn->total_size > largest_size) && (txn->total_size > 0)) { largest = txn; largest_size = txn->total_size; } } ~~~ ReorderBufferCheckMemoryLimit: 5. + /* skip the transaction if already aborted */ + if (ReorderBufferCheckTXNAbort(rb, txn)) + { + /* All changes should be truncated */ + Assert(txn->size == 0 && txn->total_size == 0); + continue; + } The "discard all changes accumulated so far" side-effect happening here is not very apparent from the function name. Maybe a better name for ReorderBufferCheckTXNAbort() would be something like 'ReorderBufferCleanupIfAbortedTXN()'. ====== Kind Regards, Peter Smith. Fujitsu Australia
Hi Sawada-Sn, Here are some review comments for patch v8-0001. ====== contrib/test_decoding/sql/stats.sql 1. +-- The INSERT changes are large enough to be spilled but not, because the +-- transaction is aborted. The logical decoding skips collecting further +-- changes too. The transaction is prepared to make sure the decoding processes +-- the aborted transaction. /to be spilled but not/to be spilled but will not be/ ====== .../replication/logical/reorderbuffer.c ReorderBufferTruncateTXN: 2. /* * Discard changes from a transaction (and subtransactions), either after - * streaming or decoding them at PREPARE. Keep the remaining info - - * transactions, tuplecids, invalidations and snapshots. + * streaming, decoding them at PREPARE, or detecting the transaction abort. + * Keep the remaining info - transactions, tuplecids, invalidations and + * snapshots. * * We additionally remove tuplecids after decoding the transaction at prepare * time as we only need to perform invalidation at rollback or commit prepared. * + * The given transaction is marked as streamed if appropriate and the caller + * asked it by passing 'mark_txn_streaming' being true. + * * 'txn_prepared' indicates that we have decoded the transaction at prepare * time. */ static void -ReorderBufferTruncateTXN(ReorderBuffer *rb, ReorderBufferTXN *txn, bool txn_prepared) +ReorderBufferTruncateTXN(ReorderBuffer *rb, ReorderBufferTXN *txn, bool txn_prepared, + bool mark_txn_streaming) I think the function comment should describe the parameters in the same order that they appear in the function signature. ~~~ 3. + else if (mark_txn_streaming && (rbtxn_is_toptxn(txn) || (txn->nentries_mem != 0))) + { ... + txn->txn_flags |= RBTXN_IS_STREAMED; + } I guess it doesn't matter much, but for the sake of readability, should the condition also be checking !rbtxn_is_streamed(txn) to avoid overwriting the RBTXN_IS_STREAMED bit when it was set already? ~~~ ReorderBufferTruncateTXNIfAborted: 4. + /* + * The transaction aborted. We discard the changes we've collected so far, + * and free all resources allocated for toast reconstruction. The full + * cleanup will happen as part of decoding ABORT record of this + * transaction. + * + * Since we don't check the transaction status while replaying the + * transaction, we don't need to reset toast reconstruction data here. + */ + ReorderBufferTruncateTXN(rb, txn, false, false); 4a. The first part of the comment says "... and free all resources allocated for toast reconstruction", but the second part says "we don't need to reset toast reconstruction data here". Is that a contradiction? ~ 4b. Shouldn't this call still be passing rbtxn_prepared(txn) as the 2nd last param, like it used to? ====== Kind Regards, Peter Smith. Fujitsu Australia
On Fri, 15 Nov 2024 at 23:32, Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > On Thu, Nov 14, 2024 at 7:07 PM Peter Smith <smithpb2250@gmail.com> wrote: > > > > Hi Sawada-Sn, > > > > Here are some review comments for patch v8-0001. > > Thank you for the comments. > > > > > ====== > > contrib/test_decoding/sql/stats.sql > > > > 1. > > +-- The INSERT changes are large enough to be spilled but not, because the > > +-- transaction is aborted. The logical decoding skips collecting further > > +-- changes too. The transaction is prepared to make sure the decoding processes > > +-- the aborted transaction. > > > > /to be spilled but not/to be spilled but will not be/ > > Fixed. > > > > > ====== > > .../replication/logical/reorderbuffer.c > > > > ReorderBufferTruncateTXN: > > > > 2. > > /* > > * Discard changes from a transaction (and subtransactions), either after > > - * streaming or decoding them at PREPARE. Keep the remaining info - > > - * transactions, tuplecids, invalidations and snapshots. > > + * streaming, decoding them at PREPARE, or detecting the transaction abort. > > + * Keep the remaining info - transactions, tuplecids, invalidations and > > + * snapshots. > > * > > * We additionally remove tuplecids after decoding the transaction at prepare > > * time as we only need to perform invalidation at rollback or commit prepared. > > * > > + * The given transaction is marked as streamed if appropriate and the caller > > + * asked it by passing 'mark_txn_streaming' being true. > > + * > > * 'txn_prepared' indicates that we have decoded the transaction at prepare > > * time. > > */ > > static void > > -ReorderBufferTruncateTXN(ReorderBuffer *rb, ReorderBufferTXN *txn, > > bool txn_prepared) > > +ReorderBufferTruncateTXN(ReorderBuffer *rb, ReorderBufferTXN *txn, > > bool txn_prepared, > > + bool mark_txn_streaming) > > > > I think the function comment should describe the parameters in the > > same order that they appear in the function signature. > > Not sure it should be. We sometimes describe the overall idea of the > function first while using arguments names, and then describe what > other arguments mean. > > > > > ~~~ > > > > 3. > > + else if (mark_txn_streaming && (rbtxn_is_toptxn(txn) || > > (txn->nentries_mem != 0))) > > + { > > ... > > + txn->txn_flags |= RBTXN_IS_STREAMED; > > + } > > > > I guess it doesn't matter much, but for the sake of readability, > > should the condition also be checking !rbtxn_is_streamed(txn) to avoid > > overwriting the RBTXN_IS_STREAMED bit when it was set already? > > Not sure it improves readability because it adds one more check there. > If it's important not to re-set RBTXN_IS_STREAMED, it makes sense to > have that check and describe in the comment. But in this case, I think > we don't necessarily need to do that. > > > ~~~ > > > > ReorderBufferTruncateTXNIfAborted: > > > > 4. > > + /* > > + * The transaction aborted. We discard the changes we've collected so far, > > + * and free all resources allocated for toast reconstruction. The full > > + * cleanup will happen as part of decoding ABORT record of this > > + * transaction. > > + * > > + * Since we don't check the transaction status while replaying the > > + * transaction, we don't need to reset toast reconstruction data here. > > + */ > > + ReorderBufferTruncateTXN(rb, txn, false, false); > > > > 4a. > > The first part of the comment says "... and free all resources > > allocated for toast reconstruction", but the second part says "we > > don't need to reset toast reconstruction data here". Is that a > > contradiction? > > Yes, the comment is out-of-date. Since this function is not called > while replaying the transaction, it should not have any toast > reconstruction data. > > > > > ~ > > > > 4b. > > Shouldn't this call still be passing rbtxn_prepared(txn) as the 2nd > > last param, like it used to? > > Actually it's not necessary because it should always be false. But > thinking more, it seems to be better to use rbtxn_preapred(txn) since > it's consistent with other places and it's not necessary to put > assumptions there. Few comments: 1) Should we have the Assert inside ReorderBufferTruncateTXNIfAborted instead of having it at multiple callers, ReorderBufferResetTXN also has the Assert inside the function after truncate of the transaction: @@ -3672,6 +3758,14 @@ ReorderBufferCheckMemoryLimit(ReorderBuffer *rb) Assert(txn->total_size > 0); Assert(rb->size >= txn->total_size); + /* skip the transaction if aborted */ + if (ReorderBufferTruncateTXNIfAborted(rb, txn)) + { + /* All changes should be discarded */ + Assert(txn->size == 0 && txn->total_size == 0); + continue; + } + ReorderBufferStreamTXN(rb, txn); } else @@ -3687,6 +3781,14 @@ ReorderBufferCheckMemoryLimit(ReorderBuffer *rb) Assert(txn->size > 0); Assert(rb->size >= txn->size); + /* skip the transaction if aborted */ + if (ReorderBufferTruncateTXNIfAborted(rb, txn)) + { + /* All changes should be discarded */ + Assert(txn->size == 0 && txn->total_size == 0); + continue; + } 2) txn->txn_flags can be moved to the next line to keep it within 80 chars in this case: * Check the transaction status by looking CLOG and discard all changes if * the transaction is aborted. The transaction status is cached in txn->txn_flags * so we can skip future changes and avoid CLOG lookups on the next call. Return 3) Is there any scenario where the Assert can fail as the toast is not reset: + * Since we don't check the transaction status while replaying the + * transaction, we don't need to reset toast reconstruction data here. + */ + ReorderBufferTruncateTXN(rb, txn, rbtxn_prepared(txn), false); + if (ReorderBufferTruncateTXNIfAborted(rb, txn)) + { + /* All changes should be discarded */ + Assert(txn->size == 0 && txn->total_size == 0); + continue; + } 4) This can be changed to a single line comment: + /* + * Quick return if the transaction status is already known. + */ + if (rbtxn_is_committed(txn)) + return false; Regards, Vignesh