Thread: Fix memory counter update in reorderbuffer
Hi, I found a bug in the memory counter update in reorderbuffer. It was introduced by commit 5bec1d6bc5e, so pg17 and master are affected. In ReorderBufferCleanupTXN() we zero the transaction size and then free the transaction entry as follows: /* Update the memory counter */ ReorderBufferChangeMemoryUpdate(rb, NULL, txn, false, txn->size); /* deallocate */ ReorderBufferReturnTXN(rb, txn); However, if the transaction entry has toast changes we free them in ReorderBufferToastReset() called from ReorderBufferToastReset(), and end up subtracting the memory usage from zero. Which results in an assertion failure: TRAP: failed Assert("(rb->size >= sz) && (txn->size >= sz)"), File: "reorderbuffer.c" This can happen for example if an error occurs while replaying transaction changes including toast changes. I've attached the patch that fixes the problem and includes a test case (the test part might not be committed as it slows the test case). Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
Attachment
On Sat, Aug 3, 2024 at 1:21 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > I found a bug in the memory counter update in reorderbuffer. It was > introduced by commit 5bec1d6bc5e, so pg17 and master are affected. > > In ReorderBufferCleanupTXN() we zero the transaction size and then > free the transaction entry as follows: > > /* Update the memory counter */ > ReorderBufferChangeMemoryUpdate(rb, NULL, txn, false, txn->size); > > /* deallocate */ > ReorderBufferReturnTXN(rb, txn); > Why do we need to zero the transaction size explicitly? Shouldn't it automatically become zero after freeing all the changes? > However, if the transaction entry has toast changes we free them in > ReorderBufferToastReset() called from ReorderBufferToastReset(), > Here, you mean ReorderBufferToastReset() called from ReorderBufferReturnTXN(), right? BTW, commit 5bec1d6bc5e also introduced ReorderBufferChangeMemoryUpdate() in ReorderBufferTruncateTXN() which is also worth considering while fixing the reported problem. It may not have the same problem as you have reported but we can consider whether setting txn size as zero explicitly is required or not. -- With Regards, Amit Kapila.
On Tue, Aug 6, 2024 at 1:12 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Sat, Aug 3, 2024 at 1:21 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > I found a bug in the memory counter update in reorderbuffer. It was > > introduced by commit 5bec1d6bc5e, so pg17 and master are affected. > > > > In ReorderBufferCleanupTXN() we zero the transaction size and then > > free the transaction entry as follows: > > > > /* Update the memory counter */ > > ReorderBufferChangeMemoryUpdate(rb, NULL, txn, false, txn->size); > > > > /* deallocate */ > > ReorderBufferReturnTXN(rb, txn); > > > > Why do we need to zero the transaction size explicitly? Shouldn't it > automatically become zero after freeing all the changes? It will become zero after freeing all the changes. However, since updating the max-heap when freeing each change could bring some overhead, we freed the changes without updating the memory counter, and then zerod it. > > > However, if the transaction entry has toast changes we free them in > > ReorderBufferToastReset() called from ReorderBufferToastReset(), > > > > Here, you mean ReorderBufferToastReset() called from > ReorderBufferReturnTXN(), right? Right. Thank you for pointing it out. > BTW, commit 5bec1d6bc5e also introduced > ReorderBufferChangeMemoryUpdate() in ReorderBufferTruncateTXN() which > is also worth considering while fixing the reported problem. It may > not have the same problem as you have reported but we can consider > whether setting txn size as zero explicitly is required or not. The reason why it introduced ReorderBufferChangeMemoryUpdate() is the same as I mentioned above. And yes, as you mentioned, it doesn't have the same problem that I reported here. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
On Wed, Aug 7, 2024 at 7:42 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > On Tue, Aug 6, 2024 at 1:12 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Sat, Aug 3, 2024 at 1:21 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > > > I found a bug in the memory counter update in reorderbuffer. It was > > > introduced by commit 5bec1d6bc5e, so pg17 and master are affected. > > > > > > In ReorderBufferCleanupTXN() we zero the transaction size and then > > > free the transaction entry as follows: > > > > > > /* Update the memory counter */ > > > ReorderBufferChangeMemoryUpdate(rb, NULL, txn, false, txn->size); > > > > > > /* deallocate */ > > > ReorderBufferReturnTXN(rb, txn); > > > > > > > Why do we need to zero the transaction size explicitly? Shouldn't it > > automatically become zero after freeing all the changes? > > It will become zero after freeing all the changes. However, since > updating the max-heap when freeing each change could bring some > overhead, we freed the changes without updating the memory counter, > and then zerod it. > I think this should be covered in comments as it is not apparent. > > > BTW, commit 5bec1d6bc5e also introduced > > ReorderBufferChangeMemoryUpdate() in ReorderBufferTruncateTXN() which > > is also worth considering while fixing the reported problem. It may > > not have the same problem as you have reported but we can consider > > whether setting txn size as zero explicitly is required or not. > > The reason why it introduced ReorderBufferChangeMemoryUpdate() is the > same as I mentioned above. And yes, as you mentioned, it doesn't have > the same problem that I reported here. > I checked again and found that ReorderBufferResetTXN() first calls ReorderBufferTruncateTXN() and then ReorderBufferToastReset(). After that, it also tries to free spec_insert change which should have the same problem. So, what saves this path from the same problem? * + /* + * Update the memory counter of the transaction, removing it from + * the max-heap. + */ + ReorderBufferChangeMemoryUpdate(rb, NULL, txn, false, txn->size); + Assert(txn->size == 0); + pfree(txn); Just before freeing the TXN, updating the size looks odd though I understand the idea is to remove TXN from max_heap. Anyway, let's first discuss whether the same problem exists in ReorderBufferResetTXN() code path, and if so, how we want to fix it. -- With Regards, Amit Kapila.
On Wed, Aug 7, 2024 at 3:17 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Wed, Aug 7, 2024 at 7:42 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > On Tue, Aug 6, 2024 at 1:12 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > On Sat, Aug 3, 2024 at 1:21 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > > > > > I found a bug in the memory counter update in reorderbuffer. It was > > > > introduced by commit 5bec1d6bc5e, so pg17 and master are affected. > > > > > > > > In ReorderBufferCleanupTXN() we zero the transaction size and then > > > > free the transaction entry as follows: > > > > > > > > /* Update the memory counter */ > > > > ReorderBufferChangeMemoryUpdate(rb, NULL, txn, false, txn->size); > > > > > > > > /* deallocate */ > > > > ReorderBufferReturnTXN(rb, txn); > > > > > > > > > > Why do we need to zero the transaction size explicitly? Shouldn't it > > > automatically become zero after freeing all the changes? > > > > It will become zero after freeing all the changes. However, since > > updating the max-heap when freeing each change could bring some > > overhead, we freed the changes without updating the memory counter, > > and then zerod it. > > > > I think this should be covered in comments as it is not apparent. Agreed. > > > > > > BTW, commit 5bec1d6bc5e also introduced > > > ReorderBufferChangeMemoryUpdate() in ReorderBufferTruncateTXN() which > > > is also worth considering while fixing the reported problem. It may > > > not have the same problem as you have reported but we can consider > > > whether setting txn size as zero explicitly is required or not. > > > > The reason why it introduced ReorderBufferChangeMemoryUpdate() is the > > same as I mentioned above. And yes, as you mentioned, it doesn't have > > the same problem that I reported here. > > > > I checked again and found that ReorderBufferResetTXN() first calls > ReorderBufferTruncateTXN() and then ReorderBufferToastReset(). After > that, it also tries to free spec_insert change which should have the > same problem. So, what saves this path from the same problem? Good catch. I've not created a test case for that but it seems to be possible to happen. I think that subtracting txn->size to reduce the memory counter to zero seems to be a wrong idea in the first place. If we want to save updating memory counter and max-heap, we should use the exact memory size that we freed. In other words, just change the memory usage update to a batch operation. > > * > + /* > + * Update the memory counter of the transaction, removing it from > + * the max-heap. > + */ > + ReorderBufferChangeMemoryUpdate(rb, NULL, txn, false, txn->size); > + Assert(txn->size == 0); > + > pfree(txn); > > Just before freeing the TXN, updating the size looks odd though I > understand the idea is to remove TXN from max_heap. Anyway, let's > first discuss whether the same problem exists in > ReorderBufferResetTXN() code path, and if so, how we want to fix it. Agreed. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
On Thu, Aug 8, 2024 at 9:43 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > On Wed, Aug 7, 2024 at 3:17 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Wed, Aug 7, 2024 at 7:42 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > > > > > > > BTW, commit 5bec1d6bc5e also introduced > > > > ReorderBufferChangeMemoryUpdate() in ReorderBufferTruncateTXN() which > > > > is also worth considering while fixing the reported problem. It may > > > > not have the same problem as you have reported but we can consider > > > > whether setting txn size as zero explicitly is required or not. > > > > > > The reason why it introduced ReorderBufferChangeMemoryUpdate() is the > > > same as I mentioned above. And yes, as you mentioned, it doesn't have > > > the same problem that I reported here. > > > > > > > I checked again and found that ReorderBufferResetTXN() first calls > > ReorderBufferTruncateTXN() and then ReorderBufferToastReset(). After > > that, it also tries to free spec_insert change which should have the > > same problem. So, what saves this path from the same problem? > > Good catch. I've not created a test case for that but it seems to be > possible to happen. > > I think that subtracting txn->size to reduce the memory counter to > zero seems to be a wrong idea in the first place. If we want to save > updating memory counter and max-heap, we should use the exact memory > size that we freed. In other words, just change the memory usage > update to a batch operation. > Sounds reasonable but how would you find the size for a batch update operation? Are you planning to track it while freeing the individual changes? -- With Regards, Amit Kapila.
On Fri, Aug 9, 2024 at 3:30 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Thu, Aug 8, 2024 at 9:43 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > On Wed, Aug 7, 2024 at 3:17 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > On Wed, Aug 7, 2024 at 7:42 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > > > > > > > > > > > BTW, commit 5bec1d6bc5e also introduced > > > > > ReorderBufferChangeMemoryUpdate() in ReorderBufferTruncateTXN() which > > > > > is also worth considering while fixing the reported problem. It may > > > > > not have the same problem as you have reported but we can consider > > > > > whether setting txn size as zero explicitly is required or not. > > > > > > > > The reason why it introduced ReorderBufferChangeMemoryUpdate() is the > > > > same as I mentioned above. And yes, as you mentioned, it doesn't have > > > > the same problem that I reported here. > > > > > > > > > > I checked again and found that ReorderBufferResetTXN() first calls > > > ReorderBufferTruncateTXN() and then ReorderBufferToastReset(). After > > > that, it also tries to free spec_insert change which should have the > > > same problem. So, what saves this path from the same problem? > > > > Good catch. I've not created a test case for that but it seems to be > > possible to happen. > > > > I think that subtracting txn->size to reduce the memory counter to > > zero seems to be a wrong idea in the first place. If we want to save > > updating memory counter and max-heap, we should use the exact memory > > size that we freed. In other words, just change the memory usage > > update to a batch operation. > > > > Sounds reasonable but how would you find the size for a batch update > operation? Are you planning to track it while freeing the individual > changes? Yes, one idea is to make ReorderBufferReturnChange() return the memory size that it just freed. Then the caller who wants to update the memory counter in a batch sums up the memory size. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
On Sat, Aug 10, 2024 at 5:48 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > On Fri, Aug 9, 2024 at 3:30 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Thu, Aug 8, 2024 at 9:43 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > > > On Wed, Aug 7, 2024 at 3:17 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > On Wed, Aug 7, 2024 at 7:42 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > > > > > > > > > > > > > > > BTW, commit 5bec1d6bc5e also introduced > > > > > > ReorderBufferChangeMemoryUpdate() in ReorderBufferTruncateTXN() which > > > > > > is also worth considering while fixing the reported problem. It may > > > > > > not have the same problem as you have reported but we can consider > > > > > > whether setting txn size as zero explicitly is required or not. > > > > > > > > > > The reason why it introduced ReorderBufferChangeMemoryUpdate() is the > > > > > same as I mentioned above. And yes, as you mentioned, it doesn't have > > > > > the same problem that I reported here. > > > > > > > > > > > > > I checked again and found that ReorderBufferResetTXN() first calls > > > > ReorderBufferTruncateTXN() and then ReorderBufferToastReset(). After > > > > that, it also tries to free spec_insert change which should have the > > > > same problem. So, what saves this path from the same problem? > > > > > > Good catch. I've not created a test case for that but it seems to be > > > possible to happen. > > > > > > I think that subtracting txn->size to reduce the memory counter to > > > zero seems to be a wrong idea in the first place. If we want to save > > > updating memory counter and max-heap, we should use the exact memory > > > size that we freed. In other words, just change the memory usage > > > update to a batch operation. > > > > > > > Sounds reasonable but how would you find the size for a batch update > > operation? Are you planning to track it while freeing the individual > > changes? > > Yes, one idea is to make ReorderBufferReturnChange() return the memory > size that it just freed. Then the caller who wants to update the > memory counter in a batch sums up the memory size. I've drafted the patch. I didn't change ReorderBufferReturnChange() but called ReorderBufferChangeSize() for individual change instead, as it's simpler.gi Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com