Thread: Fix memory counter update in reorderbuffer

Fix memory counter update in reorderbuffer

From
Masahiko Sawada
Date:
Hi,

I found a bug in the memory counter update in reorderbuffer. It was
introduced by commit 5bec1d6bc5e, so pg17 and master are affected.

In ReorderBufferCleanupTXN() we zero the transaction size and then
free the transaction entry as follows:

    /* Update the memory counter */
    ReorderBufferChangeMemoryUpdate(rb, NULL, txn, false, txn->size);

    /* deallocate */
    ReorderBufferReturnTXN(rb, txn);

However, if the transaction entry has toast changes we free them in
ReorderBufferToastReset() called from ReorderBufferToastReset(), and
end up subtracting the memory usage from zero. Which results in an
assertion failure:

TRAP: failed Assert("(rb->size >= sz) && (txn->size >= sz)"), File:
"reorderbuffer.c"

This can happen for example if an error occurs while replaying
transaction changes including toast changes.

I've attached the patch that fixes the problem and includes a test
case (the test part might not be committed as it slows the test case).

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Attachment

Re: Fix memory counter update in reorderbuffer

From
Amit Kapila
Date:
On Sat, Aug 3, 2024 at 1:21 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> I found a bug in the memory counter update in reorderbuffer. It was
> introduced by commit 5bec1d6bc5e, so pg17 and master are affected.
>
> In ReorderBufferCleanupTXN() we zero the transaction size and then
> free the transaction entry as follows:
>
>     /* Update the memory counter */
>     ReorderBufferChangeMemoryUpdate(rb, NULL, txn, false, txn->size);
>
>     /* deallocate */
>     ReorderBufferReturnTXN(rb, txn);
>

Why do we need to zero the transaction size explicitly? Shouldn't it
automatically become zero after freeing all the changes?

> However, if the transaction entry has toast changes we free them in
> ReorderBufferToastReset() called from ReorderBufferToastReset(),
>

Here, you mean ReorderBufferToastReset() called from
ReorderBufferReturnTXN(), right?

BTW, commit 5bec1d6bc5e also introduced
ReorderBufferChangeMemoryUpdate() in ReorderBufferTruncateTXN() which
is also worth considering while fixing the reported problem. It may
not have the same problem as you have reported but we can consider
whether setting txn size as zero explicitly is required or not.

--
With Regards,
Amit Kapila.



Re: Fix memory counter update in reorderbuffer

From
Masahiko Sawada
Date:
On Tue, Aug 6, 2024 at 1:12 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Sat, Aug 3, 2024 at 1:21 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > I found a bug in the memory counter update in reorderbuffer. It was
> > introduced by commit 5bec1d6bc5e, so pg17 and master are affected.
> >
> > In ReorderBufferCleanupTXN() we zero the transaction size and then
> > free the transaction entry as follows:
> >
> >     /* Update the memory counter */
> >     ReorderBufferChangeMemoryUpdate(rb, NULL, txn, false, txn->size);
> >
> >     /* deallocate */
> >     ReorderBufferReturnTXN(rb, txn);
> >
>
> Why do we need to zero the transaction size explicitly? Shouldn't it
> automatically become zero after freeing all the changes?

It will become zero after freeing all the changes. However, since
updating the max-heap when freeing each change could bring some
overhead, we freed the changes without updating the memory counter,
and then zerod it.

>
> > However, if the transaction entry has toast changes we free them in
> > ReorderBufferToastReset() called from ReorderBufferToastReset(),
> >
>
> Here, you mean ReorderBufferToastReset() called from
> ReorderBufferReturnTXN(), right?

Right. Thank you for pointing it out.

> BTW, commit 5bec1d6bc5e also introduced
> ReorderBufferChangeMemoryUpdate() in ReorderBufferTruncateTXN() which
> is also worth considering while fixing the reported problem. It may
> not have the same problem as you have reported but we can consider
> whether setting txn size as zero explicitly is required or not.

The reason why it introduced ReorderBufferChangeMemoryUpdate() is the
same as I mentioned above. And yes, as you mentioned, it doesn't have
the same problem that I reported here.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: Fix memory counter update in reorderbuffer

From
Amit Kapila
Date:
On Wed, Aug 7, 2024 at 7:42 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Tue, Aug 6, 2024 at 1:12 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Sat, Aug 3, 2024 at 1:21 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > I found a bug in the memory counter update in reorderbuffer. It was
> > > introduced by commit 5bec1d6bc5e, so pg17 and master are affected.
> > >
> > > In ReorderBufferCleanupTXN() we zero the transaction size and then
> > > free the transaction entry as follows:
> > >
> > >     /* Update the memory counter */
> > >     ReorderBufferChangeMemoryUpdate(rb, NULL, txn, false, txn->size);
> > >
> > >     /* deallocate */
> > >     ReorderBufferReturnTXN(rb, txn);
> > >
> >
> > Why do we need to zero the transaction size explicitly? Shouldn't it
> > automatically become zero after freeing all the changes?
>
> It will become zero after freeing all the changes. However, since
> updating the max-heap when freeing each change could bring some
> overhead, we freed the changes without updating the memory counter,
> and then zerod it.
>

I think this should be covered in comments as it is not apparent.

>
> > BTW, commit 5bec1d6bc5e also introduced
> > ReorderBufferChangeMemoryUpdate() in ReorderBufferTruncateTXN() which
> > is also worth considering while fixing the reported problem. It may
> > not have the same problem as you have reported but we can consider
> > whether setting txn size as zero explicitly is required or not.
>
> The reason why it introduced ReorderBufferChangeMemoryUpdate() is the
> same as I mentioned above. And yes, as you mentioned, it doesn't have
> the same problem that I reported here.
>

I checked again and found that ReorderBufferResetTXN() first calls
ReorderBufferTruncateTXN() and then ReorderBufferToastReset(). After
that, it also tries to free spec_insert change which should have the
same problem. So, what saves this path from the same problem?

*
+ /*
+ * Update the memory counter of the transaction, removing it from
+ * the max-heap.
+ */
+ ReorderBufferChangeMemoryUpdate(rb, NULL, txn, false, txn->size);
+ Assert(txn->size == 0);
+
  pfree(txn);

Just before freeing the TXN, updating the size looks odd though I
understand the idea is to remove TXN from max_heap. Anyway, let's
first discuss whether the same problem exists in
ReorderBufferResetTXN() code path, and if so, how we want to fix it.

--
With Regards,
Amit Kapila.



Re: Fix memory counter update in reorderbuffer

From
Masahiko Sawada
Date:
On Wed, Aug 7, 2024 at 3:17 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Aug 7, 2024 at 7:42 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Tue, Aug 6, 2024 at 1:12 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Sat, Aug 3, 2024 at 1:21 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > > > I found a bug in the memory counter update in reorderbuffer. It was
> > > > introduced by commit 5bec1d6bc5e, so pg17 and master are affected.
> > > >
> > > > In ReorderBufferCleanupTXN() we zero the transaction size and then
> > > > free the transaction entry as follows:
> > > >
> > > >     /* Update the memory counter */
> > > >     ReorderBufferChangeMemoryUpdate(rb, NULL, txn, false, txn->size);
> > > >
> > > >     /* deallocate */
> > > >     ReorderBufferReturnTXN(rb, txn);
> > > >
> > >
> > > Why do we need to zero the transaction size explicitly? Shouldn't it
> > > automatically become zero after freeing all the changes?
> >
> > It will become zero after freeing all the changes. However, since
> > updating the max-heap when freeing each change could bring some
> > overhead, we freed the changes without updating the memory counter,
> > and then zerod it.
> >
>
> I think this should be covered in comments as it is not apparent.

Agreed.

>
> >
> > > BTW, commit 5bec1d6bc5e also introduced
> > > ReorderBufferChangeMemoryUpdate() in ReorderBufferTruncateTXN() which
> > > is also worth considering while fixing the reported problem. It may
> > > not have the same problem as you have reported but we can consider
> > > whether setting txn size as zero explicitly is required or not.
> >
> > The reason why it introduced ReorderBufferChangeMemoryUpdate() is the
> > same as I mentioned above. And yes, as you mentioned, it doesn't have
> > the same problem that I reported here.
> >
>
> I checked again and found that ReorderBufferResetTXN() first calls
> ReorderBufferTruncateTXN() and then ReorderBufferToastReset(). After
> that, it also tries to free spec_insert change which should have the
> same problem. So, what saves this path from the same problem?

Good catch. I've not created a test case for that but it seems to be
possible to happen.

I think that subtracting txn->size to reduce the memory counter to
zero seems to be a wrong idea in the first place. If we want to save
updating memory counter and max-heap, we should use the exact memory
size that we freed. In other words, just change the memory usage
update to a batch operation.

>
> *
> + /*
> + * Update the memory counter of the transaction, removing it from
> + * the max-heap.
> + */
> + ReorderBufferChangeMemoryUpdate(rb, NULL, txn, false, txn->size);
> + Assert(txn->size == 0);
> +
>   pfree(txn);
>
> Just before freeing the TXN, updating the size looks odd though I
> understand the idea is to remove TXN from max_heap. Anyway, let's
> first discuss whether the same problem exists in
> ReorderBufferResetTXN() code path, and if so, how we want to fix it.

Agreed.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: Fix memory counter update in reorderbuffer

From
Amit Kapila
Date:
On Thu, Aug 8, 2024 at 9:43 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Wed, Aug 7, 2024 at 3:17 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Wed, Aug 7, 2024 at 7:42 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> >
> > >
> > > > BTW, commit 5bec1d6bc5e also introduced
> > > > ReorderBufferChangeMemoryUpdate() in ReorderBufferTruncateTXN() which
> > > > is also worth considering while fixing the reported problem. It may
> > > > not have the same problem as you have reported but we can consider
> > > > whether setting txn size as zero explicitly is required or not.
> > >
> > > The reason why it introduced ReorderBufferChangeMemoryUpdate() is the
> > > same as I mentioned above. And yes, as you mentioned, it doesn't have
> > > the same problem that I reported here.
> > >
> >
> > I checked again and found that ReorderBufferResetTXN() first calls
> > ReorderBufferTruncateTXN() and then ReorderBufferToastReset(). After
> > that, it also tries to free spec_insert change which should have the
> > same problem. So, what saves this path from the same problem?
>
> Good catch. I've not created a test case for that but it seems to be
> possible to happen.
>
> I think that subtracting txn->size to reduce the memory counter to
> zero seems to be a wrong idea in the first place. If we want to save
> updating memory counter and max-heap, we should use the exact memory
> size that we freed. In other words, just change the memory usage
> update to a batch operation.
>

Sounds reasonable but how would you find the size for a batch update
operation? Are you planning to track it while freeing the individual
changes?

--
With Regards,
Amit Kapila.



Re: Fix memory counter update in reorderbuffer

From
Masahiko Sawada
Date:
On Fri, Aug 9, 2024 at 3:30 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Aug 8, 2024 at 9:43 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Wed, Aug 7, 2024 at 3:17 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Wed, Aug 7, 2024 at 7:42 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > >
> > > >
> > > > > BTW, commit 5bec1d6bc5e also introduced
> > > > > ReorderBufferChangeMemoryUpdate() in ReorderBufferTruncateTXN() which
> > > > > is also worth considering while fixing the reported problem. It may
> > > > > not have the same problem as you have reported but we can consider
> > > > > whether setting txn size as zero explicitly is required or not.
> > > >
> > > > The reason why it introduced ReorderBufferChangeMemoryUpdate() is the
> > > > same as I mentioned above. And yes, as you mentioned, it doesn't have
> > > > the same problem that I reported here.
> > > >
> > >
> > > I checked again and found that ReorderBufferResetTXN() first calls
> > > ReorderBufferTruncateTXN() and then ReorderBufferToastReset(). After
> > > that, it also tries to free spec_insert change which should have the
> > > same problem. So, what saves this path from the same problem?
> >
> > Good catch. I've not created a test case for that but it seems to be
> > possible to happen.
> >
> > I think that subtracting txn->size to reduce the memory counter to
> > zero seems to be a wrong idea in the first place. If we want to save
> > updating memory counter and max-heap, we should use the exact memory
> > size that we freed. In other words, just change the memory usage
> > update to a batch operation.
> >
>
> Sounds reasonable but how would you find the size for a batch update
> operation? Are you planning to track it while freeing the individual
> changes?

Yes, one idea is to make ReorderBufferReturnChange() return the memory
size that it just freed. Then the caller who wants to update the
memory counter in a batch sums up the memory size.

Regards,


--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: Fix memory counter update in reorderbuffer

From
Masahiko Sawada
Date:
On Sat, Aug 10, 2024 at 5:48 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Fri, Aug 9, 2024 at 3:30 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Thu, Aug 8, 2024 at 9:43 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > On Wed, Aug 7, 2024 at 3:17 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > > On Wed, Aug 7, 2024 at 7:42 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > >
> > > > >
> > > > > > BTW, commit 5bec1d6bc5e also introduced
> > > > > > ReorderBufferChangeMemoryUpdate() in ReorderBufferTruncateTXN() which
> > > > > > is also worth considering while fixing the reported problem. It may
> > > > > > not have the same problem as you have reported but we can consider
> > > > > > whether setting txn size as zero explicitly is required or not.
> > > > >
> > > > > The reason why it introduced ReorderBufferChangeMemoryUpdate() is the
> > > > > same as I mentioned above. And yes, as you mentioned, it doesn't have
> > > > > the same problem that I reported here.
> > > > >
> > > >
> > > > I checked again and found that ReorderBufferResetTXN() first calls
> > > > ReorderBufferTruncateTXN() and then ReorderBufferToastReset(). After
> > > > that, it also tries to free spec_insert change which should have the
> > > > same problem. So, what saves this path from the same problem?
> > >
> > > Good catch. I've not created a test case for that but it seems to be
> > > possible to happen.
> > >
> > > I think that subtracting txn->size to reduce the memory counter to
> > > zero seems to be a wrong idea in the first place. If we want to save
> > > updating memory counter and max-heap, we should use the exact memory
> > > size that we freed. In other words, just change the memory usage
> > > update to a batch operation.
> > >
> >
> > Sounds reasonable but how would you find the size for a batch update
> > operation? Are you planning to track it while freeing the individual
> > changes?
>
> Yes, one idea is to make ReorderBufferReturnChange() return the memory
> size that it just freed. Then the caller who wants to update the
> memory counter in a batch sums up the memory size.

I've drafted the patch. I didn't change ReorderBufferReturnChange()
but called ReorderBufferChangeSize() for individual change instead, as
it's simpler.gi

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Attachment

Re: Fix memory counter update in reorderbuffer

From
Amit Kapila
Date:
On Tue, Aug 20, 2024 at 5:55 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Fri, Aug 16, 2024 at 12:22 AM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
> >
>
> Thank you for testing the patch!
>
> I'm slightly hesitant to add a test under src/test/subscription since
> it's a bug in ReorderBuffer and not specific to logical replication.
> If we reasonably cannot add a test under contrib/test_decoding, I'm
> okay with adding it under src/test/subscription.
>

Sounds like a reasonable approach.

> I've attached the updated patch with the commit message (but without a
> test case for now).
>

The patch LGTM except for one minor comment.

+ /* All changes must be returned */
+ Assert(txn->size == 0);

Isn't it better to say: "All changes must be deallocated." in the above comment?

--
With Regards,
Amit Kapila.



Re: Fix memory counter update in reorderbuffer

From
Amit Kapila
Date:
On Fri, Aug 23, 2024 at 3:44 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Tue, Aug 20, 2024 at 9:29 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
>
> I've updated the updated patch with regression tests.
>

LGTM.

--
With Regards,
Amit Kapila.



Re: Fix memory counter update in reorderbuffer

From
Masahiko Sawada
Date:
On Sun, Aug 25, 2024 at 9:29 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Aug 23, 2024 at 3:44 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Tue, Aug 20, 2024 at 9:29 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> >
> > I've updated the updated patch with regression tests.
> >
>
> LGTM.

Thank you for reviewing the patch. Pushed.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: Fix memory counter update in reorderbuffer

From
Nathan Bossart
Date:
On Mon, Aug 26, 2024 at 11:25:53AM -0700, Masahiko Sawada wrote:
> Thank you for reviewing the patch. Pushed.

nitpick: I think this one needs a pgindent run.

-- 
nathan



Re: Fix memory counter update in reorderbuffer

From
Masahiko Sawada
Date:
On Mon, Aug 26, 2024 at 2:27 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
>
> On Mon, Aug 26, 2024 at 11:25:53AM -0700, Masahiko Sawada wrote:
> > Thank you for reviewing the patch. Pushed.
>
> nitpick: I think this one needs a pgindent run.

Thank you for pointing it out. I forgot to check with pgindent. I've
pushed the change. Hope it
fixes the problem.


Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com