Re: Logical replication 'invalid memory alloc request size 1585837200' after upgrading to 17.5 - Mailing list pgsql-bugs

From Masahiko Sawada
Subject Re: Logical replication 'invalid memory alloc request size 1585837200' after upgrading to 17.5
Date
Msg-id CAD21AoAX6sLVFz2bzDEuHP8mRpHEEmwpt88S4=FD1k_+fynLHg@mail.gmail.com
Whole thread Raw
In response to Re: Logical replication 'invalid memory alloc request size 1585837200' after upgrading to 17.5  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-bugs
On Mon, Jun 2, 2025 at 3:22 AM vignesh C <vignesh21@gmail.com> wrote:
>
> On Sat, 31 May 2025 at 13:27, vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Fri, 30 May 2025 at 23:00, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > Thank you for updating the patch. Here are some review comments:
> > >
> > > @@ -3439,9 +3464,27 @@ ReorderBufferAddInvalidations(ReorderBuffer
> > > *rb, TransactionId xid,
> > >                               XLogRecPtr lsn, Size nmsgs,
> > >                               SharedInvalidationMessage *msgs)
> > >  {
> > > -   ReorderBufferTXN *txn;
> > > +   ReorderBufferAddInvalidationsExtended(rb, xid, lsn, nmsgs, msgs, false);
> > > +}
> > >
> > > If the patch is the changes for master do we need to have an extended
> > > version of ReorderBufferAddInvalidation()?
> >
> > This has been removed now and ReorderBufferAddDistributedInvalidtions
> > has been added
> >
> > > ---
> > > +           /*
> > > +            * Make sure there's no cache pollution. Unlike the PG_TRY part,
> > > +            * this must be done unconditionally because the processing might
> > > +            * fail before we reach invalidation messages.
> > > +            */
> > > +           if (rbtxn_inval_all_cache(txn))
> > > +               InvalidateSystemCaches();
> > > +           else
> > > +               ReorderBufferExecuteInvalidations(txn->ninvalidations_distr,
> > > +
> > > txn->distributed_invalidations);
> > > +
> > >
> > > If we don't need to execute the distributed inval message in an error
> > > path other than detecting concurrent abort, we should describe the
> > > reason.
> >
> > Removed it to keep it in the common error path
> >
> > > ---
> > > Given that we don't account the memory usage of both
> > > txn->invalidations and txn->distributed_invalidations, probably we can
> > > have a lower limit, say 8MB (or lower?), to avoid memory exhaustion.
> >
> > Modified
> >
> > > ---
> > > +   if ((for_inval && !AllocSizeIsValid(req_mem_size)) ||
> > > +       rbtxn_inval_all_cache(txn))
> > >     {
> > > -       txn->ninvalidations = nmsgs;
> > > -       txn->invalidations = (SharedInvalidationMessage *)
> > > -           palloc(sizeof(SharedInvalidationMessage) * nmsgs);
> > > -       memcpy(txn->invalidations, msgs,
> > > -              sizeof(SharedInvalidationMessage) * nmsgs);
> > > +       txn->txn_flags |= RBTXN_INVAL_ALL_CACHE;
> > > +
> > > +       if (*invalidations)
> > > +       {
> > > +           pfree(*invalidations);
> > > +           *invalidations = NULL;
> > > +           *ninvalidations = 0;
> > > +       }
> > >
> > > RBTXN_INVAL_ALL_CACHE seems to have an effect only on the distributed
> > > inval messages. One question is do we need to care about the overflow
> > > of txn->invalidations as well? If no, does it make sense to have a
> > > separate function like ReorderBufferAddDistributedInvalidtions()
> > > instead of having an extended version of
> > > ReorderBufferAddInvalidations()? Some common routines can also be
> > > declared as a static function if needed.
> >
> > Modified
> >
> > The attached v7 version patch has the changes for the same.
>
> Here is the patch, including updates for the back branches.
> The main difference from master is that the newly added structure
> members are appended at the end in the back branches to preserve
> compatibility. The invalidation addition logic remains consistent with
> master: a new function, ReorderBufferAddDistributedInvalidations, has
> been introduced to handle distributed invalidations. Shared logic
> between ReorderBufferAddInvalidations and
> ReorderBufferAddDistributedInvalidations has been factored out into a
> common helper, ReorderBufferAddInvalidationsCommon. This approach
> simplifies future merges and, in my assessment, does not introduce any
> backward compatibility issues.
>

Thank you for updating the patch. Here are some review comments:

+   req_mem_size = sizeof(SharedInvalidationMessage) *
(txn->ninvalidations_distr + nmsgs);
+
+   /*
+    * If the number of invalidation messages is larger than 8MB, it's more
+    * efficient to invalidate the entire cache rather than processing each
+    * message individually.
+    */
+   if (req_mem_size > (8 * 1024 * 1024) || rbtxn_inval_all_cache(txn))

It's better to define the maximum number of distributed inval messages
per transaction as a macro instead of calculating the memory size
every time.

---
+static void
+ReorderBufferAddInvalidationsCommon(ReorderBuffer *rb, TransactionId xid,
+                                   XLogRecPtr lsn, Size nmsgs,
+                                   SharedInvalidationMessage *msgs,
+                                   ReorderBufferTXN *txn,
+                                   bool for_inval)

This function is quite confusing to me. For instance,
ReorderBufferAddDistributedInvalidations() needs to call this function
with for_inval=false in spite of adding inval messages actually. Also,
the following condition seems not intuisive but there is no comment:

    if (!for_inval || (for_inval && !rbtxn_inval_all_cache(txn)))

Instead of having ReorderBufferAddInvalidationsCommon(), I think we
can have a function say ReorderBufferQueueInvalidations() where we
enqueue the given inval messages as a
REORDER_BUFFER_CHANGE_INVALIDATION change.
ReorderBufferAddInvalidations() adds inval messages to
txn->invalidations and calls that function, while
ReorderBufferQueueInvalidations() adds inval messages to
txn->distributed_ivnalidations and calls that function if the array is
not full.

BTW if we need to invalidate all accumulated caches at the end of
transaction replay anyway, we don't need to add inval messages to
txn->invalidations once txn->distributed_invalidations gets full?

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



pgsql-bugs by date:

Previous
From: Nathan Bossart
Date:
Subject: Re: BUG #18920: LOAD '$libdir/plugins' no longer works in 18beta1
Next
From: Rahila Syed
Date:
Subject: Re: BUG #18920: LOAD '$libdir/plugins' no longer works in 18beta1