Home > mailing lists

Re: Logical replication 'invalid memory alloc request size 1585837200' after upgrading to 17.5 - Mailing list pgsql-bugs

From	Masahiko Sawada
Subject	Re: Logical replication 'invalid memory alloc request size 1585837200' after upgrading to 17.5
Date	May 30 20:29:34
Msg-id	CAD21AoCPJS3=SY28X5X_sfKzMA8PU3y0nm16ReyboFdX8=gRfg@mail.gmail.com Whole thread Raw
In response to	Re: Logical replication 'invalid memory alloc request size 1585837200' after upgrading to 17.5 (vignesh C <vignesh21@gmail.com>)
Responses	Re: Logical replication 'invalid memory alloc request size 1585837200' after upgrading to 17.5
List	pgsql-bugs

Tree view

On Fri, May 30, 2025 at 5:15 AM vignesh C <vignesh21@gmail.com> wrote:
>
> On Fri, 30 May 2025 at 10:12, Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Fri, May 30, 2025 at 9:31 AM vignesh C <vignesh21@gmail.com> wrote:
> > >
> > > On Thu, 29 May 2025 at 22:57, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > > > > I agree that chances are much lower than current if txn->invalidations
> > > > > doesn't contain invalidations from other transactions, but it is not
> > > > > clear what exactly you are trying to advocate by it. Are you trying to
> > > > > advocate that we should maintain a member similar to txn->invalidation
> > > > > (say txn->distributed_invals) instead of a queue?
> > > >
> > > > Yes, because I guess it's much simpler. I think it would not be a good
> > > > idea to introduce a new concept of accounting the memory usage of the
> > > > distributed inval messages too and serializing them, at least on back
> > > > branches. I think that In case where the txn->distriubted_inval is
> > > > about to overflow (not has to be 1GB) we can invalidate all caches
> > > > instread.
> > >
> >
> > I agree that it would be simpler, and to avoid invalid memory
> > allocation requests even for rare cases, we can have the backup logic
> > to invalidate all caches.
> >
> > > To identify overflow scenarios, I’m considering the following options:
> > > a) Introduce a new txn_flags value, such as RBTXN_INVAL_ALL_CACHE, to
> > > explicitly mark transactions that require full cache invalidation.
> > > b) Add a dedicated parameter to indicate an overflow scenario.
> > > c) setting the newly added nentries_distr to -1, to indicate an
> > > overflow scenario.
> > >
> > > Do you have any preference or thoughts on which of these approaches
> > > would be cleaner?
> > >
> >
> > I would prefer (a) as that is an explicit way to indicate that we need
> > to invalidate all caches. But let us see if Sawada-san has something
> > else in mind.
>
> The attached v6 version patch has the changes for the same.
> Thoughts?

Thank you for updating the patch. Here are some review comments:

@@ -3439,9 +3464,27 @@ ReorderBufferAddInvalidations(ReorderBuffer
*rb, TransactionId xid,
                              XLogRecPtr lsn, Size nmsgs,
                              SharedInvalidationMessage *msgs)
 {
-   ReorderBufferTXN *txn;
+   ReorderBufferAddInvalidationsExtended(rb, xid, lsn, nmsgs, msgs, false);
+}

If the patch is the changes for master do we need to have an extended
version of ReorderBufferAddInvalidation()?

---
+           /*
+            * Make sure there's no cache pollution. Unlike the PG_TRY part,
+            * this must be done unconditionally because the processing might
+            * fail before we reach invalidation messages.
+            */
+           if (rbtxn_inval_all_cache(txn))
+               InvalidateSystemCaches();
+           else
+               ReorderBufferExecuteInvalidations(txn->ninvalidations_distr,
+
txn->distributed_invalidations);
+

If we don't need to execute the distributed inval message in an error
path other than detecting concurrent abort, we should describe the
reason.

---
Given that we don't account the memory usage of both
txn->invalidations and txn->distributed_invalidations, probably we can
have a lower limit, say 8MB (or lower?), to avoid memory exhaustion.

---
+   if ((for_inval && !AllocSizeIsValid(req_mem_size)) ||
+       rbtxn_inval_all_cache(txn))
    {
-       txn->ninvalidations = nmsgs;
-       txn->invalidations = (SharedInvalidationMessage *)
-           palloc(sizeof(SharedInvalidationMessage) * nmsgs);
-       memcpy(txn->invalidations, msgs,
-              sizeof(SharedInvalidationMessage) * nmsgs);
+       txn->txn_flags |= RBTXN_INVAL_ALL_CACHE;
+
+       if (*invalidations)
+       {
+           pfree(*invalidations);
+           *invalidations = NULL;
+           *ninvalidations = 0;
+       }

RBTXN_INVAL_ALL_CACHE seems to have an effect only on the distributed
inval messages. One question is do we need to care about the overflow
of txn->invalidations as well? If no, does it make sense to have a
separate function like ReorderBufferAddDistributedInvalidtions()
instead of having an extended version of
ReorderBufferAddInvalidations()? Some common routines can also be
declared as a static function if needed.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

pgsql-bugs by date:

From: Tom Lane
Date: 30 May, 19:59:58
Subject: Re: [EXT] Re: GSS Auth issue when user member of lots of AD groups

From: PG Bug reporting form
Date: 30 May, 22:39:43
Subject: BUG #18941: PostgreSQL planner does not select a multicolumn btree_gin index under RLS

Re: Logical replication 'invalid memory alloc request size 1585837200' after upgrading to 17.5 - Mailing list pgsql-bugs

Previous

Next