Re: Logical replication 'invalid memory alloc request size 1585837200' after upgrading to 17.5 - Mailing list pgsql-bugs
From | Masahiko Sawada |
---|---|
Subject | Re: Logical replication 'invalid memory alloc request size 1585837200' after upgrading to 17.5 |
Date | |
Msg-id | CAD21AoCPJS3=SY28X5X_sfKzMA8PU3y0nm16ReyboFdX8=gRfg@mail.gmail.com Whole thread Raw |
In response to | Re: Logical replication 'invalid memory alloc request size 1585837200' after upgrading to 17.5 (vignesh C <vignesh21@gmail.com>) |
Responses |
Re: Logical replication 'invalid memory alloc request size 1585837200' after upgrading to 17.5
|
List | pgsql-bugs |
On Fri, May 30, 2025 at 5:15 AM vignesh C <vignesh21@gmail.com> wrote: > > On Fri, 30 May 2025 at 10:12, Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Fri, May 30, 2025 at 9:31 AM vignesh C <vignesh21@gmail.com> wrote: > > > > > > On Thu, 29 May 2025 at 22:57, Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > > > > > > I agree that chances are much lower than current if txn->invalidations > > > > > doesn't contain invalidations from other transactions, but it is not > > > > > clear what exactly you are trying to advocate by it. Are you trying to > > > > > advocate that we should maintain a member similar to txn->invalidation > > > > > (say txn->distributed_invals) instead of a queue? > > > > > > > > Yes, because I guess it's much simpler. I think it would not be a good > > > > idea to introduce a new concept of accounting the memory usage of the > > > > distributed inval messages too and serializing them, at least on back > > > > branches. I think that In case where the txn->distriubted_inval is > > > > about to overflow (not has to be 1GB) we can invalidate all caches > > > > instread. > > > > > > > I agree that it would be simpler, and to avoid invalid memory > > allocation requests even for rare cases, we can have the backup logic > > to invalidate all caches. > > > > > To identify overflow scenarios, I’m considering the following options: > > > a) Introduce a new txn_flags value, such as RBTXN_INVAL_ALL_CACHE, to > > > explicitly mark transactions that require full cache invalidation. > > > b) Add a dedicated parameter to indicate an overflow scenario. > > > c) setting the newly added nentries_distr to -1, to indicate an > > > overflow scenario. > > > > > > Do you have any preference or thoughts on which of these approaches > > > would be cleaner? > > > > > > > I would prefer (a) as that is an explicit way to indicate that we need > > to invalidate all caches. But let us see if Sawada-san has something > > else in mind. > > The attached v6 version patch has the changes for the same. > Thoughts? Thank you for updating the patch. Here are some review comments: @@ -3439,9 +3464,27 @@ ReorderBufferAddInvalidations(ReorderBuffer *rb, TransactionId xid, XLogRecPtr lsn, Size nmsgs, SharedInvalidationMessage *msgs) { - ReorderBufferTXN *txn; + ReorderBufferAddInvalidationsExtended(rb, xid, lsn, nmsgs, msgs, false); +} If the patch is the changes for master do we need to have an extended version of ReorderBufferAddInvalidation()? --- + /* + * Make sure there's no cache pollution. Unlike the PG_TRY part, + * this must be done unconditionally because the processing might + * fail before we reach invalidation messages. + */ + if (rbtxn_inval_all_cache(txn)) + InvalidateSystemCaches(); + else + ReorderBufferExecuteInvalidations(txn->ninvalidations_distr, + txn->distributed_invalidations); + If we don't need to execute the distributed inval message in an error path other than detecting concurrent abort, we should describe the reason. --- Given that we don't account the memory usage of both txn->invalidations and txn->distributed_invalidations, probably we can have a lower limit, say 8MB (or lower?), to avoid memory exhaustion. --- + if ((for_inval && !AllocSizeIsValid(req_mem_size)) || + rbtxn_inval_all_cache(txn)) { - txn->ninvalidations = nmsgs; - txn->invalidations = (SharedInvalidationMessage *) - palloc(sizeof(SharedInvalidationMessage) * nmsgs); - memcpy(txn->invalidations, msgs, - sizeof(SharedInvalidationMessage) * nmsgs); + txn->txn_flags |= RBTXN_INVAL_ALL_CACHE; + + if (*invalidations) + { + pfree(*invalidations); + *invalidations = NULL; + *ninvalidations = 0; + } RBTXN_INVAL_ALL_CACHE seems to have an effect only on the distributed inval messages. One question is do we need to care about the overflow of txn->invalidations as well? If no, does it make sense to have a separate function like ReorderBufferAddDistributedInvalidtions() instead of having an extended version of ReorderBufferAddInvalidations()? Some common routines can also be declared as a static function if needed. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
pgsql-bugs by date: