Re: Logical replication 'invalid memory alloc request size 1585837200' after upgrading to 17.5 - Mailing list pgsql-bugs
From | Masahiko Sawada |
---|---|
Subject | Re: Logical replication 'invalid memory alloc request size 1585837200' after upgrading to 17.5 |
Date | |
Msg-id | CAD21AoBhDBqavSkxr+0GCxom1Q7P7guY5Ees20Wda=YZLFVfCA@mail.gmail.com Whole thread Raw |
In response to | Re: Logical replication 'invalid memory alloc request size 1585837200' after upgrading to 17.5 (Amit Kapila <amit.kapila16@gmail.com>) |
Responses |
Re: Logical replication 'invalid memory alloc request size 1585837200' after upgrading to 17.5
|
List | pgsql-bugs |
On Tue, Jun 3, 2025 at 12:07 AM vignesh C <vignesh21@gmail.com> wrote: > > On Mon, 2 Jun 2025 at 22:49, Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > Thank you for updating the patch. Here are some review comments: > > > > + req_mem_size = sizeof(SharedInvalidationMessage) * > > (txn->ninvalidations_distr + nmsgs); > > + > > + /* > > + * If the number of invalidation messages is larger than 8MB, it's more > > + * efficient to invalidate the entire cache rather than processing each > > + * message individually. > > + */ > > + if (req_mem_size > (8 * 1024 * 1024) || rbtxn_inval_all_cache(txn)) > > > > It's better to define the maximum number of distributed inval messages > > per transaction as a macro instead of calculating the memory size > > every time. > > Modified > > > --- > > +static void > > +ReorderBufferAddInvalidationsCommon(ReorderBuffer *rb, TransactionId xid, > > + XLogRecPtr lsn, Size nmsgs, > > + SharedInvalidationMessage *msgs, > > + ReorderBufferTXN *txn, > > + bool for_inval) > > > > This function is quite confusing to me. For instance, > > ReorderBufferAddDistributedInvalidations() needs to call this function > > with for_inval=false in spite of adding inval messages actually. Also, > > the following condition seems not intuisive but there is no comment: > > > > if (!for_inval || (for_inval && !rbtxn_inval_all_cache(txn))) > > > > Instead of having ReorderBufferAddInvalidationsCommon(), I think we > > can have a function say ReorderBufferQueueInvalidations() where we > > enqueue the given inval messages as a > > REORDER_BUFFER_CHANGE_INVALIDATION change. > > ReorderBufferAddInvalidations() adds inval messages to > > txn->invalidations and calls that function, while > > ReorderBufferQueueInvalidations() adds inval messages to > > txn->distributed_ivnalidations and calls that function if the array is > > not full. > > Modified > > > BTW if we need to invalidate all accumulated caches at the end of > > transaction replay anyway, we don't need to add inval messages to > > txn->invalidations once txn->distributed_invalidations gets full? > > yes, no need to add invalidation messages to txn->invalidation once > RBTXN_INVAL_ALL_CACHE is set. This is handled now. > > The attached v9 version patch has the changes for the same. Thank you for updating the patch. Here are review comments on v9 patch: +/* + * Maximum number of distributed invalidation messages per transaction. + * Each message is ~16 bytes, this allows up to 8 MB of invalidation + * message data. + */ +#define MAX_DISTR_INVAL_MSG_PER_TXN 524288 The size of SharedInvalidationMessage could change in the future so we should calculate it at compile time. --- + /* + * If the complete cache will be invalidated, we don't need to accumulate + * the invalidations. + */ + if (!rbtxn_inval_all_cache(txn)) + ReorderBufferAccumulateInvalidations(&txn->ninvalidations, + &txn->invalidations, nmsgs, msgs); We need to explain why we don't check the number of invalidation messages for txn->invalidations and mark it as inval-all-cache, unlike ReorderBufferAddDistributedInvalidations(). --- + /* + * If the number of invalidation messages is high, performing a full cache + * invalidation is more efficient than handling each message separately. + */ + if (((nmsgs + txn->ninvalidations_distributed) > MAX_DISTR_INVAL_MSG_PER_TXN) || + rbtxn_inval_all_cache(txn)) { - txn->invalidations = (SharedInvalidationMessage *) - repalloc(txn->invalidations, sizeof(SharedInvalidationMessage) * - (txn->ninvalidations + nmsgs)); + txn->txn_flags |= RBTXN_INVAL_ALL_CACHE; I think we don't need to mark the transaction as RBTXN_INVAL_ALL_CACHE again. I'd rewrite the logic as follows: if (txn->ninvalidations_distributed + nmsgs >= MAX_DISTR_INVAL_MSG_PER_TXN) { /* mark the txn as inval-all-cache */ .... /* free the accumulated inval msgs */ .... } if (!rbtxn_inval_all_cache(txn)) ReorderBufferAccumulateInvalidations(...); --- - ReorderBufferAddInvalidations(builder->reorder, txn->xid, lsn, - ninvalidations, msgs); + ReorderBufferAddDistributedInvalidations(builder->reorder, + txn->xid, lsn, + ninvalidations, msgs); I think we need some comments here to explain why we need to distribute only inval messages coming from the current transaction. --- +/* Should the complete cache be invalidated? */ +#define rbtxn_inval_all_cache(txn) \ +( \ + ((txn)->txn_flags & RBTXN_INVAL_ALL_CACHE) != 0 \ +) I find that if we rename the flag to something like RBTXN_INVAL_OVERFLOWED, it would explain the state of the transaction clearer. --- Can we have a reasonable test case that covers the inval message overflow cases? I've attached a patch for some changes and adding more comments (note that it still has XXX comments). Please include these changes that you agreed with in the next version patch. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
Attachment
pgsql-bugs by date: