Re: Logical replication 'invalid memory alloc request size 1585837200' after upgrading to 17.5 - Mailing list pgsql-bugs
From | Amit Kapila |
---|---|
Subject | Re: Logical replication 'invalid memory alloc request size 1585837200' after upgrading to 17.5 |
Date | |
Msg-id | CAA4eK1LqTcH5vYrCUFn=NUHv_9eYrZS6piLA3UUKD4=39xB=fQ@mail.gmail.com Whole thread Raw |
In response to | Re: Logical replication 'invalid memory alloc request size 1585837200' after upgrading to 17.5 (Masahiko Sawada <sawada.mshk@gmail.com>) |
Responses |
Re: Logical replication 'invalid memory alloc request size 1585837200' after upgrading to 17.5
RE: Logical replication 'invalid memory alloc request size 1585837200' after upgrading to 17.5 |
List | pgsql-bugs |
On Thu, May 29, 2025 at 12:22 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > On Mon, May 26, 2025 at 4:19 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Mon, May 26, 2025 at 2:52 PM Hayato Kuroda (Fujitsu) > > <kuroda.hayato@fujitsu.com> wrote: > > > > > > > If the above hypothesis is true, we need to consider another idea so > > > > that we can execute invalidation messages in both cases. > > > > > > The straightforward fix is to check the change queue as well when the transaction > > > has invalidation messages. 0003 implemented that. One downside is that traversing > > > changes can affect performance. Currently we iterates all of changes even a > > > single REORDER_BUFFER_CHANGE_INVALIDATION. I cannot find better solutions for now. > > > > > > > It can impact the performance for large transactions with fewer > > invalidations, especially the ones which has spilled changes because > > it needs to traverse the entire list of changes again at the end. > > What if we remember all executed REORDER_BUFFER_CHANGE_INVALIDATION in > a queue while replaying the transaction so that we can execute them at > the end in a non-error path, instead of re-traversing the entire list > of changes to execute the inval messages? > The current proposed patch (v4) is also traversing only the required inval messages, as it has maintained a separate queue for that. So, what will be the advantage of forming such a queue during the processing of changes? Are you imagining a local instead of a queue at ReorderBufferTXN level? I feel we still need at ReorderBufferTXN level to ensure that we can execute those changes across streaming blocks, otherwise, the cleanup of such a local queue would be tricky and add to maintenance effort. One disadvantage of the approach you suggest is that the changes in the new queue won't be accounted for in logical_decoding_work_mem computation, which can be done in the proposed approach, although the patch hasn't implemented it as of now. A few comments on v4: =================== 1. +static void +ReorderBufferExecuteInvalidationsInQueue(ReorderBuffer *rb, ReorderBufferTXN *txn) +{ ... ... + /* Skip other changes because the transaction was aborted */ + case REORDER_BUFFER_CHANGE_INSERT: + case REORDER_BUFFER_CHANGE_UPDATE: + case REORDER_BUFFER_CHANGE_DELETE: + case REORDER_BUFFER_CHANGE_MESSAGE: + case REORDER_BUFFER_CHANGE_INTERNAL_SNAPSHOT: + case REORDER_BUFFER_CHANGE_INTERNAL_COMMAND_ID: + case REORDER_BUFFER_CHANGE_INTERNAL_TUPLECID: + case REORDER_BUFFER_CHANGE_INTERNAL_SPEC_INSERT: + case REORDER_BUFFER_CHANGE_INTERNAL_SPEC_CONFIRM: + case REORDER_BUFFER_CHANGE_INTERNAL_SPEC_ABORT: + case REORDER_BUFFER_CHANGE_TRUNCATE: + break; This cases should never happen, so we should have either an Assert or an elog here. 2. +void +ReorderBufferAddInvalidationsExtended(ReorderBuffer *rb, TransactionId xid, + XLogRecPtr lsn, Size nmsgs, + SharedInvalidationMessage *msgs, + bool needs_distribute) { ... + * XXX: IIUC This must be done only to the toptxns, but is it right? + */ + if (!needs_distribute && !TransactionIdIsValid(txn->toplevel_xid)) + { + ReorderBufferChange *inval_change; + + /* Duplicate the inval change to queue it */ + inval_change = ReorderBufferAllocChange(rb); + inval_change->action = REORDER_BUFFER_CHANGE_INVALIDATION; The name needs_distribute seems confusing because when this function is invoked from SnapBuildDistributeSnapshotAndInval(), the parameter is passed as false; it should be true in that case, and the meaning of this parameter in the function should be reversed. -- With Regards, Amit Kapila.
pgsql-bugs by date: