Re: Logical replication 'invalid memory alloc request size 1585837200' after upgrading to 17.5 - Mailing list pgsql-bugs
From | Amit Kapila |
---|---|
Subject | Re: Logical replication 'invalid memory alloc request size 1585837200' after upgrading to 17.5 |
Date | |
Msg-id | CAA4eK1LMgqeT_bPZ3MH-VKvwOqpZyfJmF7knZhu1rqt2Pqsnwg@mail.gmail.com Whole thread Raw |
In response to | Re: Logical replication 'invalid memory alloc request size 1585837200' after upgrading to 17.5 (Amit Kapila <amit.kapila16@gmail.com>) |
Responses |
Re: Logical replication 'invalid memory alloc request size 1585837200' after upgrading to 17.5
|
List | pgsql-bugs |
On Wed, May 21, 2025 at 11:18 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Mon, May 19, 2025 at 8:08 PM Duncan Sands > <duncan.sands@deepbluecap.com> wrote: > > > > While it is long, it doesn't seem to merit allocating anything like 1GB of > > memory. So I'm guessing that postgres is miscalculating the required size somehow. > > > > We fixed a bug in commit 4909b38af0 to distribute invalidation at the > transaction end to avoid data loss in certain cases, which could cause > such a problem. I am wondering that even prior to that commit, we > would eventually end up allocating the required memory for a > transaction for all the invalidations because of repalloc in > ReorderBufferAddInvalidations, so why it matter with this commit? One > possibility is that we need allocations for multiple in-progress > transactions now. > I think the problem here is that when we are distributing invalidations to a concurrent transaction, in addition to queuing the invalidations as a change, we also copy the distributed invalidations along with the original transaction's invalidations via repalloc in ReorderBufferAddInvalidations. So, when there are many in-progress transactions, each would try to copy all its accumulated invalidations to the remaining in-progress transactions. This could lead to such an increase in allocation request size. However, after queuing the change, we don't need to copy it along with the original transaction's invalidations. This is because the copy is only required when we don't process any changes in cases like ReorderBufferForget(). I have analyzed all such cases, and my analysis is as follows: ReorderBufferForget() ------------------------------ It is okay not to perform the invalidations that we got from other concurrent transactions during ReorderBufferForget. This is because ReorderBufferForget executes invalidations when we skip the transaction being decoded, as it is not from a database of interest. So, we execute only to invalidate shared catalogs (See comment at the caller of ReorderBufferForget). It is sufficient to execute such invalidations in the source transaction only because the transaction being skipped wouldn't have loaded anything in the shared catalog. ReorderBufferAbort() ----------------------------- ReorderBufferAbort() process invalidation when it has already streamed some changes. Whenever it would have streamed the change, it would have processed the concurrent transactions' invalidation messages that happened before the statement that led to streaming. That should be sufficient for us. Consider the following variant of the original case that required the distribution of invalidations: 1) S1: CREATE TABLE d(data text not null); 2) S1: INSERT INTO d VALUES('d1'); 3) S2: BEGIN; INSERT INTO d VALUES('d2'); 4) S1: ALTER PUBLICATION pb ADD TABLE d; 5) S2: INSERT INTO unrelated_tab VALUES(1); 6) S2: ROLLBACK; 7) S2: INSERT INTO d VALUES('d3'); 8) S1: INSERT INTO d VALUES('d4'); The problem with the sequence is that the insert from 3) could be decoded *after* 4) in step 5) due to streaming, and that to decode the insert (which happened before the ALTER) the catalog snapshot and cache state is from *before* the ALTER TABLE. Because the transaction started in 3) doesn't modify any catalogs, no invalidations are executed after decoding it. The result could be that the cache looks like it did at 3), not like after 4). However, this won't create a problem because while streaming at 5), we would execute invalidation from S-1 due to the change added via message REORDER_BUFFER_CHANGE_INVALIDATION in ReorderBufferAddInvalidations. ReorderBufferInvalidate -------------------------------- The reason is the same as ReorderBufferForget(), as it executes invalidations for the same reason, but with a different function to avoid the cleanup of the buffer at the end. XLOG_XACT_INVALIDATIONS ------------------------------------------- While processing XLOG_XACT_INVALIDATIONS, we don't need invalidations accumulated from other xacts because this is a special case to execute invalidations from a particular command (DDL) in a transaction. It won't build any cache, so it can't create any invalid state. -- With Regards, Amit Kapila.
pgsql-bugs by date: