Re: Logical replication 'invalid memory alloc request size 1585837200' after upgrading to 17.5 - Mailing list pgsql-bugs
From | Shlok Kyal |
---|---|
Subject | Re: Logical replication 'invalid memory alloc request size 1585837200' after upgrading to 17.5 |
Date | |
Msg-id | CANhcyEWp_T7tX-yKbdbxdUR144UAZ7oxNM_AORfCvWHZg0ja5w@mail.gmail.com Whole thread Raw |
In response to | Logical replication 'invalid memory alloc request size 1585837200' after upgrading to 17.5 (Duncan Sands <duncan.sands@deepbluecap.com>) |
List | pgsql-bugs |
On Mon, 19 May 2025 at 20:08, Duncan Sands <duncan.sands@deepbluecap.com> wrote: > > PostgreSQL v17.5 (Ubuntu 17.5-1.pgdg24.04+1); Ubuntu 24.04.2 LTS (kernel > 6.8.0); x86-64 > > Good morning from DeepBlueCapital. Soon after upgrading to 17.5 from 17.4, we > started seeing logical replication failures with publisher errors like this: > > ERROR: invalid memory alloc request size 1196493216 > > (the exact size varies). Here is a typical log extract from the publisher: > > 2025-05-19 10:30:14 CEST \[1348336-465] remote\_production\_user\@blue DEBUG: > 00000: write FB03/349DEF90 flush FB03/349DEF90 apply FB03/349DEF90 reply\_time > 2025-05-19 10:30:07.467048+02 > 2025-05-19 10:30:14 CEST \[1348336-466] remote\_production\_user\@blue LOCATION: > ProcessStandbyReplyMessage, walsender.c:2431 > 2025-05-19 10:30:14 CEST \[1348336-467] remote\_production\_user\@blue DEBUG: > 00000: skipped replication of an empty transaction with XID: 207637565 > 2025-05-19 10:30:14 CEST \[1348336-468] remote\_production\_user\@blue CONTEXT: > slot "jnb\_production", output plugin "pgoutput", in the commit callback, > associated LSN FB03/349FF938 > 2025-05-19 10:30:14 CEST \[1348336-469] remote\_production\_user\@blue LOCATION: > pgoutput\_commit\_txn, pgoutput.c:629 > 2025-05-19 10:30:14 CEST \[1348336-470] remote\_production\_user\@blue DEBUG: > 00000: UpdateDecodingStats: updating stats 0x5ae1616c17a8 0 0 0 0 1 0 1 191 > 2025-05-19 10:30:14 CEST \[1348336-471] remote\_production\_user\@blue LOCATION: > UpdateDecodingStats, logical.c:1943 > 2025-05-19 10:30:14 CEST \[1348336-472] remote\_production\_user\@blue DEBUG: > 00000: found top level transaction 207637519, with catalog changes > 2025-05-19 10:30:14 CEST \[1348336-473] remote\_production\_user\@blue LOCATION: > SnapBuildCommitTxn, snapbuild.c:1150 > 2025-05-19 10:30:14 CEST \[1348336-474] remote\_production\_user\@blue DEBUG: > 00000: adding a new snapshot and invalidations to 207616976 at FB03/34A1AAE0 > 2025-05-19 10:30:14 CEST \[1348336-475] remote\_production\_user\@blue LOCATION: > SnapBuildDistributeSnapshotAndInval, snapbuild.c:915 > 2025-05-19 10:30:14 CEST \[1348336-476] remote\_production\_user\@blue ERROR: > XX000: invalid memory alloc request size 1196493216 > > If I'm reading it right, things go wrong on the publisher while preparing the > message, i.e. it's not a subscriber problem. > > This particular instance was triggered by a large number of catalog > invalidations: I dumped what I think is the relevant WAL with "pg_waldump -s > FB03/34A1AAE0 -p 17/main/ --xid=207637519" and the output was a single long line: > > rmgr: Transaction len (rec/tot): 10665/ 10665, tx: 207637519, lsn: > FB03/34A1AAE0, prev FB03/34A1A8C8, desc: COMMIT 2025-05-19 08:10:12.880599 CEST; > dropped stats: 2/17426/661557718 2/17426/661557717 2/17426/661557714 > 2/17426/661557678 2/17426/661557677 2/17426/661557674 2/17426/661557673 > 2/17426/661557672 2/17426/661557669 2/17426/661557618 2/17426/661557617 > 2/17426/661557614; inval msgs: catcache 80 catcache 79 catcache 80 catcache 79 > catcache 55 catcache 54 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 > catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 > catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 > catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 > catcache 55 catcache 54 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 > catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 > catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 55 > catcache 54 catcache 7 catcache 6 catcache 7 catcache 6 catcache 32 catcache 55 > catcache 54 catcache 55 catcache 54 catcache 55 catcache 54 catcache 80 catcache > 79 catcache 80 catcache 79 catcache 55 catcache 54 catcache 7 catcache 6 > catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 > catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 > catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 > catcache 6 catcache 7 catcache 6 catcache 55 catcache 54 catcache 7 catcache 6 > catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 > catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 > catcache 7 catcache 6 catcache 55 catcache 54 catcache 7 catcache 6 catcache 7 > catcache 6 catcache 32 catcache 55 catcache 54 catcache 55 catcache 54 catcache > 55 catcache 54 catcache 63 catcache 63 catcache 63 catcache 63 catcache 63 > catcache 63 catcache 63 catcache 55 catcache 54 catcache 80 catcache 79 catcache > 80 catcache 79 catcache 55 catcache 54 catcache 7 catcache 6 catcache 7 catcache > 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 > catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 > catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 > catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 > catcache 7 catcache 6 catcache 55 catcache 54 catcache 7 catcache 6 catcache 7 > catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 > catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 > catcache 6 catcache 55 catcache 54 catcache 7 catcache 6 catcache 7 catcache 6 > catcache 32 catcache 55 catcache 54 catcache 55 catcache 54 catcache 55 catcache > 54 catcache 80 catcache 79 catcache 80 catcache 79 catcache 55 catcache 54 > catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 > catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 > catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 > catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 > catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 55 > catcache 54 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 > catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 > catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 55 catcache 54 > catcache 7 catcache 6 catcache 7 catcache 6 catcache 32 catcache 55 catcache 54 > catcache 55 catcache 54 catcache 55 catcache 54 catcache 63 catcache 63 catcache > 63 catcache 63 catcache 63 catcache 63 catcache 63 catcache 63 catcache 63 > catcache 63 catcache 63 catcache 55 catcache 54 catcache 32 catcache 7 catcache > 6 catcache 7 catcache 6 catcache 55 catcache 54 catcache 7 catcache 6 catcache 7 > catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 > catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 > catcache 6 catcache 55 catcache 54 catcache 80 catcache 79 catcache 80 catcache > 79 catcache 63 catcache 63 catcache 63 catcache 63 catcache 63 catcache 63 > catcache 63 catcache 63 catcache 63 catcache 63 catcache 63 catcache 7 catcache > 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 > catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 > catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 > catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 > catcache 7 catcache 6 catcache 7 catcache 6 catcache 55 catcache 54 catcache 32 > catcache 7 catcache 6 catcache 7 catcache 6 catcache 55 catcache 54 catcache 7 > catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 > catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 > catcache 6 catcache 7 catcache 6 catcache 55 catcache 54 catcache 80 catcache 79 > catcache 80 catcache 79 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 > catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 > catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 > catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 > catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 > catcache 6 catcache 55 catcache 54 catcache 32 catcache 7 catcache 6 catcache 7 > catcache 6 catcache 55 catcache 54 catcache 7 catcache 6 catcache 7 catcache 6 > catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 > catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 > catcache 55 catcache 54 catcache 80 catcache 79 catcache 80 catcache 79 catcache > 63 catcache 63 catcache 63 catcache 63 catcache 63 catcache 63 catcache 63 > catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 > catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 > catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 > catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 55 catcache 54 > catcache 32 catcache 7 catcache 6 catcache 7 catcache 6 catcache 55 catcache 54 > catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 > catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 > catcache 7 catcache 6 catcache 7 catcache 6 catcache 55 catcache 54 catcache 80 > catcache 79 catcache 80 catcache 79 catcache 7 catcache 6 catcache 7 catcache 6 > catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 > catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 > catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 > catcache 6 catcache 55 catcache 54 snapshot 2608 relcache 661557614 snapshot > 1214 relcache 661557617 relcache 661557618 relcache 661557617 snapshot 2608 > relcache 661557617 relcache 661557618 relcache 661557614 snapshot 2608 snapshot > 2608 relcache 661557669 snapshot 1214 relcache 661557672 relcache 661557673 > relcache 661557672 snapshot 2608 relcache 661557672 relcache 661557673 relcache > 661557669 snapshot 2608 relcache 661557669 snapshot 2608 relcache 661557674 > snapshot 1214 relcache 661557677 relcache 661557678 relcache 661557677 snapshot > 2608 relcache 661557677 relcache 661557678 relcache 661557674 snapshot 2608 > snapshot 2608 relcache 661557714 snapshot 1214 relcache 661557717 relcache > 661557718 relcache 661557717 snapshot 2608 relcache 661557717 relcache 661557718 > relcache 661557714 snapshot 2608 relcache 661557714 relcache 661557718 relcache > 661557717 snapshot 2608 relcache 661557717 snapshot 2608 snapshot 2608 snapshot > 2608 relcache 661557714 snapshot 2608 snapshot 1214 relcache 661557678 relcache > 661557677 snapshot 2608 relcache 661557677 snapshot 2608 snapshot 2608 snapshot > 2608 relcache 661557674 snapshot 2608 snapshot 1214 relcache 661557673 relcache > 661557672 snapshot 2608 relcache 661557672 snapshot 2608 snapshot 2608 snapshot > 2608 relcache 661557669 snapshot 2608 snapshot 1214 relcache 661557618 relcache > 661557617 snapshot 2608 relcache 661557617 snapshot 2608 snapshot 2608 snapshot > 2608 relcache 661557614 snapshot 2608 snapshot 1214 > > While it is long, it doesn't seem to merit allocating anything like 1GB of > memory. So I'm guessing that postgres is miscalculating the required size somehow. > > If I skip over this LSN, for example by dropping the subscription and recreating > it anew, then things go fine for a while before hitting another "invalid memory > alloc request", i.e. it wasn't just a one-off. On the other hand, after > downgrading to 17.4, subscribers spontaneously recovered and the issue has gone > way. Since I didn't skip over the last LSN of this kind, presumably 17.4 > successfully serialized a message for the same problematic bit of WAL that > caused 17.5 to blow up, which suggests a regression between 17.4 and 17.5. > Hi Duncan, Thanks for reporting this. I tried adding around ~80000 invalidations but could not reproduce the issue. Can you share the steps to reproduce the above scenario? Thanks and Regards, Shlok Kyal
pgsql-bugs by date: