Home > mailing lists

Logical Replica ReorderBuffer Size Accounting Issues - Mailing list pgsql-bugs

From	Alex Richman
Subject	Logical Replica ReorderBuffer Size Accounting Issues
Date	January 5, 2023 14:56:23
Msg-id	CAMnUB3oYugXCBLSkih+qNsWQPciEwos6g_AMbnz_peNoxfHwyw@mail.gmail.com Whole thread Raw
Responses	Re: Logical Replica ReorderBuffer Size Accounting Issues (Amit Kapila <amit.kapila16@gmail.com>)
List	pgsql-bugs

Tree view

Hi,

We've noticed an odd memory issue with walsenders for logical replication slots - They experience large spikes in memory usage up to ~10x over the baseline from ~500MiB to ~5GiB, exceeding the configured logical_decoding_work_mem. Since we have ~40 active subscriptions this produces a spike of ~200GiB on the sender, which is quite worrying.

The spikes in memory always slowly ramp up to ~5GB over ~10 minutes, then quickly drop back down to the ~500MB baseline.

logical_decoding_work_mem is configured to 256MB, and streaming is configured on the subscription side, so I would expect the slots to either stream to spill bytes to disk when they get to the 256MB limit, and not get close to 5GiB. However pg_stat_replication_slots shows 0 spilled or streamed bytes for any slots.

I used GDB to call MemoryContextStats on a walsender process with 5GB usage, which logged this large reorderbuffer context:

--- snip ---

ReorderBuffer: 65536 total in 4 blocks; 64624 free (169 chunks); 912 used

ReorderBufferByXid: 32768 total in 3 blocks; 12600 free (6 chunks); 20168 used

Tuples: 4311744512 total in 514 blocks (12858943 chunks); 6771224 free (12855411 chunks); 4304973288 used

TXN: 16944 total in 2 blocks; 13984 free (46 chunks); 2960 used

Change: 574944 total in 70 blocks; 214944 free (2239 chunks); 360000 used

--- snip ---

It's my understanding that the reorder buffer context is the thing that logical_decoding_work_mem specifically constraints, so it's surprising to see that it's holding onto ~4GB of tuples instead of spooling them. I found the code for that here: https://github.com/postgres/postgres/blob/eb5ad4ff05fd382ac98cab60b82f7fd6ce4cfeb8/src/backend/replication/logical/reorderbuffer.c#L3557 which suggests it's checking rb->size against the configured work_mem.

I then used GDB to break into a high memory walsender and grab rb->size, which was only 73944. So it looks like the tuple memory isn't being properly accounted for in the total reorderbuffer size, so nothing is getting streamed/spooled?

Not super familiar with this so please let me know if there's something I've missed, otherwise it seems like the reorder buffer size accounting is a bit wrong.

Thanks,

- Alex.

pgsql-bugs by date:

From: Sandeep Thakkar
Date: 05 January 2023, 05:04:32
Subject: Re: BUG #17733: ERROR: could not load library "/Users/frank/postgres/postgresql-13.9/lib/postgresql/llvmjit.so": dl

From: PG Bug reporting form
Date: 05 January 2023, 19:53:35
Subject: BUG #17737: An assert failed in execExprInterp.c

Logical Replica ReorderBuffer Size Accounting Issues - Mailing list pgsql-bugs

Previous

Next