Home > mailing lists

Re: Add logical_decoding_spill_limit to cap spill file disk usage per slot - Mailing list pgsql-hackers

From	Bharath Rupireddy
Subject	Re: Add logical_decoding_spill_limit to cap spill file disk usage per slot
Date	March 23 17:24:23
Msg-id	CALj2ACVc3iYLOkC36VJwoXyVZmGcb0WEMKoc478q+xdRG+2BtA@mail.gmail.com Whole thread Raw
In response to	Add logical_decoding_spill_limit to cap spill file disk usage per slot (shawn wang <shawn.wang.pg@gmail.com>)
Responses	Re: Add logical_decoding_spill_limit to cap spill file disk usage per slot
List	pgsql-hackers

Tree view

Hi,

On Mon, Mar 23, 2026 at 6:20 AM shawn wang <shawn.wang.pg@gmail.com> wrote:
>
> Hi hackers,

Thank you for proposing this new feature.

>  == Motivation ==
>
> We operate a fleet of PostgreSQL instances with logical replication. On several occasions, we have experienced
productionincidents where logical decoding spill files (pg_replslot/<slot>/xid-*.spill) grew uncontrollably — consuming
tensof gigabytes and eventually filling up the data disk. This caused the entire instance to go read-only, impacting
notjust replication but all write workloads. 
>
> The typical scenario is a large transaction (e.g. bulk data load or a long-running DDL) combined with a subscriber
thatis either slow or temporarily disconnected. The reorder buffer exceeds logical_decoding_work_mem and starts
spilling,but there is no upper bound on how much can be spilled. The only backstop today is the OS returning ENOSPC, at
whichpoint the damage is already done. 

Having a lot of spill files also increases crash/recovery times.
However, files spilling to disk causing no-space-left-on-disk issues
leading to downtime applies to WAL files, historical catalog snapshot
files, subtransaction overflow files, CLOG (and all the subsystems
backed by SLRU data structure), etc. - basically any Postgres
subsystem writing files to disk. I'm a bit worried that we may end up
solving disk space issues, which IMHO are outside of the database
scope, in the database. Others may have different opinions though.

How common is this issue? Could you please add a test case to the
proposed patch that without this feature would otherwise hit the issue
described?

Having said that, were alternatives like disabling subscriptions when
seen occupying the disk space considered?

> We looked for existing protections:
>
> max_slot_wal_keep_size: limits WAL retention, but does not affect spill files at all.
> logical_decoding_work_mem: controls *when* spilling starts, but not *how much* can be spilled.
> There is no existing GUC, patch, or commitfest entry that addresses spill file disk quota.

Interesting!

> The "Report reorder buffer size" patch (CF #6053, by Ashutosh Bapat) improves observability of reorder buffer state,
whichis complementary — but observability alone cannot prevent disk-full incidents. 

With the proposed reorder buffer stats above, would it be possible to
have a monitoring solution (an extension or a tool) to disable
subscriptions and notify the admin? Would something like this work?

> == Proposed solution ==
>
> The attached patch adds a new GUC:
> logical_decoding_spill_limit (integer, unit kB, default 0)
>
> When set to a positive value, it limits the total size of on-disk spill files per replication slot. Key design
points:
>
> Tracking: We add two new fields: - ReorderBuffer.spillBytesOnDisk — current total on-disk spill size for this slot
(unlikespillBytes which is a cumulative statistic counter, this is a live gauge). - ReorderBufferTXN.serialized_size —
per-transactionon-disk size, so we can accurately decrement the global counter during cleanup. 
> Increment: In ReorderBufferSerializeChange(), after a successful write(), both counters are incremented by the size
written.
> Decrement: In ReorderBufferRestoreCleanup(), when spill files are unlinked, the global counter is decremented by the
transaction'sserialized_size. 
> Enforcement: In ReorderBufferCheckMemoryLimit(), before calling ReorderBufferSerializeTXN(), we check: if
(spillBytesOnDisk+ txn->size > spill_limit) ereport(ERROR, ...) This is only checked on the spill-to-disk path — not on
thestreaming path (which involves no disk I/O). 
> Behavior on limit exceeded: An ERROR is raised with ERRCODE_CONFIGURATION_LIMIT_EXCEEDED. The walsender exits, but
theslot's restart_lsn and confirmed_flush are preserved. The subscriber can reconnect after the DBA: 
>
> increases logical_decoding_spill_limit, or
> increases logical_decoding_work_mem (to reduce spilling), or
> switches to a streaming-capable output plugin (which avoids spilling entirely).

When the logical_decoding_spill_limit is exceeded, ERRORing out in the
walsender is even more problematic, right? The replication slot would
be inactive, causing bloat and preventing tuple freezing, WAL files
growth and eventually the system may hit disk-space issues - it is
like "we avoided disk space issues for one subsystem, but introduced
it for another". This looks a bit problematic IMHO. Others may have
different opinions though.

--
Bharath Rupireddy
Amazon Web Services: https://aws.amazon.com

pgsql-hackers by date:

From: Adam Brusselback
Date: 23 March, 17:23:43
Subject: Re: [Patch] Add WHERE clause support to REFRESH MATERIALIZED VIEW

From: "Greg Burd"
Date: 23 March, 17:24:51
Subject: Re: Trying out

Re: Add logical_decoding_spill_limit to cap spill file disk usage per slot - Mailing list pgsql-hackers

Previous

Next