Home > mailing lists

Re: Report reorder buffer size - Mailing list pgsql-hackers

From	Ashutosh Bapat
Subject	Re: Report reorder buffer size
Date	August 21 16:56:41
Msg-id	CAExHW5stjbY_jqcP=PAj6w93dX_zTMtjrtm2S7bZvcjQkL81dw@mail.gmail.com Whole thread Raw
In response to	Re: Report reorder buffer size (Bertrand Drouvot <bertranddrouvot.pg@gmail.com>)
Responses	Re: Report reorder buffer size
List	pgsql-hackers

Tree view

Hi Bertrand,
Thanks for your response. I am glad that you have found the proposal
to be generally useful.

On Thu, Aug 14, 2025 at 3:50 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
>
> > Using these trendlines, a user may decide to keep
> > logical_decoding_work_mem to minimum and thus let all the large
> > transactions spill to disk or be streamed.
>
> Right. Also, the issue with spill files is that when they are read back from disk
> then the reorder buffer can consume a lot of memory. That's why I think tracking
> the xid, sub xid and LSNs could be useful too. That could help diagnose why
> it was using that memory.

When the replication is stuck or is slow or WAL builds up a detailed
look at the reorder buffer's contents would be useful. It will need to
be done outside of pg_stat_replication or pg_stat_replication_slots. I
think it will be useful to provide a function which will trigger a
given WAL sender to dump the information about the contents of the
reorder buffer to the server error log or send them to the requesting
client.

>
> I think pg_stat_replication is a good place to record "real time" activities.
>

Ok. Thanks for confirming.

> Maybe we could log 2 metrics: the reorder buffer size before creating the spill
> files and the reorder buffer size while reading back the spill files? That would
> help make the distinction when logical_decoding_work_mem is involved (i.e before
> creating the spill files) and when it is not (reading them back).
>
> pg_stat_replication_slots could be used to add some counters too: like the
> number of times logical_decoding_work_mem has been reached while decoding from
> that slot.

I think pg_stat_replication_slots::spill_count and
pg_stat_replication_slots::stream_count give that value.

> Also maybe record the max value the reorder buffer reached and
> the associated xid, number of sub xids, and LSN? That could be useful to diagnose
> high memory consumption and/or OOM issues due to logical decoding.

A function which dumps the information about reorder buffers can be
used along with the trendline for this purpose. For example, if a user
notices WAL accumulation and an upward trend in the metric exposed by
my patch, they can trigger a dump of reorder buffer contents at that
time and diagnose the problem and even fix it proactively.

>
> Also, should we log some of information to the server log?
>

Information in server error logs is hard to access and analyse,
especially in cloud environment. Having a SQL construct is better.

> > But before I work on those, I would like to know others' opinions,
> > usefulness and acceptance.
>
> Thanks for working on it. I like the idea and also think that is an area where
> more details/metrics should be provided.

Thanks.

--
Best Wishes,
Ashutosh Bapat

pgsql-hackers by date:

From: Nathan Bossart
Date: 21 August, 16:44:01
Subject: Re: vacuumdb --missing-stats-only and permission issue

From: Nathan Bossart
Date: 21 August, 16:56:48
Subject: Re: Don't treat virtual generated columns as missing statistics in vacuumdb --missing-stats-only

Re: Report reorder buffer size - Mailing list pgsql-hackers

Previous

Next