Re: Report reorder buffer size - Mailing list pgsql-hackers

From Masahiko Sawada
Subject Re: Report reorder buffer size
Date
Msg-id CAD21AoBBkPkwnDkMOPAfEwEvFsn+RodZXF79A+EjeK2sfsHmuQ@mail.gmail.com
Whole thread Raw
In response to Re: Report reorder buffer size  (Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>)
Responses Re: Report reorder buffer size
List pgsql-hackers
On Tue, Aug 26, 2025 at 2:45 AM Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:
>
> Hi Masahiko,
> Thanks for your inputs.
>
> On Tue, Aug 26, 2025 at 2:17 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> > On Wed, Aug 13, 2025 at 5:40 AM Ashutosh Bapat
> > <ashutosh.bapat.oss@gmail.com> wrote:
> >
> > Thank you for starting the thread about this topic. This is one of the
> > topics that I've been eager to address. Here are some random comments:
> >
>
> Glad that more people think there is a need for improving the reorder
> buffer statistics.
>
> > >
> > > With the attached WIP patch, we can report, at a given point in time,
> > > total size of reorder buffer i.e. sum of the size of all the changes
> > > belonging to the transactions active in the reorder buffer whether
> > > those changes are in memory, on disk or sent downstream as part of an
> > > unfinished streamed transaction. By sampling this value at regular
> > > intervals, one can observe the reorder buffer trendline as shown in
> > > the plots attached.
> >
> > Alternative idea (or an additional metric) would be the high-watermark
> > of memory usage. That way, users won't miss memory usage spikes that
> > might be missed by sampling the total memory usage.
>
> I think the trendline is important to make right trade-offs in case
> setting logical_decoding_work_mem to maximum is not possible. With a
> high watermark the question is how long does a high remain high? As
> the load changes, a high that was once may become lower or higher than
> the new high and soon irrelevant. Probably we could reset the
> high-watermark every time the view is sampled, so that we provide a
> trendline for the high-water mark as well. But I want to be cautious
> about adding that complexity of tracking maxima accurately and then
> maintain it forever. If sampling is frequent enough usually it will
> capture a maxima and minima good enough for practical purposes. The
> users will need to consider the trendline as approximate anyway since
> the load will show slight variations over the time.

Agreed with the importance of trendline.

> Please share your idea of reporting high-watermark.

I was thinking of this high-watermark idea since it doesn't require
any external system/tool to get the information for tuning
logical_decoding_work_mem. It would be easy to use when users want to
figure out maximum data usage for logical decoding of the particular
workload.

>
> > >
> > > 2. Is the size of the reorder buffer enough or we want to also track
> > > the size of changes on disk and the size of changes sent downstream as
> > > part of unfinished streamed transactions separately? Also the number
> > > of transactions being tracked by the reorder buffer?
> >
> > For the purpose of tuning logical_decoding_work_mem, the additional
> > metrics and statistics we need might not be many. But in order to make
> > logical decoding more visible for users for debugging or diagnosing
> > purposes, more statistics like the number of transactions being
> > tracked by the reorder buffer might be required.
> >
>
> Ok. We could add the total number of transactions in the reorder
> buffer at a given point in time to the report easily. How about
> subtransactions? How about prepared but not committed/aborted
> transactions?

For debugging or diagnosing purposes, this information might be
useful, but I'm not sure we need this information of logical decodings
that are running.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



pgsql-hackers by date:

Previous
From: Peter Smith
Date:
Subject: Re: [WIP]Vertical Clustered Index (columnar store extension) - take2
Next
From: Chao Li
Date:
Subject: Re: Identifying function-lookup failures due to argument name mismatches