Re: Report reorder buffer size - Mailing list pgsql-hackers
From | Masahiko Sawada |
---|---|
Subject | Re: Report reorder buffer size |
Date | |
Msg-id | CAD21AoDVhtDugdR2pR9yyuo8XKn0mo0gR5arqOZBKK_8sOe8sg@mail.gmail.com Whole thread Raw |
In response to | Report reorder buffer size (Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>) |
List | pgsql-hackers |
Hi, On Wed, Aug 13, 2025 at 5:40 AM Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> wrote: > > Hi All, > The question of how to tune logical_decoding_work_mem has come up a > few times in the context of customer cases. If it is too low then the > changes are spilled to disk, slowing down replication and consuming > disk space. In case of streamed transactions, it won't consume space > on primary but will consume resources (transactional resources/disk > space) on replica. If decoding memory is too high, it may cause memory > pressure especially when there are many wal senders and large amounts > of changes queued to reorder buffers frequently. Thank you for starting the thread about this topic. This is one of the topics that I've been eager to address. Here are some random comments: > > In order to tune logical_decoding_work_mem optimally we need a measure > of the size of changes belonging to the transactions queued in the > reorder buffer, whether spilled to the disk or in memory or sent > downstream as part of an unfinished streamed transaction. Currently > there is no way to determine that. pg_stat_replication_slots reports > cumulative statistics about spilled or streamed transactions. It helps > to know whether the logical decoding work memory is sufficient or not > to hold all the changes. But it does not provide a way to estimate an > optimal value for it since it doesn't capture the state of the reorder > buffer at a given point in time. > > With the attached WIP patch, we can report, at a given point in time, > total size of reorder buffer i.e. sum of the size of all the changes > belonging to the transactions active in the reorder buffer whether > those changes are in memory, on disk or sent downstream as part of an > unfinished streamed transaction. By sampling this value at regular > intervals, one can observe the reorder buffer trendline as shown in > the plots attached. Alternative idea (or an additional metric) would be the high-watermark of memory usage. That way, users won't miss memory usage spikes that might be missed by sampling the total memory usage. > Tracking total size of reorder buffer also helps to quantify load on a > wal sender. It is an additional metric that can be useful to debug WAL > buildup, slow replication etc. For example, the plot clearly and > directly shows when the wal sender started processing the large > transaction, when it finished etc. Agreed. > The patch is not complete. There are many things to be worked on > 1. Whether the metrics should be reported in pg_stat_replication or > pg_stat_replication_slots. Given that it's a point-in-time metric, it > fits pg_stat_replication better. But that view does not report logical > decoding activity happening in backends other than the wal senders. > But that's true with the other metrics in that view as well. > pg_stat_replication_slots, however, covers all users of logical > decoding. For now I have used pg_stat_replication, but it can be > changed based on the discussion here. > > 2. Is the size of the reorder buffer enough or we want to also track > the size of changes on disk and the size of changes sent downstream as > part of unfinished streamed transactions separately? Also the number > of transactions being tracked by the reorder buffer? For the purpose of tuning logical_decoding_work_mem, the additional metrics and statistics we need might not be many. But in order to make logical decoding more visible for users for debugging or diagnosing purposes, more statistics like the number of transactions being tracked by the reorder buffer might be required. We need to note that the actual size of changes sent downstream actually depends on logical decoding plugins. For instance, we have table, row, and column filters in logical replication cases. It might be worth considering providing such statistics too. The statistics like the number of changes filtered out by the table filters or the change-kind filter (e.g., only publishing INSERTs etc) might be helpful for users to confirm the effectiveness of the filters they set. > If we are going > to report so much statistics about the contents of the reorder buffer, > is it better to have a separate view pg_stat_reorder_buffer for the > same? Given logical decoding can be used also by regular backend processes, I guess that such dynamic metrics would fit a system view dedicated to logical decoding, say pg_stat_reorder_buffer or pg_stat_logical_decoding. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
pgsql-hackers by date: