Re: Add memory_limit_hits to pg_stat_replication_slots - Mailing list pgsql-hackers
From | Masahiko Sawada |
---|---|
Subject | Re: Add memory_limit_hits to pg_stat_replication_slots |
Date | |
Msg-id | CAD21AoBPKarxZNnSy884LzHtwO6QN0XUqRzSaM-sM79oPT=-UA@mail.gmail.com Whole thread Raw |
In response to | Re: Add memory_limit_hits to pg_stat_replication_slots (Bertrand Drouvot <bertranddrouvot.pg@gmail.com>) |
List | pgsql-hackers |
On Tue, Sep 23, 2025 at 11:31 PM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote: > > Hi, > > On Tue, Sep 23, 2025 at 11:39:22AM -0700, Masahiko Sawada wrote: > > On Tue, Sep 23, 2025 at 1:52 AM Bertrand Drouvot > > <bertranddrouvot.pg@gmail.com> wrote: > > > > > > However, technically speaking, "exceeded" is not the perfect wording since > > > the code was doing (and is still doing something similar to): > > > > > > if (debug_logical_replication_streaming == DEBUG_LOGICAL_REP_STREAMING_BUFFERED && > > > - rb->size < logical_decoding_work_mem * (Size) 1024) > > > + !memory_limit_reached) > > > return; > > > > > > as the comment describes correctly using "reached": > > > > > > /* > > > * Check whether the logical_decoding_work_mem limit was reached, and if yes > > > * pick the largest (sub)transaction at-a-time to evict and spill its changes to > > > * disk or send to the output plugin until we reach under the memory limit. > > > > > > So I think that "reached" or "hit" would be better wording. However, the > > > documentation for spill_txns, stream_txns already use "exceeded" (and not "reached") > > > so I went with "exceeded" for consistency. I think that's fine, if not we may want > > > to use "reached" for those 3 stats descriptions. > > > > I find "exceeded" is fine as the documentation for logical decoding > > also uses it[1]: > > > > Similar to spill-to-disk behavior, streaming is triggered when the > > total amount of changes decoded from the WAL (for all in-progress > > transactions) exceeds the limit defined by logical_decoding_work_mem > > setting. > > Yes it also uses "exceeds" but I think it's not 100% accurate. It would be > if, in ReorderBufferCheckMemoryLimit, we were using "<=" instead of "<" in: > > if (debug_logical_replication_streaming == DEBUG_LOGICAL_REP_STREAMING_BUFFERED && > rb->size < logical_decoding_work_mem * (Size) 1024) > > I think an accurate wording would be "reaches or exceeds" in all those places, > but just using "exceeds" looks good enough. > > > One comment for the v3 patch: > > > > + <para> > > + Number of times <literal>logical_decoding_work_mem</literal> has been > > + exceeded while decoding changes from WAL for this slot. > > + </para> > > > > How about rewording it to like: > > > > Number of times the memory used by logical decoding has exceeded > > logical_decoding_work_mem. > > That sounds better, thanks! Used this wording in v4 attached (that's the only > change as compared to v3). > Thank you for updating the patch! Here are some comments: --- + bool memory_limit_reached = (rb->size >= logical_decoding_work_mem * (Size) 1024); + + if (memory_limit_reached) + rb->memExceededCount += 1; Do we want to use 'exceeded' for the variable too for better consistency? --- One thing I want to clarify is that even if the memory usage exceeds the logical_decoding_work_mem it doesn't necessarily mean we serialize or stream transactions because of ReorderBufferCheckAndTruncateAbortedTXN(). For example, in a situation where many large already-aborted transactions are truncated by transactionsReorderBufferCheckAndTruncateAbortedTXN(), users would see a high number in mem_exceeded_count column but it might not actually require any adjustment for logical_decoding_work_mem. One idea is to increment that counter if exceeding memory usage is caused to serialize or stream any transactions. On the other hand, it might make sense and be straightforward too to show a pure statistic that the memory usage exceeded the logical_decoding_work_mem. What do you think? Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
pgsql-hackers by date: