Re: Resetting spilled txn statistics in pg_stat_replication - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: Resetting spilled txn statistics in pg_stat_replication
Date
Msg-id 20200623101831.it6lzwbm37xwquco@development
Whole thread Raw
In response to Re: Resetting spilled txn statistics in pg_stat_replication  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: Resetting spilled txn statistics in pg_stat_replication
Re: Resetting spilled txn statistics in pg_stat_replication
List pgsql-hackers
On Tue, Jun 23, 2020 at 10:58:18AM +0530, Amit Kapila wrote:
>On Tue, Jun 23, 2020 at 9:32 AM Masahiko Sawada
><masahiko.sawada@2ndquadrant.com> wrote:
>>
>> On Sun, 21 Jun 2020 at 06:57, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
>> >
>> > >
>> > >What if the decoding has been performed by multiple backends using the
>> > >same slot?  In that case, it will be difficult to make the judgment
>> > >for the value of logical_decoding_work_mem based on stats.  It would
>> > >make sense if we provide a way to set logical_decoding_work_mem for a
>> > >slot but not sure if that is better than what we have now.
>> > >
>>
>> I thought that the stats are relevant to what
>> logical_decoding_work_mem value was but not with who performed logical
>> decoding. So even if multiple backends perform logical decoding using
>> the same slot, the user can directly use stats as long as
>> logical_decoding_work_mem value doesn’t change.
>>
>
>I think if you maintain these stats at the slot level, you probably
>need to use spinlock or atomic ops in order to update those as slots
>can be used from multiple backends whereas currently, we don't need
>that.

IMHO storing the stats in the slot itself is a bad idea. We have the
statistics collector for exactly this purpose, and it's receiving data
over UDP without any extra locking etc.
>
>> > >What problems do we see in displaying these for each process?  I think
>> > >users might want to see the stats for the exited processes or after
>> > >server restart but I think both of those are not even possible today.
>> > >I think the stats are available till the corresponding WALSender
>> > >process is active.
>>
>> I might want to see the stats for the exited processes or after server
>> restart. But I'm inclined to agree with displaying the stats per
>> process if the stats are displayed on a separate view (e.g.
>> pg_stat_replication_slots).
>>
>
>Yeah, as told previously, this makes more sense to me.
>
>Do you think we should try to write a POC patch using a per-process
>entry approach and see what difficulties we are facing and does it
>give the stats in a way we are imagining but OTOH, we can wait for
>some more to see if there is clear winner approach here?
>

I may be missing something obvious, but I still see no point in tracking
per-process stats. We don't have that for other stats, and I'm not sure
how common is the scenario when a given slot is decoded by many
backends. I'd say vast majority of cases are simply running decoding
from a walsender, which may occasionally restart, but I doubt the users
are interested in per-pid data - they probably want aggregated data.

Can someone explain a plausible scenario for which tracking per-process
stats would be needed, and simply computing deltas would not work? How
will you know which old PID is which, what will you do when a PID is
reused, and so on?


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



pgsql-hackers by date:

Previous
From: vignesh C
Date:
Subject: Re: [PATCH] Initial progress reporting for COPY command
Next
From: Michael Paquier
Date:
Subject: Re: TAP tests and symlinks on Windows