Thread: Resetting spilled txn statistics in pg_stat_replication

Resetting spilled txn statistics in pg_stat_replication

From
Masahiko Sawada
Date:
Hi all,

Tracking of spilled transactions has been introduced to PG13. These
new statistics values, spill_txns, spill_count, and spill_bytes, are
cumulative total values unlike other statistics values in
pg_stat_replication. How can we reset these values? We can reset
statistics values in other statistics views using by
pg_stat_reset_shared(), pg_stat_reset() and so on. It seems to me that
the only option to reset spilled transactions is to restart logical
replication but it's surely high cost.

It might have been discussed during development but it's worth having
a SQL function to reset these statistics?

Regards,

-- 
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Resetting spilled txn statistics in pg_stat_replication

From
Kyotaro Horiguchi
Date:
At Tue, 2 Jun 2020 15:17:36 +0900, Masahiko Sawada <masahiko.sawada@2ndquadrant.com> wrote in 
> Hi all,
> 
> Tracking of spilled transactions has been introduced to PG13. These
> new statistics values, spill_txns, spill_count, and spill_bytes, are
> cumulative total values unlike other statistics values in
> pg_stat_replication. How can we reset these values? We can reset
> statistics values in other statistics views using by
> pg_stat_reset_shared(), pg_stat_reset() and so on. It seems to me that
> the only option to reset spilled transactions is to restart logical
> replication but it's surely high cost.
> 
> It might have been discussed during development but it's worth having
> a SQL function to reset these statistics?

Actually, I don't see pg_stat_reset() useful so much except for our
regression test (or might be rather harmful for monitoring aids).  So
I doubt the usefulness of the feature, but having it makes things more
consistent.

Anyway I think the most significant point of implementing the feature
would be user interface.  Adding pg_stat_replication_reset(pid int)
doesn't seem to be a good thing to do..

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: Resetting spilled txn statistics in pg_stat_replication

From
Fujii Masao
Date:

On 2020/06/02 16:00, Kyotaro Horiguchi wrote:
> At Tue, 2 Jun 2020 15:17:36 +0900, Masahiko Sawada <masahiko.sawada@2ndquadrant.com> wrote in
>> Hi all,
>>
>> Tracking of spilled transactions has been introduced to PG13. These
>> new statistics values, spill_txns, spill_count, and spill_bytes, are
>> cumulative total values unlike other statistics values in
>> pg_stat_replication.

Basically I don't think it's good design to mix dynamic and collected
stats in one view. It might be better to separate them into different
new stats view. It's too late to add new stats view for v13, though....

Regards,

-- 
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION



Re: Resetting spilled txn statistics in pg_stat_replication

From
Amit Kapila
Date:
On Tue, Jun 2, 2020 at 11:48 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:
>
> Hi all,
>
> Tracking of spilled transactions has been introduced to PG13. These
> new statistics values, spill_txns, spill_count, and spill_bytes, are
> cumulative total values unlike other statistics values in
> pg_stat_replication. How can we reset these values? We can reset
> statistics values in other statistics views using by
> pg_stat_reset_shared(), pg_stat_reset() and so on. It seems to me that
> the only option to reset spilled transactions is to restart logical
> replication but it's surely high cost.
>

I see your point but I don't see a pressing need for such a function
for PG13.  Basically, these counters will be populated when we have
large transactions in the system so not sure how much is the use case
for such a function. Note that we need to add additional column
stats_reset in pg_stat_replication view as well similar to what we
have in pg_stat_archiver and pg_stat_bgwriter.  OTOH, I don't see any
big reason for not having such a function for PG14.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: Resetting spilled txn statistics in pg_stat_replication

From
Masahiko Sawada
Date:
On Tue, 2 Jun 2020 at 17:22, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
>
>
>
> On 2020/06/02 16:00, Kyotaro Horiguchi wrote:
> > At Tue, 2 Jun 2020 15:17:36 +0900, Masahiko Sawada <masahiko.sawada@2ndquadrant.com> wrote in
> >> Hi all,
> >>
> >> Tracking of spilled transactions has been introduced to PG13. These
> >> new statistics values, spill_txns, spill_count, and spill_bytes, are
> >> cumulative total values unlike other statistics values in
> >> pg_stat_replication.
>
> Basically I don't think it's good design to mix dynamic and collected
> stats in one view. It might be better to separate them into different
> new stats view.  It's too late to add new stats view for v13, though....

Yeah, actually I had the same impression when studying this feature.
Another benefit of having a separate view for such statistics would be
that it can also support logical decoding invoked by SQL interface.
Currently, reorder buffer always tracks statistics of spilled
transactions but we can see it only in using logical replication.

Regards,

-- 
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Resetting spilled txn statistics in pg_stat_replication

From
Amit Kapila
Date:
On Tue, Jun 2, 2020 at 1:52 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
>
> On 2020/06/02 16:00, Kyotaro Horiguchi wrote:
> > At Tue, 2 Jun 2020 15:17:36 +0900, Masahiko Sawada <masahiko.sawada@2ndquadrant.com> wrote in
> >> Hi all,
> >>
> >> Tracking of spilled transactions has been introduced to PG13. These
> >> new statistics values, spill_txns, spill_count, and spill_bytes, are
> >> cumulative total values unlike other statistics values in
> >> pg_stat_replication.
>
> Basically I don't think it's good design to mix dynamic and collected
> stats in one view. It might be better to separate them into different
> new stats view.
>

I think this is worth considering but note that we already have a
similar mix in other views like pg_stat_archiver (archived_count and
failed_count are dynamic whereas other columns are static).  On the
one hand, it is good if we have a separate view for such dynamic
information but OTOH users need to consult more views for replication
information.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: Resetting spilled txn statistics in pg_stat_replication

From
Masahiko Sawada
Date:
On Tue, 2 Jun 2020 at 16:00, Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote:
>
> At Tue, 2 Jun 2020 15:17:36 +0900, Masahiko Sawada <masahiko.sawada@2ndquadrant.com> wrote in
> > Hi all,
> >
> > Tracking of spilled transactions has been introduced to PG13. These
> > new statistics values, spill_txns, spill_count, and spill_bytes, are
> > cumulative total values unlike other statistics values in
> > pg_stat_replication. How can we reset these values? We can reset
> > statistics values in other statistics views using by
> > pg_stat_reset_shared(), pg_stat_reset() and so on. It seems to me that
> > the only option to reset spilled transactions is to restart logical
> > replication but it's surely high cost.
> >
> > It might have been discussed during development but it's worth having
> > a SQL function to reset these statistics?
>
> Actually, I don't see pg_stat_reset() useful so much except for our
> regression test (or might be rather harmful for monitoring aids).  So
> I doubt the usefulness of the feature, but having it makes things more
> consistent.

IMO these reset functions are useful for verifications. I often use
them before starting performance evaluations.

Regards,

-- 
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Resetting spilled txn statistics in pg_stat_replication

From
Masahiko Sawada
Date:
On Tue, 2 Jun 2020 at 18:34, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Jun 2, 2020 at 11:48 AM Masahiko Sawada
> <masahiko.sawada@2ndquadrant.com> wrote:
> >
> > Hi all,
> >
> > Tracking of spilled transactions has been introduced to PG13. These
> > new statistics values, spill_txns, spill_count, and spill_bytes, are
> > cumulative total values unlike other statistics values in
> > pg_stat_replication. How can we reset these values? We can reset
> > statistics values in other statistics views using by
> > pg_stat_reset_shared(), pg_stat_reset() and so on. It seems to me that
> > the only option to reset spilled transactions is to restart logical
> > replication but it's surely high cost.
> >
>
> I see your point but I don't see a pressing need for such a function
> for PG13.  Basically, these counters will be populated when we have
> large transactions in the system so not sure how much is the use case
> for such a function. Note that we need to add additional column
> stats_reset in pg_stat_replication view as well similar to what we
> have in pg_stat_archiver and pg_stat_bgwriter.  OTOH, I don't see any
> big reason for not having such a function for PG14.

Ok. I think the reset function is mostly for evaluations or rare
cases. In either case, since it's not an urgent case we can postpone
it to PG14 if necessary.

Regards,

-- 
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Resetting spilled txn statistics in pg_stat_replication

From
Magnus Hagander
Date:


On Tue, Jun 9, 2020 at 9:12 AM Masahiko Sawada <masahiko.sawada@2ndquadrant.com> wrote:
On Tue, 2 Jun 2020 at 18:34, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Jun 2, 2020 at 11:48 AM Masahiko Sawada
> <masahiko.sawada@2ndquadrant.com> wrote:
> >
> > Hi all,
> >
> > Tracking of spilled transactions has been introduced to PG13. These
> > new statistics values, spill_txns, spill_count, and spill_bytes, are
> > cumulative total values unlike other statistics values in
> > pg_stat_replication. How can we reset these values? We can reset
> > statistics values in other statistics views using by
> > pg_stat_reset_shared(), pg_stat_reset() and so on. It seems to me that
> > the only option to reset spilled transactions is to restart logical
> > replication but it's surely high cost.

You just have to "bounce" the worker though, right? You don't have to actually restart logical replication, just disconnect and reconnect?


> I see your point but I don't see a pressing need for such a function
> for PG13.  Basically, these counters will be populated when we have
> large transactions in the system so not sure how much is the use case
> for such a function. Note that we need to add additional column
> stats_reset in pg_stat_replication view as well similar to what we
> have in pg_stat_archiver and pg_stat_bgwriter.  OTOH, I don't see any
> big reason for not having such a function for PG14.

Ok. I think the reset function is mostly for evaluations or rare
cases. In either case, since it's not an urgent case we can postpone
it to PG14 if necessary.

Reading through this thread, I agree that it's kind of weird to keep cumulative stats mixed with non-cumulative stats. (it always irks me, for example, that we have numbackends in pg_stat_database which behaves different from every other column in it)

However, I don't see how they *are* cumulative. They are only cumulative while the client is connected -- as soon as it disconnects they go away. In that regard, they're more like the pg_stat_progress_xyz views for example.

Which makes it mostly useless for long-term tracking anyway. Because no matter which way you snapshot it, you will lose data.

ISTM the logical places to keep cumulative stats would be pg_replication_slots? (Or go create a pg_stat_replication_slots?) That is, that the logical grouping of these statistics for long-term is the replication slot, not the walsender?

--

Re: Resetting spilled txn statistics in pg_stat_replication

From
Amit Kapila
Date:
On Tue, Jun 9, 2020 at 1:54 PM Magnus Hagander <magnus@hagander.net> wrote:
>
> On Tue, Jun 9, 2020 at 9:12 AM Masahiko Sawada <masahiko.sawada@2ndquadrant.com> wrote:
>>
>> On Tue, 2 Jun 2020 at 18:34, Amit Kapila <amit.kapila16@gmail.com> wrote:
>> >
>> > I see your point but I don't see a pressing need for such a function
>> > for PG13.  Basically, these counters will be populated when we have
>> > large transactions in the system so not sure how much is the use case
>> > for such a function. Note that we need to add additional column
>> > stats_reset in pg_stat_replication view as well similar to what we
>> > have in pg_stat_archiver and pg_stat_bgwriter.  OTOH, I don't see any
>> > big reason for not having such a function for PG14.
>>
>> Ok. I think the reset function is mostly for evaluations or rare
>> cases. In either case, since it's not an urgent case we can postpone
>> it to PG14 if necessary.
>
>
> Reading through this thread, I agree that it's kind of weird to keep cumulative stats mixed with non-cumulative
stats.(it always irks me, for example, that we have numbackends in pg_stat_database which behaves different from every
othercolumn in it)
 
>
> However, I don't see how they *are* cumulative. They are only cumulative while the client is connected -- as soon as
itdisconnects they go away. In that regard, they're more like the pg_stat_progress_xyz views for example.
 
>
> Which makes it mostly useless for long-term tracking anyway. Because no matter which way you snapshot it, you will
losedata.
 
>
> ISTM the logical places to keep cumulative stats would be pg_replication_slots? (Or go create a
pg_stat_replication_slots?)That is, that the logical grouping of these statistics for long-term is the replication
slot,not the walsender?
 
>

I think I see one advantage of displaying these stats at slot level.
Currently, we won't be able to see these stats when we use SQL
Interface APIs (like pg_logical_get_slot_changes) to decode the WAL
but if we display at slot level, then we should be able to see it.

I would prefer to display it in pg_replication_slots just to avoid
creating more views but OTOH, if a new view like
pg_stat_replication_slots sounds better place for these stats then I
am fine with it too.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: Resetting spilled txn statistics in pg_stat_replication

From
Masahiko Sawada
Date:
On Tue, 9 Jun 2020 at 17:24, Magnus Hagander <magnus@hagander.net> wrote:
>
>
>
> On Tue, Jun 9, 2020 at 9:12 AM Masahiko Sawada <masahiko.sawada@2ndquadrant.com> wrote:
>>
>> On Tue, 2 Jun 2020 at 18:34, Amit Kapila <amit.kapila16@gmail.com> wrote:
>> >
>> > On Tue, Jun 2, 2020 at 11:48 AM Masahiko Sawada
>> > <masahiko.sawada@2ndquadrant.com> wrote:
>> > >
>> > > Hi all,
>> > >
>> > > Tracking of spilled transactions has been introduced to PG13. These
>> > > new statistics values, spill_txns, spill_count, and spill_bytes, are
>> > > cumulative total values unlike other statistics values in
>> > > pg_stat_replication. How can we reset these values? We can reset
>> > > statistics values in other statistics views using by
>> > > pg_stat_reset_shared(), pg_stat_reset() and so on. It seems to me that
>> > > the only option to reset spilled transactions is to restart logical
>> > > replication but it's surely high cost.
>
>
> You just have to "bounce" the worker though, right? You don't have to actually restart logical replication, just
disconnectand reconnect? 

Right.

>
>
>> > I see your point but I don't see a pressing need for such a function
>> > for PG13.  Basically, these counters will be populated when we have
>> > large transactions in the system so not sure how much is the use case
>> > for such a function. Note that we need to add additional column
>> > stats_reset in pg_stat_replication view as well similar to what we
>> > have in pg_stat_archiver and pg_stat_bgwriter.  OTOH, I don't see any
>> > big reason for not having such a function for PG14.
>>
>> Ok. I think the reset function is mostly for evaluations or rare
>> cases. In either case, since it's not an urgent case we can postpone
>> it to PG14 if necessary.
>
>
> Reading through this thread, I agree that it's kind of weird to keep cumulative stats mixed with non-cumulative
stats.(it always irks me, for example, that we have numbackends in pg_stat_database which behaves different from every
othercolumn in it) 
>
> However, I don't see how they *are* cumulative. They are only cumulative while the client is connected -- as soon as
itdisconnects they go away. In that regard, they're more like the pg_stat_progress_xyz views for example. 
>
> Which makes it mostly useless for long-term tracking anyway. Because no matter which way you snapshot it, you will
losedata. 
>
> ISTM the logical places to keep cumulative stats would be pg_replication_slots? (Or go create a
pg_stat_replication_slots?)That is, that the logical grouping of these statistics for long-term is the replication
slot,not the walsender? 

I personally prefer to display these values in pg_replication_slots.
If we create a new stats view, it's only for logical replication
slots? Or displaying both types of slots as physical replication slots
might have statistics in the future?

If we move these values to replication slots, I think we can change
the code so that these statistics are managed by replication slots
(e.g. ReplicationSlot struct). Once ReplicationSlot has these values,
we can keep them beyond reconnections or multiple calls of SQL
interface functions. Of course, these values don’t need to be
persisted.

Regards,

--
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Resetting spilled txn statistics in pg_stat_replication

From
Magnus Hagander
Date:
On Wed, Jun 10, 2020 at 9:01 AM Masahiko Sawada <masahiko.sawada@2ndquadrant.com> wrote:
On Tue, 9 Jun 2020 at 17:24, Magnus Hagander <magnus@hagander.net> wrote:
>
>
>
> On Tue, Jun 9, 2020 at 9:12 AM Masahiko Sawada <masahiko.sawada@2ndquadrant.com> wrote:
>>
>> On Tue, 2 Jun 2020 at 18:34, Amit Kapila <amit.kapila16@gmail.com> wrote:
>> >
>> > On Tue, Jun 2, 2020 at 11:48 AM Masahiko Sawada
>> > <masahiko.sawada@2ndquadrant.com> wrote:
>> > >
>> > > Hi all,
>> > >
>> > > Tracking of spilled transactions has been introduced to PG13. These
>> > > new statistics values, spill_txns, spill_count, and spill_bytes, are
>> > > cumulative total values unlike other statistics values in
>> > > pg_stat_replication. How can we reset these values? We can reset
>> > > statistics values in other statistics views using by
>> > > pg_stat_reset_shared(), pg_stat_reset() and so on. It seems to me that
>> > > the only option to reset spilled transactions is to restart logical
>> > > replication but it's surely high cost.
>
>
> You just have to "bounce" the worker though, right? You don't have to actually restart logical replication, just disconnect and reconnect?

Right.

>
>
>> > I see your point but I don't see a pressing need for such a function
>> > for PG13.  Basically, these counters will be populated when we have
>> > large transactions in the system so not sure how much is the use case
>> > for such a function. Note that we need to add additional column
>> > stats_reset in pg_stat_replication view as well similar to what we
>> > have in pg_stat_archiver and pg_stat_bgwriter.  OTOH, I don't see any
>> > big reason for not having such a function for PG14.
>>
>> Ok. I think the reset function is mostly for evaluations or rare
>> cases. In either case, since it's not an urgent case we can postpone
>> it to PG14 if necessary.
>
>
> Reading through this thread, I agree that it's kind of weird to keep cumulative stats mixed with non-cumulative stats. (it always irks me, for example, that we have numbackends in pg_stat_database which behaves different from every other column in it)
>
> However, I don't see how they *are* cumulative. They are only cumulative while the client is connected -- as soon as it disconnects they go away. In that regard, they're more like the pg_stat_progress_xyz views for example.
>
> Which makes it mostly useless for long-term tracking anyway. Because no matter which way you snapshot it, you will lose data.
>
> ISTM the logical places to keep cumulative stats would be pg_replication_slots? (Or go create a pg_stat_replication_slots?) That is, that the logical grouping of these statistics for long-term is the replication slot, not the walsender?

I personally prefer to display these values in pg_replication_slots.
If we create a new stats view, it's only for logical replication
slots? Or displaying both types of slots as physical replication slots
might have statistics in the future?

Yeah, I think it's kind of a weird situation. There's already some things in pg_replication_slots that should probably be in a stat_ view,  so if we were to create one we would have to move those, and probably needlessly break things for people. 

i guess we could have separate views for logical and pysical slots since there are things that only one of them will have. But there is that already -- the database for example, and xmins. Splitting that apart now should be a bigger thing, but I don't think it's worth it.


If we move these values to replication slots, I think we can change
the code so that these statistics are managed by replication slots
(e.g. ReplicationSlot struct). Once ReplicationSlot has these values,
we can keep them beyond reconnections or multiple calls of SQL
interface functions. Of course, these values don’t need to be
persisted.

Eh, why should they not be persisted? The comparison would be temp_files and temp_bytes in pg_stat_database, and those *are* persisted. 

--

Re: Resetting spilled txn statistics in pg_stat_replication

From
Amit Kapila
Date:
On Thu, Jun 11, 2020 at 12:35 AM Magnus Hagander <magnus@hagander.net> wrote:
>
> On Wed, Jun 10, 2020 at 9:01 AM Masahiko Sawada <masahiko.sawada@2ndquadrant.com> wrote:
>>
>
>> If we move these values to replication slots, I think we can change
>> the code so that these statistics are managed by replication slots
>> (e.g. ReplicationSlot struct). Once ReplicationSlot has these values,
>> we can keep them beyond reconnections or multiple calls of SQL
>> interface functions. Of course, these values don’t need to be
>> persisted.
>
>
> Eh, why should they not be persisted?
>

Because these stats are corresponding to a ReoderBuffer (logical
decoding) which will be allocated fresh after restart and have no
relation with what has happened before restart.

Now, thinking about this again, I am not sure if these stats are
directly related to slots. These are stats for logical decoding which
can be performed either via WALSender or decoding plugin (via APIs).
So, why not have them displayed in a new view like pg_stat_logical (or
pg_stat_logical_decoding/pg_stat_logical_replication)?   In future, we
will need to add similar stats for streaming of in-progress
transactions as well (see patch 0007-Track-statistics-for-streaming at
[1]), so having a separate view for these doesn't sound illogical.

> The comparison would be temp_files and temp_bytes in pg_stat_database, and those *are* persisted.

I am not able to see a one-on-one mapping of those stats with what is
being discussed here.

[1] - https://www.postgresql.org/message-id/CAFiTN-vXQx_161WC-a9HvNaF25nwO%3DJJRpRdTtyfGQHbM3Bd1Q%40mail.gmail.com

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: Resetting spilled txn statistics in pg_stat_replication

From
Masahiko Sawada
Date:
On Thu, 11 Jun 2020 at 12:30, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Jun 11, 2020 at 12:35 AM Magnus Hagander <magnus@hagander.net> wrote:
> >
> > On Wed, Jun 10, 2020 at 9:01 AM Masahiko Sawada <masahiko.sawada@2ndquadrant.com> wrote:
> >>
> >
> >> If we move these values to replication slots, I think we can change
> >> the code so that these statistics are managed by replication slots
> >> (e.g. ReplicationSlot struct). Once ReplicationSlot has these values,
> >> we can keep them beyond reconnections or multiple calls of SQL
> >> interface functions. Of course, these values don’t need to be
> >> persisted.
> >
> >
> > Eh, why should they not be persisted?
> >
>
> Because these stats are corresponding to a ReoderBuffer (logical
> decoding) which will be allocated fresh after restart and have no
> relation with what has happened before restart.

I thought the same. But I now think there is no difference between
reconnecting replication and server restart in terms of the logical
decoding context. Even if we persist these values, we might want to
reset these values after crash recovery like stats collector.

>
> Now, thinking about this again, I am not sure if these stats are
> directly related to slots. These are stats for logical decoding which
> can be performed either via WALSender or decoding plugin (via APIs).
> So, why not have them displayed in a new view like pg_stat_logical (or
> pg_stat_logical_decoding/pg_stat_logical_replication)?   In future, we
> will need to add similar stats for streaming of in-progress
> transactions as well (see patch 0007-Track-statistics-for-streaming at
> [1]), so having a separate view for these doesn't sound illogical.
>

I think we need to decide how long we want to remain these statistics
values. That is, if we were to have such pg_stat_logical view, these
values would remain until logical decoding finished since I think the
view would display only running logical decoding. OTOH, if we were to
correspond these stats to slots, these values would remain beyond
multiple logical decoding SQL API calls.

I think one of the main use-case of these statistics is the tuning of
logical_decoding_work_mem. This seems similar to a combination of
pg_stat_database.temp_files/temp_bytes and work_mem. From this
perspective, I guess it’s useful for users if these values remain
until or the slots are removed or server crashes. Given that the kinds
of logical decoding statistics might grow, having a separate view
dedicated to replication slots makes sense to me.

For updating these statistics, if we correspond these statistics to
logical decoding or replication slot, we can change the strategy of
updating statistics so that it doesn’t depend on logical decoding
plugin implementation. If updating statistics doesn’t affect
performance much, it’s better to track the statistics regardless of
plugins.

Regards,

--
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Resetting spilled txn statistics in pg_stat_replication

From
Amit Kapila
Date:
On Thu, Jun 11, 2020 at 1:46 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:
>
> On Thu, 11 Jun 2020 at 12:30, Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> >
> > Now, thinking about this again, I am not sure if these stats are
> > directly related to slots. These are stats for logical decoding which
> > can be performed either via WALSender or decoding plugin (via APIs).
> > So, why not have them displayed in a new view like pg_stat_logical (or
> > pg_stat_logical_decoding/pg_stat_logical_replication)?   In future, we
> > will need to add similar stats for streaming of in-progress
> > transactions as well (see patch 0007-Track-statistics-for-streaming at
> > [1]), so having a separate view for these doesn't sound illogical.
> >
>
> I think we need to decide how long we want to remain these statistics
> values. That is, if we were to have such pg_stat_logical view, these
> values would remain until logical decoding finished since I think the
> view would display only running logical decoding. OTOH, if we were to
> correspond these stats to slots, these values would remain beyond
> multiple logical decoding SQL API calls.
>

I thought of having these till the process that performs these
operations exist.  So for WALSender, the stats will be valid till it
is not restarted due to some reason or when performed via backend, the
stats will be valid till the corresponding backend exits.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: Resetting spilled txn statistics in pg_stat_replication

From
Masahiko Sawada
Date:
On Thu, 11 Jun 2020 at 18:11, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Jun 11, 2020 at 1:46 PM Masahiko Sawada
> <masahiko.sawada@2ndquadrant.com> wrote:
> >
> > On Thu, 11 Jun 2020 at 12:30, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > >
> > > Now, thinking about this again, I am not sure if these stats are
> > > directly related to slots. These are stats for logical decoding which
> > > can be performed either via WALSender or decoding plugin (via APIs).
> > > So, why not have them displayed in a new view like pg_stat_logical (or
> > > pg_stat_logical_decoding/pg_stat_logical_replication)?   In future, we
> > > will need to add similar stats for streaming of in-progress
> > > transactions as well (see patch 0007-Track-statistics-for-streaming at
> > > [1]), so having a separate view for these doesn't sound illogical.
> > >
> >
> > I think we need to decide how long we want to remain these statistics
> > values. That is, if we were to have such pg_stat_logical view, these
> > values would remain until logical decoding finished since I think the
> > view would display only running logical decoding. OTOH, if we were to
> > correspond these stats to slots, these values would remain beyond
> > multiple logical decoding SQL API calls.
> >
>
> I thought of having these till the process that performs these
> operations exist.  So for WALSender, the stats will be valid till it
> is not restarted due to some reason or when performed via backend, the
> stats will be valid till the corresponding backend exits.
>

The number of rows of that view could be up to (max_backends +
max_wal_senders). Is that right? What if different backends used the
same replication slot one after the other?

Regards,

-- 
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Resetting spilled txn statistics in pg_stat_replication

From
Amit Kapila
Date:
On Thu, Jun 11, 2020 at 3:07 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:
>
> On Thu, 11 Jun 2020 at 18:11, Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Thu, Jun 11, 2020 at 1:46 PM Masahiko Sawada
> > <masahiko.sawada@2ndquadrant.com> wrote:
> > >
> > > On Thu, 11 Jun 2020 at 12:30, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > >
> > > > Now, thinking about this again, I am not sure if these stats are
> > > > directly related to slots. These are stats for logical decoding which
> > > > can be performed either via WALSender or decoding plugin (via APIs).
> > > > So, why not have them displayed in a new view like pg_stat_logical (or
> > > > pg_stat_logical_decoding/pg_stat_logical_replication)?   In future, we
> > > > will need to add similar stats for streaming of in-progress
> > > > transactions as well (see patch 0007-Track-statistics-for-streaming at
> > > > [1]), so having a separate view for these doesn't sound illogical.
> > > >
> > >
> > > I think we need to decide how long we want to remain these statistics
> > > values. That is, if we were to have such pg_stat_logical view, these
> > > values would remain until logical decoding finished since I think the
> > > view would display only running logical decoding. OTOH, if we were to
> > > correspond these stats to slots, these values would remain beyond
> > > multiple logical decoding SQL API calls.
> > >
> >
> > I thought of having these till the process that performs these
> > operations exist.  So for WALSender, the stats will be valid till it
> > is not restarted due to some reason or when performed via backend, the
> > stats will be valid till the corresponding backend exits.
> >
>
> The number of rows of that view could be up to (max_backends +
> max_wal_senders). Is that right? What if different backends used the
> same replication slot one after the other?
>

Yeah, it would be tricky if multiple slots are used by the same
backend.  We could probably track the number of times decoding has
happened by the session that will probably help us in averaging the
spill amount.  If we think that the aim is to help users to tune
logical_decoding_work_mem to avoid frequent spilling or streaming then
how would maintaining at slot level will help?  As you said previously
we could track it only for running logical decoding context but if we
do that then data will be temporary and the user needs to constantly
monitor the same to make sense of it but maybe that is fine.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: Resetting spilled txn statistics in pg_stat_replication

From
Masahiko Sawada
Date:
On Thu, 11 Jun 2020 at 20:02, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Jun 11, 2020 at 3:07 PM Masahiko Sawada
> <masahiko.sawada@2ndquadrant.com> wrote:
> >
> > On Thu, 11 Jun 2020 at 18:11, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Thu, Jun 11, 2020 at 1:46 PM Masahiko Sawada
> > > <masahiko.sawada@2ndquadrant.com> wrote:
> > > >
> > > > On Thu, 11 Jun 2020 at 12:30, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > >
> > > > >
> > > > > Now, thinking about this again, I am not sure if these stats are
> > > > > directly related to slots. These are stats for logical decoding which
> > > > > can be performed either via WALSender or decoding plugin (via APIs).
> > > > > So, why not have them displayed in a new view like pg_stat_logical (or
> > > > > pg_stat_logical_decoding/pg_stat_logical_replication)?   In future, we
> > > > > will need to add similar stats for streaming of in-progress
> > > > > transactions as well (see patch 0007-Track-statistics-for-streaming at
> > > > > [1]), so having a separate view for these doesn't sound illogical.
> > > > >
> > > >
> > > > I think we need to decide how long we want to remain these statistics
> > > > values. That is, if we were to have such pg_stat_logical view, these
> > > > values would remain until logical decoding finished since I think the
> > > > view would display only running logical decoding. OTOH, if we were to
> > > > correspond these stats to slots, these values would remain beyond
> > > > multiple logical decoding SQL API calls.
> > > >
> > >
> > > I thought of having these till the process that performs these
> > > operations exist.  So for WALSender, the stats will be valid till it
> > > is not restarted due to some reason or when performed via backend, the
> > > stats will be valid till the corresponding backend exits.
> > >
> >
> > The number of rows of that view could be up to (max_backends +
> > max_wal_senders). Is that right? What if different backends used the
> > same replication slot one after the other?
> >
>
> Yeah, it would be tricky if multiple slots are used by the same
> backend.  We could probably track the number of times decoding has
> happened by the session that will probably help us in averaging the
> spill amount.  If we think that the aim is to help users to tune
> logical_decoding_work_mem to avoid frequent spilling or streaming then
> how would maintaining at slot level will help?

Since the logical decoding intermediate files are written at per slots
directory, I thought that corresponding these statistics to
replication slots is also understandable for users. I was thinking
something like pg_stat_logical_replication_slot view which shows
slot_name and statistics of only logical replication slots. The view
always shows rows as many as existing replication slots regardless of
logical decoding being running. I think there is no big difference in
how users use these statistics values between maintaining at slot
level and at logical decoding level.

In logical replication case, since we generally don’t support setting
different logical_decoding_work_mem per wal senders, every wal sender
will decode the same WAL stream with the same setting, meaning they
will similarly spill intermediate files. Maybe the same is true
statistics of streaming. So having these statistics per logical
replication might not help as of now.

> As you said previously
> we could track it only for running logical decoding context but if we
> do that then data will be temporary and the user needs to constantly
> monitor the same to make sense of it but maybe that is fine.

Agreed, in general, it's better not to frequently reset cumulative
values. I personally would like these statistics to be valid even
after the process executing logical decoding exits (or even after
server restart), and to do reset them as needed.

Regards,

--
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Resetting spilled txn statistics in pg_stat_replication

From
Amit Kapila
Date:
On Thu, Jun 11, 2020 at 7:39 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:
>
> On Thu, 11 Jun 2020 at 20:02, Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Thu, Jun 11, 2020 at 3:07 PM Masahiko Sawada
> > <masahiko.sawada@2ndquadrant.com> wrote:
> > >
> > > On Thu, 11 Jun 2020 at 18:11, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > > On Thu, Jun 11, 2020 at 1:46 PM Masahiko Sawada
> > > > <masahiko.sawada@2ndquadrant.com> wrote:
> > > > >
> > > > > On Thu, 11 Jun 2020 at 12:30, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > > >
> > > > > >
> > > > > > Now, thinking about this again, I am not sure if these stats are
> > > > > > directly related to slots. These are stats for logical decoding which
> > > > > > can be performed either via WALSender or decoding plugin (via APIs).
> > > > > > So, why not have them displayed in a new view like pg_stat_logical (or
> > > > > > pg_stat_logical_decoding/pg_stat_logical_replication)?   In future, we
> > > > > > will need to add similar stats for streaming of in-progress
> > > > > > transactions as well (see patch 0007-Track-statistics-for-streaming at
> > > > > > [1]), so having a separate view for these doesn't sound illogical.
> > > > > >
> > > > >
> > > > > I think we need to decide how long we want to remain these statistics
> > > > > values. That is, if we were to have such pg_stat_logical view, these
> > > > > values would remain until logical decoding finished since I think the
> > > > > view would display only running logical decoding. OTOH, if we were to
> > > > > correspond these stats to slots, these values would remain beyond
> > > > > multiple logical decoding SQL API calls.
> > > > >
> > > >
> > > > I thought of having these till the process that performs these
> > > > operations exist.  So for WALSender, the stats will be valid till it
> > > > is not restarted due to some reason or when performed via backend, the
> > > > stats will be valid till the corresponding backend exits.
> > > >
> > >
> > > The number of rows of that view could be up to (max_backends +
> > > max_wal_senders). Is that right? What if different backends used the
> > > same replication slot one after the other?
> > >
> >
> > Yeah, it would be tricky if multiple slots are used by the same
> > backend.  We could probably track the number of times decoding has
> > happened by the session that will probably help us in averaging the
> > spill amount.  If we think that the aim is to help users to tune
> > logical_decoding_work_mem to avoid frequent spilling or streaming then
> > how would maintaining at slot level will help?
>
> Since the logical decoding intermediate files are written at per slots
> directory, I thought that corresponding these statistics to
> replication slots is also understandable for users.
>

What I wanted to know is how will it help users to tune
logical_decoding_work_mem?  Different backends can process from the
same slot, so it is not clear how user will be able to make any
meaning out of those stats.  OTOH, it is easier to see how to make
meaning of these stats if we display them w.r.t process.  Basically,
we have spill_count and spill_size which can be used to tune
logical_decoding_work_mem and also the activity of spilling happens at
process level, so it sounds like one-to-one mapping.  I am not telling
to rule out maintaining a slot level but trying to see if we can come
up with a clear definition.

> I was thinking
> something like pg_stat_logical_replication_slot view which shows
> slot_name and statistics of only logical replication slots. The view
> always shows rows as many as existing replication slots regardless of
> logical decoding being running. I think there is no big difference in
> how users use these statistics values between maintaining at slot
> level and at logical decoding level.
>
> In logical replication case, since we generally don’t support setting
> different logical_decoding_work_mem per wal senders, every wal sender
> will decode the same WAL stream with the same setting, meaning they
> will similarly spill intermediate files.
>

I am not sure this will be true in every case.  We do have a
slot_advance functionality, so some plugin might use that and that
will lead to different files getting spilled.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: Resetting spilled txn statistics in pg_stat_replication

From
Fujii Masao
Date:

On 2020/06/12 12:21, Amit Kapila wrote:
> On Thu, Jun 11, 2020 at 7:39 PM Masahiko Sawada
> <masahiko.sawada@2ndquadrant.com> wrote:
>>
>> On Thu, 11 Jun 2020 at 20:02, Amit Kapila <amit.kapila16@gmail.com> wrote:
>>>
>>> On Thu, Jun 11, 2020 at 3:07 PM Masahiko Sawada
>>> <masahiko.sawada@2ndquadrant.com> wrote:
>>>>
>>>> On Thu, 11 Jun 2020 at 18:11, Amit Kapila <amit.kapila16@gmail.com> wrote:
>>>>>
>>>>> On Thu, Jun 11, 2020 at 1:46 PM Masahiko Sawada
>>>>> <masahiko.sawada@2ndquadrant.com> wrote:
>>>>>>
>>>>>> On Thu, 11 Jun 2020 at 12:30, Amit Kapila <amit.kapila16@gmail.com> wrote:
>>>>>>>
>>>>>>>
>>>>>>> Now, thinking about this again, I am not sure if these stats are
>>>>>>> directly related to slots. These are stats for logical decoding which
>>>>>>> can be performed either via WALSender or decoding plugin (via APIs).
>>>>>>> So, why not have them displayed in a new view like pg_stat_logical (or
>>>>>>> pg_stat_logical_decoding/pg_stat_logical_replication)?   In future, we
>>>>>>> will need to add similar stats for streaming of in-progress
>>>>>>> transactions as well (see patch 0007-Track-statistics-for-streaming at
>>>>>>> [1]), so having a separate view for these doesn't sound illogical.
>>>>>>>
>>>>>>
>>>>>> I think we need to decide how long we want to remain these statistics
>>>>>> values. That is, if we were to have such pg_stat_logical view, these
>>>>>> values would remain until logical decoding finished since I think the
>>>>>> view would display only running logical decoding. OTOH, if we were to
>>>>>> correspond these stats to slots, these values would remain beyond
>>>>>> multiple logical decoding SQL API calls.
>>>>>>
>>>>>
>>>>> I thought of having these till the process that performs these
>>>>> operations exist.  So for WALSender, the stats will be valid till it
>>>>> is not restarted due to some reason or when performed via backend, the
>>>>> stats will be valid till the corresponding backend exits.
>>>>>
>>>>
>>>> The number of rows of that view could be up to (max_backends +
>>>> max_wal_senders). Is that right? What if different backends used the
>>>> same replication slot one after the other?
>>>>
>>>
>>> Yeah, it would be tricky if multiple slots are used by the same
>>> backend.  We could probably track the number of times decoding has
>>> happened by the session that will probably help us in averaging the
>>> spill amount.  If we think that the aim is to help users to tune
>>> logical_decoding_work_mem to avoid frequent spilling or streaming then
>>> how would maintaining at slot level will help?
>>
>> Since the logical decoding intermediate files are written at per slots
>> directory, I thought that corresponding these statistics to
>> replication slots is also understandable for users.
>>
> 
> What I wanted to know is how will it help users to tune
> logical_decoding_work_mem?  Different backends can process from the
> same slot, so it is not clear how user will be able to make any
> meaning out of those stats.  OTOH, it is easier to see how to make
> meaning of these stats if we display them w.r.t process.  Basically,
> we have spill_count and spill_size which can be used to tune
> logical_decoding_work_mem and also the activity of spilling happens at
> process level, so it sounds like one-to-one mapping.  I am not telling
> to rule out maintaining a slot level but trying to see if we can come
> up with a clear definition.
> 
>> I was thinking
>> something like pg_stat_logical_replication_slot view which shows
>> slot_name and statistics of only logical replication slots. The view
>> always shows rows as many as existing replication slots regardless of
>> logical decoding being running. I think there is no big difference in
>> how users use these statistics values between maintaining at slot
>> level and at logical decoding level.
>>
>> In logical replication case, since we generally don’t support setting
>> different logical_decoding_work_mem per wal senders, every wal sender
>> will decode the same WAL stream with the same setting, meaning they
>> will similarly spill intermediate files.

I was thinking we support that. We can create multiple replication users
with different logical_decoding_work_mem settings. Also each walsender
can use logical_decoding_work_mem configured in its user. No?

Regards,


-- 
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION



Re: Resetting spilled txn statistics in pg_stat_replication

From
Masahiko Sawada
Date:
On Fri, 12 Jun 2020 at 12:56, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
>
>
>
> On 2020/06/12 12:21, Amit Kapila wrote:
> > On Thu, Jun 11, 2020 at 7:39 PM Masahiko Sawada
> > <masahiko.sawada@2ndquadrant.com> wrote:
> >>
> >> On Thu, 11 Jun 2020 at 20:02, Amit Kapila <amit.kapila16@gmail.com> wrote:
> >>>
> >>> On Thu, Jun 11, 2020 at 3:07 PM Masahiko Sawada
> >>> <masahiko.sawada@2ndquadrant.com> wrote:
> >>>>
> >>>> On Thu, 11 Jun 2020 at 18:11, Amit Kapila <amit.kapila16@gmail.com> wrote:
> >>>>>
> >>>>> On Thu, Jun 11, 2020 at 1:46 PM Masahiko Sawada
> >>>>> <masahiko.sawada@2ndquadrant.com> wrote:
> >>>>>>
> >>>>>> On Thu, 11 Jun 2020 at 12:30, Amit Kapila <amit.kapila16@gmail.com> wrote:
> >>>>>>>
> >>>>>>>
> >>>>>>> Now, thinking about this again, I am not sure if these stats are
> >>>>>>> directly related to slots. These are stats for logical decoding which
> >>>>>>> can be performed either via WALSender or decoding plugin (via APIs).
> >>>>>>> So, why not have them displayed in a new view like pg_stat_logical (or
> >>>>>>> pg_stat_logical_decoding/pg_stat_logical_replication)?   In future, we
> >>>>>>> will need to add similar stats for streaming of in-progress
> >>>>>>> transactions as well (see patch 0007-Track-statistics-for-streaming at
> >>>>>>> [1]), so having a separate view for these doesn't sound illogical.
> >>>>>>>
> >>>>>>
> >>>>>> I think we need to decide how long we want to remain these statistics
> >>>>>> values. That is, if we were to have such pg_stat_logical view, these
> >>>>>> values would remain until logical decoding finished since I think the
> >>>>>> view would display only running logical decoding. OTOH, if we were to
> >>>>>> correspond these stats to slots, these values would remain beyond
> >>>>>> multiple logical decoding SQL API calls.
> >>>>>>
> >>>>>
> >>>>> I thought of having these till the process that performs these
> >>>>> operations exist.  So for WALSender, the stats will be valid till it
> >>>>> is not restarted due to some reason or when performed via backend, the
> >>>>> stats will be valid till the corresponding backend exits.
> >>>>>
> >>>>
> >>>> The number of rows of that view could be up to (max_backends +
> >>>> max_wal_senders). Is that right? What if different backends used the
> >>>> same replication slot one after the other?
> >>>>
> >>>
> >>> Yeah, it would be tricky if multiple slots are used by the same
> >>> backend.  We could probably track the number of times decoding has
> >>> happened by the session that will probably help us in averaging the
> >>> spill amount.  If we think that the aim is to help users to tune
> >>> logical_decoding_work_mem to avoid frequent spilling or streaming then
> >>> how would maintaining at slot level will help?
> >>
> >> Since the logical decoding intermediate files are written at per slots
> >> directory, I thought that corresponding these statistics to
> >> replication slots is also understandable for users.
> >>
> >
> > What I wanted to know is how will it help users to tune
> > logical_decoding_work_mem?  Different backends can process from the
> > same slot, so it is not clear how user will be able to make any
> > meaning out of those stats.  OTOH, it is easier to see how to make
> > meaning of these stats if we display them w.r.t process.  Basically,
> > we have spill_count and spill_size which can be used to tune
> > logical_decoding_work_mem and also the activity of spilling happens at
> > process level, so it sounds like one-to-one mapping.  I am not telling
> > to rule out maintaining a slot level but trying to see if we can come
> > up with a clear definition.
> >
> >> I was thinking
> >> something like pg_stat_logical_replication_slot view which shows
> >> slot_name and statistics of only logical replication slots. The view
> >> always shows rows as many as existing replication slots regardless of
> >> logical decoding being running. I think there is no big difference in
> >> how users use these statistics values between maintaining at slot
> >> level and at logical decoding level.
> >>
> >> In logical replication case, since we generally don’t support setting
> >> different logical_decoding_work_mem per wal senders, every wal sender
> >> will decode the same WAL stream with the same setting, meaning they
> >> will similarly spill intermediate files.
>
> I was thinking we support that. We can create multiple replication users
> with different logical_decoding_work_mem settings. Also each walsender
> can use logical_decoding_work_mem configured in its user. No?
>

Yes, you're right. I had missed that way.

Regards,

--
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Resetting spilled txn statistics in pg_stat_replication

From
Masahiko Sawada
Date:
On Fri, 12 Jun 2020 at 12:21, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Jun 11, 2020 at 7:39 PM Masahiko Sawada
> <masahiko.sawada@2ndquadrant.com> wrote:
> >
> > On Thu, 11 Jun 2020 at 20:02, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Thu, Jun 11, 2020 at 3:07 PM Masahiko Sawada
> > > <masahiko.sawada@2ndquadrant.com> wrote:
> > > >
> > > > On Thu, 11 Jun 2020 at 18:11, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > >
> > > > > On Thu, Jun 11, 2020 at 1:46 PM Masahiko Sawada
> > > > > <masahiko.sawada@2ndquadrant.com> wrote:
> > > > > >
> > > > > > On Thu, 11 Jun 2020 at 12:30, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > > > >
> > > > > > >
> > > > > > > Now, thinking about this again, I am not sure if these stats are
> > > > > > > directly related to slots. These are stats for logical decoding which
> > > > > > > can be performed either via WALSender or decoding plugin (via APIs).
> > > > > > > So, why not have them displayed in a new view like pg_stat_logical (or
> > > > > > > pg_stat_logical_decoding/pg_stat_logical_replication)?   In future, we
> > > > > > > will need to add similar stats for streaming of in-progress
> > > > > > > transactions as well (see patch 0007-Track-statistics-for-streaming at
> > > > > > > [1]), so having a separate view for these doesn't sound illogical.
> > > > > > >
> > > > > >
> > > > > > I think we need to decide how long we want to remain these statistics
> > > > > > values. That is, if we were to have such pg_stat_logical view, these
> > > > > > values would remain until logical decoding finished since I think the
> > > > > > view would display only running logical decoding. OTOH, if we were to
> > > > > > correspond these stats to slots, these values would remain beyond
> > > > > > multiple logical decoding SQL API calls.
> > > > > >
> > > > >
> > > > > I thought of having these till the process that performs these
> > > > > operations exist.  So for WALSender, the stats will be valid till it
> > > > > is not restarted due to some reason or when performed via backend, the
> > > > > stats will be valid till the corresponding backend exits.
> > > > >
> > > >
> > > > The number of rows of that view could be up to (max_backends +
> > > > max_wal_senders). Is that right? What if different backends used the
> > > > same replication slot one after the other?
> > > >
> > >
> > > Yeah, it would be tricky if multiple slots are used by the same
> > > backend.  We could probably track the number of times decoding has
> > > happened by the session that will probably help us in averaging the
> > > spill amount.  If we think that the aim is to help users to tune
> > > logical_decoding_work_mem to avoid frequent spilling or streaming then
> > > how would maintaining at slot level will help?
> >
> > Since the logical decoding intermediate files are written at per slots
> > directory, I thought that corresponding these statistics to
> > replication slots is also understandable for users.
> >
>
> What I wanted to know is how will it help users to tune
> logical_decoding_work_mem?  Different backends can process from the
> same slot, so it is not clear how user will be able to make any
> meaning out of those stats.

I thought that the user needs to constantly monitor them during one
process is executing logical decoding and to see the increments. I
might not fully understand but I guess the same is true for displaying
them w.r.t. process. Since a process can do logical decoding several
times using the same slot with a different setting, the user will need
to monitor them several times.

> OTOH, it is easier to see how to make
> meaning of these stats if we display them w.r.t process.  Basically,
> we have spill_count and spill_size which can be used to tune
> logical_decoding_work_mem and also the activity of spilling happens at
> process level, so it sounds like one-to-one mapping.

Displaying them w.r.t process also seems a good idea but I'm still
unclear what to display and how long these values are valid. The view
will have the following columns for example?

* pid
* slot_name
* spill_txns
* spill_count
* spill_bytes
* exec_count

Regards,

-- 
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Resetting spilled txn statistics in pg_stat_replication

From
Amit Kapila
Date:
On Fri, Jun 12, 2020 at 11:20 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:
>
> On Fri, 12 Jun 2020 at 12:21, Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > > Since the logical decoding intermediate files are written at per slots
> > > directory, I thought that corresponding these statistics to
> > > replication slots is also understandable for users.
> > >
> >
> > What I wanted to know is how will it help users to tune
> > logical_decoding_work_mem?  Different backends can process from the
> > same slot, so it is not clear how user will be able to make any
> > meaning out of those stats.
>
> I thought that the user needs to constantly monitor them during one
> process is executing logical decoding and to see the increments. I
> might not fully understand but I guess the same is true for displaying
> them w.r.t. process. Since a process can do logical decoding several
> times using the same slot with a different setting, the user will need
> to monitor them several times.
>

Yeah, I think we might not be able to get exact measure but if we
divide total_size spilled by exec_count, we will get some rough idea
of what should be the logical_decoding_work_mem for that particular
session.  For ex. consider the logical_decoding_work_mem is 100bytes
for a particular backend and the size spilled by that backend is 100
then I think you can roughly keep it to 200bytes if you want to avoid
spilling.  Similarly one can compute its average value over multiple
executions.  Does this make sense to you?

> > OTOH, it is easier to see how to make
> > meaning of these stats if we display them w.r.t process.  Basically,
> > we have spill_count and spill_size which can be used to tune
> > logical_decoding_work_mem and also the activity of spilling happens at
> > process level, so it sounds like one-to-one mapping.
>
> Displaying them w.r.t process also seems a good idea but I'm still
> unclear what to display and how long these values are valid.
>

I feel till the lifetime of a process if we want to display the values
at process level but I am open to hear others (including yours) views
on this.

> The view
> will have the following columns for example?
>
> * pid
> * slot_name
> * spill_txns
> * spill_count
> * spill_bytes
> * exec_count
>

Yeah, these appear to be what I have in mind.  Note that we can have
multiple entries of the same pid here because of slotname, there is
some value to display slotname but I am not completely sure if that is
a good idea but I am fine if you have a reason to include slotname?
--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: Resetting spilled txn statistics in pg_stat_replication

From
Magnus Hagander
Date:


On Fri, Jun 12, 2020 at 10:23 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Fri, Jun 12, 2020 at 11:20 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:
>
> On Fri, 12 Jun 2020 at 12:21, Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > > Since the logical decoding intermediate files are written at per slots
> > > directory, I thought that corresponding these statistics to
> > > replication slots is also understandable for users.
> > >
> >
> > What I wanted to know is how will it help users to tune
> > logical_decoding_work_mem?  Different backends can process from the
> > same slot, so it is not clear how user will be able to make any
> > meaning out of those stats.
>
> I thought that the user needs to constantly monitor them during one
> process is executing logical decoding and to see the increments. I
> might not fully understand but I guess the same is true for displaying
> them w.r.t. process. Since a process can do logical decoding several
> times using the same slot with a different setting, the user will need
> to monitor them several times.
>

Yeah, I think we might not be able to get exact measure but if we
divide total_size spilled by exec_count, we will get some rough idea
of what should be the logical_decoding_work_mem for that particular
session.  For ex. consider the logical_decoding_work_mem is 100bytes
for a particular backend and the size spilled by that backend is 100
then I think you can roughly keep it to 200bytes if you want to avoid
spilling.  Similarly one can compute its average value over multiple
executions.  Does this make sense to you?

The thing that becomes really interesting is to analyze this across time. For example to identify patterns where it always spills at the same time as certain other things are happening. For that usecase, having a "predictable persistence" is important. You may not be able to afford setting logical_decoding_work_mem high enough to cover every possible scenario (if you did, then we would basically not need the spilling..), so you want to track down in relation to the rest of your application exactly when and how this is happening.
 

> > OTOH, it is easier to see how to make
> > meaning of these stats if we display them w.r.t process.  Basically,
> > we have spill_count and spill_size which can be used to tune
> > logical_decoding_work_mem and also the activity of spilling happens at
> > process level, so it sounds like one-to-one mapping.
>
> Displaying them w.r.t process also seems a good idea but I'm still
> unclear what to display and how long these values are valid.
>

I feel till the lifetime of a process if we want to display the values
at process level but I am open to hear others (including yours) views
on this.

The problem with "lifetime of a process" is that it's not predictable. A replication process might "bounce" for any reason, and it is normally not a problem. But if you suddenly lose your stats when you do that, it starts to matter a lot more. Especially when you don't know if it bounced. (Sure you can look at the backend_start time, but that adds a whole different sets of complexitites).


> > The view
> will have the following columns for example?
>
> * pid
> * slot_name
> * spill_txns
> * spill_count
> * spill_bytes
> * exec_count
>

Yeah, these appear to be what I have in mind.  Note that we can have
multiple entries of the same pid here because of slotname, there is
some value to display slotname but I am not completely sure if that is
a good idea but I am fine if you have a reason to include slotname?

Well, it's a general view so you can always GROUP BY that away if you want at reading point?

--

Re: Resetting spilled txn statistics in pg_stat_replication

From
Amit Kapila
Date:
On Fri, Jun 12, 2020 at 6:11 PM Magnus Hagander <magnus@hagander.net> wrote:
>
> On Fri, Jun 12, 2020 at 10:23 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>>
>
>
> The problem with "lifetime of a process" is that it's not predictable. A replication process might "bounce" for any
reason,and it is normally not a problem. But if you suddenly lose your stats when you do that, it starts to matter a
lotmore. Especially when you don't know if it bounced. (Sure you can look at the backend_start time, but that adds a
wholedifferent sets of complexitites). 
>

It is not clear to me what is a good way to display the stats for a
process that has exited or bounced due to whatever reason.  OTOH, if
we just display per-slot stats, it is difficult to imagine how the
user can make any sense out of it or in other words how such stats can
be useful to users.

>
> > > The view
>>
>> > will have the following columns for example?
>> >
>> > * pid
>> > * slot_name
>> > * spill_txns
>> > * spill_count
>> > * spill_bytes
>> > * exec_count
>> >
>>
>> Yeah, these appear to be what I have in mind.  Note that we can have
>> multiple entries of the same pid here because of slotname, there is
>> some value to display slotname but I am not completely sure if that is
>> a good idea but I am fine if you have a reason to include slotname?
>
>
> Well, it's a general view so you can always GROUP BY that away if you want at reading point?
>

Okay, that is a valid point.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: Resetting spilled txn statistics in pg_stat_replication

From
Fujii Masao
Date:

On 2020/06/13 14:23, Amit Kapila wrote:
> On Fri, Jun 12, 2020 at 6:11 PM Magnus Hagander <magnus@hagander.net> wrote:
>>
>> On Fri, Jun 12, 2020 at 10:23 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>>>
>>
>>
>> The problem with "lifetime of a process" is that it's not predictable. A replication process might "bounce" for any
reason,and it is normally not a problem. But if you suddenly lose your stats when you do that, it starts to matter a
lotmore. Especially when you don't know if it bounced. (Sure you can look at the backend_start time, but that adds a
wholedifferent sets of complexitites).
 
>>
> 
> It is not clear to me what is a good way to display the stats for a
> process that has exited or bounced due to whatever reason.  OTOH, if
> we just display per-slot stats, it is difficult to imagine how the
> user can make any sense out of it or in other words how such stats can
> be useful to users.

If we allow users to set logical_decoding_work_mem per slot,
maybe the users can tune it directly from the stats?

Regards,

-- 
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION



Re: Resetting spilled txn statistics in pg_stat_replication

From
Amit Kapila
Date:
On Sat, Jun 13, 2020 at 5:07 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
>
>
> On 2020/06/13 14:23, Amit Kapila wrote:
> > On Fri, Jun 12, 2020 at 6:11 PM Magnus Hagander <magnus@hagander.net> wrote:
> >>
> >> On Fri, Jun 12, 2020 at 10:23 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >>>
> >>
> >>
> >> The problem with "lifetime of a process" is that it's not predictable. A replication process might "bounce" for
anyreason, and it is normally not a problem. But if you suddenly lose your stats when you do that, it starts to matter
alot more. Especially when you don't know if it bounced. (Sure you can look at the backend_start time, but that adds a
wholedifferent sets of complexitites). 
> >>
> >
> > It is not clear to me what is a good way to display the stats for a
> > process that has exited or bounced due to whatever reason.  OTOH, if
> > we just display per-slot stats, it is difficult to imagine how the
> > user can make any sense out of it or in other words how such stats can
> > be useful to users.
>
> If we allow users to set logical_decoding_work_mem per slot,
> maybe the users can tune it directly from the stats?
>

How will it behave when same slot is used from multiple sessions?  I
think it will be difficult to make sense of the stats for slots unless
we also somehow see which process has lead to that stats and if we do
so then there won't be much difference w.r.t what we can do now?

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: Resetting spilled txn statistics in pg_stat_replication

From
Masahiko Sawada
Date:
On Sat, 13 Jun 2020 at 14:23, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Jun 12, 2020 at 6:11 PM Magnus Hagander <magnus@hagander.net> wrote:
> >
> > On Fri, Jun 12, 2020 at 10:23 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >>
> >
> >
> > The problem with "lifetime of a process" is that it's not predictable. A replication process might "bounce" for any
reason,and it is normally not a problem. But if you suddenly lose your stats when you do that, it starts to matter a
lotmore. Especially when you don't know if it bounced. (Sure you can look at the backend_start time, but that adds a
wholedifferent sets of complexitites). 
> >
>
> It is not clear to me what is a good way to display the stats for a
> process that has exited or bounced due to whatever reason.  OTOH, if
> we just display per-slot stats, it is difficult to imagine how the
> user can make any sense out of it or in other words how such stats can
> be useful to users.

If we have the reset function, the user can reset before doing logical
decoding so that the user can use the stats directly. Or I think we
can automatically reset the stats when logical decoding is performed
with different logical_decoding_work_mem value than the previous one.
In either way, since the stats correspond to the logical decoding
using the same slot with the same parameter value the user can use
them directly.

Regards,

--
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Resetting spilled txn statistics in pg_stat_replication

From
Amit Kapila
Date:
On Wed, Jun 17, 2020 at 1:34 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:
>
> On Sat, 13 Jun 2020 at 14:23, Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Fri, Jun 12, 2020 at 6:11 PM Magnus Hagander <magnus@hagander.net> wrote:
> > >
> > > On Fri, Jun 12, 2020 at 10:23 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >>
> > >
> > >
> > > The problem with "lifetime of a process" is that it's not predictable. A replication process might "bounce" for
anyreason, and it is normally not a problem. But if you suddenly lose your stats when you do that, it starts to matter
alot more. Especially when you don't know if it bounced. (Sure you can look at the backend_start time, but that adds a
wholedifferent sets of complexitites). 
> > >
> >
> > It is not clear to me what is a good way to display the stats for a
> > process that has exited or bounced due to whatever reason.  OTOH, if
> > we just display per-slot stats, it is difficult to imagine how the
> > user can make any sense out of it or in other words how such stats can
> > be useful to users.
>
> If we have the reset function, the user can reset before doing logical
> decoding so that the user can use the stats directly. Or I think we
> can automatically reset the stats when logical decoding is performed
> with different logical_decoding_work_mem value than the previous one.
>

I had written above in the context of persisting these stats.  I mean
to say if the process has bounced or server has restarted then the
previous stats might not make much sense because we were planning to
use pid [1], so the stats from process that has exited might not make
much sense or do you think that is okay?  If we don't want to persist
and the lifetime of these stats is till the process is alive then we
are fine.


[1] - https://www.postgresql.org/message-id/CA%2Bfd4k5nqeFdhpnCULpTh9TR%2B15rHZSbz0SDC6sZhr_v99SeKA%40mail.gmail.com

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: Resetting spilled txn statistics in pg_stat_replication

From
Masahiko Sawada
Date:
On Wed, 17 Jun 2020 at 20:14, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Jun 17, 2020 at 1:34 PM Masahiko Sawada
> <masahiko.sawada@2ndquadrant.com> wrote:
> >
> > On Sat, 13 Jun 2020 at 14:23, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Fri, Jun 12, 2020 at 6:11 PM Magnus Hagander <magnus@hagander.net> wrote:
> > > >
> > > > On Fri, Jun 12, 2020 at 10:23 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >>
> > > >
> > > >
> > > > The problem with "lifetime of a process" is that it's not predictable. A replication process might "bounce" for
anyreason, and it is normally not a problem. But if you suddenly lose your stats when you do that, it starts to matter
alot more. Especially when you don't know if it bounced. (Sure you can look at the backend_start time, but that adds a
wholedifferent sets of complexitites). 
> > > >
> > >
> > > It is not clear to me what is a good way to display the stats for a
> > > process that has exited or bounced due to whatever reason.  OTOH, if
> > > we just display per-slot stats, it is difficult to imagine how the
> > > user can make any sense out of it or in other words how such stats can
> > > be useful to users.
> >
> > If we have the reset function, the user can reset before doing logical
> > decoding so that the user can use the stats directly. Or I think we
> > can automatically reset the stats when logical decoding is performed
> > with different logical_decoding_work_mem value than the previous one.
> >
>
> I had written above in the context of persisting these stats.  I mean
> to say if the process has bounced or server has restarted then the
> previous stats might not make much sense because we were planning to
> use pid [1], so the stats from process that has exited might not make
> much sense or do you think that is okay?  If we don't want to persist
> and the lifetime of these stats is till the process is alive then we
> are fine.
>

Sorry for confusing you. The above my idea is about having the stats
per slots. That is, we add spill_txns, spill_count and spill_bytes to
pg_replication_slots or a new view pg_stat_logical_replication_slots
with some columns: slot_name plus these stats columns and stats_reset.
The idea is that the stats values accumulate until either the slot is
dropped, the server crashed, the user executes the reset function, or
logical decoding is performed with different logical_decoding_work_mem
value than the previous time. In other words, the stats values are
reset in either case. That way, I think the stats values always
correspond to logical decoding using the same slot with the same
logical_decoding_work_mem value.

Regards,

--
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Resetting spilled txn statistics in pg_stat_replication

From
Amit Kapila
Date:
On Thu, Jun 18, 2020 at 8:01 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:
>
> On Wed, 17 Jun 2020 at 20:14, Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> >
> > I had written above in the context of persisting these stats.  I mean
> > to say if the process has bounced or server has restarted then the
> > previous stats might not make much sense because we were planning to
> > use pid [1], so the stats from process that has exited might not make
> > much sense or do you think that is okay?  If we don't want to persist
> > and the lifetime of these stats is till the process is alive then we
> > are fine.
> >
>
> Sorry for confusing you. The above my idea is about having the stats
> per slots. That is, we add spill_txns, spill_count and spill_bytes to
> pg_replication_slots or a new view pg_stat_logical_replication_slots
> with some columns: slot_name plus these stats columns and stats_reset.
> The idea is that the stats values accumulate until either the slot is
> dropped, the server crashed, the user executes the reset function, or
> logical decoding is performed with different logical_decoding_work_mem
> value than the previous time. In other words, the stats values are
> reset in either case. That way, I think the stats values always
> correspond to logical decoding using the same slot with the same
> logical_decoding_work_mem value.
>

What if the decoding has been performed by multiple backends using the
same slot?  In that case, it will be difficult to make the judgment
for the value of logical_decoding_work_mem based on stats.  It would
make sense if we provide a way to set logical_decoding_work_mem for a
slot but not sure if that is better than what we have now.

What problems do we see in displaying these for each process?  I think
users might want to see the stats for the exited processes or after
server restart but I think both of those are not even possible today.
I think the stats are available till the corresponding WALSender
process is active.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: Resetting spilled txn statistics in pg_stat_replication

From
Tomas Vondra
Date:
Hi,

Sorry for neglecting this thread for the last couple days ...

In general, I agree it's somewhat unfortunate the stats are reset when
the walsender exits. This was mostly fine for tuning of the spilling
(change value -> restart -> see stats) but for proper monitoring this
is somewhat problematic. I simply considered these fields somewhat
similar to lag monitoring, not from the "monitoring" POV.


On Thu, Jun 11, 2020 at 11:09:00PM +0900, Masahiko Sawada wrote:
>
> ...
>
>Since the logical decoding intermediate files are written at per slots
>directory, I thought that corresponding these statistics to
>replication slots is also understandable for users. I was thinking
>something like pg_stat_logical_replication_slot view which shows
>slot_name and statistics of only logical replication slots. The view
>always shows rows as many as existing replication slots regardless of
>logical decoding being running. I think there is no big difference in
>how users use these statistics values between maintaining at slot
>level and at logical decoding level.
>
>In logical replication case, since we generally don’t support setting
>different logical_decoding_work_mem per wal senders, every wal sender
>will decode the same WAL stream with the same setting, meaning they
>will similarly spill intermediate files. Maybe the same is true
>statistics of streaming. So having these statistics per logical
>replication might not help as of now.
>

I think the idea to track these stats per replication slot (rather than
per walsender) is the right approach. We should extend statistics
collector to keep one entry per replication slot and have a new stats
view called e.g. pg_stat_replication_slots, which could be reset just
like other stats in the collector.

I don't quite understand the discussion about different backends using
logical_decoding_work_mem - why would this be an issue? Surely we have
this exact issue e.g. with tracking index vs. sequential scans and GUCs
like random_page_cost. That can change over time too, different backends
may use different values, and yet we don't worry about resetting the
number of index scans for a table etc.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Resetting spilled txn statistics in pg_stat_replication

From
Tomas Vondra
Date:
On Thu, Jun 18, 2020 at 12:21:17PM +0530, Amit Kapila wrote:
>On Thu, Jun 18, 2020 at 8:01 AM Masahiko Sawada
><masahiko.sawada@2ndquadrant.com> wrote:
>>
>> On Wed, 17 Jun 2020 at 20:14, Amit Kapila <amit.kapila16@gmail.com> wrote:
>> >
>> >
>> > I had written above in the context of persisting these stats.  I mean
>> > to say if the process has bounced or server has restarted then the
>> > previous stats might not make much sense because we were planning to
>> > use pid [1], so the stats from process that has exited might not make
>> > much sense or do you think that is okay?  If we don't want to persist
>> > and the lifetime of these stats is till the process is alive then we
>> > are fine.
>> >
>>
>> Sorry for confusing you. The above my idea is about having the stats
>> per slots. That is, we add spill_txns, spill_count and spill_bytes to
>> pg_replication_slots or a new view pg_stat_logical_replication_slots
>> with some columns: slot_name plus these stats columns and stats_reset.
>> The idea is that the stats values accumulate until either the slot is
>> dropped, the server crashed, the user executes the reset function, or
>> logical decoding is performed with different logical_decoding_work_mem
>> value than the previous time. In other words, the stats values are
>> reset in either case. That way, I think the stats values always
>> correspond to logical decoding using the same slot with the same
>> logical_decoding_work_mem value.
>>
>
>What if the decoding has been performed by multiple backends using the
>same slot?  In that case, it will be difficult to make the judgment
>for the value of logical_decoding_work_mem based on stats.  It would
>make sense if we provide a way to set logical_decoding_work_mem for a
>slot but not sure if that is better than what we have now.
>
>What problems do we see in displaying these for each process?  I think
>users might want to see the stats for the exited processes or after
>server restart but I think both of those are not even possible today.
>I think the stats are available till the corresponding WALSender
>process is active.
>

I don't quite see what the problem is. We're in this exact position with
many other stats we track and various GUCs. If you decide to tune the
setting for a particular slot, you simply need to be careful which
backends decode the slot and what GUC values they used.

But I don't think this situation (multiple backends decoding the same
slot with different logical_decoding_work_mem values) is very common. In
most cases the backends/walsenders will all use the same value. If you
change that, you better remember that.

I really think we should not be inventing something that automatically
resets the stats when someone happens to change the GUC.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Resetting spilled txn statistics in pg_stat_replication

From
Amit Kapila
Date:
On Sun, Jun 21, 2020 at 3:27 AM Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:
>
> On Thu, Jun 18, 2020 at 12:21:17PM +0530, Amit Kapila wrote:
> >On Thu, Jun 18, 2020 at 8:01 AM Masahiko Sawada
> ><masahiko.sawada@2ndquadrant.com> wrote:
> >>
> >> On Wed, 17 Jun 2020 at 20:14, Amit Kapila <amit.kapila16@gmail.com> wrote:
> >> >
> >> >
> >> > I had written above in the context of persisting these stats.  I mean
> >> > to say if the process has bounced or server has restarted then the
> >> > previous stats might not make much sense because we were planning to
> >> > use pid [1], so the stats from process that has exited might not make
> >> > much sense or do you think that is okay?  If we don't want to persist
> >> > and the lifetime of these stats is till the process is alive then we
> >> > are fine.
> >> >
> >>
> >> Sorry for confusing you. The above my idea is about having the stats
> >> per slots. That is, we add spill_txns, spill_count and spill_bytes to
> >> pg_replication_slots or a new view pg_stat_logical_replication_slots
> >> with some columns: slot_name plus these stats columns and stats_reset.
> >> The idea is that the stats values accumulate until either the slot is
> >> dropped, the server crashed, the user executes the reset function, or
> >> logical decoding is performed with different logical_decoding_work_mem
> >> value than the previous time. In other words, the stats values are
> >> reset in either case. That way, I think the stats values always
> >> correspond to logical decoding using the same slot with the same
> >> logical_decoding_work_mem value.
> >>
> >
> >What if the decoding has been performed by multiple backends using the
> >same slot?  In that case, it will be difficult to make the judgment
> >for the value of logical_decoding_work_mem based on stats.  It would
> >make sense if we provide a way to set logical_decoding_work_mem for a
> >slot but not sure if that is better than what we have now.
> >
> >What problems do we see in displaying these for each process?  I think
> >users might want to see the stats for the exited processes or after
> >server restart but I think both of those are not even possible today.
> >I think the stats are available till the corresponding WALSender
> >process is active.
> >
>
> I don't quite see what the problem is. We're in this exact position with
> many other stats we track and various GUCs. If you decide to tune the
> setting for a particular slot, you simply need to be careful which
> backends decode the slot and what GUC values they used.
>

What problem do you if we allow it to display per-process (WALSender
or backend)?  They are incurred by the WALSender or by backends so
displaying them accordingly seems more straightforward and logical to
me.

As of now, we don't allow it to be set for a slot, so it won't be
convenient for the user to tune it per slot.  I think we can allow to
set it per-slot but not sure if there is any benefit for the same.

> I really think we should not be inventing something that automatically
> resets the stats when someone happens to change the GUC.
>

I agree with that.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: Resetting spilled txn statistics in pg_stat_replication

From
Amit Kapila
Date:
On Mon, Jun 22, 2020 at 8:22 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Sun, Jun 21, 2020 at 3:27 AM Tomas Vondra
> <tomas.vondra@2ndquadrant.com> wrote:
> >
> > On Thu, Jun 18, 2020 at 12:21:17PM +0530, Amit Kapila wrote:
> > >On Thu, Jun 18, 2020 at 8:01 AM Masahiko Sawada
> > ><masahiko.sawada@2ndquadrant.com> wrote:
> > >>
> > >> On Wed, 17 Jun 2020 at 20:14, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >> >
> > >> >
> > >> > I had written above in the context of persisting these stats.  I mean
> > >> > to say if the process has bounced or server has restarted then the
> > >> > previous stats might not make much sense because we were planning to
> > >> > use pid [1], so the stats from process that has exited might not make
> > >> > much sense or do you think that is okay?  If we don't want to persist
> > >> > and the lifetime of these stats is till the process is alive then we
> > >> > are fine.
> > >> >
> > >>
> > >> Sorry for confusing you. The above my idea is about having the stats
> > >> per slots. That is, we add spill_txns, spill_count and spill_bytes to
> > >> pg_replication_slots or a new view pg_stat_logical_replication_slots
> > >> with some columns: slot_name plus these stats columns and stats_reset.
> > >> The idea is that the stats values accumulate until either the slot is
> > >> dropped, the server crashed, the user executes the reset function, or
> > >> logical decoding is performed with different logical_decoding_work_mem
> > >> value than the previous time. In other words, the stats values are
> > >> reset in either case. That way, I think the stats values always
> > >> correspond to logical decoding using the same slot with the same
> > >> logical_decoding_work_mem value.
> > >>
> > >
> > >What if the decoding has been performed by multiple backends using the
> > >same slot?  In that case, it will be difficult to make the judgment
> > >for the value of logical_decoding_work_mem based on stats.  It would
> > >make sense if we provide a way to set logical_decoding_work_mem for a
> > >slot but not sure if that is better than what we have now.
> > >
> > >What problems do we see in displaying these for each process?  I think
> > >users might want to see the stats for the exited processes or after
> > >server restart but I think both of those are not even possible today.
> > >I think the stats are available till the corresponding WALSender
> > >process is active.
> > >
> >
> > I don't quite see what the problem is. We're in this exact position with
> > many other stats we track and various GUCs. If you decide to tune the
> > setting for a particular slot, you simply need to be careful which
> > backends decode the slot and what GUC values they used.
> >
>
> What problem do you if we allow it to display per-process (WALSender
> or backend)?  They are incurred by the WALSender or by backends so
> displaying them accordingly seems more straightforward and logical to
> me.
>
> As of now, we don't allow it to be set for a slot, so it won't be
> convenient for the user to tune it per slot.  I think we can allow to
> set it per-slot but not sure if there is any benefit for the same.
>

If we display stats as discussed in email [1] (pid, slot_name,
spill_txns, spill_count, etc.), then we can even find the stats w.r.t
each slot.


[1] - https://www.postgresql.org/message-id/CA%2Bfd4k5nqeFdhpnCULpTh9TR%2B15rHZSbz0SDC6sZhr_v99SeKA%40mail.gmail.com

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: Resetting spilled txn statistics in pg_stat_replication

From
Masahiko Sawada
Date:
On Sun, 21 Jun 2020 at 06:57, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
>
> On Thu, Jun 18, 2020 at 12:21:17PM +0530, Amit Kapila wrote:
> >On Thu, Jun 18, 2020 at 8:01 AM Masahiko Sawada
> ><masahiko.sawada@2ndquadrant.com> wrote:
> >>
> >> On Wed, 17 Jun 2020 at 20:14, Amit Kapila <amit.kapila16@gmail.com> wrote:
> >> >
> >> >
> >> > I had written above in the context of persisting these stats.  I mean
> >> > to say if the process has bounced or server has restarted then the
> >> > previous stats might not make much sense because we were planning to
> >> > use pid [1], so the stats from process that has exited might not make
> >> > much sense or do you think that is okay?  If we don't want to persist
> >> > and the lifetime of these stats is till the process is alive then we
> >> > are fine.
> >> >
> >>
> >> Sorry for confusing you. The above my idea is about having the stats
> >> per slots. That is, we add spill_txns, spill_count and spill_bytes to
> >> pg_replication_slots or a new view pg_stat_logical_replication_slots
> >> with some columns: slot_name plus these stats columns and stats_reset.
> >> The idea is that the stats values accumulate until either the slot is
> >> dropped, the server crashed, the user executes the reset function, or
> >> logical decoding is performed with different logical_decoding_work_mem
> >> value than the previous time. In other words, the stats values are
> >> reset in either case. That way, I think the stats values always
> >> correspond to logical decoding using the same slot with the same
> >> logical_decoding_work_mem value.
> >>
> >
> >What if the decoding has been performed by multiple backends using the
> >same slot?  In that case, it will be difficult to make the judgment
> >for the value of logical_decoding_work_mem based on stats.  It would
> >make sense if we provide a way to set logical_decoding_work_mem for a
> >slot but not sure if that is better than what we have now.
> >

I thought that the stats are relevant to what
logical_decoding_work_mem value was but not with who performed logical
decoding. So even if multiple backends perform logical decoding using
the same slot, the user can directly use stats as long as
logical_decoding_work_mem value doesn’t change.

> >What problems do we see in displaying these for each process?  I think
> >users might want to see the stats for the exited processes or after
> >server restart but I think both of those are not even possible today.
> >I think the stats are available till the corresponding WALSender
> >process is active.

I might want to see the stats for the exited processes or after server
restart. But I'm inclined to agree with displaying the stats per
process if the stats are displayed on a separate view (e.g.
pg_stat_replication_slots).

> >
>
> I don't quite see what the problem is. We're in this exact position with
> many other stats we track and various GUCs. If you decide to tune the
> setting for a particular slot, you simply need to be careful which
> backends decode the slot and what GUC values they used.
>
> But I don't think this situation (multiple backends decoding the same
> slot with different logical_decoding_work_mem values) is very common. In
> most cases the backends/walsenders will all use the same value. If you
> change that, you better remember that.
>
> I really think we should not be inventing something that automatically
> resets the stats when someone happens to change the GUC.

Agreed. But what I thought is more simple; storing the
logical_decoding_work_mem value along with the stats into a logical
replication slot and resetting the stats if the
logical_decoding_work_mem value is different than the stored value
when performing logical decoding.

Regards,

--
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Resetting spilled txn statistics in pg_stat_replication

From
Amit Kapila
Date:
On Tue, Jun 23, 2020 at 9:32 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:
>
> On Sun, 21 Jun 2020 at 06:57, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
> >
> > >
> > >What if the decoding has been performed by multiple backends using the
> > >same slot?  In that case, it will be difficult to make the judgment
> > >for the value of logical_decoding_work_mem based on stats.  It would
> > >make sense if we provide a way to set logical_decoding_work_mem for a
> > >slot but not sure if that is better than what we have now.
> > >
>
> I thought that the stats are relevant to what
> logical_decoding_work_mem value was but not with who performed logical
> decoding. So even if multiple backends perform logical decoding using
> the same slot, the user can directly use stats as long as
> logical_decoding_work_mem value doesn’t change.
>

I think if you maintain these stats at the slot level, you probably
need to use spinlock or atomic ops in order to update those as slots
can be used from multiple backends whereas currently, we don't need
that.

> > >What problems do we see in displaying these for each process?  I think
> > >users might want to see the stats for the exited processes or after
> > >server restart but I think both of those are not even possible today.
> > >I think the stats are available till the corresponding WALSender
> > >process is active.
>
> I might want to see the stats for the exited processes or after server
> restart. But I'm inclined to agree with displaying the stats per
> process if the stats are displayed on a separate view (e.g.
> pg_stat_replication_slots).
>

Yeah, as told previously, this makes more sense to me.

Do you think we should try to write a POC patch using a per-process
entry approach and see what difficulties we are facing and does it
give the stats in a way we are imagining but OTOH, we can wait for
some more to see if there is clear winner approach here?

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: Resetting spilled txn statistics in pg_stat_replication

From
Tomas Vondra
Date:
On Tue, Jun 23, 2020 at 10:58:18AM +0530, Amit Kapila wrote:
>On Tue, Jun 23, 2020 at 9:32 AM Masahiko Sawada
><masahiko.sawada@2ndquadrant.com> wrote:
>>
>> On Sun, 21 Jun 2020 at 06:57, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
>> >
>> > >
>> > >What if the decoding has been performed by multiple backends using the
>> > >same slot?  In that case, it will be difficult to make the judgment
>> > >for the value of logical_decoding_work_mem based on stats.  It would
>> > >make sense if we provide a way to set logical_decoding_work_mem for a
>> > >slot but not sure if that is better than what we have now.
>> > >
>>
>> I thought that the stats are relevant to what
>> logical_decoding_work_mem value was but not with who performed logical
>> decoding. So even if multiple backends perform logical decoding using
>> the same slot, the user can directly use stats as long as
>> logical_decoding_work_mem value doesn’t change.
>>
>
>I think if you maintain these stats at the slot level, you probably
>need to use spinlock or atomic ops in order to update those as slots
>can be used from multiple backends whereas currently, we don't need
>that.

IMHO storing the stats in the slot itself is a bad idea. We have the
statistics collector for exactly this purpose, and it's receiving data
over UDP without any extra locking etc.
>
>> > >What problems do we see in displaying these for each process?  I think
>> > >users might want to see the stats for the exited processes or after
>> > >server restart but I think both of those are not even possible today.
>> > >I think the stats are available till the corresponding WALSender
>> > >process is active.
>>
>> I might want to see the stats for the exited processes or after server
>> restart. But I'm inclined to agree with displaying the stats per
>> process if the stats are displayed on a separate view (e.g.
>> pg_stat_replication_slots).
>>
>
>Yeah, as told previously, this makes more sense to me.
>
>Do you think we should try to write a POC patch using a per-process
>entry approach and see what difficulties we are facing and does it
>give the stats in a way we are imagining but OTOH, we can wait for
>some more to see if there is clear winner approach here?
>

I may be missing something obvious, but I still see no point in tracking
per-process stats. We don't have that for other stats, and I'm not sure
how common is the scenario when a given slot is decoded by many
backends. I'd say vast majority of cases are simply running decoding
from a walsender, which may occasionally restart, but I doubt the users
are interested in per-pid data - they probably want aggregated data.

Can someone explain a plausible scenario for which tracking per-process
stats would be needed, and simply computing deltas would not work? How
will you know which old PID is which, what will you do when a PID is
reused, and so on?


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Resetting spilled txn statistics in pg_stat_replication

From
Amit Kapila
Date:
On Tue, Jun 23, 2020 at 3:48 PM Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:
>
> On Tue, Jun 23, 2020 at 10:58:18AM +0530, Amit Kapila wrote:
> >On Tue, Jun 23, 2020 at 9:32 AM Masahiko Sawada
> ><masahiko.sawada@2ndquadrant.com> wrote:
> >>
> >> On Sun, 21 Jun 2020 at 06:57, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
> >> >
> >> > >
> >> > >What if the decoding has been performed by multiple backends using the
> >> > >same slot?  In that case, it will be difficult to make the judgment
> >> > >for the value of logical_decoding_work_mem based on stats.  It would
> >> > >make sense if we provide a way to set logical_decoding_work_mem for a
> >> > >slot but not sure if that is better than what we have now.
> >> > >
> >>
> >> I thought that the stats are relevant to what
> >> logical_decoding_work_mem value was but not with who performed logical
> >> decoding. So even if multiple backends perform logical decoding using
> >> the same slot, the user can directly use stats as long as
> >> logical_decoding_work_mem value doesn’t change.
> >>
> >
> >I think if you maintain these stats at the slot level, you probably
> >need to use spinlock or atomic ops in order to update those as slots
> >can be used from multiple backends whereas currently, we don't need
> >that.
>
> IMHO storing the stats in the slot itself is a bad idea. We have the
> statistics collector for exactly this purpose, and it's receiving data
> over UDP without any extra locking etc.
> >
> >> > >What problems do we see in displaying these for each process?  I think
> >> > >users might want to see the stats for the exited processes or after
> >> > >server restart but I think both of those are not even possible today.
> >> > >I think the stats are available till the corresponding WALSender
> >> > >process is active.
> >>
> >> I might want to see the stats for the exited processes or after server
> >> restart. But I'm inclined to agree with displaying the stats per
> >> process if the stats are displayed on a separate view (e.g.
> >> pg_stat_replication_slots).
> >>
> >
> >Yeah, as told previously, this makes more sense to me.
> >
> >Do you think we should try to write a POC patch using a per-process
> >entry approach and see what difficulties we are facing and does it
> >give the stats in a way we are imagining but OTOH, we can wait for
> >some more to see if there is clear winner approach here?
> >
>
> I may be missing something obvious, but I still see no point in tracking
> per-process stats. We don't have that for other stats,
>

Won't we display per-process information in pg_stat_replication?
These stats are currently displayed in that view and one of the
shortcomings was that it won't display these stats when we decode via
backend.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: Resetting spilled txn statistics in pg_stat_replication

From
Amit Kapila
Date:
On Tue, Jun 23, 2020 at 6:39 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Jun 23, 2020 at 3:48 PM Tomas Vondra
> <tomas.vondra@2ndquadrant.com> wrote:
> >
> > On Tue, Jun 23, 2020 at 10:58:18AM +0530, Amit Kapila wrote:
> > >On Tue, Jun 23, 2020 at 9:32 AM Masahiko Sawada
> > ><masahiko.sawada@2ndquadrant.com> wrote:
> > >>
> > >> On Sun, 21 Jun 2020 at 06:57, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
> > >> >
> > >> > >
> > >> > >What if the decoding has been performed by multiple backends using the
> > >> > >same slot?  In that case, it will be difficult to make the judgment
> > >> > >for the value of logical_decoding_work_mem based on stats.  It would
> > >> > >make sense if we provide a way to set logical_decoding_work_mem for a
> > >> > >slot but not sure if that is better than what we have now.
> > >> > >
> > >>
> > >> I thought that the stats are relevant to what
> > >> logical_decoding_work_mem value was but not with who performed logical
> > >> decoding. So even if multiple backends perform logical decoding using
> > >> the same slot, the user can directly use stats as long as
> > >> logical_decoding_work_mem value doesn’t change.
> > >>

Today, I thought about it again, and if we consider the point that
logical_decoding_work_mem value doesn’t change much then having the
stats at slot-level would also allow computing
logical_decoding_work_mem based on stats.  Do you think it is a
reasonable assumption that users won't change
logical_decoding_work_mem for different processes (WALSender, etc.)?

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: Resetting spilled txn statistics in pg_stat_replication

From
Masahiko Sawada
Date:
On Thu, 25 Jun 2020 at 19:35, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Jun 23, 2020 at 6:39 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Tue, Jun 23, 2020 at 3:48 PM Tomas Vondra
> > <tomas.vondra@2ndquadrant.com> wrote:
> > >
> > > On Tue, Jun 23, 2020 at 10:58:18AM +0530, Amit Kapila wrote:
> > > >On Tue, Jun 23, 2020 at 9:32 AM Masahiko Sawada
> > > ><masahiko.sawada@2ndquadrant.com> wrote:
> > > >>
> > > >> On Sun, 21 Jun 2020 at 06:57, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
> > > >> >
> > > >> > >
> > > >> > >What if the decoding has been performed by multiple backends using the
> > > >> > >same slot?  In that case, it will be difficult to make the judgment
> > > >> > >for the value of logical_decoding_work_mem based on stats.  It would
> > > >> > >make sense if we provide a way to set logical_decoding_work_mem for a
> > > >> > >slot but not sure if that is better than what we have now.
> > > >> > >
> > > >>
> > > >> I thought that the stats are relevant to what
> > > >> logical_decoding_work_mem value was but not with who performed logical
> > > >> decoding. So even if multiple backends perform logical decoding using
> > > >> the same slot, the user can directly use stats as long as
> > > >> logical_decoding_work_mem value doesn’t change.
> > > >>
>
> Today, I thought about it again, and if we consider the point that
> logical_decoding_work_mem value doesn’t change much then having the
> stats at slot-level would also allow computing
> logical_decoding_work_mem based on stats.  Do you think it is a
> reasonable assumption that users won't change
> logical_decoding_work_mem for different processes (WALSender, etc.)?

FWIW, if we use logical_decoding_work_mem as a threshold of starting
of sending changes to a subscriber, I think there might be use cases
where the user wants to set different logical_decoding_work_mem values
to different wal senders. For example, setting a lower value to
minimize the latency of synchronous logical replication to a near-site
whereas setting a large value to minimize the amount of data sent to a
far site.

Regards,

--
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Resetting spilled txn statistics in pg_stat_replication

From
Amit Kapila
Date:
On Fri, Jun 26, 2020 at 11:31 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:
>
> On Thu, 25 Jun 2020 at 19:35, Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Tue, Jun 23, 2020 at 6:39 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Tue, Jun 23, 2020 at 3:48 PM Tomas Vondra
> > > <tomas.vondra@2ndquadrant.com> wrote:
> > > >
> > > > On Tue, Jun 23, 2020 at 10:58:18AM +0530, Amit Kapila wrote:
> > > > >On Tue, Jun 23, 2020 at 9:32 AM Masahiko Sawada
> > > > ><masahiko.sawada@2ndquadrant.com> wrote:
> > > > >>
> > > > >> On Sun, 21 Jun 2020 at 06:57, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
> > > > >> >
> > > > >> > >
> > > > >> > >What if the decoding has been performed by multiple backends using the
> > > > >> > >same slot?  In that case, it will be difficult to make the judgment
> > > > >> > >for the value of logical_decoding_work_mem based on stats.  It would
> > > > >> > >make sense if we provide a way to set logical_decoding_work_mem for a
> > > > >> > >slot but not sure if that is better than what we have now.
> > > > >> > >
> > > > >>
> > > > >> I thought that the stats are relevant to what
> > > > >> logical_decoding_work_mem value was but not with who performed logical
> > > > >> decoding. So even if multiple backends perform logical decoding using
> > > > >> the same slot, the user can directly use stats as long as
> > > > >> logical_decoding_work_mem value doesn’t change.
> > > > >>
> >
> > Today, I thought about it again, and if we consider the point that
> > logical_decoding_work_mem value doesn’t change much then having the
> > stats at slot-level would also allow computing
> > logical_decoding_work_mem based on stats.  Do you think it is a
> > reasonable assumption that users won't change
> > logical_decoding_work_mem for different processes (WALSender, etc.)?
>
> FWIW, if we use logical_decoding_work_mem as a threshold of starting
> of sending changes to a subscriber, I think there might be use cases
> where the user wants to set different logical_decoding_work_mem values
> to different wal senders. For example, setting a lower value to
> minimize the latency of synchronous logical replication to a near-site
> whereas setting a large value to minimize the amount of data sent to a
> far site.
>

How does setting a large value can minimize the amount of data sent?
One possibility is if there are a lot of transaction aborts and
transactions are not large enough that they cross
logical_decoding_work_mem threshold but such cases shouldn't be many.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: Resetting spilled txn statistics in pg_stat_replication

From
Magnus Hagander
Date:


On Tue, Jun 23, 2020 at 12:18 PM Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
On Tue, Jun 23, 2020 at 10:58:18AM +0530, Amit Kapila wrote:
>On Tue, Jun 23, 2020 at 9:32 AM Masahiko Sawada
><masahiko.sawada@2ndquadrant.com> wrote:
>>
>> On Sun, 21 Jun 2020 at 06:57, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
>> >
>> > >
>> > >What if the decoding has been performed by multiple backends using the
>> > >same slot?  In that case, it will be difficult to make the judgment
>> > >for the value of logical_decoding_work_mem based on stats.  It would
>> > >make sense if we provide a way to set logical_decoding_work_mem for a
>> > >slot but not sure if that is better than what we have now.
>> > >
>>
>> I thought that the stats are relevant to what
>> logical_decoding_work_mem value was but not with who performed logical
>> decoding. So even if multiple backends perform logical decoding using
>> the same slot, the user can directly use stats as long as
>> logical_decoding_work_mem value doesn’t change.
>>
>
>I think if you maintain these stats at the slot level, you probably
>need to use spinlock or atomic ops in order to update those as slots
>can be used from multiple backends whereas currently, we don't need
>that.

IMHO storing the stats in the slot itself is a bad idea. We have the
statistics collector for exactly this purpose, and it's receiving data
over UDP without any extra locking etc.

Yeah, that seems much more appropriate. Of course, where they are exposed is a different question.


>> > >What problems do we see in displaying these for each process?  I think
>> > >users might want to see the stats for the exited processes or after
>> > >server restart but I think both of those are not even possible today.
>> > >I think the stats are available till the corresponding WALSender
>> > >process is active.
>>
>> I might want to see the stats for the exited processes or after server
>> restart. But I'm inclined to agree with displaying the stats per
>> process if the stats are displayed on a separate view (e.g.
>> pg_stat_replication_slots).
>>
>
>Yeah, as told previously, this makes more sense to me.
>
>Do you think we should try to write a POC patch using a per-process
>entry approach and see what difficulties we are facing and does it
>give the stats in a way we are imagining but OTOH, we can wait for
>some more to see if there is clear winner approach here?
>

I may be missing something obvious, but I still see no point in tracking
per-process stats. We don't have that for other stats, and I'm not sure
how common is the scenario when a given slot is decoded by many
backends. I'd say vast majority of cases are simply running decoding
from a walsender, which may occasionally restart, but I doubt the users
are interested in per-pid data - they probably want aggregated data.

Well, technically we do -- we have the pg_stat_xact_* views. However, those are only viewable from *inside* the session itself (which can sometimes be quite annoying).

This does somewhat apply in that normal transactions send their stats batches at transaction end. If this is data we'd be interested in viewing inside of that, a more direct exposure would be needed -- such as the way we do with LSNs in pg_stat_replication or whatever.

For long-term monitoring, people definitely want aggregate data I'd say. The "realtime data" if we call it that is in my experience mostly interesting if you want to define alerts etc ("replication standby is too far behind" is alertable through that, whereas things like "total amount of replication traffic over the past hour" is something that's more trend-alertable which is typically handled in a separate system pulling the aggregate stats)


Can someone explain a plausible scenario for which tracking per-process
stats would be needed, and simply computing deltas would not work? How
will you know which old PID is which, what will you do when a PID is
reused, and so on?

I fail to see that one as well, in a real-world scenario. Maybe if you want to do a one-off point-tuning of one tiny piece of a system? But you will then also need to long term statistics to follow-up if what you did was correct anyway... 

--

Re: Resetting spilled txn statistics in pg_stat_replication

From
Masahiko Sawada
Date:
On Fri, 26 Jun 2020 at 17:53, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Jun 26, 2020 at 11:31 AM Masahiko Sawada
> <masahiko.sawada@2ndquadrant.com> wrote:
> >
> > On Thu, 25 Jun 2020 at 19:35, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Tue, Jun 23, 2020 at 6:39 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > > On Tue, Jun 23, 2020 at 3:48 PM Tomas Vondra
> > > > <tomas.vondra@2ndquadrant.com> wrote:
> > > > >
> > > > > On Tue, Jun 23, 2020 at 10:58:18AM +0530, Amit Kapila wrote:
> > > > > >On Tue, Jun 23, 2020 at 9:32 AM Masahiko Sawada
> > > > > ><masahiko.sawada@2ndquadrant.com> wrote:
> > > > > >>
> > > > > >> On Sun, 21 Jun 2020 at 06:57, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
> > > > > >> >
> > > > > >> > >
> > > > > >> > >What if the decoding has been performed by multiple backends using the
> > > > > >> > >same slot?  In that case, it will be difficult to make the judgment
> > > > > >> > >for the value of logical_decoding_work_mem based on stats.  It would
> > > > > >> > >make sense if we provide a way to set logical_decoding_work_mem for a
> > > > > >> > >slot but not sure if that is better than what we have now.
> > > > > >> > >
> > > > > >>
> > > > > >> I thought that the stats are relevant to what
> > > > > >> logical_decoding_work_mem value was but not with who performed logical
> > > > > >> decoding. So even if multiple backends perform logical decoding using
> > > > > >> the same slot, the user can directly use stats as long as
> > > > > >> logical_decoding_work_mem value doesn’t change.
> > > > > >>
> > >
> > > Today, I thought about it again, and if we consider the point that
> > > logical_decoding_work_mem value doesn’t change much then having the
> > > stats at slot-level would also allow computing
> > > logical_decoding_work_mem based on stats.  Do you think it is a
> > > reasonable assumption that users won't change
> > > logical_decoding_work_mem for different processes (WALSender, etc.)?
> >
> > FWIW, if we use logical_decoding_work_mem as a threshold of starting
> > of sending changes to a subscriber, I think there might be use cases
> > where the user wants to set different logical_decoding_work_mem values
> > to different wal senders. For example, setting a lower value to
> > minimize the latency of synchronous logical replication to a near-site
> > whereas setting a large value to minimize the amount of data sent to a
> > far site.
> >
>
> How does setting a large value can minimize the amount of data sent?
> One possibility is if there are a lot of transaction aborts and
> transactions are not large enough that they cross
> logical_decoding_work_mem threshold but such cases shouldn't be many.

Yeah, this is what I meant.

I agree that it would not be a common case that the user sets
different values for different processes. Based on that assumption, I
also think having the stats at slot-level is a good idea. But I might
want to have the reset function.

Regards,

--
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Resetting spilled txn statistics in pg_stat_replication

From
Amit Kapila
Date:
On Mon, Jun 29, 2020 at 10:26 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:
>
> On Fri, 26 Jun 2020 at 17:53, Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Fri, Jun 26, 2020 at 11:31 AM Masahiko Sawada
> > <masahiko.sawada@2ndquadrant.com> wrote:
> > >
> > > On Thu, 25 Jun 2020 at 19:35, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > >
> > > > Today, I thought about it again, and if we consider the point that
> > > > logical_decoding_work_mem value doesn’t change much then having the
> > > > stats at slot-level would also allow computing
> > > > logical_decoding_work_mem based on stats.  Do you think it is a
> > > > reasonable assumption that users won't change
> > > > logical_decoding_work_mem for different processes (WALSender, etc.)?
> > >
> > > FWIW, if we use logical_decoding_work_mem as a threshold of starting
> > > of sending changes to a subscriber, I think there might be use cases
> > > where the user wants to set different logical_decoding_work_mem values
> > > to different wal senders. For example, setting a lower value to
> > > minimize the latency of synchronous logical replication to a near-site
> > > whereas setting a large value to minimize the amount of data sent to a
> > > far site.
> > >
> >
> > How does setting a large value can minimize the amount of data sent?
> > One possibility is if there are a lot of transaction aborts and
> > transactions are not large enough that they cross
> > logical_decoding_work_mem threshold but such cases shouldn't be many.
>
> Yeah, this is what I meant.
>
> I agree that it would not be a common case that the user sets
> different values for different processes. Based on that assumption, I
> also think having the stats at slot-level is a good idea.
>

Okay.

> But I might
> want to have the reset function.
>

I don't mind but lets fist see how the patch for the basic feature
looks and what is required to implement it?  Are you interested in
writing the patch for this work?

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: Resetting spilled txn statistics in pg_stat_replication

From
Masahiko Sawada
Date:
On Mon, 29 Jun 2020 at 20:37, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Jun 29, 2020 at 10:26 AM Masahiko Sawada
> <masahiko.sawada@2ndquadrant.com> wrote:
> >
> > On Fri, 26 Jun 2020 at 17:53, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Fri, Jun 26, 2020 at 11:31 AM Masahiko Sawada
> > > <masahiko.sawada@2ndquadrant.com> wrote:
> > > >
> > > > On Thu, 25 Jun 2020 at 19:35, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > >
> > > > >
> > > > > Today, I thought about it again, and if we consider the point that
> > > > > logical_decoding_work_mem value doesn’t change much then having the
> > > > > stats at slot-level would also allow computing
> > > > > logical_decoding_work_mem based on stats.  Do you think it is a
> > > > > reasonable assumption that users won't change
> > > > > logical_decoding_work_mem for different processes (WALSender, etc.)?
> > > >
> > > > FWIW, if we use logical_decoding_work_mem as a threshold of starting
> > > > of sending changes to a subscriber, I think there might be use cases
> > > > where the user wants to set different logical_decoding_work_mem values
> > > > to different wal senders. For example, setting a lower value to
> > > > minimize the latency of synchronous logical replication to a near-site
> > > > whereas setting a large value to minimize the amount of data sent to a
> > > > far site.
> > > >
> > >
> > > How does setting a large value can minimize the amount of data sent?
> > > One possibility is if there are a lot of transaction aborts and
> > > transactions are not large enough that they cross
> > > logical_decoding_work_mem threshold but such cases shouldn't be many.
> >
> > Yeah, this is what I meant.
> >
> > I agree that it would not be a common case that the user sets
> > different values for different processes. Based on that assumption, I
> > also think having the stats at slot-level is a good idea.
> >
>
> Okay.
>
> > But I might
> > want to have the reset function.
> >
>
> I don't mind but lets fist see how the patch for the basic feature
> looks and what is required to implement it?  Are you interested in
> writing the patch for this work?

Yes, I'll write the draft patch.

Regards,

--
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Resetting spilled txn statistics in pg_stat_replication

From
Amit Kapila
Date:
On Tue, Jun 30, 2020 at 6:38 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:
>
> On Mon, 29 Jun 2020 at 20:37, Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Mon, Jun 29, 2020 at 10:26 AM Masahiko Sawada
> > <masahiko.sawada@2ndquadrant.com> wrote:
> > >
> > > I agree that it would not be a common case that the user sets
> > > different values for different processes. Based on that assumption, I
> > > also think having the stats at slot-level is a good idea.
> > >
> >
> > Okay.
> >
> > > But I might
> > > want to have the reset function.
> > >
> >
> > I don't mind but lets fist see how the patch for the basic feature
> > looks and what is required to implement it?  Are you interested in
> > writing the patch for this work?
>
> Yes, I'll write the draft patch.
>

Great, thanks.  One thing we can consider is that instead of storing
the stats directly in the slot we can consider sending it to stats
collector as suggested by Tomas.  Basically that can avoid contention
around slots (See discussion in email [1]).  I have not evaluated any
of the approaches in detail so you can let us know the advantage of
one over another.  Now, you might be already considering this but I
thought it is better to share what I have in mind rather than saying
that later once you have the draft patch ready.

[1] - https://www.postgresql.org/message-id/20200623101831.it6lzwbm37xwquco%40development

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: Resetting spilled txn statistics in pg_stat_replication

From
Masahiko Sawada
Date:
On Tue, 30 Jun 2020 at 12:58, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Jun 30, 2020 at 6:38 AM Masahiko Sawada
> <masahiko.sawada@2ndquadrant.com> wrote:
> >
> > On Mon, 29 Jun 2020 at 20:37, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Mon, Jun 29, 2020 at 10:26 AM Masahiko Sawada
> > > <masahiko.sawada@2ndquadrant.com> wrote:
> > > >
> > > > I agree that it would not be a common case that the user sets
> > > > different values for different processes. Based on that assumption, I
> > > > also think having the stats at slot-level is a good idea.
> > > >
> > >
> > > Okay.
> > >
> > > > But I might
> > > > want to have the reset function.
> > > >
> > >
> > > I don't mind but lets fist see how the patch for the basic feature
> > > looks and what is required to implement it?  Are you interested in
> > > writing the patch for this work?
> >
> > Yes, I'll write the draft patch.
> >
>
> Great, thanks.  One thing we can consider is that instead of storing
> the stats directly in the slot we can consider sending it to stats
> collector as suggested by Tomas.  Basically that can avoid contention
> around slots (See discussion in email [1]).  I have not evaluated any
> of the approaches in detail so you can let us know the advantage of
> one over another.  Now, you might be already considering this but I
> thought it is better to share what I have in mind rather than saying
> that later once you have the draft patch ready.

Thanks! Yes, I'm working on this patch while considering to send the
stats to stats collector.

I've attached PoC patch that implements a simple approach. I'd like to
discuss how we collect the replication slot statistics in the stats
collector before I bring the patch to completion.

In this PoC patch, we have the array of PgStat_ReplSlotStats struct
which has max_replication_slots entries. The backend and wal sender
send the slot statistics to the stats collector when decoding a commit
WAL record.

typedef struct PgStat_ReplSlotStats
{
    char            slotname[NAMEDATALEN];
    PgStat_Counter  spill_txns;
    PgStat_Counter  spill_count;
    PgStat_Counter  spill_bytes;
} PgStat_ReplSlotStats;

What I'd like to discuss are:

Since the unique identifier of replication slots is the name, the
process sends slot statistics along with slot name to stats collector.
I'm concerned about the amount of data sent to the stats collector and
the cost of searching the statistics within the statistics array (it’s
O(N) where N is max_replication_slots). Since the maximum length of
slot name is NAMEDATALEN (64 bytes default) and max_replication_slots
is unlikely a large number, I might be too worrying but it seems like
it’s better to avoid that if we can do that easily. An idea I came up
with is to use the index of slots (i.g., the index of
ReplicationSlotCtl->replication_slots[]) as the index of statistics of
slot in the stats collector. But since the index of slots could change
after the restart we need to synchronize the index of slots on both
array somehow. So I thought that we can determine the index of the
statistics of slots at ReplicationSlotAcquire() or
ReplicationSlotCreate(), but it will in turn need to read stats file
while holding ReplicationSlotControlLock to prevent the same index of
the statistics being used by the concurrent process who creating a
slot. I might be missing something though.

Second, as long as the unique identifier is the slot name there is no
convenient way to distinguish between the same name old and new
replication slots, so the backend process or wal sender process sends
a message to the stats collector to drop the replication slot at
ReplicationSlotDropPtr(). This strategy differs from what we do for
table, index, and function statistics. It might not be a problem but
I’m thinking a better way.

The new view name is also an open question. I prefer
pg_stat_replication_slots and to add stats of physical replication
slots to the same view in the future, rather than a separate view.

Aside from the above, this patch will change the most of the changes
introduced by commit 9290ad198b1 and introduce new code much. I’m
concerned whether such changes are acceptable at the time of beta 2.

Regards,

--
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

Re: Resetting spilled txn statistics in pg_stat_replication

From
Ajin Cherian
Date:


On Thu, Jul 2, 2020 at 1:31 PM Masahiko Sawada <masahiko.sawada@2ndquadrant.com> wrote:


Thanks! Yes, I'm working on this patch while considering to send the
stats to stats collector.

I've attached PoC patch that implements a simple approach. I'd like to
discuss how we collect the replication slot statistics in the stats
collector before I bring the patch to completion.


I understand the patch is only in the initial stage but I just tried testing it. Using the patch, I enabled logical replication and created two pub/subs (sub1,sub2) for two seperate tables (t1,t2). I inserted data into the second table (t2) such that it spills into disk.
Then when I checked the stats using the new function pg_stat_get_replication_slots() , I see that the same stats are updated for both the slots, when ideally it should have reflected in the second slot alone.

postgres=# SELECT s.name, s.spill_txns,    s.spill_count,    s.spill_bytes   FROM pg_stat_get_replication_slots() s(name, spill_txns, spill_count, spill_bytes);
 name | spill_txns | spill_count | spill_bytes
------+------------+-------------+-------------
 sub1 |          1 |          20 |  1320000000
 sub2 |          1 |          20 |  1320000000
(2 rows)

I haven't debugged the issue yet, I can if you wish but just thought I'd let you know what I found.

thanks,
Ajin Cherian
Fujitsu Australia

Re: Resetting spilled txn statistics in pg_stat_replication

From
Masahiko Sawada
Date:
On Sat, 4 Jul 2020 at 22:13, Ajin Cherian <itsajin@gmail.com> wrote:
>
>
>
> On Thu, Jul 2, 2020 at 1:31 PM Masahiko Sawada <masahiko.sawada@2ndquadrant.com> wrote:
>>
>>
>>
>> Thanks! Yes, I'm working on this patch while considering to send the
>> stats to stats collector.
>>
>> I've attached PoC patch that implements a simple approach. I'd like to
>> discuss how we collect the replication slot statistics in the stats
>> collector before I bring the patch to completion.
>>
>
> I understand the patch is only in the initial stage but I just tried testing it.

Thank you for testing the patch!

> Using the patch, I enabled logical replication and created two pub/subs (sub1,sub2) for two seperate tables (t1,t2).
Iinserted data into the second table (t2) such that it spills into disk.
 
> Then when I checked the stats using the new function pg_stat_get_replication_slots() , I see that the same stats are
updatedfor both the slots, when ideally it should have reflected in the second slot alone.
 
>
> postgres=# SELECT s.name, s.spill_txns,    s.spill_count,    s.spill_bytes   FROM pg_stat_get_replication_slots()
s(name,spill_txns, spill_count, spill_bytes);
 
>  name | spill_txns | spill_count | spill_bytes
> ------+------------+-------------+-------------
>  sub1 |          1 |          20 |  1320000000
>  sub2 |          1 |          20 |  1320000000
> (2 rows)
>

I think this is because logical decodings behind those two logical
replications decode all WAL records *before* filtering the specified
tables. In logical replication, we decode the whole WAL stream and
then pass it to a logical decoding output plugin such as pgoutput. And
then we filter tables according to the publication. Therefore, even if
subscription sub1 is not interested in changes related to table t2,
the replication slot sub1 needs to decode the whole WAL stream,
resulting in spilling into disk.

Regards,

-- 
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Resetting spilled txn statistics in pg_stat_replication

From
Amit Kapila
Date:
On Thu, Jul 2, 2020 at 9:01 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:
>
> I've attached PoC patch that implements a simple approach. I'd like to
> discuss how we collect the replication slot statistics in the stats
> collector before I bring the patch to completion.
>
> In this PoC patch, we have the array of PgStat_ReplSlotStats struct
> which has max_replication_slots entries. The backend and wal sender
> send the slot statistics to the stats collector when decoding a commit
> WAL record.
>
> typedef struct PgStat_ReplSlotStats
> {
>     char            slotname[NAMEDATALEN];
>     PgStat_Counter  spill_txns;
>     PgStat_Counter  spill_count;
>     PgStat_Counter  spill_bytes;
> } PgStat_ReplSlotStats;
>
> What I'd like to discuss are:
>
> Since the unique identifier of replication slots is the name, the
> process sends slot statistics along with slot name to stats collector.
> I'm concerned about the amount of data sent to the stats collector and
> the cost of searching the statistics within the statistics array (it’s
> O(N) where N is max_replication_slots). Since the maximum length of
> slot name is NAMEDATALEN (64 bytes default) and max_replication_slots
> is unlikely a large number, I might be too worrying but it seems like
> it’s better to avoid that if we can do that easily. An idea I came up
> with is to use the index of slots (i.g., the index of
> ReplicationSlotCtl->replication_slots[]) as the index of statistics of
> slot in the stats collector. But since the index of slots could change
> after the restart we need to synchronize the index of slots on both
> array somehow. So I thought that we can determine the index of the
> statistics of slots at ReplicationSlotAcquire() or
> ReplicationSlotCreate(), but it will in turn need to read stats file
> while holding ReplicationSlotControlLock to prevent the same index of
> the statistics being used by the concurrent process who creating a
> slot. I might be missing something though.
>

I don't think we should be bothered about the large values of
max_replication_slots.  The default value is 10 and I am not sure if
users will be able to select values large enough that we should bother
about searching them by name.  I think if it could turn out to be a
problem then we can try to invent something to mitigate it.

> Second, as long as the unique identifier is the slot name there is no
> convenient way to distinguish between the same name old and new
> replication slots, so the backend process or wal sender process sends
> a message to the stats collector to drop the replication slot at
> ReplicationSlotDropPtr(). This strategy differs from what we do for
> table, index, and function statistics. It might not be a problem but
> I’m thinking a better way.
>

Can we rely on message ordering in the transmission mechanism (UDP)
for stats?  The wiki suggests [1] we can't.  If so, then this might
not work.

> The new view name is also an open question. I prefer
> pg_stat_replication_slots and to add stats of physical replication
> slots to the same view in the future, rather than a separate view.
>

This sounds okay to me.

> Aside from the above, this patch will change the most of the changes
> introduced by commit 9290ad198b1 and introduce new code much. I’m
> concerned whether such changes are acceptable at the time of beta 2.
>

I think it depends on the final patch. My initial thought was that we
should do this for PG14 but if you are suggesting removing the changes
done by commit 9290ad198b1 then we need to think over it.  I could
think of below options:
a. Revert 9290ad198b1 and introduce stats for spilling in PG14.  We
were anyway having spilling without any work in PG13 but didn’t have
stats.
b. Try to get your patch in PG13 if we can, otherwise, revert the
feature 9290ad198b1.
c. Get whatever we have in commit 9290ad198b1 for PG13 and
additionally have what we are discussing here for PG14.  This means
that spilled stats at slot level will be available in PG14 via
pg_stat_replication_slots and for individual WAL senders it will be
available via pg_stat_replication both in PG13 and PG14.  Even if we
can get your patch in PG13, we can still keep those in
pg_stat_replication.
d. Get whatever we have in commit 9290ad198b1 for PG13 and change it
for PG14.  I don't think this will be a popular approach.

[1] - https://en.wikipedia.org/wiki/User_Datagram_Protocol

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: Resetting spilled txn statistics in pg_stat_replication

From
Masahiko Sawada
Date:
On Mon, 6 Jul 2020 at 20:45, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Jul 2, 2020 at 9:01 AM Masahiko Sawada
> <masahiko.sawada@2ndquadrant.com> wrote:
> >
> > I've attached PoC patch that implements a simple approach. I'd like to
> > discuss how we collect the replication slot statistics in the stats
> > collector before I bring the patch to completion.
> >
> > In this PoC patch, we have the array of PgStat_ReplSlotStats struct
> > which has max_replication_slots entries. The backend and wal sender
> > send the slot statistics to the stats collector when decoding a commit
> > WAL record.
> >
> > typedef struct PgStat_ReplSlotStats
> > {
> >     char            slotname[NAMEDATALEN];
> >     PgStat_Counter  spill_txns;
> >     PgStat_Counter  spill_count;
> >     PgStat_Counter  spill_bytes;
> > } PgStat_ReplSlotStats;
> >
> > What I'd like to discuss are:
> >
> > Since the unique identifier of replication slots is the name, the
> > process sends slot statistics along with slot name to stats collector.
> > I'm concerned about the amount of data sent to the stats collector and
> > the cost of searching the statistics within the statistics array (it’s
> > O(N) where N is max_replication_slots). Since the maximum length of
> > slot name is NAMEDATALEN (64 bytes default) and max_replication_slots
> > is unlikely a large number, I might be too worrying but it seems like
> > it’s better to avoid that if we can do that easily. An idea I came up
> > with is to use the index of slots (i.g., the index of
> > ReplicationSlotCtl->replication_slots[]) as the index of statistics of
> > slot in the stats collector. But since the index of slots could change
> > after the restart we need to synchronize the index of slots on both
> > array somehow. So I thought that we can determine the index of the
> > statistics of slots at ReplicationSlotAcquire() or
> > ReplicationSlotCreate(), but it will in turn need to read stats file
> > while holding ReplicationSlotControlLock to prevent the same index of
> > the statistics being used by the concurrent process who creating a
> > slot. I might be missing something though.
> >
>
> I don't think we should be bothered about the large values of
> max_replication_slots.  The default value is 10 and I am not sure if
> users will be able to select values large enough that we should bother
> about searching them by name.  I think if it could turn out to be a
> problem then we can try to invent something to mitigate it.

Agreed.

>
> > Second, as long as the unique identifier is the slot name there is no
> > convenient way to distinguish between the same name old and new
> > replication slots, so the backend process or wal sender process sends
> > a message to the stats collector to drop the replication slot at
> > ReplicationSlotDropPtr(). This strategy differs from what we do for
> > table, index, and function statistics. It might not be a problem but
> > I’m thinking a better way.
> >
>
> Can we rely on message ordering in the transmission mechanism (UDP)
> for stats?  The wiki suggests [1] we can't.  If so, then this might
> not work.

Yeah, I'm also concerned about this. Another idea would be to have
another unique identifier to distinguish old and new replication slots
with the same name. For example, creation timestamp. And then we
reclaim the stats of unused slots later like table and function
statistics.

On the other hand, if the ordering were to be reversed, we would miss
that stats but the next stat reporting would create the new entry. If
the problem is unlikely to happen in common case we can live with
that.

> > The new view name is also an open question. I prefer
> > pg_stat_replication_slots and to add stats of physical replication
> > slots to the same view in the future, rather than a separate view.
> >
>
> This sounds okay to me.
>
> > Aside from the above, this patch will change the most of the changes
> > introduced by commit 9290ad198b1 and introduce new code much. I’m
> > concerned whether such changes are acceptable at the time of beta 2.
> >
>
> I think it depends on the final patch. My initial thought was that we
> should do this for PG14 but if you are suggesting removing the changes
> done by commit 9290ad198b1 then we need to think over it.  I could
> think of below options:
> a. Revert 9290ad198b1 and introduce stats for spilling in PG14.  We
> were anyway having spilling without any work in PG13 but didn’t have
> stats.
> b. Try to get your patch in PG13 if we can, otherwise, revert the
> feature 9290ad198b1.
> c. Get whatever we have in commit 9290ad198b1 for PG13 and
> additionally have what we are discussing here for PG14.  This means
> that spilled stats at slot level will be available in PG14 via
> pg_stat_replication_slots and for individual WAL senders it will be
> available via pg_stat_replication both in PG13 and PG14.  Even if we
> can get your patch in PG13, we can still keep those in
> pg_stat_replication.
> d. Get whatever we have in commit 9290ad198b1 for PG13 and change it
> for PG14.  I don't think this will be a popular approach.

I was thinking option (a) or (b). I'm inclined to option (a) since the
PoC patch added a certain amount of new codes. I agree with you that
it depends on the final patch.

Regards,

--
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Resetting spilled txn statistics in pg_stat_replication

From
Amit Kapila
Date:
On Tue, Jul 7, 2020 at 7:07 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:
>
> On Mon, 6 Jul 2020 at 20:45, Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > > Second, as long as the unique identifier is the slot name there is no
> > > convenient way to distinguish between the same name old and new
> > > replication slots, so the backend process or wal sender process sends
> > > a message to the stats collector to drop the replication slot at
> > > ReplicationSlotDropPtr(). This strategy differs from what we do for
> > > table, index, and function statistics. It might not be a problem but
> > > I’m thinking a better way.
> > >
> >
> > Can we rely on message ordering in the transmission mechanism (UDP)
> > for stats?  The wiki suggests [1] we can't.  If so, then this might
> > not work.
>
> Yeah, I'm also concerned about this. Another idea would be to have
> another unique identifier to distinguish old and new replication slots
> with the same name. For example, creation timestamp. And then we
> reclaim the stats of unused slots later like table and function
> statistics.
>

So, we need to have 'creation timestamp' as persistent data for slots
to achieve this?  I am not sure of adding creation_time as a parameter
to identify for this case because users can change timings on systems
so it might not be a bullet-proof method but I agree that it can work
in general.

> On the other hand, if the ordering were to be reversed, we would miss
> that stats but the next stat reporting would create the new entry. If
> the problem is unlikely to happen in common case we can live with
> that.
>

Yeah, that is a valid point and I think otherwise also some UDP
packets can be lost so maybe we don't need to worry too much about
this.  I guess we can add a comment in the code for such a case.

> >
> > > Aside from the above, this patch will change the most of the changes
> > > introduced by commit 9290ad198b1 and introduce new code much. I’m
> > > concerned whether such changes are acceptable at the time of beta 2.
> > >
> >
> > I think it depends on the final patch. My initial thought was that we
> > should do this for PG14 but if you are suggesting removing the changes
> > done by commit 9290ad198b1 then we need to think over it.  I could
> > think of below options:
> > a. Revert 9290ad198b1 and introduce stats for spilling in PG14.  We
> > were anyway having spilling without any work in PG13 but didn’t have
> > stats.
> > b. Try to get your patch in PG13 if we can, otherwise, revert the
> > feature 9290ad198b1.
> > c. Get whatever we have in commit 9290ad198b1 for PG13 and
> > additionally have what we are discussing here for PG14.  This means
> > that spilled stats at slot level will be available in PG14 via
> > pg_stat_replication_slots and for individual WAL senders it will be
> > available via pg_stat_replication both in PG13 and PG14.  Even if we
> > can get your patch in PG13, we can still keep those in
> > pg_stat_replication.
> > d. Get whatever we have in commit 9290ad198b1 for PG13 and change it
> > for PG14.  I don't think this will be a popular approach.
>
> I was thinking option (a) or (b). I'm inclined to option (a) since the
> PoC patch added a certain amount of new codes. I agree with you that
> it depends on the final patch.
>

Magnus, Tomas, others, do you have any suggestions on the above
options or let us know if you have any other option in mind?

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: Resetting spilled txn statistics in pg_stat_replication

From
Magnus Hagander
Date:


On Tue, Jul 7, 2020 at 5:10 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Jul 7, 2020 at 7:07 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:
>
> On Mon, 6 Jul 2020 at 20:45, Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > > Second, as long as the unique identifier is the slot name there is no
> > > convenient way to distinguish between the same name old and new
> > > replication slots, so the backend process or wal sender process sends
> > > a message to the stats collector to drop the replication slot at
> > > ReplicationSlotDropPtr(). This strategy differs from what we do for
> > > table, index, and function statistics. It might not be a problem but
> > > I’m thinking a better way.
> > >
> >
> > Can we rely on message ordering in the transmission mechanism (UDP)
> > for stats?  The wiki suggests [1] we can't.  If so, then this might
> > not work.
>
> Yeah, I'm also concerned about this. Another idea would be to have
> another unique identifier to distinguish old and new replication slots
> with the same name. For example, creation timestamp. And then we
> reclaim the stats of unused slots later like table and function
> statistics.
>

So, we need to have 'creation timestamp' as persistent data for slots
to achieve this?  I am not sure of adding creation_time as a parameter
to identify for this case because users can change timings on systems
so it might not be a bullet-proof method but I agree that it can work
in general.

If we need them to be persistent across time like that, perhaps we simply need to assign oids to replication slots? That might simplify this problem quite a bit?


> On the other hand, if the ordering were to be reversed, we would miss
> that stats but the next stat reporting would create the new entry. If
> the problem is unlikely to happen in common case we can live with
> that.
>

Yeah, that is a valid point and I think otherwise also some UDP
packets can be lost so maybe we don't need to worry too much about
this.  I guess we can add a comment in the code for such a case.

The fact that we may in theory lose some packages over UDP is the main reason we're using UDP in the first place, I believe :) But it's highly unlikely to happen in the real world I believe (and I think on some platforms impossible).


> > > Aside from the above, this patch will change the most of the changes
> > > introduced by commit 9290ad198b1 and introduce new code much. I’m
> > > concerned whether such changes are acceptable at the time of beta 2.
> > >
> >
> > I think it depends on the final patch. My initial thought was that we
> > should do this for PG14 but if you are suggesting removing the changes
> > done by commit 9290ad198b1 then we need to think over it.  I could
> > think of below options:
> > a. Revert 9290ad198b1 and introduce stats for spilling in PG14.  We
> > were anyway having spilling without any work in PG13 but didn’t have
> > stats.
> > b. Try to get your patch in PG13 if we can, otherwise, revert the
> > feature 9290ad198b1.
> > c. Get whatever we have in commit 9290ad198b1 for PG13 and
> > additionally have what we are discussing here for PG14.  This means
> > that spilled stats at slot level will be available in PG14 via
> > pg_stat_replication_slots and for individual WAL senders it will be
> > available via pg_stat_replication both in PG13 and PG14.  Even if we
> > can get your patch in PG13, we can still keep those in
> > pg_stat_replication.
> > d. Get whatever we have in commit 9290ad198b1 for PG13 and change it
> > for PG14.  I don't think this will be a popular approach.
>
> I was thinking option (a) or (b). I'm inclined to option (a) since the
> PoC patch added a certain amount of new codes. I agree with you that
> it depends on the final patch.
>

Magnus, Tomas, others, do you have any suggestions on the above
options or let us know if you have any other option in mind?


I have a feeling it's far too late for (b) at this time. Regardless of the size of the patch, it feels that this can end up being a rushed and not thought-through-all-the-way one, in which case we may end up in an even worse position.

Much as I would like to have these stats earlier, I'm also leaning towards (a).

//Magnus

Re: Resetting spilled txn statistics in pg_stat_replication

From
Masahiko Sawada
Date:
On Tue, 7 Jul 2020 at 17:50, Magnus Hagander <magnus@hagander.net> wrote:
>
>
>
> On Tue, Jul 7, 2020 at 5:10 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>>
>> On Tue, Jul 7, 2020 at 7:07 AM Masahiko Sawada
>> <masahiko.sawada@2ndquadrant.com> wrote:
>> >
>> > On Mon, 6 Jul 2020 at 20:45, Amit Kapila <amit.kapila16@gmail.com> wrote:
>> > >
>> > > > Second, as long as the unique identifier is the slot name there is no
>> > > > convenient way to distinguish between the same name old and new
>> > > > replication slots, so the backend process or wal sender process sends
>> > > > a message to the stats collector to drop the replication slot at
>> > > > ReplicationSlotDropPtr(). This strategy differs from what we do for
>> > > > table, index, and function statistics. It might not be a problem but
>> > > > I’m thinking a better way.
>> > > >
>> > >
>> > > Can we rely on message ordering in the transmission mechanism (UDP)
>> > > for stats?  The wiki suggests [1] we can't.  If so, then this might
>> > > not work.
>> >
>> > Yeah, I'm also concerned about this. Another idea would be to have
>> > another unique identifier to distinguish old and new replication slots
>> > with the same name. For example, creation timestamp. And then we
>> > reclaim the stats of unused slots later like table and function
>> > statistics.
>> >
>>
>> So, we need to have 'creation timestamp' as persistent data for slots
>> to achieve this?  I am not sure of adding creation_time as a parameter
>> to identify for this case because users can change timings on systems
>> so it might not be a bullet-proof method but I agree that it can work
>> in general.
>
>
> If we need them to be persistent across time like that, perhaps we simply need to assign oids to replication slots?
Thatmight simplify this problem quite a bit? 

Yeah, I guess assigning oids to replication slots in the same way of
oids in system catalogs might not work because physical replication
slot can be created even during recovery. But using a
monotonically-increasing integer as id seems better and straight
forward. This id is not necessarily displayed in pg_repliation_slots
view because the user already can use slot name as a unique
identifier.

Regards,

--
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Resetting spilled txn statistics in pg_stat_replication

From
Amit Kapila
Date:
On Wed, Jul 8, 2020 at 11:28 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:
>
> On Tue, 7 Jul 2020 at 17:50, Magnus Hagander <magnus@hagander.net> wrote:
> >
> >
> >
> > On Tue, Jul 7, 2020 at 5:10 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >>
> >> On Tue, Jul 7, 2020 at 7:07 AM Masahiko Sawada
> >> <masahiko.sawada@2ndquadrant.com> wrote:
> >> >
> >> > On Mon, 6 Jul 2020 at 20:45, Amit Kapila <amit.kapila16@gmail.com> wrote:
> >> > >
> >> > > > Second, as long as the unique identifier is the slot name there is no
> >> > > > convenient way to distinguish between the same name old and new
> >> > > > replication slots, so the backend process or wal sender process sends
> >> > > > a message to the stats collector to drop the replication slot at
> >> > > > ReplicationSlotDropPtr(). This strategy differs from what we do for
> >> > > > table, index, and function statistics. It might not be a problem but
> >> > > > I’m thinking a better way.
> >> > > >
> >> > >
> >> > > Can we rely on message ordering in the transmission mechanism (UDP)
> >> > > for stats?  The wiki suggests [1] we can't.  If so, then this might
> >> > > not work.
> >> >
> >> > Yeah, I'm also concerned about this. Another idea would be to have
> >> > another unique identifier to distinguish old and new replication slots
> >> > with the same name. For example, creation timestamp. And then we
> >> > reclaim the stats of unused slots later like table and function
> >> > statistics.
> >> >
> >>
> >> So, we need to have 'creation timestamp' as persistent data for slots
> >> to achieve this?  I am not sure of adding creation_time as a parameter
> >> to identify for this case because users can change timings on systems
> >> so it might not be a bullet-proof method but I agree that it can work
> >> in general.
> >
> >
> > If we need them to be persistent across time like that, perhaps we simply need to assign oids to replication slots?
Thatmight simplify this problem quite a bit? 
>
> Yeah, I guess assigning oids to replication slots in the same way of
> oids in system catalogs might not work because physical replication
> slot can be created even during recovery. But using a
> monotonically-increasing integer as id seems better and straight
> forward.
>

But don't we need to make it WAL logged as well similar to what we do
in GetNewObjectId?  I am thinking do we really need Oids for slots or
is it okay to have some approximate stats in boundary cases?

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: Resetting spilled txn statistics in pg_stat_replication

From
Masahiko Sawada
Date:
On Wed, 8 Jul 2020 at 16:04, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Jul 8, 2020 at 11:28 AM Masahiko Sawada
> <masahiko.sawada@2ndquadrant.com> wrote:
> >
> > On Tue, 7 Jul 2020 at 17:50, Magnus Hagander <magnus@hagander.net> wrote:
> > >
> > >
> > >
> > > On Tue, Jul 7, 2020 at 5:10 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >>
> > >> On Tue, Jul 7, 2020 at 7:07 AM Masahiko Sawada
> > >> <masahiko.sawada@2ndquadrant.com> wrote:
> > >> >
> > >> > On Mon, 6 Jul 2020 at 20:45, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >> > >
> > >> > > > Second, as long as the unique identifier is the slot name there is no
> > >> > > > convenient way to distinguish between the same name old and new
> > >> > > > replication slots, so the backend process or wal sender process sends
> > >> > > > a message to the stats collector to drop the replication slot at
> > >> > > > ReplicationSlotDropPtr(). This strategy differs from what we do for
> > >> > > > table, index, and function statistics. It might not be a problem but
> > >> > > > I’m thinking a better way.
> > >> > > >
> > >> > >
> > >> > > Can we rely on message ordering in the transmission mechanism (UDP)
> > >> > > for stats?  The wiki suggests [1] we can't.  If so, then this might
> > >> > > not work.
> > >> >
> > >> > Yeah, I'm also concerned about this. Another idea would be to have
> > >> > another unique identifier to distinguish old and new replication slots
> > >> > with the same name. For example, creation timestamp. And then we
> > >> > reclaim the stats of unused slots later like table and function
> > >> > statistics.
> > >> >
> > >>
> > >> So, we need to have 'creation timestamp' as persistent data for slots
> > >> to achieve this?  I am not sure of adding creation_time as a parameter
> > >> to identify for this case because users can change timings on systems
> > >> so it might not be a bullet-proof method but I agree that it can work
> > >> in general.
> > >
> > >
> > > If we need them to be persistent across time like that, perhaps we simply need to assign oids to replication
slots?That might simplify this problem quite a bit? 
> >
> > Yeah, I guess assigning oids to replication slots in the same way of
> > oids in system catalogs might not work because physical replication
> > slot can be created even during recovery. But using a
> > monotonically-increasing integer as id seems better and straight
> > forward.
> >
>
> But don't we need to make it WAL logged as well similar to what we do
> in GetNewObjectId?

Yes. I was thinking that assigning (the maximum number of the existing
slot id + 1) to a new slot without WAL logging.

>  I am thinking do we really need Oids for slots or
> is it okay to have some approximate stats in boundary cases?

I think that using oids has another benefit that we don't need to send
slot name to the stats collector along with the stats. Since the
maximum size of slot name is NAMEDATALEN and we don't support the
pgstat message larger than PGSTAT_MAX_MSG_SIZE (1000 bytes), if the
user wants to increase NAMEDATALEN they might not be able to build.

Regards,

--
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Resetting spilled txn statistics in pg_stat_replication

From
Amit Kapila
Date:
On Wed, Jul 8, 2020 at 1:14 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:
>
> On Wed, 8 Jul 2020 at 16:04, Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Wed, Jul 8, 2020 at 11:28 AM Masahiko Sawada
> > <masahiko.sawada@2ndquadrant.com> wrote:
> > >
> > > >
> > > > If we need them to be persistent across time like that, perhaps we simply need to assign oids to replication
slots?That might simplify this problem quite a bit?
 
> > >
> > > Yeah, I guess assigning oids to replication slots in the same way of
> > > oids in system catalogs might not work because physical replication
> > > slot can be created even during recovery. But using a
> > > monotonically-increasing integer as id seems better and straight
> > > forward.
> > >
> >
> > But don't we need to make it WAL logged as well similar to what we do
> > in GetNewObjectId?
>
> Yes. I was thinking that assigning (the maximum number of the existing
> slot id + 1) to a new slot without WAL logging.
>
> >  I am thinking do we really need Oids for slots or
> > is it okay to have some approximate stats in boundary cases?
>
> I think that using oids has another benefit that we don't need to send
> slot name to the stats collector along with the stats. Since the
> maximum size of slot name is NAMEDATALEN and we don't support the
> pgstat message larger than PGSTAT_MAX_MSG_SIZE (1000 bytes), if the
> user wants to increase NAMEDATALEN they might not be able to build.
>

I think NAMEDATALEN is used for many other objects as well and I don't
think we want to change it in foreseeable future, so that doesn't
sound to be a good reason to invent OIDs for slots.  OTOH, I do
understand it would be better to send OIDs than names for slots but I
am just not sure if it is a good idea to invent a new way to generate
OIDs (which is different from how we do it for other objects in the
system) for this purpose.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: Resetting spilled txn statistics in pg_stat_replication

From
Amit Kapila
Date:
On Tue, Jul 7, 2020 at 2:20 PM Magnus Hagander <magnus@hagander.net> wrote:
>
> On Tue, Jul 7, 2020 at 5:10 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>>
>> > > I think it depends on the final patch. My initial thought was that we
>> > > should do this for PG14 but if you are suggesting removing the changes
>> > > done by commit 9290ad198b1 then we need to think over it.  I could
>> > > think of below options:
>> > > a. Revert 9290ad198b1 and introduce stats for spilling in PG14.  We
>> > > were anyway having spilling without any work in PG13 but didn’t have
>> > > stats.
>> > > b. Try to get your patch in PG13 if we can, otherwise, revert the
>> > > feature 9290ad198b1.
>> > > c. Get whatever we have in commit 9290ad198b1 for PG13 and
>> > > additionally have what we are discussing here for PG14.  This means
>> > > that spilled stats at slot level will be available in PG14 via
>> > > pg_stat_replication_slots and for individual WAL senders it will be
>> > > available via pg_stat_replication both in PG13 and PG14.  Even if we
>> > > can get your patch in PG13, we can still keep those in
>> > > pg_stat_replication.
>> > > d. Get whatever we have in commit 9290ad198b1 for PG13 and change it
>> > > for PG14.  I don't think this will be a popular approach.
>> >
>> > I was thinking option (a) or (b). I'm inclined to option (a) since the
>> > PoC patch added a certain amount of new codes. I agree with you that
>> > it depends on the final patch.
>> >
>>
>> Magnus, Tomas, others, do you have any suggestions on the above
>> options or let us know if you have any other option in mind?
>>
>
> I have a feeling it's far too late for (b) at this time. Regardless of the size of the patch, it feels that this can
endup being a rushed and not thought-through-all-the-way one, in which case we may end up in an even worse position. 
>
> Much as I would like to have these stats earlier, I'm also leaning towards (a).
>

Fair enough.  The attached patch reverts the commits related to these
stats.  Sawada-San, can you please once see if I have missed anything
apart from catversion bump which I will do before commit?



--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachment

Re: Resetting spilled txn statistics in pg_stat_replication

From
Masahiko Sawada
Date:
On Thu, 9 Jul 2020 at 16:09, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Jul 7, 2020 at 2:20 PM Magnus Hagander <magnus@hagander.net> wrote:
> >
> > On Tue, Jul 7, 2020 at 5:10 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >>
> >> > > I think it depends on the final patch. My initial thought was that we
> >> > > should do this for PG14 but if you are suggesting removing the changes
> >> > > done by commit 9290ad198b1 then we need to think over it.  I could
> >> > > think of below options:
> >> > > a. Revert 9290ad198b1 and introduce stats for spilling in PG14.  We
> >> > > were anyway having spilling without any work in PG13 but didn’t have
> >> > > stats.
> >> > > b. Try to get your patch in PG13 if we can, otherwise, revert the
> >> > > feature 9290ad198b1.
> >> > > c. Get whatever we have in commit 9290ad198b1 for PG13 and
> >> > > additionally have what we are discussing here for PG14.  This means
> >> > > that spilled stats at slot level will be available in PG14 via
> >> > > pg_stat_replication_slots and for individual WAL senders it will be
> >> > > available via pg_stat_replication both in PG13 and PG14.  Even if we
> >> > > can get your patch in PG13, we can still keep those in
> >> > > pg_stat_replication.
> >> > > d. Get whatever we have in commit 9290ad198b1 for PG13 and change it
> >> > > for PG14.  I don't think this will be a popular approach.
> >> >
> >> > I was thinking option (a) or (b). I'm inclined to option (a) since the
> >> > PoC patch added a certain amount of new codes. I agree with you that
> >> > it depends on the final patch.
> >> >
> >>
> >> Magnus, Tomas, others, do you have any suggestions on the above
> >> options or let us know if you have any other option in mind?
> >>
> >
> > I have a feeling it's far too late for (b) at this time. Regardless of the size of the patch, it feels that this
canend up being a rushed and not thought-through-all-the-way one, in which case we may end up in an even worse
position.
> >
> > Much as I would like to have these stats earlier, I'm also leaning towards (a).
> >
>
> Fair enough.  The attached patch reverts the commits related to these
> stats.  Sawada-San, can you please once see if I have missed anything
> apart from catversion bump which I will do before commit?

Thank you for the patch!

Do we remove the corresponding line in the release note by another
commit? For the rest, the looks good to me.

Regards,

--
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Resetting spilled txn statistics in pg_stat_replication

From
Masahiko Sawada
Date:
On Thu, 9 Jul 2020 at 12:11, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Jul 8, 2020 at 1:14 PM Masahiko Sawada
> <masahiko.sawada@2ndquadrant.com> wrote:
> >
> > On Wed, 8 Jul 2020 at 16:04, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Wed, Jul 8, 2020 at 11:28 AM Masahiko Sawada
> > > <masahiko.sawada@2ndquadrant.com> wrote:
> > > >
> > > > >
> > > > > If we need them to be persistent across time like that, perhaps we simply need to assign oids to replication
slots?That might simplify this problem quite a bit?
 
> > > >
> > > > Yeah, I guess assigning oids to replication slots in the same way of
> > > > oids in system catalogs might not work because physical replication
> > > > slot can be created even during recovery. But using a
> > > > monotonically-increasing integer as id seems better and straight
> > > > forward.
> > > >
> > >
> > > But don't we need to make it WAL logged as well similar to what we do
> > > in GetNewObjectId?
> >
> > Yes. I was thinking that assigning (the maximum number of the existing
> > slot id + 1) to a new slot without WAL logging.
> >
> > >  I am thinking do we really need Oids for slots or
> > > is it okay to have some approximate stats in boundary cases?
> >
> > I think that using oids has another benefit that we don't need to send
> > slot name to the stats collector along with the stats. Since the
> > maximum size of slot name is NAMEDATALEN and we don't support the
> > pgstat message larger than PGSTAT_MAX_MSG_SIZE (1000 bytes), if the
> > user wants to increase NAMEDATALEN they might not be able to build.
> >
>
> I think NAMEDATALEN is used for many other objects as well and I don't
> think we want to change it in foreseeable future, so that doesn't
> sound to be a good reason to invent OIDs for slots.  OTOH, I do
> understand it would be better to send OIDs than names for slots but I
> am just not sure if it is a good idea to invent a new way to generate
> OIDs (which is different from how we do it for other objects in the
> system) for this purpose.

I'm concerned that there might be users who are using custom
PostgreSQL that increased NAMEDATALEN for some reason. But indeed, I
also agree with your concerns. So perhaps we can go with the current
PoC patch approach as the first version (i.g., sending slot drop
message to stats collector). When we need such a unique identifier
also for other purposes, we will be able to change this feature so
that it uses that identifier for this statistics reporting purpose.

Regards,

--
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Resetting spilled txn statistics in pg_stat_replication

From
Amit Kapila
Date:
On Fri, Jul 10, 2020 at 7:19 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:
>
> On Thu, 9 Jul 2020 at 16:09, Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> >
> > Fair enough.  The attached patch reverts the commits related to these
> > stats.  Sawada-San, can you please once see if I have missed anything
> > apart from catversion bump which I will do before commit?
>
> Thank you for the patch!
>
> Do we remove the corresponding line in the release note by another
> commit?
>

Yes, I will do that as well.

> For the rest, the looks good to me.
>

Thanks, will wait for a day or so and then push it early next week.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: Resetting spilled txn statistics in pg_stat_replication

From
Amit Kapila
Date:
On Fri, Jul 10, 2020 at 7:23 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:
>
> On Thu, 9 Jul 2020 at 12:11, Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Wed, Jul 8, 2020 at 1:14 PM Masahiko Sawada
> > <masahiko.sawada@2ndquadrant.com> wrote:
> > >
> > >
> > > I think that using oids has another benefit that we don't need to send
> > > slot name to the stats collector along with the stats. Since the
> > > maximum size of slot name is NAMEDATALEN and we don't support the
> > > pgstat message larger than PGSTAT_MAX_MSG_SIZE (1000 bytes), if the
> > > user wants to increase NAMEDATALEN they might not be able to build.
> > >
> >
> > I think NAMEDATALEN is used for many other objects as well and I don't
> > think we want to change it in foreseeable future, so that doesn't
> > sound to be a good reason to invent OIDs for slots.  OTOH, I do
> > understand it would be better to send OIDs than names for slots but I
> > am just not sure if it is a good idea to invent a new way to generate
> > OIDs (which is different from how we do it for other objects in the
> > system) for this purpose.
>
> I'm concerned that there might be users who are using custom
> PostgreSQL that increased NAMEDATALEN for some reason. But indeed, I
> also agree with your concerns. So perhaps we can go with the current
> PoC patch approach as the first version (i.g., sending slot drop
> message to stats collector). When we need such a unique identifier
> also for other purposes, we will be able to change this feature so
> that it uses that identifier for this statistics reporting purpose.
>

Okay, feel to submit the version atop my revert patch.  I think you
might want to remove the indexing stuff you have added for faster
search as discussed above.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: Resetting spilled txn statistics in pg_stat_replication

From
Amit Kapila
Date:
On Fri, Jul 10, 2020 at 2:39 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Jul 10, 2020 at 7:19 AM Masahiko Sawada
> <masahiko.sawada@2ndquadrant.com> wrote:
> >
> > On Thu, 9 Jul 2020 at 16:09, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > >
> > > Fair enough.  The attached patch reverts the commits related to these
> > > stats.  Sawada-San, can you please once see if I have missed anything
> > > apart from catversion bump which I will do before commit?
> >
> > Thank you for the patch!
> >
> > Do we remove the corresponding line in the release note by another
> > commit?
> >
>
> Yes, I will do that as well.
>
> > For the rest, the looks good to me.
> >
>
> Thanks, will wait for a day or so and then push it early next week.
>

Pushed.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: Resetting spilled txn statistics in pg_stat_replication

From
Amit Kapila
Date:
On Fri, Jul 10, 2020 at 2:42 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Jul 10, 2020 at 7:23 AM Masahiko Sawada
> <masahiko.sawada@2ndquadrant.com> wrote:
> >
> > On Thu, 9 Jul 2020 at 12:11, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Wed, Jul 8, 2020 at 1:14 PM Masahiko Sawada
> > > <masahiko.sawada@2ndquadrant.com> wrote:
> > > >
> > > >
> > > > I think that using oids has another benefit that we don't need to send
> > > > slot name to the stats collector along with the stats. Since the
> > > > maximum size of slot name is NAMEDATALEN and we don't support the
> > > > pgstat message larger than PGSTAT_MAX_MSG_SIZE (1000 bytes), if the
> > > > user wants to increase NAMEDATALEN they might not be able to build.
> > > >
> > >
> > > I think NAMEDATALEN is used for many other objects as well and I don't
> > > think we want to change it in foreseeable future, so that doesn't
> > > sound to be a good reason to invent OIDs for slots.  OTOH, I do
> > > understand it would be better to send OIDs than names for slots but I
> > > am just not sure if it is a good idea to invent a new way to generate
> > > OIDs (which is different from how we do it for other objects in the
> > > system) for this purpose.
> >
> > I'm concerned that there might be users who are using custom
> > PostgreSQL that increased NAMEDATALEN for some reason. But indeed, I
> > also agree with your concerns. So perhaps we can go with the current
> > PoC patch approach as the first version (i.g., sending slot drop
> > message to stats collector). When we need such a unique identifier
> > also for other purposes, we will be able to change this feature so
> > that it uses that identifier for this statistics reporting purpose.
> >
>
> Okay, feel to submit the version atop my revert patch.
>

Attached, please find the rebased version.  I have kept prorows as 10
instead of 100 for pg_stat_get_replication_slots because I don't see
much reason for keeping the value more than the default value of
max_replication_slots.

As we are targeting this patch for PG14, so I think we can now add the
functionality to reset the stats as well.  What do you think?

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachment

Re: Resetting spilled txn statistics in pg_stat_replication

From
Masahiko Sawada
Date:
On Thu, 16 Jul 2020 at 15:53, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Jul 10, 2020 at 2:42 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Fri, Jul 10, 2020 at 7:23 AM Masahiko Sawada
> > <masahiko.sawada@2ndquadrant.com> wrote:
> > >
> > > On Thu, 9 Jul 2020 at 12:11, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > > On Wed, Jul 8, 2020 at 1:14 PM Masahiko Sawada
> > > > <masahiko.sawada@2ndquadrant.com> wrote:
> > > > >
> > > > >
> > > > > I think that using oids has another benefit that we don't need to send
> > > > > slot name to the stats collector along with the stats. Since the
> > > > > maximum size of slot name is NAMEDATALEN and we don't support the
> > > > > pgstat message larger than PGSTAT_MAX_MSG_SIZE (1000 bytes), if the
> > > > > user wants to increase NAMEDATALEN they might not be able to build.
> > > > >
> > > >
> > > > I think NAMEDATALEN is used for many other objects as well and I don't
> > > > think we want to change it in foreseeable future, so that doesn't
> > > > sound to be a good reason to invent OIDs for slots.  OTOH, I do
> > > > understand it would be better to send OIDs than names for slots but I
> > > > am just not sure if it is a good idea to invent a new way to generate
> > > > OIDs (which is different from how we do it for other objects in the
> > > > system) for this purpose.
> > >
> > > I'm concerned that there might be users who are using custom
> > > PostgreSQL that increased NAMEDATALEN for some reason. But indeed, I
> > > also agree with your concerns. So perhaps we can go with the current
> > > PoC patch approach as the first version (i.g., sending slot drop
> > > message to stats collector). When we need such a unique identifier
> > > also for other purposes, we will be able to change this feature so
> > > that it uses that identifier for this statistics reporting purpose.
> > >
> >
> > Okay, feel to submit the version atop my revert patch.
> >
>
> Attached, please find the rebased version.  I have kept prorows as 10
> instead of 100 for pg_stat_get_replication_slots because I don't see
> much reason for keeping the value more than the default value of
> max_replication_slots.
>

Thank you for rebasing the patch! Agreed.

> As we are targeting this patch for PG14, so I think we can now add the
> functionality to reset the stats as well.  What do you think?
>

Yeah, I was also updating the patch while adding the reset functions.

However, I'm concerned about the following part:

+static int
+pgstat_replslot_index(const char *name)
+{
+   int     i;
+
+   Assert(nReplSlotStats <= max_replication_slots);
+   for (i = 0; i < nReplSlotStats; i++)
+   {
+       if (strcmp(replSlotStats[i].slotname, name) == 0)
+           return i; /* found */
+   }
+
+   /* not found, register new slot */
+   memset(&replSlotStats[nReplSlotStats], 0, sizeof(PgStat_ReplSlotStats));
+   memcpy(&replSlotStats[nReplSlotStats].slotname, name, NAMEDATALEN);
+   return nReplSlotStats++;
+}

+static void
+pgstat_recv_replslot(PgStat_MsgReplSlot *msg, int len)
+{
+   int idx;
+
+   idx = pgstat_replslot_index(msg->m_slotname);
+   Assert(idx >= 0 && idx < max_replication_slots);

As long as we cannot rely on message ordering, the above assertion
could be false. For example, suppose that there is no unused
replication slots and the user:

1. drops the existing slot.
2. creates a new slot.

If the stats messages arrive in order of 2 and 1, the above assertion
is false or leads to memory corruption when assertions are not
enabled.

A possible solution would be to add an in-use flag to
PgStat_ReplSlotStats indicating whether the stats for slot is used or
not. When receiving a drop message for a slot, the stats collector
just marks the corresponding stats as unused. When receiving the stats
report for a new slot but there is no unused stats slot, ignore it.
What do you think?

Regards,

--
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Resetting spilled txn statistics in pg_stat_replication

From
Amit Kapila
Date:
On Thu, Jul 16, 2020 at 1:45 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:
>
> On Thu, 16 Jul 2020 at 15:53, Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Fri, Jul 10, 2020 at 2:42 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Fri, Jul 10, 2020 at 7:23 AM Masahiko Sawada
> > > <masahiko.sawada@2ndquadrant.com> wrote:
> > > >
> > > > On Thu, 9 Jul 2020 at 12:11, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > >
> > > > > On Wed, Jul 8, 2020 at 1:14 PM Masahiko Sawada
> > > > > <masahiko.sawada@2ndquadrant.com> wrote:
> > > > > >
> > > > > >
> > > > > > I think that using oids has another benefit that we don't need to send
> > > > > > slot name to the stats collector along with the stats. Since the
> > > > > > maximum size of slot name is NAMEDATALEN and we don't support the
> > > > > > pgstat message larger than PGSTAT_MAX_MSG_SIZE (1000 bytes), if the
> > > > > > user wants to increase NAMEDATALEN they might not be able to build.
> > > > > >
> > > > >
> > > > > I think NAMEDATALEN is used for many other objects as well and I don't
> > > > > think we want to change it in foreseeable future, so that doesn't
> > > > > sound to be a good reason to invent OIDs for slots.  OTOH, I do
> > > > > understand it would be better to send OIDs than names for slots but I
> > > > > am just not sure if it is a good idea to invent a new way to generate
> > > > > OIDs (which is different from how we do it for other objects in the
> > > > > system) for this purpose.
> > > >
> > > > I'm concerned that there might be users who are using custom
> > > > PostgreSQL that increased NAMEDATALEN for some reason. But indeed, I
> > > > also agree with your concerns. So perhaps we can go with the current
> > > > PoC patch approach as the first version (i.g., sending slot drop
> > > > message to stats collector). When we need such a unique identifier
> > > > also for other purposes, we will be able to change this feature so
> > > > that it uses that identifier for this statistics reporting purpose.
> > > >
> > >
> > > Okay, feel to submit the version atop my revert patch.
> > >
> >
> > Attached, please find the rebased version.  I have kept prorows as 10
> > instead of 100 for pg_stat_get_replication_slots because I don't see
> > much reason for keeping the value more than the default value of
> > max_replication_slots.
> >
>
> Thank you for rebasing the patch! Agreed.
>
> > As we are targeting this patch for PG14, so I think we can now add the
> > functionality to reset the stats as well.  What do you think?
> >
>
> Yeah, I was also updating the patch while adding the reset functions.
>
> However, I'm concerned about the following part:
>
> +static int
> +pgstat_replslot_index(const char *name)
> +{
> +   int     i;
> +
> +   Assert(nReplSlotStats <= max_replication_slots);
> +   for (i = 0; i < nReplSlotStats; i++)
> +   {
> +       if (strcmp(replSlotStats[i].slotname, name) == 0)
> +           return i; /* found */
> +   }
> +
> +   /* not found, register new slot */
> +   memset(&replSlotStats[nReplSlotStats], 0, sizeof(PgStat_ReplSlotStats));
> +   memcpy(&replSlotStats[nReplSlotStats].slotname, name, NAMEDATALEN);
> +   return nReplSlotStats++;
> +}
>
> +static void
> +pgstat_recv_replslot(PgStat_MsgReplSlot *msg, int len)
> +{
> +   int idx;
> +
> +   idx = pgstat_replslot_index(msg->m_slotname);
> +   Assert(idx >= 0 && idx < max_replication_slots);
>
> As long as we cannot rely on message ordering, the above assertion
> could be false. For example, suppose that there is no unused
> replication slots and the user:
>
> 1. drops the existing slot.
> 2. creates a new slot.
>
> If the stats messages arrive in order of 2 and 1, the above assertion
> is false or leads to memory corruption when assertions are not
> enabled.
>
> A possible solution would be to add an in-use flag to
> PgStat_ReplSlotStats indicating whether the stats for slot is used or
> not. When receiving a drop message for a slot, the stats collector
> just marks the corresponding stats as unused. When receiving the stats
> report for a new slot but there is no unused stats slot, ignore it.
> What do you think?
>

As of now, you have a boolean flag msg.m_drop to distinguish the drop
message but we don't have a similar way to distinguish the 'create'
message.  What if have a way to distinguish 'create' message (we can
probably keep some sort of flag to indicate the type of message
(create, drop, update)) and then if the slot with the same name
already exists, we ignore such a message.  Now, we also need a way to
create the entry for a slot for a normal stats update message as well
to accommodate for the lost 'create' message.  Does that make sense?

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: Resetting spilled txn statistics in pg_stat_replication

From
Masahiko Sawada
Date:
On Thu, 16 Jul 2020 at 18:16, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Jul 16, 2020 at 1:45 PM Masahiko Sawada
> <masahiko.sawada@2ndquadrant.com> wrote:
> >
> > On Thu, 16 Jul 2020 at 15:53, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Fri, Jul 10, 2020 at 2:42 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > > On Fri, Jul 10, 2020 at 7:23 AM Masahiko Sawada
> > > > <masahiko.sawada@2ndquadrant.com> wrote:
> > > > >
> > > > > On Thu, 9 Jul 2020 at 12:11, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > > >
> > > > > > On Wed, Jul 8, 2020 at 1:14 PM Masahiko Sawada
> > > > > > <masahiko.sawada@2ndquadrant.com> wrote:
> > > > > > >
> > > > > > >
> > > > > > > I think that using oids has another benefit that we don't need to send
> > > > > > > slot name to the stats collector along with the stats. Since the
> > > > > > > maximum size of slot name is NAMEDATALEN and we don't support the
> > > > > > > pgstat message larger than PGSTAT_MAX_MSG_SIZE (1000 bytes), if the
> > > > > > > user wants to increase NAMEDATALEN they might not be able to build.
> > > > > > >
> > > > > >
> > > > > > I think NAMEDATALEN is used for many other objects as well and I don't
> > > > > > think we want to change it in foreseeable future, so that doesn't
> > > > > > sound to be a good reason to invent OIDs for slots.  OTOH, I do
> > > > > > understand it would be better to send OIDs than names for slots but I
> > > > > > am just not sure if it is a good idea to invent a new way to generate
> > > > > > OIDs (which is different from how we do it for other objects in the
> > > > > > system) for this purpose.
> > > > >
> > > > > I'm concerned that there might be users who are using custom
> > > > > PostgreSQL that increased NAMEDATALEN for some reason. But indeed, I
> > > > > also agree with your concerns. So perhaps we can go with the current
> > > > > PoC patch approach as the first version (i.g., sending slot drop
> > > > > message to stats collector). When we need such a unique identifier
> > > > > also for other purposes, we will be able to change this feature so
> > > > > that it uses that identifier for this statistics reporting purpose.
> > > > >
> > > >
> > > > Okay, feel to submit the version atop my revert patch.
> > > >
> > >
> > > Attached, please find the rebased version.  I have kept prorows as 10
> > > instead of 100 for pg_stat_get_replication_slots because I don't see
> > > much reason for keeping the value more than the default value of
> > > max_replication_slots.
> > >
> >
> > Thank you for rebasing the patch! Agreed.
> >
> > > As we are targeting this patch for PG14, so I think we can now add the
> > > functionality to reset the stats as well.  What do you think?
> > >
> >
> > Yeah, I was also updating the patch while adding the reset functions.
> >
> > However, I'm concerned about the following part:
> >
> > +static int
> > +pgstat_replslot_index(const char *name)
> > +{
> > +   int     i;
> > +
> > +   Assert(nReplSlotStats <= max_replication_slots);
> > +   for (i = 0; i < nReplSlotStats; i++)
> > +   {
> > +       if (strcmp(replSlotStats[i].slotname, name) == 0)
> > +           return i; /* found */
> > +   }
> > +
> > +   /* not found, register new slot */
> > +   memset(&replSlotStats[nReplSlotStats], 0, sizeof(PgStat_ReplSlotStats));
> > +   memcpy(&replSlotStats[nReplSlotStats].slotname, name, NAMEDATALEN);
> > +   return nReplSlotStats++;
> > +}
> >
> > +static void
> > +pgstat_recv_replslot(PgStat_MsgReplSlot *msg, int len)
> > +{
> > +   int idx;
> > +
> > +   idx = pgstat_replslot_index(msg->m_slotname);
> > +   Assert(idx >= 0 && idx < max_replication_slots);
> >
> > As long as we cannot rely on message ordering, the above assertion
> > could be false. For example, suppose that there is no unused
> > replication slots and the user:
> >
> > 1. drops the existing slot.
> > 2. creates a new slot.
> >
> > If the stats messages arrive in order of 2 and 1, the above assertion
> > is false or leads to memory corruption when assertions are not
> > enabled.
> >
> > A possible solution would be to add an in-use flag to
> > PgStat_ReplSlotStats indicating whether the stats for slot is used or
> > not. When receiving a drop message for a slot, the stats collector
> > just marks the corresponding stats as unused. When receiving the stats
> > report for a new slot but there is no unused stats slot, ignore it.
> > What do you think?
> >
>
> As of now, you have a boolean flag msg.m_drop to distinguish the drop
> message but we don't have a similar way to distinguish the 'create'
> message.  What if have a way to distinguish 'create' message (we can
> probably keep some sort of flag to indicate the type of message
> (create, drop, update)) and then if the slot with the same name
> already exists, we ignore such a message.  Now, we also need a way to
> create the entry for a slot for a normal stats update message as well
> to accommodate for the lost 'create' message.  Does that make sense?

I might be missing your point, but even if we have 'create' message,
the problem can happen if when slots are full the user drops slot
‘slot_a’, creates slot ‘slot_b', and messages arrive in the reverse
order?

Regards,

--
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Resetting spilled txn statistics in pg_stat_replication

From
Amit Kapila
Date:
On Thu, Jul 16, 2020 at 4:04 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:
>
> On Thu, 16 Jul 2020 at 18:16, Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Thu, Jul 16, 2020 at 1:45 PM Masahiko Sawada
> > <masahiko.sawada@2ndquadrant.com> wrote:
> > >
> > > A possible solution would be to add an in-use flag to
> > > PgStat_ReplSlotStats indicating whether the stats for slot is used or
> > > not. When receiving a drop message for a slot, the stats collector
> > > just marks the corresponding stats as unused. When receiving the stats
> > > report for a new slot but there is no unused stats slot, ignore it.
> > > What do you think?
> > >
> >
> > As of now, you have a boolean flag msg.m_drop to distinguish the drop
> > message but we don't have a similar way to distinguish the 'create'
> > message.  What if have a way to distinguish 'create' message (we can
> > probably keep some sort of flag to indicate the type of message
> > (create, drop, update)) and then if the slot with the same name
> > already exists, we ignore such a message.  Now, we also need a way to
> > create the entry for a slot for a normal stats update message as well
> > to accommodate for the lost 'create' message.  Does that make sense?
>
> I might be missing your point, but even if we have 'create' message,
> the problem can happen if when slots are full the user drops slot
> ‘slot_a’, creates slot ‘slot_b', and messages arrive in the reverse
> order?
>

In that case, also, we should drop the 'create' message of 'slot_b' as
we don't have space but later when an 'update' message arrives with
stats for the 'slot_b', we will create the entry.  I am also thinking
what if send only 'update' and 'drop' message, the message ordering
problem can still happen but we will lose one 'update' message in that
case?

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: Resetting spilled txn statistics in pg_stat_replication

From
Masahiko Sawada
Date:
On Thu, 16 Jul 2020 at 20:01, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Jul 16, 2020 at 4:04 PM Masahiko Sawada
> <masahiko.sawada@2ndquadrant.com> wrote:
> >
> > On Thu, 16 Jul 2020 at 18:16, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Thu, Jul 16, 2020 at 1:45 PM Masahiko Sawada
> > > <masahiko.sawada@2ndquadrant.com> wrote:
> > > >
> > > > A possible solution would be to add an in-use flag to
> > > > PgStat_ReplSlotStats indicating whether the stats for slot is used or
> > > > not. When receiving a drop message for a slot, the stats collector
> > > > just marks the corresponding stats as unused. When receiving the stats
> > > > report for a new slot but there is no unused stats slot, ignore it.
> > > > What do you think?
> > > >
> > >
> > > As of now, you have a boolean flag msg.m_drop to distinguish the drop
> > > message but we don't have a similar way to distinguish the 'create'
> > > message.  What if have a way to distinguish 'create' message (we can
> > > probably keep some sort of flag to indicate the type of message
> > > (create, drop, update)) and then if the slot with the same name
> > > already exists, we ignore such a message.  Now, we also need a way to
> > > create the entry for a slot for a normal stats update message as well
> > > to accommodate for the lost 'create' message.  Does that make sense?
> >
> > I might be missing your point, but even if we have 'create' message,
> > the problem can happen if when slots are full the user drops slot
> > ‘slot_a’, creates slot ‘slot_b', and messages arrive in the reverse
> > order?
> >
>
> In that case, also, we should drop the 'create' message of 'slot_b' as
> we don't have space but later when an 'update' message arrives with
> stats for the 'slot_b', we will create the entry.

Agreed.

>  I am also thinking
> what if send only 'update' and 'drop' message, the message ordering
> problem can still happen but we will lose one 'update' message in that
> case?

Yes, I think so too. We will lose one 'update' message at a maximum.

I've updated the patch so that the stats collector ignores the
'update' message if the slot stats array is already full.

Regards,

--
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

Re: Resetting spilled txn statistics in pg_stat_replication

From
Amit Kapila
Date:
On Thu, Jul 23, 2020 at 11:46 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:
>
> I've updated the patch so that the stats collector ignores the
> 'update' message if the slot stats array is already full.
>

This patch needs a rebase. I don't see this patch in the CF app. I
hope you are still interested in working on this.

Review comments:
===============
1.
+CREATE VIEW pg_stat_replication_slots AS
+    SELECT
+            s.name,
+            s.spill_txns,
+            s.spill_count,
+            s.spill_bytes
+    FROM pg_stat_get_replication_slots() AS s;

The view pg_stat_replication_slots should have a column 'stats_reset'
(datatype: timestamp with time zone) as we provide a facitily to reset
the slots. A similar column exists in pg_stat_slru as well, so is
there a reason of not providing it here?

2.
+   </para>
+  </sect2>
+
+  <sect2 id="monitoring-pg-stat-wal-receiver-view">
   <title><structname>pg_stat_wal_receiver</structname></title>

It is better to keep one empty line between </para> and </sect2> to
keep it consistent with the documentation of other views.

3.
  <primary>pg_stat_reset_replication_slot</primary>
+        </indexterm>
+        <function>pg_stat_reset_replication_slot</function> (
<type>text</type> )
+        <returnvalue>void</returnvalue>
+       </para>
+       <para>
+         Resets statistics to zero for a single replication slot, or for all
+         replication slots in the cluster.  If the argument is NULL,
all counters
+         shown in the
<structname>pg_stat_replication_slots</structname> view for
+         all replication slots are reset.
+       </para>

I think the information about the parameter for this function is not
completely clear. It seems to me that it should be the name of the
slot for which we want to reset the stats, if so, let's try to be
clear.

4.
+pgstat_reset_replslot_counter(const char *name)
+{
+ PgStat_MsgResetreplslotcounter msg;
+
+ if (pgStatSock == PGINVALID_SOCKET)
+ return;
+
+ pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_RESETREPLSLOTCOUNTER);
+ if (name)
+ {
+ memcpy(&msg.m_slotname, name, NAMEDATALEN);
+ msg.clearall = false;
+ }

Don't we want to verify here or in the caller of this function whether
the passed slot_name is a valid slot or not? For ex. see
pgstat_reset_shared_counters where we return an error if the target is
not valid.

5.
+static void
+pgstat_recv_resetreplslotcounter(PgStat_MsgResetreplslotcounter *msg,
+ int len)
+{
+ int i;
+ int idx = -1;
+ TimestampTz ts;
+
+ if (!msg->clearall)
+ {
+ /* Get the index of replication slot statistics to reset */
+ idx = pgstat_replslot_index(msg->m_slotname, false);
+
+ if (idx < 0)
+ return; /* not found */

Can we add a comment to describe when we don't expect to find the slot
here unless there is no way that can happen?

6.
+pgstat_recv_resetreplslotcounter(PgStat_MsgResetreplslotcounter *msg,
+ int len)
{
..
+ for (i = 0; i < SLRU_NUM_ELEMENTS; i++)
..
}

I think here we need to traverse till nReplSlotStats, not SLRU_NUM_ELEMENTS.

7. Don't we need to change PGSTAT_FILE_FORMAT_ID for this patch? We
can probably do at the end but better to change it now so that it
doesn't slip from our mind.

8.
@@ -5350,6 +5474,23 @@ pgstat_read_statsfiles(Oid onlydb, bool
permanent, bool deep)

  break;

+ /*
+ * 'R' A PgStat_ReplSlotStats struct describing a replication slot
+ * follows.
+ */
+ case 'R':
+ if (fread(&replSlotStats[nReplSlotStats], 1,
sizeof(PgStat_ReplSlotStats), fpin)
+ != sizeof(PgStat_ReplSlotStats))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ memset(&replSlotStats[nReplSlotStats], 0, sizeof(PgStat_ReplSlotStats));
+ goto done;
+ }
+ nReplSlotStats++;
+ break;

Both here and in pgstat_read_db_statsfile_timestamp(), the patch
handles 'R' message after 'D' whereas while writing the 'R' is written
before 'D'. So, I think it is better if we keep the order during read
the same as during write.

9. While reviewing this patch, I noticed that in
pgstat_read_db_statsfile_timestamp(), if we fail to read ArchiverStats
or SLRUStats, we return 'false' from this function but OTOH, if we
fail to read 'D' or 'R' message, we will return 'true'. I feel the
handling of 'D' and 'R' message is fine because once we read
GlobalStats, we can return the stats_timestamp. So the other two
stands corrected. I understand that this is not directly related to
this patch but if you agree we can do this as a separate patch.

-- 
With Regards,
Amit Kapila.



Re: Resetting spilled txn statistics in pg_stat_replication

From
Masahiko Sawada
Date:
On Mon, 7 Sep 2020 at 15:24, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Jul 23, 2020 at 11:46 AM Masahiko Sawada
> <masahiko.sawada@2ndquadrant.com> wrote:
> >
> > I've updated the patch so that the stats collector ignores the
> > 'update' message if the slot stats array is already full.
> >
>
> This patch needs a rebase. I don't see this patch in the CF app. I
> hope you are still interested in working on this.

Thank you for reviewing this patch!

I'm still going to work on this patch although I might be slow
response this month.

>
> Review comments:
> ===============
> 1.
> +CREATE VIEW pg_stat_replication_slots AS
> +    SELECT
> +            s.name,
> +            s.spill_txns,
> +            s.spill_count,
> +            s.spill_bytes
> +    FROM pg_stat_get_replication_slots() AS s;
>
> The view pg_stat_replication_slots should have a column 'stats_reset'
> (datatype: timestamp with time zone) as we provide a facitily to reset
> the slots. A similar column exists in pg_stat_slru as well, so is
> there a reason of not providing it here?

I had missed adding the column. Fixed.

>
> 2.
> +   </para>
> +  </sect2>
> +
> +  <sect2 id="monitoring-pg-stat-wal-receiver-view">
>    <title><structname>pg_stat_wal_receiver</structname></title>
>
> It is better to keep one empty line between </para> and </sect2> to
> keep it consistent with the documentation of other views.

Fixed.

>
> 3.
>   <primary>pg_stat_reset_replication_slot</primary>
> +        </indexterm>
> +        <function>pg_stat_reset_replication_slot</function> (
> <type>text</type> )
> +        <returnvalue>void</returnvalue>
> +       </para>
> +       <para>
> +         Resets statistics to zero for a single replication slot, or for all
> +         replication slots in the cluster.  If the argument is NULL,
> all counters
> +         shown in the
> <structname>pg_stat_replication_slots</structname> view for
> +         all replication slots are reset.
> +       </para>
>
> I think the information about the parameter for this function is not
> completely clear. It seems to me that it should be the name of the
> slot for which we want to reset the stats, if so, let's try to be
> clear.

Fixed.

>
> 4.
> +pgstat_reset_replslot_counter(const char *name)
> +{
> + PgStat_MsgResetreplslotcounter msg;
> +
> + if (pgStatSock == PGINVALID_SOCKET)
> + return;
> +
> + pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_RESETREPLSLOTCOUNTER);
> + if (name)
> + {
> + memcpy(&msg.m_slotname, name, NAMEDATALEN);
> + msg.clearall = false;
> + }
>
> Don't we want to verify here or in the caller of this function whether
> the passed slot_name is a valid slot or not? For ex. see
> pgstat_reset_shared_counters where we return an error if the target is
> not valid.

Agreed. Fixed.

>
> 5.
> +static void
> +pgstat_recv_resetreplslotcounter(PgStat_MsgResetreplslotcounter *msg,
> + int len)
> +{
> + int i;
> + int idx = -1;
> + TimestampTz ts;
> +
> + if (!msg->clearall)
> + {
> + /* Get the index of replication slot statistics to reset */
> + idx = pgstat_replslot_index(msg->m_slotname, false);
> +
> + if (idx < 0)
> + return; /* not found */
>
> Can we add a comment to describe when we don't expect to find the slot
> here unless there is no way that can happen?

Added.

>
> 6.
> +pgstat_recv_resetreplslotcounter(PgStat_MsgResetreplslotcounter *msg,
> + int len)
> {
> ..
> + for (i = 0; i < SLRU_NUM_ELEMENTS; i++)
> ..
> }
>
> I think here we need to traverse till nReplSlotStats, not SLRU_NUM_ELEMENTS.

Fixed.

>
> 7. Don't we need to change PGSTAT_FILE_FORMAT_ID for this patch? We
> can probably do at the end but better to change it now so that it
> doesn't slip from our mind.

Yes, changed.

>
> 8.
> @@ -5350,6 +5474,23 @@ pgstat_read_statsfiles(Oid onlydb, bool
> permanent, bool deep)
>
>   break;
>
> + /*
> + * 'R' A PgStat_ReplSlotStats struct describing a replication slot
> + * follows.
> + */
> + case 'R':
> + if (fread(&replSlotStats[nReplSlotStats], 1,
> sizeof(PgStat_ReplSlotStats), fpin)
> + != sizeof(PgStat_ReplSlotStats))
> + {
> + ereport(pgStatRunningInCollector ? LOG : WARNING,
> + (errmsg("corrupted statistics file \"%s\"",
> + statfile)));
> + memset(&replSlotStats[nReplSlotStats], 0, sizeof(PgStat_ReplSlotStats));
> + goto done;
> + }
> + nReplSlotStats++;
> + break;
>
> Both here and in pgstat_read_db_statsfile_timestamp(), the patch
> handles 'R' message after 'D' whereas while writing the 'R' is written
> before 'D'. So, I think it is better if we keep the order during read
> the same as during write.

Changed the code so that it writes 'R' after 'D'.

>
> 9. While reviewing this patch, I noticed that in
> pgstat_read_db_statsfile_timestamp(), if we fail to read ArchiverStats
> or SLRUStats, we return 'false' from this function but OTOH, if we
> fail to read 'D' or 'R' message, we will return 'true'. I feel the
> handling of 'D' and 'R' message is fine because once we read
> GlobalStats, we can return the stats_timestamp. So the other two
> stands corrected. I understand that this is not directly related to
> this patch but if you agree we can do this as a separate patch.

It seems to make sense to me. We can set *ts and then read both
ArchiverStats and SLRUStats so we can return a valid timestamp even if
we fail to read.

I've attached both patches: 0001 patch fixes the issue you reported.
0002 patch is the patch that incorporated all review comments.

Regards,

--
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

Re: Resetting spilled txn statistics in pg_stat_replication

From
Amit Kapila
Date:
On Tue, Sep 8, 2020 at 7:53 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:
>
> On Mon, 7 Sep 2020 at 15:24, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> I'm still going to work on this patch although I might be slow
> response this month.
>

This is a quite fast response. Thanks for staying on top of it.

>
> >
> > 9. While reviewing this patch, I noticed that in
> > pgstat_read_db_statsfile_timestamp(), if we fail to read ArchiverStats
> > or SLRUStats, we return 'false' from this function but OTOH, if we
> > fail to read 'D' or 'R' message, we will return 'true'. I feel the
> > handling of 'D' and 'R' message is fine because once we read
> > GlobalStats, we can return the stats_timestamp. So the other two
> > stands corrected. I understand that this is not directly related to
> > this patch but if you agree we can do this as a separate patch.
>
> It seems to make sense to me. We can set *ts and then read both
> ArchiverStats and SLRUStats so we can return a valid timestamp even if
> we fail to read.
>

I have started a separate thread for this bug-fix [1] and will
continue reviewing this patch.

[1] - https://www.postgresql.org/message-id/CAA4eK1J3oTJKyVq6v7K4d3jD%2BvtnruG9fHRib6UuWWsrwAR6Aw%40mail.gmail.com

-- 
With Regards,
Amit Kapila.



Re: Resetting spilled txn statistics in pg_stat_replication

From
Amit Kapila
Date:
On Tue, Sep 8, 2020 at 7:53 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:
>
> On Mon, 7 Sep 2020 at 15:24, Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > This patch needs a rebase. I don't see this patch in the CF app. I
> > hope you are still interested in working on this.
>
> Thank you for reviewing this patch!
>
> I'm still going to work on this patch although I might be slow
> response this month.
>

Comments on the latest patch:
=============================
1.
+CREATE VIEW pg_stat_replication_slots AS
+    SELECT
+            s.name,
+            s.spill_txns,
+            s.spill_count,
+            s.spill_bytes,
+     s.stats_reset
+    FROM pg_stat_get_replication_slots() AS s;

You forgot to update the docs for the new parameter.

2.
@@ -5187,6 +5305,12 @@ pgstat_read_statsfiles(Oid onlydb, bool
permanent, bool deep)
  for (i = 0; i < SLRU_NUM_ELEMENTS; i++)
  slruStats[i].stat_reset_timestamp = globalStats.stat_reset_timestamp;

+ /*
+ * Set the same reset timestamp for all replication slots too.
+ */
+ for (i = 0; i < max_replication_slots; i++)
+ replSlotStats[i].stat_reset_timestamp = globalStats.stat_reset_timestamp;
+

I don't understand why you have removed the above code from the new
version of the patch?

3.
pgstat_recv_resetreplslotcounter()
{
..
+ ts = GetCurrentTimestamp();
+ for (i = 0; i < nReplSlotStats; i++)
+ {
+ /* reset entry with the given index, or all entries */
+ if (msg->clearall || idx == i)
+ {
+ /* reset only counters. Don't clear slot name */
+ replSlotStats[i].spill_txns = 0;
+ replSlotStats[i].spill_count = 0;
+ replSlotStats[i].spill_bytes = 0;
+ replSlotStats[i].stat_reset_timestamp = ts;
+ }
+ }
..

I don't like this coding pattern as in the worst case we need to
traverse all the slots to reset a particular slot. This could be okay
for a fixed number of elements as we have in SLRU but here it appears
quite inefficient. We can move the reset of stats part to a separate
function and then invoke it from the place where we need to reset a
particular slot and the above place.

4.
+pgstat_replslot_index(const char *name, bool create_it)
{
..
+ replSlotStats[nReplSlotStats].stat_reset_timestamp = GetCurrentTimestamp();
..
}

Why do we need to set the reset timestamp on the creation of slot entry?

5.
@@ -3170,6 +3175,13 @@ ReorderBufferSerializeTXN(ReorderBuffer *rb,
ReorderBufferTXN *txn)
  spilled++;
  }

+ /* update the statistics */
+ rb->spillCount += 1;
+ rb->spillBytes += size;
+
+ /* Don't consider already serialized transactions. */
+ rb->spillTxns += rbtxn_is_serialized(txn) ? 0 : 1;

We can't increment the spillTxns in the above way because now
sometimes we do serialize before streaming and in that case we clear
the serialized flag after streaming, see ReorderBufferTruncateTXN. So,
the count can go wrong. Another problem is currently the patch call
UpdateSpillStats only from begin_cb_wrapper which means it won't
consider streaming transactions (streaming transactions that might
have spilled). If we consider the streamed case along with it, we can
probably keep this counter up-to-date because in the above place we
can check if the txn is once serialized or streamed, we shouldn't
increment the counter. I think we need to merge Ajin's patch for
streaming stats [1] and fix the issue. I have not checked his patch so
it might need a rebase and or some changes.

6.
@@ -322,6 +321,9 @@ ReplicationSlotCreate(const char *name, bool db_specific,

  /* Let everybody know we've modified this slot */
  ConditionVariableBroadcast(&slot->active_cv);
+
+ /* Create statistics entry for the new slot */
+ pgstat_report_replslot(NameStr(slot->data.name), 0, 0, 0);
 }
..
..
@@ -683,6 +685,18 @@ ReplicationSlotDropPtr(ReplicationSlot *slot)
  ereport(WARNING,
  (errmsg("could not remove directory \"%s\"", tmppath)));

+ /*
+ * Report to drop the replication slot to stats collector.  Since there
+ * is no guarantee the order of message arrival on an UDP connection,
+ * it's possible that a message for creating a new slot arrives before a
+ * message for removing the old slot.  We send the drop message while
+ * holding ReplicationSlotAllocationLock to reduce that possibility.
+ * If the messages arrived in reverse, we would lose one statistics update
+ * message.  But the next update message will create the statistics for
+ * the replication slot.
+ */
+ pgstat_report_replslot_drop(NameStr(slot->data.name));
+

Similar to drop message, why don't we send the create message while
holding the ReplicationSlotAllocationLock?

[1] - https://www.postgresql.org/message-id/CAFPTHDZ8RnOovefzB%2BOMoRxLSD404WRLqWBUHe6bWqM5ew1bNA%40mail.gmail.com

-- 
With Regards,
Amit Kapila.



Re: Resetting spilled txn statistics in pg_stat_replication

From
Amit Kapila
Date:
On Tue, Sep 8, 2020 at 7:02 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> Comments on the latest patch:
> =============================
>

Apart from the comments I gave yesterday, another thing I was
wondering is how to write some tests for this patch. The two ideas I
could think of are as follows:

1. One idea was to keep these stats for each WALSender as it was in
the commit that we reverted as b074813d48. If we had that then we can
query the stats for tests added in commit 58b5ae9d62. I am not sure
whether we want to display it in view pg_stat_replication but it would
be a really good way to test the streamed and serialized transactions
in a predictable manner.

2. Then the second way is to try doing something similar to what we do
in src/test/regress/sql/stats.sql

I think we should do both if possible.

-- 
With Regards,
Amit Kapila.



Re: Resetting spilled txn statistics in pg_stat_replication

From
Amit Kapila
Date:
On Wed, Sep 9, 2020 at 3:20 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Sep 8, 2020 at 7:02 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > Comments on the latest patch:
> > =============================
> >
>
> Apart from the comments I gave yesterday, another thing I was
> wondering is how to write some tests for this patch. The two ideas I
> could think of are as follows:
>
> 1. One idea was to keep these stats for each WALSender as it was in
> the commit that we reverted as b074813d48. If we had that then we can
> query the stats for tests added in commit 58b5ae9d62. I am not sure
> whether we want to display it in view pg_stat_replication but it would
> be a really good way to test the streamed and serialized transactions
> in a predictable manner.
>
> 2. Then the second way is to try doing something similar to what we do
> in src/test/regress/sql/stats.sql
>
> I think we should do both if possible.
>

I have made a few comment changes on top of your last version. If you
are fine with these then include them with the next version of your
patch.

-- 
With Regards,
Amit Kapila.

Attachment

Re: Resetting spilled txn statistics in pg_stat_replication

From
Amit Kapila
Date:
On Tue, Sep 8, 2020 at 7:02 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Sep 8, 2020 at 7:53 AM Masahiko Sawada
> <masahiko.sawada@2ndquadrant.com> wrote:

I have fixed these review comments in the attached patch.

>
> Comments on the latest patch:
> =============================
> 1.
> +CREATE VIEW pg_stat_replication_slots AS
> +    SELECT
> +            s.name,
> +            s.spill_txns,
> +            s.spill_count,
> +            s.spill_bytes,
> +     s.stats_reset
> +    FROM pg_stat_get_replication_slots() AS s;
>
> You forgot to update the docs for the new parameter.
>

Updated the docs for this.

> 2.
> @@ -5187,6 +5305,12 @@ pgstat_read_statsfiles(Oid onlydb, bool
> permanent, bool deep)
>   for (i = 0; i < SLRU_NUM_ELEMENTS; i++)
>   slruStats[i].stat_reset_timestamp = globalStats.stat_reset_timestamp;
>
> + /*
> + * Set the same reset timestamp for all replication slots too.
> + */
> + for (i = 0; i < max_replication_slots; i++)
> + replSlotStats[i].stat_reset_timestamp = globalStats.stat_reset_timestamp;
> +
>
> I don't understand why you have removed the above code from the new
> version of the patch?
>

Added back.

> 3.
> pgstat_recv_resetreplslotcounter()
> {
> ..
> + ts = GetCurrentTimestamp();
> + for (i = 0; i < nReplSlotStats; i++)
> + {
> + /* reset entry with the given index, or all entries */
> + if (msg->clearall || idx == i)
> + {
> + /* reset only counters. Don't clear slot name */
> + replSlotStats[i].spill_txns = 0;
> + replSlotStats[i].spill_count = 0;
> + replSlotStats[i].spill_bytes = 0;
> + replSlotStats[i].stat_reset_timestamp = ts;
> + }
> + }
> ..
>
> I don't like this coding pattern as in the worst case we need to
> traverse all the slots to reset a particular slot. This could be okay
> for a fixed number of elements as we have in SLRU but here it appears
> quite inefficient. We can move the reset of stats part to a separate
> function and then invoke it from the place where we need to reset a
> particular slot and the above place.
>

Changed the code as per the above idea.

> 4.
> +pgstat_replslot_index(const char *name, bool create_it)
> {
> ..
> + replSlotStats[nReplSlotStats].stat_reset_timestamp = GetCurrentTimestamp();
> ..
> }
>
> Why do we need to set the reset timestamp on the creation of slot entry?
>

I don't think we need to show any time if the slot is never reset. Let
me know if there is any reason to show it.


> 5.
> @@ -3170,6 +3175,13 @@ ReorderBufferSerializeTXN(ReorderBuffer *rb,
> ReorderBufferTXN *txn)
>   spilled++;
>   }
>
> + /* update the statistics */
> + rb->spillCount += 1;
> + rb->spillBytes += size;
> +
> + /* Don't consider already serialized transactions. */
> + rb->spillTxns += rbtxn_is_serialized(txn) ? 0 : 1;
>
> We can't increment the spillTxns in the above way because now
> sometimes we do serialize before streaming and in that case we clear
> the serialized flag after streaming, see ReorderBufferTruncateTXN. So,
> the count can go wrong.
>

To fix, this I have added another flag which indicates if we have ever
serialized the txn. I couldn't find a better way, do let me know if
you can think of a better way to address this comment.

> Another problem is currently the patch call
> UpdateSpillStats only from begin_cb_wrapper which means it won't
> consider streaming transactions (streaming transactions that might
> have spilled).
>

The other problem I see with updating in begin_cb_wrapper is that it
will ignore the spilling done for transactions that get aborted. To
fix both the issues, I have updated the stats in DecodeCommit and
DecodeAbort.


> 6.
> @@ -322,6 +321,9 @@ ReplicationSlotCreate(const char *name, bool db_specific,
>
>   /* Let everybody know we've modified this slot */
>   ConditionVariableBroadcast(&slot->active_cv);
> +
> + /* Create statistics entry for the new slot */
> + pgstat_report_replslot(NameStr(slot->data.name), 0, 0, 0);
>  }
> ..
> ..
> @@ -683,6 +685,18 @@ ReplicationSlotDropPtr(ReplicationSlot *slot)
>   ereport(WARNING,
>   (errmsg("could not remove directory \"%s\"", tmppath)));
>
> + /*
> + * Report to drop the replication slot to stats collector.  Since there
> + * is no guarantee the order of message arrival on an UDP connection,
> + * it's possible that a message for creating a new slot arrives before a
> + * message for removing the old slot.  We send the drop message while
> + * holding ReplicationSlotAllocationLock to reduce that possibility.
> + * If the messages arrived in reverse, we would lose one statistics update
> + * message.  But the next update message will create the statistics for
> + * the replication slot.
> + */
> + pgstat_report_replslot_drop(NameStr(slot->data.name));
> +
>
> Similar to drop message, why don't we send the create message while
> holding the ReplicationSlotAllocationLock?
>

Updated code to address this comment, basically moved the create
message under lock.

Apart from the above,
(a) fixed one bug in ReorderBufferSerializeTXN() where we were
updating the stats even when we have not spilled anything.
(b) made changes in pgstat_read_db_statsfile_timestamp to return false
when the replication slot entry is corrupt.
(c) move the declaration and definitions in pgstat.c to make them
consistent with existing code
(d) made another couple of cosmetic fixes and changed a few comments
(e) Tested the patch by using a guc which allows spilling all the
changes. See v4-0001-guc-always-spill

Let me know what do you think about the changes?

--
With Regards,
Amit Kapila.

Attachment

Re: Resetting spilled txn statistics in pg_stat_replication

From
Amit Kapila
Date:
On Sat, Sep 19, 2020 at 1:48 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Sep 8, 2020 at 7:02 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Tue, Sep 8, 2020 at 7:53 AM Masahiko Sawada
> > <masahiko.sawada@2ndquadrant.com> wrote:
>
> I have fixed these review comments in the attached patch.
>
>
> Apart from the above,
> (a) fixed one bug in ReorderBufferSerializeTXN() where we were
> updating the stats even when we have not spilled anything.
> (b) made changes in pgstat_read_db_statsfile_timestamp to return false
> when the replication slot entry is corrupt.
> (c) move the declaration and definitions in pgstat.c to make them
> consistent with existing code
> (d) made another couple of cosmetic fixes and changed a few comments
> (e) Tested the patch by using a guc which allows spilling all the
> changes. See v4-0001-guc-always-spill
>

I have found a way to write the test case for this patch. This is
based on the idea we used in stats.sql. As of now, I have kept the
test as a separate patch. We can decide to commit the test part
separately as it is slightly timing dependent but OTOH as it is based
on existing logic in stats.sql so there shouldn't be much problem. I
have not changed anything apart from the test patch in this version.
Note that the first patch is just a debugging kind of tool to test the
patch.

Thoughts?

-- 
With Regards,
Amit Kapila.

Attachment

Re: Resetting spilled txn statistics in pg_stat_replication

From
Amit Kapila
Date:
On Thu, Sep 24, 2020 at 5:44 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Sat, Sep 19, 2020 at 1:48 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Tue, Sep 8, 2020 at 7:02 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Tue, Sep 8, 2020 at 7:53 AM Masahiko Sawada
> > > <masahiko.sawada@2ndquadrant.com> wrote:
> >
> > I have fixed these review comments in the attached patch.
> >
> >
> > Apart from the above,
> > (a) fixed one bug in ReorderBufferSerializeTXN() where we were
> > updating the stats even when we have not spilled anything.
> > (b) made changes in pgstat_read_db_statsfile_timestamp to return false
> > when the replication slot entry is corrupt.
> > (c) move the declaration and definitions in pgstat.c to make them
> > consistent with existing code
> > (d) made another couple of cosmetic fixes and changed a few comments
> > (e) Tested the patch by using a guc which allows spilling all the
> > changes. See v4-0001-guc-always-spill
> >
>
> I have found a way to write the test case for this patch. This is
> based on the idea we used in stats.sql. As of now, I have kept the
> test as a separate patch. We can decide to commit the test part
> separately as it is slightly timing dependent but OTOH as it is based
> on existing logic in stats.sql so there shouldn't be much problem. I
> have not changed anything apart from the test patch in this version.
> Note that the first patch is just a debugging kind of tool to test the
> patch.
>

I have done some more testing of this patch especially for the case
where we spill before streaming the transaction and found everything
is working as expected. Additionally, I have changed a few more
comments and ran pgindent. I am still not very sure whether we want to
display physical slots in this view as all the stats are for logical
slots but anyway we can add stats w.r.t physical slots in the future.
I am fine either way (don't show physical slots in this view or show
them but keep stats as 0). Let me know if you have any thoughts on
these points, other than that I am happy with the current state of the
patch.

-- 
With Regards,
Amit Kapila.

Attachment

Re: Resetting spilled txn statistics in pg_stat_replication

From
Dilip Kumar
Date:
On Fri, Sep 25, 2020 at 4:33 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Sep 24, 2020 at 5:44 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Sat, Sep 19, 2020 at 1:48 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Tue, Sep 8, 2020 at 7:02 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > > On Tue, Sep 8, 2020 at 7:53 AM Masahiko Sawada
> > > > <masahiko.sawada@2ndquadrant.com> wrote:
> > >
> > > I have fixed these review comments in the attached patch.
> > >
> > >
> > > Apart from the above,
> > > (a) fixed one bug in ReorderBufferSerializeTXN() where we were
> > > updating the stats even when we have not spilled anything.
> > > (b) made changes in pgstat_read_db_statsfile_timestamp to return false
> > > when the replication slot entry is corrupt.
> > > (c) move the declaration and definitions in pgstat.c to make them
> > > consistent with existing code
> > > (d) made another couple of cosmetic fixes and changed a few comments
> > > (e) Tested the patch by using a guc which allows spilling all the
> > > changes. See v4-0001-guc-always-spill
> > >
> >
> > I have found a way to write the test case for this patch. This is
> > based on the idea we used in stats.sql. As of now, I have kept the
> > test as a separate patch. We can decide to commit the test part
> > separately as it is slightly timing dependent but OTOH as it is based
> > on existing logic in stats.sql so there shouldn't be much problem. I
> > have not changed anything apart from the test patch in this version.
> > Note that the first patch is just a debugging kind of tool to test the
> > patch.
> >
>
> I have done some more testing of this patch especially for the case
> where we spill before streaming the transaction and found everything
> is working as expected. Additionally, I have changed a few more
> comments and ran pgindent. I am still not very sure whether we want to
> display physical slots in this view as all the stats are for logical
> slots but anyway we can add stats w.r.t physical slots in the future.
> I am fine either way (don't show physical slots in this view or show
> them but keep stats as 0). Let me know if you have any thoughts on
> these points, other than that I am happy with the current state of the
> patch.

IMHO, It will make more sense to only show the logical replication
slots in this view.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



Re: Resetting spilled txn statistics in pg_stat_replication

From
Amit Kapila
Date:
On Wed, Sep 30, 2020 at 1:12 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Fri, Sep 25, 2020 at 4:33 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Thu, Sep 24, 2020 at 5:44 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > I have done some more testing of this patch especially for the case
> > where we spill before streaming the transaction and found everything
> > is working as expected. Additionally, I have changed a few more
> > comments and ran pgindent. I am still not very sure whether we want to
> > display physical slots in this view as all the stats are for logical
> > slots but anyway we can add stats w.r.t physical slots in the future.
> > I am fine either way (don't show physical slots in this view or show
> > them but keep stats as 0). Let me know if you have any thoughts on
> > these points, other than that I am happy with the current state of the
> > patch.
>
> IMHO, It will make more sense to only show the logical replication
> slots in this view.
>

Okay, Sawada-San, others, do you have any opinion on this matter?

-- 
With Regards,
Amit Kapila.



Re: Resetting spilled txn statistics in pg_stat_replication

From
Dilip Kumar
Date:
On Wed, Sep 30, 2020 at 2:40 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Sep 30, 2020 at 1:12 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Fri, Sep 25, 2020 at 4:33 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Thu, Sep 24, 2020 at 5:44 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > I have done some more testing of this patch especially for the case
> > > where we spill before streaming the transaction and found everything
> > > is working as expected. Additionally, I have changed a few more
> > > comments and ran pgindent. I am still not very sure whether we want to
> > > display physical slots in this view as all the stats are for logical
> > > slots but anyway we can add stats w.r.t physical slots in the future.
> > > I am fine either way (don't show physical slots in this view or show
> > > them but keep stats as 0). Let me know if you have any thoughts on
> > > these points, other than that I am happy with the current state of the
> > > patch.
> >
> > IMHO, It will make more sense to only show the logical replication
> > slots in this view.
> >
>
> Okay, Sawada-San, others, do you have any opinion on this matter?

I have started looking into this patch, I have a few comments.

+        Number of times transactions were spilled to disk. Transactions
+        may get spilled repeatedly, and this counter gets incremented on every
+        such invocation.
+      </para></entry>
+     </row>
+
+
+  <para>
+   Tracking of spilled transactions works only for logical replication.  In

The number of spaces used after the full stop is not uniform.

+ /* update the statistics iff we have spilled anything */
+ if (spilled)
+ {
+ rb->spillCount += 1;
+ rb->spillBytes += size;
+
+ /* Don't consider already serialized transactions. */

Single line comments are not uniform,  "update the statistics" is
starting is small letter and not ending with the full stop
whereas 'Don't consider' is starting with capital and ending with full stop.


+
+ /*
+ * We set this flag to indicate if the transaction is ever serialized.
+ * We need this to accurately update the stats.
+ */
+ txn->txn_flags |= RBTXN_IS_SERIALIZED_CLEAR;

I feel we can explain the exact scenario in the comment, i.e. after
spill if we stream then we still
need to know that it spilled in past so that we don't count this again
as a new spilled transaction.

old slot.  We send the drop
+ * and create messages while holding ReplicationSlotAllocationLock to
+ * reduce that possibility. If the messages reached in reverse, we would
+ * lose one statistics update message.  But

Spacing after the full stop is not uniform.


+ * Statistics about transactions spilled to disk.
+ *
+ * A single transaction may be spilled repeatedly, which is why we keep
+ * two different counters. For spilling, the transaction counter includes
+ * both toplevel transactions and subtransactions.
+ */
+ int64 spillCount; /* spill-to-disk invocation counter */
+ int64 spillTxns; /* number of transactions spilled to disk  */
+ int64 spillBytes; /* amount of data spilled to disk */

Can we keep the order as spillTxns, spillTxns, spillBytes because
every other place we kept like that
so that way it will look more uniform.

Other than that I did not see any problem.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



Re: Resetting spilled txn statistics in pg_stat_replication

From
Amit Kapila
Date:
On Wed, Sep 30, 2020 at 4:34 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Wed, Sep 30, 2020 at 2:40 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Wed, Sep 30, 2020 at 1:12 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > >
> > > On Fri, Sep 25, 2020 at 4:33 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > > On Thu, Sep 24, 2020 at 5:44 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > > I have done some more testing of this patch especially for the case
> > > > where we spill before streaming the transaction and found everything
> > > > is working as expected. Additionally, I have changed a few more
> > > > comments and ran pgindent. I am still not very sure whether we want to
> > > > display physical slots in this view as all the stats are for logical
> > > > slots but anyway we can add stats w.r.t physical slots in the future.
> > > > I am fine either way (don't show physical slots in this view or show
> > > > them but keep stats as 0). Let me know if you have any thoughts on
> > > > these points, other than that I am happy with the current state of the
> > > > patch.
> > >
> > > IMHO, It will make more sense to only show the logical replication
> > > slots in this view.
> > >
> >
> > Okay, Sawada-San, others, do you have any opinion on this matter?
>

I have changed so that view will only show logical replication slots
and adjusted the docs accordingly. I think we can easily extend it to
show physical replication slots if required in the future.

> I have started looking into this patch, I have a few comments.
>
> +        Number of times transactions were spilled to disk. Transactions
> +        may get spilled repeatedly, and this counter gets incremented on every
> +        such invocation.
> +      </para></entry>
> +     </row>
> +
> +
> +  <para>
> +   Tracking of spilled transactions works only for logical replication.  In
>
> The number of spaces used after the full stop is not uniform.
>

I have removed this sentence as it is not required as we only want to
show logical slots in the view and for that I have updated the other
parts of the doc.

> + /* update the statistics iff we have spilled anything */
> + if (spilled)
> + {
> + rb->spillCount += 1;
> + rb->spillBytes += size;
> +
> + /* Don't consider already serialized transactions. */
>
> Single line comments are not uniform,  "update the statistics" is
> starting is small letter and not ending with the full stop
> whereas 'Don't consider' is starting with capital and ending with full stop.
>

Actually, it doesn't matter in this case but I have changed to keep it
consistent.

>
> +
> + /*
> + * We set this flag to indicate if the transaction is ever serialized.
> + * We need this to accurately update the stats.
> + */
> + txn->txn_flags |= RBTXN_IS_SERIALIZED_CLEAR;
>
> I feel we can explain the exact scenario in the comment, i.e. after
> spill if we stream then we still
> need to know that it spilled in past so that we don't count this again
> as a new spilled transaction.
>

Okay, I have expanded the comment a bit more to explain this.

> old slot.  We send the drop
> + * and create messages while holding ReplicationSlotAllocationLock to
> + * reduce that possibility. If the messages reached in reverse, we would
> + * lose one statistics update message.  But
>
> Spacing after the full stop is not uniform.
>

Changed.

>
> + * Statistics about transactions spilled to disk.
> + *
> + * A single transaction may be spilled repeatedly, which is why we keep
> + * two different counters. For spilling, the transaction counter includes
> + * both toplevel transactions and subtransactions.
> + */
> + int64 spillCount; /* spill-to-disk invocation counter */
> + int64 spillTxns; /* number of transactions spilled to disk  */
> + int64 spillBytes; /* amount of data spilled to disk */
>
> Can we keep the order as spillTxns, spillTxns, spillBytes because
> every other place we kept like that
> so that way it will look more uniform.
>

Changed here and at one more place as per this suggestion.

> Other than that I did not see any problem.
>

Thanks for the review.

-- 
With Regards,
Amit Kapila.



Re: Resetting spilled txn statistics in pg_stat_replication

From
Amit Kapila
Date:
On Thu, Oct 1, 2020 at 12:06 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> Thanks for the review.
>

oops, forgot to attach the updated patches, doing now.

-- 
With Regards,
Amit Kapila.

Attachment

Re: Resetting spilled txn statistics in pg_stat_replication

From
Masahiko Sawada
Date:
On Wed, 30 Sep 2020 at 18:10, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Sep 30, 2020 at 1:12 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Fri, Sep 25, 2020 at 4:33 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Thu, Sep 24, 2020 at 5:44 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > I have done some more testing of this patch especially for the case
> > > where we spill before streaming the transaction and found everything
> > > is working as expected. Additionally, I have changed a few more
> > > comments and ran pgindent. I am still not very sure whether we want to
> > > display physical slots in this view as all the stats are for logical
> > > slots but anyway we can add stats w.r.t physical slots in the future.
> > > I am fine either way (don't show physical slots in this view or show
> > > them but keep stats as 0). Let me know if you have any thoughts on
> > > these points, other than that I am happy with the current state of the
> > > patch.
> >
> > IMHO, It will make more sense to only show the logical replication
> > slots in this view.
> >
>

Thank you for updating the patch.

> Okay, Sawada-San, others, do you have any opinion on this matter?
>

When we discussed this before, I was thinking that we could have other
statistics for physical slots in the same statistics view in the
future. Having the view show only logical slots also makes sense to me
but I’m concerned a bit that we could break backward compatibility
that monitoring tools etc will be affected when the view starts to
show physical slots too. If the view shows only logical slots, it also
might be worth considering to have separate views for logical slots
and physical slots and having pg_stat_logical_replication_slots by
this change.

Here is my review comment on the v7 patch.

+   /*
+    * Set the same reset timestamp for all replication slots too.
+    */
+   for (i = 0; i < max_replication_slots; i++)
+       replSlotStats[i].stat_reset_timestamp =
globalStats.stat_reset_timestamp;
+

You added back the above code but since we clear the timestamps on
creation of a new slot they are not shown:

+   /* Register new slot */
+   memset(&replSlotStats[nReplSlotStats], 0, sizeof(PgStat_ReplSlotStats));
+   memcpy(&replSlotStats[nReplSlotStats].slotname, name, NAMEDATALEN);

Looking at other statistics views such as pg_stat_slru,
pg_stat_bgwriter, and pg_stat_archiver, they have a valid
reset_timestamp value from the beginning. That's why I removed that
code and assigned the timestamp when registering a new slot.

---
+   if (OidIsValid(slot->data.database))
+       pgstat_report_replslot(NameStr(slot->data.name), 0, 0, 0);

I think we can use SlotIsLogical() for this purpose. The same is true
when dropping a slot.

Regards,

--
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Resetting spilled txn statistics in pg_stat_replication

From
Amit Kapila
Date:
On Sat, Oct 3, 2020 at 9:26 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:
>
> When we discussed this before, I was thinking that we could have other
> statistics for physical slots in the same statistics view in the
> future. Having the view show only logical slots also makes sense to me
> but I’m concerned a bit that we could break backward compatibility
> that monitoring tools etc will be affected when the view starts to
> show physical slots too.
>

I think that would happen anyway as we need to add more columns in
view for the physical slots.

> If the view shows only logical slots, it also
> might be worth considering to have separate views for logical slots
> and physical slots and having pg_stat_logical_replication_slots by
> this change.
>

I am not sure at this stage but I think we will add the additional
stats for physical slots once we have any in this view itself. I would
like to avoid adding separate views if possible. The only reason to
omit physical slots at this stage is that we don't have any stats for
the same.

> Here is my review comment on the v7 patch.
>
> +   /*
> +    * Set the same reset timestamp for all replication slots too.
> +    */
> +   for (i = 0; i < max_replication_slots; i++)
> +       replSlotStats[i].stat_reset_timestamp =
> globalStats.stat_reset_timestamp;
> +
>
> You added back the above code but since we clear the timestamps on
> creation of a new slot they are not shown:
>
> +   /* Register new slot */
> +   memset(&replSlotStats[nReplSlotStats], 0, sizeof(PgStat_ReplSlotStats));
> +   memcpy(&replSlotStats[nReplSlotStats].slotname, name, NAMEDATALEN);
>
> Looking at other statistics views such as pg_stat_slru,
> pg_stat_bgwriter, and pg_stat_archiver, they have a valid
> reset_timestamp value from the beginning. That's why I removed that
> code and assigned the timestamp when registering a new slot.
>

Hmm, I don't think it is shown intentionally in those views. I think
what is happening in other views is that it has been initialized with
some value when we read the stats and then while updating and or
writing because we don't change the stat_reset_timestamp, it displays
the same value as initialized at the time of read. Now, because in
pgstat_replslot_index() we always initialize the replSlotStats it
would overwrite any previous value we have set during read and display
the stat_reset as empty for replication slots. If we stop initializing
the replSlotStats in pgstat_replslot_index() then we will see similar
behavior as other views have. So even if we want to change then
probably we should stop initialization in pgstat_replslot_index but I
don't think that is necessarily better behavior because the
description of the parameter doesn't indicate any such thing.

> ---
> +   if (OidIsValid(slot->data.database))
> +       pgstat_report_replslot(NameStr(slot->data.name), 0, 0, 0);
>
> I think we can use SlotIsLogical() for this purpose. The same is true
> when dropping a slot.
>

makes sense, so changed accordingly in the attached patch.

--
With Regards,
Amit Kapila.

Attachment

Re: Resetting spilled txn statistics in pg_stat_replication

From
Masahiko Sawada
Date:
On Sat, 3 Oct 2020 at 16:55, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Sat, Oct 3, 2020 at 9:26 AM Masahiko Sawada
> <masahiko.sawada@2ndquadrant.com> wrote:
> >
> > When we discussed this before, I was thinking that we could have other
> > statistics for physical slots in the same statistics view in the
> > future. Having the view show only logical slots also makes sense to me
> > but I’m concerned a bit that we could break backward compatibility
> > that monitoring tools etc will be affected when the view starts to
> > show physical slots too.
> >
>
> I think that would happen anyway as we need to add more columns in
> view for the physical slots.

I think it depends; adding more columns to the view might not break
tools if the query used in the tool explicitly specifies columns. OTOH
if the view starts to show more rows, the tool will need to have the
condition to get the same result as before.

>
> > If the view shows only logical slots, it also
> > might be worth considering to have separate views for logical slots
> > and physical slots and having pg_stat_logical_replication_slots by
> > this change.
> >
>
> I am not sure at this stage but I think we will add the additional
> stats for physical slots once we have any in this view itself. I would
> like to avoid adding separate views if possible. The only reason to
> omit physical slots at this stage is that we don't have any stats for
> the same.

I also prefer not to have separate views. I'm concerned about the
compatibility I explained above but at the same time I agree that it
doesn't make sense to show the stats always having nothing. Since
given you and Dilip agreed on that, I also agree with that.

>
> > Here is my review comment on the v7 patch.
> >
> > +   /*
> > +    * Set the same reset timestamp for all replication slots too.
> > +    */
> > +   for (i = 0; i < max_replication_slots; i++)
> > +       replSlotStats[i].stat_reset_timestamp =
> > globalStats.stat_reset_timestamp;
> > +
> >
> > You added back the above code but since we clear the timestamps on
> > creation of a new slot they are not shown:
> >
> > +   /* Register new slot */
> > +   memset(&replSlotStats[nReplSlotStats], 0, sizeof(PgStat_ReplSlotStats));
> > +   memcpy(&replSlotStats[nReplSlotStats].slotname, name, NAMEDATALEN);
> >
> > Looking at other statistics views such as pg_stat_slru,
> > pg_stat_bgwriter, and pg_stat_archiver, they have a valid
> > reset_timestamp value from the beginning. That's why I removed that
> > code and assigned the timestamp when registering a new slot.
> >
>
> Hmm, I don't think it is shown intentionally in those views. I think
> what is happening in other views is that it has been initialized with
> some value when we read the stats and then while updating and or
> writing because we don't change the stat_reset_timestamp, it displays
> the same value as initialized at the time of read. Now, because in
> pgstat_replslot_index() we always initialize the replSlotStats it
> would overwrite any previous value we have set during read and display
> the stat_reset as empty for replication slots. If we stop initializing
> the replSlotStats in pgstat_replslot_index() then we will see similar
> behavior as other views have. So even if we want to change then
> probably we should stop initialization in pgstat_replslot_index but I
> don't think that is necessarily better behavior because the
> description of the parameter doesn't indicate any such thing.

Understood. I agreed that the newly created slot doesn't have
reset_timestamp. Looking at pg_stat_database, a view whose rows are
added dynamically unlike other stat views, the newly created database
doesn't have reset_timestamp. But given we clear the stats for a slot
at pgstat_replslot_index(), why do we need to initialize the
reset_timestamp with globalStats.stat_reset_timestamp when reading the
stats file? Even if we could not find any slot stats in the stats file
the view won’t show anything. And the reset_timestamp will be cleared
when receiving a slot stats.

Regards,

--
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Resetting spilled txn statistics in pg_stat_replication

From
Amit Kapila
Date:
On Mon, Oct 5, 2020 at 1:26 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:
>
> On Sat, 3 Oct 2020 at 16:55, Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Sat, Oct 3, 2020 at 9:26 AM Masahiko Sawada
> > <masahiko.sawada@2ndquadrant.com> wrote:
> > >
> > > When we discussed this before, I was thinking that we could have other
> > > statistics for physical slots in the same statistics view in the
> > > future. Having the view show only logical slots also makes sense to me
> > > but I’m concerned a bit that we could break backward compatibility
> > > that monitoring tools etc will be affected when the view starts to
> > > show physical slots too.
> > >
> >
> > I think that would happen anyway as we need to add more columns in
> > view for the physical slots.
>
> I think it depends; adding more columns to the view might not break
> tools if the query used in the tool explicitly specifies columns.
>

What if it uses Select * ...? It might not be advisable to assume how
the user might fetch data.

> OTOH
> if the view starts to show more rows, the tool will need to have the
> condition to get the same result as before.
>
> >
> > > If the view shows only logical slots, it also
> > > might be worth considering to have separate views for logical slots
> > > and physical slots and having pg_stat_logical_replication_slots by
> > > this change.
> > >
> >
> > I am not sure at this stage but I think we will add the additional
> > stats for physical slots once we have any in this view itself. I would
> > like to avoid adding separate views if possible. The only reason to
> > omit physical slots at this stage is that we don't have any stats for
> > the same.
>
> I also prefer not to have separate views. I'm concerned about the
> compatibility I explained above but at the same time I agree that it
> doesn't make sense to show the stats always having nothing. Since
> given you and Dilip agreed on that, I also agree with that.
>

Okay.

> >
> > > Here is my review comment on the v7 patch.
> > >
> > > +   /*
> > > +    * Set the same reset timestamp for all replication slots too.
> > > +    */
> > > +   for (i = 0; i < max_replication_slots; i++)
> > > +       replSlotStats[i].stat_reset_timestamp =
> > > globalStats.stat_reset_timestamp;
> > > +
> > >
> > > You added back the above code but since we clear the timestamps on
> > > creation of a new slot they are not shown:
> > >
> > > +   /* Register new slot */
> > > +   memset(&replSlotStats[nReplSlotStats], 0, sizeof(PgStat_ReplSlotStats));
> > > +   memcpy(&replSlotStats[nReplSlotStats].slotname, name, NAMEDATALEN);
> > >
> > > Looking at other statistics views such as pg_stat_slru,
> > > pg_stat_bgwriter, and pg_stat_archiver, they have a valid
> > > reset_timestamp value from the beginning. That's why I removed that
> > > code and assigned the timestamp when registering a new slot.
> > >
> >
> > Hmm, I don't think it is shown intentionally in those views. I think
> > what is happening in other views is that it has been initialized with
> > some value when we read the stats and then while updating and or
> > writing because we don't change the stat_reset_timestamp, it displays
> > the same value as initialized at the time of read. Now, because in
> > pgstat_replslot_index() we always initialize the replSlotStats it
> > would overwrite any previous value we have set during read and display
> > the stat_reset as empty for replication slots. If we stop initializing
> > the replSlotStats in pgstat_replslot_index() then we will see similar
> > behavior as other views have. So even if we want to change then
> > probably we should stop initialization in pgstat_replslot_index but I
> > don't think that is necessarily better behavior because the
> > description of the parameter doesn't indicate any such thing.
>
> Understood. I agreed that the newly created slot doesn't have
> reset_timestamp. Looking at pg_stat_database, a view whose rows are
> added dynamically unlike other stat views, the newly created database
> doesn't have reset_timestamp. But given we clear the stats for a slot
> at pgstat_replslot_index(), why do we need to initialize the
> reset_timestamp with globalStats.stat_reset_timestamp when reading the
> stats file? Even if we could not find any slot stats in the stats file
> the view won’t show anything.
>

It was mainly for a code consistency point of view. Also, we will
clear the data in pgstat_replslot_index only for new slots, not for
existing slots. It might be used when we can't load the statsfile as
per comment in code ("Set the current timestamp (will be kept only in
case we can't load an existing statsfile)).

--
With Regards,
Amit Kapila.



Re: Resetting spilled txn statistics in pg_stat_replication

From
Masahiko Sawada
Date:
On Mon, 5 Oct 2020 at 17:50, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Oct 5, 2020 at 1:26 PM Masahiko Sawada
> <masahiko.sawada@2ndquadrant.com> wrote:
> >
> > On Sat, 3 Oct 2020 at 16:55, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Sat, Oct 3, 2020 at 9:26 AM Masahiko Sawada
> > > <masahiko.sawada@2ndquadrant.com> wrote:
> > > >
> > > > When we discussed this before, I was thinking that we could have other
> > > > statistics for physical slots in the same statistics view in the
> > > > future. Having the view show only logical slots also makes sense to me
> > > > but I’m concerned a bit that we could break backward compatibility
> > > > that monitoring tools etc will be affected when the view starts to
> > > > show physical slots too.
> > > >
> > >
> > > I think that would happen anyway as we need to add more columns in
> > > view for the physical slots.
> >
> > I think it depends; adding more columns to the view might not break
> > tools if the query used in the tool explicitly specifies columns.
> >
>
> What if it uses Select * ...? It might not be advisable to assume how
> the user might fetch data.
>
> > OTOH
> > if the view starts to show more rows, the tool will need to have the
> > condition to get the same result as before.
> >
> > >
> > > > If the view shows only logical slots, it also
> > > > might be worth considering to have separate views for logical slots
> > > > and physical slots and having pg_stat_logical_replication_slots by
> > > > this change.
> > > >
> > >
> > > I am not sure at this stage but I think we will add the additional
> > > stats for physical slots once we have any in this view itself. I would
> > > like to avoid adding separate views if possible. The only reason to
> > > omit physical slots at this stage is that we don't have any stats for
> > > the same.
> >
> > I also prefer not to have separate views. I'm concerned about the
> > compatibility I explained above but at the same time I agree that it
> > doesn't make sense to show the stats always having nothing. Since
> > given you and Dilip agreed on that, I also agree with that.
> >
>
> Okay.
>
> > >
> > > > Here is my review comment on the v7 patch.
> > > >
> > > > +   /*
> > > > +    * Set the same reset timestamp for all replication slots too.
> > > > +    */
> > > > +   for (i = 0; i < max_replication_slots; i++)
> > > > +       replSlotStats[i].stat_reset_timestamp =
> > > > globalStats.stat_reset_timestamp;
> > > > +
> > > >
> > > > You added back the above code but since we clear the timestamps on
> > > > creation of a new slot they are not shown:
> > > >
> > > > +   /* Register new slot */
> > > > +   memset(&replSlotStats[nReplSlotStats], 0, sizeof(PgStat_ReplSlotStats));
> > > > +   memcpy(&replSlotStats[nReplSlotStats].slotname, name, NAMEDATALEN);
> > > >
> > > > Looking at other statistics views such as pg_stat_slru,
> > > > pg_stat_bgwriter, and pg_stat_archiver, they have a valid
> > > > reset_timestamp value from the beginning. That's why I removed that
> > > > code and assigned the timestamp when registering a new slot.
> > > >
> > >
> > > Hmm, I don't think it is shown intentionally in those views. I think
> > > what is happening in other views is that it has been initialized with
> > > some value when we read the stats and then while updating and or
> > > writing because we don't change the stat_reset_timestamp, it displays
> > > the same value as initialized at the time of read. Now, because in
> > > pgstat_replslot_index() we always initialize the replSlotStats it
> > > would overwrite any previous value we have set during read and display
> > > the stat_reset as empty for replication slots. If we stop initializing
> > > the replSlotStats in pgstat_replslot_index() then we will see similar
> > > behavior as other views have. So even if we want to change then
> > > probably we should stop initialization in pgstat_replslot_index but I
> > > don't think that is necessarily better behavior because the
> > > description of the parameter doesn't indicate any such thing.
> >
> > Understood. I agreed that the newly created slot doesn't have
> > reset_timestamp. Looking at pg_stat_database, a view whose rows are
> > added dynamically unlike other stat views, the newly created database
> > doesn't have reset_timestamp. But given we clear the stats for a slot
> > at pgstat_replslot_index(), why do we need to initialize the
> > reset_timestamp with globalStats.stat_reset_timestamp when reading the
> > stats file? Even if we could not find any slot stats in the stats file
> > the view won’t show anything.
> >
>
> It was mainly for a code consistency point of view. Also, we will
> clear the data in pgstat_replslot_index only for new slots, not for
> existing slots. It might be used when we can't load the statsfile as
> per comment in code ("Set the current timestamp (will be kept only in
> case we can't load an existing statsfile)).
>

Understood.

Looking at pgstat_reset_replslot_counter() in the v8 patch, even if we
pass a physical slot name to pg_stat_reset_replication_slot() a
PgStat_MsgResetreplslotcounter is sent to the stats collector. I’m
okay with not raising an error but maybe we can have it not to send
the message in that case.

Regards,

--
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Resetting spilled txn statistics in pg_stat_replication

From
Amit Kapila
Date:
On Tue, Oct 6, 2020 at 9:34 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:
>
> Looking at pgstat_reset_replslot_counter() in the v8 patch, even if we
> pass a physical slot name to pg_stat_reset_replication_slot() a
> PgStat_MsgResetreplslotcounter is sent to the stats collector. I’m
> okay with not raising an error but maybe we can have it not to send
> the message in that case.
>

makes sense, so changed accordingly.

--
With Regards,
Amit Kapila.

Attachment

Re: Resetting spilled txn statistics in pg_stat_replication

From
Masahiko Sawada
Date:
On Tue, 6 Oct 2020 at 17:56, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Oct 6, 2020 at 9:34 AM Masahiko Sawada
> <masahiko.sawada@2ndquadrant.com> wrote:
> >
> > Looking at pgstat_reset_replslot_counter() in the v8 patch, even if we
> > pass a physical slot name to pg_stat_reset_replication_slot() a
> > PgStat_MsgResetreplslotcounter is sent to the stats collector. I’m
> > okay with not raising an error but maybe we can have it not to send
> > the message in that case.
> >
>
> makes sense, so changed accordingly.
>

Thank you for updating the patch!

Both patches look good to me.

Regards,

--
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Resetting spilled txn statistics in pg_stat_replication

From
Amit Kapila
Date:
On Wed, Oct 7, 2020 at 11:24 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:
>
> On Tue, 6 Oct 2020 at 17:56, Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Tue, Oct 6, 2020 at 9:34 AM Masahiko Sawada
> > <masahiko.sawada@2ndquadrant.com> wrote:
> > >
> > > Looking at pgstat_reset_replslot_counter() in the v8 patch, even if we
> > > pass a physical slot name to pg_stat_reset_replication_slot() a
> > > PgStat_MsgResetreplslotcounter is sent to the stats collector. I’m
> > > okay with not raising an error but maybe we can have it not to send
> > > the message in that case.
> > >
> >
> > makes sense, so changed accordingly.
> >
>
> Thank you for updating the patch!
>

Thanks, I will push the first one tomorrow unless I see more comments
and test-case one later. I think after we are done with this the next
step would be to finish the streaming stats work [1]. We probably need
to review and add the test case in that patch. If nobody else shows up
I will pick it up and complete it.

[1] - https://www.postgresql.org/message-id/CAFPTHDZ8RnOovefzB%2BOMoRxLSD404WRLqWBUHe6bWqM5ew1bNA%40mail.gmail.com

--
With Regards,
Amit Kapila.



Re: Resetting spilled txn statistics in pg_stat_replication

From
Masahiko Sawada
Date:
On Wed, 7 Oct 2020 at 17:52, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Oct 7, 2020 at 11:24 AM Masahiko Sawada
> <masahiko.sawada@2ndquadrant.com> wrote:
> >
> > On Tue, 6 Oct 2020 at 17:56, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Tue, Oct 6, 2020 at 9:34 AM Masahiko Sawada
> > > <masahiko.sawada@2ndquadrant.com> wrote:
> > > >
> > > > Looking at pgstat_reset_replslot_counter() in the v8 patch, even if we
> > > > pass a physical slot name to pg_stat_reset_replication_slot() a
> > > > PgStat_MsgResetreplslotcounter is sent to the stats collector. I’m
> > > > okay with not raising an error but maybe we can have it not to send
> > > > the message in that case.
> > > >
> > >
> > > makes sense, so changed accordingly.
> > >
> >
> > Thank you for updating the patch!
> >
>
> Thanks, I will push the first one tomorrow unless I see more comments
> and test-case one later.

I thought we could have a test case for the reset function, what do you think?

> I think after we are done with this the next
> step would be to finish the streaming stats work [1]. We probably need
> to review and add the test case in that patch. If nobody else shows up
> I will pick it up and complete it.

+1
I can review that patch.

Regards,

--
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Resetting spilled txn statistics in pg_stat_replication

From
Amit Kapila
Date:
On Thu, Oct 8, 2020 at 7:46 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:
>
> On Wed, 7 Oct 2020 at 17:52, Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Wed, Oct 7, 2020 at 11:24 AM Masahiko Sawada
> > <masahiko.sawada@2ndquadrant.com> wrote:
> > >
> > > On Tue, 6 Oct 2020 at 17:56, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > > On Tue, Oct 6, 2020 at 9:34 AM Masahiko Sawada
> > > > <masahiko.sawada@2ndquadrant.com> wrote:
> > > > >
> > > > > Looking at pgstat_reset_replslot_counter() in the v8 patch, even if we
> > > > > pass a physical slot name to pg_stat_reset_replication_slot() a
> > > > > PgStat_MsgResetreplslotcounter is sent to the stats collector. I’m
> > > > > okay with not raising an error but maybe we can have it not to send
> > > > > the message in that case.
> > > > >
> > > >
> > > > makes sense, so changed accordingly.
> > > >
> > >
> > > Thank you for updating the patch!
> > >
> >
> > Thanks, I will push the first one tomorrow unless I see more comments
> > and test-case one later.
>
> I thought we could have a test case for the reset function, what do you think?
>

We can write if we want but there are few things we need to do for
that like maybe a new function like wait_for_spill_stats which will
check if the counters have become zero. Then probably call a reset
function, call a new wait function, and then again check stats to
ensure they are reset to 0.

We can't write any advanced test which means reset the existing stats
perform some tests and again check stats because *slot_get_changes()
function can start from the previous WAL for which we have covered the
stats. We might write that if we can somehow track the WAL positions
from the previous test. I am not sure if we want to go there.

> > I think after we are done with this the next
> > step would be to finish the streaming stats work [1]. We probably need
> > to review and add the test case in that patch. If nobody else shows up
> > I will pick it up and complete it.
>
> +1
> I can review that patch.
>

Thanks!

--
With Regards,
Amit Kapila.



Re: Resetting spilled txn statistics in pg_stat_replication

From
Masahiko Sawada
Date:
On Thu, 8 Oct 2020 at 14:10, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Oct 8, 2020 at 7:46 AM Masahiko Sawada
> <masahiko.sawada@2ndquadrant.com> wrote:
> >
> > On Wed, 7 Oct 2020 at 17:52, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Wed, Oct 7, 2020 at 11:24 AM Masahiko Sawada
> > > <masahiko.sawada@2ndquadrant.com> wrote:
> > > >
> > > > On Tue, 6 Oct 2020 at 17:56, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > >
> > > > > On Tue, Oct 6, 2020 at 9:34 AM Masahiko Sawada
> > > > > <masahiko.sawada@2ndquadrant.com> wrote:
> > > > > >
> > > > > > Looking at pgstat_reset_replslot_counter() in the v8 patch, even if we
> > > > > > pass a physical slot name to pg_stat_reset_replication_slot() a
> > > > > > PgStat_MsgResetreplslotcounter is sent to the stats collector. I’m
> > > > > > okay with not raising an error but maybe we can have it not to send
> > > > > > the message in that case.
> > > > > >
> > > > >
> > > > > makes sense, so changed accordingly.
> > > > >
> > > >
> > > > Thank you for updating the patch!
> > > >
> > >
> > > Thanks, I will push the first one tomorrow unless I see more comments
> > > and test-case one later.
> >
> > I thought we could have a test case for the reset function, what do you think?
> >
>
> We can write if we want but there are few things we need to do for
> that like maybe a new function like wait_for_spill_stats which will
> check if the counters have become zero. Then probably call a reset
> function, call a new wait function, and then again check stats to
> ensure they are reset to 0.

Yes.

> We can't write any advanced test which means reset the existing stats
> perform some tests and again check stats because *slot_get_changes()
> function can start from the previous WAL for which we have covered the
> stats. We might write that if we can somehow track the WAL positions
> from the previous test. I am not sure if we want to go there.

Can we use pg_logical_slot_peek_changes() instead to decode the same
transactions multiple times?

Regards,

--
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Resetting spilled txn statistics in pg_stat_replication

From
Amit Kapila
Date:
On Thu, Oct 8, 2020 at 1:55 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:
>
> On Thu, 8 Oct 2020 at 14:10, Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> >
> > We can write if we want but there are few things we need to do for
> > that like maybe a new function like wait_for_spill_stats which will
> > check if the counters have become zero. Then probably call a reset
> > function, call a new wait function, and then again check stats to
> > ensure they are reset to 0.
>
> Yes.
>

I am not sure if it is worth but probably it is not a bad idea
especially if we extend the existing tests based on your below idea?

> > We can't write any advanced test which means reset the existing stats
> > perform some tests and again check stats because *slot_get_changes()
> > function can start from the previous WAL for which we have covered the
> > stats. We might write that if we can somehow track the WAL positions
> > from the previous test. I am not sure if we want to go there.
>
> Can we use pg_logical_slot_peek_changes() instead to decode the same
> transactions multiple times?
>

I think this will do the trick. If we want to go there then I suggest
we can have a separate regression test file in test_decoding with name
as decoding_stats, stats, or something like that. We can later add the
tests related to streaming stats in that file as well.

-- 
With Regards,
Amit Kapila.



Re: Resetting spilled txn statistics in pg_stat_replication

From
Amit Kapila
Date:
On Thu, Oct 8, 2020 at 7:46 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:
>
> On Wed, 7 Oct 2020 at 17:52, Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
>
> > I think after we are done with this the next
> > step would be to finish the streaming stats work [1]. We probably need
> > to review and add the test case in that patch. If nobody else shows up
> > I will pick it up and complete it.
>
> +1
> I can review that patch.
>

I have rebased the stream stats patch and made minor modifications. I
haven't done a detailed review but one thing that I think is not
correct is:
@@ -3496,10 +3499,18 @@ ReorderBufferStreamTXN(ReorderBuffer *rb,
ReorderBufferTXN *txn)
  txn->snapshot_now = NULL;
  }

+
+ rb->streamCount += 1;
+ rb->streamBytes += txn->total_size;
+
+ /* Don't consider already streamed transaction. */
+ rb->streamTxns += (rbtxn_is_streamed(txn)) ? 0 : 1;
+
  /* Process and send the changes to output plugin. */
  ReorderBufferProcessTXN(rb, txn, InvalidXLogRecPtr, snapshot_now,
  command_id, true);

I think we should update the stream stats after
ReorderBufferProcessTXN rather than before because any error in
ReorderBufferProcessTXN can lead to an unnecessary update of stats.
But OTOH, the txn flags, and other data can be changed after
ReorderBufferProcessTXN so we need to save them in a temporary
variable before calling the function.

-- 
With Regards,
Amit Kapila.

Attachment

Re: Resetting spilled txn statistics in pg_stat_replication

From
Masahiko Sawada
Date:
On Thu, 8 Oct 2020 at 17:59, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Oct 8, 2020 at 1:55 PM Masahiko Sawada
> <masahiko.sawada@2ndquadrant.com> wrote:
> >
> > On Thu, 8 Oct 2020 at 14:10, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > >
> > > We can write if we want but there are few things we need to do for
> > > that like maybe a new function like wait_for_spill_stats which will
> > > check if the counters have become zero. Then probably call a reset
> > > function, call a new wait function, and then again check stats to
> > > ensure they are reset to 0.
> >
> > Yes.
> >
>
> I am not sure if it is worth but probably it is not a bad idea
> especially if we extend the existing tests based on your below idea?
>
> > > We can't write any advanced test which means reset the existing stats
> > > perform some tests and again check stats because *slot_get_changes()
> > > function can start from the previous WAL for which we have covered the
> > > stats. We might write that if we can somehow track the WAL positions
> > > from the previous test. I am not sure if we want to go there.
> >
> > Can we use pg_logical_slot_peek_changes() instead to decode the same
> > transactions multiple times?
> >
>
> I think this will do the trick. If we want to go there then I suggest
> we can have a separate regression test file in test_decoding with name
> as decoding_stats, stats, or something like that. We can later add the
> tests related to streaming stats in that file as well.
>

Agreed.

I've updated the patch. Please review it.

Regards,

-- 
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

Re: Resetting spilled txn statistics in pg_stat_replication

From
Masahiko Sawada
Date:
On Thu, 8 Oct 2020 at 22:57, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Oct 8, 2020 at 7:46 AM Masahiko Sawada
> <masahiko.sawada@2ndquadrant.com> wrote:
> >
> > On Wed, 7 Oct 2020 at 17:52, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> >
> > > I think after we are done with this the next
> > > step would be to finish the streaming stats work [1]. We probably need
> > > to review and add the test case in that patch. If nobody else shows up
> > > I will pick it up and complete it.
> >
> > +1
> > I can review that patch.
> >
>
> I have rebased the stream stats patch and made minor modifications. I
> haven't done a detailed review but one thing that I think is not
> correct is:
> @@ -3496,10 +3499,18 @@ ReorderBufferStreamTXN(ReorderBuffer *rb,
> ReorderBufferTXN *txn)
>   txn->snapshot_now = NULL;
>   }
>
> +
> + rb->streamCount += 1;
> + rb->streamBytes += txn->total_size;
> +
> + /* Don't consider already streamed transaction. */
> + rb->streamTxns += (rbtxn_is_streamed(txn)) ? 0 : 1;
> +
>   /* Process and send the changes to output plugin. */
>   ReorderBufferProcessTXN(rb, txn, InvalidXLogRecPtr, snapshot_now,
>   command_id, true);
>
> I think we should update the stream stats after
> ReorderBufferProcessTXN rather than before because any error in
> ReorderBufferProcessTXN can lead to an unnecessary update of stats.
> But OTOH, the txn flags, and other data can be changed after
> ReorderBufferProcessTXN so we need to save them in a temporary
> variable before calling the function.

Thank you for updating the patch!

I've not looked at the patch in-depth yet but RBTXN_IS_STREAMED could
be cleared after ReorderBUfferProcessTXN()?

BTW maybe it's better to start a new thread for this patch as the
title is no longer relevant.

Regards,

--
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



RE: Resetting spilled txn statistics in pg_stat_replication

From
"Shinoda, Noriyoshi (PN Japan A&PS Delivery)"
Date:
Hi, thank you for the awesome feature.

As it may have been discussed, I think the 'name' column in pg_stat_replication_slots is more consistent with the
columnname and data type matched to the pg_replication_slots catalog.
 
The attached patch changes the name and data type of the 'name' column to slot_name and 'name' type, respectively.
Also, the macro name PG_STAT_GET_..._CLOS has been changed to PG_STAT_GET_..._COLS.

Regards,
Noriyoshi Shinoda
-----Original Message-----
From: Masahiko Sawada [mailto:masahiko.sawada@2ndquadrant.com] 
Sent: Monday, October 12, 2020 3:22 PM
To: Amit Kapila <amit.kapila16@gmail.com>
Cc: Dilip Kumar <dilipbalaut@gmail.com>; Magnus Hagander <magnus@hagander.net>; Tomas Vondra
<tomas.vondra@2ndquadrant.com>;PostgreSQL Hackers <pgsql-hackers@lists.postgresql.org>; Ajin Cherian
<itsajin@gmail.com>
Subject: Re: Resetting spilled txn statistics in pg_stat_replication

On Thu, 8 Oct 2020 at 22:57, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Oct 8, 2020 at 7:46 AM Masahiko Sawada 
> <masahiko.sawada@2ndquadrant.com> wrote:
> >
> > On Wed, 7 Oct 2020 at 17:52, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> >
> > > I think after we are done with this the next step would be to 
> > > finish the streaming stats work [1]. We probably need to review 
> > > and add the test case in that patch. If nobody else shows up I 
> > > will pick it up and complete it.
> >
> > +1
> > I can review that patch.
> >
>
> I have rebased the stream stats patch and made minor modifications. I 
> haven't done a detailed review but one thing that I think is not 
> correct is:
> @@ -3496,10 +3499,18 @@ ReorderBufferStreamTXN(ReorderBuffer *rb, 
> ReorderBufferTXN *txn)
>   txn->snapshot_now = NULL;
>   }
>
> +
> + rb->streamCount += 1;
> + rb->streamBytes += txn->total_size;
> +
> + /* Don't consider already streamed transaction. */
> + rb->streamTxns += (rbtxn_is_streamed(txn)) ? 0 : 1;
> +
>   /* Process and send the changes to output plugin. */
>   ReorderBufferProcessTXN(rb, txn, InvalidXLogRecPtr, snapshot_now,
>   command_id, true);
>
> I think we should update the stream stats after 
> ReorderBufferProcessTXN rather than before because any error in 
> ReorderBufferProcessTXN can lead to an unnecessary update of stats.
> But OTOH, the txn flags, and other data can be changed after 
> ReorderBufferProcessTXN so we need to save them in a temporary 
> variable before calling the function.

Thank you for updating the patch!

I've not looked at the patch in-depth yet but RBTXN_IS_STREAMED could be cleared after ReorderBUfferProcessTXN()?

BTW maybe it's better to start a new thread for this patch as the title is no longer relevant.

Regards,

--
Masahiko Sawada            http://www.2ndQuadrant.com/ 
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Attachment

Re: Resetting spilled txn statistics in pg_stat_replication

From
Amit Kapila
Date:
On Mon, Oct 12, 2020 at 10:59 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:
>
> On Thu, 8 Oct 2020 at 17:59, Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Thu, Oct 8, 2020 at 1:55 PM Masahiko Sawada
> > <masahiko.sawada@2ndquadrant.com> wrote:
> > >
> > > On Thu, 8 Oct 2020 at 14:10, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > >
> > > > We can write if we want but there are few things we need to do for
> > > > that like maybe a new function like wait_for_spill_stats which will
> > > > check if the counters have become zero. Then probably call a reset
> > > > function, call a new wait function, and then again check stats to
> > > > ensure they are reset to 0.
> > >
> > > Yes.
> > >
> >
> > I am not sure if it is worth but probably it is not a bad idea
> > especially if we extend the existing tests based on your below idea?
> >
> > > > We can't write any advanced test which means reset the existing stats
> > > > perform some tests and again check stats because *slot_get_changes()
> > > > function can start from the previous WAL for which we have covered the
> > > > stats. We might write that if we can somehow track the WAL positions
> > > > from the previous test. I am not sure if we want to go there.
> > >
> > > Can we use pg_logical_slot_peek_changes() instead to decode the same
> > > transactions multiple times?
> > >
> >
> > I think this will do the trick. If we want to go there then I suggest
> > we can have a separate regression test file in test_decoding with name
> > as decoding_stats, stats, or something like that. We can later add the
> > tests related to streaming stats in that file as well.
> >
>
> Agreed.
>
> I've updated the patch. Please review it.
>

Few comments:
=============
1.
+-- function to wait for counters to advance
+CREATE FUNCTION wait_for_spill_stats(check_reset bool) RETURNS void AS $$

Can we rename this function to wait_for_decode_stats? I am thinking we
can later reuse this function for streaming stats as well by passing
the additional parameter 'stream bool'.

2. let's drop the table added by this test and regression_slot at the
end of the test.

-- 
With Regards,
Amit Kapila.



Re: Resetting spilled txn statistics in pg_stat_replication

From
Masahiko Sawada
Date:
On Mon, 12 Oct 2020 at 18:29, Shinoda, Noriyoshi (PN Japan A&PS
Delivery) <noriyoshi.shinoda@hpe.com> wrote:
>
> Hi, thank you for the awesome feature.
>

Thank you for reporting!

> As it may have been discussed, I think the 'name' column in pg_stat_replication_slots is more consistent with the
columnname and data type matched to the pg_replication_slots catalog.
 
> The attached patch changes the name and data type of the 'name' column to slot_name and 'name' type, respectively.

It seems a good idea to me. In other system views, we use the name
data type for object name. When I wrote the first patch, I borrowed
the code for pg_stat_slru which uses text data for the name but I
think it's an oversight.

> Also, the macro name PG_STAT_GET_..._CLOS has been changed to PG_STAT_GET_..._COLS.

Good catch!

Here is my comments on the patch:

--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -798,7 +798,7 @@ CREATE VIEW pg_stat_replication AS

 CREATE VIEW pg_stat_replication_slots AS
     SELECT
-            s.name,
+            s.name AS slot_name,
             s.spill_txns,
             s.spill_count,
             s.spill_bytes,

I think we should modify 'proargnames' of
pg_stat_get_replication_slots() in pg_proc.dat instead.

---
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -7094,7 +7094,7 @@ pgstat_replslot_index(const char *name, bool create_it)
    Assert(nReplSlotStats <= max_replication_slots);
    for (i = 0; i < nReplSlotStats; i++)
    {
-       if (strcmp(replSlotStats[i].slotname, name) == 0)
+       if (strcmp(replSlotStats[i].slotname.data, name) == 0)
            return i;           /* found */
    }

@@ -7107,7 +7107,7 @@ pgstat_replslot_index(const char *name, bool create_it)

    /* Register new slot */
    memset(&replSlotStats[nReplSlotStats], 0, sizeof(PgStat_ReplSlotStats));
-   memcpy(&replSlotStats[nReplSlotStats].slotname, name, NAMEDATALEN);
+   memcpy(&replSlotStats[nReplSlotStats].slotname.data, name, NAMEDATALEN);

    return nReplSlotStats++;
 }

We can use NameStr() instead.

---
Perhaps we need to update the regression test as well.

Regards,

--
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Resetting spilled txn statistics in pg_stat_replication

From
Amit Kapila
Date:
On Mon, Oct 12, 2020 at 11:52 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:
>
> On Thu, 8 Oct 2020 at 22:57, Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Thu, Oct 8, 2020 at 7:46 AM Masahiko Sawada
> > <masahiko.sawada@2ndquadrant.com> wrote:
> >
> > I have rebased the stream stats patch and made minor modifications. I
> > haven't done a detailed review but one thing that I think is not
> > correct is:
> > @@ -3496,10 +3499,18 @@ ReorderBufferStreamTXN(ReorderBuffer *rb,
> > ReorderBufferTXN *txn)
> >   txn->snapshot_now = NULL;
> >   }
> >
> > +
> > + rb->streamCount += 1;
> > + rb->streamBytes += txn->total_size;
> > +
> > + /* Don't consider already streamed transaction. */
> > + rb->streamTxns += (rbtxn_is_streamed(txn)) ? 0 : 1;
> > +
> >   /* Process and send the changes to output plugin. */
> >   ReorderBufferProcessTXN(rb, txn, InvalidXLogRecPtr, snapshot_now,
> >   command_id, true);
> >
> > I think we should update the stream stats after
> > ReorderBufferProcessTXN rather than before because any error in
> > ReorderBufferProcessTXN can lead to an unnecessary update of stats.
> > But OTOH, the txn flags, and other data can be changed after
> > ReorderBufferProcessTXN so we need to save them in a temporary
> > variable before calling the function.
>
> Thank you for updating the patch!
>
> I've not looked at the patch in-depth yet but RBTXN_IS_STREAMED could
> be cleared after ReorderBUfferProcessTXN()?
>

I think you mean to say RBTXN_IS_STREAMED could be *set* after
ReorderBUfferProcessTXN(). We need to set it for txn and subtxns and
currently, it is being done with other things in
ReorderBufferTruncateTXN so not sure if it is a good idea to do this
separately.

> BTW maybe it's better to start a new thread for this patch as the
> title is no longer relevant.
>

Yeah, that makes sense. I'll do that while posting a new version of the patch.


-- 
With Regards,
Amit Kapila.



RE: Resetting spilled txn statistics in pg_stat_replication

From
"Shinoda, Noriyoshi (PN Japan A&PS Delivery)"
Date:
Sawada-san, Thank you your comments.

The attached patch reflects the comment.
I also made a fix for the regression test.

Regards,
Noriyoshi Shinoda

-----Original Message-----
From: Masahiko Sawada [mailto:masahiko.sawada@2ndquadrant.com] 
Sent: Monday, October 12, 2020 8:12 PM
To: Shinoda, Noriyoshi (PN Japan A&PS Delivery) <noriyoshi.shinoda@hpe.com>
Cc: Amit Kapila <amit.kapila16@gmail.com>; Dilip Kumar <dilipbalaut@gmail.com>; Magnus Hagander <magnus@hagander.net>;
TomasVondra <tomas.vondra@2ndquadrant.com>; PostgreSQL Hackers <pgsql-hackers@lists.postgresql.org>; Ajin Cherian
<itsajin@gmail.com>
Subject: Re: Resetting spilled txn statistics in pg_stat_replication

On Mon, 12 Oct 2020 at 18:29, Shinoda, Noriyoshi (PN Japan A&PS
Delivery) <noriyoshi.shinoda@hpe.com> wrote:
>
> Hi, thank you for the awesome feature.
>

Thank you for reporting!

> As it may have been discussed, I think the 'name' column in pg_stat_replication_slots is more consistent with the
columnname and data type matched to the pg_replication_slots catalog.
 
> The attached patch changes the name and data type of the 'name' column to slot_name and 'name' type, respectively.

It seems a good idea to me. In other system views, we use the name data type for object name. When I wrote the first
patch,I borrowed the code for pg_stat_slru which uses text data for the name but I think it's an oversight.
 

> Also, the macro name PG_STAT_GET_..._CLOS has been changed to PG_STAT_GET_..._COLS.

Good catch!

Here is my comments on the patch:

--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -798,7 +798,7 @@ CREATE VIEW pg_stat_replication AS

 CREATE VIEW pg_stat_replication_slots AS
     SELECT
-            s.name,
+            s.name AS slot_name,
             s.spill_txns,
             s.spill_count,
             s.spill_bytes,

I think we should modify 'proargnames' of
pg_stat_get_replication_slots() in pg_proc.dat instead.

---
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -7094,7 +7094,7 @@ pgstat_replslot_index(const char *name, bool create_it)
    Assert(nReplSlotStats <= max_replication_slots);
    for (i = 0; i < nReplSlotStats; i++)
    {
-       if (strcmp(replSlotStats[i].slotname, name) == 0)
+       if (strcmp(replSlotStats[i].slotname.data, name) == 0)
            return i;           /* found */
    }

@@ -7107,7 +7107,7 @@ pgstat_replslot_index(const char *name, bool create_it)

    /* Register new slot */
    memset(&replSlotStats[nReplSlotStats], 0, sizeof(PgStat_ReplSlotStats));
-   memcpy(&replSlotStats[nReplSlotStats].slotname, name, NAMEDATALEN);
+   memcpy(&replSlotStats[nReplSlotStats].slotname.data, name, 
+ NAMEDATALEN);

    return nReplSlotStats++;
 }

We can use NameStr() instead.

---
Perhaps we need to update the regression test as well.

Regards,

--
Masahiko Sawada            http://www.2ndQuadrant.com/ 
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Attachment

Re: Resetting spilled txn statistics in pg_stat_replication

From
Masahiko Sawada
Date:
On Mon, 12 Oct 2020 at 19:24, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Oct 12, 2020 at 10:59 AM Masahiko Sawada
> <masahiko.sawada@2ndquadrant.com> wrote:
> >
> > On Thu, 8 Oct 2020 at 17:59, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Thu, Oct 8, 2020 at 1:55 PM Masahiko Sawada
> > > <masahiko.sawada@2ndquadrant.com> wrote:
> > > >
> > > > On Thu, 8 Oct 2020 at 14:10, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > >
> > > > >
> > > > > We can write if we want but there are few things we need to do for
> > > > > that like maybe a new function like wait_for_spill_stats which will
> > > > > check if the counters have become zero. Then probably call a reset
> > > > > function, call a new wait function, and then again check stats to
> > > > > ensure they are reset to 0.
> > > >
> > > > Yes.
> > > >
> > >
> > > I am not sure if it is worth but probably it is not a bad idea
> > > especially if we extend the existing tests based on your below idea?
> > >
> > > > > We can't write any advanced test which means reset the existing stats
> > > > > perform some tests and again check stats because *slot_get_changes()
> > > > > function can start from the previous WAL for which we have covered the
> > > > > stats. We might write that if we can somehow track the WAL positions
> > > > > from the previous test. I am not sure if we want to go there.
> > > >
> > > > Can we use pg_logical_slot_peek_changes() instead to decode the same
> > > > transactions multiple times?
> > > >
> > >
> > > I think this will do the trick. If we want to go there then I suggest
> > > we can have a separate regression test file in test_decoding with name
> > > as decoding_stats, stats, or something like that. We can later add the
> > > tests related to streaming stats in that file as well.
> > >
> >
> > Agreed.
> >
> > I've updated the patch. Please review it.
> >
>
> Few comments:
> =============

Thank you for your review.

> 1.
> +-- function to wait for counters to advance
> +CREATE FUNCTION wait_for_spill_stats(check_reset bool) RETURNS void AS $$
>
> Can we rename this function to wait_for_decode_stats? I am thinking we
> can later reuse this function for streaming stats as well by passing
> the additional parameter 'stream bool'.

+1. Fixed.

>
> 2. let's drop the table added by this test and regression_slot at the
> end of the test.

Fixed.

Attached the updated version patch. Please review it.

Regards,

--
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

Re: Resetting spilled txn statistics in pg_stat_replication

From
Masahiko Sawada
Date:
(Please avoid top-posting)

On Mon, 12 Oct 2020 at 23:45, Shinoda, Noriyoshi (PN Japan A&PS
Delivery) <noriyoshi.shinoda@hpe.com> wrote:
>
> Sawada-san, Thank you your comments.
>
> The attached patch reflects the comment.
> I also made a fix for the regression test.
>
> Regards,
> Noriyoshi Shinoda
>
> -----Original Message-----
> From: Masahiko Sawada [mailto:masahiko.sawada@2ndquadrant.com]
> Sent: Monday, October 12, 2020 8:12 PM
> To: Shinoda, Noriyoshi (PN Japan A&PS Delivery) <noriyoshi.shinoda@hpe.com>
> Cc: Amit Kapila <amit.kapila16@gmail.com>; Dilip Kumar <dilipbalaut@gmail.com>; Magnus Hagander
<magnus@hagander.net>;Tomas Vondra <tomas.vondra@2ndquadrant.com>; PostgreSQL Hackers
<pgsql-hackers@lists.postgresql.org>;Ajin Cherian <itsajin@gmail.com> 
> Subject: Re: Resetting spilled txn statistics in pg_stat_replication
>
> On Mon, 12 Oct 2020 at 18:29, Shinoda, Noriyoshi (PN Japan A&PS
> Delivery) <noriyoshi.shinoda@hpe.com> wrote:
> >
> > Hi, thank you for the awesome feature.
> >
>
> Thank you for reporting!
>
> > As it may have been discussed, I think the 'name' column in pg_stat_replication_slots is more consistent with the
columnname and data type matched to the pg_replication_slots catalog. 
> > The attached patch changes the name and data type of the 'name' column to slot_name and 'name' type, respectively.
>
> It seems a good idea to me. In other system views, we use the name data type for object name. When I wrote the first
patch,I borrowed the code for pg_stat_slru which uses text data for the name but I think it's an oversight. 

Hmm, my above observation is wrong. All other statistics use text data
type and internally use char[NAMEDATALEN]. So I think renaming to
'slot_name' would be a good idea but probably we don’t need to change
the internally used data type. For the data type of slot_name of
pg_stat_replication_slots view, given that the doc says the
following[1], I think we can keep it too as this view is not a system
catalog. What do you think?

8.3. Character Types:
The name type exists only for the storage of identifiers in the
internal system catalogs

[1] https://www.postgresql.org/docs/devel/datatype-character.html

Regards,

--
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Resetting spilled txn statistics in pg_stat_replication

From
Amit Kapila
Date:
On Tue, Oct 13, 2020 at 4:54 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:
>
> Attached the updated version patch. Please review it.
>

I have pushed this but it failed in one of the BF. See
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=florican&dt=2020-10-13%2003%3A07%3A25

The failure is shown below and I am analyzing it. See, if you can
provide any insights.

@@ -58,7 +58,7 @@
 SELECT name, spill_txns, spill_count FROM pg_stat_replication_slots;
       name       | spill_txns | spill_count
 -----------------+------------+-------------
- regression_slot |          1 |          12
+ regression_slot |          1 |          10
 (1 row)

 -- reset the slot stats, and wait for stats collector to reset
@@ -96,7 +96,7 @@
 SELECT name, spill_txns, spill_count FROM pg_stat_replication_slots;
       name       | spill_txns | spill_count
 -----------------+------------+-------------
- regression_slot |          1 |          12
+ regression_slot |          1 |          10
 (1 row)



-- 
With Regards,
Amit Kapila.



Re: Resetting spilled txn statistics in pg_stat_replication

From
Amit Kapila
Date:
On Tue, Oct 13, 2020 at 9:02 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Oct 13, 2020 at 4:54 AM Masahiko Sawada
> <masahiko.sawada@2ndquadrant.com> wrote:
> >
> > Attached the updated version patch. Please review it.
> >
>
> I have pushed this but it failed in one of the BF. See
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=florican&dt=2020-10-13%2003%3A07%3A25
>
> The failure is shown below and I am analyzing it. See, if you can
> provide any insights.
>
> @@ -58,7 +58,7 @@
>  SELECT name, spill_txns, spill_count FROM pg_stat_replication_slots;
>        name       | spill_txns | spill_count
>  -----------------+------------+-------------
> - regression_slot |          1 |          12
> + regression_slot |          1 |          10
>  (1 row)
>
>  -- reset the slot stats, and wait for stats collector to reset
> @@ -96,7 +96,7 @@
>  SELECT name, spill_txns, spill_count FROM pg_stat_replication_slots;
>        name       | spill_txns | spill_count
>  -----------------+------------+-------------
> - regression_slot |          1 |          12
> + regression_slot |          1 |          10
>  (1 row)
>

The reason for this problem could be that there is some transaction
(say by autovacuum) which happened interleaved with this transaction
and committed before this one. Now during DecodeCommit of this
background transaction, we will send the stats accumulated by that
time which could lead to such a problem.

-- 
With Regards,
Amit Kapila.



Re: Resetting spilled txn statistics in pg_stat_replication

From
Amit Kapila
Date:
On Tue, Oct 13, 2020 at 9:11 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Oct 13, 2020 at 9:02 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Tue, Oct 13, 2020 at 4:54 AM Masahiko Sawada
> > <masahiko.sawada@2ndquadrant.com> wrote:
> > >
> > > Attached the updated version patch. Please review it.
> > >
> >
> > I have pushed this but it failed in one of the BF. See
> > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=florican&dt=2020-10-13%2003%3A07%3A25
> >
> > The failure is shown below and I am analyzing it. See, if you can
> > provide any insights.
> >
> > @@ -58,7 +58,7 @@
> >  SELECT name, spill_txns, spill_count FROM pg_stat_replication_slots;
> >        name       | spill_txns | spill_count
> >  -----------------+------------+-------------
> > - regression_slot |          1 |          12
> > + regression_slot |          1 |          10
> >  (1 row)
> >
> >  -- reset the slot stats, and wait for stats collector to reset
> > @@ -96,7 +96,7 @@
> >  SELECT name, spill_txns, spill_count FROM pg_stat_replication_slots;
> >        name       | spill_txns | spill_count
> >  -----------------+------------+-------------
> > - regression_slot |          1 |          12
> > + regression_slot |          1 |          10
> >  (1 row)
> >
>
> The reason for this problem could be that there is some transaction
> (say by autovacuum) which happened interleaved with this transaction
> and committed before this one. Now during DecodeCommit of this
> background transaction, we will send the stats accumulated by that
> time which could lead to such a problem.
>

If this theory is correct then I think we can't rely on the
'spill_count' value, what do you think?

-- 
With Regards,
Amit Kapila.



Re: Resetting spilled txn statistics in pg_stat_replication

From
Tom Lane
Date:
Amit Kapila <amit.kapila16@gmail.com> writes:
> I have pushed this but it failed in one of the BF. See
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=florican&dt=2020-10-13%2003%3A07%3A25
> The failure is shown below and I am analyzing it. See, if you can
> provide any insights.

It's not very clear what spill_count actually counts (and the
documentation sure does nothing to clarify that), but if it has anything
to do with WAL volume, the explanation might be that florican is 32-bit.
All the animals that have passed that test so far are 64-bit.

> The reason for this problem could be that there is some transaction
> (say by autovacuum) which happened interleaved with this transaction
> and committed before this one.

I can believe that idea too, but would it not have resulted in a
diff in spill_txns as well?

In short, I'm not real convinced that a stable result is possible in this
test.  Maybe you should just test for spill_txns and spill_count being
positive.

            regards, tom lane



Re: Resetting spilled txn statistics in pg_stat_replication

From
Amit Kapila
Date:
On Tue, Oct 13, 2020 at 9:25 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Amit Kapila <amit.kapila16@gmail.com> writes:
> > I have pushed this but it failed in one of the BF. See
> > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=florican&dt=2020-10-13%2003%3A07%3A25
> > The failure is shown below and I am analyzing it. See, if you can
> > provide any insights.
>
> It's not very clear what spill_count actually counts (and the
> documentation sure does nothing to clarify that), but if it has anything
> to do with WAL volume, the explanation might be that florican is 32-bit.
> All the animals that have passed that test so far are 64-bit.
>

It is based on the size of the change. In this case, it is the size of
the tuples inserted. See ReorderBufferChangeSize() know how we compute
the size of each change. Once the total_size for changes crosses
logical_decoding_work_mem (64kB) in this case, we will spill. So
'spill_count'  is the number of times the size of changes in that
transaction crossed the threshold and which lead to a spill of the
corresponding changes.


> > The reason for this problem could be that there is some transaction
> > (say by autovacuum) which happened interleaved with this transaction
> > and committed before this one.
>
> I can believe that idea too, but would it not have resulted in a
> diff in spill_txns as well?
>

We count that 'spill_txns'  once for a transaction that is ever
spilled. I think the 'spill_txns' wouldn't vary for this particular
test even if the autovacuum transaction happens-before the main
transaction of the test because in that case, wait_for_decode_stats
won't finish until it sees the main transaction ('spill_txns' won't be
positive by that time)

> In short, I'm not real convinced that a stable result is possible in this
> test.  Maybe you should just test for spill_txns and spill_count being
> positive.
>

Yeah, that seems like the best we can do here.

-- 
With Regards,
Amit Kapila.



Re: Resetting spilled txn statistics in pg_stat_replication

From
Tom Lane
Date:
Amit Kapila <amit.kapila16@gmail.com> writes:
> On Tue, Oct 13, 2020 at 9:25 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> It's not very clear what spill_count actually counts (and the
>> documentation sure does nothing to clarify that), but if it has anything
>> to do with WAL volume, the explanation might be that florican is 32-bit.
>> All the animals that have passed that test so far are 64-bit.

prairiedog just failed in not-quite-the-same way, which reinforces the
idea that this test is dependent on MAXALIGN, which determines physical
tuple size.  (I just checked the buildfarm, and the four active members
that report MAXALIGN 4 during configure are florican, lapwing, locust,
and prairiedog.  Not sure about the MSVC critters though.)  The
spill_count number is different though, so it seems that that may not
be the whole story.

> It is based on the size of the change. In this case, it is the size of
> the tuples inserted. See ReorderBufferChangeSize() know how we compute
> the size of each change.

I know I can go read the source code, but most users will not want to.
Is the documentation in monitoring.sgml really sufficient?  If we can't
explain this with more precision, is it really a number we want to expose
at all?

            regards, tom lane



Re: Resetting spilled txn statistics in pg_stat_replication

From
Amit Kapila
Date:
On Tue, Oct 13, 2020 at 10:21 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Amit Kapila <amit.kapila16@gmail.com> writes:
> > On Tue, Oct 13, 2020 at 9:25 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> It's not very clear what spill_count actually counts (and the
> >> documentation sure does nothing to clarify that), but if it has anything
> >> to do with WAL volume, the explanation might be that florican is 32-bit.
> >> All the animals that have passed that test so far are 64-bit.
>
> prairiedog just failed in not-quite-the-same way, which reinforces the
> idea that this test is dependent on MAXALIGN, which determines physical
> tuple size.  (I just checked the buildfarm, and the four active members
> that report MAXALIGN 4 during configure are florican, lapwing, locust,
> and prairiedog.  Not sure about the MSVC critters though.)  The
> spill_count number is different though, so it seems that that may not
> be the whole story.
>

It is possible that MAXALIGN stuff is playing a role here and or the
background transaction stuff. I think if we go with the idea of
testing spill_txns and spill_count being positive then the results
will be stable. I'll write a patch for that.

> > It is based on the size of the change. In this case, it is the size of
> > the tuples inserted. See ReorderBufferChangeSize() know how we compute
> > the size of each change.
>
> I know I can go read the source code, but most users will not want to.
> Is the documentation in monitoring.sgml really sufficient?  If we can't
> explain this with more precision, is it really a number we want to expose
> at all?
>

This counter is important to give users an idea about the amount of
I/O we incur during decoding and to tune logical_decoding_work_mem
GUC. So, I would prefer to improve the documentation for this
variable.

-- 
With Regards,
Amit Kapila.



Re: Resetting spilled txn statistics in pg_stat_replication

From
Tom Lane
Date:
I wrote:
> prairiedog just failed in not-quite-the-same way, which reinforces the
> idea that this test is dependent on MAXALIGN, which determines physical
> tuple size.  (I just checked the buildfarm, and the four active members
> that report MAXALIGN 4 during configure are florican, lapwing, locust,
> and prairiedog.  Not sure about the MSVC critters though.)  The
> spill_count number is different though, so it seems that that may not
> be the whole story.

Oh, and here comes lapwing:

- regression_slot |          1 |          12
+ regression_slot |          1 |          10

So if it weren't that prairiedog showed 11 not 10, we'd have a nice
neat it-depends-on-MAXALIGN theory.  As is, I'm not sure what all
is affecting it, though MAXALIGN sure seems to be a component.

(locust seems to be AWOL at the moment, so I'm not holding my breath
for that one to report in.)

            regards, tom lane



Re: Resetting spilled txn statistics in pg_stat_replication

From
Amit Kapila
Date:
On Tue, Oct 13, 2020 at 10:33 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Oct 13, 2020 at 10:21 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >
> > Amit Kapila <amit.kapila16@gmail.com> writes:
> > > On Tue, Oct 13, 2020 at 9:25 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > >> It's not very clear what spill_count actually counts (and the
> > >> documentation sure does nothing to clarify that), but if it has anything
> > >> to do with WAL volume, the explanation might be that florican is 32-bit.
> > >> All the animals that have passed that test so far are 64-bit.
> >
> > prairiedog just failed in not-quite-the-same way, which reinforces the
> > idea that this test is dependent on MAXALIGN, which determines physical
> > tuple size.  (I just checked the buildfarm, and the four active members
> > that report MAXALIGN 4 during configure are florican, lapwing, locust,
> > and prairiedog.  Not sure about the MSVC critters though.)  The
> > spill_count number is different though, so it seems that that may not
> > be the whole story.
> >
>
> It is possible that MAXALIGN stuff is playing a role here and or the
> background transaction stuff. I think if we go with the idea of
> testing spill_txns and spill_count being positive then the results
> will be stable. I'll write a patch for that.
>

Please find the attached patch for the same. Additionally, I have
skipped empty xacts during decoding as background autovacuum
transactions can impact that count as well. I have done some minimal
testing with this. I'll do some more.

-- 
With Regards,
Amit Kapila.

Attachment

Re: Resetting spilled txn statistics in pg_stat_replication

From
Tom Lane
Date:
Amit Kapila <amit.kapila16@gmail.com> writes:
>> It is possible that MAXALIGN stuff is playing a role here and or the
>> background transaction stuff. I think if we go with the idea of
>> testing spill_txns and spill_count being positive then the results
>> will be stable. I'll write a patch for that.

Here's our first failure on a MAXALIGN-8 machine:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=grison&dt=2020-10-13%2005%3A00%3A08

So this is just plain not stable.  It is odd though.  I can
easily think of mechanisms that would cause the WAL volume
to occasionally be *more* than the "typical" case.  What
would cause it to be *less*, if MAXALIGN is ruled out?

            regards, tom lane



Re: Resetting spilled txn statistics in pg_stat_replication

From
Amit Kapila
Date:
On Tue, Oct 13, 2020 at 11:05 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Amit Kapila <amit.kapila16@gmail.com> writes:
> >> It is possible that MAXALIGN stuff is playing a role here and or the
> >> background transaction stuff. I think if we go with the idea of
> >> testing spill_txns and spill_count being positive then the results
> >> will be stable. I'll write a patch for that.
>
> Here's our first failure on a MAXALIGN-8 machine:
>
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=grison&dt=2020-10-13%2005%3A00%3A08
>
> So this is just plain not stable.  It is odd though.  I can
> easily think of mechanisms that would cause the WAL volume
> to occasionally be *more* than the "typical" case.  What
> would cause it to be *less*, if MAXALIGN is ruled out?
>

The original theory I have given above [1] which is an interleaved
autovacumm transaction. Let me try to explain in a bit more detail.
Say when transaction T-1 is performing Insert ('INSERT INTO stats_test
SELECT 'serialize-topbig--1:'||g.i FROM generate_series(1, 5000)
g(i);') a parallel autovacuum transaction occurs. The problem as seen
in buildfarm will happen when autovacuum transaction happens after 80%
or more of the Insert is done.

In such a situation we will start decoding 'Insert' first and need to
spill multiple times due to the amount of changes (more than threshold
logical_decoding_work_mem) and then before we encounter Commit of
transaction that performed Insert (and probably some more changes from
that transaction) we will encounter a small transaction (autovacuum
transaction).  The decode of that small transaction will send the
stats collected till now which will lead to the problem shown in
buildfarm.

[1] - https://www.postgresql.org/message-id/CAA4eK1Jo0U1oSJyxrdA7i-bOOTh0Hue-NQqdG-CEqwGtDZPjyw%40mail.gmail.com

-- 
With Regards,
Amit Kapila.



Re: Resetting spilled txn statistics in pg_stat_replication

From
Masahiko Sawada
Date:
On Tue, 13 Oct 2020 at 14:53, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Oct 13, 2020 at 11:05 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >
> > Amit Kapila <amit.kapila16@gmail.com> writes:
> > >> It is possible that MAXALIGN stuff is playing a role here and or the
> > >> background transaction stuff. I think if we go with the idea of
> > >> testing spill_txns and spill_count being positive then the results
> > >> will be stable. I'll write a patch for that.
> >
> > Here's our first failure on a MAXALIGN-8 machine:
> >
> > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=grison&dt=2020-10-13%2005%3A00%3A08
> >
> > So this is just plain not stable.  It is odd though.  I can
> > easily think of mechanisms that would cause the WAL volume
> > to occasionally be *more* than the "typical" case.  What
> > would cause it to be *less*, if MAXALIGN is ruled out?
> >
>
> The original theory I have given above [1] which is an interleaved
> autovacumm transaction. Let me try to explain in a bit more detail.
> Say when transaction T-1 is performing Insert ('INSERT INTO stats_test
> SELECT 'serialize-topbig--1:'||g.i FROM generate_series(1, 5000)
> g(i);') a parallel autovacuum transaction occurs. The problem as seen
> in buildfarm will happen when autovacuum transaction happens after 80%
> or more of the Insert is done.
>
> In such a situation we will start decoding 'Insert' first and need to
> spill multiple times due to the amount of changes (more than threshold
> logical_decoding_work_mem) and then before we encounter Commit of
> transaction that performed Insert (and probably some more changes from
> that transaction) we will encounter a small transaction (autovacuum
> transaction).  The decode of that small transaction will send the
> stats collected till now which will lead to the problem shown in
> buildfarm.

That seems a possible scenario.

I think probably this also explains the reason why spill_count
slightly varied and spill_txns was still 1. The spill_count value
depends on how much the process spilled out transactions before
encountering the commit of an autovacuum transaction. Since we have
the spill statistics per reorder buffer, not per transactions, it's
possible.

Regards,

-- 
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Resetting spilled txn statistics in pg_stat_replication

From
Amit Kapila
Date:
On Tue, Oct 13, 2020 at 11:49 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:
>
> On Tue, 13 Oct 2020 at 14:53, Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Tue, Oct 13, 2020 at 11:05 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > >
> > > Amit Kapila <amit.kapila16@gmail.com> writes:
> > > >> It is possible that MAXALIGN stuff is playing a role here and or the
> > > >> background transaction stuff. I think if we go with the idea of
> > > >> testing spill_txns and spill_count being positive then the results
> > > >> will be stable. I'll write a patch for that.
> > >
> > > Here's our first failure on a MAXALIGN-8 machine:
> > >
> > > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=grison&dt=2020-10-13%2005%3A00%3A08
> > >
> > > So this is just plain not stable.  It is odd though.  I can
> > > easily think of mechanisms that would cause the WAL volume
> > > to occasionally be *more* than the "typical" case.  What
> > > would cause it to be *less*, if MAXALIGN is ruled out?
> > >
> >
> > The original theory I have given above [1] which is an interleaved
> > autovacumm transaction. Let me try to explain in a bit more detail.
> > Say when transaction T-1 is performing Insert ('INSERT INTO stats_test
> > SELECT 'serialize-topbig--1:'||g.i FROM generate_series(1, 5000)
> > g(i);') a parallel autovacuum transaction occurs. The problem as seen
> > in buildfarm will happen when autovacuum transaction happens after 80%
> > or more of the Insert is done.
> >
> > In such a situation we will start decoding 'Insert' first and need to
> > spill multiple times due to the amount of changes (more than threshold
> > logical_decoding_work_mem) and then before we encounter Commit of
> > transaction that performed Insert (and probably some more changes from
> > that transaction) we will encounter a small transaction (autovacuum
> > transaction).  The decode of that small transaction will send the
> > stats collected till now which will lead to the problem shown in
> > buildfarm.
>
> That seems a possible scenario.
>
> I think probably this also explains the reason why spill_count
> slightly varied and spill_txns was still 1. The spill_count value
> depends on how much the process spilled out transactions before
> encountering the commit of an autovacuum transaction. Since we have
> the spill statistics per reorder buffer, not per transactions, it's
> possible.
>

Okay, here is an updated version (changed some comments) of the patch
I posted some time back. What do you think? I have tested this on both
Windows and Linux environments. I think it is a bit tricky to
reproduce the exact scenario so if you are fine we can push this and
check or let me know if you any better idea?

-- 
With Regards,
Amit Kapila.

Attachment

Re: Resetting spilled txn statistics in pg_stat_replication

From
Masahiko Sawada
Date:
On Tue, 13 Oct 2020 at 15:27, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Oct 13, 2020 at 11:49 AM Masahiko Sawada
> <masahiko.sawada@2ndquadrant.com> wrote:
> >
> > On Tue, 13 Oct 2020 at 14:53, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Tue, Oct 13, 2020 at 11:05 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > > >
> > > > Amit Kapila <amit.kapila16@gmail.com> writes:
> > > > >> It is possible that MAXALIGN stuff is playing a role here and or the
> > > > >> background transaction stuff. I think if we go with the idea of
> > > > >> testing spill_txns and spill_count being positive then the results
> > > > >> will be stable. I'll write a patch for that.
> > > >
> > > > Here's our first failure on a MAXALIGN-8 machine:
> > > >
> > > > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=grison&dt=2020-10-13%2005%3A00%3A08
> > > >
> > > > So this is just plain not stable.  It is odd though.  I can
> > > > easily think of mechanisms that would cause the WAL volume
> > > > to occasionally be *more* than the "typical" case.  What
> > > > would cause it to be *less*, if MAXALIGN is ruled out?
> > > >
> > >
> > > The original theory I have given above [1] which is an interleaved
> > > autovacumm transaction. Let me try to explain in a bit more detail.
> > > Say when transaction T-1 is performing Insert ('INSERT INTO stats_test
> > > SELECT 'serialize-topbig--1:'||g.i FROM generate_series(1, 5000)
> > > g(i);') a parallel autovacuum transaction occurs. The problem as seen
> > > in buildfarm will happen when autovacuum transaction happens after 80%
> > > or more of the Insert is done.
> > >
> > > In such a situation we will start decoding 'Insert' first and need to
> > > spill multiple times due to the amount of changes (more than threshold
> > > logical_decoding_work_mem) and then before we encounter Commit of
> > > transaction that performed Insert (and probably some more changes from
> > > that transaction) we will encounter a small transaction (autovacuum
> > > transaction).  The decode of that small transaction will send the
> > > stats collected till now which will lead to the problem shown in
> > > buildfarm.
> >
> > That seems a possible scenario.
> >
> > I think probably this also explains the reason why spill_count
> > slightly varied and spill_txns was still 1. The spill_count value
> > depends on how much the process spilled out transactions before
> > encountering the commit of an autovacuum transaction. Since we have
> > the spill statistics per reorder buffer, not per transactions, it's
> > possible.
> >
>
> Okay, here is an updated version (changed some comments) of the patch
> I posted some time back. What do you think? I have tested this on both
> Windows and Linux environments. I think it is a bit tricky to
> reproduce the exact scenario so if you are fine we can push this and
> check or let me know if you any better idea?

I agree to check if the spill_counts and spill_txns are positive. I
thought we can reduce the number of tuples to insert to the half. It
would help to reduce the likelihood of other transactions interfere
and speed up the test (currently, the stats.sql test takes almost 1
sec in my environment). But it might lead to another problem like the
logical decoding doesn't spill out the transaction on some
environment.

Regards,

-- 
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Resetting spilled txn statistics in pg_stat_replication

From
Amit Kapila
Date:
On Tue, Oct 13, 2020 at 12:17 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:
>
> On Tue, 13 Oct 2020 at 15:27, Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Tue, Oct 13, 2020 at 11:49 AM Masahiko Sawada
> > <masahiko.sawada@2ndquadrant.com> wrote:
> > >
> > > On Tue, 13 Oct 2020 at 14:53, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > > The original theory I have given above [1] which is an interleaved
> > > > autovacumm transaction. Let me try to explain in a bit more detail.
> > > > Say when transaction T-1 is performing Insert ('INSERT INTO stats_test
> > > > SELECT 'serialize-topbig--1:'||g.i FROM generate_series(1, 5000)
> > > > g(i);') a parallel autovacuum transaction occurs. The problem as seen
> > > > in buildfarm will happen when autovacuum transaction happens after 80%
> > > > or more of the Insert is done.
> > > >
> > > > In such a situation we will start decoding 'Insert' first and need to
> > > > spill multiple times due to the amount of changes (more than threshold
> > > > logical_decoding_work_mem) and then before we encounter Commit of
> > > > transaction that performed Insert (and probably some more changes from
> > > > that transaction) we will encounter a small transaction (autovacuum
> > > > transaction).  The decode of that small transaction will send the
> > > > stats collected till now which will lead to the problem shown in
> > > > buildfarm.
> > >
> > > That seems a possible scenario.
> > >
> > > I think probably this also explains the reason why spill_count
> > > slightly varied and spill_txns was still 1. The spill_count value
> > > depends on how much the process spilled out transactions before
> > > encountering the commit of an autovacuum transaction. Since we have
> > > the spill statistics per reorder buffer, not per transactions, it's
> > > possible.
> > >
> >
> > Okay, here is an updated version (changed some comments) of the patch
> > I posted some time back. What do you think? I have tested this on both
> > Windows and Linux environments. I think it is a bit tricky to
> > reproduce the exact scenario so if you are fine we can push this and
> > check or let me know if you any better idea?
>
> I agree to check if the spill_counts and spill_txns are positive.
>

I am able to reproduce this problem via debugger. Basically, execute
the Insert mentioned above from one the psql sessions and in
ExecInsert() stop the execution once 'estate->es_processed > 4000' and
then from another psql terminal execute some DDL which will be ignored
but will any try to decode commit. Then perform 'continue' in the
first session. This will lead to inconsistent stats value depending
upon at what time DDL is performed. I'll push the patch as I am more
confident now.

> I
> thought we can reduce the number of tuples to insert to the half. It
> would help to reduce the likelihood of other transactions interfere
> and speed up the test (currently, the stats.sql test takes almost 1
> sec in my environment). But it might lead to another problem like the
> logical decoding doesn't spill out the transaction on some
> environment.
>

Yeah, and in other cases, in spill.sql we are using the same amount of
data to test spilling.

-- 
With Regards,
Amit Kapila.



Re: Resetting spilled txn statistics in pg_stat_replication

From
Amit Kapila
Date:
On Tue, Oct 13, 2020 at 12:42 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Oct 13, 2020 at 12:17 PM Masahiko Sawada
> <masahiko.sawada@2ndquadrant.com> wrote:
> >
> > I agree to check if the spill_counts and spill_txns are positive.
> >
>
> I am able to reproduce this problem via debugger. Basically, execute
> the Insert mentioned above from one the psql sessions and in
> ExecInsert() stop the execution once 'estate->es_processed > 4000' and
> then from another psql terminal execute some DDL which will be ignored
> but will any try to decode commit.

/will any try to decode commit./we will anyway try to decode commit
for this DDL transaction when decoding changes via
pg_logical_slot_peek_changes

>  Then perform 'continue' in the
> first session. This will lead to inconsistent stats value depending
> upon at what time DDL is performed. I'll push the patch as I am more
> confident now.
>

-- 
With Regards,
Amit Kapila.



Re: Resetting spilled txn statistics in pg_stat_replication

From
Masahiko Sawada
Date:
On Tue, 13 Oct 2020 at 16:12, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Oct 13, 2020 at 12:17 PM Masahiko Sawada
> <masahiko.sawada@2ndquadrant.com> wrote:
> >
> > On Tue, 13 Oct 2020 at 15:27, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Tue, Oct 13, 2020 at 11:49 AM Masahiko Sawada
> > > <masahiko.sawada@2ndquadrant.com> wrote:
> > > >
> > > > On Tue, 13 Oct 2020 at 14:53, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > >
> > > > > The original theory I have given above [1] which is an interleaved
> > > > > autovacumm transaction. Let me try to explain in a bit more detail.
> > > > > Say when transaction T-1 is performing Insert ('INSERT INTO stats_test
> > > > > SELECT 'serialize-topbig--1:'||g.i FROM generate_series(1, 5000)
> > > > > g(i);') a parallel autovacuum transaction occurs. The problem as seen
> > > > > in buildfarm will happen when autovacuum transaction happens after 80%
> > > > > or more of the Insert is done.
> > > > >
> > > > > In such a situation we will start decoding 'Insert' first and need to
> > > > > spill multiple times due to the amount of changes (more than threshold
> > > > > logical_decoding_work_mem) and then before we encounter Commit of
> > > > > transaction that performed Insert (and probably some more changes from
> > > > > that transaction) we will encounter a small transaction (autovacuum
> > > > > transaction).  The decode of that small transaction will send the
> > > > > stats collected till now which will lead to the problem shown in
> > > > > buildfarm.
> > > >
> > > > That seems a possible scenario.
> > > >
> > > > I think probably this also explains the reason why spill_count
> > > > slightly varied and spill_txns was still 1. The spill_count value
> > > > depends on how much the process spilled out transactions before
> > > > encountering the commit of an autovacuum transaction. Since we have
> > > > the spill statistics per reorder buffer, not per transactions, it's
> > > > possible.
> > > >
> > >
> > > Okay, here is an updated version (changed some comments) of the patch
> > > I posted some time back. What do you think? I have tested this on both
> > > Windows and Linux environments. I think it is a bit tricky to
> > > reproduce the exact scenario so if you are fine we can push this and
> > > check or let me know if you any better idea?
> >
> > I agree to check if the spill_counts and spill_txns are positive.
> >
>
> I am able to reproduce this problem via debugger. Basically, execute
> the Insert mentioned above from one the psql sessions and in
> ExecInsert() stop the execution once 'estate->es_processed > 4000' and
> then from another psql terminal execute some DDL which will be ignored
> but will any try to decode commit. Then perform 'continue' in the
> first session. This will lead to inconsistent stats value depending
> upon at what time DDL is performed.

Thanks!
I'm also able to reproduce this in a similar way and have confirmed
the patch fixes it.

>  I'll push the patch as I am more
> confident now.

+1. Let's check how the tests are going to be.

Regards,

-- 
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Resetting spilled txn statistics in pg_stat_replication

From
Amit Kapila
Date:
On Tue, Oct 13, 2020 at 12:51 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:
>
>
> >  I'll push the patch as I am more
> > confident now.
>
> +1. Let's check how the tests are going to be.
>

Okay, thanks. I have pushed it.

-- 
With Regards,
Amit Kapila.



Re: Resetting spilled txn statistics in pg_stat_replication

From
Tom Lane
Date:
Amit Kapila <amit.kapila16@gmail.com> writes:
> I am able to reproduce this problem via debugger. Basically, execute
> the Insert mentioned above from one the psql sessions and in
> ExecInsert() stop the execution once 'estate->es_processed > 4000' and
> then from another psql terminal execute some DDL which will be ignored
> but will any try to decode commit. Then perform 'continue' in the
> first session. This will lead to inconsistent stats value depending
> upon at what time DDL is performed. I'll push the patch as I am more
> confident now.

So ... doesn't this mean that if the concurrent transaction commits very
shortly after our query starts, decoding might stop without having ever
spilled at all?  IOW, I'm afraid that the revised test can still fail,
just at a frequency circa one-twelfth of before.

I'm also somewhat suspicious of this explanation because it doesn't
seem to account for the clear experimental evidence that 32-bit machines
were more prone to failure than 64-bit.

            regards, tom lane



Re: Resetting spilled txn statistics in pg_stat_replication

From
Amit Kapila
Date:
On Tue, Oct 13, 2020 at 7:57 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Amit Kapila <amit.kapila16@gmail.com> writes:
> > I am able to reproduce this problem via debugger. Basically, execute
> > the Insert mentioned above from one the psql sessions and in
> > ExecInsert() stop the execution once 'estate->es_processed > 4000' and
> > then from another psql terminal execute some DDL which will be ignored
> > but will any try to decode commit. Then perform 'continue' in the
> > first session. This will lead to inconsistent stats value depending
> > upon at what time DDL is performed. I'll push the patch as I am more
> > confident now.
>
> So ... doesn't this mean that if the concurrent transaction commits very
> shortly after our query starts, decoding might stop without having ever
> spilled at all?
>

I am assuming in "our query starts" you refer to Insert statement used
in the test. No, the decoding (and required spilling) will still
happen. It is only that we will try to send the stats accumulated by
that time which will be zero. And later when the decoding of our
Insert transaction is finished it will again send the updated stats.
It should work fine in this or similar scenarios.

-- 
With Regards,
Amit Kapila.



RE: Resetting spilled txn statistics in pg_stat_replication

From
"Shinoda, Noriyoshi (PN Japan A&PS Delivery)"
Date:
Thanks for your comment.

> 8.3. Character Types:
> The name type exists only for the storage of identifiers in the internal system catalogs

I didn't know the policy about data types. Thank you.
But I think the column names should match pg_replication_slots.
The attached patch changes only the column names and macros.

Regards,
Noriyoshi Shinoda

-----Original Message-----
From: Masahiko Sawada [mailto:masahiko.sawada@2ndquadrant.com] 
Sent: Tuesday, October 13, 2020 9:11 AM
To: Shinoda, Noriyoshi (PN Japan A&PS Delivery) <noriyoshi.shinoda@hpe.com>
Cc: Amit Kapila <amit.kapila16@gmail.com>; Dilip Kumar <dilipbalaut@gmail.com>; Magnus Hagander <magnus@hagander.net>;
TomasVondra <tomas.vondra@2ndquadrant.com>; PostgreSQL Hackers <pgsql-hackers@lists.postgresql.org>; Ajin Cherian
<itsajin@gmail.com>
Subject: Re: Resetting spilled txn statistics in pg_stat_replication

(Please avoid top-posting)

On Mon, 12 Oct 2020 at 23:45, Shinoda, Noriyoshi (PN Japan A&PS
Delivery) <noriyoshi.shinoda@hpe.com> wrote:
>
> Sawada-san, Thank you your comments.
>
> The attached patch reflects the comment.
> I also made a fix for the regression test.
>
> Regards,
> Noriyoshi Shinoda
>
> -----Original Message-----
> From: Masahiko Sawada [mailto:masahiko.sawada@2ndquadrant.com]
> Sent: Monday, October 12, 2020 8:12 PM
> To: Shinoda, Noriyoshi (PN Japan A&PS Delivery) 
> <noriyoshi.shinoda@hpe.com>
> Cc: Amit Kapila <amit.kapila16@gmail.com>; Dilip Kumar 
> <dilipbalaut@gmail.com>; Magnus Hagander <magnus@hagander.net>; Tomas 
> Vondra <tomas.vondra@2ndquadrant.com>; PostgreSQL Hackers 
> <pgsql-hackers@lists.postgresql.org>; Ajin Cherian <itsajin@gmail.com>
> Subject: Re: Resetting spilled txn statistics in pg_stat_replication
>
> On Mon, 12 Oct 2020 at 18:29, Shinoda, Noriyoshi (PN Japan A&PS
> Delivery) <noriyoshi.shinoda@hpe.com> wrote:
> >
> > Hi, thank you for the awesome feature.
> >
>
> Thank you for reporting!
>
> > As it may have been discussed, I think the 'name' column in pg_stat_replication_slots is more consistent with the
columnname and data type matched to the pg_replication_slots catalog.
 
> > The attached patch changes the name and data type of the 'name' column to slot_name and 'name' type, respectively.
>
> It seems a good idea to me. In other system views, we use the name data type for object name. When I wrote the first
patch,I borrowed the code for pg_stat_slru which uses text data for the name but I think it's an oversight.
 

Hmm, my above observation is wrong. All other statistics use text data type and internally use char[NAMEDATALEN]. So I
thinkrenaming to 'slot_name' would be a good idea but probably we don’t need to change the internally used data type.
Forthe data type of slot_name of pg_stat_replication_slots view, given that the doc says the following[1], I think we
cankeep it too as this view is not a system catalog. What do you think?
 

8.3. Character Types:
The name type exists only for the storage of identifiers in the internal system catalogs

[1] https://www.postgresql.org/docs/devel/datatype-character.html 

Regards,

--
Masahiko Sawada            http://www.2ndQuadrant.com/ 
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

Re: Resetting spilled txn statistics in pg_stat_replication

From
Masahiko Sawada
Date:
On Wed, 14 Oct 2020 at 12:03, Shinoda, Noriyoshi (PN Japan A&PS
Delivery) <noriyoshi.shinoda@hpe.com> wrote:
>
> Thanks for your comment.
>
> > 8.3. Character Types:
> > The name type exists only for the storage of identifiers in the internal system catalogs
>
> I didn't know the policy about data types. Thank you.
> But I think the column names should match pg_replication_slots.
> The attached patch changes only the column names and macros.
>

Thank you for updating the patch!

The patch changes the column name from 'name' to 'slot_name' and fixes
a typo. That's reasonable to me. Amit, what do you think?

Regards,

-- 
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Resetting spilled txn statistics in pg_stat_replication

From
Amit Kapila
Date:
On Tue, Oct 13, 2020 at 5:41 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:
>
> On Mon, 12 Oct 2020 at 23:45, Shinoda, Noriyoshi (PN Japan A&PS
> Delivery) <noriyoshi.shinoda@hpe.com> wrote:
> >
> >
> > > As it may have been discussed, I think the 'name' column in pg_stat_replication_slots is more consistent with the
columnname and data type matched to the pg_replication_slots catalog.
 
> > > The attached patch changes the name and data type of the 'name' column to slot_name and 'name' type,
respectively.
> >
> > It seems a good idea to me. In other system views, we use the name data type for object name. When I wrote the
firstpatch, I borrowed the code for pg_stat_slru which uses text data for the name but I think it's an oversight.
 
>
> Hmm, my above observation is wrong. All other statistics use text data
> type and internally use char[NAMEDATALEN].
>

AFAICS, we use name data-type in many other similar stats views like
pg_stat_subscription, pg_statio_all_sequences, pg_stat_user_functions,
pg_stat_all_tables. So, shouldn't we consistent with those views?

-- 
With Regards,
Amit Kapila.



RE: Resetting spilled txn statistics in pg_stat_replication

From
"Shinoda, Noriyoshi (PN Japan A&PS Delivery)"
Date:
Amit-san, Sawada-san,

Thank you for your comment.

> AFAICS, we use name data-type in many other similar stats views like pg_stat_subscription, pg_statio_all_sequences,
pg_stat_user_functions,
 
> pg_stat_all_tables. So, shouldn't we consistent with those views?

I checked the data type used for the statistics view identity column. 'Name' type columns are used in many views. If
thereis no problem with PostgreSQL standard, I would like to change both the data type and the column name.
 

- name type 
pg_stat_activity.datname
pg_stat_replication.usename
pg_stat_subscription.subname
pg_stat_database.datname
pg_stat_database_conflicts.datname
pg_stat_all_tables.schemaname/.relname
pg_stat_all_indexes.schemaname/.relname/.indexrelname
pg_statio_all_tables.schemaname/.relname
pg_statio_all_indexes.schemaname/.relname/.indexname
pg_statio_all_sequences.schemaname/.relname
pg_stat_user_functions.schemaname/.funcname

- text type
pg_stat_replication_slots.name
pg_stat_slru.name
pg_backend_memory_contexts.name

The attached patch makes the following changes.
- column name: name to slot_name
- data type: text to name
- macro: ... CLOS to ... COLS

Regards,
Noriyoshi Shinoda

-----Original Message-----
From: Amit Kapila [mailto:amit.kapila16@gmail.com] 
Sent: Thursday, October 15, 2020 5:52 PM
To: Masahiko Sawada <masahiko.sawada@2ndquadrant.com>
Cc: Shinoda, Noriyoshi (PN Japan A&PS Delivery) <noriyoshi.shinoda@hpe.com>; Dilip Kumar <dilipbalaut@gmail.com>;
MagnusHagander <magnus@hagander.net>; Tomas Vondra <tomas.vondra@2ndquadrant.com>; PostgreSQL Hackers
<pgsql-hackers@lists.postgresql.org>;Ajin Cherian <itsajin@gmail.com>
 
Subject: Re: Resetting spilled txn statistics in pg_stat_replication

On Tue, Oct 13, 2020 at 5:41 AM Masahiko Sawada <masahiko.sawada@2ndquadrant.com> wrote:
>
> On Mon, 12 Oct 2020 at 23:45, Shinoda, Noriyoshi (PN Japan A&PS
> Delivery) <noriyoshi.shinoda@hpe.com> wrote:
> >
> >
> > > As it may have been discussed, I think the 'name' column in pg_stat_replication_slots is more consistent with the
columnname and data type matched to the pg_replication_slots catalog.
 
> > > The attached patch changes the name and data type of the 'name' column to slot_name and 'name' type,
respectively.
> >
> > It seems a good idea to me. In other system views, we use the name data type for object name. When I wrote the
firstpatch, I borrowed the code for pg_stat_slru which uses text data for the name but I think it's an oversight.
 
>
> Hmm, my above observation is wrong. All other statistics use text data 
> type and internally use char[NAMEDATALEN].
>

AFAICS, we use name data-type in many other similar stats views like pg_stat_subscription, pg_statio_all_sequences,
pg_stat_user_functions,pg_stat_all_tables. So, shouldn't we consistent with those views?
 

--
With Regards,
Amit Kapila.

Attachment

Re: Resetting spilled txn statistics in pg_stat_replication

From
Masahiko Sawada
Date:
On Thu, 15 Oct 2020 at 17:51, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Oct 13, 2020 at 5:41 AM Masahiko Sawada
> <masahiko.sawada@2ndquadrant.com> wrote:
> >
> > On Mon, 12 Oct 2020 at 23:45, Shinoda, Noriyoshi (PN Japan A&PS
> > Delivery) <noriyoshi.shinoda@hpe.com> wrote:
> > >
> > >
> > > > As it may have been discussed, I think the 'name' column in pg_stat_replication_slots is more consistent with
thecolumn name and data type matched to the pg_replication_slots catalog. 
> > > > The attached patch changes the name and data type of the 'name' column to slot_name and 'name' type,
respectively.
> > >
> > > It seems a good idea to me. In other system views, we use the name data type for object name. When I wrote the
firstpatch, I borrowed the code for pg_stat_slru which uses text data for the name but I think it's an oversight. 
> >
> > Hmm, my above observation is wrong. All other statistics use text data
> > type and internally use char[NAMEDATALEN].
> >
>
> AFAICS, we use name data-type in many other similar stats views like
> pg_stat_subscription, pg_statio_all_sequences, pg_stat_user_functions,
> pg_stat_all_tables. So, shouldn't we consistent with those views?
>

Yes, they has the name data type column but it actually comes from
system catalogs. For instance, here is the view definition of
pg_stat_subscription:

 SELECT su.oid AS subid,
    su.subname,
    st.pid,
    st.relid,
    st.received_lsn,
    st.last_msg_send_time,
    st.last_msg_receipt_time,
    st.latest_end_lsn,
    st.latest_end_time
   FROM pg_subscription su
     LEFT JOIN pg_stat_get_subscription(NULL::oid) st(subid, relid,
pid, received_lsn, last_msg_send_time, last_msg_receipt_time, latest
_end_lsn, latest_end_time) ON st.subid = su.oid;

This view uses the subscription name from pg_subscription system
catalog. AFAICS no string data managed by the stats collector use name
data type. It uses char[NAMEDATALEN] instead. And since I found the
description about name data type in the doc, I thought it's better to
have it as text. Probably since pg_stat_replication_slots
(pg_stat_get_replication_slots()) is the our first view that doesn’t
use system catalog to get the object name I'm okay with changing to
name data type but if we do that it's better to update the description
in the doc as well.

Regards,

--
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Resetting spilled txn statistics in pg_stat_replication

From
Amit Kapila
Date:
On Mon, Oct 19, 2020 at 9:04 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:
>
> On Thu, 15 Oct 2020 at 17:51, Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> >
> > AFAICS, we use name data-type in many other similar stats views like
> > pg_stat_subscription, pg_statio_all_sequences, pg_stat_user_functions,
> > pg_stat_all_tables. So, shouldn't we consistent with those views?
> >
>
> Yes, they has the name data type column but it actually comes from
> system catalogs. For instance, here is the view definition of
> pg_stat_subscription:
>
>  SELECT su.oid AS subid,
>     su.subname,
>     st.pid,
>     st.relid,
>     st.received_lsn,
>     st.last_msg_send_time,
>     st.last_msg_receipt_time,
>     st.latest_end_lsn,
>     st.latest_end_time
>    FROM pg_subscription su
>      LEFT JOIN pg_stat_get_subscription(NULL::oid) st(subid, relid,
> pid, received_lsn, last_msg_send_time, last_msg_receipt_time, latest
> _end_lsn, latest_end_time) ON st.subid = su.oid;
>
> This view uses the subscription name from pg_subscription system
> catalog. AFAICS no string data managed by the stats collector use name
> data type. It uses char[NAMEDATALEN] instead. And since I found the
> description about name data type in the doc, I thought it's better to
> have it as text.
>

Okay, I see the merit of keeping slot_name as 'text' type. I was
trying to see if we can be consistent but I guess that is not so
straight-forward, if we go with all (stat) view definitions to
consistently display/use 'name' datatype for strings then ideally we
need to change at other places as well. Also, I see that
pg_stat_wal_receiver uses slot_name as 'text', so displaying the same
as 'name' in pg_stat_replication_slots might not be ideal. So, let's
stick with 'text' data type for slot_name which means we should
go-ahead with the v3 version of the patch [1] posted by Shinoda-San,
right?

[1] -
https://www.postgresql.org/message-id/TU4PR8401MB115297EF936A7675A5644151EE050%40TU4PR8401MB1152.NAMPRD84.PROD.OUTLOOK.COM

-- 
With Regards,
Amit Kapila.



Re: Resetting spilled txn statistics in pg_stat_replication

From
Masahiko Sawada
Date:
On Mon, 19 Oct 2020 at 14:24, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Oct 19, 2020 at 9:04 AM Masahiko Sawada
> <masahiko.sawada@2ndquadrant.com> wrote:
> >
> > On Thu, 15 Oct 2020 at 17:51, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > >
> > > AFAICS, we use name data-type in many other similar stats views like
> > > pg_stat_subscription, pg_statio_all_sequences, pg_stat_user_functions,
> > > pg_stat_all_tables. So, shouldn't we consistent with those views?
> > >
> >
> > Yes, they has the name data type column but it actually comes from
> > system catalogs. For instance, here is the view definition of
> > pg_stat_subscription:
> >
> >  SELECT su.oid AS subid,
> >     su.subname,
> >     st.pid,
> >     st.relid,
> >     st.received_lsn,
> >     st.last_msg_send_time,
> >     st.last_msg_receipt_time,
> >     st.latest_end_lsn,
> >     st.latest_end_time
> >    FROM pg_subscription su
> >      LEFT JOIN pg_stat_get_subscription(NULL::oid) st(subid, relid,
> > pid, received_lsn, last_msg_send_time, last_msg_receipt_time, latest
> > _end_lsn, latest_end_time) ON st.subid = su.oid;
> >
> > This view uses the subscription name from pg_subscription system
> > catalog. AFAICS no string data managed by the stats collector use name
> > data type. It uses char[NAMEDATALEN] instead. And since I found the
> > description about name data type in the doc, I thought it's better to
> > have it as text.
> >
>
> Okay, I see the merit of keeping slot_name as 'text' type. I was
> trying to see if we can be consistent but I guess that is not so
> straight-forward, if we go with all (stat) view definitions to
> consistently display/use 'name' datatype for strings then ideally we
> need to change at other places as well. Also, I see that
> pg_stat_wal_receiver uses slot_name as 'text', so displaying the same
> as 'name' in pg_stat_replication_slots might not be ideal.

Oh, I didn't realize pg_stat_wal_receiver uses text data type.

> So, let's
> stick with 'text' data type for slot_name which means we should
> go-ahead with the v3 version of the patch [1] posted by Shinoda-San,
> right?

Right. +1

Regards,

-- 
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Resetting spilled txn statistics in pg_stat_replication

From
Amit Kapila
Date:
On Mon, Oct 19, 2020 at 1:20 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:
>
> On Mon, 19 Oct 2020 at 14:24, Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
>
> > So, let's
> > stick with 'text' data type for slot_name which means we should
> > go-ahead with the v3 version of the patch [1] posted by Shinoda-San,
> > right?
>
> Right. +1
>

I have pushed the patch after updating the test_decoding/sql/stats.
The corresponding changes in the test were missing.

-- 
With Regards,
Amit Kapila.



Re: Resetting spilled txn statistics in pg_stat_replication

From
Masahiko Sawada
Date:
On Tue, 20 Oct 2020 at 20:11, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Oct 19, 2020 at 1:20 PM Masahiko Sawada
> <masahiko.sawada@2ndquadrant.com> wrote:
> >
> > On Mon, 19 Oct 2020 at 14:24, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> >
> > > So, let's
> > > stick with 'text' data type for slot_name which means we should
> > > go-ahead with the v3 version of the patch [1] posted by Shinoda-San,
> > > right?
> >
> > Right. +1
> >
>
> I have pushed the patch after updating the test_decoding/sql/stats.
> The corresponding changes in the test were missing.

Thank you!

Regards,

-- 
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



RE: Resetting spilled txn statistics in pg_stat_replication

From
"Shinoda, Noriyoshi (PN Japan A&PS Delivery)"
Date:
> I have pushed the patch after updating the test_decoding/sql/stats.
> The corresponding changes in the test were missing.

Thank you very much for your help!

Regards,
Noriyoshi Shinoda

-----Original Message-----
From: Masahiko Sawada [mailto:masahiko.sawada@2ndquadrant.com] 
Sent: Tuesday, October 20, 2020 9:24 PM
To: Amit Kapila <amit.kapila16@gmail.com>
Cc: Shinoda, Noriyoshi (PN Japan A&PS Delivery) <noriyoshi.shinoda@hpe.com>; Dilip Kumar <dilipbalaut@gmail.com>;
MagnusHagander <magnus@hagander.net>; Tomas Vondra <tomas.vondra@2ndquadrant.com>; PostgreSQL Hackers
<pgsql-hackers@lists.postgresql.org>;Ajin Cherian <itsajin@gmail.com>
 
Subject: Re: Resetting spilled txn statistics in pg_stat_replication

On Tue, 20 Oct 2020 at 20:11, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Oct 19, 2020 at 1:20 PM Masahiko Sawada 
> <masahiko.sawada@2ndquadrant.com> wrote:
> >
> > On Mon, 19 Oct 2020 at 14:24, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> >
> > > So, let's
> > > stick with 'text' data type for slot_name which means we should 
> > > go-ahead with the v3 version of the patch [1] posted by 
> > > Shinoda-San, right?
> >
> > Right. +1
> >
>
> I have pushed the patch after updating the test_decoding/sql/stats.
> The corresponding changes in the test were missing.

Thank you!

Regards,

-- 
Masahiko Sawada            http://www.2ndQuadrant.com/ 
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Resetting spilled txn statistics in pg_stat_replication

From
Amit Kapila
Date:
On Tue, Oct 13, 2020 at 10:33 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Oct 13, 2020 at 10:21 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >
> >
> > I know I can go read the source code, but most users will not want to.
> > Is the documentation in monitoring.sgml really sufficient?  If we can't
> > explain this with more precision, is it really a number we want to expose
> > at all?
> >
>
> This counter is important to give users an idea about the amount of
> I/O we incur during decoding and to tune logical_decoding_work_mem
> GUC. So, I would prefer to improve the documentation for this
> variable.
>

I have modified the description of spill_count and spill_txns to make
things clear. Any suggestions?

-- 
With Regards,
Amit Kapila.

Attachment

Re: Resetting spilled txn statistics in pg_stat_replication

From
Masahiko Sawada
Date:
On Wed, 21 Oct 2020 at 12:56, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Oct 13, 2020 at 10:33 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Tue, Oct 13, 2020 at 10:21 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > >
> > >
> > > I know I can go read the source code, but most users will not want to.
> > > Is the documentation in monitoring.sgml really sufficient?  If we can't
> > > explain this with more precision, is it really a number we want to expose
> > > at all?
> > >
> >
> > This counter is important to give users an idea about the amount of
> > I/O we incur during decoding and to tune logical_decoding_work_mem
> > GUC. So, I would prefer to improve the documentation for this
> > variable.
> >
>
> I have modified the description of spill_count and spill_txns to make
> things clear. Any suggestions?

Thank you for the patch.

-        logical decoding exceeds
<literal>logical_decoding_work_mem</literal>. The
-        counter gets incremented both for toplevel transactions and
-        subtransactions.
+        logical decoding of changes from WAL for this exceeds
+        <literal>logical_decoding_work_mem</literal>. The counter gets
+        incremented both for toplevel transactions and subtransactions.

What is the word "this" in the above change referring to? How about
something like:

Number of transactions spilled to disk after the memory used by
logical decoding of changes from WAL exceeding
logical_decoding_work_mem. The counter gets incremented both for
toplevel transactions and subtransactions.

Regards,

-- 
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Resetting spilled txn statistics in pg_stat_replication

From
Amit Kapila
Date:
On Thu, Oct 22, 2020 at 4:09 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:
>
> On Wed, 21 Oct 2020 at 12:56, Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Tue, Oct 13, 2020 at 10:33 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Tue, Oct 13, 2020 at 10:21 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > > >
> > > >
> > > > I know I can go read the source code, but most users will not want to.
> > > > Is the documentation in monitoring.sgml really sufficient?  If we can't
> > > > explain this with more precision, is it really a number we want to expose
> > > > at all?
> > > >
> > >
> > > This counter is important to give users an idea about the amount of
> > > I/O we incur during decoding and to tune logical_decoding_work_mem
> > > GUC. So, I would prefer to improve the documentation for this
> > > variable.
> > >
> >
> > I have modified the description of spill_count and spill_txns to make
> > things clear. Any suggestions?
>
> Thank you for the patch.
>
> -        logical decoding exceeds
> <literal>logical_decoding_work_mem</literal>. The
> -        counter gets incremented both for toplevel transactions and
> -        subtransactions.
> +        logical decoding of changes from WAL for this exceeds
> +        <literal>logical_decoding_work_mem</literal>. The counter gets
> +        incremented both for toplevel transactions and subtransactions.
>
> What is the word "this" in the above change referring to?
>

'slot'. The word *slot* is missing in the sentence.

> How about
> something like:
>

> Number of transactions spilled to disk after the memory used by
> logical decoding of changes from WAL exceeding
> logical_decoding_work_mem. The counter gets incremented both for
> toplevel transactions and subtransactions.
>

/exceeding/exceeds. I am fine with your proposed text as well but if
you like the above after correction that would be better because it
would be more close to spill_count description.

-- 
With Regards,
Amit Kapila.



Re: Resetting spilled txn statistics in pg_stat_replication

From
Masahiko Sawada
Date:
On Thu, 22 Oct 2020 at 20:34, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Oct 22, 2020 at 4:09 PM Masahiko Sawada
> <masahiko.sawada@2ndquadrant.com> wrote:
> >
> > On Wed, 21 Oct 2020 at 12:56, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Tue, Oct 13, 2020 at 10:33 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > > On Tue, Oct 13, 2020 at 10:21 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > > > >
> > > > >
> > > > > I know I can go read the source code, but most users will not want to.
> > > > > Is the documentation in monitoring.sgml really sufficient?  If we can't
> > > > > explain this with more precision, is it really a number we want to expose
> > > > > at all?
> > > > >
> > > >
> > > > This counter is important to give users an idea about the amount of
> > > > I/O we incur during decoding and to tune logical_decoding_work_mem
> > > > GUC. So, I would prefer to improve the documentation for this
> > > > variable.
> > > >
> > >
> > > I have modified the description of spill_count and spill_txns to make
> > > things clear. Any suggestions?
> >
> > Thank you for the patch.
> >
> > -        logical decoding exceeds
> > <literal>logical_decoding_work_mem</literal>. The
> > -        counter gets incremented both for toplevel transactions and
> > -        subtransactions.
> > +        logical decoding of changes from WAL for this exceeds
> > +        <literal>logical_decoding_work_mem</literal>. The counter gets
> > +        incremented both for toplevel transactions and subtransactions.
> >
> > What is the word "this" in the above change referring to?
> >
>
> 'slot'. The word *slot* is missing in the sentence.
>
> > How about
> > something like:
> >
>
> > Number of transactions spilled to disk after the memory used by
> > logical decoding of changes from WAL exceeding
> > logical_decoding_work_mem. The counter gets incremented both for
> > toplevel transactions and subtransactions.
> >
>
> /exceeding/exceeds. I am fine with your proposed text as well but if
> you like the above after correction that would be better because it
> would be more close to spill_count description.

yeah, I agree with the correction.

Regards,

-- 
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Resetting spilled txn statistics in pg_stat_replication

From
Amit Kapila
Date:
On Fri, Oct 23, 2020 at 7:42 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:
>
> On Thu, 22 Oct 2020 at 20:34, Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > > > I have modified the description of spill_count and spill_txns to make
> > > > things clear. Any suggestions?
> > >
> > > Thank you for the patch.
> > >
> > > -        logical decoding exceeds
> > > <literal>logical_decoding_work_mem</literal>. The
> > > -        counter gets incremented both for toplevel transactions and
> > > -        subtransactions.
> > > +        logical decoding of changes from WAL for this exceeds
> > > +        <literal>logical_decoding_work_mem</literal>. The counter gets
> > > +        incremented both for toplevel transactions and subtransactions.
> > >
> > > What is the word "this" in the above change referring to?
> > >
> >
> > 'slot'. The word *slot* is missing in the sentence.
> >
> > > How about
> > > something like:
> > >
> >
> > > Number of transactions spilled to disk after the memory used by
> > > logical decoding of changes from WAL exceeding
> > > logical_decoding_work_mem. The counter gets incremented both for
> > > toplevel transactions and subtransactions.
> > >
> >
> > /exceeding/exceeds. I am fine with your proposed text as well but if
> > you like the above after correction that would be better because it
> > would be more close to spill_count description.
>
> yeah, I agree with the correction.
>

Okay, thanks, attached is an updated patch. I'll push this early next
week unless you or someone else has any comments/suggestions.

-- 
With Regards,
Amit Kapila.

Attachment

Re: Resetting spilled txn statistics in pg_stat_replication

From
Amit Kapila
Date:
On Fri, Oct 23, 2020 at 8:59 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Oct 23, 2020 at 7:42 AM Masahiko Sawada
> <masahiko.sawada@2ndquadrant.com> wrote:
> >
> > On Thu, 22 Oct 2020 at 20:34, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > > > I have modified the description of spill_count and spill_txns to make
> > > > > things clear. Any suggestions?
> > > >
> > > > Thank you for the patch.
> > > >
> > > > -        logical decoding exceeds
> > > > <literal>logical_decoding_work_mem</literal>. The
> > > > -        counter gets incremented both for toplevel transactions and
> > > > -        subtransactions.
> > > > +        logical decoding of changes from WAL for this exceeds
> > > > +        <literal>logical_decoding_work_mem</literal>. The counter gets
> > > > +        incremented both for toplevel transactions and subtransactions.
> > > >
> > > > What is the word "this" in the above change referring to?
> > > >
> > >
> > > 'slot'. The word *slot* is missing in the sentence.
> > >
> > > > How about
> > > > something like:
> > > >
> > >
> > > > Number of transactions spilled to disk after the memory used by
> > > > logical decoding of changes from WAL exceeding
> > > > logical_decoding_work_mem. The counter gets incremented both for
> > > > toplevel transactions and subtransactions.
> > > >
> > >
> > > /exceeding/exceeds. I am fine with your proposed text as well but if
> > > you like the above after correction that would be better because it
> > > would be more close to spill_count description.
> >
> > yeah, I agree with the correction.
> >
>
> Okay, thanks, attached is an updated patch.
>

While updating the streaming stats patch, it occurred to me that we
can write a better description spill_bytes as well. Attached contains
the update to spill_bytes description.

-- 
With Regards,
Amit Kapila.

Attachment

Re: Resetting spilled txn statistics in pg_stat_replication

From
Justin Pryzby
Date:
On Fri, Oct 23, 2020 at 10:45:34AM +0530, Amit Kapila wrote:
> On Fri, Oct 23, 2020 at 8:59 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Fri, Oct 23, 2020 at 7:42 AM Masahiko Sawada
> > <masahiko.sawada@2ndquadrant.com> wrote:
> > >
> > > On Thu, 22 Oct 2020 at 20:34, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > > > > I have modified the description of spill_count and spill_txns to make
> > > > > > things clear. Any suggestions?
> > > > >
> > > > > Thank you for the patch.
> > > > >
> > > > > -        logical decoding exceeds
> > > > > <literal>logical_decoding_work_mem</literal>. The
> > > > > -        counter gets incremented both for toplevel transactions and
> > > > > -        subtransactions.
> > > > > +        logical decoding of changes from WAL for this exceeds
> > > > > +        <literal>logical_decoding_work_mem</literal>. The counter gets
> > > > > +        incremented both for toplevel transactions and subtransactions.
> > > > >
> > > > > What is the word "this" in the above change referring to?
> > > > >
> > > >
> > > > 'slot'. The word *slot* is missing in the sentence.
> > > >
> > > > > How about
> > > > > something like:
> > > > >
> > > >
> > > > > Number of transactions spilled to disk after the memory used by
> > > > > logical decoding of changes from WAL exceeding
> > > > > logical_decoding_work_mem. The counter gets incremented both for
> > > > > toplevel transactions and subtransactions.
> > > > >
> > > >
> > > > /exceeding/exceeds. I am fine with your proposed text as well but if
> > > > you like the above after correction that would be better because it
> > > > would be more close to spill_count description.
> > >
> > > yeah, I agree with the correction.
> > >
> >
> > Okay, thanks, attached is an updated patch.
> >
> 
> While updating the streaming stats patch, it occurred to me that we
> can write a better description spill_bytes as well. Attached contains
> the update to spill_bytes description.

+        This and other spill
+        counters can be used to gauge the I/O occurred during logical decoding
+        and accordingly can tune <literal>logical_decoding_work_mem</literal>.

"gauge the IO occurred" is wrong.
Either: I/O *which* occured, or I/O occurring, or occurs.

"can tune" should say "allow tuning".

Like:
+        This and other spill
+        counters can be used to gauge the I/O which occurs during logical decoding
+        and accordingly allow tuning of <literal>logical_decoding_work_mem</literal>.

-        Number of times transactions were spilled to disk. Transactions
+        Number of times transactions were spilled to disk while performing
+        decoding of changes from WAL for this slot. Transactions

What about: "..while decoding changes.." (remove "performing" and "of").

-- 
Justin



Re: Resetting spilled txn statistics in pg_stat_replication

From
Amit Kapila
Date:
On Tue, Oct 27, 2020 at 8:51 AM Justin Pryzby <pryzby@telsasoft.com> wrote:
>
> On Fri, Oct 23, 2020 at 10:45:34AM +0530, Amit Kapila wrote:
> > On Fri, Oct 23, 2020 at 8:59 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > While updating the streaming stats patch, it occurred to me that we
> > can write a better description spill_bytes as well. Attached contains
> > the update to spill_bytes description.
>
> +        This and other spill
> +        counters can be used to gauge the I/O occurred during logical decoding
> +        and accordingly can tune <literal>logical_decoding_work_mem</literal>.
>
> "gauge the IO occurred" is wrong.
> Either: I/O *which* occured, or I/O occurring, or occurs.
>
> "can tune" should say "allow tuning".
>
> Like:
> +        This and other spill
> +        counters can be used to gauge the I/O which occurs during logical decoding
> +        and accordingly allow tuning of <literal>logical_decoding_work_mem</literal>.
>
> -        Number of times transactions were spilled to disk. Transactions
> +        Number of times transactions were spilled to disk while performing
> +        decoding of changes from WAL for this slot. Transactions
>
> What about: "..while decoding changes.." (remove "performing" and "of").
>

All of your suggestions sound good to me. Find the patch attached to
update the docs accordingly.

-- 
With Regards,
Amit Kapila.

Attachment

Re: Resetting spilled txn statistics in pg_stat_replication

From
Amit Kapila
Date:
On Wed, Oct 28, 2020 at 5:26 AM Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:
>
> On Tue, Oct 27, 2020 at 06:19:42PM -0500, Justin Pryzby wrote:
> >On Tue, Oct 27, 2020 at 09:17:43AM +0530, Amit Kapila wrote:
> >> On Tue, Oct 27, 2020 at 8:51 AM Justin Pryzby <pryzby@telsasoft.com> wrote:
> >> >
> >> > On Fri, Oct 23, 2020 at 10:45:34AM +0530, Amit Kapila wrote:
> >> > > On Fri, Oct 23, 2020 at 8:59 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >> > >
> >> > > While updating the streaming stats patch, it occurred to me that we
> >> > > can write a better description spill_bytes as well. Attached contains
> >> > > the update to spill_bytes description.
> >> >
> >> > +        This and other spill
> >> > +        counters can be used to gauge the I/O occurred during logical decoding
> >> > +        and accordingly can tune <literal>logical_decoding_work_mem</literal>.
> >> >
> >> > "gauge the IO occurred" is wrong.
> >> > Either: I/O *which* occured, or I/O occurring, or occurs.
> >> >
> >> > "can tune" should say "allow tuning".
> >> >
> >> > Like:
> >> > +        This and other spill
> >> > +        counters can be used to gauge the I/O which occurs during logical decoding
> >> > +        and accordingly allow tuning of <literal>logical_decoding_work_mem</literal>.
> >> >
> >> > -        Number of times transactions were spilled to disk. Transactions
> >> > +        Number of times transactions were spilled to disk while performing
> >> > +        decoding of changes from WAL for this slot. Transactions
> >> >
> >> > What about: "..while decoding changes.." (remove "performing" and "of").
> >> >
> >>
> >> All of your suggestions sound good to me. Find the patch attached to
> >> update the docs accordingly.
> >
> >@@ -2628,8 +2627,8 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
> >        <para>
> >         Amount of decoded transaction data spilled to disk while performing
> >         decoding of changes from WAL for this slot. This and other spill
> >-        counters can be used to gauge the I/O occurred during logical decoding
> >-        and accordingly can tune <literal>logical_decoding_work_mem</literal>.
> >+        counters can be used to gauge the I/O which occurred during logical
> >+        decoding and accordingly allow tuning  <literal>logical_decoding_work_mem</literal>.
> >
> >Now that I look again, maybe remove "accordingly" ?
> >
>
> Yeah, the 'accordingly' seems rather unnecessary here. Let's remove it.
>

Removed and pushed.

-- 
With Regards,
Amit Kapila.