Thread: Replication slot stats misgivings

Replication slot stats misgivings

From

Andres Freund

Date:

19 March 2021, 18:52:47

Hi,

I started to write this as a reply to
https://postgr.es/m/20210318015105.dcfa4ceybdjubf2i%40alap3.anarazel.de
but I think it doesn't really fit under that header anymore.

On 2021-03-17 18:51:05 -0700, Andres Freund wrote:
> It does make it easier for the shared memory stats patch, because if
> there's a fixed number + location, the relevant stats reporting doesn't
> need to go through a hashtable with the associated locking.  I guess
> that may have colored my perception that it's better to just have a
> statically sized memory allocation for this.  Noteworthy that SLRU stats
> are done in a fixed size allocation as well...

As part of reviewing the replication slot stats patch I looked at
replication slot stats a fair bit, and I've a few misgivings. First,
about the pgstat.c side of things:

- If somehow slot stat drop messages got lost (remember pgstat
  communication is lossy!), we'll just stop maintaining stats for slots
  created later, because there'll eventually be no space for keeping
  stats for another slot.

- If max_replication_slots was lowered between a restart,
  pgstat_read_statfile() will happily write beyond the end of
  replSlotStats.

- pgstat_reset_replslot_counter() acquires ReplicationSlotControlLock. I
  think pgstat.c has absolutely no business doing things on that level.

- We do a linear search through all replication slots whenever receiving
  stats for a slot. Even though there'd be a perfectly good index to
  just use all throughout - the slots index itself. It looks to me like
  slots stat reports can be fairly frequent in some workloads, so that
  doesn't seem great.

- PgStat_ReplSlotStats etc use slotname[NAMEDATALEN]. Why not just NameData?

- pgstat_report_replslot() already has a lot of stats parameters, it
  seems likely that we'll get more. Seems like we should just use a
  struct of stats updates.

And then more generally about the feature:
- If a slot was used to stream out a large amount of changes (say an
  initial data load), but then replication is interrupted before the
  transaction is committed/aborted, stream_bytes will not reflect the
  many gigabytes of data we may have sent.
- I seems weird that we went to the trouble of inventing replication
  slot stats, but then limit them to logical slots, and even there don't
  record the obvious things like the total amount of data sent.

I think the best way to address the more fundamental "pgstat related"
complaints is to change how replication slot stats are
"addressed". Instead of using the slots name, report stats using the
index in ReplicationSlotCtl->replication_slots.

That removes the risk of running out of "replication slot stat slots":
If we loose a drop message, the index eventually will be reused and we
likely can detect that the stats were for a different slot by comparing
the slot name.

It also makes it easy to handle the issue of max_replication_slots being
lowered and there still being stats for a slot - we simply can skip
restoring that slots data, because we know the relevant slot can't exist
anymore. And we can make the initial pgstat_report_replslot() during
slot creation use a

I'm wondering if we should just remove the slot name entirely from the
pgstat.c side of things, and have pg_stat_get_replication_slots()
inquire about slots by index as well and get the list of slots to report
stats for from slot.c infrastructure.

Greetings,

Andres Freund

Re: Replication slot stats misgivings

From

Amit Kapila

Date:

20 March 2021, 03:55:40

On Sat, Mar 20, 2021 at 12:22 AM Andres Freund <andres@anarazel.de> wrote:
>
> And then more generally about the feature:
> - If a slot was used to stream out a large amount of changes (say an
>   initial data load), but then replication is interrupted before the
>   transaction is committed/aborted, stream_bytes will not reflect the
>   many gigabytes of data we may have sent.
>

We can probably update the stats each time we spilled or streamed the
transaction data but it was not clear at that stage whether or how
much it will be useful.

> - I seems weird that we went to the trouble of inventing replication
>   slot stats, but then limit them to logical slots, and even there don't
>   record the obvious things like the total amount of data sent.
>

Won't spill_bytes and stream_bytes will give you the amount of data sent?

>
> I think the best way to address the more fundamental "pgstat related"
> complaints is to change how replication slot stats are
> "addressed". Instead of using the slots name, report stats using the
> index in ReplicationSlotCtl->replication_slots.
>
> That removes the risk of running out of "replication slot stat slots":
> If we loose a drop message, the index eventually will be reused and we
> likely can detect that the stats were for a different slot by comparing
> the slot name.
>

This idea is worth exploring to address the complaints but what do we
do when we detect that the stats are from the different slot? It has
mixed of stats from the old and new slot. We need to probably reset it
after we detect that. What if after some frequency (say whenever we
run out of indexes) we check whether the slots we are maintaining is
pgstat.c have some stale slot entry (entry exists but the actual slot
is dropped)?

> It also makes it easy to handle the issue of max_replication_slots being
> lowered and there still being stats for a slot - we simply can skip
> restoring that slots data, because we know the relevant slot can't exist
> anymore. And we can make the initial pgstat_report_replslot() during
> slot creation use a
>

Here, your last sentence seems to be incomplete.

> I'm wondering if we should just remove the slot name entirely from the
> pgstat.c side of things, and have pg_stat_get_replication_slots()
> inquire about slots by index as well and get the list of slots to report
> stats for from slot.c infrastructure.
>

But how will you detect in your idea that some of the stats from the
already dropped slot?

I'll create an entry for this in PG14 Open items wiki.

-- 
With Regards,
Amit Kapila.

Re: Replication slot stats misgivings

From

Amit Kapila

Date:

20 March 2021, 04:58:06

On Sat, Mar 20, 2021 at 9:25 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Sat, Mar 20, 2021 at 12:22 AM Andres Freund <andres@anarazel.de> wrote:
> >
> > And then more generally about the feature:
> > - If a slot was used to stream out a large amount of changes (say an
> >   initial data load), but then replication is interrupted before the
> >   transaction is committed/aborted, stream_bytes will not reflect the
> >   many gigabytes of data we may have sent.
> >
>
> We can probably update the stats each time we spilled or streamed the
> transaction data but it was not clear at that stage whether or how
> much it will be useful.
>
> > - I seems weird that we went to the trouble of inventing replication
> >   slot stats, but then limit them to logical slots, and even there don't
> >   record the obvious things like the total amount of data sent.
> >
>
> Won't spill_bytes and stream_bytes will give you the amount of data sent?
>
> >
> > I think the best way to address the more fundamental "pgstat related"
> > complaints is to change how replication slot stats are
> > "addressed". Instead of using the slots name, report stats using the
> > index in ReplicationSlotCtl->replication_slots.
> >
> > That removes the risk of running out of "replication slot stat slots":
> > If we loose a drop message, the index eventually will be reused and we
> > likely can detect that the stats were for a different slot by comparing
> > the slot name.
> >
>
> This idea is worth exploring to address the complaints but what do we
> do when we detect that the stats are from the different slot? It has
> mixed of stats from the old and new slot. We need to probably reset it
> after we detect that.
>

What if the user created a slot with the same name after dropping the
slot and it has used the same index. I think chances are less but
still a possibility, but maybe that is okay.

> What if after some frequency (say whenever we
> run out of indexes) we check whether the slots we are maintaining is
> pgstat.c have some stale slot entry (entry exists but the actual slot
> is dropped)?
>

A similar drawback (the user created a slot with the same name after
dropping it) exists with this as well.

-- 
With Regards,
Amit Kapila.

Re: Replication slot stats misgivings

From

Andres Freund

Date:

20 March 2021, 21:26:53

Hi,

On 2021-03-20 09:25:40 +0530, Amit Kapila wrote:
> On Sat, Mar 20, 2021 at 12:22 AM Andres Freund <andres@anarazel.de> wrote:
> >
> > And then more generally about the feature:
> > - If a slot was used to stream out a large amount of changes (say an
> >   initial data load), but then replication is interrupted before the
> >   transaction is committed/aborted, stream_bytes will not reflect the
> >   many gigabytes of data we may have sent.
> >
> 
> We can probably update the stats each time we spilled or streamed the
> transaction data but it was not clear at that stage whether or how
> much it will be useful.

It seems like the obvious answer here is to sync stats when releasing
the slot?


> > - I seems weird that we went to the trouble of inventing replication
> >   slot stats, but then limit them to logical slots, and even there don't
> >   record the obvious things like the total amount of data sent.
> >
> 
> Won't spill_bytes and stream_bytes will give you the amount of data sent?

I don't think either tracks changes that were neither spilled nor
streamed? And if they are, they're terribly misnamed?

> >
> > I think the best way to address the more fundamental "pgstat related"
> > complaints is to change how replication slot stats are
> > "addressed". Instead of using the slots name, report stats using the
> > index in ReplicationSlotCtl->replication_slots.
> >
> > That removes the risk of running out of "replication slot stat slots":
> > If we loose a drop message, the index eventually will be reused and we
> > likely can detect that the stats were for a different slot by comparing
> > the slot name.
> >
> 
> This idea is worth exploring to address the complaints but what do we
> do when we detect that the stats are from the different slot?

I think it's pretty easy to make that bulletproof. Add a
pgstat_report_replslot_create(), and use that in
ReplicationSlotCreate(). That is called with
ReplicationSlotAllocationLock held, so it can just safely zero out stats.

I don't think:

> It has mixed of stats from the old and new slot.

Can happen in that scenario.


> > It also makes it easy to handle the issue of max_replication_slots being
> > lowered and there still being stats for a slot - we simply can skip
> > restoring that slots data, because we know the relevant slot can't exist
> > anymore. And we can make the initial pgstat_report_replslot() during
> > slot creation use a
> >
> 
> Here, your last sentence seems to be incomplete.

Oops, I was planning to suggest adding pgstat_report_replslot_create()
that zeroes out the pre-existing stats (or a parameter to
pgstat_report_replslot(), but I don't think that's better).


> > I'm wondering if we should just remove the slot name entirely from the
> > pgstat.c side of things, and have pg_stat_get_replication_slots()
> > inquire about slots by index as well and get the list of slots to report
> > stats for from slot.c infrastructure.
> >
> 
> But how will you detect in your idea that some of the stats from the
> already dropped slot?

I don't think that is possible with my sketch?

Greetings,

Andres Freund

Re: Replication slot stats misgivings

From

Andres Freund

Date:

20 March 2021, 21:27:44

Hi,

On 2021-03-20 10:28:06 +0530, Amit Kapila wrote:
> On Sat, Mar 20, 2021 at 9:25 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > This idea is worth exploring to address the complaints but what do we
> > do when we detect that the stats are from the different slot? It has
> > mixed of stats from the old and new slot. We need to probably reset it
> > after we detect that.
> >
> 
> What if the user created a slot with the same name after dropping the
> slot and it has used the same index. I think chances are less but
> still a possibility, but maybe that is okay.
> 
> > What if after some frequency (say whenever we
> > run out of indexes) we check whether the slots we are maintaining is
> > pgstat.c have some stale slot entry (entry exists but the actual slot
> > is dropped)?
> >
> 
> A similar drawback (the user created a slot with the same name after
> dropping it) exists with this as well.

pgstat_report_replslot_drop() already prevents that, no?

Greetings,

Andres Freund

Re: Replication slot stats misgivings

From

Amit Kapila

Date:

21 March 2021, 10:38:00

On Sun, Mar 21, 2021 at 2:57 AM Andres Freund <andres@anarazel.de> wrote:
>
> Hi,
>
> On 2021-03-20 10:28:06 +0530, Amit Kapila wrote:
> > On Sat, Mar 20, 2021 at 9:25 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > This idea is worth exploring to address the complaints but what do we
> > > do when we detect that the stats are from the different slot? It has
> > > mixed of stats from the old and new slot. We need to probably reset it
> > > after we detect that.
> > >
> >
> > What if the user created a slot with the same name after dropping the
> > slot and it has used the same index. I think chances are less but
> > still a possibility, but maybe that is okay.
> >
> > > What if after some frequency (say whenever we
> > > run out of indexes) we check whether the slots we are maintaining is
> > > pgstat.c have some stale slot entry (entry exists but the actual slot
> > > is dropped)?
> > >
> >
> > A similar drawback (the user created a slot with the same name after
> > dropping it) exists with this as well.
>
> pgstat_report_replslot_drop() already prevents that, no?
>

Yeah, normally it would prevent that but what if a drop message is lost?

-- 
With Regards,
Amit Kapila.

Re: Replication slot stats misgivings

From

Amit Kapila

Date:

21 March 2021, 10:40:35

On Sun, Mar 21, 2021 at 2:56 AM Andres Freund <andres@anarazel.de> wrote:
>
> On 2021-03-20 09:25:40 +0530, Amit Kapila wrote:
> > On Sat, Mar 20, 2021 at 12:22 AM Andres Freund <andres@anarazel.de> wrote:
> > >
> > > And then more generally about the feature:
> > > - If a slot was used to stream out a large amount of changes (say an
> > >   initial data load), but then replication is interrupted before the
> > >   transaction is committed/aborted, stream_bytes will not reflect the
> > >   many gigabytes of data we may have sent.
> > >
> >
> > We can probably update the stats each time we spilled or streamed the
> > transaction data but it was not clear at that stage whether or how
> > much it will be useful.
>
> It seems like the obvious answer here is to sync stats when releasing
> the slot?
>

Okay, that makes sense.

>
> > > - I seems weird that we went to the trouble of inventing replication
> > >   slot stats, but then limit them to logical slots, and even there don't
> > >   record the obvious things like the total amount of data sent.
> > >
> >
> > Won't spill_bytes and stream_bytes will give you the amount of data sent?
>
> I don't think either tracks changes that were neither spilled nor
> streamed? And if they are, they're terribly misnamed?
>

Right, it won't track such changes but we can track that as well and I
understand it will be good to track that information. I think we were
too focused on stats for newly introduced features that we forget
about the non-spilled and non-streamed xacts.

Note - I have now created an entry for this in PG14 Open Items [1].

[1] - https://wiki.postgresql.org/wiki/PostgreSQL_14_Open_Items

-- 
With Regards,
Amit Kapila.

Re: Replication slot stats misgivings

From

Andres Freund

Date:

21 March 2021, 21:40:11

Hi,

On 2021-03-21 16:08:00 +0530, Amit Kapila wrote:
> On Sun, Mar 21, 2021 at 2:57 AM Andres Freund <andres@anarazel.de> wrote:
> > On 2021-03-20 10:28:06 +0530, Amit Kapila wrote:
> > > On Sat, Mar 20, 2021 at 9:25 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > This idea is worth exploring to address the complaints but what do we
> > > > do when we detect that the stats are from the different slot? It has
> > > > mixed of stats from the old and new slot. We need to probably reset it
> > > > after we detect that.
> > > >
> > >
> > > What if the user created a slot with the same name after dropping the
> > > slot and it has used the same index. I think chances are less but
> > > still a possibility, but maybe that is okay.
> > >
> > > > What if after some frequency (say whenever we
> > > > run out of indexes) we check whether the slots we are maintaining is
> > > > pgstat.c have some stale slot entry (entry exists but the actual slot
> > > > is dropped)?
> > > >
> > >
> > > A similar drawback (the user created a slot with the same name after
> > > dropping it) exists with this as well.
> >
> > pgstat_report_replslot_drop() already prevents that, no?
> >
> 
> Yeah, normally it would prevent that but what if a drop message is lost?

That already exists as a danger, no? pgstat_recv_replslot() uses
pgstat_replslot_index() to find the slot by name. So if a drop message
is lost we'd potentially accumulate into stats of an older slot.  It'd
probably a lower risk with what I suggested, because the initial stat
report slot.c would use something like pgstat_report_replslot_create(),
which the stats collector can use to reset the stats to 0?

If we do it right the lossiness will be removed via shared memory stats
patch... But architecturally the name based lookup and unpredictable
number of stats doesn't fit in super well.

Greetings,

Andres Freund

Re: Replication slot stats misgivings

From

Amit Kapila

Date:

22 March 2021, 02:56:19

On Mon, Mar 22, 2021 at 3:10 AM Andres Freund <andres@anarazel.de> wrote:
>
> On 2021-03-21 16:08:00 +0530, Amit Kapila wrote:
> > On Sun, Mar 21, 2021 at 2:57 AM Andres Freund <andres@anarazel.de> wrote:
> > > On 2021-03-20 10:28:06 +0530, Amit Kapila wrote:
> > > > On Sat, Mar 20, 2021 at 9:25 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > > This idea is worth exploring to address the complaints but what do we
> > > > > do when we detect that the stats are from the different slot? It has
> > > > > mixed of stats from the old and new slot. We need to probably reset it
> > > > > after we detect that.
> > > > >
> > > >
> > > > What if the user created a slot with the same name after dropping the
> > > > slot and it has used the same index. I think chances are less but
> > > > still a possibility, but maybe that is okay.
> > > >
> > > > > What if after some frequency (say whenever we
> > > > > run out of indexes) we check whether the slots we are maintaining is
> > > > > pgstat.c have some stale slot entry (entry exists but the actual slot
> > > > > is dropped)?
> > > > >
> > > >
> > > > A similar drawback (the user created a slot with the same name after
> > > > dropping it) exists with this as well.
> > >
> > > pgstat_report_replslot_drop() already prevents that, no?
> > >
> >
> > Yeah, normally it would prevent that but what if a drop message is lost?
>
> That already exists as a danger, no? pgstat_recv_replslot() uses
> pgstat_replslot_index() to find the slot by name. So if a drop message
> is lost we'd potentially accumulate into stats of an older slot.  It'd
> probably a lower risk with what I suggested, because the initial stat
> report slot.c would use something like pgstat_report_replslot_create(),
> which the stats collector can use to reset the stats to 0?
>

okay, but I guess if we miss the create message as well then we will
have a similar danger. I think the benefit your idea will bring is to
use index-based lookup instead of name-based lookup. IIRC, we have
initially used the name here because we thought there is nothing like
OID for slots but your suggestion of using
ReplicationSlotCtl->replication_slots can address that.

> If we do it right the lossiness will be removed via shared memory stats
> patch...
>

Okay.

-- 
With Regards,
Amit Kapila.

Re: Replication slot stats misgivings

From

Masahiko Sawada

Date:

22 March 2021, 04:25:16

On Sat, Mar 20, 2021 at 3:52 AM Andres Freund <andres@anarazel.de> wrote:
>
> - If max_replication_slots was lowered between a restart,
>   pgstat_read_statfile() will happily write beyond the end of
>   replSlotStats.

I think we cannot restart the server after lowering
max_replication_slots to a value less than the number of replication
slots actually created on the server. No?

>
> - pgstat_reset_replslot_counter() acquires ReplicationSlotControlLock. I
>   think pgstat.c has absolutely no business doing things on that level.

Agreed.

>
> - PgStat_ReplSlotStats etc use slotname[NAMEDATALEN]. Why not just NameData?

That's because we followed other definitions in pgstat.h that use
char[NAMEDATALEN]. I'm okay with using NameData.

>
> - pgstat_report_replslot() already has a lot of stats parameters, it
>   seems likely that we'll get more. Seems like we should just use a
>   struct of stats updates.

Agreed.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Replication slot stats misgivings

From

Masahiko Sawada

Date:

22 March 2021, 06:49:23

On Mon, Mar 22, 2021 at 1:25 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Sat, Mar 20, 2021 at 3:52 AM Andres Freund <andres@anarazel.de> wrote:
> >
> > - If max_replication_slots was lowered between a restart,
> >   pgstat_read_statfile() will happily write beyond the end of
> >   replSlotStats.
>
> I think we cannot restart the server after lowering
> max_replication_slots to a value less than the number of replication
> slots actually created on the server. No?

This problem happens in the case where max_replication_slots is
lowered and there still are stats for a slot.

I understood the risk of running out of replSlotStats. If we use the
index in replSlotStats instead, IIUC we need to somehow synchronize
the indexes in between replSlotStats and
ReplicationSlotCtl->replication_slots. The order of replSlotStats is
preserved across restarting whereas the order of
ReplicationSlotCtl->replication_slots isn’t (readdir() that is used by
StartupReplicationSlots() doesn’t guarantee the order of the returned
entries in the directory). Maybe we can compare the slot name in the
received message to the name in the element of replSlotStats. If they
don’t match, we swap entries in replSlotStats to synchronize the index
of the replication slot in ReplicationSlotCtl->replication_slots and
replSlotStats. If we cannot find the entry in replSlotStats that has
the name in the received message, it probably means either it's a new
slot or the previous create message is dropped, we can create the new
stats for the slot. Is that what you mean, Andres?

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Replication slot stats misgivings

From

Amit Kapila

Date:

23 March 2021, 06:09:23

On Mon, Mar 22, 2021 at 12:20 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Mon, Mar 22, 2021 at 1:25 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Sat, Mar 20, 2021 at 3:52 AM Andres Freund <andres@anarazel.de> wrote:
> > >
> > > - If max_replication_slots was lowered between a restart,
> > >   pgstat_read_statfile() will happily write beyond the end of
> > >   replSlotStats.
> >
> > I think we cannot restart the server after lowering
> > max_replication_slots to a value less than the number of replication
> > slots actually created on the server. No?
>
> This problem happens in the case where max_replication_slots is
> lowered and there still are stats for a slot.
>

I think this can happen only if the drop message is lost, right?

> I understood the risk of running out of replSlotStats. If we use the
> index in replSlotStats instead, IIUC we need to somehow synchronize
> the indexes in between replSlotStats and
> ReplicationSlotCtl->replication_slots. The order of replSlotStats is
> preserved across restarting whereas the order of
> ReplicationSlotCtl->replication_slots isn’t (readdir() that is used by
> StartupReplicationSlots() doesn’t guarantee the order of the returned
> entries in the directory). Maybe we can compare the slot name in the
> received message to the name in the element of replSlotStats. If they
> don’t match, we swap entries in replSlotStats to synchronize the index
> of the replication slot in ReplicationSlotCtl->replication_slots and
> replSlotStats. If we cannot find the entry in replSlotStats that has
> the name in the received message, it probably means either it's a new
> slot or the previous create message is dropped, we can create the new
> stats for the slot. Is that what you mean, Andres?
>

I wonder how in this scheme, we will remove the risk of running out of
'replSlotStats' and still restore correct stats assuming the drop
message is lost? Do we want to check after restoring each slot info
whether the slot with that name exists?


--
With Regards,
Amit Kapila.

Re: Replication slot stats misgivings

From

Masahiko Sawada

Date:

23 March 2021, 14:37:14

On Tue, Mar 23, 2021 at 3:09 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Mar 22, 2021 at 12:20 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Mon, Mar 22, 2021 at 1:25 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > On Sat, Mar 20, 2021 at 3:52 AM Andres Freund <andres@anarazel.de> wrote:
> > > >
> > > > - If max_replication_slots was lowered between a restart,
> > > >   pgstat_read_statfile() will happily write beyond the end of
> > > >   replSlotStats.
> > >
> > > I think we cannot restart the server after lowering
> > > max_replication_slots to a value less than the number of replication
> > > slots actually created on the server. No?
> >
> > This problem happens in the case where max_replication_slots is
> > lowered and there still are stats for a slot.
> >
>
> I think this can happen only if the drop message is lost, right?

Yes, I think you're right. In that case, the stats file could have
more slots statistics than the lowered max_replication_slots.

>
> > I understood the risk of running out of replSlotStats. If we use the
> > index in replSlotStats instead, IIUC we need to somehow synchronize
> > the indexes in between replSlotStats and
> > ReplicationSlotCtl->replication_slots. The order of replSlotStats is
> > preserved across restarting whereas the order of
> > ReplicationSlotCtl->replication_slots isn’t (readdir() that is used by
> > StartupReplicationSlots() doesn’t guarantee the order of the returned
> > entries in the directory). Maybe we can compare the slot name in the
> > received message to the name in the element of replSlotStats. If they
> > don’t match, we swap entries in replSlotStats to synchronize the index
> > of the replication slot in ReplicationSlotCtl->replication_slots and
> > replSlotStats. If we cannot find the entry in replSlotStats that has
> > the name in the received message, it probably means either it's a new
> > slot or the previous create message is dropped, we can create the new
> > stats for the slot. Is that what you mean, Andres?
> >
>
> I wonder how in this scheme, we will remove the risk of running out of
> 'replSlotStats' and still restore correct stats assuming the drop
> message is lost? Do we want to check after restoring each slot info
> whether the slot with that name exists?

Yeah, I think we need such a check at least if the number of slot
stats in the stats file is larger than max_replication_slots. Or we
can do that at every startup to remove orphaned slot stats.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Replication slot stats misgivings

From

Andres Freund

Date:

23 March 2021, 17:24:45

Hi,

On 2021-03-23 23:37:14 +0900, Masahiko Sawada wrote:
> On Tue, Mar 23, 2021 at 3:09 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Mon, Mar 22, 2021 at 12:20 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > On Mon, Mar 22, 2021 at 1:25 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > > > On Sat, Mar 20, 2021 at 3:52 AM Andres Freund <andres@anarazel.de> wrote:
> > > > >
> > > > > - If max_replication_slots was lowered between a restart,
> > > > >   pgstat_read_statfile() will happily write beyond the end of
> > > > >   replSlotStats.
> > > >
> > > > I think we cannot restart the server after lowering
> > > > max_replication_slots to a value less than the number of replication
> > > > slots actually created on the server. No?
> > >
> > > This problem happens in the case where max_replication_slots is
> > > lowered and there still are stats for a slot.
> > >
> >
> > I think this can happen only if the drop message is lost, right?
> 
> Yes, I think you're right. In that case, the stats file could have
> more slots statistics than the lowered max_replication_slots.

Or if slots are deleted on the file-system while the cluster is
shutdown. Which obviously is at best a semi-supported thing, but it
normally does work.


> > > I understood the risk of running out of replSlotStats. If we use the
> > > index in replSlotStats instead, IIUC we need to somehow synchronize
> > > the indexes in between replSlotStats and
> > > ReplicationSlotCtl->replication_slots. The order of replSlotStats is
> > > preserved across restarting whereas the order of
> > > ReplicationSlotCtl->replication_slots isn’t (readdir() that is used by
> > > StartupReplicationSlots() doesn’t guarantee the order of the returneda
> > > entries in the directory).

Very good point. Even if readdir() order were fixed, we'd still have the
problem because there can be "gaps" in the indexes for slots
(e.g. create slot_a, create slot_b, create slot_c, drop slot_b, leaving
you with index 0 and 2 used, and 1 unused).


> > > Maybe we can compare the slot name in the
> > > received message to the name in the element of replSlotStats. If they
> > > don’t match, we swap entries in replSlotStats to synchronize the index
> > > of the replication slot in ReplicationSlotCtl->replication_slots and
> > > replSlotStats. If we cannot find the entry in replSlotStats that has
> > > the name in the received message, it probably means either it's a new
> > > slot or the previous create message is dropped, we can create the new
> > > stats for the slot. Is that what you mean, Andres?

That doesn't seem great. Slot names are imo a poor identifier for
something happening asynchronously. The stats collector regularly
doesn't process incoming messages for periods of time because it is busy
writing out the stats file. That's also when messages to it are most
likely to be dropped (likely because the incoming buffer is full).

Perhaps we could have RestoreSlotFromDisk() send something to the stats
collector ensuring the mapping makes sense?

Greetings,

Andres Freund

Re: Replication slot stats misgivings

From

Amit Kapila

Date:

24 March 2021, 10:06:25

On Tue, Mar 23, 2021 at 10:54 PM Andres Freund <andres@anarazel.de> wrote:
>
> On 2021-03-23 23:37:14 +0900, Masahiko Sawada wrote:
>
> > > > Maybe we can compare the slot name in the
> > > > received message to the name in the element of replSlotStats. If they
> > > > don’t match, we swap entries in replSlotStats to synchronize the index
> > > > of the replication slot in ReplicationSlotCtl->replication_slots and
> > > > replSlotStats. If we cannot find the entry in replSlotStats that has
> > > > the name in the received message, it probably means either it's a new
> > > > slot or the previous create message is dropped, we can create the new
> > > > stats for the slot. Is that what you mean, Andres?
>
> That doesn't seem great. Slot names are imo a poor identifier for
> something happening asynchronously. The stats collector regularly
> doesn't process incoming messages for periods of time because it is busy
> writing out the stats file. That's also when messages to it are most
> likely to be dropped (likely because the incoming buffer is full).
>

Leaving aside restart case, without some sort of such sanity checking,
if both drop (of old slot) and create (of new slot) messages are lost
then we will start accumulating stats in old slots. However, if only
one of them is lost then there won't be any such problem.

> Perhaps we could have RestoreSlotFromDisk() send something to the stats
> collector ensuring the mapping makes sense?
>

Say if we send just the index location of each slot then probably we
can setup replSlotStats. Now say before the restart if one of the drop
messages was missed (by stats collector) and that happens to be at
some middle location, then we would end up restoring some already
dropped slot, leaving some of the still required ones. However, if
there is some sanity identifier like name along with the index, then I
think that would have worked for such a case.

I think it would have been easier if we would have some OID type of
identifier for each slot. But, without that may be index location of
ReplicationSlotCtl->replication_slots and slotname combination can
reduce the chances of slot stats go wrong quite less even if not zero.
If not name, do we have anything else in a slot that can be used for
some sort of sanity checking?

--
With Regards,
Amit Kapila.

Re: Replication slot stats misgivings

From

Masahiko Sawada

Date:

25 March 2021, 06:05:51

On Wed, Mar 24, 2021 at 7:06 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Mar 23, 2021 at 10:54 PM Andres Freund <andres@anarazel.de> wrote:
> >
> > On 2021-03-23 23:37:14 +0900, Masahiko Sawada wrote:
> >
> > > > > Maybe we can compare the slot name in the
> > > > > received message to the name in the element of replSlotStats. If they
> > > > > don’t match, we swap entries in replSlotStats to synchronize the index
> > > > > of the replication slot in ReplicationSlotCtl->replication_slots and
> > > > > replSlotStats. If we cannot find the entry in replSlotStats that has
> > > > > the name in the received message, it probably means either it's a new
> > > > > slot or the previous create message is dropped, we can create the new
> > > > > stats for the slot. Is that what you mean, Andres?
> >
> > That doesn't seem great. Slot names are imo a poor identifier for
> > something happening asynchronously. The stats collector regularly
> > doesn't process incoming messages for periods of time because it is busy
> > writing out the stats file. That's also when messages to it are most
> > likely to be dropped (likely because the incoming buffer is full).
> >
>
> Leaving aside restart case, without some sort of such sanity checking,
> if both drop (of old slot) and create (of new slot) messages are lost
> then we will start accumulating stats in old slots. However, if only
> one of them is lost then there won't be any such problem.
>
> > Perhaps we could have RestoreSlotFromDisk() send something to the stats
> > collector ensuring the mapping makes sense?
> >
>
> Say if we send just the index location of each slot then probably we
> can setup replSlotStats. Now say before the restart if one of the drop
> messages was missed (by stats collector) and that happens to be at
> some middle location, then we would end up restoring some already
> dropped slot, leaving some of the still required ones. However, if
> there is some sanity identifier like name along with the index, then I
> think that would have worked for such a case.

Even such messages could also be lost? Given that any message could be
lost under a UDP connection, I think we cannot rely on a single
message. Instead, I think we need to loosely synchronize the indexes
while assuming the indexes in replSlotStats and
ReplicationSlotCtl->replication_slots are not synchronized.

>
> I think it would have been easier if we would have some OID type of
> identifier for each slot. But, without that may be index location of
> ReplicationSlotCtl->replication_slots and slotname combination can
> reduce the chances of slot stats go wrong quite less even if not zero.
> If not name, do we have anything else in a slot that can be used for
> some sort of sanity checking?

I don't see any useful information in a slot for sanity checking.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Replication slot stats misgivings

From

Amit Kapila

Date:

25 March 2021, 11:42:31

On Thu, Mar 25, 2021 at 11:36 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Wed, Mar 24, 2021 at 7:06 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> >
> > Leaving aside restart case, without some sort of such sanity checking,
> > if both drop (of old slot) and create (of new slot) messages are lost
> > then we will start accumulating stats in old slots. However, if only
> > one of them is lost then there won't be any such problem.
> >
> > > Perhaps we could have RestoreSlotFromDisk() send something to the stats
> > > collector ensuring the mapping makes sense?
> > >
> >
> > Say if we send just the index location of each slot then probably we
> > can setup replSlotStats. Now say before the restart if one of the drop
> > messages was missed (by stats collector) and that happens to be at
> > some middle location, then we would end up restoring some already
> > dropped slot, leaving some of the still required ones. However, if
> > there is some sanity identifier like name along with the index, then I
> > think that would have worked for such a case.
>
> Even such messages could also be lost? Given that any message could be
> lost under a UDP connection, I think we cannot rely on a single
> message. Instead, I think we need to loosely synchronize the indexes
> while assuming the indexes in replSlotStats and
> ReplicationSlotCtl->replication_slots are not synchronized.
>
> >
> > I think it would have been easier if we would have some OID type of
> > identifier for each slot. But, without that may be index location of
> > ReplicationSlotCtl->replication_slots and slotname combination can
> > reduce the chances of slot stats go wrong quite less even if not zero.
> > If not name, do we have anything else in a slot that can be used for
> > some sort of sanity checking?
>
> I don't see any useful information in a slot for sanity checking.
>

In that case, can we do a hard check for which slots exist if
replSlotStats runs out of space (that can probably happen only after
restart and when we lost some drop messages)?


-- 
With Regards,
Amit Kapila.

Re: Replication slot stats misgivings

From

Andres Freund

Date:

25 March 2021, 19:47:26

Hi,

On 2021-03-25 17:12:31 +0530, Amit Kapila wrote:
> On Thu, Mar 25, 2021 at 11:36 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Wed, Mar 24, 2021 at 7:06 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > >
> > > Leaving aside restart case, without some sort of such sanity checking,
> > > if both drop (of old slot) and create (of new slot) messages are lost
> > > then we will start accumulating stats in old slots. However, if only
> > > one of them is lost then there won't be any such problem.
> > >
> > > > Perhaps we could have RestoreSlotFromDisk() send something to the stats
> > > > collector ensuring the mapping makes sense?
> > > >
> > >
> > > Say if we send just the index location of each slot then probably we
> > > can setup replSlotStats. Now say before the restart if one of the drop
> > > messages was missed (by stats collector) and that happens to be at
> > > some middle location, then we would end up restoring some already
> > > dropped slot, leaving some of the still required ones. However, if
> > > there is some sanity identifier like name along with the index, then I
> > > think that would have worked for such a case.
> >
> > Even such messages could also be lost? Given that any message could be
> > lost under a UDP connection, I think we cannot rely on a single
> > message. Instead, I think we need to loosely synchronize the indexes
> > while assuming the indexes in replSlotStats and
> > ReplicationSlotCtl->replication_slots are not synchronized.
> >
> > >
> > > I think it would have been easier if we would have some OID type of
> > > identifier for each slot. But, without that may be index location of
> > > ReplicationSlotCtl->replication_slots and slotname combination can
> > > reduce the chances of slot stats go wrong quite less even if not zero.
> > > If not name, do we have anything else in a slot that can be used for
> > > some sort of sanity checking?
> >
> > I don't see any useful information in a slot for sanity checking.
> >
> 
> In that case, can we do a hard check for which slots exist if
> replSlotStats runs out of space (that can probably happen only after
> restart and when we lost some drop messages)?

I suggest we wait doing anything about this until we know if the shared
stats patch gets in or not (I'd give it 50% maybe). If it does get in
things get a good bit easier, because we don't have to deal with the
message loss issues anymore.

Greetings,

Andres Freund

Re: Replication slot stats misgivings

From

Amit Kapila

Date:

26 March 2021, 02:28:58

On Fri, Mar 26, 2021 at 1:17 AM Andres Freund <andres@anarazel.de> wrote:
>
> Hi,
>
> On 2021-03-25 17:12:31 +0530, Amit Kapila wrote:
> > On Thu, Mar 25, 2021 at 11:36 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > On Wed, Mar 24, 2021 at 7:06 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > >
> > > > Leaving aside restart case, without some sort of such sanity checking,
> > > > if both drop (of old slot) and create (of new slot) messages are lost
> > > > then we will start accumulating stats in old slots. However, if only
> > > > one of them is lost then there won't be any such problem.
> > > >
> > > > > Perhaps we could have RestoreSlotFromDisk() send something to the stats
> > > > > collector ensuring the mapping makes sense?
> > > > >
> > > >
> > > > Say if we send just the index location of each slot then probably we
> > > > can setup replSlotStats. Now say before the restart if one of the drop
> > > > messages was missed (by stats collector) and that happens to be at
> > > > some middle location, then we would end up restoring some already
> > > > dropped slot, leaving some of the still required ones. However, if
> > > > there is some sanity identifier like name along with the index, then I
> > > > think that would have worked for such a case.
> > >
> > > Even such messages could also be lost? Given that any message could be
> > > lost under a UDP connection, I think we cannot rely on a single
> > > message. Instead, I think we need to loosely synchronize the indexes
> > > while assuming the indexes in replSlotStats and
> > > ReplicationSlotCtl->replication_slots are not synchronized.
> > >
> > > >
> > > > I think it would have been easier if we would have some OID type of
> > > > identifier for each slot. But, without that may be index location of
> > > > ReplicationSlotCtl->replication_slots and slotname combination can
> > > > reduce the chances of slot stats go wrong quite less even if not zero.
> > > > If not name, do we have anything else in a slot that can be used for
> > > > some sort of sanity checking?
> > >
> > > I don't see any useful information in a slot for sanity checking.
> > >
> >
> > In that case, can we do a hard check for which slots exist if
> > replSlotStats runs out of space (that can probably happen only after
> > restart and when we lost some drop messages)?
>
> I suggest we wait doing anything about this until we know if the shared
> stats patch gets in or not (I'd give it 50% maybe). If it does get in
> things get a good bit easier, because we don't have to deal with the
> message loss issues anymore.
>

Okay, that makes sense.

-- 
With Regards,
Amit Kapila.

Re: Replication slot stats misgivings

From

Andres Freund

Date:

30 March 2021, 00:58:34

Hi,

On 2021-03-26 07:58:58 +0530, Amit Kapila wrote:
> On Fri, Mar 26, 2021 at 1:17 AM Andres Freund <andres@anarazel.de> wrote:
> > I suggest we wait doing anything about this until we know if the shared
> > stats patch gets in or not (I'd give it 50% maybe). If it does get in
> > things get a good bit easier, because we don't have to deal with the
> > message loss issues anymore.
> >
> 
> Okay, that makes sense.

Any chance you could write a tap test exercising a few of these cases?
E.g. things like:

- create a few slots, drop one of them, shut down, start up, verify
  stats are still sane
- create a few slots, shut down, manually remove a slot, lower
  max_replication_slots, start up

IMO, independent of the shutdown / startup issue, it'd be worth writing
a patch tracking the bytes sent independently of the slot stats storage
issues. That would also make the testing for the above cheaper...

Greetings,

Andres Freund

Re: Replication slot stats misgivings

From

vignesh C

Date:

30 March 2021, 04:43:29

On Tue, Mar 30, 2021 at 6:28 AM Andres Freund <andres@anarazel.de> wrote:
>
> Hi,
>
> On 2021-03-26 07:58:58 +0530, Amit Kapila wrote:
> > On Fri, Mar 26, 2021 at 1:17 AM Andres Freund <andres@anarazel.de> wrote:
> > > I suggest we wait doing anything about this until we know if the shared
> > > stats patch gets in or not (I'd give it 50% maybe). If it does get in
> > > things get a good bit easier, because we don't have to deal with the
> > > message loss issues anymore.
> > >
> >
> > Okay, that makes sense.
>
> Any chance you could write a tap test exercising a few of these cases?

I can try to write a patch for this if nobody objects.

> E.g. things like:
>
> - create a few slots, drop one of them, shut down, start up, verify
>   stats are still sane
> - create a few slots, shut down, manually remove a slot, lower
>   max_replication_slots, start up

Here by "manually remove a slot", do you mean to remove the slot
manually from the pg_replslot folder?

> IMO, independent of the shutdown / startup issue, it'd be worth writing
> a patch tracking the bytes sent independently of the slot stats storage
> issues. That would also make the testing for the above cheaper...

I can try to write a patch for this if nobody objects.

Regards,
Vignesh

Re: Replication slot stats misgivings

From

Andres Freund

Date:

30 March 2021, 05:30:24

Hi,

On 2021-03-30 10:13:29 +0530, vignesh C wrote:
> On Tue, Mar 30, 2021 at 6:28 AM Andres Freund <andres@anarazel.de> wrote:
> > Any chance you could write a tap test exercising a few of these cases?
> 
> I can try to write a patch for this if nobody objects.

Cool!

> > E.g. things like:
> >
> > - create a few slots, drop one of them, shut down, start up, verify
> >   stats are still sane
> > - create a few slots, shut down, manually remove a slot, lower
> >   max_replication_slots, start up
> 
> Here by "manually remove a slot", do you mean to remove the slot
> manually from the pg_replslot folder?

Yep - thereby allowing max_replication_slots after the shutdown/start to
be lower than the number of slots-stats objects.

Greetings,

Andres Freund

Re: Replication slot stats misgivings

From

vignesh C

Date:

31 March 2021, 06:02:51

On Tue, Mar 30, 2021 at 11:00 AM Andres Freund <andres@anarazel.de> wrote:
>
> Hi,
>
> On 2021-03-30 10:13:29 +0530, vignesh C wrote:
> > On Tue, Mar 30, 2021 at 6:28 AM Andres Freund <andres@anarazel.de> wrote:
> > > Any chance you could write a tap test exercising a few of these cases?
> >
> > I can try to write a patch for this if nobody objects.
>
> Cool!
>

Attached a patch which has the test for the first scenario.

> > > E.g. things like:
> > >
> > > - create a few slots, drop one of them, shut down, start up, verify
> > >   stats are still sane
> > > - create a few slots, shut down, manually remove a slot, lower
> > >   max_replication_slots, start up
> >
> > Here by "manually remove a slot", do you mean to remove the slot
> > manually from the pg_replslot folder?
>
> Yep - thereby allowing max_replication_slots after the shutdown/start to
> be lower than the number of slots-stats objects.

I have not included the 2nd test in the patch as the test fails with
following warnings and also displays the statistics of the removed
slot:
WARNING:  problem in alloc set Statistics snapshot: detected write
past chunk end in block 0x55d038b8e410, chunk 0x55d038b8e438
WARNING:  problem in alloc set Statistics snapshot: detected write
past chunk end in block 0x55d038b8e410, chunk 0x55d038b8e438

This happens because the statistics file has an additional slot
present even though the replication slot was removed.  I felt this
issue should be fixed. I will try to fix this issue and send the
second test along with the fix.

Regards,
Vignesh

Attachment

v1-0001-Added-tests-for-verification-of-logical-replicati.patch

Re: Replication slot stats misgivings

From

vignesh C

Date:

01 April 2021, 10:13:36

On Wed, Mar 31, 2021 at 11:32 AM vignesh C <vignesh21@gmail.com> wrote:
>
> On Tue, Mar 30, 2021 at 11:00 AM Andres Freund <andres@anarazel.de> wrote:
> >
> > Hi,
> >
> > On 2021-03-30 10:13:29 +0530, vignesh C wrote:
> > > On Tue, Mar 30, 2021 at 6:28 AM Andres Freund <andres@anarazel.de> wrote:
> > > > Any chance you could write a tap test exercising a few of these cases?
> > >
> > > I can try to write a patch for this if nobody objects.
> >
> > Cool!
> >
>
> Attached a patch which has the test for the first scenario.
>
> > > > E.g. things like:
> > > >
> > > > - create a few slots, drop one of them, shut down, start up, verify
> > > >   stats are still sane
> > > > - create a few slots, shut down, manually remove a slot, lower
> > > >   max_replication_slots, start up
> > >
> > > Here by "manually remove a slot", do you mean to remove the slot
> > > manually from the pg_replslot folder?
> >
> > Yep - thereby allowing max_replication_slots after the shutdown/start to
> > be lower than the number of slots-stats objects.
>
> I have not included the 2nd test in the patch as the test fails with
> following warnings and also displays the statistics of the removed
> slot:
> WARNING:  problem in alloc set Statistics snapshot: detected write
> past chunk end in block 0x55d038b8e410, chunk 0x55d038b8e438
> WARNING:  problem in alloc set Statistics snapshot: detected write
> past chunk end in block 0x55d038b8e410, chunk 0x55d038b8e438
>
> This happens because the statistics file has an additional slot
> present even though the replication slot was removed.  I felt this
> issue should be fixed. I will try to fix this issue and send the
> second test along with the fix.

I felt from the statistics collector process, there is no way in which
we can identify if the replication slot is present or not because the
statistic collector process does not have access to shared memory.
Anything that the statistic collector process does independently by
traversing and removing the statistics of the replication slot
exceeding the max_replication_slot has its drawback of removing some
valid replication slot's statistics data.
Any thoughts on how we can identify the replication slot which has been dropped?
Can someone point me to the shared stats patch link with which message
loss can be avoided. I wanted to see a scenario where something like
the slot is dropped but the statistics are not updated because of an
immediate shutdown or server going down abruptly can occur or not with
the shared stats patch.

Regards,
Vignesh

Re: Replication slot stats misgivings

From

Amit Kapila

Date:

01 April 2021, 12:28:10

On Thu, Apr 1, 2021 at 3:43 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Wed, Mar 31, 2021 at 11:32 AM vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Tue, Mar 30, 2021 at 11:00 AM Andres Freund <andres@anarazel.de> wrote:
> > >
> > > Hi,
> > >
> > > On 2021-03-30 10:13:29 +0530, vignesh C wrote:
> > > > On Tue, Mar 30, 2021 at 6:28 AM Andres Freund <andres@anarazel.de> wrote:
> > > > > Any chance you could write a tap test exercising a few of these cases?
> > > >
> > > > I can try to write a patch for this if nobody objects.
> > >
> > > Cool!
> > >
> >
> > Attached a patch which has the test for the first scenario.
> >
> > > > > E.g. things like:
> > > > >
> > > > > - create a few slots, drop one of them, shut down, start up, verify
> > > > >   stats are still sane
> > > > > - create a few slots, shut down, manually remove a slot, lower
> > > > >   max_replication_slots, start up
> > > >
> > > > Here by "manually remove a slot", do you mean to remove the slot
> > > > manually from the pg_replslot folder?
> > >
> > > Yep - thereby allowing max_replication_slots after the shutdown/start to
> > > be lower than the number of slots-stats objects.
> >
> > I have not included the 2nd test in the patch as the test fails with
> > following warnings and also displays the statistics of the removed
> > slot:
> > WARNING:  problem in alloc set Statistics snapshot: detected write
> > past chunk end in block 0x55d038b8e410, chunk 0x55d038b8e438
> > WARNING:  problem in alloc set Statistics snapshot: detected write
> > past chunk end in block 0x55d038b8e410, chunk 0x55d038b8e438
> >
> > This happens because the statistics file has an additional slot
> > present even though the replication slot was removed.  I felt this
> > issue should be fixed. I will try to fix this issue and send the
> > second test along with the fix.
>
> I felt from the statistics collector process, there is no way in which
> we can identify if the replication slot is present or not because the
> statistic collector process does not have access to shared memory.
> Anything that the statistic collector process does independently by
> traversing and removing the statistics of the replication slot
> exceeding the max_replication_slot has its drawback of removing some
> valid replication slot's statistics data.
> Any thoughts on how we can identify the replication slot which has been dropped?
> Can someone point me to the shared stats patch link with which message
> loss can be avoided. I wanted to see a scenario where something like
> the slot is dropped but the statistics are not updated because of an
> immediate shutdown or server going down abruptly can occur or not with
> the shared stats patch.
>

I don't think it is easy to simulate a scenario where the 'drop'
message is dropped and I think that is why the test contains the step
to manually remove the slot. At this stage, you can probably provide a
test patch and a code-fix patch where it just drops the extra slots
from the stats file. That will allow us to test it with a shared
memory stats patch on which Andres and Horiguchi-San are working. If
we still continue to pursue with current approach then as Andres
suggested we might send additional information from
RestoreSlotFromDisk to keep it in sync.

-- 
With Regards,
Amit Kapila.

Re: Replication slot stats misgivings

From

Masahiko Sawada

Date:

01 April 2021, 12:48:44

On Tue, Mar 30, 2021 at 9:58 AM Andres Freund <andres@anarazel.de> wrote:
>
> IMO, independent of the shutdown / startup issue, it'd be worth writing
> a patch tracking the bytes sent independently of the slot stats storage
> issues. That would also make the testing for the above cheaper...

Agreed.

I think the bytes sent should be recorded by the decoding plugin, not
by the core side. Given that table filtering and row filtering,
tracking the bytes passed to the decoding plugin would not help gauge
the actual network I/O. In that sense, the description of stream_bytes
in the doc seems not accurate:

---
This and other streaming counters for this slot can be used to gauge
the network I/O which occurred during logical decoding and allow
tuning logical_decoding_work_mem.
---

It can surely be used to allow tuning logical_decoding_work_mem but it
could not be true for gauging the network I/O which occurred during
logical decoding.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Replication slot stats misgivings

From

vignesh C

Date:

01 April 2021, 16:55:40

On Thu, Apr 1, 2021 at 5:58 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Apr 1, 2021 at 3:43 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Wed, Mar 31, 2021 at 11:32 AM vignesh C <vignesh21@gmail.com> wrote:
> > >
> > > On Tue, Mar 30, 2021 at 11:00 AM Andres Freund <andres@anarazel.de> wrote:
> > > >
> > > > Hi,
> > > >
> > > > On 2021-03-30 10:13:29 +0530, vignesh C wrote:
> > > > > On Tue, Mar 30, 2021 at 6:28 AM Andres Freund <andres@anarazel.de> wrote:
> > > > > > Any chance you could write a tap test exercising a few of these cases?
> > > > >
> > > > > I can try to write a patch for this if nobody objects.
> > > >
> > > > Cool!
> > > >
> > >
> > > Attached a patch which has the test for the first scenario.
> > >
> > > > > > E.g. things like:
> > > > > >
> > > > > > - create a few slots, drop one of them, shut down, start up, verify
> > > > > >   stats are still sane
> > > > > > - create a few slots, shut down, manually remove a slot, lower
> > > > > >   max_replication_slots, start up
> > > > >
> > > > > Here by "manually remove a slot", do you mean to remove the slot
> > > > > manually from the pg_replslot folder?
> > > >
> > > > Yep - thereby allowing max_replication_slots after the shutdown/start to
> > > > be lower than the number of slots-stats objects.
> > >
> > > I have not included the 2nd test in the patch as the test fails with
> > > following warnings and also displays the statistics of the removed
> > > slot:
> > > WARNING:  problem in alloc set Statistics snapshot: detected write
> > > past chunk end in block 0x55d038b8e410, chunk 0x55d038b8e438
> > > WARNING:  problem in alloc set Statistics snapshot: detected write
> > > past chunk end in block 0x55d038b8e410, chunk 0x55d038b8e438
> > >
> > > This happens because the statistics file has an additional slot
> > > present even though the replication slot was removed.  I felt this
> > > issue should be fixed. I will try to fix this issue and send the
> > > second test along with the fix.
> >
> > I felt from the statistics collector process, there is no way in which
> > we can identify if the replication slot is present or not because the
> > statistic collector process does not have access to shared memory.
> > Anything that the statistic collector process does independently by
> > traversing and removing the statistics of the replication slot
> > exceeding the max_replication_slot has its drawback of removing some
> > valid replication slot's statistics data.
> > Any thoughts on how we can identify the replication slot which has been dropped?
> > Can someone point me to the shared stats patch link with which message
> > loss can be avoided. I wanted to see a scenario where something like
> > the slot is dropped but the statistics are not updated because of an
> > immediate shutdown or server going down abruptly can occur or not with
> > the shared stats patch.
> >
>
> I don't think it is easy to simulate a scenario where the 'drop'
> message is dropped and I think that is why the test contains the step
> to manually remove the slot. At this stage, you can probably provide a
> test patch and a code-fix patch where it just drops the extra slots
> from the stats file. That will allow us to test it with a shared
> memory stats patch on which Andres and Horiguchi-San are working. If
> we still continue to pursue with current approach then as Andres
> suggested we might send additional information from
> RestoreSlotFromDisk to keep it in sync.

Thanks for your comments, Attached patch has the fix for the same.
Also attached a couple of more patches which addresses the comments
which Andres had listed i.e changing char to NameData type and also to
display the unspilled/unstreamed transaction information in the
replication statistics.
Thoughts?

Regards,
Vignesh

On Fri, Apr 2, 2021 at 9:29 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Fri, Apr 2, 2021 at 1:55 AM vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Thu, Apr 1, 2021 at 5:58 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Thu, Apr 1, 2021 at 3:43 PM vignesh C <vignesh21@gmail.com> wrote:
> > > >
> > > > On Wed, Mar 31, 2021 at 11:32 AM vignesh C <vignesh21@gmail.com> wrote:
> > > > >
> > > > > On Tue, Mar 30, 2021 at 11:00 AM Andres Freund <andres@anarazel.de> wrote:
> > > > > >
> > > > > > Hi,
> > > > > >
> > > > > > On 2021-03-30 10:13:29 +0530, vignesh C wrote:
> > > > > > > On Tue, Mar 30, 2021 at 6:28 AM Andres Freund <andres@anarazel.de> wrote:
> > > > > > > > Any chance you could write a tap test exercising a few of these cases?
> > > > > > >
> > > > > > > I can try to write a patch for this if nobody objects.
> > > > > >
> > > > > > Cool!
> > > > > >
> > > > >
> > > > > Attached a patch which has the test for the first scenario.
> > > > >
> > > > > > > > E.g. things like:
> > > > > > > >
> > > > > > > > - create a few slots, drop one of them, shut down, start up, verify
> > > > > > > >   stats are still sane
> > > > > > > > - create a few slots, shut down, manually remove a slot, lower
> > > > > > > >   max_replication_slots, start up
> > > > > > >
> > > > > > > Here by "manually remove a slot", do you mean to remove the slot
> > > > > > > manually from the pg_replslot folder?
> > > > > >
> > > > > > Yep - thereby allowing max_replication_slots after the shutdown/start to
> > > > > > be lower than the number of slots-stats objects.
> > > > >
> > > > > I have not included the 2nd test in the patch as the test fails with
> > > > > following warnings and also displays the statistics of the removed
> > > > > slot:
> > > > > WARNING:  problem in alloc set Statistics snapshot: detected write
> > > > > past chunk end in block 0x55d038b8e410, chunk 0x55d038b8e438
> > > > > WARNING:  problem in alloc set Statistics snapshot: detected write
> > > > > past chunk end in block 0x55d038b8e410, chunk 0x55d038b8e438
> > > > >
> > > > > This happens because the statistics file has an additional slot
> > > > > present even though the replication slot was removed.  I felt this
> > > > > issue should be fixed. I will try to fix this issue and send the
> > > > > second test along with the fix.
> > > >
> > > > I felt from the statistics collector process, there is no way in which
> > > > we can identify if the replication slot is present or not because the
> > > > statistic collector process does not have access to shared memory.
> > > > Anything that the statistic collector process does independently by
> > > > traversing and removing the statistics of the replication slot
> > > > exceeding the max_replication_slot has its drawback of removing some
> > > > valid replication slot's statistics data.
> > > > Any thoughts on how we can identify the replication slot which has been dropped?
> > > > Can someone point me to the shared stats patch link with which message
> > > > loss can be avoided. I wanted to see a scenario where something like
> > > > the slot is dropped but the statistics are not updated because of an
> > > > immediate shutdown or server going down abruptly can occur or not with
> > > > the shared stats patch.
> > > >
> > >
> > > I don't think it is easy to simulate a scenario where the 'drop'
> > > message is dropped and I think that is why the test contains the step
> > > to manually remove the slot. At this stage, you can probably provide a
> > > test patch and a code-fix patch where it just drops the extra slots
> > > from the stats file. That will allow us to test it with a shared
> > > memory stats patch on which Andres and Horiguchi-San are working. If
> > > we still continue to pursue with current approach then as Andres
> > > suggested we might send additional information from
> > > RestoreSlotFromDisk to keep it in sync.
> >
> > Thanks for your comments, Attached patch has the fix for the same.
> > Also attached a couple of more patches which addresses the comments
> > which Andres had listed i.e changing char to NameData type and also to
> > display the unspilled/unstreamed transaction information in the
> > replication statistics.
> > Thoughts?
>
> Thank you for the patches!
>
> I've looked at those patches and here are some comments on 0001, 0002,
> and 0003 patch:

Thanks for the comments.

> 0001 patch:
>
> -       values[0] = PointerGetDatum(cstring_to_text(s->slotname));
> +       values[0] = PointerGetDatum(cstring_to_text(s->slotname.data));
>
> We can use NameGetDatum() instead.

I felt we will not be able to use NameGetDatum because this function
will not have access to the value throughout the loop and NameGetDatum
must ensure the pointed-to value has adequate lifetime.

> ---
> 0002 patch:
>
> The patch uses logical replication to test replication slots
> statistics but I think it's necessarily necessary. It would be more
> simple to use logical decoding. Maybe we can add TAP tests to
> contrib/test_decoding.
>

I will try to change it to test_decoding if feasible and post in the
next version.

> ---
> 0003 patch:
>
>  void
>  pgstat_report_replslot(const char *slotname, int spilltxns, int spillcount,
> -                      int spillbytes, int streamtxns, int
> streamcount, int streambytes)
> +                      int spillbytes, int streamtxns, int streamcount,
> +                      int streambytes, int totaltxns, int totalbytes)
>  {
>
> As Andreas pointed out, we should use a struct of stats updates rather
> than adding more arguments to pgstat_report_replslot().
>

Modified as suggested.

> ---
> +     <row>
> +      <entry role="catalog_table_entry"><para role="column_definition">
> +        <structfield>total_bytes</structfield><type>bigint</type>
> +       </para>
> +       <para>
> +        Amount of decoded in-progress transaction data replicated to
> the decoding
> +        output plugin while decoding changes from WAL for this slot.
> This and other
> +        counters for this slot can be used to gauge the network I/O
> which occurred
> +        during logical decoding and allow tuning
> <literal>logical_decoding_work_mem</literal>.
> +       </para>
> +      </entry>
> +     </row>
>
> As I mentioned in another reply, I think users should not gauge the
> network I/O which occurred during logical decoding using by those
> counters since the actual amount of network I/O is affected by table
> filtering and row filtering discussed on another thread[1]. Also,
> since this is total bytes I'm not sure how users can use this value to
> tune logical_decoding_work_mem. I agree to track both the total bytes
> and the total number of transactions passed to the decoding plugin but
> I think the description needs to be updated. How about the following
> description for example?
>
> Amount of decoded transaction data sent to the decoding output plugin
> while decoding changes from WAL for this slot. This and total_txn for
> this slot can be used to gauge the total amount of data during logical
> decoding.
>

Modified as suggested.

> ---
> I think we can merge 0001 and 0003 patches.

I have merged them.
Attached V2 patch which has the fixes for the same.
Thoughts?

Regards,
Vignesh

On Sat, Apr 3, 2021 at 11:07 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Fri, Apr 2, 2021 at 9:29 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Fri, Apr 2, 2021 at 1:55 AM vignesh C <vignesh21@gmail.com> wrote:
> > >
> > > On Thu, Apr 1, 2021 at 5:58 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > > On Thu, Apr 1, 2021 at 3:43 PM vignesh C <vignesh21@gmail.com> wrote:
> > > > >
> > > > > On Wed, Mar 31, 2021 at 11:32 AM vignesh C <vignesh21@gmail.com> wrote:
> > > > > >
> > > > > > On Tue, Mar 30, 2021 at 11:00 AM Andres Freund <andres@anarazel.de> wrote:
> > > > > > >
> > > > > > > Hi,
> > > > > > >
> > > > > > > On 2021-03-30 10:13:29 +0530, vignesh C wrote:
> > > > > > > > On Tue, Mar 30, 2021 at 6:28 AM Andres Freund <andres@anarazel.de> wrote:
> > > > > > > > > Any chance you could write a tap test exercising a few of these cases?
> > > > > > > >
> > > > > > > > I can try to write a patch for this if nobody objects.
> > > > > > >
> > > > > > > Cool!
> > > > > > >
> > > > > >
> > > > > > Attached a patch which has the test for the first scenario.
> > > > > >
> > > > > > > > > E.g. things like:
> > > > > > > > >
> > > > > > > > > - create a few slots, drop one of them, shut down, start up, verify
> > > > > > > > >   stats are still sane
> > > > > > > > > - create a few slots, shut down, manually remove a slot, lower
> > > > > > > > >   max_replication_slots, start up
> > > > > > > >
> > > > > > > > Here by "manually remove a slot", do you mean to remove the slot
> > > > > > > > manually from the pg_replslot folder?
> > > > > > >
> > > > > > > Yep - thereby allowing max_replication_slots after the shutdown/start to
> > > > > > > be lower than the number of slots-stats objects.
> > > > > >
> > > > > > I have not included the 2nd test in the patch as the test fails with
> > > > > > following warnings and also displays the statistics of the removed
> > > > > > slot:
> > > > > > WARNING:  problem in alloc set Statistics snapshot: detected write
> > > > > > past chunk end in block 0x55d038b8e410, chunk 0x55d038b8e438
> > > > > > WARNING:  problem in alloc set Statistics snapshot: detected write
> > > > > > past chunk end in block 0x55d038b8e410, chunk 0x55d038b8e438
> > > > > >
> > > > > > This happens because the statistics file has an additional slot
> > > > > > present even though the replication slot was removed.  I felt this
> > > > > > issue should be fixed. I will try to fix this issue and send the
> > > > > > second test along with the fix.
> > > > >
> > > > > I felt from the statistics collector process, there is no way in which
> > > > > we can identify if the replication slot is present or not because the
> > > > > statistic collector process does not have access to shared memory.
> > > > > Anything that the statistic collector process does independently by
> > > > > traversing and removing the statistics of the replication slot
> > > > > exceeding the max_replication_slot has its drawback of removing some
> > > > > valid replication slot's statistics data.
> > > > > Any thoughts on how we can identify the replication slot which has been dropped?
> > > > > Can someone point me to the shared stats patch link with which message
> > > > > loss can be avoided. I wanted to see a scenario where something like
> > > > > the slot is dropped but the statistics are not updated because of an
> > > > > immediate shutdown or server going down abruptly can occur or not with
> > > > > the shared stats patch.
> > > > >
> > > >
> > > > I don't think it is easy to simulate a scenario where the 'drop'
> > > > message is dropped and I think that is why the test contains the step
> > > > to manually remove the slot. At this stage, you can probably provide a
> > > > test patch and a code-fix patch where it just drops the extra slots
> > > > from the stats file. That will allow us to test it with a shared
> > > > memory stats patch on which Andres and Horiguchi-San are working. If
> > > > we still continue to pursue with current approach then as Andres
> > > > suggested we might send additional information from
> > > > RestoreSlotFromDisk to keep it in sync.
> > >
> > > Thanks for your comments, Attached patch has the fix for the same.
> > > Also attached a couple of more patches which addresses the comments
> > > which Andres had listed i.e changing char to NameData type and also to
> > > display the unspilled/unstreamed transaction information in the
> > > replication statistics.
> > > Thoughts?
> >
> > Thank you for the patches!
> >
> > I've looked at those patches and here are some comments on 0001, 0002,
> > and 0003 patch:
>
> Thanks for the comments.
>
> > 0001 patch:
> >
> > -       values[0] = PointerGetDatum(cstring_to_text(s->slotname));
> > +       values[0] = PointerGetDatum(cstring_to_text(s->slotname.data));
> >
> > We can use NameGetDatum() instead.
>
> I felt we will not be able to use NameGetDatum because this function
> will not have access to the value throughout the loop and NameGetDatum
> must ensure the pointed-to value has adequate lifetime.
>
> > ---
> > 0002 patch:
> >
> > The patch uses logical replication to test replication slots
> > statistics but I think it's necessarily necessary. It would be more
> > simple to use logical decoding. Maybe we can add TAP tests to
> > contrib/test_decoding.
> >
>
> I will try to change it to test_decoding if feasible and post in the
> next version.
>

I have modified the patch to include tap tests in contrib/test_decoding.
Attached v3 patch has the changes for the same.
Thoughts?

Regards,
Vignesh

On Tue, Apr 6, 2021 at 12:19 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Apr 5, 2021 at 8:51 PM vignesh C <vignesh21@gmail.com> wrote:
> >
>
> Few comments on the latest patches:
> Comments on 0001
> --------------------------------
> 1.
> @@ -659,6 +661,8 @@ ReorderBufferTXNByXid(ReorderBuffer *rb,
> TransactionId xid, bool create,
>   dlist_push_tail(&rb->toplevel_by_lsn, &txn->node);
>   AssertTXNLsnOrder(rb);
>   }
> +
> + rb->totalTxns++;
>   }
>   else
>   txn = NULL; /* not found and not asked to create */
> @@ -3078,6 +3082,7 @@ ReorderBufferChangeMemoryUpdate(ReorderBuffer *rb,
>   {
>   txn->size += sz;
>   rb->size += sz;
> + rb->totalBytes += sz;
>
> I think this will include the txns that are aborted and for which we
> don't send anything. It might be better to update these stats in
> ReorderBufferProcessTXN or ReorderBufferReplay where we are sure we
> have sent the data. We can probably use size/total_size in txn. We
> need to be careful to not double include the totaltxn or totalBytes
> for streaming xacts as we might process the same txn multiple times.

Modified it to update total_byte for spilled transactions and streamed
transactions where spill_bytes and stream_bytes are updated. For
non-stream/spilled transactions, total_bytes is updated in
ReorderBufferProcessTXN.

> 2.
> +        Amount of decoded transactions data sent to the decoding output plugin
> +        while decoding the changes from WAL for this slot. This and total_txns
> +        for this slot can be used to gauge the total amount of data during
> +        logical decoding.
>
> I think we can slightly modify the second line here: "This can be used
> to gauge the total amount of data sent during logical decoding.". Why
> we need to include total_txns along with it.

Modified it.

> 0002
> ----------
> 3.
> +  -- we don't want to wait forever; loop will exit after 30 seconds
> +  FOR i IN 1 .. 5 LOOP
> +
> ...
> ...
> +
> +    -- wait a little
> +    perform pg_sleep_for('100 milliseconds');
>
> I think this loop needs to be executed 300 times instead of 5 times,
> if the above comments and code needs to do what is expected here?
>

Modified it.

> 4.
> +# Test to drop one of the subscribers and verify replication statistics data is
> +# fine after publisher is restarted.
> +$node->safe_psql('postgres', "SELECT
> pg_drop_replication_slot('regression_slot4')");
> +
> +$node->stop;
> +$node->start;
> +
> +# Verify statistics data present in pg_stat_replication_slots are sane after
> +# publisher is restarted
> +$result = $node->safe_psql('postgres',
> + "SELECT slot_name, total_txns > 0 AS total_txn, total_bytes > 0 AS total_bytes
> + FROM pg_stat_replication_slots ORDER BY slot_name"
>
> Various comments in the 0002 refer to publisher/subscriber which is
> not what we are using here.

Removed references to publisher/subscriber.

> 5.
> +# Create table.
> +$node->safe_psql('postgres',
> +        "CREATE TABLE test_repl_stat(col1 int)");
> +$node->safe_psql('postgres',
> +        "SELECT data FROM
> pg_logical_slot_get_changes('regression_slot1', NULL, NULL,
> 'include-xids', '0', 'skip-empty-xacts', '1')");
> +$node->safe_psql('postgres',
> +        "SELECT data FROM
> pg_logical_slot_get_changes('regression_slot2', NULL, NULL,
> 'include-xids', '0', 'skip-empty-xacts', '1')");
> +$node->safe_psql('postgres',
> +        "SELECT data FROM
> pg_logical_slot_get_changes('regression_slot3', NULL, NULL,
> 'include-xids', '0', 'skip-empty-xacts', '1')");
> +$node->safe_psql('postgres',
> +        "SELECT data FROM
> pg_logical_slot_get_changes('regression_slot4', NULL, NULL,
> 'include-xids', '0', 'skip-empty-xacts', '1')");
>
> I think we can save the above calls to pg_logical_slot_get_changes if
> we create table before creating the slots in this test.
>

Modified it.

> 0003
> ---------
> 6. In the tests/code, publisher is used at multiple places. I think
> that is not required because this can happen via plugin as well.

Removed references to publisher.

> 7.
> + if (max_replication_slots == nReplSlotStats)
> + {
> + ereport(pgStatRunningInCollector ? LOG : WARNING,
> + (errmsg("skipping \"%s\" replication slot statistics as
> pg_stat_replication_slots does not have enough slots",
> + NameStr(replSlotStats[nReplSlotStats].slotname))));
> + memset(&replSlotStats[nReplSlotStats], 0, sizeof(PgStat_ReplSlotStats));
>
> Do we need memset here? Isn't this location is past the max location?

That is not required, I have modified it.
Attached v4 patch has the fixes for the same.

Regards,
Vignesh

On Thu, Apr 8, 2021 at 7:30 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Apr 7, 2021 at 2:51 PM vignesh C <vignesh21@gmail.com> wrote:
> >
>
> @@ -4069,6 +4069,24 @@ pgstat_read_statsfiles(Oid onlydb, bool
> permanent, bool deep)
>   * slot follows.
>   */
>   case 'R':
> + /*
> + * There is a remote scenario where one of the replication slots
> + * is dropped and the drop slot statistics message is not
> + * received by the statistic collector process, now if the
> + * max_replication_slots is reduced to the actual number of
> + * replication slots that are in use and the server is
> + * re-started then the statistics process will not be aware of
> + * this. To avoid writing beyond the max_replication_slots
> + * this replication slot statistic information will be skipped.
> + */
> + if (max_replication_slots == nReplSlotStats)
> + {
> + ereport(pgStatRunningInCollector ? LOG : WARNING,
> + (errmsg("skipping \"%s\" replication slot statistics as
> pg_stat_replication_slots does not have enough slots",
> + NameStr(replSlotStats[nReplSlotStats].slotname))));
> + goto done;
> + }
>
> I think we might truncate some valid slots here. I have another idea
> to fix this case which is that while writing, we first write the
> 'nReplSlotStats' and then write each slot info. Then while reading we
> can allocate memory based on the required number of slots. Later when
> startup process sends the slots, we can remove the already dropped
> slots from this array. What do you think?

IIUC there are two problems in the case where the drop message is lost:

1. Writing beyond the end of replSlotStats.
This can happen if after restarting the number of slots whose stats
are stored in the stats file exceeds max_replication_slots. Vignesh's
patch addresses this problem.

2. The stats for the new slot are not recorded.
If the stats for already-dropped slots remain in replSlotStats, the
stats for the new slot cannot be registered due to the full of
replSlotStats. This can happen even when after restarting the number
of slots whose stats are stored in the stat file does NOT exceed
max_replication_slots as well as even during the server running. The
patch doesn’t address this problem. (If this happens, we will have to
reset all slot stats since pg_stat_reset_replication_slot() cannot
remove the slot stats with the non-existing name).

I think we can use HTAB to store slot stats and have
pg_stat_get_replication_slot() inquire about stats by the slot name,
resolving both problems. By using HTAB we're no longer concerned about
the problem of writing stats beyond the end of the replSlotStats
array. Instead, we have to consider how and when to clean up the stats
for already-dropped slots. We can have the startup process send slot
names at startup time, which borrows the idea proposed by Amit. But
maybe we need to consider the case again where the message from the
startup process is lost? Another idea would be to have
pgstat_vacuum_stat() check the existing slots and call
pgstat_report_replslot_drop() if the slot in the stats file doesn't
exist. That way, we can continuously check the stats for
already-dropped slots.

I've written a PoC patch for the above idea; using HTAB and cleaning
up slot stats at pgstat_vacuum_stat(). The patch can be applied on top
of 0001 patch Vignesh proposed before[1].

Please note that this cannot resolve the problem of ending up
accumulating the stats to the old slot if the slot is re-created with
the same name and the drop message is lost. To deal with this problem
I think we would need to use something unique identifier for each slot
instead of slot name.

[1] https://www.postgresql.org/message-id/CALDaNm195xL1bZq4VHKt%3D-wmXJ5kC4jxKh7LXK%2BpN7ESFjHO%2Bw%40mail.gmail.com

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Attachment

0001-POC-Use-HTAB-for-replication-slot-stats.patch

Re: Replication slot stats misgivings

From

Amit Kapila

Date:

10 April 2021, 04:20:16

On Fri, Apr 9, 2021 at 4:13 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> 2.
> @@ -2051,6 +2054,17 @@ ReorderBufferProcessTXN(ReorderBuffer *rb,
> ReorderBufferTXN *txn,
>   rb->begin(rb, txn);
>   }
>
> + /*
> + * Update total transaction count and total transaction bytes, if
> + * transaction is streamed or spilled it will be updated while the
> + * transaction gets spilled or streamed.
> + */
> + if (!rb->streamBytes && !rb->spillBytes)
> + {
> + rb->totalTxns++;
> + rb->totalBytes += rb->size;
> + }
>
> I think this will skip a transaction if it is interleaved between a
> streaming transaction. Assume, two transactions t1 and t2. t1 sends
> changes in multiple streams and t2 sends all changes in one go at
> commit time. So, now, if t2 is interleaved between multiple streams
> then I think the above won't count t2.
>
> 3.
> @@ -3524,9 +3538,11 @@ ReorderBufferSerializeTXN(ReorderBuffer *rb,
> ReorderBufferTXN *txn)
>   {
>   rb->spillCount += 1;
>   rb->spillBytes += size;
> + rb->totalBytes += size;
>
>   /* don't consider already serialized transactions */
>   rb->spillTxns += (rbtxn_is_serialized(txn) ||
> rbtxn_is_serialized_clear(txn)) ? 0 : 1;
> + rb->totalTxns += (rbtxn_is_serialized(txn) ||
> rbtxn_is_serialized_clear(txn)) ? 0 : 1;
>   }
>
> We do serialize each subtransaction separately. So totalTxns will
> include subtransaction count as well when serialized, otherwise not.
> The description of totalTxns also says that it doesn't include
> subtransactions. So, I think updating rb->totalTxns here is wrong.
>

The attached patch should fix the above two comments. I think it
should be sufficient if we just update the stats after processing the
TXN. We need to ensure that don't count streamed transactions multiple
times. I have not tested the attached, can you please review/test it
and include it in the next set of patches if you agree with this
change.

-- 
With Regards,
Amit Kapila.

Re: Replication slot stats misgivings

From

Amit Kapila

Date:

10 April 2021, 04:21:03

On Sat, Apr 10, 2021 at 9:50 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Apr 9, 2021 at 4:13 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > 2.
> > @@ -2051,6 +2054,17 @@ ReorderBufferProcessTXN(ReorderBuffer *rb,
> > ReorderBufferTXN *txn,
> >   rb->begin(rb, txn);
> >   }
> >
> > + /*
> > + * Update total transaction count and total transaction bytes, if
> > + * transaction is streamed or spilled it will be updated while the
> > + * transaction gets spilled or streamed.
> > + */
> > + if (!rb->streamBytes && !rb->spillBytes)
> > + {
> > + rb->totalTxns++;
> > + rb->totalBytes += rb->size;
> > + }
> >
> > I think this will skip a transaction if it is interleaved between a
> > streaming transaction. Assume, two transactions t1 and t2. t1 sends
> > changes in multiple streams and t2 sends all changes in one go at
> > commit time. So, now, if t2 is interleaved between multiple streams
> > then I think the above won't count t2.
> >
> > 3.
> > @@ -3524,9 +3538,11 @@ ReorderBufferSerializeTXN(ReorderBuffer *rb,
> > ReorderBufferTXN *txn)
> >   {
> >   rb->spillCount += 1;
> >   rb->spillBytes += size;
> > + rb->totalBytes += size;
> >
> >   /* don't consider already serialized transactions */
> >   rb->spillTxns += (rbtxn_is_serialized(txn) ||
> > rbtxn_is_serialized_clear(txn)) ? 0 : 1;
> > + rb->totalTxns += (rbtxn_is_serialized(txn) ||
> > rbtxn_is_serialized_clear(txn)) ? 0 : 1;
> >   }
> >
> > We do serialize each subtransaction separately. So totalTxns will
> > include subtransaction count as well when serialized, otherwise not.
> > The description of totalTxns also says that it doesn't include
> > subtransactions. So, I think updating rb->totalTxns here is wrong.
> >
>
> The attached patch should fix the above two comments. I think it
> should be sufficient if we just update the stats after processing the
> TXN. We need to ensure that don't count streamed transactions multiple
> times. I have not tested the attached, can you please review/test it
> and include it in the next set of patches if you agree with this
> change.
>

oops, forgot to attach. Attaching now.

-- 
With Regards,
Amit Kapila.

Attachment

fix_stats_1.patch

Re: Replication slot stats misgivings

From

vignesh C

Date:

10 April 2021, 07:36:29

On Sat, Apr 10, 2021 at 9:50 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Apr 9, 2021 at 4:13 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > 2.
> > @@ -2051,6 +2054,17 @@ ReorderBufferProcessTXN(ReorderBuffer *rb,
> > ReorderBufferTXN *txn,
> >   rb->begin(rb, txn);
> >   }
> >
> > + /*
> > + * Update total transaction count and total transaction bytes, if
> > + * transaction is streamed or spilled it will be updated while the
> > + * transaction gets spilled or streamed.
> > + */
> > + if (!rb->streamBytes && !rb->spillBytes)
> > + {
> > + rb->totalTxns++;
> > + rb->totalBytes += rb->size;
> > + }
> >
> > I think this will skip a transaction if it is interleaved between a
> > streaming transaction. Assume, two transactions t1 and t2. t1 sends
> > changes in multiple streams and t2 sends all changes in one go at
> > commit time. So, now, if t2 is interleaved between multiple streams
> > then I think the above won't count t2.
> >
> > 3.
> > @@ -3524,9 +3538,11 @@ ReorderBufferSerializeTXN(ReorderBuffer *rb,
> > ReorderBufferTXN *txn)
> >   {
> >   rb->spillCount += 1;
> >   rb->spillBytes += size;
> > + rb->totalBytes += size;
> >
> >   /* don't consider already serialized transactions */
> >   rb->spillTxns += (rbtxn_is_serialized(txn) ||
> > rbtxn_is_serialized_clear(txn)) ? 0 : 1;
> > + rb->totalTxns += (rbtxn_is_serialized(txn) ||
> > rbtxn_is_serialized_clear(txn)) ? 0 : 1;
> >   }
> >
> > We do serialize each subtransaction separately. So totalTxns will
> > include subtransaction count as well when serialized, otherwise not.
> > The description of totalTxns also says that it doesn't include
> > subtransactions. So, I think updating rb->totalTxns here is wrong.
> >
>
> The attached patch should fix the above two comments. I think it
> should be sufficient if we just update the stats after processing the
> TXN. We need to ensure that don't count streamed transactions multiple
> times. I have not tested the attached, can you please review/test it
> and include it in the next set of patches if you agree with this
> change.

Thanks Amit for your Patch. I have merged your changes into my
patchset. I did not find any issues in my testing.
Thoughts?

Regards,
Vignesh

On Sat, Apr 10, 2021 at 6:24 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Sat, Apr 10, 2021 at 1:06 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > Thanks Amit for your Patch. I have merged your changes into my
> > patchset. I did not find any issues in my testing.
> > Thoughts?
> >
>
> 0001
> ------
>   PgStat_Counter m_stream_bytes;
> + PgStat_Counter m_total_txns;
> + PgStat_Counter m_total_bytes;
>  } PgStat_MsgReplSlot;
>
> ..
> ..
>
> + PgStat_Counter total_txns;
> + PgStat_Counter total_bytes;
>   TimestampTz stat_reset_timestamp;
>  } PgStat_ReplSlotStats;
>
> Doesn't this change belong to the second patch?

Missed it while splitting the patches, it is fixed in the attached patch,

Regards,
Vignesh

Attachment

Re: Replication slot stats misgivings

From

vignesh C

Date:

12 April 2021, 03:41:40

Thanks for the comments.

On Fri, Apr 9, 2021 at 4:13 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Apr 7, 2021 at 2:51 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > That is not required, I have modified it.
> > Attached v4 patch has the fixes for the same.
> >
>
> Few comments:
>
> 0001
> ------
> 1. The first patch includes changing char datatype to NameData
> datatype for slotname. I feel this can be a separate patch from adding
> new stats in the view. I think we can also move the change related to
> moving stats to a structure rather than sending them individually in
> the same patch.

I have split the patch as suggested.

> 2.
> @@ -2051,6 +2054,17 @@ ReorderBufferProcessTXN(ReorderBuffer *rb,
> ReorderBufferTXN *txn,
> rb->begin(rb, txn);
> }
>
> + /*
> + * Update total transaction count and total transaction bytes, if
> + * transaction is streamed or spilled it will be updated while the
> + * transaction gets spilled or streamed.
> + */
> + if (!rb->streamBytes && !rb->spillBytes)
> + {
> + rb->totalTxns++;
> + rb->totalBytes += rb->size;
> + }
>
> I think this will skip a transaction if it is interleaved between a
> streaming transaction. Assume, two transactions t1 and t2. t1 sends
> changes in multiple streams and t2 sends all changes in one go at
> commit time. So, now, if t2 is interleaved between multiple streams
> then I think the above won't count t2.
>

Modified it.

> 3.
> @@ -3524,9 +3538,11 @@ ReorderBufferSerializeTXN(ReorderBuffer *rb,
> ReorderBufferTXN *txn)
> {
> rb->spillCount += 1;
> rb->spillBytes += size;
> + rb->totalBytes += size;
>
> /* don't consider already serialized transactions */
> rb->spillTxns += (rbtxn_is_serialized(txn) ||
> rbtxn_is_serialized_clear(txn)) ? 0 : 1;
> + rb->totalTxns += (rbtxn_is_serialized(txn) ||
> rbtxn_is_serialized_clear(txn)) ? 0 : 1;
> }
>
> We do serialize each subtransaction separately. So totalTxns will
> include subtransaction count as well when serialized, otherwise not.
> The description of totalTxns also says that it doesn't include
> subtransactions. So, I think updating rb->totalTxns here is wrong.
>

Modified it.

> 0002
> -----
> 1.
> +$node->safe_psql('postgres',
> + "SELECT data FROM pg_logical_slot_get_changes('regression_slot2',
> NULL, NULL, 'include-xids', '0', 'skip-empty-xacts', '1')");
> +$node->safe_psql('postgres',
> + "SELECT data FROM
> pg_logical_slot_get_changes('regression_slot3', NULL, NULL,
> 'include-xids', '0', 'skip-empty-xacts', '1')");
>
> The indentation of the second SELECT seems to bit off.

Modified it.
These comments are fixed in the patch available at [1].

[1] - https://www.postgresql.org/message-id/CALDaNm1A%3DbjSrQjBNwNsOtTig%2B6pZpunmAj_P7Au0H0XjtvCyA%40mail.gmail.com

Regards,

Vignesh

Re: Replication slot stats misgivings

From

Masahiko Sawada

Date:

12 April 2021, 04:56:38

On Sat, Apr 10, 2021 at 9:53 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Sat, Apr 10, 2021 at 7:24 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > IIUC there are two problems in the case where the drop message is lost:
> >
> > 1. Writing beyond the end of replSlotStats.
> > This can happen if after restarting the number of slots whose stats
> > are stored in the stats file exceeds max_replication_slots. Vignesh's
> > patch addresses this problem.
> >
> > 2. The stats for the new slot are not recorded.
> > If the stats for already-dropped slots remain in replSlotStats, the
> > stats for the new slot cannot be registered due to the full of
> > replSlotStats. This can happen even when after restarting the number
> > of slots whose stats are stored in the stat file does NOT exceed
> > max_replication_slots as well as even during the server running. The
> > patch doesn’t address this problem. (If this happens, we will have to
> > reset all slot stats since pg_stat_reset_replication_slot() cannot
> > remove the slot stats with the non-existing name).
> >
> > I think we can use HTAB to store slot stats and have
> > pg_stat_get_replication_slot() inquire about stats by the slot name,
> > resolving both problems. By using HTAB we're no longer concerned about
> > the problem of writing stats beyond the end of the replSlotStats
> > array. Instead, we have to consider how and when to clean up the stats
> > for already-dropped slots. We can have the startup process send slot
> > names at startup time, which borrows the idea proposed by Amit. But
> > maybe we need to consider the case again where the message from the
> > startup process is lost? Another idea would be to have
> > pgstat_vacuum_stat() check the existing slots and call
> > pgstat_report_replslot_drop() if the slot in the stats file doesn't
> > exist. That way, we can continuously check the stats for
> > already-dropped slots.
> >

Thanks for your comments.

>
> Agreed, I think checking periodically via pgstat_vacuum_stat is a
> better idea then sending once at start up time. I also think using
> slot_name is better than using 'idx' (index in
> ReplicationSlotCtl->replication_slots) in this scheme because even
> after startup 'idx' changes we will be able to drop the dead slot.
>
> > I've written a PoC patch for the above idea; using HTAB and cleaning
> > up slot stats at pgstat_vacuum_stat(). The patch can be applied on top
> > of 0001 patch Vignesh proposed before[1].
> >
>
> It seems Vignesh has changed patches based on the latest set of
> comments so you might want to rebase.

I've merged my patch into the v6 patch set Vignesh submitted.

I've attached the updated version of the patches. I didn't change
anything in the patch that changes char[NAMEDATALEN] to NameData (0001
patch) and patches that add tests. In 0003 patch I reordered the
output parameters of pg_stat_replication_slots; showing total number
of transactions and total bytes followed by statistics for spilled and
streamed transactions seems appropriate to me. Since my patch resolved
the issue of writing stats beyond the end of the array, I've removed
the patch that writes the number of stats into the stats file
(v6-0004-Handle-overwriting-of-replication-slot-statistic-.patch).

Apart from the above updates, the
contrib/test_decoding/001_repl_stats.pl add wait_for_decode_stats()
function during testing but I think we can use poll_query_until()
instead. Also, I think we can merge 0004 and 0005 patches.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Attachment

Re: Replication slot stats misgivings

From

Amit Kapila

Date:

12 April 2021, 09:19:47

On Mon, Apr 12, 2021 at 10:27 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Sat, Apr 10, 2021 at 9:53 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> >
> > It seems Vignesh has changed patches based on the latest set of
> > comments so you might want to rebase.
>
> I've merged my patch into the v6 patch set Vignesh submitted.
>
> I've attached the updated version of the patches. I didn't change
> anything in the patch that changes char[NAMEDATALEN] to NameData (0001
> patch) and patches that add tests.
>

I think we can push 0001. What do you think?

> In 0003 patch I reordered the
> output parameters of pg_stat_replication_slots; showing total number
> of transactions and total bytes followed by statistics for spilled and
> streamed transactions seems appropriate to me.
>

I am not sure about this because I think we might want to add some
info of stream/spill bytes in total_bytes description (something like
stream/spill bytes are not in addition to total_bytes). So probably
keeping these new counters at the end makes more sense to me.

> Since my patch resolved
> the issue of writing stats beyond the end of the array, I've removed
> the patch that writes the number of stats into the stats file
> (v6-0004-Handle-overwriting-of-replication-slot-statistic-.patch).
>

Okay, but I think it might be better to keep 0001, 0002, 0003 as
Vignesh had because those are agreed upon changes and are
straightforward. We can push those and then further review HTAB
implementation and also see if Andres has any suggestions on the same.

> Apart from the above updates, the
> contrib/test_decoding/001_repl_stats.pl add wait_for_decode_stats()
> function during testing but I think we can use poll_query_until()
> instead.

+1. Can you please change it in the next version?

-- 
With Regards,
Amit Kapila.

Re: Replication slot stats misgivings

From

vignesh C

Date:

12 April 2021, 09:27:46

On Sat, Mar 20, 2021 at 9:26 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Sat, Mar 20, 2021 at 12:22 AM Andres Freund <andres@anarazel.de> wrote:
> >
> > And then more generally about the feature:
> > - If a slot was used to stream out a large amount of changes (say an
> >   initial data load), but then replication is interrupted before the
> >   transaction is committed/aborted, stream_bytes will not reflect the
> >   many gigabytes of data we may have sent.
> >
>
> We can probably update the stats each time we spilled or streamed the
> transaction data but it was not clear at that stage whether or how
> much it will be useful.
>

I felt we can update the replication slot statistics data each time we
spill/stream the transaction data instead of accumulating the
statistics and updating at the end. I have tried this in the attached
patch and the statistics data were getting updated.
Thoughts?

Regards,
Vignesh

Attachment

Update_replication_slot_statistics_after_spill_stream.patch

Re: Replication slot stats misgivings

From

Masahiko Sawada

Date:

12 April 2021, 11:04:04

On Mon, Apr 12, 2021 at 6:19 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Apr 12, 2021 at 10:27 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Sat, Apr 10, 2021 at 9:53 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > >
> > > It seems Vignesh has changed patches based on the latest set of
> > > comments so you might want to rebase.
> >
> > I've merged my patch into the v6 patch set Vignesh submitted.
> >
> > I've attached the updated version of the patches. I didn't change
> > anything in the patch that changes char[NAMEDATALEN] to NameData (0001
> > patch) and patches that add tests.
> >
>
> I think we can push 0001. What do you think?

+1

>
> > In 0003 patch I reordered the
> > output parameters of pg_stat_replication_slots; showing total number
> > of transactions and total bytes followed by statistics for spilled and
> > streamed transactions seems appropriate to me.
> >
>
> I am not sure about this because I think we might want to add some
> info of stream/spill bytes in total_bytes description (something like
> stream/spill bytes are not in addition to total_bytes).

Okay.

> So probably
> keeping these new counters at the end makes more sense to me.

But I think all of those counters are new for users since
pg_stat_replication_slots view will be introduced to PG14, no?

>
> > Since my patch resolved
> > the issue of writing stats beyond the end of the array, I've removed
> > the patch that writes the number of stats into the stats file
> > (v6-0004-Handle-overwriting-of-replication-slot-statistic-.patch).
> >
>
> Okay, but I think it might be better to keep 0001, 0002, 0003 as
> Vignesh had because those are agreed upon changes and are
> straightforward. We can push those and then further review HTAB
> implementation and also see if Andres has any suggestions on the same.

Makes sense. Maybe it should have written my patch as 0004 (i.g.,
applied on top of the patch that adds total_txn and tota_bytes).

>
> > Apart from the above updates, the
> > contrib/test_decoding/001_repl_stats.pl add wait_for_decode_stats()
> > function during testing but I think we can use poll_query_until()
> > instead.
>
> +1. Can you please change it in the next version?

Sure, I'll update the patches.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Replication slot stats misgivings

From

Amit Kapila

Date:

12 April 2021, 11:08:34

On Mon, Apr 12, 2021 at 4:34 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Mon, Apr 12, 2021 at 6:19 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Mon, Apr 12, 2021 at 10:27 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > On Sat, Apr 10, 2021 at 9:53 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > >
> > > > It seems Vignesh has changed patches based on the latest set of
> > > > comments so you might want to rebase.
> > >
> > > I've merged my patch into the v6 patch set Vignesh submitted.
> > >
> > > I've attached the updated version of the patches. I didn't change
> > > anything in the patch that changes char[NAMEDATALEN] to NameData (0001
> > > patch) and patches that add tests.
> > >
> >
> > I think we can push 0001. What do you think?
>
> +1
>
> >
> > > In 0003 patch I reordered the
> > > output parameters of pg_stat_replication_slots; showing total number
> > > of transactions and total bytes followed by statistics for spilled and
> > > streamed transactions seems appropriate to me.
> > >
> >
> > I am not sure about this because I think we might want to add some
> > info of stream/spill bytes in total_bytes description (something like
> > stream/spill bytes are not in addition to total_bytes).
>
> Okay.
>
> > So probably
> > keeping these new counters at the end makes more sense to me.
>
> But I think all of those counters are new for users since
> pg_stat_replication_slots view will be introduced to PG14, no?
>

Right, I was referring to total_txns and total_bytes attributes. I
think keeping them at end after spill and stream counters should be
okay.

-- 
With Regards,
Amit Kapila.

Re: Replication slot stats misgivings

From

vignesh C

Date:

12 April 2021, 11:15:07

On Mon, Apr 12, 2021 at 4:34 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Mon, Apr 12, 2021 at 6:19 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Mon, Apr 12, 2021 at 10:27 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > On Sat, Apr 10, 2021 at 9:53 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > >
> > > > It seems Vignesh has changed patches based on the latest set of
> > > > comments so you might want to rebase.
> > >
> > > I've merged my patch into the v6 patch set Vignesh submitted.
> > >
> > > I've attached the updated version of the patches. I didn't change
> > > anything in the patch that changes char[NAMEDATALEN] to NameData (0001
> > > patch) and patches that add tests.
> > >
> >
> > I think we can push 0001. What do you think?
>
> +1
>
> >
> > > In 0003 patch I reordered the
> > > output parameters of pg_stat_replication_slots; showing total number
> > > of transactions and total bytes followed by statistics for spilled and
> > > streamed transactions seems appropriate to me.
> > >
> >
> > I am not sure about this because I think we might want to add some
> > info of stream/spill bytes in total_bytes description (something like
> > stream/spill bytes are not in addition to total_bytes).
>
> Okay.
>
> > So probably
> > keeping these new counters at the end makes more sense to me.
>
> But I think all of those counters are new for users since
> pg_stat_replication_slots view will be introduced to PG14, no?
>
> >
> > > Since my patch resolved
> > > the issue of writing stats beyond the end of the array, I've removed
> > > the patch that writes the number of stats into the stats file
> > > (v6-0004-Handle-overwriting-of-replication-slot-statistic-.patch).
> > >
> >
> > Okay, but I think it might be better to keep 0001, 0002, 0003 as
> > Vignesh had because those are agreed upon changes and are
> > straightforward. We can push those and then further review HTAB
> > implementation and also see if Andres has any suggestions on the same.
>
> Makes sense. Maybe it should have written my patch as 0004 (i.g.,
> applied on top of the patch that adds total_txn and tota_bytes).
>
> >
> > > Apart from the above updates, the
> > > contrib/test_decoding/001_repl_stats.pl add wait_for_decode_stats()
> > > function during testing but I think we can use poll_query_until()
> > > instead.
> >
> > +1. Can you please change it in the next version?
>
> Sure, I'll update the patches.

I had started working on poll_query_until comment, I will test and
post a patch for that comment shortly.

Regards,
Vignesh

Re: Replication slot stats misgivings

From

Amit Kapila

Date:

12 April 2021, 11:16:37

On Sat, Apr 10, 2021 at 6:51 PM vignesh C <vignesh21@gmail.com> wrote:
>

Thanks, 0001 and 0002 look good to me. I have a minor comment for 0002.

<entry role="catalog_table_entry"><para role="column_definition">
+        <structfield>total_bytes</structfield><type>bigint</type>
+       </para>
+       <para>
+        Amount of decoded transactions data sent to the decoding output plugin
+        while decoding the changes from WAL for this slot. This can be used to
+        gauge the total amount of data sent during logical decoding.

Can we slightly extend it to say something like: Note that this
includes the bytes streamed and or spilled. Similarly, we can extend
it for total_txns.

-- 
With Regards,
Amit Kapila.

Re: Replication slot stats misgivings

From

Masahiko Sawada

Date:

12 April 2021, 11:58:54

On Mon, Apr 12, 2021 at 8:08 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Apr 12, 2021 at 4:34 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Mon, Apr 12, 2021 at 6:19 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Mon, Apr 12, 2021 at 10:27 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > > > On Sat, Apr 10, 2021 at 9:53 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > >
> > > > >
> > > > > It seems Vignesh has changed patches based on the latest set of
> > > > > comments so you might want to rebase.
> > > >
> > > > I've merged my patch into the v6 patch set Vignesh submitted.
> > > >
> > > > I've attached the updated version of the patches. I didn't change
> > > > anything in the patch that changes char[NAMEDATALEN] to NameData (0001
> > > > patch) and patches that add tests.
> > > >
> > >
> > > I think we can push 0001. What do you think?
> >
> > +1
> >
> > >
> > > > In 0003 patch I reordered the
> > > > output parameters of pg_stat_replication_slots; showing total number
> > > > of transactions and total bytes followed by statistics for spilled and
> > > > streamed transactions seems appropriate to me.
> > > >
> > >
> > > I am not sure about this because I think we might want to add some
> > > info of stream/spill bytes in total_bytes description (something like
> > > stream/spill bytes are not in addition to total_bytes).

BTW doesn't it confuse users that stream/spill bytes are not in
addition to total_bytes? User will need to do "total_bytes +
spill/stream_bytes" to know the actual total amount of data sent to
the decoding output plugin, is that right? The doc says "Amount of
decoded transactions data sent to the decoding output plugin while
decoding the changes from WAL for this slot" but I think we also send
decoded data that had been spilled to the decoding output plugin.

> >
> > Okay.
> >
> > > So probably
> > > keeping these new counters at the end makes more sense to me.
> >
> > But I think all of those counters are new for users since
> > pg_stat_replication_slots view will be introduced to PG14, no?
> >
>
> Right, I was referring to total_txns and total_bytes attributes. I
> think keeping them at end after spill and stream counters should be
> okay.

Okay, understood.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Replication slot stats misgivings

From

vignesh C

Date:

12 April 2021, 12:16:37

On Mon, Apr 12, 2021 at 4:46 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Sat, Apr 10, 2021 at 6:51 PM vignesh C <vignesh21@gmail.com> wrote:
> >
>
> Thanks, 0001 and 0002 look good to me. I have a minor comment for 0002.
>
> <entry role="catalog_table_entry"><para role="column_definition">
> +        <structfield>total_bytes</structfield><type>bigint</type>
> +       </para>
> +       <para>
> +        Amount of decoded transactions data sent to the decoding output plugin
> +        while decoding the changes from WAL for this slot. This can be used to
> +        gauge the total amount of data sent during logical decoding.
>
> Can we slightly extend it to say something like: Note that this
> includes the bytes streamed and or spilled. Similarly, we can extend
> it for total_txns.
>

Thanks for the comments, the comments are fixed in the v8 patch attached.
Thoughts?

Regards,
Vignesh

On Mon, Apr 12, 2021 at 7:03 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Mon, Apr 12, 2021 at 9:36 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Mon, Apr 12, 2021 at 5:29 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > On Mon, Apr 12, 2021 at 8:08 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > > On Mon, Apr 12, 2021 at 4:34 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > > >
> > > > > On Mon, Apr 12, 2021 at 6:19 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > > >
> > > > > > On Mon, Apr 12, 2021 at 10:27 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > > > > >
> > > > > > > On Sat, Apr 10, 2021 at 9:53 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > > > > >
> > > > > > > >
> > > > > > > > It seems Vignesh has changed patches based on the latest set of
> > > > > > > > comments so you might want to rebase.
> > > > > > >
> > > > > > > I've merged my patch into the v6 patch set Vignesh submitted.
> > > > > > >
> > > > > > > I've attached the updated version of the patches. I didn't change
> > > > > > > anything in the patch that changes char[NAMEDATALEN] to NameData (0001
> > > > > > > patch) and patches that add tests.
> > > > > > >
> > > > > >
> > > > > > I think we can push 0001. What do you think?
> > > > >
> > > > > +1
> > > > >
> > > > > >
> > > > > > > In 0003 patch I reordered the
> > > > > > > output parameters of pg_stat_replication_slots; showing total number
> > > > > > > of transactions and total bytes followed by statistics for spilled and
> > > > > > > streamed transactions seems appropriate to me.
> > > > > > >
> > > > > >
> > > > > > I am not sure about this because I think we might want to add some
> > > > > > info of stream/spill bytes in total_bytes description (something like
> > > > > > stream/spill bytes are not in addition to total_bytes).
> > >
> > > BTW doesn't it confuse users that stream/spill bytes are not in
> > > addition to total_bytes? User will need to do "total_bytes +
> > > spill/stream_bytes" to know the actual total amount of data sent to
> > > the decoding output plugin, is that right?
> > >
> >
> > No, total_bytes includes the spill/stream bytes. So, the user doesn't
> > need to do any calculation to compute totel_bytes sent to output
> > plugin.
>
> The following test for the latest v8 patch seems to show different.
> total_bytes is 1808 whereas spill_bytes is 13200000. Am I missing
> something?
>
> postgres(1:85969)=# select pg_create_logical_replication_slot('s',
> 'test_decoding');
>  pg_create_logical_replication_slot
> ------------------------------------
>  (s,0/1884468)
> (1 row)
>
> postgres(1:85969)=# create table a (i int);
> CREATE TABLE
> postgres(1:85969)=# insert into a select generate_series(1, 100000);
> INSERT 0 100000
> postgres(1:85969)=# set logical_decoding_work_mem to 64;
> SET
> postgres(1:85969)=# select * from pg_stat_replication_slots ;
>  slot_name | total_txns | total_bytes | spill_txns | spill_count |
> spill_bytes | stream_txns | stream_count | stream_bytes | stats_reset
>
-----------+------------+-------------+------------+-------------+-------------+-------------+--------------+--------------+-------------
>  s         |          0 |           0 |          0 |           0 |
>       0 |           0 |            0 |            0 |
> (1 row)
>
> postgres(1:85969)=# select count(*) from
> pg_logical_slot_peek_changes('s', NULL, NULL);
>  count
> --------
>  100004
> (1 row)
>
> postgres(1:85969)=# select * from pg_stat_replication_slots ;
>  slot_name | total_txns | total_bytes | spill_txns | spill_count |
> spill_bytes | stream_txns | stream_count | stream_bytes | stats_reset
>
-----------+------------+-------------+------------+-------------+-------------+-------------+--------------+--------------+-------------
>  s         |          2 |        1808 |          1 |         202 |
> 13200000 |           0 |            0 |            0 |
> (1 row)
>

Thanks for identifying this issue, while spilling the transactions
reorder buffer changes gets released, we will not be able to get the
total size for spilled transactions from reorderbuffer size. I have
fixed it by including spilledbytes to totalbytes in case of spilled
transactions. Attached patch has the fix for this.
Thoughts?

Regards,
Vignesh

Attachment

Re: Replication slot stats misgivings

From

vignesh C

Date:

14 April 2021, 01:50:09

On Tue, Apr 13, 2021 at 10:46 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Mon, Apr 12, 2021 at 9:16 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Mon, Apr 12, 2021 at 4:46 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Sat, Apr 10, 2021 at 6:51 PM vignesh C <vignesh21@gmail.com> wrote:
> > > >
> > >
> > > Thanks, 0001 and 0002 look good to me. I have a minor comment for 0002.
> > >
> > > <entry role="catalog_table_entry"><para role="column_definition">
> > > + <structfield>total_bytes</structfield><type>bigint</type>
> > > + </para>
> > > + <para>
> > > + Amount of decoded transactions data sent to the decoding output plugin
> > > + while decoding the changes from WAL for this slot. This can be used to
> > > + gauge the total amount of data sent during logical decoding.
> > >
> > > Can we slightly extend it to say something like: Note that this
> > > includes the bytes streamed and or spilled. Similarly, we can extend
> > > it for total_txns.
> > >
> >
> > Thanks for the comments, the comments are fixed in the v8 patch attached.
> > Thoughts?
>
> Here are review comments on new TAP tests:

Thanks for the comments.

> +# Create replication slots.
> +$node->safe_psql('postgres',
> + "SELECT 'init' FROM
> pg_create_logical_replication_slot('regression_slot1',
> 'test_decoding')");
> +$node->safe_psql('postgres',
> + "SELECT 'init' FROM
> pg_create_logical_replication_slot('regression_slot2',
> 'test_decoding')");
> +$node->safe_psql('postgres',
> + "SELECT 'init' FROM
> pg_create_logical_replication_slot('regression_slot3',
> 'test_decoding')");
> +$node->safe_psql('postgres',
> + "SELECT 'init' FROM
> pg_create_logical_replication_slot('regression_slot4',
> 'test_decoding')");
>
> and
>
> +
> +$node->safe_psql('postgres',
> + "SELECT data FROM
> pg_logical_slot_get_changes('regression_slot1', NULL, NULL,
> 'include-xids', '0', 'skip-empty-xacts', '1')");
> +$node->safe_psql('postgres',
> + "SELECT data FROM
> pg_logical_slot_get_changes('regression_slot2', NULL, NULL,
> 'include-xids', '0', 'skip-empty-xacts', '1')");
> +$node->safe_psql('postgres',
> + "SELECT data FROM
> pg_logical_slot_get_changes('regression_slot3', NULL, NULL,
> 'include-xids', '0', 'skip-empty-xacts', '1')");
> +$node->safe_psql('postgres',
> + "SELECT data FROM
> pg_logical_slot_get_changes('regression_slot4', NULL, NULL,
> 'include-xids', '0', 'skip-empty-xacts', '1')");
>
> I think we can do those similar queries in a single psql connection
> like follows:
>
> # Create replication slots.
> $node->safe_psql('postgres',
> qq[
> SELECT pg_create_logical_replication_slot('regression_slot1', 'test_decoding');
> SELECT pg_create_logical_replication_slot('regression_slot2', 'test_decoding');
> SELECT pg_create_logical_replication_slot('regression_slot3', 'test_decoding');
> SELECT pg_create_logical_replication_slot('regression_slot4', 'test_decoding');
> ]);
>
> and
>
> $node->safe_psql('postgres',
> qq[
> SELECT data FROM pg_logical_slot_get_changes('regression_slot1', NULL,
> NULL, 'include-xids', '0', 'skip-empty-xacts', '1');
> SELECT data FROM pg_logical_slot_get_changes('regression_slot2', NULL,
> NULL, 'include-xids', '0', 'skip-empty-xacts', '1');
> SELECT data FROM pg_logical_slot_get_changes('regression_slot3', NULL,
> NULL, 'include-xids', '0', 'skip-empty-xacts', '1');
> SELECT data FROM pg_logical_slot_get_changes('regression_slot4', NULL,
> NULL, 'include-xids', '0', 'skip-empty-xacts', '1');
> ]);
>

Modified.

> ---
> +# Wait for the statistics to be updated.
> +my $slot1_stat_check_query =
> + "SELECT count(1) = 1 FROM pg_stat_replication_slots WHERE slot_name
> = 'regression_slot1' AND total_txns > 0 AND total_bytes > 0;";
> +my $slot2_stat_check_query =
> + "SELECT count(1) = 1 FROM pg_stat_replication_slots WHERE slot_name
> = 'regression_slot2' AND total_txns > 0 AND total_bytes > 0;";
> +my $slot3_stat_check_query =
> + "SELECT count(1) = 1 FROM pg_stat_replication_slots WHERE slot_name
> = 'regression_slot3' AND total_txns > 0 AND total_bytes > 0;";
> +my $slot4_stat_check_query =
> + "SELECT count(1) = 1 FROM pg_stat_replication_slots WHERE slot_name
> = 'regression_slot4' AND total_txns > 0 AND total_bytes > 0;";
> +
> +# Verify that the statistics have been updated.
> +$node->poll_query_until('postgres', $slot1_stat_check_query)
> + or die "Timed out while waiting for statistics to be updated";
> +$node->poll_query_until('postgres', $slot2_stat_check_query)
> + or die "Timed out while waiting for statistics to be updated";
> +$node->poll_query_until('postgres', $slot3_stat_check_query)
> + or die "Timed out while waiting for statistics to be updated";
> +$node->poll_query_until('postgres', $slot4_stat_check_query)
> + or die "Timed out while waiting for statistics to be updated";
>
> We can simplify the above code to something like:
>
> $node->poll_query_until(
> 'postgres', qq[
> SELECT count(slot_name) >= 4
> FROM pg_stat_replication_slots
> WHERE slot_name ~ 'regression_slot'
> AND total_txns > 0
> AND total_bytes > 0;
> ]) or die "Timed out while waiting for statistics to be updated";
>

Modified.

> ---
> +# Test to remove one of the replication slots and adjust max_replication_slots
> +# accordingly to the number of slots and verify replication statistics data is
> +# fine after restart.
>
> I think it's better if we explain in detail what cases we're trying to
> test. How about the following description?
>
> Test to remove one of the replication slots and adjust
> max_replication_slots accordingly to the number of slots. This leads
> to a mismatch of the number of slots between in the stats file and on
> shared memory, simulating the message for dropping a slot got lost. We
> verify replication statistics data is fine after restart.

Slightly reworded and modified it.

These comments are fixed as part of the v9 patch posted at [1].
[1] - https://www.postgresql.org/message-id/CALDaNm3CtPUYkFjPhzX0AcuRiK2MzdCR%2B_w8ok1kCcykveuL2Q%40mail.gmail.com

Regards,
Vignesh

Re: Replication slot stats misgivings

From

Masahiko Sawada

Date:

14 April 2021, 02:21:55

On Tue, Apr 13, 2021 at 5:07 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Mon, Apr 12, 2021 at 7:03 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Mon, Apr 12, 2021 at 9:36 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Mon, Apr 12, 2021 at 5:29 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > > > On Mon, Apr 12, 2021 at 8:08 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > >
> > > > > On Mon, Apr 12, 2021 at 4:34 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > > > >
> > > > > > On Mon, Apr 12, 2021 at 6:19 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > > > >
> > > > > > > On Mon, Apr 12, 2021 at 10:27 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > > > > > >
> > > > > > > > On Sat, Apr 10, 2021 at 9:53 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > It seems Vignesh has changed patches based on the latest set of
> > > > > > > > > comments so you might want to rebase.
> > > > > > > >
> > > > > > > > I've merged my patch into the v6 patch set Vignesh submitted.
> > > > > > > >
> > > > > > > > I've attached the updated version of the patches. I didn't change
> > > > > > > > anything in the patch that changes char[NAMEDATALEN] to NameData (0001
> > > > > > > > patch) and patches that add tests.
> > > > > > > >
> > > > > > >
> > > > > > > I think we can push 0001. What do you think?
> > > > > >
> > > > > > +1
> > > > > >
> > > > > > >
> > > > > > > > In 0003 patch I reordered the
> > > > > > > > output parameters of pg_stat_replication_slots; showing total number
> > > > > > > > of transactions and total bytes followed by statistics for spilled and
> > > > > > > > streamed transactions seems appropriate to me.
> > > > > > > >
> > > > > > >
> > > > > > > I am not sure about this because I think we might want to add some
> > > > > > > info of stream/spill bytes in total_bytes description (something like
> > > > > > > stream/spill bytes are not in addition to total_bytes).
> > > >
> > > > BTW doesn't it confuse users that stream/spill bytes are not in
> > > > addition to total_bytes? User will need to do "total_bytes +
> > > > spill/stream_bytes" to know the actual total amount of data sent to
> > > > the decoding output plugin, is that right?
> > > >
> > >
> > > No, total_bytes includes the spill/stream bytes. So, the user doesn't
> > > need to do any calculation to compute totel_bytes sent to output
> > > plugin.
> >
> > The following test for the latest v8 patch seems to show different.
> > total_bytes is 1808 whereas spill_bytes is 13200000. Am I missing
> > something?
> >
> > postgres(1:85969)=# select pg_create_logical_replication_slot('s',
> > 'test_decoding');
> >  pg_create_logical_replication_slot
> > ------------------------------------
> >  (s,0/1884468)
> > (1 row)
> >
> > postgres(1:85969)=# create table a (i int);
> > CREATE TABLE
> > postgres(1:85969)=# insert into a select generate_series(1, 100000);
> > INSERT 0 100000
> > postgres(1:85969)=# set logical_decoding_work_mem to 64;
> > SET
> > postgres(1:85969)=# select * from pg_stat_replication_slots ;
> >  slot_name | total_txns | total_bytes | spill_txns | spill_count |
> > spill_bytes | stream_txns | stream_count | stream_bytes | stats_reset
> >
-----------+------------+-------------+------------+-------------+-------------+-------------+--------------+--------------+-------------
> >  s         |          0 |           0 |          0 |           0 |
> >       0 |           0 |            0 |            0 |
> > (1 row)
> >
> > postgres(1:85969)=# select count(*) from
> > pg_logical_slot_peek_changes('s', NULL, NULL);
> >  count
> > --------
> >  100004
> > (1 row)
> >
> > postgres(1:85969)=# select * from pg_stat_replication_slots ;
> >  slot_name | total_txns | total_bytes | spill_txns | spill_count |
> > spill_bytes | stream_txns | stream_count | stream_bytes | stats_reset
> >
-----------+------------+-------------+------------+-------------+-------------+-------------+--------------+--------------+-------------
> >  s         |          2 |        1808 |          1 |         202 |
> > 13200000 |           0 |            0 |            0 |
> > (1 row)
> >
>
> Thanks for identifying this issue, while spilling the transactions
> reorder buffer changes gets released, we will not be able to get the
> total size for spilled transactions from reorderbuffer size. I have
> fixed it by including spilledbytes to totalbytes in case of spilled
> transactions. Attached patch has the fix for this.
> Thoughts?

I've not looked at the patches yet but as Amit mentioned before[1],
it's better to move 0002 patch to after 0004. That is, 0001 patch
changes data type to NameData, 0002 patch adds total_txn and
total_bytes, and 0003 patch adds regression tests. 0004 patch will be
the patch using HTAB (was 0002 patch) and get reviewed after pushing
0001, 0002, and 0003 patches. 0005 patch adds more regression tests
for the problem 0004 patch addresses.

Regards,

[1] https://www.postgresql.org/message-id/CAA4eK1Kd4ag6Vc6jO%2BntYmTMiR70x3t_%2BYQRMDP%3D9T5a2uzUHg%40mail.gmail.com

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Replication slot stats misgivings

From

vignesh C

Date:

14 April 2021, 02:34:06

On Wed, Apr 14, 2021 at 7:52 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Tue, Apr 13, 2021 at 5:07 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Mon, Apr 12, 2021 at 7:03 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > On Mon, Apr 12, 2021 at 9:36 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > > On Mon, Apr 12, 2021 at 5:29 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > > >
> > > > > On Mon, Apr 12, 2021 at 8:08 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > > >
> > > > > > On Mon, Apr 12, 2021 at 4:34 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > > > > >
> > > > > > > On Mon, Apr 12, 2021 at 6:19 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > > > > >
> > > > > > > > On Mon, Apr 12, 2021 at 10:27 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > > > > > > >
> > > > > > > > > On Sat, Apr 10, 2021 at 9:53 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > It seems Vignesh has changed patches based on the latest set of
> > > > > > > > > > comments so you might want to rebase.
> > > > > > > > >
> > > > > > > > > I've merged my patch into the v6 patch set Vignesh submitted.
> > > > > > > > >
> > > > > > > > > I've attached the updated version of the patches. I didn't change
> > > > > > > > > anything in the patch that changes char[NAMEDATALEN] to NameData (0001
> > > > > > > > > patch) and patches that add tests.
> > > > > > > > >
> > > > > > > >
> > > > > > > > I think we can push 0001. What do you think?
> > > > > > >
> > > > > > > +1
> > > > > > >
> > > > > > > >
> > > > > > > > > In 0003 patch I reordered the
> > > > > > > > > output parameters of pg_stat_replication_slots; showing total number
> > > > > > > > > of transactions and total bytes followed by statistics for spilled and
> > > > > > > > > streamed transactions seems appropriate to me.
> > > > > > > > >
> > > > > > > >
> > > > > > > > I am not sure about this because I think we might want to add some
> > > > > > > > info of stream/spill bytes in total_bytes description (something like
> > > > > > > > stream/spill bytes are not in addition to total_bytes).
> > > > >
> > > > > BTW doesn't it confuse users that stream/spill bytes are not in
> > > > > addition to total_bytes? User will need to do "total_bytes +
> > > > > spill/stream_bytes" to know the actual total amount of data sent to
> > > > > the decoding output plugin, is that right?
> > > > >
> > > >
> > > > No, total_bytes includes the spill/stream bytes. So, the user doesn't
> > > > need to do any calculation to compute totel_bytes sent to output
> > > > plugin.
> > >
> > > The following test for the latest v8 patch seems to show different.
> > > total_bytes is 1808 whereas spill_bytes is 13200000. Am I missing
> > > something?
> > >
> > > postgres(1:85969)=# select pg_create_logical_replication_slot('s',
> > > 'test_decoding');
> > >  pg_create_logical_replication_slot
> > > ------------------------------------
> > >  (s,0/1884468)
> > > (1 row)
> > >
> > > postgres(1:85969)=# create table a (i int);
> > > CREATE TABLE
> > > postgres(1:85969)=# insert into a select generate_series(1, 100000);
> > > INSERT 0 100000
> > > postgres(1:85969)=# set logical_decoding_work_mem to 64;
> > > SET
> > > postgres(1:85969)=# select * from pg_stat_replication_slots ;
> > >  slot_name | total_txns | total_bytes | spill_txns | spill_count |
> > > spill_bytes | stream_txns | stream_count | stream_bytes | stats_reset
> > >
-----------+------------+-------------+------------+-------------+-------------+-------------+--------------+--------------+-------------
> > >  s         |          0 |           0 |          0 |           0 |
> > >       0 |           0 |            0 |            0 |
> > > (1 row)
> > >
> > > postgres(1:85969)=# select count(*) from
> > > pg_logical_slot_peek_changes('s', NULL, NULL);
> > >  count
> > > --------
> > >  100004
> > > (1 row)
> > >
> > > postgres(1:85969)=# select * from pg_stat_replication_slots ;
> > >  slot_name | total_txns | total_bytes | spill_txns | spill_count |
> > > spill_bytes | stream_txns | stream_count | stream_bytes | stats_reset
> > >
-----------+------------+-------------+------------+-------------+-------------+-------------+--------------+--------------+-------------
> > >  s         |          2 |        1808 |          1 |         202 |
> > > 13200000 |           0 |            0 |            0 |
> > > (1 row)
> > >
> >
> > Thanks for identifying this issue, while spilling the transactions
> > reorder buffer changes gets released, we will not be able to get the
> > total size for spilled transactions from reorderbuffer size. I have
> > fixed it by including spilledbytes to totalbytes in case of spilled
> > transactions. Attached patch has the fix for this.
> > Thoughts?
>
> I've not looked at the patches yet but as Amit mentioned before[1],
> it's better to move 0002 patch to after 0004. That is, 0001 patch
> changes data type to NameData, 0002 patch adds total_txn and
> total_bytes, and 0003 patch adds regression tests. 0004 patch will be
> the patch using HTAB (was 0002 patch) and get reviewed after pushing
> 0001, 0002, and 0003 patches. 0005 patch adds more regression tests
> for the problem 0004 patch addresses.

I will make the change for this and post a patch for this.
Currently we have kept total_txns and total_bytes at the beginning of
pg_stat_replication_slots, I did not see any conclusion on this. I
preferred it to be at the beginning.
Thoughts?

Regards,
Vignesh

Re: Replication slot stats misgivings

From

Amit Kapila

Date:

14 April 2021, 02:48:33

On Wed, Apr 14, 2021 at 8:04 AM vignesh C <vignesh21@gmail.com> wrote:
>
> On Wed, Apr 14, 2021 at 7:52 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > I've not looked at the patches yet but as Amit mentioned before[1],
> > it's better to move 0002 patch to after 0004. That is, 0001 patch
> > changes data type to NameData, 0002 patch adds total_txn and
> > total_bytes, and 0003 patch adds regression tests. 0004 patch will be
> > the patch using HTAB (was 0002 patch) and get reviewed after pushing
> > 0001, 0002, and 0003 patches. 0005 patch adds more regression tests
> > for the problem 0004 patch addresses.
>
> I will make the change for this and post a patch for this.
> Currently we have kept total_txns and total_bytes at the beginning of
> pg_stat_replication_slots, I did not see any conclusion on this. I
> preferred it to be at the beginning.
> Thoughts?
>

I prefer those two fields after spill and stream fields. I have
mentioned the same in one of the emails above.

-- 
With Regards,
Amit Kapila.

Re: Replication slot stats misgivings

From

Amit Kapila

Date:

14 April 2021, 06:39:40

On Tue, Apr 13, 2021 at 1:37 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Mon, Apr 12, 2021 at 7:03 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> >
> > The following test for the latest v8 patch seems to show different.
> > total_bytes is 1808 whereas spill_bytes is 13200000. Am I missing
> > something?
> >
> > postgres(1:85969)=# select pg_create_logical_replication_slot('s',
> > 'test_decoding');
> >  pg_create_logical_replication_slot
> > ------------------------------------
> >  (s,0/1884468)
> > (1 row)
> >
> > postgres(1:85969)=# create table a (i int);
> > CREATE TABLE
> > postgres(1:85969)=# insert into a select generate_series(1, 100000);
> > INSERT 0 100000
> > postgres(1:85969)=# set logical_decoding_work_mem to 64;
> > SET
> > postgres(1:85969)=# select * from pg_stat_replication_slots ;
> >  slot_name | total_txns | total_bytes | spill_txns | spill_count |
> > spill_bytes | stream_txns | stream_count | stream_bytes | stats_reset
> >
-----------+------------+-------------+------------+-------------+-------------+-------------+--------------+--------------+-------------
> >  s         |          0 |           0 |          0 |           0 |
> >       0 |           0 |            0 |            0 |
> > (1 row)
> >
> > postgres(1:85969)=# select count(*) from
> > pg_logical_slot_peek_changes('s', NULL, NULL);
> >  count
> > --------
> >  100004
> > (1 row)
> >
> > postgres(1:85969)=# select * from pg_stat_replication_slots ;
> >  slot_name | total_txns | total_bytes | spill_txns | spill_count |
> > spill_bytes | stream_txns | stream_count | stream_bytes | stats_reset
> >
-----------+------------+-------------+------------+-------------+-------------+-------------+--------------+--------------+-------------
> >  s         |          2 |        1808 |          1 |         202 |
> > 13200000 |           0 |            0 |            0 |
> > (1 row)
> >
>
> Thanks for identifying this issue, while spilling the transactions
> reorder buffer changes gets released, we will not be able to get the
> total size for spilled transactions from reorderbuffer size. I have
> fixed it by including spilledbytes to totalbytes in case of spilled
> transactions. Attached patch has the fix for this.
> Thoughts?
>

I am not sure if that is the best way to fix it because sometimes we
clear the serialized flag in which case it might not give the correct
answer. Another way to fix it could be that before we try to restore a
new set of changes, we update totalBytes counter. See, the attached
patch atop your v6-0002-* patch.

-- 
With Regards,
Amit Kapila.

Attachment

fix_spilled_stats_1.patch

Re: Replication slot stats misgivings

From

vignesh C

Date:

14 April 2021, 12:22:45

On Wed, Apr 14, 2021 at 12:09 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Apr 13, 2021 at 1:37 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Mon, Apr 12, 2021 at 7:03 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > >
> > > The following test for the latest v8 patch seems to show different.
> > > total_bytes is 1808 whereas spill_bytes is 13200000. Am I missing
> > > something?
> > >
> > > postgres(1:85969)=# select pg_create_logical_replication_slot('s',
> > > 'test_decoding');
> > >  pg_create_logical_replication_slot
> > > ------------------------------------
> > >  (s,0/1884468)
> > > (1 row)
> > >
> > > postgres(1:85969)=# create table a (i int);
> > > CREATE TABLE
> > > postgres(1:85969)=# insert into a select generate_series(1, 100000);
> > > INSERT 0 100000
> > > postgres(1:85969)=# set logical_decoding_work_mem to 64;
> > > SET
> > > postgres(1:85969)=# select * from pg_stat_replication_slots ;
> > >  slot_name | total_txns | total_bytes | spill_txns | spill_count |
> > > spill_bytes | stream_txns | stream_count | stream_bytes | stats_reset
> > >
-----------+------------+-------------+------------+-------------+-------------+-------------+--------------+--------------+-------------
> > >  s         |          0 |           0 |          0 |           0 |
> > >       0 |           0 |            0 |            0 |
> > > (1 row)
> > >
> > > postgres(1:85969)=# select count(*) from
> > > pg_logical_slot_peek_changes('s', NULL, NULL);
> > >  count
> > > --------
> > >  100004
> > > (1 row)
> > >
> > > postgres(1:85969)=# select * from pg_stat_replication_slots ;
> > >  slot_name | total_txns | total_bytes | spill_txns | spill_count |
> > > spill_bytes | stream_txns | stream_count | stream_bytes | stats_reset
> > >
-----------+------------+-------------+------------+-------------+-------------+-------------+--------------+--------------+-------------
> > >  s         |          2 |        1808 |          1 |         202 |
> > > 13200000 |           0 |            0 |            0 |
> > > (1 row)
> > >
> >
> > Thanks for identifying this issue, while spilling the transactions
> > reorder buffer changes gets released, we will not be able to get the
> > total size for spilled transactions from reorderbuffer size. I have
> > fixed it by including spilledbytes to totalbytes in case of spilled
> > transactions. Attached patch has the fix for this.
> > Thoughts?
> >
>
> I am not sure if that is the best way to fix it because sometimes we
> clear the serialized flag in which case it might not give the correct
> answer. Another way to fix it could be that before we try to restore a
> new set of changes, we update totalBytes counter. See, the attached
> patch atop your v6-0002-* patch.

I felt calculating totalbytes this way is better than depending on
spill_bytes. I have taken your changes. Attached patch includes the
changes suggested.
Thoughts?

Regards,
Vignesh

Attachment

Re: Replication slot stats misgivings

From

Amit Kapila

Date:

15 April 2021, 06:22:38

On Wed, Apr 14, 2021 at 5:52 PM vignesh C <vignesh21@gmail.com> wrote:
>

I have made minor changes to the 0001 and 0002 patches. Attached is
the combined patch for them, I think we can push them as one patch.
Changes made are (a) minor editing in comments, (b) changed the
condition when to report stats such that unless we have processed any
bytes, we shouldn't send those, (c) removed some unrelated changes
from 0002, (d) ran pgindent.

Let me know what you think of the attached?

-- 
With Regards,
Amit Kapila.

Attachment

v11-0001-Add-information-of-total-data-processed-to-repli.patch

Re: Replication slot stats misgivings

From

vignesh C

Date:

15 April 2021, 07:15:36

On Thu, Apr 15, 2021 at 11:52 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Apr 14, 2021 at 5:52 PM vignesh C <vignesh21@gmail.com> wrote:
> >
>
> I have made minor changes to the 0001 and 0002 patches. Attached is
> the combined patch for them, I think we can push them as one patch.
> Changes made are (a) minor editing in comments, (b) changed the
> condition when to report stats such that unless we have processed any
> bytes, we shouldn't send those, (c) removed some unrelated changes
> from 0002, (d) ran pgindent.
>
> Let me know what you think of the attached?

Changes look fine to me, the patch applies neatly and make check-world passes.

Regards,
Vignesh

Re: Replication slot stats misgivings

From

Amit Kapila

Date:

15 April 2021, 07:42:03

On Thu, Apr 15, 2021 at 12:45 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Thu, Apr 15, 2021 at 11:52 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Wed, Apr 14, 2021 at 5:52 PM vignesh C <vignesh21@gmail.com> wrote:
> > >
> >
> > I have made minor changes to the 0001 and 0002 patches. Attached is
> > the combined patch for them, I think we can push them as one patch.
> > Changes made are (a) minor editing in comments, (b) changed the
> > condition when to report stats such that unless we have processed any
> > bytes, we shouldn't send those, (c) removed some unrelated changes
> > from 0002, (d) ran pgindent.
> >
> > Let me know what you think of the attached?
>
> Changes look fine to me, the patch applies neatly and make check-world passes.
>

Thanks! Sawada-San, others, unless you have any suggestions, I am
planning to push
v11-0001-Add-information-of-total-data-processed-to-repli.patch
tomorrow.


-- 
With Regards,
Amit Kapila.

Re: Replication slot stats misgivings

From

Masahiko Sawada

Date:

15 April 2021, 07:42:47

On Thu, Apr 15, 2021 at 3:22 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Apr 14, 2021 at 5:52 PM vignesh C <vignesh21@gmail.com> wrote:
> >
>
> I have made minor changes to the 0001 and 0002 patches. Attached is
> the combined patch for them, I think we can push them as one patch.
> Changes made are (a) minor editing in comments, (b) changed the
> condition when to report stats such that unless we have processed any
> bytes, we shouldn't send those, (c) removed some unrelated changes
> from 0002, (d) ran pgindent.
>
> Let me know what you think of the attached?

Thank you for updating the patch.

I have one question on the doc change:

+        so the counter is not incremented for subtransactions. Note that this
+        includes the transactions streamed and or spilled.
+       </para></entry>

The patch uses the sentence "streamed and or spilled" in two places.
You meant “streamed and spilled”? Even if it actually means “and or”,
using "and or” (i.g., connecting “and” to “or” by a space) is general?
I could not find we use it other places in the doc but found we're
using "and/or" instead.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Replication slot stats misgivings

From

Amit Kapila

Date:

15 April 2021, 09:16:35

On Thu, Apr 15, 2021 at 1:13 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Thu, Apr 15, 2021 at 3:22 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> Thank you for updating the patch.
>
> I have one question on the doc change:
>
> +        so the counter is not incremented for subtransactions. Note that this
> +        includes the transactions streamed and or spilled.
> +       </para></entry>
>
> The patch uses the sentence "streamed and or spilled" in two places.
> You meant “streamed and spilled”? Even if it actually means “and or”,
> using "and or” (i.g., connecting “and” to “or” by a space) is general?
> I could not find we use it other places in the doc but found we're
> using "and/or" instead.
>

I changed it to 'and/or' and made another minor change.


--
With Regards,
Amit Kapila.

Attachment

v12-0001-Add-information-of-total-data-processed-to-repli.patch

Re: Replication slot stats misgivings

From

Masahiko Sawada

Date:

15 April 2021, 11:04:31

On Thu, Apr 15, 2021 at 6:16 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Apr 15, 2021 at 1:13 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Thu, Apr 15, 2021 at 3:22 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > Thank you for updating the patch.
> >
> > I have one question on the doc change:
> >
> > +        so the counter is not incremented for subtransactions. Note that this
> > +        includes the transactions streamed and or spilled.
> > +       </para></entry>
> >
> > The patch uses the sentence "streamed and or spilled" in two places.
> > You meant “streamed and spilled”? Even if it actually means “and or”,
> > using "and or” (i.g., connecting “and” to “or” by a space) is general?
> > I could not find we use it other places in the doc but found we're
> > using "and/or" instead.
> >
>
> I changed it to 'and/or' and made another minor change.


Thank you for the update! The patch looks good to me.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Replication slot stats misgivings

From

Amit Kapila

Date:

16 April 2021, 03:18:29

On Fri, Apr 16, 2021 at 8:22 AM Justin Pryzby <pryzby@telsasoft.com> wrote:
>
> On Thu, Apr 15, 2021 at 02:46:35PM +0530, Amit Kapila wrote:
> > On Thu, Apr 15, 2021 at 1:13 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > On Thu, Apr 15, 2021 at 3:22 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > Thank you for updating the patch.
> > >
> > > I have one question on the doc change:
> > >
> > > +        so the counter is not incremented for subtransactions. Note that this
> > > +        includes the transactions streamed and or spilled.
> > > +       </para></entry>
> > >
> > > The patch uses the sentence "streamed and or spilled" in two places.
> > > You meant “streamed and spilled”? Even if it actually means “and or”,
> > > using "and or” (i.g., connecting “and” to “or” by a space) is general?
> > > I could not find we use it other places in the doc but found we're
> > > using "and/or" instead.
> > >
> >
> > I changed it to 'and/or' and made another minor change.
>
> I'm suggesting some doc changes.  If these are fine, I'll include in my next
> round of doc review, in case you don't want to make another commit just for
> that.
>

I am fine with your proposed changes. There are one or two more
patches in this area. I can include your suggestions along with those
if you don't mind?

--
With Regards,
Amit Kapila.

Re: Replication slot stats misgivings

From

vignesh C

Date:

16 April 2021, 03:38:06

On Thu, Apr 15, 2021 at 2:46 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Apr 15, 2021 at 1:13 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Thu, Apr 15, 2021 at 3:22 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > Thank you for updating the patch.
> >
> > I have one question on the doc change:
> >
> > +        so the counter is not incremented for subtransactions. Note that this
> > +        includes the transactions streamed and or spilled.
> > +       </para></entry>
> >
> > The patch uses the sentence "streamed and or spilled" in two places.
> > You meant “streamed and spilled”? Even if it actually means “and or”,
> > using "and or” (i.g., connecting “and” to “or” by a space) is general?
> > I could not find we use it other places in the doc but found we're
> > using "and/or" instead.
> >
>
> I changed it to 'and/or' and made another minor change.

I have rebased the remaining patches on top of head. Attached the
patches for the same.
Thoughts?

Regards,
Vignesh

On Sun, Apr 18, 2021 at 12:13 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Sun, Apr 18, 2021 at 7:36 AM vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Sun, Apr 18, 2021 at 3:51 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > >
> > > I wrote:
> > > > The buildfarm suggests that this isn't entirely stable:
> > > > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=anole&dt=2021-04-17%2011%3A14%3A49
> > > > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=bichir&dt=2021-04-17%2016%3A30%3A15
> > >
> > > Oh, I missed that hyrax is showing the identical symptom:
> > >
> > > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=hyrax&dt=2021-04-16%2007%3A05%3A44
> > >
> > > So you might try CLOBBER_CACHE_ALWAYS to see if you can reproduce it
> > > that way.
> > >
> >
> > I will try to check and identify why it is failing.
> >
>
> I think the failure is due to the reason that in the new tests after
> reset, we are not waiting for the stats message to be delivered as we
> were doing in other cases. Also, for the new test (non-spilled case),
> we need to decode changes as we are doing for other tests, otherwise,
> it will show the old stats.
>

Yes, also the following expectation in expected/stats.out is wrong:

SELECT slot_name, spill_txns = 0 AS spill_txns, spill_count = 0 AS
spill_count, total_txns > 0 AS total_txns, total_bytes > 0 AS
total_bytes FROM pg_stat_replication_slots;
    slot_name    | spill_txns | spill_count | total_txns | total_bytes
-----------------+------------+-------------+------------+-------------
 regression_slot | f          | f           | t          | t
(1 row)

We should expect all values are 0. Please find attached the patch.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Attachment

fix_test_decoding_test.patch

Re: Replication slot stats misgivings

From

vignesh C

Date:

18 April 2021, 13:25:19

On Sun, Apr 18, 2021 at 9:02 AM vignesh C <vignesh21@gmail.com> wrote:
>
> On Sun, Apr 18, 2021 at 8:43 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Sun, Apr 18, 2021 at 7:36 AM vignesh C <vignesh21@gmail.com> wrote:
> > >
> > > On Sun, Apr 18, 2021 at 3:51 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > > >
> > > > I wrote:
> > > > > The buildfarm suggests that this isn't entirely stable:
> > > > > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=anole&dt=2021-04-17%2011%3A14%3A49
> > > > > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=bichir&dt=2021-04-17%2016%3A30%3A15
> > > >
> > > > Oh, I missed that hyrax is showing the identical symptom:
> > > >
> > > > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=hyrax&dt=2021-04-16%2007%3A05%3A44
> > > >
> > > > So you might try CLOBBER_CACHE_ALWAYS to see if you can reproduce it
> > > > that way.
> > > >
> > >
> > > I will try to check and identify why it is failing.
> > >
> >
> > I think the failure is due to the reason that in the new tests after
> > reset, we are not waiting for the stats message to be delivered as we
> > were doing in other cases. Also, for the new test (non-spilled case),
> > we need to decode changes as we are doing for other tests, otherwise,
> > it will show the old stats.
>
> I also felt that is the reason for the failure, I will fix and post a
> patch for this.

Attached a patch which includes the changes for the fix. I have moved
the non-spilled transaction test to reduce the steps which reduces
calling pg_logical_slot_get_changes before this test.

Regards,
Vignesh

Attachment

buildfarm_failure.patch

Re: Replication slot stats misgivings

From

Masahiko Sawada

Date:

19 April 2021, 03:29:23

On Fri, Apr 16, 2021 at 2:58 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Apr 15, 2021 at 4:35 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > Thank you for the update! The patch looks good to me.
> >
>
> I have pushed the first patch. Comments on the next patch
> v13-0001-Use-HTAB-for-replication-slot-statistics:
> 1.
> + /*
> + * Check for all replication slots in stats hash table. We do this check
> + * when replSlotStats has more than max_replication_slots entries, i.e,
> + * when there are stats for the already-dropped slot, to avoid frequent
> + * call SearchNamedReplicationSlot() which acquires LWLock.
> + */
> + if (replSlotStats && hash_get_num_entries(replSlotStats) >
> max_replication_slots)
> + {
> + PgStat_ReplSlotEntry *slotentry;
> +
> + hash_seq_init(&hstat, replSlotStats);
> + while ((slotentry = (PgStat_ReplSlotEntry *) hash_seq_search(&hstat)) != NULL)
> + {
> + if (SearchNamedReplicationSlot(NameStr(slotentry->slotname), true) == NULL)
> + pgstat_report_replslot_drop(NameStr(slotentry->slotname));
> + }
> + }
>
> Is SearchNamedReplicationSlot() so frequently used that we need to do
> this only when the hash table has entries more than
> max_replication_slots? I think it would be better if we can do it
> without such a condition to reduce the chances of missing the slot
> stats. We don't have any such restrictions for any other cases in this
> function.

Please see below comment on #4.

>
> I think it is better to add CHECK_FOR_INTERRUPTS in the above while loop?

Agreed.

>
> 2.
> /*
>   * Replication slot statistics kept in the stats collector
>   */
> -typedef struct PgStat_ReplSlotStats
> +typedef struct PgStat_ReplSlotEntry
>
> I think the comment above this structure can be changed to "The
> collector's data per slot" or something like that. Also, if we have to
> follow table/function/db style, then probably this structure should be
> named as PgStat_StatReplSlotEntry.

Agreed.

>
> 3.
> - * create the statistics for the replication slot.
> + * create the statistics for the replication slot. In case where the
> + * message for dropping the old slot gets lost and a slot with the same is
>
> /the same is/the same name is/.
>
> Can we mention something similar to what you have added here in docs as well?

Agreed.

>
> 4.
> +CREATE VIEW pg_stat_replication_slots AS
> +    SELECT
> +            s.slot_name,
> +            s.spill_txns,
> +            s.spill_count,
> +            s.spill_bytes,
> +            s.stream_txns,
> +            s.stream_count,
> +            s.stream_bytes,
> +            s.total_txns,
> +            s.total_bytes,
> +            s.stats_reset
> +    FROM pg_replication_slots as r,
> +        LATERAL pg_stat_get_replication_slot(slot_name) as s
> +    WHERE r.datoid IS NOT NULL; -- excluding physical slots
> ..
> ..
>
> -/* Get the statistics for the replication slots */
> +/* Get the statistics for the replication slot */
>  Datum
> -pg_stat_get_replication_slots(PG_FUNCTION_ARGS)
> +pg_stat_get_replication_slot(PG_FUNCTION_ARGS)
>  {
>  #define PG_STAT_GET_REPLICATION_SLOT_COLS 10
> - ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
> + text *slotname_text = PG_GETARG_TEXT_P(0);
> + NameData slotname;
>
> I think with the above changes getting all the slot stats has become
> much costlier. Is there any reason why can't we get all the stats from
> the new hash_table in one shot and return them to the user?

I think the advantage of this approach would be that it can avoid
showing the stats for already-dropped slots. Like other statistics
views such as pg_stat_all_tables and pg_stat_all_functions, searching
the stats by the name got from pg_replication_slots can show only
available slot stats even if the hash table has garbage slot stats.
Given that pg_stat_replication_slots doesn’t show garbage slot stats
even if it has, I thought we can avoid checking those garbage stats
frequently. It should not essentially be a problem for the hash table
to have entries up to max_replication_slots regardless of live or
already-dropped.

As another design, we can get all stats from the hash table in one
shot as you suggested. If we do that, it's better to check garbage
slot stats every time pgstat_vacuum_stat() is called so the view
doesn't show those stats but cannot avoid that completely.

I'll change the code pointed out by #1 and #4 according to this design
discussion.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Replication slot stats misgivings

From

Amit Kapila

Date:

19 April 2021, 04:11:30

On Sun, Apr 18, 2021 at 6:51 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> Yes, also the following expectation in expected/stats.out is wrong:
>
> SELECT slot_name, spill_txns = 0 AS spill_txns, spill_count = 0 AS
> spill_count, total_txns > 0 AS total_txns, total_bytes > 0 AS
> total_bytes FROM pg_stat_replication_slots;
>     slot_name    | spill_txns | spill_count | total_txns | total_bytes
> -----------------+------------+-------------+------------+-------------
>  regression_slot | f          | f           | t          | t
> (1 row)
>
> We should expect all values are 0. Please find attached the patch.
>

Right. Both your and Vignesh's patch will fix the problem but I mildly
prefer Vignesh's one as that seems a bit simpler. So, I went ahead and
pushed his patch with minor other changes. Thanks to both of you.

-- 
With Regards,
Amit Kapila.

Re: Replication slot stats misgivings

From

Amit Kapila

Date:

19 April 2021, 04:12:46

On Fri, Apr 16, 2021 at 8:50 AM Justin Pryzby <pryzby@telsasoft.com> wrote:
>
> On Fri, Apr 16, 2021 at 08:48:29AM +0530, Amit Kapila wrote:
> > I am fine with your proposed changes. There are one or two more
> > patches in this area. I can include your suggestions along with those
> > if you don't mind?
>
> However's convenient is fine
>

Thanks for your suggestions. I have pushed your changes as part of the
commit c64dcc7fee.

-- 
With Regards,
Amit Kapila.

Re: Replication slot stats misgivings

From

Amit Kapila

Date:

19 April 2021, 05:14:11

On Mon, Apr 19, 2021 at 9:00 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Fri, Apr 16, 2021 at 2:58 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> >
> > 4.
> > +CREATE VIEW pg_stat_replication_slots AS
> > +    SELECT
> > +            s.slot_name,
> > +            s.spill_txns,
> > +            s.spill_count,
> > +            s.spill_bytes,
> > +            s.stream_txns,
> > +            s.stream_count,
> > +            s.stream_bytes,
> > +            s.total_txns,
> > +            s.total_bytes,
> > +            s.stats_reset
> > +    FROM pg_replication_slots as r,
> > +        LATERAL pg_stat_get_replication_slot(slot_name) as s
> > +    WHERE r.datoid IS NOT NULL; -- excluding physical slots
> > ..
> > ..
> >
> > -/* Get the statistics for the replication slots */
> > +/* Get the statistics for the replication slot */
> >  Datum
> > -pg_stat_get_replication_slots(PG_FUNCTION_ARGS)
> > +pg_stat_get_replication_slot(PG_FUNCTION_ARGS)
> >  {
> >  #define PG_STAT_GET_REPLICATION_SLOT_COLS 10
> > - ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
> > + text *slotname_text = PG_GETARG_TEXT_P(0);
> > + NameData slotname;
> >
> > I think with the above changes getting all the slot stats has become
> > much costlier. Is there any reason why can't we get all the stats from
> > the new hash_table in one shot and return them to the user?
>
> I think the advantage of this approach would be that it can avoid
> showing the stats for already-dropped slots. Like other statistics
> views such as pg_stat_all_tables and pg_stat_all_functions, searching
> the stats by the name got from pg_replication_slots can show only
> available slot stats even if the hash table has garbage slot stats.
>

Sounds reasonable. However, if the create_slot message is missed, it
will show an empty row for it. See below:

postgres=# select slot_name, total_txns from pg_stat_replication_slots;
 slot_name | total_txns
-----------+------------
 s1        |          0
 s2        |          0
           |
(3 rows)

Here, I have manually via debugger skipped sending the create_slot
message for the third slot and we are showing an empty for it. This
won't happen for pg_stat_all_tables, as it will set 0 or other initial
values in such a case. I think we need to address this case.

> Given that pg_stat_replication_slots doesn’t show garbage slot stats
> even if it has, I thought we can avoid checking those garbage stats
> frequently. It should not essentially be a problem for the hash table
> to have entries up to max_replication_slots regardless of live or
> already-dropped.
>

Yeah, but I guess we still might not save much by not doing it,
especially because for the other cases like tables/functions, we are
doing it without any threshold limit.

--
With Regards,
Amit Kapila.

Re: Replication slot stats misgivings

From

Masahiko Sawada

Date:

19 April 2021, 07:48:34

On Mon, Apr 19, 2021 at 2:14 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Apr 19, 2021 at 9:00 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Fri, Apr 16, 2021 at 2:58 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > >
> > > 4.
> > > +CREATE VIEW pg_stat_replication_slots AS
> > > +    SELECT
> > > +            s.slot_name,
> > > +            s.spill_txns,
> > > +            s.spill_count,
> > > +            s.spill_bytes,
> > > +            s.stream_txns,
> > > +            s.stream_count,
> > > +            s.stream_bytes,
> > > +            s.total_txns,
> > > +            s.total_bytes,
> > > +            s.stats_reset
> > > +    FROM pg_replication_slots as r,
> > > +        LATERAL pg_stat_get_replication_slot(slot_name) as s
> > > +    WHERE r.datoid IS NOT NULL; -- excluding physical slots
> > > ..
> > > ..
> > >
> > > -/* Get the statistics for the replication slots */
> > > +/* Get the statistics for the replication slot */
> > >  Datum
> > > -pg_stat_get_replication_slots(PG_FUNCTION_ARGS)
> > > +pg_stat_get_replication_slot(PG_FUNCTION_ARGS)
> > >  {
> > >  #define PG_STAT_GET_REPLICATION_SLOT_COLS 10
> > > - ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
> > > + text *slotname_text = PG_GETARG_TEXT_P(0);
> > > + NameData slotname;
> > >
> > > I think with the above changes getting all the slot stats has become
> > > much costlier. Is there any reason why can't we get all the stats from
> > > the new hash_table in one shot and return them to the user?
> >
> > I think the advantage of this approach would be that it can avoid
> > showing the stats for already-dropped slots. Like other statistics
> > views such as pg_stat_all_tables and pg_stat_all_functions, searching
> > the stats by the name got from pg_replication_slots can show only
> > available slot stats even if the hash table has garbage slot stats.
> >
>
> Sounds reasonable. However, if the create_slot message is missed, it
> will show an empty row for it. See below:
>
> postgres=# select slot_name, total_txns from pg_stat_replication_slots;
>  slot_name | total_txns
> -----------+------------
>  s1        |          0
>  s2        |          0
>            |
> (3 rows)
>
> Here, I have manually via debugger skipped sending the create_slot
> message for the third slot and we are showing an empty for it. This
> won't happen for pg_stat_all_tables, as it will set 0 or other initial
> values in such a case. I think we need to address this case.

Good catch. I think it's better to set 0 to all counters and NULL to
reset_stats.

>
> > Given that pg_stat_replication_slots doesn’t show garbage slot stats
> > even if it has, I thought we can avoid checking those garbage stats
> > frequently. It should not essentially be a problem for the hash table
> > to have entries up to max_replication_slots regardless of live or
> > already-dropped.
> >
>
> Yeah, but I guess we still might not save much by not doing it,
> especially because for the other cases like tables/functions, we are
> doing it without any threshold limit.

Agreed.

I've attached the updated patch, please review it.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Attachment

v8-0001-Use-HTAB-for-replication-slot-statistics.patch

Re: Replication slot stats misgivings

From

vignesh C

Date:

19 April 2021, 10:58:00

On Fri, Apr 16, 2021 at 3:16 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Apr 12, 2021 at 2:57 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Sat, Mar 20, 2021 at 9:26 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Sat, Mar 20, 2021 at 12:22 AM Andres Freund <andres@anarazel.de> wrote:
> > > >
> > > > And then more generally about the feature:
> > > > - If a slot was used to stream out a large amount of changes (say an
> > > >   initial data load), but then replication is interrupted before the
> > > >   transaction is committed/aborted, stream_bytes will not reflect the
> > > >   many gigabytes of data we may have sent.
> > > >
> > >
> > > We can probably update the stats each time we spilled or streamed the
> > > transaction data but it was not clear at that stage whether or how
> > > much it will be useful.
> > >
> >
> > I felt we can update the replication slot statistics data each time we
> > spill/stream the transaction data instead of accumulating the
> > statistics and updating at the end. I have tried this in the attached
> > patch and the statistics data were getting updated.
> > Thoughts?
> >
>
> Did you check if we can update the stats when we release the slot as
> discussed above? I am not sure if it is easy to do at the time of slot
> release because this information might not be accessible there and in
> some cases, we might have already released the decoding
> context/reorderbuffer where this information is stored. It might be
> okay to update this when we stream or spill but let's see if we can do
> it easily at the time of slot release.
>

I have made the changes to update the replication statistics at
replication slot release. Please find the patch attached for the same.
Thoughts?

Regards,
Vignesh

Attachment

0001-Update-decoding-stats-while-releasing-replication-sl.patch

Re: Replication slot stats misgivings

From

Masahiko Sawada

Date:

20 April 2021, 03:37:57

On Mon, Apr 19, 2021 at 4:48 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Mon, Apr 19, 2021 at 2:14 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Mon, Apr 19, 2021 at 9:00 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > On Fri, Apr 16, 2021 at 2:58 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > >
> > > > 4.
> > > > +CREATE VIEW pg_stat_replication_slots AS
> > > > +    SELECT
> > > > +            s.slot_name,
> > > > +            s.spill_txns,
> > > > +            s.spill_count,
> > > > +            s.spill_bytes,
> > > > +            s.stream_txns,
> > > > +            s.stream_count,
> > > > +            s.stream_bytes,
> > > > +            s.total_txns,
> > > > +            s.total_bytes,
> > > > +            s.stats_reset
> > > > +    FROM pg_replication_slots as r,
> > > > +        LATERAL pg_stat_get_replication_slot(slot_name) as s
> > > > +    WHERE r.datoid IS NOT NULL; -- excluding physical slots
> > > > ..
> > > > ..
> > > >
> > > > -/* Get the statistics for the replication slots */
> > > > +/* Get the statistics for the replication slot */
> > > >  Datum
> > > > -pg_stat_get_replication_slots(PG_FUNCTION_ARGS)
> > > > +pg_stat_get_replication_slot(PG_FUNCTION_ARGS)
> > > >  {
> > > >  #define PG_STAT_GET_REPLICATION_SLOT_COLS 10
> > > > - ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
> > > > + text *slotname_text = PG_GETARG_TEXT_P(0);
> > > > + NameData slotname;
> > > >
> > > > I think with the above changes getting all the slot stats has become
> > > > much costlier. Is there any reason why can't we get all the stats from
> > > > the new hash_table in one shot and return them to the user?
> > >
> > > I think the advantage of this approach would be that it can avoid
> > > showing the stats for already-dropped slots. Like other statistics
> > > views such as pg_stat_all_tables and pg_stat_all_functions, searching
> > > the stats by the name got from pg_replication_slots can show only
> > > available slot stats even if the hash table has garbage slot stats.
> > >
> >
> > Sounds reasonable. However, if the create_slot message is missed, it
> > will show an empty row for it. See below:
> >
> > postgres=# select slot_name, total_txns from pg_stat_replication_slots;
> >  slot_name | total_txns
> > -----------+------------
> >  s1        |          0
> >  s2        |          0
> >            |
> > (3 rows)
> >
> > Here, I have manually via debugger skipped sending the create_slot
> > message for the third slot and we are showing an empty for it. This
> > won't happen for pg_stat_all_tables, as it will set 0 or other initial
> > values in such a case. I think we need to address this case.
>
> Good catch. I think it's better to set 0 to all counters and NULL to
> reset_stats.
>
> >
> > > Given that pg_stat_replication_slots doesn’t show garbage slot stats
> > > even if it has, I thought we can avoid checking those garbage stats
> > > frequently. It should not essentially be a problem for the hash table
> > > to have entries up to max_replication_slots regardless of live or
> > > already-dropped.
> > >
> >
> > Yeah, but I guess we still might not save much by not doing it,
> > especially because for the other cases like tables/functions, we are
> > doing it without any threshold limit.
>
> Agreed.
>
> I've attached the updated patch, please review it.

I've attached the new version patch that fixed the compilation error
reported off-line by Amit.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Attachment

v9-0001-Use-HTAB-for-replication-slot-statistics.patch

Re: Replication slot stats misgivings

From

Amit Kapila

Date:

20 April 2021, 09:59:16

On Tue, Apr 20, 2021 at 9:08 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> I've attached the new version patch that fixed the compilation error
> reported off-line by Amit.
>

I was thinking about whether we can someway avoid the below risk:
In case where the
+ * message for dropping the old slot gets lost and a slot with the same
+ * name is created, the stats will be accumulated into the old slots since
+ * we use the slot name as the key. In that case, user can reset the
+ * particular stats by pg_stat_reset_replication_slot().

What if we send a separate message for create slot such that the stats
collector will initialize the entries even if the previous drop
message is lost or came later? If we do that then if the drop message
is lost, the create with same name won't accumulate the stats and if
the drop came later, it will remove the newly created stats but
anyway, later stats from the same slot will again create the slot
entry in the hash table.

Also, I think we can include the test case prepared by Vignesh in the email [1].

Apart from the above, I have made few minor modifications in the attached patch.
(a) + if (slotent->stat_reset_timestamp == 0 || !slotent)
I don't understand why second part of check is required? By this time
slotent will anyway have some valid value.

(b) + slotent = (PgStat_StatReplSlotEntry *) hash_search(replSlotStats,
+    (void *) &name,
+    create_it ? HASH_ENTER : HASH_FIND,
+    &found);

It is better to use NameStr here.

(c) made various changes in comments and some other cosmetic changes.

[1] - https://www.postgresql.org/message-id/CALDaNm3yBctNFE6X2FV_haRF4uue9okm1_DVE6ZANWvOV_CvYw%40mail.gmail.com

-- 
With Regards,
Amit Kapila.

Attachment

v10-0001-Use-HTAB-for-replication-slot-statistics.patch

Re: Replication slot stats misgivings

From

vignesh C

Date:

20 April 2021, 10:22:40

On Tue, Apr 20, 2021 at 9:08 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Mon, Apr 19, 2021 at 4:48 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Mon, Apr 19, 2021 at 2:14 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Mon, Apr 19, 2021 at 9:00 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > > > On Fri, Apr 16, 2021 at 2:58 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > >
> > > > >
> > > > > 4.
> > > > > +CREATE VIEW pg_stat_replication_slots AS
> > > > > +    SELECT
> > > > > +            s.slot_name,
> > > > > +            s.spill_txns,
> > > > > +            s.spill_count,
> > > > > +            s.spill_bytes,
> > > > > +            s.stream_txns,
> > > > > +            s.stream_count,
> > > > > +            s.stream_bytes,
> > > > > +            s.total_txns,
> > > > > +            s.total_bytes,
> > > > > +            s.stats_reset
> > > > > +    FROM pg_replication_slots as r,
> > > > > +        LATERAL pg_stat_get_replication_slot(slot_name) as s
> > > > > +    WHERE r.datoid IS NOT NULL; -- excluding physical slots
> > > > > ..
> > > > > ..
> > > > >
> > > > > -/* Get the statistics for the replication slots */
> > > > > +/* Get the statistics for the replication slot */
> > > > >  Datum
> > > > > -pg_stat_get_replication_slots(PG_FUNCTION_ARGS)
> > > > > +pg_stat_get_replication_slot(PG_FUNCTION_ARGS)
> > > > >  {
> > > > >  #define PG_STAT_GET_REPLICATION_SLOT_COLS 10
> > > > > - ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
> > > > > + text *slotname_text = PG_GETARG_TEXT_P(0);
> > > > > + NameData slotname;
> > > > >
> > > > > I think with the above changes getting all the slot stats has become
> > > > > much costlier. Is there any reason why can't we get all the stats from
> > > > > the new hash_table in one shot and return them to the user?
> > > >
> > > > I think the advantage of this approach would be that it can avoid
> > > > showing the stats for already-dropped slots. Like other statistics
> > > > views such as pg_stat_all_tables and pg_stat_all_functions, searching
> > > > the stats by the name got from pg_replication_slots can show only
> > > > available slot stats even if the hash table has garbage slot stats.
> > > >
> > >
> > > Sounds reasonable. However, if the create_slot message is missed, it
> > > will show an empty row for it. See below:
> > >
> > > postgres=# select slot_name, total_txns from pg_stat_replication_slots;
> > >  slot_name | total_txns
> > > -----------+------------
> > >  s1        |          0
> > >  s2        |          0
> > >            |
> > > (3 rows)
> > >
> > > Here, I have manually via debugger skipped sending the create_slot
> > > message for the third slot and we are showing an empty for it. This
> > > won't happen for pg_stat_all_tables, as it will set 0 or other initial
> > > values in such a case. I think we need to address this case.
> >
> > Good catch. I think it's better to set 0 to all counters and NULL to
> > reset_stats.
> >
> > >
> > > > Given that pg_stat_replication_slots doesn’t show garbage slot stats
> > > > even if it has, I thought we can avoid checking those garbage stats
> > > > frequently. It should not essentially be a problem for the hash table
> > > > to have entries up to max_replication_slots regardless of live or
> > > > already-dropped.
> > > >
> > >
> > > Yeah, but I guess we still might not save much by not doing it,
> > > especially because for the other cases like tables/functions, we are
> > > doing it without any threshold limit.
> >
> > Agreed.
> >
> > I've attached the updated patch, please review it.
>
> I've attached the new version patch that fixed the compilation error
> reported off-line by Amit.

Thanks for the updated patch, few comments:
1) We can change "slotent = pgstat_get_replslot_entry(slotname,
false);" to "return pgstat_get_replslot_entry(slotname, false);" and
remove the slotent variable.

+       PgStat_StatReplSlotEntry *slotent = NULL;
+
        backend_read_statsfile();

-       *nslots_p = nReplSlotStats;
-       return replSlotStats;
+       slotent = pgstat_get_replslot_entry(slotname, false);
+
+       return slotent;

2) Should we change PGSTAT_FILE_FORMAT_ID as the statistic file format
has changed for replication statistics?

3) We can include PgStat_StatReplSlotEntry in typedefs.lst and remove
PgStat_ReplSlotStats from typedefs.lst

4) Few indentation issues are there, we can run pgindent on pgstat.c changes:
                        case 'R':
-                               if
(fread(&replSlotStats[nReplSlotStats], 1,
sizeof(PgStat_ReplSlotStats), fpin)
-                                       != sizeof(PgStat_ReplSlotStats))
+                       {
+                               PgStat_StatReplSlotEntry slotbuf;
+                               PgStat_StatReplSlotEntry *slotent;
+
+                               if (fread(&slotbuf, 1,
sizeof(PgStat_StatReplSlotEntry), fpin)
+                                       != sizeof(PgStat_StatReplSlotEntry))

Regards,
Vignesh

Re: Replication slot stats misgivings

From

Masahiko Sawada

Date:

20 April 2021, 12:26:23

On Tue, Apr 20, 2021 at 6:59 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Apr 20, 2021 at 9:08 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > I've attached the new version patch that fixed the compilation error
> > reported off-line by Amit.
> >
>
> I was thinking about whether we can someway avoid the below risk:
> In case where the
> + * message for dropping the old slot gets lost and a slot with the same
> + * name is created, the stats will be accumulated into the old slots since
> + * we use the slot name as the key. In that case, user can reset the
> + * particular stats by pg_stat_reset_replication_slot().
>
> What if we send a separate message for create slot such that the stats
> collector will initialize the entries even if the previous drop
> message is lost or came later? If we do that then if the drop message
> is lost, the create with same name won't accumulate the stats and if
> the drop came later, it will remove the newly created stats but
> anyway, later stats from the same slot will again create the slot
> entry in the hash table.

Sounds good to me. There is still little chance to happen if messages
for both creating and dropping slots with the same name got lost, but
it's unlikely to happen in practice.

>
> Also, I think we can include the test case prepared by Vignesh in the email [1].
>
> Apart from the above, I have made few minor modifications in the attached patch.
> (a) + if (slotent->stat_reset_timestamp == 0 || !slotent)
> I don't understand why second part of check is required? By this time
> slotent will anyway have some valid value.
>
> (b) + slotent = (PgStat_StatReplSlotEntry *) hash_search(replSlotStats,
> +    (void *) &name,
> +    create_it ? HASH_ENTER : HASH_FIND,
> +    &found);
>
> It is better to use NameStr here.
>
> (c) made various changes in comments and some other cosmetic changes.

All the above changes make sense to me.

I'll submit the updated patch soon.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Replication slot stats misgivings

From

Masahiko Sawada

Date:

20 April 2021, 14:23:44

On Tue, Apr 20, 2021 at 7:22 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Tue, Apr 20, 2021 at 9:08 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Mon, Apr 19, 2021 at 4:48 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > On Mon, Apr 19, 2021 at 2:14 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > > On Mon, Apr 19, 2021 at 9:00 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > > >
> > > > > On Fri, Apr 16, 2021 at 2:58 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > > >
> > > > > >
> > > > > > 4.
> > > > > > +CREATE VIEW pg_stat_replication_slots AS
> > > > > > +    SELECT
> > > > > > +            s.slot_name,
> > > > > > +            s.spill_txns,
> > > > > > +            s.spill_count,
> > > > > > +            s.spill_bytes,
> > > > > > +            s.stream_txns,
> > > > > > +            s.stream_count,
> > > > > > +            s.stream_bytes,
> > > > > > +            s.total_txns,
> > > > > > +            s.total_bytes,
> > > > > > +            s.stats_reset
> > > > > > +    FROM pg_replication_slots as r,
> > > > > > +        LATERAL pg_stat_get_replication_slot(slot_name) as s
> > > > > > +    WHERE r.datoid IS NOT NULL; -- excluding physical slots
> > > > > > ..
> > > > > > ..
> > > > > >
> > > > > > -/* Get the statistics for the replication slots */
> > > > > > +/* Get the statistics for the replication slot */
> > > > > >  Datum
> > > > > > -pg_stat_get_replication_slots(PG_FUNCTION_ARGS)
> > > > > > +pg_stat_get_replication_slot(PG_FUNCTION_ARGS)
> > > > > >  {
> > > > > >  #define PG_STAT_GET_REPLICATION_SLOT_COLS 10
> > > > > > - ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
> > > > > > + text *slotname_text = PG_GETARG_TEXT_P(0);
> > > > > > + NameData slotname;
> > > > > >
> > > > > > I think with the above changes getting all the slot stats has become
> > > > > > much costlier. Is there any reason why can't we get all the stats from
> > > > > > the new hash_table in one shot and return them to the user?
> > > > >
> > > > > I think the advantage of this approach would be that it can avoid
> > > > > showing the stats for already-dropped slots. Like other statistics
> > > > > views such as pg_stat_all_tables and pg_stat_all_functions, searching
> > > > > the stats by the name got from pg_replication_slots can show only
> > > > > available slot stats even if the hash table has garbage slot stats.
> > > > >
> > > >
> > > > Sounds reasonable. However, if the create_slot message is missed, it
> > > > will show an empty row for it. See below:
> > > >
> > > > postgres=# select slot_name, total_txns from pg_stat_replication_slots;
> > > >  slot_name | total_txns
> > > > -----------+------------
> > > >  s1        |          0
> > > >  s2        |          0
> > > >            |
> > > > (3 rows)
> > > >
> > > > Here, I have manually via debugger skipped sending the create_slot
> > > > message for the third slot and we are showing an empty for it. This
> > > > won't happen for pg_stat_all_tables, as it will set 0 or other initial
> > > > values in such a case. I think we need to address this case.
> > >
> > > Good catch. I think it's better to set 0 to all counters and NULL to
> > > reset_stats.
> > >
> > > >
> > > > > Given that pg_stat_replication_slots doesn’t show garbage slot stats
> > > > > even if it has, I thought we can avoid checking those garbage stats
> > > > > frequently. It should not essentially be a problem for the hash table
> > > > > to have entries up to max_replication_slots regardless of live or
> > > > > already-dropped.
> > > > >
> > > >
> > > > Yeah, but I guess we still might not save much by not doing it,
> > > > especially because for the other cases like tables/functions, we are
> > > > doing it without any threshold limit.
> > >
> > > Agreed.
> > >
> > > I've attached the updated patch, please review it.
> >
> > I've attached the new version patch that fixed the compilation error
> > reported off-line by Amit.
>
> Thanks for the updated patch, few comments:

Thank you for the review comments.

> 1) We can change "slotent = pgstat_get_replslot_entry(slotname,
> false);" to "return pgstat_get_replslot_entry(slotname, false);" and
> remove the slotent variable.
>
> +       PgStat_StatReplSlotEntry *slotent = NULL;
> +
>         backend_read_statsfile();
>
> -       *nslots_p = nReplSlotStats;
> -       return replSlotStats;
> +       slotent = pgstat_get_replslot_entry(slotname, false);
> +
> +       return slotent;

Fixed.

>
> 2) Should we change PGSTAT_FILE_FORMAT_ID as the statistic file format
> has changed for replication statistics?

The struct name is changed but I think the statistics file format has
not changed by this patch. No?

>
> 3) We can include PgStat_StatReplSlotEntry in typedefs.lst and remove
> PgStat_ReplSlotStats from typedefs.lst

Fixed.

>
> 4) Few indentation issues are there, we can run pgindent on pgstat.c changes:
>                         case 'R':
> -                               if
> (fread(&replSlotStats[nReplSlotStats], 1,
> sizeof(PgStat_ReplSlotStats), fpin)
> -                                       != sizeof(PgStat_ReplSlotStats))
> +                       {
> +                               PgStat_StatReplSlotEntry slotbuf;
> +                               PgStat_StatReplSlotEntry *slotent;
> +
> +                               if (fread(&slotbuf, 1,
> sizeof(PgStat_StatReplSlotEntry), fpin)
> +                                       != sizeof(PgStat_StatReplSlotEntry))

Fixed.

I've attached the patch. In addition to the test Vignesh prepared, I
added one test for the message for creating a slot that checks if the
statistics are initialized after re-creating the same name slot.
Please review it.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Attachment

v11-0001-Use-HTAB-for-replication-slot-statistics.patch

Re: Replication slot stats misgivings

From

Amit Kapila

Date:

21 April 2021, 03:50:00

On Tue, Apr 20, 2021 at 7:54 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>

I have one question:

+ /*
+ * Create the replication slot stats hash table if we don't have
+ * it already.
+ */
+ if (replSlotStats == NULL)
  {
- if (namestrcmp(&replSlotStats[i].slotname, name) == 0)
- return i; /* found */
+ HASHCTL hash_ctl;
+
+ hash_ctl.keysize = sizeof(NameData);
+ hash_ctl.entrysize = sizeof(PgStat_StatReplSlotEntry);
+ hash_ctl.hcxt = pgStatLocalContext;
+
+ replSlotStats = hash_create("Replication slots hash",
+ PGSTAT_REPLSLOT_HASH_SIZE,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
  }

It seems to me that the patch is always creating a hash table in
pgStatLocalContext?  AFAIU, we need to create it in pgStatLocalContext
when we read stats via backend_read_statsfile so that we can clear it
at the end of the transaction. The db/function stats seems to be doing
the same. Is there a reason why here we need to always create it in
pgStatLocalContext?

-- 
With Regards,
Amit Kapila.

Re: Replication slot stats misgivings

From

vignesh C

Date:

21 April 2021, 04:09:38

On Tue, Apr 20, 2021 at 7:54 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Tue, Apr 20, 2021 at 7:22 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Tue, Apr 20, 2021 at 9:08 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > On Mon, Apr 19, 2021 at 4:48 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > > > On Mon, Apr 19, 2021 at 2:14 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > >
> > > > > On Mon, Apr 19, 2021 at 9:00 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > > > >
> > > > > > On Fri, Apr 16, 2021 at 2:58 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > > > >
> > > > > > >
> > > > > > > 4.
> > > > > > > +CREATE VIEW pg_stat_replication_slots AS
> > > > > > > +    SELECT
> > > > > > > +            s.slot_name,
> > > > > > > +            s.spill_txns,
> > > > > > > +            s.spill_count,
> > > > > > > +            s.spill_bytes,
> > > > > > > +            s.stream_txns,
> > > > > > > +            s.stream_count,
> > > > > > > +            s.stream_bytes,
> > > > > > > +            s.total_txns,
> > > > > > > +            s.total_bytes,
> > > > > > > +            s.stats_reset
> > > > > > > +    FROM pg_replication_slots as r,
> > > > > > > +        LATERAL pg_stat_get_replication_slot(slot_name) as s
> > > > > > > +    WHERE r.datoid IS NOT NULL; -- excluding physical slots
> > > > > > > ..
> > > > > > > ..
> > > > > > >
> > > > > > > -/* Get the statistics for the replication slots */
> > > > > > > +/* Get the statistics for the replication slot */
> > > > > > >  Datum
> > > > > > > -pg_stat_get_replication_slots(PG_FUNCTION_ARGS)
> > > > > > > +pg_stat_get_replication_slot(PG_FUNCTION_ARGS)
> > > > > > >  {
> > > > > > >  #define PG_STAT_GET_REPLICATION_SLOT_COLS 10
> > > > > > > - ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
> > > > > > > + text *slotname_text = PG_GETARG_TEXT_P(0);
> > > > > > > + NameData slotname;
> > > > > > >
> > > > > > > I think with the above changes getting all the slot stats has become
> > > > > > > much costlier. Is there any reason why can't we get all the stats from
> > > > > > > the new hash_table in one shot and return them to the user?
> > > > > >
> > > > > > I think the advantage of this approach would be that it can avoid
> > > > > > showing the stats for already-dropped slots. Like other statistics
> > > > > > views such as pg_stat_all_tables and pg_stat_all_functions, searching
> > > > > > the stats by the name got from pg_replication_slots can show only
> > > > > > available slot stats even if the hash table has garbage slot stats.
> > > > > >
> > > > >
> > > > > Sounds reasonable. However, if the create_slot message is missed, it
> > > > > will show an empty row for it. See below:
> > > > >
> > > > > postgres=# select slot_name, total_txns from pg_stat_replication_slots;
> > > > >  slot_name | total_txns
> > > > > -----------+------------
> > > > >  s1        |          0
> > > > >  s2        |          0
> > > > >            |
> > > > > (3 rows)
> > > > >
> > > > > Here, I have manually via debugger skipped sending the create_slot
> > > > > message for the third slot and we are showing an empty for it. This
> > > > > won't happen for pg_stat_all_tables, as it will set 0 or other initial
> > > > > values in such a case. I think we need to address this case.
> > > >
> > > > Good catch. I think it's better to set 0 to all counters and NULL to
> > > > reset_stats.
> > > >
> > > > >
> > > > > > Given that pg_stat_replication_slots doesn’t show garbage slot stats
> > > > > > even if it has, I thought we can avoid checking those garbage stats
> > > > > > frequently. It should not essentially be a problem for the hash table
> > > > > > to have entries up to max_replication_slots regardless of live or
> > > > > > already-dropped.
> > > > > >
> > > > >
> > > > > Yeah, but I guess we still might not save much by not doing it,
> > > > > especially because for the other cases like tables/functions, we are
> > > > > doing it without any threshold limit.
> > > >
> > > > Agreed.
> > > >
> > > > I've attached the updated patch, please review it.
> > >
> > > I've attached the new version patch that fixed the compilation error
> > > reported off-line by Amit.
> >
> > Thanks for the updated patch, few comments:
>
> Thank you for the review comments.
>
> > 1) We can change "slotent = pgstat_get_replslot_entry(slotname,
> > false);" to "return pgstat_get_replslot_entry(slotname, false);" and
> > remove the slotent variable.
> >
> > +       PgStat_StatReplSlotEntry *slotent = NULL;
> > +
> >         backend_read_statsfile();
> >
> > -       *nslots_p = nReplSlotStats;
> > -       return replSlotStats;
> > +       slotent = pgstat_get_replslot_entry(slotname, false);
> > +
> > +       return slotent;
>
> Fixed.
>
> >
> > 2) Should we change PGSTAT_FILE_FORMAT_ID as the statistic file format
> > has changed for replication statistics?
>
> The struct name is changed but I think the statistics file format has
> not changed by this patch. No?

I tried to create stats on head and then applied this patch and tried
reading the stats, it could not get the values, the backtrace for the
same is:
(gdb) bt
#0  0x000055fe12f8a93d in pg_detoast_datum (datum=0x7f7f7f7f7f7f7f7f)
at fmgr.c:1727
#1  0x000055fe12ec2a03 in pg_stat_get_replication_slot
(fcinfo=0x55fe1357e150) at pgstatfuncs.c:2316
#2  0x000055fe12b6af23 in ExecMakeTableFunctionResult
(setexpr=0x55fe13563c28, econtext=0x55fe13563b90,
argContext=0x55fe1357e030, expectedDesc=0x55fe13564968,
    randomAccess=false) at execSRF.c:234
#3  0x000055fe12b87ba3 in FunctionNext (node=0x55fe13563a78) at
nodeFunctionscan.c:95
#4  0x000055fe12b6c929 in ExecScanFetch (node=0x55fe13563a78,
accessMtd=0x55fe12b87aee <FunctionNext>, recheckMtd=0x55fe12b87eea
<FunctionRecheck>) at execScan.c:133
#5  0x000055fe12b6c9a2 in ExecScan (node=0x55fe13563a78,
accessMtd=0x55fe12b87aee <FunctionNext>, recheckMtd=0x55fe12b87eea
<FunctionRecheck>) at execScan.c:182
#6  0x000055fe12b87f40 in ExecFunctionScan (pstate=0x55fe13563a78) at
nodeFunctionscan.c:270
#7  0x000055fe12b687eb in ExecProcNodeFirst (node=0x55fe13563a78) at
execProcnode.c:462
#8  0x000055fe12b5c713 in ExecProcNode (node=0x55fe13563a78) at
../../../src/include/executor/executor.h:257
#9  0x000055fe12b5f147 in ExecutePlan (estate=0x55fe135635f0,
planstate=0x55fe13563a78, use_parallel_mode=false,
operation=CMD_SELECT, sendTuples=true, numberTuples=0,
    direction=ForwardScanDirection, dest=0x55fe13579558,
execute_once=true) at execMain.c:1551
#10 0x000055fe12b5cded in standard_ExecutorRun
(queryDesc=0x55fe1349acd0, direction=ForwardScanDirection, count=0,
execute_once=true) at execMain.c:361
#11 0x000055fe12b5cbfc in ExecutorRun (queryDesc=0x55fe1349acd0,
direction=ForwardScanDirection, count=0, execute_once=true) at
execMain.c:305
#12 0x000055fe12dca9ce in PortalRunSelect (portal=0x55fe134ed2f0,
forward=true, count=0, dest=0x55fe13579558) at pquery.c:912
#13 0x000055fe12dca607 in PortalRun (portal=0x55fe134ed2f0,
count=9223372036854775807, isTopLevel=true, run_once=true,
dest=0x55fe13579558, altdest=0x55fe13579558,
    qc=0x7ffefa53cd30) at pquery.c:756
#14 0x000055fe12dc3915 in exec_simple_query
(query_string=0x55fe134796e0 "select * from pg_stat_replication_slots
;") at postgres.c:1196

I feel we can change CATALOG_VERSION_NO so that we will get this error
"The database cluster was initialized with CATALOG_VERSION_NO
2021XXXXX, but the server was compiled with CATALOG_VERSION_NO
2021XXXXX." which will prevent the above issue.

Regards,
Vignesh

Re: Replication slot stats misgivings

From

Amit Kapila

Date:

21 April 2021, 04:17:31

On Wed, Apr 21, 2021 at 9:39 AM vignesh C <vignesh21@gmail.com> wrote:
>
> I feel we can change CATALOG_VERSION_NO so that we will get this error
> "The database cluster was initialized with CATALOG_VERSION_NO
> 2021XXXXX, but the server was compiled with CATALOG_VERSION_NO
> 2021XXXXX." which will prevent the above issue.
>

Right, but we normally do that just before commit. We might want to
mention it in the commit message just as a Note so that we don't
forget to bump it before commit but otherwise, we don't need to change
it in the patch.

-- 
With Regards,
Amit Kapila.

Re: Replication slot stats misgivings

From

vignesh C

Date:

21 April 2021, 04:43:21

On Wed, Apr 21, 2021 at 9:47 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Apr 21, 2021 at 9:39 AM vignesh C <vignesh21@gmail.com> wrote:
> >
> > I feel we can change CATALOG_VERSION_NO so that we will get this error
> > "The database cluster was initialized with CATALOG_VERSION_NO
> > 2021XXXXX, but the server was compiled with CATALOG_VERSION_NO
> > 2021XXXXX." which will prevent the above issue.
> >
>
> Right, but we normally do that just before commit. We might want to
> mention it in the commit message just as a Note so that we don't
> forget to bump it before commit but otherwise, we don't need to change
> it in the patch.
>

Yes, that is fine with me.

Regards,
Vignesh

Re: Replication slot stats misgivings

From

Amit Kapila

Date:

21 April 2021, 06:09:19

On Tue, Apr 20, 2021 at 7:54 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
>
> I've attached the patch. In addition to the test Vignesh prepared, I
> added one test for the message for creating a slot that checks if the
> statistics are initialized after re-creating the same name slot.
>

I am not sure how much useful your new test is because you are testing
it for slot name for which we have removed the slot file. It is not
related to stat messages this patch is sending. I think we can leave
that for now. One other minor comment:

- * create the statistics for the replication slot.
+ * create the statistics for the replication slot.  In the cases where the
+ * message for dropping the old slot gets lost and a slot with the same
+ * name is created, since the stats will be initialized by the message
+ * for creating the slot the statistics are not accumulated into the
+ * old slot unless the messages for both creating and dropping slots with
+ * the same name got lost.  Just in case it happens, the user can reset
+ * the particular stats by pg_stat_reset_replication_slot().

I think we can change it to something like: " XXX In case, the
messages for creation and drop slot of the same name get lost and
create happens before (auto)vacuum cleans up the dead slot, the stats
will be accumulated into the old slot. One can imagine having OIDs for
each slot to avoid the accumulation of stats but that doesn't seem
worth doing as in practice this won't happen frequently.". Also, I am
not sure after your recent change whether it is a good idea to mention
something in docs. What do you think?

-- 
With Regards,
Amit Kapila.

Re: Replication slot stats misgivings

From

Dilip Kumar

Date:

21 April 2021, 07:43:53

On Tue, Apr 20, 2021 at 7:54 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

>
> I've attached the patch. In addition to the test Vignesh prepared, I
> added one test for the message for creating a slot that checks if the
> statistics are initialized after re-creating the same name slot.
> Please review it.

Overall the patch looks good to me.  However, I have one question, I
did not understand the reason behind moving the below code from
"pgstat_reset_replslot_counter" to "pg_stat_reset_replication_slot"?

+        /*
+         * Check if the slot exists with the given name. It is possible that by
+         * the time this message is executed the slot is dropped but at least
+         * this check will ensure that the given name is for a valid slot.
+         */
+        slot = SearchNamedReplicationSlot(target, true);
+
+        if (!slot)
+            ereport(ERROR,
+                    (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+                     errmsg("replication slot \"%s\" does not exist",
+                            target)));
+
+        /*
+         * Nothing to do for physical slots as we collect stats only for
+         * logical slots.
+         */
+        if (SlotIsPhysical(slot))
+            PG_RETURN_VOID();
+    }
+
     pgstat_reset_replslot_counter(target);

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Re: Replication slot stats misgivings

From

Masahiko Sawada

Date:

21 April 2021, 09:06:57

On Wed, Apr 21, 2021 at 12:50 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Apr 20, 2021 at 7:54 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
>
> I have one question:
>
> + /*
> + * Create the replication slot stats hash table if we don't have
> + * it already.
> + */
> + if (replSlotStats == NULL)
>   {
> - if (namestrcmp(&replSlotStats[i].slotname, name) == 0)
> - return i; /* found */
> + HASHCTL hash_ctl;
> +
> + hash_ctl.keysize = sizeof(NameData);
> + hash_ctl.entrysize = sizeof(PgStat_StatReplSlotEntry);
> + hash_ctl.hcxt = pgStatLocalContext;
> +
> + replSlotStats = hash_create("Replication slots hash",
> + PGSTAT_REPLSLOT_HASH_SIZE,
> + &hash_ctl,
> + HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
>   }
>
> It seems to me that the patch is always creating a hash table in
> pgStatLocalContext?  AFAIU, we need to create it in pgStatLocalContext
> when we read stats via backend_read_statsfile so that we can clear it
> at the end of the transaction. The db/function stats seems to be doing
> the same. Is there a reason why here we need to always create it in
> pgStatLocalContext?

I wanted to avoid creating the hash table if there is no replication
slot. But as you pointed out, we create the hash table even when
lookup (i.g., create_it is false), which is bad. So I think we can
have pgstat_get_replslot_entry() return NULL without creating the hash
table if the hash table is NULL and create_it is false so that backend
processes don’t create the hash table, not via
backend_read_statsfile(). Or another idea would be to always create
the hash table in pgstat_read_statsfiles(). That way, it would
simplify the code but could waste the memory if there is no
replication slot. I slightly prefer the former but what do you think?

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Replication slot stats misgivings

From

Masahiko Sawada

Date:

21 April 2021, 09:15:36

On Wed, Apr 21, 2021 at 3:09 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Apr 20, 2021 at 7:54 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> >
> > I've attached the patch. In addition to the test Vignesh prepared, I
> > added one test for the message for creating a slot that checks if the
> > statistics are initialized after re-creating the same name slot.
> >
>
> I am not sure how much useful your new test is because you are testing
> it for slot name for which we have removed the slot file. It is not
> related to stat messages this patch is sending. I think we can leave
> that for now.

I might be missing something but I think the test is related to the
message for creating a slot that initializes all counters. No? If
there is no that message, we will end up getting old stats if a
message for dropping slot gets lost (simulated by dropping slot file)
and the same name slot is created.

> One other minor comment:
>
> - * create the statistics for the replication slot.
> + * create the statistics for the replication slot.  In the cases where the
> + * message for dropping the old slot gets lost and a slot with the same
> + * name is created, since the stats will be initialized by the message
> + * for creating the slot the statistics are not accumulated into the
> + * old slot unless the messages for both creating and dropping slots with
> + * the same name got lost.  Just in case it happens, the user can reset
> + * the particular stats by pg_stat_reset_replication_slot().
>
> I think we can change it to something like: " XXX In case, the
> messages for creation and drop slot of the same name get lost and
> create happens before (auto)vacuum cleans up the dead slot, the stats
> will be accumulated into the old slot. One can imagine having OIDs for
> each slot to avoid the accumulation of stats but that doesn't seem
> worth doing as in practice this won't happen frequently.". Also, I am
> not sure after your recent change whether it is a good idea to mention
> something in docs. What do you think?

Both points make sense to me. I'll update the comment and remove the
mention in the doc in the next version patch.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Replication slot stats misgivings

From

Amit Kapila

Date:

21 April 2021, 09:19:58

On Wed, Apr 21, 2021 at 2:37 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Wed, Apr 21, 2021 at 12:50 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Tue, Apr 20, 2021 at 7:54 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> >
> > I have one question:
> >
> > + /*
> > + * Create the replication slot stats hash table if we don't have
> > + * it already.
> > + */
> > + if (replSlotStats == NULL)
> >   {
> > - if (namestrcmp(&replSlotStats[i].slotname, name) == 0)
> > - return i; /* found */
> > + HASHCTL hash_ctl;
> > +
> > + hash_ctl.keysize = sizeof(NameData);
> > + hash_ctl.entrysize = sizeof(PgStat_StatReplSlotEntry);
> > + hash_ctl.hcxt = pgStatLocalContext;
> > +
> > + replSlotStats = hash_create("Replication slots hash",
> > + PGSTAT_REPLSLOT_HASH_SIZE,
> > + &hash_ctl,
> > + HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
> >   }
> >
> > It seems to me that the patch is always creating a hash table in
> > pgStatLocalContext?  AFAIU, we need to create it in pgStatLocalContext
> > when we read stats via backend_read_statsfile so that we can clear it
> > at the end of the transaction. The db/function stats seems to be doing
> > the same. Is there a reason why here we need to always create it in
> > pgStatLocalContext?
>
> I wanted to avoid creating the hash table if there is no replication
> slot. But as you pointed out, we create the hash table even when
> lookup (i.g., create_it is false), which is bad. So I think we can
> have pgstat_get_replslot_entry() return NULL without creating the hash
> table if the hash table is NULL and create_it is false so that backend
> processes don’t create the hash table, not via
> backend_read_statsfile(). Or another idea would be to always create
> the hash table in pgstat_read_statsfiles(). That way, it would
> simplify the code but could waste the memory if there is no
> replication slot.
>

If you create it after reading 'R' message as we do in the case of 'D'
message then it won't waste any memory. So probably creating in
pgstat_read_statsfiles() would be better unless you see some other
problem with that.


--
With Regards,
Amit Kapila.

Re: Replication slot stats misgivings

From

Amit Kapila

Date:

21 April 2021, 09:36:03

On Wed, Apr 21, 2021 at 2:46 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Wed, Apr 21, 2021 at 3:09 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Tue, Apr 20, 2021 at 7:54 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > >
> > > I've attached the patch. In addition to the test Vignesh prepared, I
> > > added one test for the message for creating a slot that checks if the
> > > statistics are initialized after re-creating the same name slot.
> > >
> >
> > I am not sure how much useful your new test is because you are testing
> > it for slot name for which we have removed the slot file. It is not
> > related to stat messages this patch is sending. I think we can leave
> > that for now.
>
> I might be missing something but I think the test is related to the
> message for creating a slot that initializes all counters. No? If
> there is no that message, we will end up getting old stats if a
> message for dropping slot gets lost (simulated by dropping slot file)
> and the same name slot is created.
>

The test is not waiting for a new slot creation message to reach the
stats collector. So, if the old slot data still exists in the file and
now when we read stats via backend, then won't there exists a chance
that old slot stats data still exists?

-- 
With Regards,
Amit Kapila.

Re: Replication slot stats misgivings

From

Masahiko Sawada

Date:

21 April 2021, 10:06:22

On Wed, Apr 21, 2021 at 6:20 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Apr 21, 2021 at 2:37 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Wed, Apr 21, 2021 at 12:50 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Tue, Apr 20, 2021 at 7:54 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > >
> > > I have one question:
> > >
> > > + /*
> > > + * Create the replication slot stats hash table if we don't have
> > > + * it already.
> > > + */
> > > + if (replSlotStats == NULL)
> > >   {
> > > - if (namestrcmp(&replSlotStats[i].slotname, name) == 0)
> > > - return i; /* found */
> > > + HASHCTL hash_ctl;
> > > +
> > > + hash_ctl.keysize = sizeof(NameData);
> > > + hash_ctl.entrysize = sizeof(PgStat_StatReplSlotEntry);
> > > + hash_ctl.hcxt = pgStatLocalContext;
> > > +
> > > + replSlotStats = hash_create("Replication slots hash",
> > > + PGSTAT_REPLSLOT_HASH_SIZE,
> > > + &hash_ctl,
> > > + HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
> > >   }
> > >
> > > It seems to me that the patch is always creating a hash table in
> > > pgStatLocalContext?  AFAIU, we need to create it in pgStatLocalContext
> > > when we read stats via backend_read_statsfile so that we can clear it
> > > at the end of the transaction. The db/function stats seems to be doing
> > > the same. Is there a reason why here we need to always create it in
> > > pgStatLocalContext?
> >
> > I wanted to avoid creating the hash table if there is no replication
> > slot. But as you pointed out, we create the hash table even when
> > lookup (i.g., create_it is false), which is bad. So I think we can
> > have pgstat_get_replslot_entry() return NULL without creating the hash
> > table if the hash table is NULL and create_it is false so that backend
> > processes don’t create the hash table, not via
> > backend_read_statsfile(). Or another idea would be to always create
> > the hash table in pgstat_read_statsfiles(). That way, it would
> > simplify the code but could waste the memory if there is no
> > replication slot.
> >
>
> If you create it after reading 'R' message as we do in the case of 'D'
> message then it won't waste any memory. So probably creating in
> pgstat_read_statsfiles() would be better unless you see some other
> problem with that.

Yeah, I think that's the approach I mentioned as the former. I’ll
change in the next version patch.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Replication slot stats misgivings

From

Masahiko Sawada

Date:

21 April 2021, 10:08:26

On Wed, Apr 21, 2021 at 6:36 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Apr 21, 2021 at 2:46 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Wed, Apr 21, 2021 at 3:09 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Tue, Apr 20, 2021 at 7:54 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > > >
> > > > I've attached the patch. In addition to the test Vignesh prepared, I
> > > > added one test for the message for creating a slot that checks if the
> > > > statistics are initialized after re-creating the same name slot.
> > > >
> > >
> > > I am not sure how much useful your new test is because you are testing
> > > it for slot name for which we have removed the slot file. It is not
> > > related to stat messages this patch is sending. I think we can leave
> > > that for now.
> >
> > I might be missing something but I think the test is related to the
> > message for creating a slot that initializes all counters. No? If
> > there is no that message, we will end up getting old stats if a
> > message for dropping slot gets lost (simulated by dropping slot file)
> > and the same name slot is created.
> >
>
> The test is not waiting for a new slot creation message to reach the
> stats collector. So, if the old slot data still exists in the file and
> now when we read stats via backend, then won't there exists a chance
> that old slot stats data still exists?

You're right. We should wait for the message to reach the collector.
Or should we remove that test case?

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Replication slot stats misgivings

From

Amit Kapila

Date:

21 April 2021, 10:11:08

On Wed, Apr 21, 2021 at 3:39 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> >
> > The test is not waiting for a new slot creation message to reach the
> > stats collector. So, if the old slot data still exists in the file and
> > now when we read stats via backend, then won't there exists a chance
> > that old slot stats data still exists?
>
> You're right. We should wait for the message to reach the collector.
> Or should we remove that test case?
>

I feel we can remove it. I am not sure how much value this additional
test case is adding.

-- 
With Regards,
Amit Kapila.

Re: Replication slot stats misgivings

From

Masahiko Sawada

Date:

22 April 2021, 02:21:45

On Wed, Apr 21, 2021 at 4:44 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Tue, Apr 20, 2021 at 7:54 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> >
> > I've attached the patch. In addition to the test Vignesh prepared, I
> > added one test for the message for creating a slot that checks if the
> > statistics are initialized after re-creating the same name slot.
> > Please review it.
>
> Overall the patch looks good to me.  However, I have one question, I
> did not understand the reason behind moving the below code from
> "pgstat_reset_replslot_counter" to "pg_stat_reset_replication_slot"?

Andres pointed out that pgstat_reset_replslot_counter() acquires lwlock[1]:

---
- pgstat_reset_replslot_counter() acquires ReplicationSlotControlLock. I
think pgstat.c has absolutely no business doing things on that level.
---

I changed the code so that pgstat_reset_replslot_counter() doesn't
acquire directly lwlock but I think that it's appropriate to do the
existence check for slots in pgstatfunc.c rather than pgstat.c.

Regards,

[1] https://www.postgresql.org/message-id/20210319185247.ldebgpdaxsowiflw%40alap3.anarazel.de



--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Replication slot stats misgivings

From

Masahiko Sawada

Date:

22 April 2021, 02:55:33

On Wed, Apr 21, 2021 at 7:11 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Apr 21, 2021 at 3:39 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > >
> > > The test is not waiting for a new slot creation message to reach the
> > > stats collector. So, if the old slot data still exists in the file and
> > > now when we read stats via backend, then won't there exists a chance
> > > that old slot stats data still exists?
> >
> > You're right. We should wait for the message to reach the collector.
> > Or should we remove that test case?
> >
>
> I feel we can remove it. I am not sure how much value this additional
> test case is adding.

Okay, removed.

I’ve attached the updated patch. Please review it.


Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Attachment

v12-0001-Use-HTAB-for-replication-slot-statistics.patch

Re: Replication slot stats misgivings

From

Dilip Kumar

Date:

22 April 2021, 04:13:23

On Thu, Apr 22, 2021 at 7:52 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Wed, Apr 21, 2021 at 4:44 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Tue, Apr 20, 2021 at 7:54 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > >
> > > I've attached the patch. In addition to the test Vignesh prepared, I
> > > added one test for the message for creating a slot that checks if the
> > > statistics are initialized after re-creating the same name slot.
> > > Please review it.
> >
> > Overall the patch looks good to me.  However, I have one question, I
> > did not understand the reason behind moving the below code from
> > "pgstat_reset_replslot_counter" to "pg_stat_reset_replication_slot"?
>
> Andres pointed out that pgstat_reset_replslot_counter() acquires lwlock[1]:
>
> ---
> - pgstat_reset_replslot_counter() acquires ReplicationSlotControlLock. I
> think pgstat.c has absolutely no business doing things on that level.
> ---
>
> I changed the code so that pgstat_reset_replslot_counter() doesn't
> acquire directly lwlock but I think that it's appropriate to do the
> existence check for slots in pgstatfunc.c rather than pgstat.c.

Thanks for pointing that out.  It makes sense to me.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Re: Replication slot stats misgivings

From

Amit Kapila

Date:

22 April 2021, 04:49:53

On Thu, Apr 22, 2021 at 8:26 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>

Few comments:
1.
I think we want stats collector to not use pgStatLocalContext unless
it has read the stats file similar to other cases. So probably, we
should allocate it in pgStatLocalContext when we read 'R' message in
pgstat_read_statsfiles. Also, the function pgstat_get_replslot_entry
should not use pgStatLocalContext to allocate the hash table.
2.
+ if (replSlotStatHash != NULL)
+ (void) hash_search(replSlotStatHash,
+    (void *) &(msg->m_slotname),
+    HASH_REMOVE,
+    NULL);

Why have you changed this part from using NameStr?
3.
+# Check that replicatoin slot stats are expected.

Typo. replicatoin/replication


-- 
With Regards,
Amit Kapila.

Re: Replication slot stats misgivings

From

Masahiko Sawada

Date:

22 April 2021, 05:09:15

On Thu, Apr 22, 2021 at 1:50 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Apr 22, 2021 at 8:26 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
>
> Few comments:
> 1.
> I think we want stats collector to not use pgStatLocalContext unless
> it has read the stats file similar to other cases. So probably, we
> should allocate it in pgStatLocalContext when we read 'R' message in
> pgstat_read_statsfiles. Also, the function pgstat_get_replslot_entry
> should not use pgStatLocalContext to allocate the hash table.

Agreed.

> 2.
> + if (replSlotStatHash != NULL)
> + (void) hash_search(replSlotStatHash,
> +    (void *) &(msg->m_slotname),
> +    HASH_REMOVE,
> +    NULL);
>
> Why have you changed this part from using NameStr?

I thought that since the hash table is created with the key size
sizeof(NameData) it's better to use NameData for searching as well.

> 3.
> +# Check that replicatoin slot stats are expected.
>
> Typo. replicatoin/replication

Will fix in the next version.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Replication slot stats misgivings

From

Amit Kapila

Date:

22 April 2021, 06:03:37

On Thu, Apr 22, 2021 at 10:39 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Thu, Apr 22, 2021 at 1:50 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Thu, Apr 22, 2021 at 8:26 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
>
> > 2.
> > + if (replSlotStatHash != NULL)
> > + (void) hash_search(replSlotStatHash,
> > +    (void *) &(msg->m_slotname),
> > +    HASH_REMOVE,
> > +    NULL);
> >
> > Why have you changed this part from using NameStr?
>
> I thought that since the hash table is created with the key size
> sizeof(NameData) it's better to use NameData for searching as well.
>

Fair enough. I think this will give the same result either way.

-- 
With Regards,
Amit Kapila.

Re: Replication slot stats misgivings

From

Masahiko Sawada

Date:

22 April 2021, 07:31:24

On Thu, Apr 22, 2021 at 3:03 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Apr 22, 2021 at 10:39 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Thu, Apr 22, 2021 at 1:50 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Thu, Apr 22, 2021 at 8:26 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> >
> > > 2.
> > > + if (replSlotStatHash != NULL)
> > > + (void) hash_search(replSlotStatHash,
> > > +    (void *) &(msg->m_slotname),
> > > +    HASH_REMOVE,
> > > +    NULL);
> > >
> > > Why have you changed this part from using NameStr?
> >
> > I thought that since the hash table is created with the key size
> > sizeof(NameData) it's better to use NameData for searching as well.
> >
>
> Fair enough. I think this will give the same result either way.

I've attached the updated version patch.

Besides the review comment from Amit, I changed
pgstat_read_statsfiles() so that it doesn't use
spgstat_get_replslot_entry(). That’s because it was slightly unclear
why we create the hash table beforehand even though we call
pgstat_read_statsfiles() with ‘create’ = true. By this change, the
caller of pgstat_get_replslot_entry() with ‘create’ = true is only
pgstat_recv_replslot(), which makes the code clear and safe since
pgstat_recv_replslot() is used only by the collector. Also, I ran
pgindent on the modified files.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Attachment

v13-0001-Use-HTAB-for-replication-slot-statistics.patch

Re: Replication slot stats misgivings

From

Amit Kapila

Date:

22 April 2021, 10:54:54

On Thu, Apr 22, 2021 at 1:02 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>

Thanks, it looks good to me now. I'll review/test some more before
committing but at this stage, I would like to know from Andres or
others whether they see any problem with this approach to fixing a few
of the problems reported in this thread. Basically, it will fix the
cases where the drop message is lost and we were not able to record
stats for new slots and writing beyond the end of the array when after
restarting the number of slots whose stats are stored in the stats
file exceeds max_replication_slots.

It uses HTAB instead of an array to record slot stats and also taught
pgstat_vacuum_stat() to search for all the dead replication slots in
stats hashtable and tell the collector to remove them. This still uses
slot_name as the key because we were not able to find a better way to
use slot's idx.

Andres, unless you see any problems with this approach, I would like
to move forward with this early next week?

-- 
With Regards,
Amit Kapila.

Re: Replication slot stats misgivings

From

Amit Kapila

Date:

23 April 2021, 09:15:27

On Mon, Apr 19, 2021 at 4:28 PM vignesh C <vignesh21@gmail.com> wrote:
>
> I have made the changes to update the replication statistics at
> replication slot release. Please find the patch attached for the same.
> Thoughts?
>

Thanks, the changes look mostly good to me. The slot stats need to be
initialized in RestoreSlotFromDisk and ReplicationSlotCreate, not in
StartupDecodingContext. Apart from that, I have moved the declaration
of UpdateDecodingStats from slot.h back to logical.h. I have also
added/edited a few comments. Please check and let me know what do you
think of the attached?

-- 
With Regards,
Amit Kapila.

Attachment

v2-0001-Update-decoding-stats-during-replication-slot-rel.patch

Re: Replication slot stats misgivings

From

Masahiko Sawada

Date:

26 April 2021, 02:30:27

On Fri, Apr 23, 2021 at 6:15 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Apr 19, 2021 at 4:28 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > I have made the changes to update the replication statistics at
> > replication slot release. Please find the patch attached for the same.
> > Thoughts?
> >
>
> Thanks, the changes look mostly good to me. The slot stats need to be
> initialized in RestoreSlotFromDisk and ReplicationSlotCreate, not in
> StartupDecodingContext. Apart from that, I have moved the declaration
> of UpdateDecodingStats from slot.h back to logical.h. I have also
> added/edited a few comments. Please check and let me know what do you
> think of the attached?

The patch moves slot stats to the ReplicationSlot data that is on the
shared memory. If we have a space to store the statistics in the
shared memory can we simply accumulate the stats there and make them
persistent without using the stats collector? And I think there is
also a risk to increase shared memory when we want to add other
statistics in the future.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Replication slot stats misgivings

From

Amit Kapila

Date:

26 April 2021, 03:12:16

On Mon, Apr 26, 2021 at 8:01 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Fri, Apr 23, 2021 at 6:15 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Mon, Apr 19, 2021 at 4:28 PM vignesh C <vignesh21@gmail.com> wrote:
> > >
> > > I have made the changes to update the replication statistics at
> > > replication slot release. Please find the patch attached for the same.
> > > Thoughts?
> > >
> >
> > Thanks, the changes look mostly good to me. The slot stats need to be
> > initialized in RestoreSlotFromDisk and ReplicationSlotCreate, not in
> > StartupDecodingContext. Apart from that, I have moved the declaration
> > of UpdateDecodingStats from slot.h back to logical.h. I have also
> > added/edited a few comments. Please check and let me know what do you
> > think of the attached?
>
> The patch moves slot stats to the ReplicationSlot data that is on the
> shared memory. If we have a space to store the statistics in the
> shared memory can we simply accumulate the stats there and make them
> persistent without using the stats collector?
>

But for that, we need to write to file at every commit/abort/prepare
(decode of commit) which I think will incur significant overhead.
Also, we try to write after few commits then there is a danger of
losing them and still there could be a visible overhead for small
transactions.

> And I think there is
> also a risk to increase shared memory when we want to add other
> statistics in the future.
>

Yeah, so do you think it is not a good idea to store stats in
ReplicationSlot? Actually storing them in a slot makes it easier to
send them during ReplicationSlotRelease which is quite helpful if the
replication is interrupted due to some reason. Or the other idea was
that we send stats every time we stream or spill changes.

-- 
With Regards,
Amit Kapila.

Re: Replication slot stats misgivings

From

vignesh C

Date:

27 April 2021, 02:31:22

On Mon, Apr 26, 2021 at 8:42 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Apr 26, 2021 at 8:01 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Fri, Apr 23, 2021 at 6:15 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Mon, Apr 19, 2021 at 4:28 PM vignesh C <vignesh21@gmail.com> wrote:
> > > >
> > > > I have made the changes to update the replication statistics at
> > > > replication slot release. Please find the patch attached for the same.
> > > > Thoughts?
> > > >
> > >
> > > Thanks, the changes look mostly good to me. The slot stats need to be
> > > initialized in RestoreSlotFromDisk and ReplicationSlotCreate, not in
> > > StartupDecodingContext. Apart from that, I have moved the declaration
> > > of UpdateDecodingStats from slot.h back to logical.h. I have also
> > > added/edited a few comments. Please check and let me know what do you
> > > think of the attached?
> >
> > The patch moves slot stats to the ReplicationSlot data that is on the
> > shared memory. If we have a space to store the statistics in the
> > shared memory can we simply accumulate the stats there and make them
> > persistent without using the stats collector?
> >
>
> But for that, we need to write to file at every commit/abort/prepare
> (decode of commit) which I think will incur significant overhead.
> Also, we try to write after few commits then there is a danger of
> losing them and still there could be a visible overhead for small
> transactions.
>

I preferred not to persist this information to file, let's have stats
collector handle the stats persisting.

> > And I think there is
> > also a risk to increase shared memory when we want to add other
> > statistics in the future.
> >
>
> Yeah, so do you think it is not a good idea to store stats in
> ReplicationSlot? Actually storing them in a slot makes it easier to
> send them during ReplicationSlotRelease which is quite helpful if the
> replication is interrupted due to some reason. Or the other idea was
> that we send stats every time we stream or spill changes.

We use around 64 bytes of shared memory to store the statistics
information per slot, I'm not sure if this is a lot of memory. If this
memory is fine, then I felt the approach to store stats seems fine. If
that memory is too much then we could use the other approach to update
stats when we stream or spill the changes as suggested by Amit.

Regards,
Vignesh

Re: Replication slot stats misgivings

From

Amit Kapila

Date:

27 April 2021, 02:44:50

On Tue, Apr 27, 2021 at 8:01 AM vignesh C <vignesh21@gmail.com> wrote:
>
> On Mon, Apr 26, 2021 at 8:42 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Mon, Apr 26, 2021 at 8:01 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > On Fri, Apr 23, 2021 at 6:15 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > > On Mon, Apr 19, 2021 at 4:28 PM vignesh C <vignesh21@gmail.com> wrote:
> > > > >
> > > > > I have made the changes to update the replication statistics at
> > > > > replication slot release. Please find the patch attached for the same.
> > > > > Thoughts?
> > > > >
> > > >
> > > > Thanks, the changes look mostly good to me. The slot stats need to be
> > > > initialized in RestoreSlotFromDisk and ReplicationSlotCreate, not in
> > > > StartupDecodingContext. Apart from that, I have moved the declaration
> > > > of UpdateDecodingStats from slot.h back to logical.h. I have also
> > > > added/edited a few comments. Please check and let me know what do you
> > > > think of the attached?
> > >
> > > The patch moves slot stats to the ReplicationSlot data that is on the
> > > shared memory. If we have a space to store the statistics in the
> > > shared memory can we simply accumulate the stats there and make them
> > > persistent without using the stats collector?
> > >
> >
> > But for that, we need to write to file at every commit/abort/prepare
> > (decode of commit) which I think will incur significant overhead.
> > Also, we try to write after few commits then there is a danger of
> > losing them and still there could be a visible overhead for small
> > transactions.
> >
>
> I preferred not to persist this information to file, let's have stats
> collector handle the stats persisting.
>

Sawada-San, I would like to go ahead with your
"Use-HTAB-for-replication-slot-statistics" unless you think otherwise?

-- 
With Regards,
Amit Kapila.

Re: Replication slot stats misgivings

From

Masahiko Sawada

Date:

27 April 2021, 03:28:12

On Tue, Apr 27, 2021 at 11:45 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Apr 27, 2021 at 8:01 AM vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Mon, Apr 26, 2021 at 8:42 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Mon, Apr 26, 2021 at 8:01 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > > > On Fri, Apr 23, 2021 at 6:15 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > >
> > > > > On Mon, Apr 19, 2021 at 4:28 PM vignesh C <vignesh21@gmail.com> wrote:
> > > > > >
> > > > > > I have made the changes to update the replication statistics at
> > > > > > replication slot release. Please find the patch attached for the same.
> > > > > > Thoughts?
> > > > > >
> > > > >
> > > > > Thanks, the changes look mostly good to me. The slot stats need to be
> > > > > initialized in RestoreSlotFromDisk and ReplicationSlotCreate, not in
> > > > > StartupDecodingContext. Apart from that, I have moved the declaration
> > > > > of UpdateDecodingStats from slot.h back to logical.h. I have also
> > > > > added/edited a few comments. Please check and let me know what do you
> > > > > think of the attached?
> > > >
> > > > The patch moves slot stats to the ReplicationSlot data that is on the
> > > > shared memory. If we have a space to store the statistics in the
> > > > shared memory can we simply accumulate the stats there and make them
> > > > persistent without using the stats collector?
> > > >
> > >
> > > But for that, we need to write to file at every commit/abort/prepare
> > > (decode of commit) which I think will incur significant overhead.
> > > Also, we try to write after few commits then there is a danger of
> > > losing them and still there could be a visible overhead for small
> > > transactions.
> > >
> >
> > I preferred not to persist this information to file, let's have stats
> > collector handle the stats persisting.
> >
>
> Sawada-San, I would like to go ahead with your
> "Use-HTAB-for-replication-slot-statistics" unless you think otherwise?

I agree that it's better to use the stats collector. So please go ahead.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Replication slot stats misgivings

From

Masahiko Sawada

Date:

27 April 2021, 03:47:00

On Tue, Apr 27, 2021 at 11:31 AM vignesh C <vignesh21@gmail.com> wrote:
>
> On Mon, Apr 26, 2021 at 8:42 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Mon, Apr 26, 2021 at 8:01 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > On Fri, Apr 23, 2021 at 6:15 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > > On Mon, Apr 19, 2021 at 4:28 PM vignesh C <vignesh21@gmail.com> wrote:
> > > > >
> > > > > I have made the changes to update the replication statistics at
> > > > > replication slot release. Please find the patch attached for the same.
> > > > > Thoughts?
> > > > >
> > > >
> > > > Thanks, the changes look mostly good to me. The slot stats need to be
> > > > initialized in RestoreSlotFromDisk and ReplicationSlotCreate, not in
> > > > StartupDecodingContext. Apart from that, I have moved the declaration
> > > > of UpdateDecodingStats from slot.h back to logical.h. I have also
> > > > added/edited a few comments. Please check and let me know what do you
> > > > think of the attached?
> > >
> > > The patch moves slot stats to the ReplicationSlot data that is on the
> > > shared memory. If we have a space to store the statistics in the
> > > shared memory can we simply accumulate the stats there and make them
> > > persistent without using the stats collector?
> > >
> >
> > But for that, we need to write to file at every commit/abort/prepare
> > (decode of commit) which I think will incur significant overhead.
> > Also, we try to write after few commits then there is a danger of
> > losing them and still there could be a visible overhead for small
> > transactions.
> >
>
> I preferred not to persist this information to file, let's have stats
> collector handle the stats persisting.
>
> > > And I think there is
> > > also a risk to increase shared memory when we want to add other
> > > statistics in the future.
> > >
> >
> > Yeah, so do you think it is not a good idea to store stats in
> > ReplicationSlot? Actually storing them in a slot makes it easier to
> > send them during ReplicationSlotRelease which is quite helpful if the
> > replication is interrupted due to some reason. Or the other idea was
> > that we send stats every time we stream or spill changes.
>
> We use around 64 bytes of shared memory to store the statistics
> information per slot, I'm not sure if this is a lot of memory. If this
> memory is fine, then I felt the approach to store stats seems fine. If
> that memory is too much then we could use the other approach to update
> stats when we stream or spill the changes as suggested by Amit.

I agree that makes it easier to send slot stats during
ReplicationSlotRelease() but I'd prefer to avoid storing data that
doesn't need to be shared in the shared buffer if possible. And those
counters are not used by physical slots at all. If sending slot stats
every time we stream or spill changes doesn't affect the system much,
I think it's better than having slot stats in the shared memory.

Also, not sure it’s better but another idea would be to make the slot
stats a global variable like pgBufferUsage and use it during decoding.
Or we can set a proc-exit callback? But to be honest, I'm not sure
which approach we should go with. Those approaches have proc and cons.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Replication slot stats misgivings

From

Amit Kapila

Date:

27 April 2021, 04:13:40

On Tue, Apr 27, 2021 at 9:17 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Tue, Apr 27, 2021 at 11:31 AM vignesh C <vignesh21@gmail.com> wrote:
> >
> > > > And I think there is
> > > > also a risk to increase shared memory when we want to add other
> > > > statistics in the future.
> > > >
> > >
> > > Yeah, so do you think it is not a good idea to store stats in
> > > ReplicationSlot? Actually storing them in a slot makes it easier to
> > > send them during ReplicationSlotRelease which is quite helpful if the
> > > replication is interrupted due to some reason. Or the other idea was
> > > that we send stats every time we stream or spill changes.
> >
> > We use around 64 bytes of shared memory to store the statistics
> > information per slot, I'm not sure if this is a lot of memory. If this
> > memory is fine, then I felt the approach to store stats seems fine. If
> > that memory is too much then we could use the other approach to update
> > stats when we stream or spill the changes as suggested by Amit.
>
> I agree that makes it easier to send slot stats during
> ReplicationSlotRelease() but I'd prefer to avoid storing data that
> doesn't need to be shared in the shared buffer if possible.
>

Sounds reasonable and we might add some stats in the future so that
will further increase the usage of shared memory.

> And those
> counters are not used by physical slots at all. If sending slot stats
> every time we stream or spill changes doesn't affect the system much,
> I think it's better than having slot stats in the shared memory.
>

As the minimum size of logical_decoding_work_mem is 64KB, so in the
worst case, we will send stats after decoding that many changes. I
don't think it would impact too much considering that we need to spill
or stream those many changes.  If it concerns any users they can
always increase logical_decoding_work_mem. The default value is 64MB
at which point, I don't think it will matter sending the stats.

> Also, not sure it’s better but another idea would be to make the slot
> stats a global variable like pgBufferUsage and use it during decoding.
>

Hmm, I think it is better to avoid global variables if possible.

> Or we can set a proc-exit callback? But to be honest, I'm not sure
> which approach we should go with. Those approaches have proc and cons.
>

I think we can try the first approach listed here which is to send
stats each time we spill or stream.

--
With Regards,
Amit Kapila.

Re: Replication slot stats misgivings

From

vignesh C

Date:

27 April 2021, 04:18:04

On Tue, Apr 27, 2021 at 9:43 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Apr 27, 2021 at 9:17 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Tue, Apr 27, 2021 at 11:31 AM vignesh C <vignesh21@gmail.com> wrote:
> > >
> > > > > And I think there is
> > > > > also a risk to increase shared memory when we want to add other
> > > > > statistics in the future.
> > > > >
> > > >
> > > > Yeah, so do you think it is not a good idea to store stats in
> > > > ReplicationSlot? Actually storing them in a slot makes it easier to
> > > > send them during ReplicationSlotRelease which is quite helpful if the
> > > > replication is interrupted due to some reason. Or the other idea was
> > > > that we send stats every time we stream or spill changes.
> > >
> > > We use around 64 bytes of shared memory to store the statistics
> > > information per slot, I'm not sure if this is a lot of memory. If this
> > > memory is fine, then I felt the approach to store stats seems fine. If
> > > that memory is too much then we could use the other approach to update
> > > stats when we stream or spill the changes as suggested by Amit.
> >
> > I agree that makes it easier to send slot stats during
> > ReplicationSlotRelease() but I'd prefer to avoid storing data that
> > doesn't need to be shared in the shared buffer if possible.
> >
>
> Sounds reasonable and we might add some stats in the future so that
> will further increase the usage of shared memory.
>
> > And those
> > counters are not used by physical slots at all. If sending slot stats
> > every time we stream or spill changes doesn't affect the system much,
> > I think it's better than having slot stats in the shared memory.
> >
>
> As the minimum size of logical_decoding_work_mem is 64KB, so in the
> worst case, we will send stats after decoding that many changes. I
> don't think it would impact too much considering that we need to spill
> or stream those many changes.  If it concerns any users they can
> always increase logical_decoding_work_mem. The default value is 64MB
> at which point, I don't think it will matter sending the stats.

Sounds good to me, I will rebase my previous patch and send a patch for this.

Regards,
Vignesh

Re: Replication slot stats misgivings

From

Masahiko Sawada

Date:

27 April 2021, 04:27:10

On Tue, Apr 27, 2021 at 1:18 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Tue, Apr 27, 2021 at 9:43 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Tue, Apr 27, 2021 at 9:17 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > On Tue, Apr 27, 2021 at 11:31 AM vignesh C <vignesh21@gmail.com> wrote:
> > > >
> > > > > > And I think there is
> > > > > > also a risk to increase shared memory when we want to add other
> > > > > > statistics in the future.
> > > > > >
> > > > >
> > > > > Yeah, so do you think it is not a good idea to store stats in
> > > > > ReplicationSlot? Actually storing them in a slot makes it easier to
> > > > > send them during ReplicationSlotRelease which is quite helpful if the
> > > > > replication is interrupted due to some reason. Or the other idea was
> > > > > that we send stats every time we stream or spill changes.
> > > >
> > > > We use around 64 bytes of shared memory to store the statistics
> > > > information per slot, I'm not sure if this is a lot of memory. If this
> > > > memory is fine, then I felt the approach to store stats seems fine. If
> > > > that memory is too much then we could use the other approach to update
> > > > stats when we stream or spill the changes as suggested by Amit.
> > >
> > > I agree that makes it easier to send slot stats during
> > > ReplicationSlotRelease() but I'd prefer to avoid storing data that
> > > doesn't need to be shared in the shared buffer if possible.
> > >
> >
> > Sounds reasonable and we might add some stats in the future so that
> > will further increase the usage of shared memory.
> >
> > > And those
> > > counters are not used by physical slots at all. If sending slot stats
> > > every time we stream or spill changes doesn't affect the system much,
> > > I think it's better than having slot stats in the shared memory.
> > >
> >
> > As the minimum size of logical_decoding_work_mem is 64KB, so in the
> > worst case, we will send stats after decoding that many changes. I
> > don't think it would impact too much considering that we need to spill
> > or stream those many changes.  If it concerns any users they can
> > always increase logical_decoding_work_mem. The default value is 64MB
> > at which point, I don't think it will matter sending the stats.
>
> Sounds good to me, I will rebase my previous patch and send a patch for this.

+1. Thanks!

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Replication slot stats misgivings

From

vignesh C

Date:

27 April 2021, 05:32:12

On Tue, Apr 27, 2021 at 9:48 AM vignesh C <vignesh21@gmail.com> wrote:
>
> On Tue, Apr 27, 2021 at 9:43 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Tue, Apr 27, 2021 at 9:17 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > On Tue, Apr 27, 2021 at 11:31 AM vignesh C <vignesh21@gmail.com> wrote:
> > > >
> > > > > > And I think there is
> > > > > > also a risk to increase shared memory when we want to add other
> > > > > > statistics in the future.
> > > > > >
> > > > >
> > > > > Yeah, so do you think it is not a good idea to store stats in
> > > > > ReplicationSlot? Actually storing them in a slot makes it easier to
> > > > > send them during ReplicationSlotRelease which is quite helpful if the
> > > > > replication is interrupted due to some reason. Or the other idea was
> > > > > that we send stats every time we stream or spill changes.
> > > >
> > > > We use around 64 bytes of shared memory to store the statistics
> > > > information per slot, I'm not sure if this is a lot of memory. If this
> > > > memory is fine, then I felt the approach to store stats seems fine. If
> > > > that memory is too much then we could use the other approach to update
> > > > stats when we stream or spill the changes as suggested by Amit.
> > >
> > > I agree that makes it easier to send slot stats during
> > > ReplicationSlotRelease() but I'd prefer to avoid storing data that
> > > doesn't need to be shared in the shared buffer if possible.
> > >
> >
> > Sounds reasonable and we might add some stats in the future so that
> > will further increase the usage of shared memory.
> >
> > > And those
> > > counters are not used by physical slots at all. If sending slot stats
> > > every time we stream or spill changes doesn't affect the system much,
> > > I think it's better than having slot stats in the shared memory.
> > >
> >
> > As the minimum size of logical_decoding_work_mem is 64KB, so in the
> > worst case, we will send stats after decoding that many changes. I
> > don't think it would impact too much considering that we need to spill
> > or stream those many changes.  If it concerns any users they can
> > always increase logical_decoding_work_mem. The default value is 64MB
> > at which point, I don't think it will matter sending the stats.
>
> Sounds good to me, I will rebase my previous patch and send a patch for this.
>

Attached patch has the changes to update statistics during
spill/stream which prevents the statistics from being lost during
interrupt.
Thoughts?

Regards,
Vignesh

Attachment

v3-0001-Update-replication-statistics-after-every-stream-.patch

Re: Replication slot stats misgivings

From

Amit Kapila

Date:

27 April 2021, 12:10:47

On Tue, Apr 27, 2021 at 8:58 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Tue, Apr 27, 2021 at 11:45 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> >
> > Sawada-San, I would like to go ahead with your
> > "Use-HTAB-for-replication-slot-statistics" unless you think otherwise?
>
> I agree that it's better to use the stats collector. So please go ahead.
>

I have pushed this patch and seeing one buildfarm failure:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=mandrill&dt=2021-04-27%2009%3A23%3A14

  starting permutation: s1_init s1_begin s1_insert_tbl1 s1_insert_tbl2
s2_alter_tbl1_char s1_commit s2_get_changes
+ isolationtester: canceling step s1_init after 314 seconds
  step s1_init: SELECT 'init' FROM
pg_create_logical_replication_slot('isolation_slot', 'test_decoding');
  ?column?

I am analyzing this. Do let me know if you have any thoughts on the same?

-- 
With Regards,
Amit Kapila.

Re: Replication slot stats misgivings

From

Amit Kapila

Date:

27 April 2021, 14:29:07

On Tue, Apr 27, 2021 at 5:40 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Apr 27, 2021 at 8:58 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> I have pushed this patch and seeing one buildfarm failure:
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=mandrill&dt=2021-04-27%2009%3A23%3A14
>
>   starting permutation: s1_init s1_begin s1_insert_tbl1 s1_insert_tbl2
> s2_alter_tbl1_char s1_commit s2_get_changes
> + isolationtester: canceling step s1_init after 314 seconds
>   step s1_init: SELECT 'init' FROM
> pg_create_logical_replication_slot('isolation_slot', 'test_decoding');
>   ?column?
>
> I am analyzing this.
>

After checking below logs corresponding to this test, it seems test
has been executed and create_slot was successful:
2021-04-27 11:06:43.770 UTC [17694956:52] isolation/concurrent_ddl_dml
STATEMENT:  SELECT 'init' FROM
pg_create_logical_replication_slot('isolation_slot', 'test_decoding');
2021-04-27 11:07:11.748 UTC [5243096:9] LOG:  checkpoint starting: time
2021-04-27 11:09:24.332 UTC [5243096:10] LOG:  checkpoint complete:
wrote 14 buffers (0.1%); 0 WAL file(s) added, 0 removed, 0 recycled;
write=0.716 s, sync=0.001 s, total=132.584 s; sync files=0,
longest=0.000 s, average=0.000 s; distance=198 kB, estimate=406 kB
2021-04-27 11:09:40.116 UTC [6226046:1] [unknown] LOG:  connection
received: host=[local]
2021-04-27 11:09:40.117 UTC [17694956:53] isolation/concurrent_ddl_dml
LOG:  statement: BEGIN;
2021-04-27 11:09:40.117 UTC [17694956:54] isolation/concurrent_ddl_dml
LOG:  statement: INSERT INTO tbl1 (val1, val2) VALUES (1, 1);
2021-04-27 11:09:40.118 UTC [17694956:55] isolation/concurrent_ddl_dml
LOG:  statement: INSERT INTO tbl2 (val1, val2) VALUES (1, 1);
2021-04-27 11:09:40.119 UTC [10944636:49] isolation/concurrent_ddl_dml
LOG:  statement: ALTER TABLE tbl1 ALTER COLUMN val2 TYPE character
varying;

I am not sure but there is some possibility that even though create
slot is successful, the isolation tester got successful in canceling
it, maybe because create_slot is just finished at the same time. As we
can see from logs, during this test checkpoint also happened which
could also lead to the slowness of this particular command.

Also, I see a lot of messages like below which indicate stats
collector is also quite slow:
2021-04-27 10:57:59.385 UTC [18743536:1] LOG:  using stale statistics
instead of current ones because stats collector is not responding

I am not sure if the timeout happened because the machine is slow or
is it in any way related to code. I am seeing some previous failures
due to timeout on this machine [1][2]. In those failures, I see the
"using stale stats...." message. Also, I am not able to see why it can
fail due to this patch?

[1] - https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=mandrill&dt=2021-02-23%2004%3A23%3A56
[2] - https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=mandrill&dt=2020-12-24%2005%3A31%3A43

-- 
With Regards,
Amit Kapila.

Re: Replication slot stats misgivings

From

Masahiko Sawada

Date:

27 April 2021, 14:58:07

On Tue, Apr 27, 2021 at 11:29 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Apr 27, 2021 at 5:40 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Tue, Apr 27, 2021 at 8:58 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > I have pushed this patch and seeing one buildfarm failure:
> > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=mandrill&dt=2021-04-27%2009%3A23%3A14
> >
> >   starting permutation: s1_init s1_begin s1_insert_tbl1 s1_insert_tbl2
> > s2_alter_tbl1_char s1_commit s2_get_changes
> > + isolationtester: canceling step s1_init after 314 seconds
> >   step s1_init: SELECT 'init' FROM
> > pg_create_logical_replication_slot('isolation_slot', 'test_decoding');
> >   ?column?
> >
> > I am analyzing this.
> >
>
> After checking below logs corresponding to this test, it seems test
> has been executed and create_slot was successful:

The pg_create_logical_replication_slot() was executed at 11:04:25:

2021-04-27 11:04:25.494 UTC [17694956:49] isolation/concurrent_ddl_dml
LOG:  statement: SELECT 'init' FROM
pg_create_logical_replication_slot('isolation_slot', 'test_decoding');

Therefore this command took 314 sec that matches the number the
isolation test reported. And the folling logs follow:

2021-04-27 11:06:43.770 UTC [17694956:50] isolation/concurrent_ddl_dml
LOG:  logical decoding found consistent point at 0/17F9078
2021-04-27 11:06:43.770 UTC [17694956:51] isolation/concurrent_ddl_dml
DETAIL:  There are no running transactions.

> 2021-04-27 11:06:43.770 UTC [17694956:52] isolation/concurrent_ddl_dml
> STATEMENT:  SELECT 'init' FROM
> pg_create_logical_replication_slot('isolation_slot', 'test_decoding');
> 2021-04-27 11:07:11.748 UTC [5243096:9] LOG:  checkpoint starting: time
> 2021-04-27 11:09:24.332 UTC [5243096:10] LOG:  checkpoint complete:
> wrote 14 buffers (0.1%); 0 WAL file(s) added, 0 removed, 0 recycled;
> write=0.716 s, sync=0.001 s, total=132.584 s; sync files=0,
> longest=0.000 s, average=0.000 s; distance=198 kB, estimate=406 kB
> 2021-04-27 11:09:40.116 UTC [6226046:1] [unknown] LOG:  connection
> received: host=[local]
> 2021-04-27 11:09:40.117 UTC [17694956:53] isolation/concurrent_ddl_dml
> LOG:  statement: BEGIN;
> 2021-04-27 11:09:40.117 UTC [17694956:54] isolation/concurrent_ddl_dml
> LOG:  statement: INSERT INTO tbl1 (val1, val2) VALUES (1, 1);
> 2021-04-27 11:09:40.118 UTC [17694956:55] isolation/concurrent_ddl_dml
> LOG:  statement: INSERT INTO tbl2 (val1, val2) VALUES (1, 1);
> 2021-04-27 11:09:40.119 UTC [10944636:49] isolation/concurrent_ddl_dml
> LOG:  statement: ALTER TABLE tbl1 ALTER COLUMN val2 TYPE character
> varying;
>
> I am not sure but there is some possibility that even though create
> slot is successful, the isolation tester got successful in canceling
> it, maybe because create_slot is just finished at the same time.

Yeah, we see the test log "canceling step s1_init after 314 seconds"
but don't see any log indicating canceling query.

>  As we
> can see from logs, during this test checkpoint also happened which
> could also lead to the slowness of this particular command.

Yes. I also think the checkpoint could somewhat lead to the slowness.
And since create_slot() took 2min to find a consistent snapshot the
system might have already been busy.

>
> Also, I see a lot of messages like below which indicate stats
> collector is also quite slow:
> 2021-04-27 10:57:59.385 UTC [18743536:1] LOG:  using stale statistics
> instead of current ones because stats collector is not responding
>
> I am not sure if the timeout happened because the machine is slow or
> is it in any way related to code. I am seeing some previous failures
> due to timeout on this machine [1][2]. In those failures, I see the
> "using stale stats...." message.

It seems like a time-dependent issue but I'm wondering why the logical
decoding test failed at this time.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Replication slot stats misgivings

From

Amit Kapila

Date:

28 April 2021, 02:58:31

On Tue, Apr 27, 2021 at 8:28 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Tue, Apr 27, 2021 at 11:29 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Tue, Apr 27, 2021 at 5:40 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> >
> > I am not sure if the timeout happened because the machine is slow or
> > is it in any way related to code. I am seeing some previous failures
> > due to timeout on this machine [1][2]. In those failures, I see the
> > "using stale stats...." message.
>
> It seems like a time-dependent issue but I'm wondering why the logical
> decoding test failed at this time.
>

As per the analysis done till now, it appears to be due to the reason
that the machine is slow which leads to timeout and there appear to be
some prior failures related to timeout as well. I think it is better
to wait for another run (or few runs) to see if this occurs again.

-- 
With Regards,
Amit Kapila.

Re: Replication slot stats misgivings

From

vignesh C

Date:

28 April 2021, 03:02:59

On Wed, Apr 28, 2021 at 8:28 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Apr 27, 2021 at 8:28 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Tue, Apr 27, 2021 at 11:29 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Tue, Apr 27, 2021 at 5:40 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > >
> > > I am not sure if the timeout happened because the machine is slow or
> > > is it in any way related to code. I am seeing some previous failures
> > > due to timeout on this machine [1][2]. In those failures, I see the
> > > "using stale stats...." message.
> >
> > It seems like a time-dependent issue but I'm wondering why the logical
> > decoding test failed at this time.
> >
>
> As per the analysis done till now, it appears to be due to the reason
> that the machine is slow which leads to timeout and there appear to be
> some prior failures related to timeout as well. I think it is better
> to wait for another run (or few runs) to see if this occurs again.
>

Yes, checkpoint seems to take a lot of time, could be because the
machine is slow. Let's wait for the next run and see.

Regards,
Vignesh

Re: Replication slot stats misgivings

From

Amit Kapila

Date:

28 April 2021, 03:29:45

On Tue, Apr 27, 2021 at 11:02 AM vignesh C <vignesh21@gmail.com> wrote:
>
> On Tue, Apr 27, 2021 at 9:48 AM vignesh C <vignesh21@gmail.com> wrote:
> >
>
> Attached patch has the changes to update statistics during
> spill/stream which prevents the statistics from being lost during
> interrupt.
>

 void
-UpdateDecodingStats(LogicalDecodingContext *ctx)
+UpdateDecodingStats(ReorderBuffer *rb)

I don't think you need to change this interface because
reorderbuffer->private_data points to LogicalDecodingContext. See
StartupDecodingContext. Other than that there is a comment in the code
"Update the decoding stats at transaction prepare/commit/abort...".
This patch should extend that comment by saying something like
"Additionally we send the stats when we spill or stream the changes to
avoid losing them in case the decoding is interrupted."

-- 
With Regards,
Amit Kapila.

Re: Replication slot stats misgivings

From

vignesh C

Date:

28 April 2021, 04:06:56

On Wed, Apr 28, 2021 at 8:59 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Apr 27, 2021 at 11:02 AM vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Tue, Apr 27, 2021 at 9:48 AM vignesh C <vignesh21@gmail.com> wrote:
> > >
> >
> > Attached patch has the changes to update statistics during
> > spill/stream which prevents the statistics from being lost during
> > interrupt.
> >
>
>  void
> -UpdateDecodingStats(LogicalDecodingContext *ctx)
> +UpdateDecodingStats(ReorderBuffer *rb)
>
> I don't think you need to change this interface because
> reorderbuffer->private_data points to LogicalDecodingContext. See
> StartupDecodingContext. Other than that there is a comment in the code
> "Update the decoding stats at transaction prepare/commit/abort...".
> This patch should extend that comment by saying something like
> "Additionally we send the stats when we spill or stream the changes to
> avoid losing them in case the decoding is interrupted."

Thanks for the comments, Please find the attached v4 patch having the
fixes for the same.

Regards,
Vignesh

Attachment

v4-0001-Update-replication-statistics-after-every-stream-.patch

Re: Replication slot stats misgivings

From

Masahiko Sawada

Date:

28 April 2021, 04:07:11

On Wed, Apr 28, 2021 at 12:29 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Apr 27, 2021 at 11:02 AM vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Tue, Apr 27, 2021 at 9:48 AM vignesh C <vignesh21@gmail.com> wrote:
> > >
> >
> > Attached patch has the changes to update statistics during
> > spill/stream which prevents the statistics from being lost during
> > interrupt.
> >
>
>  void
> -UpdateDecodingStats(LogicalDecodingContext *ctx)
> +UpdateDecodingStats(ReorderBuffer *rb)
>
> I don't think you need to change this interface because
> reorderbuffer->private_data points to LogicalDecodingContext. See
> StartupDecodingContext.

+1

With this approach, we could still miss the totalTxns and totalBytes
updates if the decoding a large but less than
logical_decoding_work_mem is interrupted, right?

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Replication slot stats misgivings

From

vignesh C

Date:

28 April 2021, 04:14:59

On Wed, Apr 28, 2021 at 9:37 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Wed, Apr 28, 2021 at 12:29 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Tue, Apr 27, 2021 at 11:02 AM vignesh C <vignesh21@gmail.com> wrote:
> > >
> > > On Tue, Apr 27, 2021 at 9:48 AM vignesh C <vignesh21@gmail.com> wrote:
> > > >
> > >
> > > Attached patch has the changes to update statistics during
> > > spill/stream which prevents the statistics from being lost during
> > > interrupt.
> > >
> >
> >  void
> > -UpdateDecodingStats(LogicalDecodingContext *ctx)
> > +UpdateDecodingStats(ReorderBuffer *rb)
> >
> > I don't think you need to change this interface because
> > reorderbuffer->private_data points to LogicalDecodingContext. See
> > StartupDecodingContext.
>
> +1
>
> With this approach, we could still miss the totalTxns and totalBytes
> updates if the decoding a large but less than
> logical_decoding_work_mem is interrupted, right?

Yes you are right, I felt that is reasonable and that way it reduces
frequent calls to the stats collector to update the stats.

Regards,
Vignesh

Re: Replication slot stats misgivings

From

Amit Kapila

Date:

28 April 2021, 06:25:33

On Wed, Apr 28, 2021 at 9:37 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Wed, Apr 28, 2021 at 12:29 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Tue, Apr 27, 2021 at 11:02 AM vignesh C <vignesh21@gmail.com> wrote:
> > >
> > > On Tue, Apr 27, 2021 at 9:48 AM vignesh C <vignesh21@gmail.com> wrote:
> > > >
> > >
> > > Attached patch has the changes to update statistics during
> > > spill/stream which prevents the statistics from being lost during
> > > interrupt.
> > >
> >
> >  void
> > -UpdateDecodingStats(LogicalDecodingContext *ctx)
> > +UpdateDecodingStats(ReorderBuffer *rb)
> >
> > I don't think you need to change this interface because
> > reorderbuffer->private_data points to LogicalDecodingContext. See
> > StartupDecodingContext.
>
> +1
>
> With this approach, we could still miss the totalTxns and totalBytes
> updates if the decoding a large but less than
> logical_decoding_work_mem is interrupted, right?
>

Right, but is there some simple way to avoid that? I see two
possibilities (a) store stats in ReplicationSlot and then send them at
ReplicationSlotRelease but that will lead to an increase in shared
memory usage and as per the discussion above, we don't want that, (b)
send intermediate stats after decoding say N changes but for that, we
need to additionally compute the size of each change which might add
some overhead.

I am not sure if any of these alternatives are a good idea. What do
you think? Do you have any other ideas for this?

-- 
With Regards,
Amit Kapila.

Re: Replication slot stats misgivings

From

Masahiko Sawada

Date:

28 April 2021, 07:19:18

On Fri, Apr 16, 2021 at 2:58 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Apr 15, 2021 at 4:35 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > Thank you for the update! The patch looks good to me.
> >

BTW regarding the commit f5fc2f5b23 that added total_txns and
total_bytes, we add the reorder buffer size (i.g., rb->size) to
rb->totalBytes but I think we should use the transaction size (i.g.,
txn->size) instead:

@@ -1363,6 +1365,11 @@ ReorderBufferIterTXNNext(ReorderBuffer *rb,
ReorderBufferIterTXNState *state)
        dlist_delete(&change->node);
        dlist_push_tail(&state->old_change, &change->node);

+       /*
+        * Update the total bytes processed before releasing the current set
+        * of changes and restoring the new set of changes.
+        */
+       rb->totalBytes += rb->size;
        if (ReorderBufferRestoreChanges(rb, entry->txn, &entry->file,
                                        &state->entries[off].segno))
        {
@@ -2363,6 +2370,20 @@ ReorderBufferProcessTXN(ReorderBuffer *rb,
ReorderBufferTXN *txn,
        ReorderBufferIterTXNFinish(rb, iterstate);
        iterstate = NULL;

+       /*
+        * Update total transaction count and total transaction bytes
+        * processed. Ensure to not count the streamed transaction multiple
+        * times.
+        *
+        * Note that the statistics computation has to be done after
+        * ReorderBufferIterTXNFinish as it releases the serialized change
+        * which we have already accounted in ReorderBufferIterTXNNext.
+        */
+       if (!rbtxn_is_streamed(txn))
+           rb->totalTxns++;
+
+       rb->totalBytes += rb->size;
+

IIUC rb->size could include multiple decoded transactions. So it's not
appropriate to add that value to the counter as the transaction size
passed to the logical decoding plugin. If the reorder buffer process a
transaction while having a large transaction that is being decoded, we
could end up more increasing txn_bytes than necessary.

Please review the attached patch.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Attachment

use_txn_size.patch

Re: Replication slot stats misgivings

From

Amit Kapila

Date:

28 April 2021, 09:39:00

On Wed, Apr 28, 2021 at 12:49 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
>
> BTW regarding the commit f5fc2f5b23 that added total_txns and
> total_bytes, we add the reorder buffer size (i.g., rb->size) to
> rb->totalBytes but I think we should use the transaction size (i.g.,
> txn->size) instead:
>

You are right about the problem but I think your proposed fix also
won't work because txn->size always has current transaction size which
will be top-transaction in the case when a transaction has multiple
subtransactions. It won't include the subtxn->size. For example, you
can try to decode with below kind of transaction:
Begin;
insert into t1 values(1);
savepoint s1;
insert into t1 values(2);
savepoint s2;
insert into t1 values(3);
commit;

I think we can fix it by keeping track of total_size in toptxn as we
are doing for the streaming case in ReorderBufferChangeMemoryUpdate.
We can probably do it for non-streaming cases as well.

-- 
With Regards,
Amit Kapila.

Re: Replication slot stats misgivings

From

Masahiko Sawada

Date:

28 April 2021, 11:21:21

On Wed, Apr 28, 2021 at 6:39 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Apr 28, 2021 at 12:49 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> >
> > BTW regarding the commit f5fc2f5b23 that added total_txns and
> > total_bytes, we add the reorder buffer size (i.g., rb->size) to
> > rb->totalBytes but I think we should use the transaction size (i.g.,
> > txn->size) instead:
> >
>
> You are right about the problem but I think your proposed fix also
> won't work because txn->size always has current transaction size which
> will be top-transaction in the case when a transaction has multiple
> subtransactions. It won't include the subtxn->size.

Right. I missed the point that ReorderBufferProcessTXN() processes
also subtransactions.

> I think we can fix it by keeping track of total_size in toptxn as we
> are doing for the streaming case in ReorderBufferChangeMemoryUpdate.
> We can probably do it for non-streaming cases as well.

Agreed.

I've updated the patch. What do you think?

Regards,


--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Attachment

use_total_size_v2.patch

Re: Replication slot stats misgivings

From

Amit Kapila

Date:

28 April 2021, 11:31:47

On Wed, Apr 28, 2021 at 4:51 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Wed, Apr 28, 2021 at 6:39 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
>
> > I think we can fix it by keeping track of total_size in toptxn as we
> > are doing for the streaming case in ReorderBufferChangeMemoryUpdate.
> > We can probably do it for non-streaming cases as well.
>
> Agreed.
>
> I've updated the patch. What do you think?
>

@@ -1369,7 +1369,7 @@ ReorderBufferIterTXNNext(ReorderBuffer *rb,
ReorderBufferIterTXNState *state)
  * Update the total bytes processed before releasing the current set
  * of changes and restoring the new set of changes.
  */
- rb->totalBytes += rb->size;
+ rb->totalBytes += entry->txn->total_size;
  if (ReorderBufferRestoreChanges(rb, entry->txn, &entry->file,
  &state->entries[off].segno))

I have not tested this but won't in the above change you need to check
txn->toptxn for subtxns?

-- 
With Regards,
Amit Kapila.

Re: Replication slot stats misgivings

From

Masahiko Sawada

Date:

28 April 2021, 14:12:55

On Wed, Apr 28, 2021 at 3:25 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Apr 28, 2021 at 9:37 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Wed, Apr 28, 2021 at 12:29 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Tue, Apr 27, 2021 at 11:02 AM vignesh C <vignesh21@gmail.com> wrote:
> > > >
> > > > On Tue, Apr 27, 2021 at 9:48 AM vignesh C <vignesh21@gmail.com> wrote:
> > > > >
> > > >
> > > > Attached patch has the changes to update statistics during
> > > > spill/stream which prevents the statistics from being lost during
> > > > interrupt.
> > > >
> > >
> > >  void
> > > -UpdateDecodingStats(LogicalDecodingContext *ctx)
> > > +UpdateDecodingStats(ReorderBuffer *rb)
> > >
> > > I don't think you need to change this interface because
> > > reorderbuffer->private_data points to LogicalDecodingContext. See
> > > StartupDecodingContext.
> >
> > +1
> >
> > With this approach, we could still miss the totalTxns and totalBytes
> > updates if the decoding a large but less than
> > logical_decoding_work_mem is interrupted, right?
> >
>
> Right, but is there some simple way to avoid that? I see two
> possibilities (a) store stats in ReplicationSlot and then send them at
> ReplicationSlotRelease but that will lead to an increase in shared
> memory usage and as per the discussion above, we don't want that, (b)
> send intermediate stats after decoding say N changes but for that, we
> need to additionally compute the size of each change which might add
> some overhead.

Right.

> I am not sure if any of these alternatives are a good idea. What do
> you think? Do you have any other ideas for this?

I've been considering some ideas but don't come up with a good one
yet. It’s just an idea and not tested but how about having
CreateDecodingContext() register before_shmem_exit() callback with the
decoding context to ensure that we send slot stats even on
interruption. And FreeDecodingContext() cancels the callback.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Replication slot stats misgivings

From

Tom Lane

Date:

28 April 2021, 20:41:25

It seems that the test case added by f5fc2f5b2 is still a bit
unstable, even after c64dcc7fe:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=peripatus&dt=2021-04-23%2006%3A20%3A12

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=peripatus&dt=2021-04-24%2018%3A20%3A10

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=snapper&dt=2021-04-28%2017%3A53%3A14

(The snapper run fails to show regression.diffs, so it's not certain
that it's the same failure as peripatus, but ...)

            regards, tom lane

Re: Replication slot stats misgivings

From

Masahiko Sawada

Date:

28 April 2021, 23:28:03

On Thu, Apr 29, 2021 at 5:41 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> It seems that the test case added by f5fc2f5b2 is still a bit
> unstable, even after c64dcc7fe:

Hmm, I don't see the exact cause yet but there are two possibilities:
some transactions were really spilled, and it showed the old stats due
to losing the drop (and create) slot messages. For the former case, it
seems to better to create the slot just before the insertion and
setting logical_decoding_work_mem to the default (64MB). For the
latter case, maybe we can use a different name slot than the name used
in other tests?

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Replication slot stats misgivings

From

Amit Kapila

Date:

29 April 2021, 02:54:52

On Thu, Apr 29, 2021 at 4:58 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Thu, Apr 29, 2021 at 5:41 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >
> > It seems that the test case added by f5fc2f5b2 is still a bit
> > unstable, even after c64dcc7fe:
>
> Hmm, I don't see the exact cause yet but there are two possibilities:
> some transactions were really spilled,
>

This is the first test and inserts just one small record, so how it
can lead to spill of data. Do you mean to say that may be some
background process has written some transaction which leads to a spill
of data?

> and it showed the old stats due
> to losing the drop (and create) slot messages.
>

Yeah, something like this could happen. Another possibility here could
be that before the stats collector has processed drop and create
messages, we have enquired about the stats which lead to it giving us
the old stats. Note, that we don't wait for 'drop' or 'create' message
to be delivered. So, there is a possibility of the same. What do you
think?

> For the former case, it
> seems to better to create the slot just before the insertion and
> setting logical_decoding_work_mem to the default (64MB). For the
> latter case, maybe we can use a different name slot than the name used
> in other tests?
>

How about doing both of the above suggestions? Alternatively, we can
wait for both 'drop' and 'create' message to be delivered but that
might be overkill.

-- 
With Regards,
Amit Kapila.

Re: Replication slot stats misgivings

From

Tom Lane

Date:

29 April 2021, 03:20:00

Amit Kapila <amit.kapila16@gmail.com> writes:
> This is the first test and inserts just one small record, so how it
> can lead to spill of data. Do you mean to say that may be some
> background process has written some transaction which leads to a spill
> of data?

autovacuum, say?

> Yeah, something like this could happen. Another possibility here could
> be that before the stats collector has processed drop and create
> messages, we have enquired about the stats which lead to it giving us
> the old stats. Note, that we don't wait for 'drop' or 'create' message
> to be delivered. So, there is a possibility of the same. What do you
> think?

You should take a close look at the stats test in the main regression
tests.  We had to jump through *high* hoops to get that to be stable,
and yet it still fails semi-regularly.  This looks like pretty much the
same thing, and so I'm pessimistically inclined to guess that it will
never be entirely stable.

(At least not before the fabled stats collector rewrite, which may well
introduce some entirely new set of failure modes.)

Do we really need this test in this form?  Perhaps it could be converted
to a TAP test that's a bit more forgiving.

            regards, tom lane

Re: Replication slot stats misgivings

From

Andres Freund

Date:

29 April 2021, 03:51:32

On 2021-04-28 23:20:00 -0400, Tom Lane wrote:
> (At least not before the fabled stats collector rewrite, which may well
> introduce some entirely new set of failure modes.)

FWIW, I added a function that forces a flush there. That can be done
synchronously and the underlying functionality needs to exist anyway to
deal with backend exit. Makes it a *lot* easier to write tests for stats
related things...

Re: Replication slot stats misgivings

From

Amit Kapila

Date:

29 April 2021, 04:25:30

On Thu, Apr 29, 2021 at 8:50 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Amit Kapila <amit.kapila16@gmail.com> writes:
> > This is the first test and inserts just one small record, so how it
> > can lead to spill of data. Do you mean to say that may be some
> > background process has written some transaction which leads to a spill
> > of data?
>
> autovacuum, say?
>
> > Yeah, something like this could happen. Another possibility here could
> > be that before the stats collector has processed drop and create
> > messages, we have enquired about the stats which lead to it giving us
> > the old stats. Note, that we don't wait for 'drop' or 'create' message
> > to be delivered. So, there is a possibility of the same. What do you
> > think?
>
> You should take a close look at the stats test in the main regression
> tests.  We had to jump through *high* hoops to get that to be stable,
> and yet it still fails semi-regularly.  This looks like pretty much the
> same thing, and so I'm pessimistically inclined to guess that it will
> never be entirely stable.
>

True, it is possible that we can't make it entirely stable but I would
like to try some more before giving up on this. Otherwise, I guess the
other possibility is to remove some of the latest tests added or
probably change them to be more forgiving. For example, we can change
the currently failing test to not check 'spill*' count and rely on
just 'total*' count which will work even in scenarios we discussed for
this failure but it will reduce the efficiency/completeness of the
test case.

> (At least not before the fabled stats collector rewrite, which may well
> introduce some entirely new set of failure modes.)
>
> Do we really need this test in this form?  Perhaps it could be converted
> to a TAP test that's a bit more forgiving.
>

We have a TAP test for slot stats but there we are checking some
scenarios across the restart. We can surely move these tests also
there but it is not apparent to me how it can create a difference?

-- 
With Regards,
Amit Kapila.

Re: Replication slot stats misgivings

From

Amit Kapila

Date:

29 April 2021, 05:07:00

On Wed, Apr 28, 2021 at 7:43 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Wed, Apr 28, 2021 at 3:25 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
>
> > I am not sure if any of these alternatives are a good idea. What do
> > you think? Do you have any other ideas for this?
>
> I've been considering some ideas but don't come up with a good one
> yet. It’s just an idea and not tested but how about having
> CreateDecodingContext() register before_shmem_exit() callback with the
> decoding context to ensure that we send slot stats even on
> interruption. And FreeDecodingContext() cancels the callback.
>

Is it a good idea to send stats while exiting and rely on the same? I
think before_shmem_exit is mostly used for the cleanup purpose so not
sure if we can rely on it for this purpose. I think we can't be sure
that in all cases we will send all stats, so maybe Vignesh's patch is
sufficient to cover the cases where we avoid losing it in cases where
we would have sent a large amount of data.

--
With Regards,
Amit Kapila.

Re: Replication slot stats misgivings

From

Masahiko Sawada

Date:

29 April 2021, 05:43:44

On Thu, Apr 29, 2021 at 11:55 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Apr 29, 2021 at 4:58 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Thu, Apr 29, 2021 at 5:41 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > >
> > > It seems that the test case added by f5fc2f5b2 is still a bit
> > > unstable, even after c64dcc7fe:
> >
> > Hmm, I don't see the exact cause yet but there are two possibilities:
> > some transactions were really spilled,
> >
>
> This is the first test and inserts just one small record, so how it
> can lead to spill of data. Do you mean to say that may be some
> background process has written some transaction which leads to a spill
> of data?

Not sure but I thought that the logical decoding started to decodes
from a relatively old point for some reason and decoded incomplete
transactions that weren’t shown in the result.

>
> > and it showed the old stats due
> > to losing the drop (and create) slot messages.
> >
>
> Yeah, something like this could happen. Another possibility here could
> be that before the stats collector has processed drop and create
> messages, we have enquired about the stats which lead to it giving us
> the old stats. Note, that we don't wait for 'drop' or 'create' message
> to be delivered. So, there is a possibility of the same. What do you
> think?

Yeah, that could happen even if any message didn't get dropped.

>
> > For the former case, it
> > seems to better to create the slot just before the insertion and
> > setting logical_decoding_work_mem to the default (64MB). For the
> > latter case, maybe we can use a different name slot than the name used
> > in other tests?
> >
>
> How about doing both of the above suggestions? Alternatively, we can
> wait for both 'drop' and 'create' message to be delivered but that
> might be overkill.

Agreed. Attached the patch doing both things.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Attachment

fix_stats_test.patch

Re: Replication slot stats misgivings

From

Amit Kapila

Date:

29 April 2021, 06:37:58

On Thu, Apr 29, 2021 at 11:14 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> >
> > How about doing both of the above suggestions? Alternatively, we can
> > wait for both 'drop' and 'create' message to be delivered but that
> > might be overkill.
>
> Agreed. Attached the patch doing both things.
>

Thanks, the patch LGTM. I'll wait for a day before committing to see
if anyone has better ideas.

-- 
With Regards,
Amit Kapila.

Re: Replication slot stats misgivings

From

vignesh C

Date:

29 April 2021, 08:26:43

On Thu, Apr 29, 2021 at 11:14 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Thu, Apr 29, 2021 at 11:55 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Thu, Apr 29, 2021 at 4:58 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > On Thu, Apr 29, 2021 at 5:41 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > > >
> > > > It seems that the test case added by f5fc2f5b2 is still a bit
> > > > unstable, even after c64dcc7fe:
> > >
> > > Hmm, I don't see the exact cause yet but there are two possibilities:
> > > some transactions were really spilled,
> > >
> >
> > This is the first test and inserts just one small record, so how it
> > can lead to spill of data. Do you mean to say that may be some
> > background process has written some transaction which leads to a spill
> > of data?
>
> Not sure but I thought that the logical decoding started to decodes
> from a relatively old point for some reason and decoded incomplete
> transactions that weren’t shown in the result.
>
> >
> > > and it showed the old stats due
> > > to losing the drop (and create) slot messages.
> > >
> >
> > Yeah, something like this could happen. Another possibility here could
> > be that before the stats collector has processed drop and create
> > messages, we have enquired about the stats which lead to it giving us
> > the old stats. Note, that we don't wait for 'drop' or 'create' message
> > to be delivered. So, there is a possibility of the same. What do you
> > think?
>
> Yeah, that could happen even if any message didn't get dropped.
>
> >
> > > For the former case, it
> > > seems to better to create the slot just before the insertion and
> > > setting logical_decoding_work_mem to the default (64MB). For the
> > > latter case, maybe we can use a different name slot than the name used
> > > in other tests?
> > >
> >
> > How about doing both of the above suggestions? Alternatively, we can
> > wait for both 'drop' and 'create' message to be delivered but that
> > might be overkill.
>
> Agreed. Attached the patch doing both things.

Having a different slot name should solve the problem. The patch looks
good to me.

Regards,
Vignesh

Re: Replication slot stats misgivings

From

Amit Kapila

Date:

29 April 2021, 09:36:23

On Wed, Apr 28, 2021 at 5:01 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Apr 28, 2021 at 4:51 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Wed, Apr 28, 2021 at 6:39 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> @@ -1369,7 +1369,7 @@ ReorderBufferIterTXNNext(ReorderBuffer *rb,
> ReorderBufferIterTXNState *state)
>   * Update the total bytes processed before releasing the current set
>   * of changes and restoring the new set of changes.
>   */
> - rb->totalBytes += rb->size;
> + rb->totalBytes += entry->txn->total_size;
>   if (ReorderBufferRestoreChanges(rb, entry->txn, &entry->file,
>   &state->entries[off].segno))
>
> I have not tested this but won't in the above change you need to check
> txn->toptxn for subtxns?
>

Now, I am able to reproduce this issue:
Create table t1(c1 int);
select pg_create_logical_replication_slot('s', 'test_decoding');
Begin;
insert into t1 values(1);
savepoint s1;
insert into t1 select generate_series(1, 100000);
commit;

postgres=# select count(*) from pg_logical_slot_peek_changes('s1', NULL, NULL);
 count
--------
 100005
(1 row)

postgres=# select * from pg_stat_replication_slots;
 slot_name | spill_txns | spill_count | spill_bytes | stream_txns |
stream_count | stream_bytes | total_txns | total_bytes |
stats_reset

-----------+------------+-------------+-------------+-------------+--------------+--------------+------------+-------------+----------------------------------
 s1        |          0 |           0 |           0 |           0 |
        0 |            0 |          2 |    13200672 | 2021-04-29
14:33:55.156566+05:30
(1 row)

select * from pg_stat_reset_replication_slot('s1');

Now reduce the logical decoding work mem to allow spilling.
postgres=# set logical_decoding_work_mem='64kB';
SET
postgres=# select count(*) from pg_logical_slot_peek_changes('s1', NULL, NULL);
 count
--------
 100005
(1 row)

postgres=# select * from pg_stat_replication_slots;
 slot_name | spill_txns | spill_count | spill_bytes | stream_txns |
stream_count | stream_bytes | total_txns | total_bytes |
stats_reset

-----------+------------+-------------+-------------+-------------+--------------+--------------+------------+-------------+----------------------------------
 s1        |          1 |         202 |    13200000 |           0 |
        0 |            0 |          2 |         672 | 2021-04-29
14:35:21.836613+05:30
(1 row)

You can notice that after we have allowed spilling the 'total_bytes'
stats is showing a different value. The attached patch fixes the issue
for me. Let me know what do you think about this?

-- 
With Regards,
Amit Kapila.

Attachment

use_total_size_v3.patch

Re: Replication slot stats misgivings

From

vignesh C

Date:

29 April 2021, 12:44:06

On Thu, Apr 29, 2021 at 3:06 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Apr 28, 2021 at 5:01 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Wed, Apr 28, 2021 at 4:51 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > On Wed, Apr 28, 2021 at 6:39 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > @@ -1369,7 +1369,7 @@ ReorderBufferIterTXNNext(ReorderBuffer *rb,
> > ReorderBufferIterTXNState *state)
> > * Update the total bytes processed before releasing the current set
> > * of changes and restoring the new set of changes.
> > */
> > - rb->totalBytes += rb->size;
> > + rb->totalBytes += entry->txn->total_size;
> > if (ReorderBufferRestoreChanges(rb, entry->txn, &entry->file,
> > &state->entries[off].segno))
> >
> > I have not tested this but won't in the above change you need to check
> > txn->toptxn for subtxns?
> >
>
> Now, I am able to reproduce this issue:
> Create table t1(c1 int);
> select pg_create_logical_replication_slot('s', 'test_decoding');
> Begin;
> insert into t1 values(1);
> savepoint s1;
> insert into t1 select generate_series(1, 100000);
> commit;
>
> postgres=# select count(*) from pg_logical_slot_peek_changes('s1', NULL, NULL);
> count
> --------
> 100005
> (1 row)
>
> postgres=# select * from pg_stat_replication_slots;
> slot_name | spill_txns | spill_count | spill_bytes | stream_txns |
> stream_count | stream_bytes | total_txns | total_bytes |
> stats_reset
> -----------+------------+-------------+-------------+-------------+--------------+--------------+------------+-------------+----------------------------------
> s1 | 0 | 0 | 0 | 0 |
> 0 | 0 | 2 | 13200672 | 2021-04-29
> 14:33:55.156566+05:30
> (1 row)
>
> select * from pg_stat_reset_replication_slot('s1');
>
> Now reduce the logical decoding work mem to allow spilling.
> postgres=# set logical_decoding_work_mem='64kB';
> SET
> postgres=# select count(*) from pg_logical_slot_peek_changes('s1', NULL, NULL);
> count
> --------
> 100005
> (1 row)
>
> postgres=# select * from pg_stat_replication_slots;
> slot_name | spill_txns | spill_count | spill_bytes | stream_txns |
> stream_count | stream_bytes | total_txns | total_bytes |
> stats_reset
> -----------+------------+-------------+-------------+-------------+--------------+--------------+------------+-------------+----------------------------------
> s1 | 1 | 202 | 13200000 | 0 |
> 0 | 0 | 2 | 672 | 2021-04-29
> 14:35:21.836613+05:30
> (1 row)
>
> You can notice that after we have allowed spilling the 'total_bytes'
> stats is showing a different value. The attached patch fixes the issue
> for me. Let me know what do you think about this?

I found one issue with the following scenario when testing with logical_decoding_work_mem as 64kB:

BEGIN;
INSERT INTO t1 values(generate_series(1,10000));
SAVEPOINT s1;
INSERT INTO t1 values(generate_series(1,10000));
COMMIT;
SELECT count(*) FROM pg_logical_slot_get_changes('regression_slot1', NULL,
NULL, 'include-xids', '0', 'skip-empty-xacts', '1');
select * from pg_stat_replication_slots;
slot_name | spill_txns | spill_count | spill_bytes | stream_txns | stream_count | stream_bytes | total_txns | total_bytes | stats_reset
------------------+------------+-------------+-------------+-------------+--------------+--------------+------------+-------------+----------------------------------
regression_slot1 | 6 | 154 | 9130176 | 0 | 0 | 0 | 1 | 4262016 | 2021-04-29 17:50:00.080663+05:30
(1 row)

Same thing works fine with logical_decoding_work_mem as 64MB:
select * from pg_stat_replication_slots;
slot_name | spill_txns | spill_count | spill_bytes | stream_txns | stream_count | stream_bytes | total_txns | total_bytes | stats_reset
------------------+------------+-------------+-------------+-------------+--------------+--------------+------------+-------------+----------------------------------
regression_slot1 | 6 | 154 | 9130176 | 0 | 0 | 0 | 1 | 2640000 | 2021-04-29 17:50:00.080663+05:30
(1 row)

The patch required one change:

- rb->totalBytes += rb->size;
+ if (entry->txn->toptxn)
+ rb->totalBytes += entry->txn->toptxn->total_size;
+ else
+ rb->totalBytes += entry->txn->total_size;

The above should be changed to:

- rb->totalBytes += rb->size;

+ if (entry->txn->toptxn)
+ rb->totalBytes += entry->txn->toptxn->total_size;
+ else
+ rb->totalBytes += entry->txn->size;

Attached patch fixes the issue.

Thoughts?

Regards,

Vignesh

Attachment

use_total_size_v4.patch

Re: Replication slot stats misgivings

From

Masahiko Sawada

Date:

30 April 2021, 00:24:39

On Thu, Apr 29, 2021 at 9:44 PM vignesh C <vignesh21@gmail.com> wrote:
>
>
>
> On Thu, Apr 29, 2021 at 3:06 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Wed, Apr 28, 2021 at 5:01 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Wed, Apr 28, 2021 at 4:51 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > > > On Wed, Apr 28, 2021 at 6:39 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > @@ -1369,7 +1369,7 @@ ReorderBufferIterTXNNext(ReorderBuffer *rb,
> > > ReorderBufferIterTXNState *state)
> > >   * Update the total bytes processed before releasing the current set
> > >   * of changes and restoring the new set of changes.
> > >   */
> > > - rb->totalBytes += rb->size;
> > > + rb->totalBytes += entry->txn->total_size;
> > >   if (ReorderBufferRestoreChanges(rb, entry->txn, &entry->file,
> > >   &state->entries[off].segno))
> > >
> > > I have not tested this but won't in the above change you need to check
> > > txn->toptxn for subtxns?
> > >
> >
> > Now, I am able to reproduce this issue:
> > Create table t1(c1 int);
> > select pg_create_logical_replication_slot('s', 'test_decoding');
> > Begin;
> > insert into t1 values(1);
> > savepoint s1;
> > insert into t1 select generate_series(1, 100000);
> > commit;
> >
> > postgres=# select count(*) from pg_logical_slot_peek_changes('s1', NULL, NULL);
> >  count
> > --------
> >  100005
> > (1 row)
> >
> > postgres=# select * from pg_stat_replication_slots;
> >  slot_name | spill_txns | spill_count | spill_bytes | stream_txns |
> > stream_count | stream_bytes | total_txns | total_bytes |
> > stats_reset
> >
-----------+------------+-------------+-------------+-------------+--------------+--------------+------------+-------------+----------------------------------
> >  s1        |          0 |           0 |           0 |           0 |
> >         0 |            0 |          2 |    13200672 | 2021-04-29
> > 14:33:55.156566+05:30
> > (1 row)
> >
> > select * from pg_stat_reset_replication_slot('s1');
> >
> > Now reduce the logical decoding work mem to allow spilling.
> > postgres=# set logical_decoding_work_mem='64kB';
> > SET
> > postgres=# select count(*) from pg_logical_slot_peek_changes('s1', NULL, NULL);
> >  count
> > --------
> >  100005
> > (1 row)
> >
> > postgres=# select * from pg_stat_replication_slots;
> >  slot_name | spill_txns | spill_count | spill_bytes | stream_txns |
> > stream_count | stream_bytes | total_txns | total_bytes |
> > stats_reset
> >
-----------+------------+-------------+-------------+-------------+--------------+--------------+------------+-------------+----------------------------------
> >  s1        |          1 |         202 |    13200000 |           0 |
> >         0 |            0 |          2 |         672 | 2021-04-29
> > 14:35:21.836613+05:30
> > (1 row)
> >
> > You can notice that after we have allowed spilling the 'total_bytes'
> > stats is showing a different value. The attached patch fixes the issue
> > for me. Let me know what do you think about this?
>
> I found one issue with the following scenario when testing with logical_decoding_work_mem as 64kB:
>
> BEGIN;
> INSERT INTO t1 values(generate_series(1,10000));
> SAVEPOINT s1;
> INSERT INTO t1 values(generate_series(1,10000));
> COMMIT;
> SELECT count(*) FROM pg_logical_slot_get_changes('regression_slot1', NULL,
>         NULL, 'include-xids', '0', 'skip-empty-xacts', '1');
> select * from pg_stat_replication_slots;
>     slot_name     | spill_txns | spill_count | spill_bytes | stream_txns | stream_count | stream_bytes | total_txns |
total_bytes|           stats_reset
 
>
------------------+------------+-------------+-------------+-------------+--------------+--------------+------------+-------------+----------------------------------
>  regression_slot1 |          6 |         154 |     9130176 |           0 |            0 |            0 |          1 |
   4262016 | 2021-04-29 17:50:00.080663+05:30
 
> (1 row)
>
> Same thing works fine with logical_decoding_work_mem as 64MB:
> select * from pg_stat_replication_slots;
>    slot_name     | spill_txns | spill_count | spill_bytes | stream_txns | stream_count | stream_bytes | total_txns |
total_bytes|           stats_reset
 
>
------------------+------------+-------------+-------------+-------------+--------------+--------------+------------+-------------+----------------------------------
>  regression_slot1 |          6 |         154 |     9130176 |           0 |            0 |            0 |          1 |
   2640000 | 2021-04-29 17:50:00.080663+05:30
 
> (1 row)
>
> The patch required one change:
> - rb->totalBytes += rb->size;
> + if (entry->txn->toptxn)
> + rb->totalBytes += entry->txn->toptxn->total_size;
> + else
> + rb->totalBytes += entry->txn->total_size;
>
> The above should be changed to:
> - rb->totalBytes += rb->size;
> + if (entry->txn->toptxn)
> + rb->totalBytes += entry->txn->toptxn->total_size;
> + else
> + rb->totalBytes += entry->txn->size;
>
> Attached patch fixes the issue.
> Thoughts?

After more thought, it seems to me that we should use txn->size here
regardless of the top transaction or subtransaction since we're
iterating changes associated with a transaction that is either the top
transaction or a subtransaction. Otherwise, I think if some
subtransactions are not serialized, we will end up adding bytes
including those subtransactions during iterating other serialized
subtransactions. Whereas in ReorderBufferProcessTXN() we should use
txn->total_txn since txn is always the top transaction. I've attached
another patch to do this.

BTW, to check how many bytes of changes are passed to the decoder
plugin I wrote and attached a simple decoder plugin that calculates
the total amount of bytes for each change on the decoding plugin side.
I think what we expect is that the amounts of change bytes shown on
both sides are matched. You can build it in the same way as other
third-party modules and need to create decoder_stats extension.

The basic usage is to execute pg_logical_slot_get/peek_changes() and
mystats('slot_name') in the same process. During decoding the changes,
decoder_stats plugin accumulates the change bytes in the local memory
and mystats() SQL function, defined in decoder_stats extension, shows
those stats.

I've done some test with v4 patch. For instance, with the following
workload the output is expected:

BEGIN;
INSERT INTO t1 values(generate_series(1,10000));
SAVEPOINT s1;
INSERT INTO t1 values(generate_series(1,10000));
COMMIT;

mystats() functions shows:

=# select pg_logical_slot_get_changes('test_slot', null, null);
=# select change_type, change_bytes, total_bytes from mystats('test_slot');
 change_type | change_bytes | total_bytes
-------------+--------------+-------------
 INSERT      | 2578 kB      | 2578 kB
(1 row)

'change_bytes' and 'total_bytes' are the total amount of changes
calculated on the plugin side and core side, respectively. Those are
matched, which is expected. On the other hand, with the following
workload those are not matched:

BEGIN;
INSERT INTO t1 values(generate_series(1,10000));
SAVEPOINT s1;
INSERT INTO t1 values(generate_series(1,10000));
SAVEPOINT s2;
INSERT INTO t1 values(generate_series(1,10000));
COMMIT;

=# select pg_logical_slot_get_changes('test_slot', null, null);
=# select change_type, change_bytes, total_bytes from mystats('test_slot');
 change_type | change_bytes | total_bytes
-------------+--------------+-------------
 INSERT      | 3867 kB      | 5451 kB
(1 row)

This is fixed by the attached v5 patch.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

On Fri, Apr 30, 2021 at 5:55 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> After more thought, it seems to me that we should use txn->size here
> regardless of the top transaction or subtransaction since we're
> iterating changes associated with a transaction that is either the top
> transaction or a subtransaction. Otherwise, I think if some
> subtransactions are not serialized, we will end up adding bytes
> including those subtransactions during iterating other serialized
> subtransactions. Whereas in ReorderBufferProcessTXN() we should use
> txn->total_txn since txn is always the top transaction. I've attached
> another patch to do this.
>

LGTM. I have slightly edited the comments in the attached. I'll push
this early next week unless there are more comments.

-- 
With Regards,
Amit Kapila.

Attachment

use_total_size_v6.patch

Re: Replication slot stats misgivings

From

Amit Kapila

Date:

03 May 2021, 05:26:51

On Thu, Apr 29, 2021 at 10:37 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Apr 28, 2021 at 7:43 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Wed, Apr 28, 2021 at 3:25 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> >
> > > I am not sure if any of these alternatives are a good idea. What do
> > > you think? Do you have any other ideas for this?
> >
> > I've been considering some ideas but don't come up with a good one
> > yet. It’s just an idea and not tested but how about having
> > CreateDecodingContext() register before_shmem_exit() callback with the
> > decoding context to ensure that we send slot stats even on
> > interruption. And FreeDecodingContext() cancels the callback.
> >
>
> Is it a good idea to send stats while exiting and rely on the same? I
> think before_shmem_exit is mostly used for the cleanup purpose so not
> sure if we can rely on it for this purpose. I think we can't be sure
> that in all cases we will send all stats, so maybe Vignesh's patch is
> sufficient to cover the cases where we avoid losing it in cases where
> we would have sent a large amount of data.
>

Sawada-San, any thoughts on this point? Apart from this, I think you
have suggested somewhere in this thread to slightly update the
description of stream_bytes. I would like to update the description of
stream_bytes and total_bytes as below:

stream_bytes
Amount of transaction data decoded for streaming in-progress
transactions to the decoding output plugin while decoding changes from
WAL for this slot. This and other streaming counters for this slot can
be used to tune logical_decoding_work_mem.

total_bytes
Amount of transaction data decoded for sending transactions to the
decoding output plugin while decoding changes from WAL for this slot.
Note that this includes data that is streamed and/or spilled.

This update considers two points:
a. we don't send this data across the network because plugin might
decide to filter this data, ex. based on publications.
b. not all of the decoded changes are sent to plugin, consider
REORDER_BUFFER_CHANGE_INTERNAL_COMMAND_ID,
REORDER_BUFFER_CHANGE_INTERNAL_SNAPSHOT, etc.

--
With Regards,
Amit Kapila.

Re: Replication slot stats misgivings

From

Amit Kapila

Date:

03 May 2021, 05:29:13

On Fri, Apr 30, 2021 at 1:47 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> LGTM. I have slightly edited the comments in the attached. I'll push
> this early next week unless there are more comments.
>

Pushed.

-- 
With Regards,
Amit Kapila.

Re: Replication slot stats misgivings

From

Masahiko Sawada

Date:

03 May 2021, 12:18:14

On Mon, May 3, 2021 at 2:27 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Apr 29, 2021 at 10:37 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Wed, Apr 28, 2021 at 7:43 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > On Wed, Apr 28, 2021 at 3:25 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > >
> > > > I am not sure if any of these alternatives are a good idea. What do
> > > > you think? Do you have any other ideas for this?
> > >
> > > I've been considering some ideas but don't come up with a good one
> > > yet. It’s just an idea and not tested but how about having
> > > CreateDecodingContext() register before_shmem_exit() callback with the
> > > decoding context to ensure that we send slot stats even on
> > > interruption. And FreeDecodingContext() cancels the callback.
> > >
> >
> > Is it a good idea to send stats while exiting and rely on the same? I
> > think before_shmem_exit is mostly used for the cleanup purpose so not
> > sure if we can rely on it for this purpose. I think we can't be sure
> > that in all cases we will send all stats, so maybe Vignesh's patch is
> > sufficient to cover the cases where we avoid losing it in cases where
> > we would have sent a large amount of data.
> >
>
> Sawada-San, any thoughts on this point?

before_shmem_exit is mostly used to the cleanup purpose but how about
on_shmem_exit()? pgstats relies on that to send stats at the
interruption. See pgstat_shutdown_hook().

That being said, I agree Vignesh' patch would cover most cases. If we
don't find any better solution, I think we can go with Vignesh's
patch.

> Apart from this, I think you
> have suggested somewhere in this thread to slightly update the
> description of stream_bytes. I would like to update the description of
> stream_bytes and total_bytes as below:
>
> stream_bytes
> Amount of transaction data decoded for streaming in-progress
> transactions to the decoding output plugin while decoding changes from
> WAL for this slot. This and other streaming counters for this slot can
> be used to tune logical_decoding_work_mem.
>
> total_bytes
> Amount of transaction data decoded for sending transactions to the
> decoding output plugin while decoding changes from WAL for this slot.
> Note that this includes data that is streamed and/or spilled.
>
> This update considers two points:
> a. we don't send this data across the network because plugin might
> decide to filter this data, ex. based on publications.
> b. not all of the decoded changes are sent to plugin, consider
> REORDER_BUFFER_CHANGE_INTERNAL_COMMAND_ID,
> REORDER_BUFFER_CHANGE_INTERNAL_SNAPSHOT, etc.

Looks good to me.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Replication slot stats misgivings

From

Masahiko Sawada

Date:

03 May 2021, 12:18:36

On Mon, May 3, 2021 at 2:29 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Apr 30, 2021 at 1:47 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > LGTM. I have slightly edited the comments in the attached. I'll push
> > this early next week unless there are more comments.
> >
>
> Pushed.

Thank you!

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Replication slot stats misgivings

From

Amit Kapila

Date:

03 May 2021, 13:20:54

On Mon, May 3, 2021 at 5:48 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Mon, May 3, 2021 at 2:27 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Thu, Apr 29, 2021 at 10:37 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Wed, Apr 28, 2021 at 7:43 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > > > On Wed, Apr 28, 2021 at 3:25 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > >
> > > >
> > > > > I am not sure if any of these alternatives are a good idea. What do
> > > > > you think? Do you have any other ideas for this?
> > > >
> > > > I've been considering some ideas but don't come up with a good one
> > > > yet. It’s just an idea and not tested but how about having
> > > > CreateDecodingContext() register before_shmem_exit() callback with the
> > > > decoding context to ensure that we send slot stats even on
> > > > interruption. And FreeDecodingContext() cancels the callback.
> > > >
> > >
> > > Is it a good idea to send stats while exiting and rely on the same? I
> > > think before_shmem_exit is mostly used for the cleanup purpose so not
> > > sure if we can rely on it for this purpose. I think we can't be sure
> > > that in all cases we will send all stats, so maybe Vignesh's patch is
> > > sufficient to cover the cases where we avoid losing it in cases where
> > > we would have sent a large amount of data.
> > >
> >
> > Sawada-San, any thoughts on this point?
>
> before_shmem_exit is mostly used to the cleanup purpose but how about
> on_shmem_exit()? pgstats relies on that to send stats at the
> interruption. See pgstat_shutdown_hook().
>

Yeah, that is worth trying. Would you like to give it a try? I think
it still might not cover the cases where we error out in the backend
while decoding via APIs because at that time we won't exit, maybe for
that we can consider Vignesh's patch.

--
With Regards,
Amit Kapila.

Re: Replication slot stats misgivings

From

Masahiko Sawada

Date:

04 May 2021, 04:18:03

On Mon, May 3, 2021 at 10:21 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, May 3, 2021 at 5:48 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Mon, May 3, 2021 at 2:27 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Thu, Apr 29, 2021 at 10:37 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > > On Wed, Apr 28, 2021 at 7:43 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > > >
> > > > > On Wed, Apr 28, 2021 at 3:25 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > > >
> > > > >
> > > > > > I am not sure if any of these alternatives are a good idea. What do
> > > > > > you think? Do you have any other ideas for this?
> > > > >
> > > > > I've been considering some ideas but don't come up with a good one
> > > > > yet. It’s just an idea and not tested but how about having
> > > > > CreateDecodingContext() register before_shmem_exit() callback with the
> > > > > decoding context to ensure that we send slot stats even on
> > > > > interruption. And FreeDecodingContext() cancels the callback.
> > > > >
> > > >
> > > > Is it a good idea to send stats while exiting and rely on the same? I
> > > > think before_shmem_exit is mostly used for the cleanup purpose so not
> > > > sure if we can rely on it for this purpose. I think we can't be sure
> > > > that in all cases we will send all stats, so maybe Vignesh's patch is
> > > > sufficient to cover the cases where we avoid losing it in cases where
> > > > we would have sent a large amount of data.
> > > >
> > >
> > > Sawada-San, any thoughts on this point?
> >
> > before_shmem_exit is mostly used to the cleanup purpose but how about
> > on_shmem_exit()? pgstats relies on that to send stats at the
> > interruption. See pgstat_shutdown_hook().
> >
>
> Yeah, that is worth trying. Would you like to give it a try?

Yes.

In this approach, I think we will need to have a static pointer in
logical.c pointing to LogicalDecodingContext that we’re using. At
StartupDecodingContext(), we set the pointer to the just created
LogicalDecodingContext and register the callback so that we can refer
to the LogicalDecodingContext on that callback. And at
FreeDecodingContext(), we reset the pointer to NULL (however, since
FreeDecodingContext() is not called when an error happens we would
need to ensure resetting it somehow). But, after more thought, if we
have the static pointer in logical.c it would rather be better to have
a global function that sends slot stats based on the
LogicalDecodingContext pointed by the static pointer and can be called
by ReplicationSlotRelease(). That way, we don’t need to worry about
erroring out cases as well as interruption cases, although we need to
have a new static pointer.

I've attached a quick-hacked patch. I also incorporated the change
that calls UpdateDecodingStats() at FreeDecodingContext() so that we
can send slot stats also in the case where we spilled/streamed changes
but finished without commit/abort/prepare record.

>  I think
> it still might not cover the cases where we error out in the backend
> while decoding via APIs because at that time we won't exit, maybe for
> that we can consider Vignesh's patch.

Agreed. It seems to me that the approach of the attached patch is
better than the approach using on_shmem_exit(). So if we want to avoid
having the new static pointer and function for this purpose we can
consider Vignesh’s patch.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Attachment

send_stats_at_release.patch

Re: Replication slot stats misgivings

From

vignesh C

Date:

04 May 2021, 05:34:23

On Tue, May 4, 2021 at 9:48 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Mon, May 3, 2021 at 10:21 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Mon, May 3, 2021 at 5:48 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > On Mon, May 3, 2021 at 2:27 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > > On Thu, Apr 29, 2021 at 10:37 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > >
> > > > > On Wed, Apr 28, 2021 at 7:43 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > > > >
> > > > > > On Wed, Apr 28, 2021 at 3:25 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > > > >
> > > > > >
> > > > > > > I am not sure if any of these alternatives are a good idea. What do
> > > > > > > you think? Do you have any other ideas for this?
> > > > > >
> > > > > > I've been considering some ideas but don't come up with a good one
> > > > > > yet. It’s just an idea and not tested but how about having
> > > > > > CreateDecodingContext() register before_shmem_exit() callback with the
> > > > > > decoding context to ensure that we send slot stats even on
> > > > > > interruption. And FreeDecodingContext() cancels the callback.
> > > > > >
> > > > >
> > > > > Is it a good idea to send stats while exiting and rely on the same? I
> > > > > think before_shmem_exit is mostly used for the cleanup purpose so not
> > > > > sure if we can rely on it for this purpose. I think we can't be sure
> > > > > that in all cases we will send all stats, so maybe Vignesh's patch is
> > > > > sufficient to cover the cases where we avoid losing it in cases where
> > > > > we would have sent a large amount of data.
> > > > >
> > > >
> > > > Sawada-San, any thoughts on this point?
> > >
> > > before_shmem_exit is mostly used to the cleanup purpose but how about
> > > on_shmem_exit()? pgstats relies on that to send stats at the
> > > interruption. See pgstat_shutdown_hook().
> > >
> >
> > Yeah, that is worth trying. Would you like to give it a try?
>
> Yes.
>
> In this approach, I think we will need to have a static pointer in
> logical.c pointing to LogicalDecodingContext that we’re using. At
> StartupDecodingContext(), we set the pointer to the just created
> LogicalDecodingContext and register the callback so that we can refer
> to the LogicalDecodingContext on that callback. And at
> FreeDecodingContext(), we reset the pointer to NULL (however, since
> FreeDecodingContext() is not called when an error happens we would
> need to ensure resetting it somehow). But, after more thought, if we
> have the static pointer in logical.c it would rather be better to have
> a global function that sends slot stats based on the
> LogicalDecodingContext pointed by the static pointer and can be called
> by ReplicationSlotRelease(). That way, we don’t need to worry about
> erroring out cases as well as interruption cases, although we need to
> have a new static pointer.
>
> I've attached a quick-hacked patch. I also incorporated the change
> that calls UpdateDecodingStats() at FreeDecodingContext() so that we
> can send slot stats also in the case where we spilled/streamed changes
> but finished without commit/abort/prepare record.
>
> >  I think
> > it still might not cover the cases where we error out in the backend
> > while decoding via APIs because at that time we won't exit, maybe for
> > that we can consider Vignesh's patch.
>
> Agreed. It seems to me that the approach of the attached patch is
> better than the approach using on_shmem_exit(). So if we want to avoid
> having the new static pointer and function for this purpose we can
> consider Vignesh’s patch.
>

I'm ok with using either my patch or Sawada san's patch, Even I had
the same thought of whether we should have a static variable thought
pointed out by Sawada san. Apart from that I had one minor comment:
This comment needs to be corrected "andu sed to sent"
+/*
+ * Pointing to the currently-used logical decoding context andu sed to sent
+ * slot statistics on releasing slots.
+ */
+static LogicalDecodingContext *MyLogicalDecodingContext = NULL;
+

Regards,
Vignesh

Re: Replication slot stats misgivings

From

Masahiko Sawada

Date:

06 May 2021, 00:45:12

On Tue, May 4, 2021 at 2:34 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Tue, May 4, 2021 at 9:48 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Mon, May 3, 2021 at 10:21 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Mon, May 3, 2021 at 5:48 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > > > On Mon, May 3, 2021 at 2:27 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > >
> > > > > On Thu, Apr 29, 2021 at 10:37 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > > >
> > > > > > On Wed, Apr 28, 2021 at 7:43 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > > > > >
> > > > > > > On Wed, Apr 28, 2021 at 3:25 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > > > > >
> > > > > > >
> > > > > > > > I am not sure if any of these alternatives are a good idea. What do
> > > > > > > > you think? Do you have any other ideas for this?
> > > > > > >
> > > > > > > I've been considering some ideas but don't come up with a good one
> > > > > > > yet. It’s just an idea and not tested but how about having
> > > > > > > CreateDecodingContext() register before_shmem_exit() callback with the
> > > > > > > decoding context to ensure that we send slot stats even on
> > > > > > > interruption. And FreeDecodingContext() cancels the callback.
> > > > > > >
> > > > > >
> > > > > > Is it a good idea to send stats while exiting and rely on the same? I
> > > > > > think before_shmem_exit is mostly used for the cleanup purpose so not
> > > > > > sure if we can rely on it for this purpose. I think we can't be sure
> > > > > > that in all cases we will send all stats, so maybe Vignesh's patch is
> > > > > > sufficient to cover the cases where we avoid losing it in cases where
> > > > > > we would have sent a large amount of data.
> > > > > >
> > > > >
> > > > > Sawada-San, any thoughts on this point?
> > > >
> > > > before_shmem_exit is mostly used to the cleanup purpose but how about
> > > > on_shmem_exit()? pgstats relies on that to send stats at the
> > > > interruption. See pgstat_shutdown_hook().
> > > >
> > >
> > > Yeah, that is worth trying. Would you like to give it a try?
> >
> > Yes.
> >
> > In this approach, I think we will need to have a static pointer in
> > logical.c pointing to LogicalDecodingContext that we’re using. At
> > StartupDecodingContext(), we set the pointer to the just created
> > LogicalDecodingContext and register the callback so that we can refer
> > to the LogicalDecodingContext on that callback. And at
> > FreeDecodingContext(), we reset the pointer to NULL (however, since
> > FreeDecodingContext() is not called when an error happens we would
> > need to ensure resetting it somehow). But, after more thought, if we
> > have the static pointer in logical.c it would rather be better to have
> > a global function that sends slot stats based on the
> > LogicalDecodingContext pointed by the static pointer and can be called
> > by ReplicationSlotRelease(). That way, we don’t need to worry about
> > erroring out cases as well as interruption cases, although we need to
> > have a new static pointer.
> >
> > I've attached a quick-hacked patch. I also incorporated the change
> > that calls UpdateDecodingStats() at FreeDecodingContext() so that we
> > can send slot stats also in the case where we spilled/streamed changes
> > but finished without commit/abort/prepare record.
> >
> > >  I think
> > > it still might not cover the cases where we error out in the backend
> > > while decoding via APIs because at that time we won't exit, maybe for
> > > that we can consider Vignesh's patch.
> >
> > Agreed. It seems to me that the approach of the attached patch is
> > better than the approach using on_shmem_exit(). So if we want to avoid
> > having the new static pointer and function for this purpose we can
> > consider Vignesh’s patch.
> >
>
> I'm ok with using either my patch or Sawada san's patch, Even I had
> the same thought of whether we should have a static variable thought
> pointed out by Sawada san. Apart from that I had one minor comment:
> This comment needs to be corrected "andu sed to sent"
> +/*
> + * Pointing to the currently-used logical decoding context andu sed to sent
> + * slot statistics on releasing slots.
> + */
> +static LogicalDecodingContext *MyLogicalDecodingContext = NULL;
> +

Right, that needs to be fixed.

After more thought, I'm concerned that my patch's approach might be
invasive for PG14. Given that Vignesh’s patch would cover most cases,
I think we can live with a small downside that could miss some slot
stats. If we want to ensure sending slot stats at releasing slot, we
can develop it as an improvement. My patch would be better to get
reviewed by more peoples including the design during PG15 development.
Thoughts?

Regards,

---
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Replication slot stats misgivings

From

Masahiko Sawada

Date:

06 May 2021, 03:43:32

On Mon, May 3, 2021 at 9:18 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Mon, May 3, 2021 at 2:27 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> > Apart from this, I think you
> > have suggested somewhere in this thread to slightly update the
> > description of stream_bytes. I would like to update the description of
> > stream_bytes and total_bytes as below:
> >
> > stream_bytes
> > Amount of transaction data decoded for streaming in-progress
> > transactions to the decoding output plugin while decoding changes from
> > WAL for this slot. This and other streaming counters for this slot can
> > be used to tune logical_decoding_work_mem.
> >
> > total_bytes
> > Amount of transaction data decoded for sending transactions to the
> > decoding output plugin while decoding changes from WAL for this slot.
> > Note that this includes data that is streamed and/or spilled.
> >
> > This update considers two points:
> > a. we don't send this data across the network because plugin might
> > decide to filter this data, ex. based on publications.
> > b. not all of the decoded changes are sent to plugin, consider
> > REORDER_BUFFER_CHANGE_INTERNAL_COMMAND_ID,
> > REORDER_BUFFER_CHANGE_INTERNAL_SNAPSHOT, etc.
>
> Looks good to me.

Attached the doc update patch.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Attachment

fix_doc.patch

Re: Replication slot stats misgivings

From

Amit Kapila

Date:

06 May 2021, 04:09:19

On Thu, May 6, 2021 at 6:15 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> After more thought, I'm concerned that my patch's approach might be
> invasive for PG14. Given that Vignesh’s patch would cover most cases,
>

I am not sure if your patch is too invasive but OTOH I am also
convinced that Vignesh's patch covers most cases and is much simpler
so we can go ahead with that. In the attached, I have combined
Vignesh's patch and your doc fix patch. Additionally, I have changed
some comments and some other cosmetic stuff. Let me know what you
think of the attached?

--
With Regards,
Amit Kapila.

Attachment

v5-0001-Update-replication-statistics-after-every-stream-.patch

Re: Replication slot stats misgivings

From

Masahiko Sawada

Date:

06 May 2021, 05:24:58

On Thu, May 6, 2021 at 1:09 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, May 6, 2021 at 6:15 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > After more thought, I'm concerned that my patch's approach might be
> > invasive for PG14. Given that Vignesh’s patch would cover most cases,
> >
>
> I am not sure if your patch is too invasive but OTOH I am also
> convinced that Vignesh's patch covers most cases and is much simpler
> so we can go ahead with that.

I think that my patch affects also other codes including logical
decoding and decoding context. We will need to write code while
worrying about MyLogicalDecodingContext.

>  In the attached, I have combined
> Vignesh's patch and your doc fix patch. Additionally, I have changed
> some comments and some other cosmetic stuff. Let me know what you
> think of the attached?

Thank you for updating the patch. The patch looks good to me.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Replication slot stats misgivings

From

vignesh C

Date:

06 May 2021, 05:52:09

On Thu, May 6, 2021 at 9:39 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, May 6, 2021 at 6:15 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > After more thought, I'm concerned that my patch's approach might be
> > invasive for PG14. Given that Vignesh’s patch would cover most cases,
> >
>
> I am not sure if your patch is too invasive but OTOH I am also
> convinced that Vignesh's patch covers most cases and is much simpler
> so we can go ahead with that. In the attached, I have combined
> Vignesh's patch and your doc fix patch. Additionally, I have changed
> some comments and some other cosmetic stuff. Let me know what you
> think of the attached?

The updated patch looks good to me.

Regards,
Vignesh

Re: Replication slot stats misgivings

From

Amit Kapila

Date:

06 May 2021, 07:02:56

On Thu, May 6, 2021 at 10:55 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Thu, May 6, 2021 at 1:09 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
>
> >  In the attached, I have combined
> > Vignesh's patch and your doc fix patch. Additionally, I have changed
> > some comments and some other cosmetic stuff. Let me know what you
> > think of the attached?
>
> Thank you for updating the patch. The patch looks good to me.
>

Pushed!

-- 
With Regards,
Amit Kapila.

Re: Replication slot stats misgivings

From

Masahiko Sawada

Date:

06 May 2021, 07:59:27

On Thu, May 6, 2021 at 4:03 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, May 6, 2021 at 10:55 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Thu, May 6, 2021 at 1:09 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> >
> > >  In the attached, I have combined
> > > Vignesh's patch and your doc fix patch. Additionally, I have changed
> > > some comments and some other cosmetic stuff. Let me know what you
> > > think of the attached?
> >
> > Thank you for updating the patch. The patch looks good to me.
> >
>
> Pushed!

Thanks!

All issues pointed out in this thread are resolved and we can remove
this item from the open items?

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Replication slot stats misgivings

From

Amit Kapila

Date:

06 May 2021, 08:28:35

On Thu, May 6, 2021 at 1:30 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> All issues pointed out in this thread are resolved and we can remove
> this item from the open items?
>

I think so. Do you think we should reply to Andres's original email
stating the commits that fixed the individual review comments to avoid
any confusion later?

-- 
With Regards,
Amit Kapila.

Re: Replication slot stats misgivings

From

vignesh C

Date:

06 May 2021, 08:34:04

On Thu, May 6, 2021 at 1:30 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Thu, May 6, 2021 at 4:03 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Thu, May 6, 2021 at 10:55 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > On Thu, May 6, 2021 at 1:09 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > >
> > > >  In the attached, I have combined
> > > > Vignesh's patch and your doc fix patch. Additionally, I have changed
> > > > some comments and some other cosmetic stuff. Let me know what you
> > > > think of the attached?
> > >
> > > Thank you for updating the patch. The patch looks good to me.
> > >
> >
> > Pushed!

Thanks for committing.

>
> All issues pointed out in this thread are resolved and we can remove
> this item from the open items?

I felt all the comments listed have been addressed.

Regards,
Vignesh

Re: Replication slot stats misgivings

From

Masahiko Sawada

Date:

06 May 2021, 08:35:20

On Thu, May 6, 2021 at 5:28 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, May 6, 2021 at 1:30 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > All issues pointed out in this thread are resolved and we can remove
> > this item from the open items?
> >
>
> I think so. Do you think we should reply to Andres's original email
> stating the commits that fixed the individual review comments to avoid
> any confusion later?

Good idea. That's also helpful for confirming that all comments are
addressed. Would you like to gather those commits? or shall I?

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Replication slot stats misgivings

From

Amit Kapila

Date:

06 May 2021, 09:25:06

On Thu, May 6, 2021 at 2:05 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Thu, May 6, 2021 at 5:28 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Thu, May 6, 2021 at 1:30 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > All issues pointed out in this thread are resolved and we can remove
> > > this item from the open items?
> > >
> >
> > I think so. Do you think we should reply to Andres's original email
> > stating the commits that fixed the individual review comments to avoid
> > any confusion later?
>
> Good idea. That's also helpful for confirming that all comments are
> addressed. Would you like to gather those commits? or shall I?
>

I am fine either way. I will do it tomorrow unless you have responded
before that.

-- 
With Regards,
Amit Kapila.

Re: Replication slot stats misgivings

From

Masahiko Sawada

Date:

07 May 2021, 00:39:56

On Sat, Mar 20, 2021 at 3:52 AM Andres Freund <andres@anarazel.de> wrote:
>
> Hi,
>
> I started to write this as a reply to
> https://postgr.es/m/20210318015105.dcfa4ceybdjubf2i%40alap3.anarazel.de
> but I think it doesn't really fit under that header anymore.
>
> On 2021-03-17 18:51:05 -0700, Andres Freund wrote:
> > It does make it easier for the shared memory stats patch, because if
> > there's a fixed number + location, the relevant stats reporting doesn't
> > need to go through a hashtable with the associated locking.  I guess
> > that may have colored my perception that it's better to just have a
> > statically sized memory allocation for this.  Noteworthy that SLRU stats
> > are done in a fixed size allocation as well...

Through a long discussion, all review comments pointed out here have
been addressed. I summarized that individual review comments are fixed
by which commit to avoid any confusion later.

>
> As part of reviewing the replication slot stats patch I looked at
> replication slot stats a fair bit, and I've a few misgivings. First,
> about the pgstat.c side of things:
>
> - If somehow slot stat drop messages got lost (remember pgstat
>   communication is lossy!), we'll just stop maintaining stats for slots
>   created later, because there'll eventually be no space for keeping
>   stats for another slot.
>
> - If max_replication_slots was lowered between a restart,
>   pgstat_read_statfile() will happily write beyond the end of
>   replSlotStats.
>
> - pgstat_reset_replslot_counter() acquires ReplicationSlotControlLock. I
>   think pgstat.c has absolutely no business doing things on that level.
>
> - We do a linear search through all replication slots whenever receiving
>   stats for a slot. Even though there'd be a perfectly good index to
>   just use all throughout - the slots index itself. It looks to me like
>   slots stat reports can be fairly frequent in some workloads, so that
>   doesn't seem great.

Fixed by 3fa17d37716.

>
> - PgStat_ReplSlotStats etc use slotname[NAMEDATALEN]. Why not just NameData?
>
> - pgstat_report_replslot() already has a lot of stats parameters, it
>   seems likely that we'll get more. Seems like we should just use a
>   struct of stats updates.

Fixed by cca57c1d9bf.

>
> And then more generally about the feature:
> - If a slot was used to stream out a large amount of changes (say an
>   initial data load), but then replication is interrupted before the
>   transaction is committed/aborted, stream_bytes will not reflect the
>   many gigabytes of data we may have sent.

Fixed by 592f00f8d.

> - I seems weird that we went to the trouble of inventing replication
>   slot stats, but then limit them to logical slots, and even there don't
>   record the obvious things like the total amount of data sent.

Fixed by f5fc2f5b23d.

>
> I think the best way to address the more fundamental "pgstat related"
> complaints is to change how replication slot stats are
> "addressed". Instead of using the slots name, report stats using the
> index in ReplicationSlotCtl->replication_slots.
>
> That removes the risk of running out of "replication slot stat slots":
> If we loose a drop message, the index eventually will be reused and we
> likely can detect that the stats were for a different slot by comparing
> the slot name.
>
> It also makes it easy to handle the issue of max_replication_slots being
> lowered and there still being stats for a slot - we simply can skip
> restoring that slots data, because we know the relevant slot can't exist
> anymore. And we can make the initial pgstat_report_replslot() during
> slot creation use a
>
> I'm wondering if we should just remove the slot name entirely from the
> pgstat.c side of things, and have pg_stat_get_replication_slots()
> inquire about slots by index as well and get the list of slots to report
> stats for from slot.c infrastructure.

We fixed the problem of "running out of replication slot stat slots"
by using HTAB to store slot stats, see 3fa17d37716. The slot stats
could be orphaned if a slot drop message gets lost. But we constantly
check and remove them in pgstat_vacuum_stat().

For the record, there are two known issues that are unlikely to happen
in practice or don't affect users much: (1) if the messages for
creation and drop slot of the same name get lost and create happens
before (auto)vacuum cleans up the dead slot, the stats will be
accumulated into the old slot, and (2) we could miss the total_txn and
total_bytes updates if logical decoding is interrupted.

For (1), there is an idea of having OIDs for each slot to avoid the
accumulation of stats but that doesn't seem worth doing as in practice
this won't happen frequently. For (2), there are some ideas of
reporting slot stats at releasing slot (by keeping stats in
ReplicationSlot or by using callback) but we decided to go with
reporting slot stats after every stream/spill. Because it covers most
cases in practice and is much simpler than other approaches.


Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Replication slot stats misgivings

From

Amit Kapila

Date:

07 May 2021, 02:33:42

On Fri, May 7, 2021 at 6:10 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Sat, Mar 20, 2021 at 3:52 AM Andres Freund <andres@anarazel.de> wrote:
> >
> > Hi,
> >
> > I started to write this as a reply to
> > https://postgr.es/m/20210318015105.dcfa4ceybdjubf2i%40alap3.anarazel.de
> > but I think it doesn't really fit under that header anymore.
> >
> > On 2021-03-17 18:51:05 -0700, Andres Freund wrote:
> > > It does make it easier for the shared memory stats patch, because if
> > > there's a fixed number + location, the relevant stats reporting doesn't
> > > need to go through a hashtable with the associated locking.  I guess
> > > that may have colored my perception that it's better to just have a
> > > statically sized memory allocation for this.  Noteworthy that SLRU stats
> > > are done in a fixed size allocation as well...
>
> Through a long discussion, all review comments pointed out here have
> been addressed. I summarized that individual review comments are fixed
> by which commit to avoid any confusion later.
>
> >
> > As part of reviewing the replication slot stats patch I looked at
> > replication slot stats a fair bit, and I've a few misgivings. First,
> > about the pgstat.c side of things:
> >
> > - If somehow slot stat drop messages got lost (remember pgstat
> >   communication is lossy!), we'll just stop maintaining stats for slots
> >   created later, because there'll eventually be no space for keeping
> >   stats for another slot.
> >
> > - If max_replication_slots was lowered between a restart,
> >   pgstat_read_statfile() will happily write beyond the end of
> >   replSlotStats.
> >
> > - pgstat_reset_replslot_counter() acquires ReplicationSlotControlLock. I
> >   think pgstat.c has absolutely no business doing things on that level.
> >
> > - We do a linear search through all replication slots whenever receiving
> >   stats for a slot. Even though there'd be a perfectly good index to
> >   just use all throughout - the slots index itself. It looks to me like
> >   slots stat reports can be fairly frequent in some workloads, so that
> >   doesn't seem great.
>
> Fixed by 3fa17d37716.
>
> >
> > - PgStat_ReplSlotStats etc use slotname[NAMEDATALEN]. Why not just NameData?
> >
> > - pgstat_report_replslot() already has a lot of stats parameters, it
> >   seems likely that we'll get more. Seems like we should just use a
> >   struct of stats updates.
>
> Fixed by cca57c1d9bf.
>
> >
> > And then more generally about the feature:
> > - If a slot was used to stream out a large amount of changes (say an
> >   initial data load), but then replication is interrupted before the
> >   transaction is committed/aborted, stream_bytes will not reflect the
> >   many gigabytes of data we may have sent.
>
> Fixed by 592f00f8d.
>
> > - I seems weird that we went to the trouble of inventing replication
> >   slot stats, but then limit them to logical slots, and even there don't
> >   record the obvious things like the total amount of data sent.
>
> Fixed by f5fc2f5b23d.
>
> >
> > I think the best way to address the more fundamental "pgstat related"
> > complaints is to change how replication slot stats are
> > "addressed". Instead of using the slots name, report stats using the
> > index in ReplicationSlotCtl->replication_slots.
> >
> > That removes the risk of running out of "replication slot stat slots":
> > If we loose a drop message, the index eventually will be reused and we
> > likely can detect that the stats were for a different slot by comparing
> > the slot name.
> >
> > It also makes it easy to handle the issue of max_replication_slots being
> > lowered and there still being stats for a slot - we simply can skip
> > restoring that slots data, because we know the relevant slot can't exist
> > anymore. And we can make the initial pgstat_report_replslot() during
> > slot creation use a
> >
> > I'm wondering if we should just remove the slot name entirely from the
> > pgstat.c side of things, and have pg_stat_get_replication_slots()
> > inquire about slots by index as well and get the list of slots to report
> > stats for from slot.c infrastructure.
>
> We fixed the problem of "running out of replication slot stat slots"
> by using HTAB to store slot stats, see 3fa17d37716. The slot stats
> could be orphaned if a slot drop message gets lost. But we constantly
> check and remove them in pgstat_vacuum_stat().
>

Thanks for the summarization. I don't find anything that is left
unaddressed. I think we can wait for a day or two to see if Andres or
anyone else sees anything that is left unaddressed and then we can
close the open item.

-- 
With Regards,
Amit Kapila.

Re: Replication slot stats misgivings

From

Amit Kapila

Date:

11 May 2021, 01:59:25

On Fri, May 7, 2021 at 8:03 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> Thanks for the summarization. I don't find anything that is left
> unaddressed. I think we can wait for a day or two to see if Andres or
> anyone else sees anything that is left unaddressed and then we can
> close the open item.
>

I have closed this open item.

-- 
With Regards,
Amit Kapila.

Re: Replication slot stats misgivings

From

Tom Lane

Date:

11 May 2021, 21:32:36

Amit Kapila <amit.kapila16@gmail.com> writes:
> I have closed this open item.

That seems a little premature, considering that the
contrib/test_decoding/sql/stats.sql test case is still failing regularly.

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=locust&dt=2021-05-11%2019%3A14%3A53

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=peripatus&dt=2021-05-07%2010%3A20%3A21

            regards, tom lane

Re: Replication slot stats misgivings

From

Masahiko Sawada

Date:

11 May 2021, 22:29:59

On Wed, May 12, 2021 at 6:32 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Amit Kapila <amit.kapila16@gmail.com> writes:
> > I have closed this open item.
>
> That seems a little premature, considering that the
> contrib/test_decoding/sql/stats.sql test case is still failing regularly.

Thank you for reporting.

Ugh, since by commit 592f00f8de we send slot stats every after
spil/stream it’s possible that we report slot stats that have non-zero
counters for spill_bytes/txns and zeroes for total_bytes/txns. It
seems to me it’s legitimate that the slot stats view shows non-zero
values for spill_bytes/txns and zero values for total_bytes/txns
during decoding a large transaction. So I think we can fix the test
script so that it checks only spill_bytes/txns when checking spilled
transactions.

For the record, during streaming transactions, IIUC this kind of thing
doesn’t happen since we update both total_bytes/txns and
stream_bytes/txns before reporting slot stats.

I've attached a patch to fix it.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Attachment

fix_stats.patch

Re: Replication slot stats misgivings

From

Amit Kapila

Date:

12 May 2021, 02:23:04

On Wed, May 12, 2021 at 4:00 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Wed, May 12, 2021 at 6:32 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >
> > Amit Kapila <amit.kapila16@gmail.com> writes:
> > > I have closed this open item.
> >
> > That seems a little premature, considering that the
> > contrib/test_decoding/sql/stats.sql test case is still failing regularly.
>
> Thank you for reporting.
>
> Ugh, since by commit 592f00f8de we send slot stats every after
> spil/stream it’s possible that we report slot stats that have non-zero
> counters for spill_bytes/txns and zeroes for total_bytes/txns. It
> seems to me it’s legitimate that the slot stats view shows non-zero
> values for spill_bytes/txns and zero values for total_bytes/txns
> during decoding a large transaction. So I think we can fix the test
> script so that it checks only spill_bytes/txns when checking spilled
> transactions.
>

Your analysis and fix look correct to me. I'll test and push your
patch if I don't see any problem with it.

> For the record, during streaming transactions, IIUC this kind of thing
> doesn’t happen since we update both total_bytes/txns and
> stream_bytes/txns before reporting slot stats.
>

Right, because during streaming, we send the data to the decoding plugin.

--
With Regards,
Amit Kapila.

Re: Replication slot stats misgivings

From

Amit Kapila

Date:

12 May 2021, 03:37:50

On Wed, May 12, 2021 at 7:53 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, May 12, 2021 at 4:00 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > Ugh, since by commit 592f00f8de we send slot stats every after
> > spil/stream it’s possible that we report slot stats that have non-zero
> > counters for spill_bytes/txns and zeroes for total_bytes/txns. It
> > seems to me it’s legitimate that the slot stats view shows non-zero
> > values for spill_bytes/txns and zero values for total_bytes/txns
> > during decoding a large transaction. So I think we can fix the test
> > script so that it checks only spill_bytes/txns when checking spilled
> > transactions.
> >
>
> Your analysis and fix look correct to me.
>

I think the part of the test that tests the stats after resetting it
might give different results. This can happen because in the previous
test we spill multiple times (spill_count is 12 in my testing) and it
is possible that some of the spill stats messages is received by stats
collector after the reset message. If this theory is correct then it
better that we remove the test for reset stats and the test after it
"decode and check stats again.". This is not directly related to your
patch or buildfarm failure but I guess this can happen and we might
see such a failure in future.

--
With Regards,
Amit Kapila.

Re: Replication slot stats misgivings

From

vignesh C

Date:

12 May 2021, 04:19:34

On Wed, May 12, 2021 at 9:08 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, May 12, 2021 at 7:53 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Wed, May 12, 2021 at 4:00 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > Ugh, since by commit 592f00f8de we send slot stats every after
> > > spil/stream it’s possible that we report slot stats that have non-zero
> > > counters for spill_bytes/txns and zeroes for total_bytes/txns. It
> > > seems to me it’s legitimate that the slot stats view shows non-zero
> > > values for spill_bytes/txns and zero values for total_bytes/txns
> > > during decoding a large transaction. So I think we can fix the test
> > > script so that it checks only spill_bytes/txns when checking spilled
> > > transactions.
> > >
> >
> > Your analysis and fix look correct to me.
> >
>
> I think the part of the test that tests the stats after resetting it
> might give different results. This can happen because in the previous
> test we spill multiple times (spill_count is 12 in my testing) and it
> is possible that some of the spill stats messages is received by stats
> collector after the reset message. If this theory is correct then it
> better that we remove the test for reset stats and the test after it
> "decode and check stats again.". This is not directly related to your
> patch or buildfarm failure but I guess this can happen and we might
> see such a failure in future.

I agree with your analysis to remove that test. Attached patch has the
changes for the same.

Regards,
Vignesh

Attachment

Replication_slot_stats_test_fix.patch

Re: Replication slot stats misgivings

From

Tom Lane

Date:

12 May 2021, 04:29:03

vignesh C <vignesh21@gmail.com> writes:
> I agree with your analysis to remove that test. Attached patch has the
> changes for the same.

Is there any value in converting the test case into a TAP test that
could be more flexible about the expected output?  I'm mainly wondering
whether there are any code paths that this test forces the server through,
which would otherwise lack coverage.

            regards, tom lane

Re: Replication slot stats misgivings

From

vignesh C

Date:

12 May 2021, 04:42:20

On Wed, May 12, 2021 at 9:59 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> vignesh C <vignesh21@gmail.com> writes:
> > I agree with your analysis to remove that test. Attached patch has the
> > changes for the same.
>
> Is there any value in converting the test case into a TAP test that
> could be more flexible about the expected output?  I'm mainly wondering
> whether there are any code paths that this test forces the server through,
> which would otherwise lack coverage.

Removing this test does not reduce code coverage. This test is
basically to decode and check the stats again, it is kind of a
repetitive test. The problem with this test here is that when a
transaction is spilled, the statistics for the spill transaction will
be sent to the statistics collector as and when the transaction is
spilled. This test sends spill stats around 12 times. The test expects
to reset the stats and check the stats gets updated when we get the
changes. We cannot validate reset slot stats results here, as it could
be possible that in some machines the stats collector receives the
spilled transaction stats after getting reset slots. This same problem
will exist with tap tests too. So I feel it is better to remove this
test.

Regards,
Vignesh

Re: Replication slot stats misgivings

From

Tom Lane

Date:

12 May 2021, 04:49:11

vignesh C <vignesh21@gmail.com> writes:
> On Wed, May 12, 2021 at 9:59 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Is there any value in converting the test case into a TAP test that
>> could be more flexible about the expected output?  I'm mainly wondering
>> whether there are any code paths that this test forces the server through,
>> which would otherwise lack coverage.

> Removing this test does not reduce code coverage. This test is
> basically to decode and check the stats again, it is kind of a
> repetitive test. The problem with this test here is that when a
> transaction is spilled, the statistics for the spill transaction will
> be sent to the statistics collector as and when the transaction is
> spilled. This test sends spill stats around 12 times. The test expects
> to reset the stats and check the stats gets updated when we get the
> changes. We cannot validate reset slot stats results here, as it could
> be possible that in some machines the stats collector receives the
> spilled transaction stats after getting reset slots. This same problem
> will exist with tap tests too. So I feel it is better to remove this
> test.

OK, I'm satisfied as long as we've considered the code-coverage angle.

            regards, tom lane

Re: Replication slot stats misgivings

From

Masahiko Sawada

Date:

12 May 2021, 10:32:10

On Wed, May 12, 2021 at 1:19 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Wed, May 12, 2021 at 9:08 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Wed, May 12, 2021 at 7:53 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Wed, May 12, 2021 at 4:00 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > > > Ugh, since by commit 592f00f8de we send slot stats every after
> > > > spil/stream it’s possible that we report slot stats that have non-zero
> > > > counters for spill_bytes/txns and zeroes for total_bytes/txns. It
> > > > seems to me it’s legitimate that the slot stats view shows non-zero
> > > > values for spill_bytes/txns and zero values for total_bytes/txns
> > > > during decoding a large transaction. So I think we can fix the test
> > > > script so that it checks only spill_bytes/txns when checking spilled
> > > > transactions.
> > > >
> > >
> > > Your analysis and fix look correct to me.
> > >
> >
> > I think the part of the test that tests the stats after resetting it
> > might give different results. This can happen because in the previous
> > test we spill multiple times (spill_count is 12 in my testing) and it
> > is possible that some of the spill stats messages is received by stats
> > collector after the reset message. If this theory is correct then it
> > better that we remove the test for reset stats and the test after it
> > "decode and check stats again.". This is not directly related to your
> > patch or buildfarm failure but I guess this can happen and we might
> > see such a failure in future.

Good point. I agree to remove this test.

>
> I agree with your analysis to remove that test. Attached patch has the
> changes for the same.

Thank you for the patch. The patch looks good to me. I also agree that
removing the test doesn't reduce the test coverage.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Replication slot stats misgivings

From

Amit Kapila

Date:

13 May 2021, 05:51:03

On Wed, May 12, 2021 at 4:02 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Wed, May 12, 2021 at 1:19 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > > I think the part of the test that tests the stats after resetting it
> > > might give different results. This can happen because in the previous
> > > test we spill multiple times (spill_count is 12 in my testing) and it
> > > is possible that some of the spill stats messages is received by stats
> > > collector after the reset message. If this theory is correct then it
> > > better that we remove the test for reset stats and the test after it
> > > "decode and check stats again.". This is not directly related to your
> > > patch or buildfarm failure but I guess this can happen and we might
> > > see such a failure in future.
>
> Good point. I agree to remove this test.
>
> >
> > I agree with your analysis to remove that test. Attached patch has the
> > changes for the same.
>
> Thank you for the patch. The patch looks good to me. I also agree that
> removing the test doesn't reduce the test coverage.
>

Thanks, I have pushed the patch.

-- 
With Regards,
Amit Kapila.

Re: Replication slot stats misgivings

From

vignesh C

Date:

13 May 2021, 06:00:23

On Thu, May 13, 2021 at 11:21 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, May 12, 2021 at 4:02 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Wed, May 12, 2021 at 1:19 PM vignesh C <vignesh21@gmail.com> wrote:
> > >
> > > > I think the part of the test that tests the stats after resetting it
> > > > might give different results. This can happen because in the previous
> > > > test we spill multiple times (spill_count is 12 in my testing) and it
> > > > is possible that some of the spill stats messages is received by stats
> > > > collector after the reset message. If this theory is correct then it
> > > > better that we remove the test for reset stats and the test after it
> > > > "decode and check stats again.". This is not directly related to your
> > > > patch or buildfarm failure but I guess this can happen and we might
> > > > see such a failure in future.
> >
> > Good point. I agree to remove this test.
> >
> > >
> > > I agree with your analysis to remove that test. Attached patch has the
> > > changes for the same.
> >
> > Thank you for the patch. The patch looks good to me. I also agree that
> > removing the test doesn't reduce the test coverage.
> >
>
> Thanks, I have pushed the patch.
>

Thanks for pushing the patch.

Regards,
Vignesh

Re: Replication slot stats misgivings

From

Amit Kapila

Date:

24 May 2021, 04:07:55

On Thu, May 13, 2021 at 11:30 AM vignesh C <vignesh21@gmail.com> wrote:
>

Do we want to update the information about pg_stat_replication_slots
at the following place in docs
https://www.postgresql.org/docs/devel/logicaldecoding-catalogs.html?

If so, feel free to submit the patch for it?

-- 
With Regards,
Amit Kapila.

Re: Replication slot stats misgivings

From

vignesh C

Date:

24 May 2021, 04:38:53

On Mon, May 24, 2021 at 9:38 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, May 13, 2021 at 11:30 AM vignesh C <vignesh21@gmail.com> wrote:
> >
>
> Do we want to update the information about pg_stat_replication_slots
> at the following place in docs
> https://www.postgresql.org/docs/devel/logicaldecoding-catalogs.html?
>
> If so, feel free to submit the patch for it?

Adding it will be useful, the attached patch has the changes for the same.

Regards,
Vignesh

Attachment

v1-0001-Added-pg_stat_replication_slots-view-information-.patch

Re: Replication slot stats misgivings

From

Amit Kapila

Date:

25 May 2021, 09:30:55

On Mon, May 24, 2021 at 10:09 AM vignesh C <vignesh21@gmail.com> wrote:
>
> On Mon, May 24, 2021 at 9:38 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Thu, May 13, 2021 at 11:30 AM vignesh C <vignesh21@gmail.com> wrote:
> > >
> >
> > Do we want to update the information about pg_stat_replication_slots
> > at the following place in docs
> > https://www.postgresql.org/docs/devel/logicaldecoding-catalogs.html?
> >
> > If so, feel free to submit the patch for it?
>
> Adding it will be useful, the attached patch has the changes for the same.
>

Thanks for the patch, pushed.

-- 
With Regards,
Amit Kapila.