On Thu, Apr 8, 2021 at 4:00 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Apr 7, 2021 at 2:51 PM vignesh C <vignesh21@gmail.com> wrote:
> >
>
> @@ -4069,6 +4069,24 @@ pgstat_read_statsfiles(Oid onlydb, bool
> permanent, bool deep)
> * slot follows.
> */
> case 'R':
> + /*
> + * There is a remote scenario where one of the replication slots
> + * is dropped and the drop slot statistics message is not
> + * received by the statistic collector process, now if the
> + * max_replication_slots is reduced to the actual number of
> + * replication slots that are in use and the server is
> + * re-started then the statistics process will not be aware of
> + * this. To avoid writing beyond the max_replication_slots
> + * this replication slot statistic information will be skipped.
> + */
> + if (max_replication_slots == nReplSlotStats)
> + {
> + ereport(pgStatRunningInCollector ? LOG : WARNING,
> + (errmsg("skipping \"%s\" replication slot statistics as
> pg_stat_replication_slots does not have enough slots",
> + NameStr(replSlotStats[nReplSlotStats].slotname))));
> + goto done;
> + }
>
> I think we might truncate some valid slots here. I have another idea
> to fix this case which is that while writing, we first write the
> 'nReplSlotStats' and then write each slot info. Then while reading we
> can allocate memory based on the required number of slots. Later when
> startup process sends the slots, we can remove the already dropped
> slots from this array. What do you think?
I felt this idea is better, the reason being in the earlier idea we
might end up deleting some valid replication slot statistics and that
slot's statistics will never be available to the user.
Regards,
Vignesh