Re: Replication slot stats misgivings - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Replication slot stats misgivings
Date
Msg-id CAA4eK1+GW98sKkooQT1en4EU=Gaugg-5Yi57FNDmvsY60McAFw@mail.gmail.com
Whole thread Raw
In response to Re: Replication slot stats misgivings  (Masahiko Sawada <sawada.mshk@gmail.com>)
Responses Re: Replication slot stats misgivings  (Masahiko Sawada <sawada.mshk@gmail.com>)
List pgsql-hackers
On Mon, Mar 22, 2021 at 12:20 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Mon, Mar 22, 2021 at 1:25 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Sat, Mar 20, 2021 at 3:52 AM Andres Freund <andres@anarazel.de> wrote:
> > >
> > > - If max_replication_slots was lowered between a restart,
> > >   pgstat_read_statfile() will happily write beyond the end of
> > >   replSlotStats.
> >
> > I think we cannot restart the server after lowering
> > max_replication_slots to a value less than the number of replication
> > slots actually created on the server. No?
>
> This problem happens in the case where max_replication_slots is
> lowered and there still are stats for a slot.
>

I think this can happen only if the drop message is lost, right?

> I understood the risk of running out of replSlotStats. If we use the
> index in replSlotStats instead, IIUC we need to somehow synchronize
> the indexes in between replSlotStats and
> ReplicationSlotCtl->replication_slots. The order of replSlotStats is
> preserved across restarting whereas the order of
> ReplicationSlotCtl->replication_slots isn’t (readdir() that is used by
> StartupReplicationSlots() doesn’t guarantee the order of the returned
> entries in the directory). Maybe we can compare the slot name in the
> received message to the name in the element of replSlotStats. If they
> don’t match, we swap entries in replSlotStats to synchronize the index
> of the replication slot in ReplicationSlotCtl->replication_slots and
> replSlotStats. If we cannot find the entry in replSlotStats that has
> the name in the received message, it probably means either it's a new
> slot or the previous create message is dropped, we can create the new
> stats for the slot. Is that what you mean, Andres?
>

I wonder how in this scheme, we will remove the risk of running out of
'replSlotStats' and still restore correct stats assuming the drop
message is lost? Do we want to check after restoring each slot info
whether the slot with that name exists?


--
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: Masahiro Ikeda
Date:
Subject: Re: make the stats collector shutdown without writing the statsfiles if the immediate shutdown is requested.
Next
From: Julien Rouhaud
Date:
Subject: Re: Feature improvement: can we add queryId for pg_catalog.pg_stat_activity view?