Re: Replication slot stats misgivings - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Replication slot stats misgivings
Date
Msg-id CAA4eK1L8VE9YLE1tJzXW1nAaNQXOax6bsOuD1oOa7Xfn2Wykcg@mail.gmail.com
Whole thread Raw
In response to Re: Replication slot stats misgivings  (Masahiko Sawada <sawada.mshk@gmail.com>)
Responses Re: Replication slot stats misgivings  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
On Thu, Mar 25, 2021 at 11:36 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Wed, Mar 24, 2021 at 7:06 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> >
> > Leaving aside restart case, without some sort of such sanity checking,
> > if both drop (of old slot) and create (of new slot) messages are lost
> > then we will start accumulating stats in old slots. However, if only
> > one of them is lost then there won't be any such problem.
> >
> > > Perhaps we could have RestoreSlotFromDisk() send something to the stats
> > > collector ensuring the mapping makes sense?
> > >
> >
> > Say if we send just the index location of each slot then probably we
> > can setup replSlotStats. Now say before the restart if one of the drop
> > messages was missed (by stats collector) and that happens to be at
> > some middle location, then we would end up restoring some already
> > dropped slot, leaving some of the still required ones. However, if
> > there is some sanity identifier like name along with the index, then I
> > think that would have worked for such a case.
>
> Even such messages could also be lost? Given that any message could be
> lost under a UDP connection, I think we cannot rely on a single
> message. Instead, I think we need to loosely synchronize the indexes
> while assuming the indexes in replSlotStats and
> ReplicationSlotCtl->replication_slots are not synchronized.
>
> >
> > I think it would have been easier if we would have some OID type of
> > identifier for each slot. But, without that may be index location of
> > ReplicationSlotCtl->replication_slots and slotname combination can
> > reduce the chances of slot stats go wrong quite less even if not zero.
> > If not name, do we have anything else in a slot that can be used for
> > some sort of sanity checking?
>
> I don't see any useful information in a slot for sanity checking.
>

In that case, can we do a hard check for which slots exist if
replSlotStats runs out of space (that can probably happen only after
restart and when we lost some drop messages)?


-- 
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: Dean Rasheed
Date:
Subject: Re: PoC/WIP: Extended statistics on expressions
Next
From: David Steele
Date:
Subject: Re: SQL/JSON: JSON_TABLE