On Tue, Sep 5, 2023 at 10:09 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Tue, Sep 5, 2023 at 9:38 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Mon, Sep 4, 2023 at 4:19 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > >
> > > Said that there is a possibility that some of the slots which got
> > > invalidated even on the previous checkpoint might get the same LSN as
> > > the slot which got invalidated later if there is no activity between
> > > these two checkpoints. So if we go with this approach then there is
> > > some risk of migrating some of the slots which were already
> > > invalidated even before the shutdown checkpoint.
> > >
> >
> > I think even during the shutdown checkpoint, after writing shutdown
> > checkpoint WAL, we can invalidate some slots that in theory are safe
> > to migrate/copy because all the WAL for those slots would also have
> > been sent. So, those would be similar to what we invalidate during the
> > upgrade, no?
>
> Thats correct
>
> If so, I think it is better to have the same behavior for
> > invalidated slots irrespective of the time it gets invalidated. We can
> > either give an error for such slots during the upgrade (which means
> > disallow the upgrade) or simply ignore such slots during the upgrade.
> > I would prefer ERROR but if we want to ignore such slots, we can
> > probably inform the user in some way about ignored slots, so that she
> > can later drop corresponding subscritions or recreate such slots and
> > do the required sync-up to continue the replication.
>
> Earlier I was thinking that ERRORing out is better so that the user
> can take necessary action for the invalidated slots and then retry
> upgrade. But thinking again I could not find what are the advantages
> of this because if we error out then also users need to restart the
> old cluster again and have to drop the corresponding subscriptions
> OTOH if we allow the upgrade by ignoring the slots then also the user
> has to take similar actions on the new cluster? So what's the
> advantage of erroring out over upgrading?
>
The advantage is that we avoid inconvenience caused to users because
Drop Subscription will be unsuccessful as the corresponding slots are
not present. So users first need to disassociate slots for the
subscription and then drop the subscription. Also, I am not sure
leaving behind some slots doesn't have any other impact, otherwise,
why don't we drop such slots from time to time after they are marked
invalidated during normal operation? If users really want to leave
behind such invalidated slots after upgrade, we can even think of
providing some option like "exclude_invalid_logical_slots".
--
With Regards,
Amit Kapila.