On Wed, Jul 6, 2022 at 1:47 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Wed, Jul 6, 2022 at 9:06 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > How would you choose the slot name for the table sync, right now it
> > contains the relid of the table for which it needs to perform sync?
> > Say, if we ignore to include the appropriate identifier in the slot
> > name, we won't be able to resue/drop the slot after restart of table
> > sync worker due to an error.
>
> I had a quick look into the patch and it seems it is using the worker
> array index instead of relid while forming the slot name, and I think
> that make sense, because now whichever worker is using that worker
> index can reuse the slot created w.r.t that index.
>
I think that won't work because each time on restart the slot won't be
fixed. Now, it is possible that we may drop the wrong slot if that
state of copying rel is SUBREL_STATE_DATASYNC. Also, it is possible
that while creating a slot, we fail because the same name slot already
exists due to some other worker which has created that slot has been
restarted. Also, what about origin_name, won't that have similar
problems? Also, if the state is already SUBREL_STATE_FINISHEDCOPY, if
the slot is not the same as we have used in the previous run of a
particular worker, it may start WAL streaming from a different point
based on the slot's confirmed_flush_location.
--
With Regards,
Amit Kapila.