On Mon, Apr 15, 2024 at 2:29 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Apr 12, 2024 at 5:25 PM shveta malik <shveta.malik@gmail.com> wrote:
> >
> > Please find v3 addressing race-condition and one other comment.
> >
> > Up-thread it was suggested that, probably, checking
> > SlotSyncCtx->syncing should be sufficient in ShutDownSlotSync(). On
> > re-thinking, it might not be. Slot sync worker sets and resets
> > 'syncing' with each sync-cycle, and thus we need to rely on worker's
> > pid in ShutDownSlotSync(), as there could be a window where promotion
> > is triggered and 'syncing' is not set for worker, while the worker is
> > still running. This implementation of setting and resetting syncing
> > with each sync-cycle looks better as compared to setting syncing
> > during the entire life-cycle of the worker. So, I did not change it.
> >
>
> To retain this we need to have different handling for 'syncing' for
> workers and function which seems like more maintenance burden than the
> value it provides. Moreover, in SyncReplicationSlots(), we are calling
> a function after acquiring spinlock which is not our usual coding
> practice.
Okay. Changed it to consistent handling. Now both worker and SQL
function set 'syncing' when they start and reset it when they exit.
> One minor comment:
> * All the fields except 'syncing' are used only by slotsync worker.
> * 'syncing' is used both by worker and SQL function pg_sync_replication_slots.
> */
> typedef struct SlotSyncCtxStruct
> {
> pid_t pid;
> bool stopSignaled;
> bool syncing;
> time_t last_start_time;
> slock_t mutex;
> } SlotSyncCtxStruct;
>
> I feel the above comment is no longer valid after this patch. We can
> probably remove this altogether.
Yes, changed.
Please find v4 addressing the above comments.
thanks
Shveta