On Wed, 1 Oct 2025 at 08:26, Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
>
> Dear Shlok,
>
> > Thanks for updating the patch. Here are my comments.
>
> I found one more comment.
>
> ```
> + /*
> + * If found_consistent_snapshot is not NULL and a consistent snapshot is
> + * found set the slot sync skip reason to none. Else, if consistent
> + * snapshot is not found the stats will be updated in the function
> + * update_and_persist_local_synced_slot
> + */
> + if (!found_consistent_snapshot || *found_consistent_snapshot)
> + update_slot_sync_skip_stats(slot, SS_SKIP_NONE);
> ```
>
> I think the condition is confusing; in code level there is a path that
> found_consistent_snapshot is NULL but synchronization happened. (Not sure it is
> possible though).
>
> I think it is better to put update_slot_sync_skip_stats() near the sync part.
> If the snapshot exists from the beginning, it can be done unconditionally,
> otherwise we can check again. Attached .diffs file implements it.
>
I agree that the condition can be confusing. I checked your patch and
have a concern.
For the change:
```
LogicalSlotAdvanceAndCheckSnapState(remote_slot->confirmed_lsn,
found_consistent_snapshot);
+ /* Update the slot sync skip reason if snapshot could be created */
+ if (SnapBuildSnapshotExists(remote_slot->restart_lsn))
+ {
+ Assert(!found_consistent_snapshot ||
+ *found_consistent_snapshot);
+ update_slot_sync_skip_stats(slot, SS_SKIP_NONE);
+ }
+
```
I debugged and noticed that, on a successful sync slot run, in some
cases we do not enter the if condition here which can lead to
misleading slot_sync_skip_reason.
I noticed that even if the synced slot is advanced and
'found_consistent_snapshot' set as true,
SnapBuildSnapshotExists(remote_slot->restart_lsn) can return false.
As far as I understand, LogicalSlotAdvanceAndCheckSnapState will
advance the slot but will not serialize the snapshot and hence
SnapBuildSnapshotExists can return false.
I have also added a test case in 049_slot_skip_stats named
"slot_sync_skip_reason is reset after successful sync" which can
reproduce it.
I think we can update the stats when "slot->data.confirmed_flush ==
remote_slot->confirmed_lsn" instead of checking
'SnapBuildSnapshotExists'. Thoughts?
Thanks,
Shlok Kyal