On Monday, September 1, 2025 12:45 PM Zhijie Hou (Fujitsu) <houzj.fnst@fujitsu.com> wrote:
>
> On Friday, August 29, 2025 6:28 PM shveta malik <shveta.malik@gmail.com>:
> >
> > On Fri, Aug 29, 2025 at 11:49 AM Zhijie Hou (Fujitsu)
> > <houzj.fnst@fujitsu.com>
> > wrote:
> > >
> > > Here is the new version patch set which also addressed Shveta's
> > comments[1].
> > >
> >
> > Thanks for the patch.
> >
> > On 001 alone, I’m observing a behavior where, if sub1 has stopped
> > retention, and I then create a new subscription sub2, the worker for
> > sub2 fails to start successfully. It repeatedly starts and exits,
> > logging the following message:
> >
> > LOG: logical replication worker for subscription "sub2" will restart
> > because the option retain_dead_tuples was enabled during startup
> >
> > Same things happen when I disable and re-enable 'retain_dead_tuple' of
> > any sub once the slot has invalid xmin.
>
> I think this behavior is because slot.xmin is set to an invalid number, and 0001
> patch has no slot recovery logic, so even if retentionactive is true, newly created
> subscriptions cannot have a valid oldest_nonremovable_xid.
>
> After thinking more, I decided to add slot recovery functionality to 0001 as well,
> thus avoiding the need for additional checks here. I also adjusted the
> documents accordingly.
>
> Here is the V69 patch set which addressed above comments and the latest
> comment from Nisha[1].
I reviewed the patch internally and tweaked a small detail of the apply worker
to reduce the waiting time in the main loop when max_retention_duration is
defined (set wait_time = min(wait_time, max_retention_duration)). Also, I added
a simple test in 035_conflicts.pl of 0001 to verify the new sub option.
Here is V70 patch set.
Best Regards,
Hou zj