On Thu, Apr 17, 2025 at 12:19 PM shveta malik wrote:
>
> On Wed, Apr 16, 2025 at 10:30 AM shveta malik <shveta.malik@gmail.com>
> wrote:
> >
> > On Wed, Mar 26, 2025 at 4:17 PM Zhijie Hou (Fujitsu)
> > <houzj.fnst@fujitsu.com> wrote:
> > >
> > > Here's a rebased version of the patch series.
> > >
> >
>
> Thanks Hou-San for the patches. I am going through this long thread and
> patches. One doubt I have is whenever there is a chance of conflict-slot update
> (either xmin or possibility of its invalidation), apply-worker gives a wake-up call
> to the launcher (ApplyLauncherWakeup). Shouldn't that suffice to wake-up
> launcher irrespective of its nap-time? Do we actually need to introduce
> MIN/MAX_NAPTIME_PER_SLOT_UPDATE in the launcher and the logic
> around it?
Thanks for reviewing. After rethinking, I agree that the wakeup is
sufficient, so I removed the nap-time logic in this version.
> Few comments for patch004:
> Config.sgml:
> 1)
> + <para>
> + Maximum duration (in milliseconds) for which conflict
> + information can be retained for conflict detection by the apply worker.
> + The default value is <literal>0</literal>, indicating that conflict
> + information is retained until it is no longer needed for detection
> + purposes.
> + </para>
>
> IIUC, the above is not entirely accurate. Suppose the subscriber manages to
> catch up and sets oldest_nonremovable_xid to 100, which is then updated in
> slot. After this, the apply worker takes a nap and begins a new xid update cycle.
> Now, let’s say the next candidate_xid is 200, but this time the subscriber fails
> to keep up and exceeds max_conflict_retention_duration. As a result, it sets
> oldest_nonremovable_xid to InvalidTransactionId, and the launcher skips
> updating the slot’s xmin.
If the time exceeds the max_conflict_retention_duration, the launcher would
Invalidate the slot, instead of skipping updating it. So the conflict info(e.g.,
dead tuples) would not be retained anymore.
> However, the previous xmin value (100) is still there
> in the slot, causing its data to be retained beyond the
> max_conflict_retention_duration. The xid 200 which actually honors
> max_conflict_retention_duration was never marked for retention. If my
> understanding is correct, then the documentation doesn’t fully capture this
> scenario.
As mentioned above, the strategy here is to invalidate the slot.
>
> 2)
> + The replication slot
> + <quote><literal>pg_conflict_detection</literal></quote> that
> used to
> + retain conflict information will be invalidated if all apply workers
> + associated with the subscription, where
>
> Subscription --> subscriptions
>
> 3)
> Name stop_conflict_retention in MyLogicalRepWorker is confusing. Shall it be
> stop_conflict_info_retention?
Changed.
Here is V30 patch set includes the following changes:
* Addressed above comments.
* Added the retention timeout check in wait_for_local_flush(), as suggested by Nisha[1].
* Improved upgrade codes and added a test for upgrade of retain_conflict_info option,
as suggested by Kuroda-san[2].
[1] https://www.postgresql.org/message-id/CABdArM4Ft8q3dZv4Bqw%3DrbS5_LFMXDJMRr3vC8a_KMCX1qatpg%40mail.gmail.com
[2]
https://www.postgresql.org/message-id/OSCPR01MB14966269726272F2F2B2BD3B0F5B22%40OSCPR01MB14966.jpnprd01.prod.outlook.com
Best Regards,
Hou zj