Re: Question about InvalidatePossiblyObsoleteSlot() - Mailing list pgsql-hackers
From | Masahiko Sawada |
---|---|
Subject | Re: Question about InvalidatePossiblyObsoleteSlot() |
Date | |
Msg-id | CAD21AoD6zM1E8E1rrJe96vc9x+Kuwd8AdF4zh-76dZWTsip3DQ@mail.gmail.com Whole thread Raw |
In response to | Re: Question about InvalidatePossiblyObsoleteSlot() (Bertrand Drouvot <bertranddrouvot.pg@gmail.com>) |
List | pgsql-hackers |
On Fri, Oct 17, 2025 at 12:18 AM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote: > > Hi, > > On Thu, Oct 16, 2025 at 11:32:03AM -0700, Masahiko Sawada wrote: > > On Thu, Oct 16, 2025 at 1:23 AM Bertrand Drouvot > > <bertranddrouvot.pg@gmail.com> wrote: > > > > > > I'm not sure. The existing logging shows when we invalidate a slot, so the > > > absence of that message indicates we skipped it. Users can also verify the > > > slot status in the view if needed. > > > > > > > I'm somewhat confused. As the commit message of 818fefd8fd4 says, the > > invalidation of an active slot is done in two steps: > > > > - Termination of the backend holding it, if any. > > - Report that the slot is obsolete, with a conflict cause depending on > > the slot's data. > > > > Before commit 818fefd8fd4, it was possible that if slot's > > effective_xmin, catalog_effective_xmin, or restart_lsn gets advanced > > between two steps, we ended up skipping the actual invalidation for > > the slot. > > If advancing far enough to reach RS_INVAL_NONE, right. > > > What the patch proposed on this thread does is that we allow such pre- > > 818fefd8fd4 behavior only for restart_lsn. That is, we skip the actual > > slot invalidation if the slot's restart_lsn gets advanced and exceeds > > the oldestLSN between two steps. > > Exactly. > > > I thought the reason why we moved away from pre-818fefd8fd4 was that > > we wanted to avoid reporting only "terminated" and to report both > > things consistently. But if we can accept pre-818fefd8fd4 for > > restart_lsn, why can we not do the same for effective_xmin and > > catalog_effective_xmin? > > I guess that the same is true also for > > effective_xmin and catalog_effective_xmin that we can skip the actual > > slot invalidation if those values get advanced and exceed > > snapshotConflictHorizon. > > I think this is safe to do for the restart_lsn because that's the same process > (checkpointer/startup) that is executing InvalidatePossiblyObsoleteSlot() > and will remove the WALs. It can not remove the WALs as long as it is busy in > InvalidatePossiblyObsoleteSlot() so that I think this is safe for the restart_lsn > case. > > I'm not that confident for the xmin cases, do you think that's also safe to not > invalidate the slots for effective_xmin and catalog_effective_xmin if they > advance far enough? I find the same in xmin cases. ResolveRecoveryConflictWithSnapshot() is called only during the recovery by the startup process, and it also tries to invalidate possibly obsolete slots. Catalog tuples are not removed as long as the startup calls ResolveRecoveryConflictWithSnapshot() before actually removing the tuples and it's busy in InvalidatePossiblyObsoleteSlot(). I might be missing something but this is the reason why I'm confused with the 818fefd8fd4 fix and the proposed change. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
pgsql-hackers by date: