Re: Fix race condition in InvalidatePossiblyObsoleteSlot() - Mailing list pgsql-hackers

From Bertrand Drouvot
Subject Re: Fix race condition in InvalidatePossiblyObsoleteSlot()
Date
Msg-id Zc30i8fB7VYrTVw+@ip-10-97-1-34.eu-west-3.compute.internal
Whole thread Raw
In response to Re: Fix race condition in InvalidatePossiblyObsoleteSlot()  (Michael Paquier <michael@paquier.xyz>)
Responses Re: Fix race condition in InvalidatePossiblyObsoleteSlot()
List pgsql-hackers
Hi,

On Thu, Feb 15, 2024 at 02:09:45PM +0900, Michael Paquier wrote:
> On Thu, Jan 18, 2024 at 02:20:28PM +0000, Bertrand Drouvot wrote:
> > On Thu, Jan 18, 2024 at 04:59:39PM +0530, Bharath Rupireddy wrote:
> >> I'm not sure if it
> >> can happen at all, but I think we can rely on previous conflict reason
> >> instead of previous effective_xmin/effective_catalog_xmin/restart_lsn.
> > 
> > I'm not sure what you mean here. I think we should still keep the "initial" LSN
> > so that the next check done with it still makes sense. The previous conflict
> > reason as you're proposing also makes sense to me but for another reason: PANIC
> > in case the issue still happen (for cases we did not think about, means not
> > covered by what the added previous LSNs are covering).
> 
> Using a PANIC seems like an overreaction to me for this path.  Sure, 
> we should not record twice a conflict reason, but wouldn't an
> assertion be more adapted enough to check that a termination is not
> logged twice for this code path run in the checkpointer?

Thanks for looking at it!

Agree, using an assertion instead in v3 attached.

> 
> +            if (!terminated)
> +            {
> +                initial_restart_lsn = s->data.restart_lsn;
> +                initial_effective_xmin = s->effective_xmin;
> +                initial_catalog_effective_xmin = s->effective_catalog_xmin;
> +            }
> 
> It seems important to document why we are saving this data here; while
> we hold the LWLock for repslots, before we perform any termination,
> and we  want to protect conflict reports with incorrect data if the
> slot data got changed while the lwlock is temporarily released while
> there's a termination.

Yeah, comments added in v3.

> >> 3. Is there a way to reproduce this racy issue, may be by adding some
> >> sleeps or something? If yes, can you share it here, just for the
> >> records and verify the whatever fix provided actually works?
> > 
> > Alexander was able to reproduce it on a slow machine and the issue was not there
> > anymore with v1 in place. I think it's tricky to reproduce as it would need the
> > slot to advance between the 2 checks.
> 
> I'd really want a test for that in the long term.  And an injection
> point stuck between the point where ReplicationSlotControlLock is
> released then re-acquired when there is an active PID should be
> enough, isn't it? 

Yeah, that should be enough.

> For example just after the kill()?  It seems to me
> that we should stuck the checkpointer, perform a forced upgrade of the
> xmins, release the checkpointer and see the effects of the change in
> the second loop.  Now, modules/injection_points/ does not allow that,
> yet, but I've posted a patch among these lines that can stop a process
> and release it using a condition variable (should be 0006 on [1]).  I
> was planning to start a new thread with a patch posted for the next CF
> to add this kind of facility with a regression test for an old bug,
> the patch needs a few hours of love, first.  I should be able to post
> that next week.

Great, that looks like a good idea!

Are we going to fix this on master and 16 stable first and then later on add a
test on master with the injection points?

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment

pgsql-hackers by date:

Previous
From: "Hayato Kuroda (Fujitsu)"
Date:
Subject: RE: speed up a logical replica setup
Next
From: John Naylor
Date:
Subject: Re: [PoC] Improve dead tuple storage for lazy vacuum