Fix race condition in InvalidatePossiblyObsoleteSlot() - Mailing list pgsql-hackers

From Bertrand Drouvot
Subject Fix race condition in InvalidatePossiblyObsoleteSlot()
Date
Msg-id ZaTjW2Xh+TQUCOH0@ip-10-97-1-34.eu-west-3.compute.internal
Whole thread Raw
Responses Re: Fix race condition in InvalidatePossiblyObsoleteSlot()
List pgsql-hackers
Hi hackers,

While working on [1], we discovered (thanks Alexander for the testing) that an
conflicting active logical slot on a standby could be "terminated" without
leading to an "obsolete" message (see [2]).

Indeed, in case of an active slot we proceed in 2 steps in
InvalidatePossiblyObsoleteSlot():

- terminate the backend holding the slot
- report the slot as obsolete

This is racy because between the two we release the mutex on the slot, which
means that the slot's effective_xmin and effective_catalog_xmin could advance
during that time (leading to exit the loop).

I think that holding the mutex longer is not an option (given what we'd to do
while holding it) so the attached proposal is to record the effective_xmin and
effective_catalog_xmin instead that was used during the backend termination.

[1]: https://www.postgresql.org/message-id/flat/bf67e076-b163-9ba3-4ade-b9fc51a3a8f6%40gmail.com
[2]: https://www.postgresql.org/message-id/7c025095-5763-17a6-8c80-899b35bd0459%40gmail.com

Looking forward to your feedback,

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment

pgsql-hackers by date:

Previous
From: Masahiko Sawada
Date:
Subject: Re: Make COPY format extendable: Extract COPY TO format implementations
Next
From: Tatsuo Ishii
Date:
Subject: Re: pgbnech: allow to cancel queries during benchmark