Re: Test slots invalidations in 035_standby_logical_decoding.pl only if dead rows are removed - Mailing list pgsql-hackers

From Bertrand Drouvot
Subject Re: Test slots invalidations in 035_standby_logical_decoding.pl only if dead rows are removed
Date
Msg-id ZaTxhjnPygOdosJ4@ip-10-97-1-34.eu-west-3.compute.internal
Whole thread Raw
In response to Re: Test slots invalidations in 035_standby_logical_decoding.pl only if dead rows are removed  (Michael Paquier <michael@paquier.xyz>)
List pgsql-hackers
Hi,

On Mon, Jan 15, 2024 at 01:11:26PM +0900, Michael Paquier wrote:
> On Sun, Jan 14, 2024 at 11:08:39PM -0500, Tom Lane wrote:
> > Michael Paquier <michael@paquier.xyz> writes:
> >> While thinking about that, a second idea came into my mind: a
> >> superuser-settable developer GUC to disable such WAL records to be
> >> generated within certain areas of the test.  This requires a small
> >> implementation, but nothing really huge, while being portable
> >> everywhere.  And it is not the first time I've been annoyed with these
> >> records when wanting a predictible set of WAL records for some test
> >> case.
> > 
> > Hmm ... I see what you are after, but to what extent would this mean
> > that what we are testing is not our real-world behavior?
> 
> Don't think so.  We don't care much about these records when it comes
> to checking slot invalidation scenarios with a predictible XID
> horizon, AFAIK.

Yeah, we want to test slot invalidation behavior so we need to ensure that such
an invalidation occur (which is not the case if we get a xl_running_xacts in the
middle) at the first place.

OTOH I also see Tom's point: for example I think we'd not have discovered [1]
(outside from the field) with such a developer GUC in place.

We did a few things in this thread, so to sum up what we've discovered:

- a race condition in InvalidatePossiblyObsoleteSlot() (see [1])
- we need to launch the vacuum(s) only if we are sure we got a newer XID horizon
( proposal in in v6 attached)
- we need a way to control how frequent xl_running_xacts are emmitted (to ensure
they are not triggered in a middle of an active slot invalidation test).

I'm not sure it's possible to address Tom's concern and keep the test "predictable".

So, I think I'd vote for Michael's proposal to implement a superuser-settable
developer GUC (as sending a SIGSTOP on the bgwriter (and bypass $windows_os) would
still not address Tom's concern anyway).

Another option would be to "sacrifice" the full predictablity of the test (in
favor of real-world behavior testing)?

[1]: https://www.postgresql.org/message-id/ZaTjW2Xh%2BTQUCOH0%40ip-10-97-1-34.eu-west-3.compute.internal

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment

pgsql-hackers by date:

Previous
From: "Anton A. Melnikov"
Date:
Subject: Re: [PATCH] kNN for btree
Next
From: Nazir Bilal Yavuz
Date:
Subject: Re: Create shorthand for including all extra tests