On Wed, Mar 04, 2026 at 02:31:29PM +0900, Michael Paquier wrote:
> As a whole, it looks like we should just switch the teardown() call to
> a stop() call in the first test with xact_009_10, backpatch it, and
> call it a day. No need for injection points and no need for GUC
> tweaks.
With a little bit more patience, I have reproduced the same failure as
Alexander using the bgwriter trick, -DWAL_DEBUG and his reproducer
script with parallel runs of the 009 recovery test. The attached
patch is also proving to work. The failure happens at the 2nd~3rd
iteration without the fix, and the tests are able to last more than 50
iterations with the fix.
As far as I can see by scanning the history of the test, this is a
copy-pasto coming from 30820982b295 where the tests were initially
introduced, where teardown_node() was copied across the test
sequences. As we want to check that a promoted standby is able to
commit the 2PC transactions issued on the primary, a plain stop() will
equally work.
I'll push this fix shortly, taking care of one instability. Nice
investigation on this one, Alexander, by the way.
--
Michael