Re: A failure in prepared_xacts test - Mailing list pgsql-hackers

From Michael Paquier
Subject Re: A failure in prepared_xacts test
Date
Msg-id Zi8pC2WkIL56cHmj@paquier.xyz
Whole thread Raw
In response to A failure in prepared_xacts test  (Richard Guo <guofenglinux@gmail.com>)
Responses Re: A failure in prepared_xacts test
List pgsql-hackers
On Mon, Apr 29, 2024 at 09:12:48AM +0800, Richard Guo wrote:
> Does anyone have any clue to this failure?
>
> FWIW, after another run of this test, the failure just disappears.  Does
> it suggest that the test case is flaky?

If you grep the source tree, you'd notice that a prepared transaction
named gxid only exists in the 2PC tests of ECPG, in
src/interfaces/ecpg/test/sql/twophase.pgc.  So the origin of the
failure comes from a race condition due to test parallelization,
because the scan of pg_prepared_xacts affects all databases with
installcheck, and in your case it means that the scan of
pg_prepared_xacts was running in parallel of the ECPG tests with an
installcheck.

The only location in the whole tree where we want to do predictible
scans of pg_prepared_xacts is prepared_xacts.sql, so rather than
playing with 2PC transactions across a bunch of tests, I think that we
should do two things, both touching prepared_xacts.sql:
- The 2PC transactions run in the main regression test suite should
use names that would be unlikely used elsewhere.
- Limit the scans of pg_prepared_xacts on these name patterns to avoid
interferences.

See for example the attached with both expected outputs updated
depending on the value set for max_prepared_transactions in the
backend.  There may be an argument in back-patching that, but I don't
recall seeing this failure in the CI, so perhaps that's not worth
bothering with.  What do you think?
--
Michael

Attachment

pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: TerminateOtherDBBackends code comments inconsistency.
Next
From: Alexander Lakhin
Date:
Subject: Re: A failure in prepared_xacts test