Hello Michael and Bertrand,
15.01.2024 06:59, Michael Paquier wrote:
> The WAL records related to standby snapshots are playing a lot with
> the randomness of the failures we are seeing. Alexander has mentioned
> offlist something else: using SIGSTOP on the bgwriter to avoid these
> records and make the test more stable. That would not be workable for
> Windows, but I could live with that knowing that logical decoding for
> standbys has no platform-speficic tweak for the code paths we're
> testing here, and that would put as limitation to skip the test for
> $windows_os.
I've found a way to implement pause/resume for Windows processed and it
looks acceptable to me if we can afford "use Win32::API;" on Windows
(maybe the test could be skipped only if this perl module is absent).
Please look at the PoC patch for the test 035_standby_logical_decoding.
(The patched test passes for me.)
If this approach looks promising to you, maybe we could add a submodule to
perl/PostgreSQL/Test/ and use this functionality in other tests (e.g., in
019_replslot_limit) as well.
Personally I think that having such a functionality for using in tests
might be useful not only to avoid some "problematic" behaviour but also to
test the opposite cases.
> While thinking about that, a second idea came into my mind: a
> superuser-settable developer GUC to disable such WAL records to be
> generated within certain areas of the test. This requires a small
> implementation, but nothing really huge, while being portable
> everywhere. And it is not the first time I've been annoyed with these
> records when wanting a predictible set of WAL records for some test
> case.
I see that the test in question exists in REL_16_STABLE, it means that a
new GUC would not help there?
Best regards,
Alexander