Alexander, thanks for pushing this! This is small but very awaited feature.
> On 16 Feb 2024, at 02:08, Andres Freund <andres@anarazel.de> wrote:
>
> Isn't this test going to be very fragile on busy / slow machines? What if the
> pg_sleep() takes one second, because there were other tasks to schedule? I'd
> be surprised if this didn't fail under valgrind, for example.
Even more robust tests that were bullet-proof in CI previously exhibited some failures on buildfarm. Currently there
are5 failures through this weekend.
Failing tests are testing interaction of idle_in_transaction_session_timeout vs transaction_timeout(5), and
reschedulingtransaction_timeout(6).
Symptoms:
[0] transaction timeout occurs when it is being scheduled. Seems like SET was running to long.
step s6_begin: BEGIN ISOLATION LEVEL READ COMMITTED;
step s6_tt: SET statement_timeout = '1s'; SET transaction_timeout = '10ms';
+s6: FATAL: terminating connection due to transaction timeout
step checker_sleep: SELECT pg_sleep(0.1);
[1] transaction timeout 10ms is not detected after 1s
step s6_check: SELECT count(*) FROM pg_stat_activity WHERE application_name = 'isolation/timeouts/s6';
count
-----
- 0
+ 1
[2] transaction timeout is not detected in both session 5 and session 6.
So far not signle animal reported failures twice, so it's hard to say anything about frequency. But it seems to be
significantsource of failures.
So far I have these ideas:
1. Remove test sessions 5 and 6. But it seems a little strange that session 3 did not fail at all (it is testing
interactionof statement_timeout and transaction_timeout). This test is very similar to test sessiont 5...
2. Increase wait times.
step checker_sleep { SELECT pg_sleep(0.1); }
Seems not enough to observe backend timed out from pg_stat_activity. But this won't help from [0].
3. Reuse waiting INJECTION_POINT from [3] to make timeout tests deterministic and safe from race conditions. With
waitinginjection points we can wait as much as needed in current environment.
Any advices are welcome.
Best regards, Andrey Borodin.
[0] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=tamandua&dt=2024-02-16%2020%3A06%3A51
[1] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=kestrel&dt=2024-02-16%2001%3A45%3A10
[2] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=fairywren&dt=2024-02-17%2001%3A55%3A45
[3] https://www.postgresql.org/message-id/0925F9A9-4D53-4B27-A87E-3D83A757B0E0@yandex-team.ru