Thread: Avoid race condition for event_triggers regress test
Hi hackers,
After committing the on-login trigger (e83d1b0c40ccda8955f1245087f0697652c4df86) the event_trigger regress test became sensible to any other parallel tests, not only DDL. Thus it should be placed in a separate parallel schedule group.
The current problem is that a race condition may occur on some systems, when oidjoins test starts a moment later than normally and affects logins count for on-login trigger test. The problem is quite a rare one and I only faced it once. But rare or not - the problem is a problem and it should be addressed.
Such race condition can be simulated by adding "select pg_sleep(2);" and "\c" at the very beginning of oidjoins.sql and adding "select pg_sleep(5);" after creation of the login trigger in event_trigger.sql.
The resulting symptoms are quite recognizable: regression.diffs file will contain unexpected welcome message for oidjoins test and unexpectedly increased result of "SELECT COUNT(*) FROM user_logins;" for event_triggers test. (These are accompanied with the expected responses to the newly added commands of course)
To get rid of the unexpected results the oidjoins and event_triggers tests should be splitted into separate paralell schedule groups. This is exactly what the proposed (attached) patch is doing.
What do you think?
After committing the on-login trigger (e83d1b0c40ccda8955f1245087f0697652c4df86) the event_trigger regress test became sensible to any other parallel tests, not only DDL. Thus it should be placed in a separate parallel schedule group.
The current problem is that a race condition may occur on some systems, when oidjoins test starts a moment later than normally and affects logins count for on-login trigger test. The problem is quite a rare one and I only faced it once. But rare or not - the problem is a problem and it should be addressed.
Such race condition can be simulated by adding "select pg_sleep(2);" and "\c" at the very beginning of oidjoins.sql and adding "select pg_sleep(5);" after creation of the login trigger in event_trigger.sql.
The resulting symptoms are quite recognizable: regression.diffs file will contain unexpected welcome message for oidjoins test and unexpectedly increased result of "SELECT COUNT(*) FROM user_logins;" for event_triggers test. (These are accompanied with the expected responses to the newly added commands of course)
To get rid of the unexpected results the oidjoins and event_triggers tests should be splitted into separate paralell schedule groups. This is exactly what the proposed (attached) patch is doing.
What do you think?
Attachment
Hi, > The current problem is that a race condition may occur on some systems, when oidjoins test starts a moment later than normallyand affects logins count for on-login trigger test. The problem is quite a rare one and I only faced it once. Butrare or not - the problem is a problem and it should be addressed. Thanks for the patch and the steps to reproduce. I tested the patch and it does what is claimed. Including the steps to reproduce as a separate patch with .txt extension so cfbot will ignore it. I think it's a good find and a good fix. -- Best regards, Aleksander Alekseev