installcheck-world concurrency issues - Mailing list pgsql-hackers

From Andres Freund
Subject installcheck-world concurrency issues
Date
Msg-id 20221003234111.4ob7yph6r4g4ywhu@awork3.anarazel.de
Whole thread Raw
Responses Re: installcheck-world concurrency issues  (Michael Paquier <michael@paquier.xyz>)
Re: installcheck-world concurrency issues  (Peter Eisentraut <peter.eisentraut@enterprisedb.com>)
List pgsql-hackers
Hi,

while working on installcheck support with meson, that currently running
installcheck-world fails regularly with meson and occasionally with make.

A way to quite reliably reproduce this with make is

make -s -j48 -C contrib/ USE_MODULE_DB=1 installcheck-adminpack-recurse installcheck-passwordcheck-recurse

that will fail with diffs like:

diff -du10 /home/andres/src/postgresql/contrib/passwordcheck/expected/passwordcheck.out
/home/andres/build/postgres/dev-assert/vpath/contrib/passwordcheck/res>
--- /home/andres/src/postgresql/contrib/passwordcheck/expected/passwordcheck.out        2022-10-03 15:56:57.900326662
-0700
+++ /home/andres/build/postgres/dev-assert/vpath/contrib/passwordcheck/results/passwordcheck.out        2022-10-03
15:56:59.930329973-0700
 
@@ -1,19 +1,22 @@
 LOAD 'passwordcheck';
 CREATE USER regress_user1;
 -- ok
 ALTER USER regress_user1 PASSWORD 'a_nice_long_password';
+ERROR:  tuple concurrently deleted
 -- error: too short
 ALTER USER regress_user1 PASSWORD 'tooshrt';
-ERROR:  password is too short
+ERROR:  role "regress_user1" does not exist
 -- error: contains user name
 ALTER USER regress_user1 PASSWORD 'xyzregress_user1';
-ERROR:  password must not contain user name
+ERROR:  role "regress_user1" does not exist
 -- error: contains only letters

 LOAD 'passwordcheck';
 CREATE USER regress_user1;
 -- ok
 ALTER USER regress_user1 PASSWORD 'a_nice_long_password';
+ERROR:  tuple concurrently deleted
 -- error: too short
 ALTER USER regress_user1 PASSWORD 'tooshrt';
-ERROR:  password is too short
+ERROR:  role "regress_user1" does not exist
 -- error: contains user name


That's not surprising, given the common name of "regress_user1".

The attached patch fixes a number of instances of this issue. With it I got
through ~5 iterations of installcheck-world on ac, and >30 iterations with
meson.

There's a few further roles that seem to pose some danger goign forward:

./contrib/file_fdw/sql/file_fdw.sql:CREATE ROLE regress_no_priv_user LOGIN;                 -- has priv but no user
mapping
./contrib/postgres_fdw/sql/postgres_fdw.sql:CREATE ROLE regress_view_owner SUPERUSER;
./contrib/postgres_fdw/sql/postgres_fdw.sql:CREATE ROLE regress_nosuper NOSUPERUSER;
./contrib/passwordcheck/sql/passwordcheck.sql:CREATE USER regress_passwordcheck_user1;
./contrib/citext/sql/create_index_acl.sql:CREATE ROLE regress_minimal;
./src/test/modules/test_rls_hooks/sql/test_rls_hooks.sql:CREATE ROLE regress_r1;
./src/test/modules/test_rls_hooks/sql/test_rls_hooks.sql:CREATE ROLE regress_s1;
./src/test/modules/test_oat_hooks/sql/test_oat_hooks.sql:CREATE ROLE regress_role_joe;
./src/test/modules/test_oat_hooks/sql/test_oat_hooks.sql:CREATE USER regress_test_user;
./src/test/modules/unsafe_tests/sql/rolenames.sql:CREATE ROLE regress_testrol0 SUPERUSER LOGIN;
./src/test/modules/unsafe_tests/sql/rolenames.sql:CREATE ROLE regress_testrolx SUPERUSER LOGIN;
./src/test/modules/unsafe_tests/sql/rolenames.sql:CREATE ROLE regress_testrol2 SUPERUSER;
./src/test/modules/unsafe_tests/sql/rolenames.sql:CREATE ROLE regress_testrol1 SUPERUSER LOGIN IN ROLE
regress_testrol2;
./src/test/modules/unsafe_tests/sql/rolenames.sql:CREATE ROLE regress_role_haspriv;
./src/test/modules/unsafe_tests/sql/rolenames.sql:CREATE ROLE regress_role_nopriv;
./src/test/modules/unsafe_tests/sql/guc_privs.sql:CREATE ROLE regress_admin SUPERUSER;
./src/test/modules/test_ddl_deparse/sql/alter_function.sql:CREATE ROLE regress_alter_function_role;


BTW, shouldn't src/test/modules/unsafe_tests use the PG_TEST_EXTRA mechanism
somehow?  Seems not great to run it as part of installcheck-world, if we don't
want to run it as part of installcheck.


A second issue I noticed is that advisory_lock.sql often fails, because the
pg_locks queries don't restrict to the current database. Patch attached.

I haven't seen that with autoconf installcheck-world, presumably because of
this:

# There are too many interdependencies between the subdirectories, so
# don't attempt parallel make here.
.NOTPARALLEL:


With those two patches applied, I got through 10 iterations of running all
regress / isolation tests concurrently with meson without failures.

I attached the meson patch as well, but just because I used it to to get to
these patches.

Greetings,

Andres Freund

Attachment

pgsql-hackers by date:

Previous
From: Nikita Glukhov
Date:
Subject: Re: Error-safe user functions
Next
From: Ranier Vilela
Date:
Subject: Re: Reducing the chunk header sizes on all memory context types