Thread: Commit 5a2fed911a broke parallel query
This is a script to reproduce the problem: CREATE ROLE fluff; CREATE ROLE duff LOGIN IN ROLE fluff; CREATE DATABASE scratch OWNER duff; REVOKE CONNECT, TEMP ON DATABASE scratch FROM PUBLIC; \c scratch duff CREATE TABLE large AS SELECT id FROM generate_series(1, 500000) AS id; GRANT SELECT ON large TO fluff; SET ROLE fluff; SELECT count(*) FROM large; Since commit 5a2fed911a, this results in: ERROR: permission denied for database "scratch" DETAIL: User does not have CONNECT privilege. CONTEXT: parallel worker The following patch makes the problem disappear, but I am far from certain that using the session user is always correct there: diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c index a024b1151d0..150ec3f52f8 100644 --- a/src/backend/utils/init/postinit.c +++ b/src/backend/utils/init/postinit.c @@ -360,7 +360,7 @@ CheckMyDatabase(const char *name, bool am_superuser, bool override_allow_connect * and save a few cycles.) */ if (!am_superuser && - object_aclcheck(DatabaseRelationId, MyDatabaseId, GetUserId(), + object_aclcheck(DatabaseRelationId, MyDatabaseId, GetSessionUserId(), ACL_CONNECT) != ACLCHECK_OK) ereport(FATAL, (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE), Yours, Laurenz Albe
Laurenz Albe <laurenz.albe@cybertec.at> writes: > This is a script to reproduce the problem: As commented in 73c9f91a1, This all seems very accidental and probably not the behavior we really want, but a security patch is no time to be redesigning things. Perhaps this thread is a good place to start the discussion. In the security-team discussion that led up to 73c9f91a1, I'd been wondering why parallel worker start enforces any connection restrictions at all. If we'd like the use of parallelism to be more-or-less transparent, we shouldn't apply these checks, and not the !am_superuser ones in InitPostgres either. If we do want to apply permissions checks in parallel worker start, why should we think that the failure in this example is wrong? regards, tom lane
I wrote: > In the security-team discussion that led up to 73c9f91a1, I'd been > wondering why parallel worker start enforces any connection > restrictions at all. If we'd like the use of parallelism to be > more-or-less transparent, we shouldn't apply these checks, > and not the !am_superuser ones in InitPostgres either. That is, the way I'd prefer to attack this is something along the lines of the attached, which just disables all those checks in parallel workers (and reverts 73c9f91a1's hackery on am_superuser). One big question this raises is whether any other sorts of background processes should be treated similarly. Another loose end is that I don't think the enforcement of datconnlimit is very sane. CountDBBackends skips processes that say AmBackgroundWorkerProcess, so logically those should be precisely the ones that escape enforcement of datconnlimit, but it doesn't quite work like that today. This patch at least puts parallel workers on the right side of the line; but I think we have some cleanup to do for other sorts of auxiliary processes. I'm also wondering if we could revisit the need for CheckMyDatabase's override_allow_connections parameter and thereby remove INIT_PG_OVERRIDE_ALLOW_CONNS. So all in all, this is just at the POC stage. regards, tom lane diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c index 770ab6906e..bd2b91bdde 100644 --- a/src/backend/utils/init/postinit.c +++ b/src/backend/utils/init/postinit.c @@ -342,9 +342,17 @@ CheckMyDatabase(const char *name, bool am_superuser, bool override_allow_connect * a way to recover from disabling all access to all databases, for * example "UPDATE pg_database SET datallowconn = false;". * - * We do not enforce them for autovacuum worker processes either. + * We do not enforce them for autovacuum worker processes either, so that + * autovacuuming can still happen in all databases. + * + * We do not enforce them for parallel worker processes either: it's + * assumed that the checks applied when the leader started are sufficient. + * (To do otherwise would make the use of parallel query less transparent + * than we want.) */ - if (IsUnderPostmaster && !AmAutoVacuumWorkerProcess()) + if (IsUnderPostmaster && + !AmAutoVacuumWorkerProcess() && + !InitializingParallelWorker) { /* * Check that the database is currently allowing connections. @@ -865,23 +873,7 @@ InitPostgres(const char *in_dbname, Oid dboid, { InitializeSessionUserId(username, useroid, (flags & INIT_PG_OVERRIDE_ROLE_LOGIN) != 0); - - /* - * In a parallel worker, set am_superuser based on the - * authenticated user ID, not the current role. This is pretty - * dubious but it matches our historical behavior. Note that this - * value of am_superuser is used only for connection-privilege - * checks here and in CheckMyDatabase (we won't reach - * process_startup_options in a background worker). - * - * In other cases, there's been no opportunity for the current - * role to diverge from the authenticated user ID yet, so we can - * just rely on superuser() and avoid an extra catalog lookup. - */ - if (InitializingParallelWorker) - am_superuser = superuser_arg(GetAuthenticatedUserId()); - else - am_superuser = superuser(); + am_superuser = superuser(); } } else @@ -909,16 +901,17 @@ InitPostgres(const char *in_dbname, Oid dboid, /* * The last few connection slots are reserved for superusers and roles - * with privileges of pg_use_reserved_connections. Replication - * connections are drawn from slots reserved with max_wal_senders and are - * not limited by max_connections, superuser_reserved_connections, or - * reserved_connections. + * with privileges of pg_use_reserved_connections. + * + * This does not apply to walsenders (which are instead limited by + * max_wal_senders) nor parallel workers (which are instead limited by + * max_parallel_workers). * * Note: At this point, the new backend has already claimed a proc struct, * so we must check whether the number of free slots is strictly less than * the reserved connection limits. */ - if (!am_superuser && !am_walsender && + if (!am_superuser && !am_walsender && !InitializingParallelWorker && (SuperuserReservedConnections + ReservedConnections) > 0 && !HaveNFreeProcs(SuperuserReservedConnections + ReservedConnections, &nfree)) {
On Mon, 2024-12-23 at 12:12 -0500, Tom Lane wrote: > I wrote: > > In the security-team discussion that led up to 73c9f91a1, I'd been > > wondering why parallel worker start enforces any connection > > restrictions at all. If we'd like the use of parallelism to be > > more-or-less transparent, we shouldn't apply these checks, > > and not the !am_superuser ones in InitPostgres either. > > That is, the way I'd prefer to attack this is something along the > lines of the attached, which just disables all those checks in > parallel workers (and reverts 73c9f91a1's hackery on am_superuser). > > One big question this raises is whether any other sorts of background > processes should be treated similarly. > > Another loose end is that I don't think the enforcement of > datconnlimit is very sane. CountDBBackends skips processes that say > AmBackgroundWorkerProcess, so logically those should be precisely the > ones that escape enforcement of datconnlimit, but it doesn't quite > work like that today. This patch at least puts parallel workers on > the right side of the line; but I think we have some cleanup to do > for other sorts of auxiliary processes. > > I'm also wondering if we could revisit the need for > CheckMyDatabase's override_allow_connections parameter > and thereby remove INIT_PG_OVERRIDE_ALLOW_CONNS. > So all in all, this is just at the POC stage. Thanks for looking at this problem. I think that disabling the connect privilege check for parallel workers is the right thing to do. Getting permission problems just because PostgreSQL decided to use parallel query doesn't make any sense. I'm not sure about other background workers, but my guts say that they shouldn't check for the connect privilege either. I agree that background workers shouldn't count against a connection limit, but against "max_worker_processes". Yours, Laurenz Albe
Hello, just a notice that this sounds like the symptom already described in https://www.postgresql.org/message-id/c7896c19adb646e889d5b2e40fdd17c1@ldbv.bayern.de Regards, Christian -----Ursprüngliche Nachricht----- Von: Laurenz Albe <laurenz.albe@cybertec.at> Gesendet: Montag, 23. Dezember 2024 11:55 An: pgsql-bugs@lists.postgresql.org Betreff: Commit 5a2fed911a broke parallel query This is a script to reproduce the problem: CREATE ROLE fluff; CREATE ROLE duff LOGIN IN ROLE fluff; CREATE DATABASE scratch OWNER duff; REVOKE CONNECT, TEMP ON DATABASE scratch FROM PUBLIC; \c scratch duff CREATE TABLE large AS SELECT id FROM generate_series(1, 500000) AS id; GRANT SELECT ON large TO fluff; SET ROLE fluff; SELECT count(*) FROM large; Since commit 5a2fed911a, this results in: ERROR: permission denied for database "scratch" DETAIL: User does not have CONNECT privilege. CONTEXT: parallel worker The following patch makes the problem disappear, but I am far from certain that using the session user is always correct there: diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c index a024b1151d0..150ec3f52f8 100644 --- a/src/backend/utils/init/postinit.c +++ b/src/backend/utils/init/postinit.c @@ -360,7 +360,7 @@ CheckMyDatabase(const char *name, bool am_superuser, bool override_allow_connect * and save a few cycles.) */ if (!am_superuser && - object_aclcheck(DatabaseRelationId, MyDatabaseId, GetUserId(), + object_aclcheck(DatabaseRelationId, MyDatabaseId, GetSessionUserId(), ACL_CONNECT) != ACLCHECK_OK) ereport(FATAL, (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE), Yours, Laurenz Albe
"Rank, Christian (LfL)" <Christian.Rank@lfl.bayern.de> writes: > just a notice that this sounds like the symptom already described in > https://www.postgresql.org/message-id/c7896c19adb646e889d5b2e40fdd17c1@ldbv.bayern.de Indeed. My apologies for having not gotten back to you in that thread. There is a committed fix now: https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=41a252c2ca68ba7239e73abd99fdbf28871f5f07 (caution: the fix is nontrivially different in each branch; that one is for v16) regards, tom lane