Thread: Commit 5a2fed911a broke parallel query

Commit 5a2fed911a broke parallel query

From
Laurenz Albe
Date:
This is a script to reproduce the problem:

  CREATE ROLE fluff;
  CREATE ROLE duff LOGIN IN ROLE fluff;
  CREATE DATABASE scratch OWNER duff;
  REVOKE CONNECT, TEMP ON DATABASE scratch FROM PUBLIC;

  \c scratch duff
  CREATE TABLE large AS SELECT id FROM generate_series(1, 500000) AS id;
  GRANT SELECT ON large TO fluff;

  SET ROLE fluff;
  SELECT count(*) FROM large;

Since commit 5a2fed911a, this results in:

  ERROR:  permission denied for database "scratch"
  DETAIL:  User does not have CONNECT privilege.
  CONTEXT:  parallel worker

The following patch makes the problem disappear, but I am far
from certain that using the session user is always correct there:

diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index a024b1151d0..150ec3f52f8 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -360,7 +360,7 @@ CheckMyDatabase(const char *name, bool am_superuser, bool override_allow_connect
          * and save a few cycles.)
          */
         if (!am_superuser &&
-            object_aclcheck(DatabaseRelationId, MyDatabaseId, GetUserId(),
+            object_aclcheck(DatabaseRelationId, MyDatabaseId, GetSessionUserId(),
                             ACL_CONNECT) != ACLCHECK_OK)
             ereport(FATAL,
                     (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),

Yours,
Laurenz Albe



Re: Commit 5a2fed911a broke parallel query

From
Tom Lane
Date:
Laurenz Albe <laurenz.albe@cybertec.at> writes:
> This is a script to reproduce the problem:

As commented in 73c9f91a1,

    This all seems very accidental and probably not the behavior we really
    want, but a security patch is no time to be redesigning things.

Perhaps this thread is a good place to start the discussion.

In the security-team discussion that led up to 73c9f91a1, I'd been
wondering why parallel worker start enforces any connection
restrictions at all.  If we'd like the use of parallelism to be
more-or-less transparent, we shouldn't apply these checks,
and not the !am_superuser ones in InitPostgres either.  If we do
want to apply permissions checks in parallel worker start, why
should we think that the failure in this example is wrong?

            regards, tom lane



Re: Commit 5a2fed911a broke parallel query

From
Tom Lane
Date:
I wrote:
> In the security-team discussion that led up to 73c9f91a1, I'd been
> wondering why parallel worker start enforces any connection
> restrictions at all.  If we'd like the use of parallelism to be
> more-or-less transparent, we shouldn't apply these checks,
> and not the !am_superuser ones in InitPostgres either.

That is, the way I'd prefer to attack this is something along the
lines of the attached, which just disables all those checks in
parallel workers (and reverts 73c9f91a1's hackery on am_superuser).

One big question this raises is whether any other sorts of background
processes should be treated similarly.

Another loose end is that I don't think the enforcement of
datconnlimit is very sane.  CountDBBackends skips processes that say
AmBackgroundWorkerProcess, so logically those should be precisely the
ones that escape enforcement of datconnlimit, but it doesn't quite
work like that today.  This patch at least puts parallel workers on
the right side of the line; but I think we have some cleanup to do
for other sorts of auxiliary processes.

I'm also wondering if we could revisit the need for
CheckMyDatabase's override_allow_connections parameter
and thereby remove INIT_PG_OVERRIDE_ALLOW_CONNS.

So all in all, this is just at the POC stage.

            regards, tom lane

diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index 770ab6906e..bd2b91bdde 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -342,9 +342,17 @@ CheckMyDatabase(const char *name, bool am_superuser, bool override_allow_connect
      * a way to recover from disabling all access to all databases, for
      * example "UPDATE pg_database SET datallowconn = false;".
      *
-     * We do not enforce them for autovacuum worker processes either.
+     * We do not enforce them for autovacuum worker processes either, so that
+     * autovacuuming can still happen in all databases.
+     *
+     * We do not enforce them for parallel worker processes either: it's
+     * assumed that the checks applied when the leader started are sufficient.
+     * (To do otherwise would make the use of parallel query less transparent
+     * than we want.)
      */
-    if (IsUnderPostmaster && !AmAutoVacuumWorkerProcess())
+    if (IsUnderPostmaster &&
+        !AmAutoVacuumWorkerProcess() &&
+        !InitializingParallelWorker)
     {
         /*
          * Check that the database is currently allowing connections.
@@ -865,23 +873,7 @@ InitPostgres(const char *in_dbname, Oid dboid,
         {
             InitializeSessionUserId(username, useroid,
                                     (flags & INIT_PG_OVERRIDE_ROLE_LOGIN) != 0);
-
-            /*
-             * In a parallel worker, set am_superuser based on the
-             * authenticated user ID, not the current role.  This is pretty
-             * dubious but it matches our historical behavior.  Note that this
-             * value of am_superuser is used only for connection-privilege
-             * checks here and in CheckMyDatabase (we won't reach
-             * process_startup_options in a background worker).
-             *
-             * In other cases, there's been no opportunity for the current
-             * role to diverge from the authenticated user ID yet, so we can
-             * just rely on superuser() and avoid an extra catalog lookup.
-             */
-            if (InitializingParallelWorker)
-                am_superuser = superuser_arg(GetAuthenticatedUserId());
-            else
-                am_superuser = superuser();
+            am_superuser = superuser();
         }
     }
     else
@@ -909,16 +901,17 @@ InitPostgres(const char *in_dbname, Oid dboid,

     /*
      * The last few connection slots are reserved for superusers and roles
-     * with privileges of pg_use_reserved_connections.  Replication
-     * connections are drawn from slots reserved with max_wal_senders and are
-     * not limited by max_connections, superuser_reserved_connections, or
-     * reserved_connections.
+     * with privileges of pg_use_reserved_connections.
+     *
+     * This does not apply to walsenders (which are instead limited by
+     * max_wal_senders) nor parallel workers (which are instead limited by
+     * max_parallel_workers).
      *
      * Note: At this point, the new backend has already claimed a proc struct,
      * so we must check whether the number of free slots is strictly less than
      * the reserved connection limits.
      */
-    if (!am_superuser && !am_walsender &&
+    if (!am_superuser && !am_walsender && !InitializingParallelWorker &&
         (SuperuserReservedConnections + ReservedConnections) > 0 &&
         !HaveNFreeProcs(SuperuserReservedConnections + ReservedConnections, &nfree))
     {

Re: Commit 5a2fed911a broke parallel query

From
Laurenz Albe
Date:
On Mon, 2024-12-23 at 12:12 -0500, Tom Lane wrote:
> I wrote:
> > In the security-team discussion that led up to 73c9f91a1, I'd been
> > wondering why parallel worker start enforces any connection
> > restrictions at all.  If we'd like the use of parallelism to be
> > more-or-less transparent, we shouldn't apply these checks,
> > and not the !am_superuser ones in InitPostgres either.
>
> That is, the way I'd prefer to attack this is something along the
> lines of the attached, which just disables all those checks in
> parallel workers (and reverts 73c9f91a1's hackery on am_superuser).
>
> One big question this raises is whether any other sorts of background
> processes should be treated similarly.
>
> Another loose end is that I don't think the enforcement of
> datconnlimit is very sane.  CountDBBackends skips processes that say
> AmBackgroundWorkerProcess, so logically those should be precisely the
> ones that escape enforcement of datconnlimit, but it doesn't quite
> work like that today.  This patch at least puts parallel workers on
> the right side of the line; but I think we have some cleanup to do
> for other sorts of auxiliary processes.
>
> I'm also wondering if we could revisit the need for
> CheckMyDatabase's override_allow_connections parameter
> and thereby remove INIT_PG_OVERRIDE_ALLOW_CONNS.
> So all in all, this is just at the POC stage.

Thanks for looking at this problem.

I think that disabling the connect privilege check for parallel
workers is the right thing to do.  Getting permission problems just
because PostgreSQL decided to use parallel query doesn't make any sense.

I'm not sure about other background workers, but my guts say that
they shouldn't check for the connect privilege either.

I agree that background workers shouldn't count against a connection
limit, but against "max_worker_processes".

Yours,
Laurenz Albe