Re: Connection limits/permissions, slotsync workers, etc - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Connection limits/permissions, slotsync workers, etc
Date
Msg-id 2546145.1735332710@sss.pgh.pa.us
Whole thread Raw
In response to Re: Connection limits/permissions, slotsync workers, etc  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Also, here's a patch for the rest of what I was talking about.

We'll need to back-patch this given that the CVE-2024-10978 changes
caused these sorts of problems in all branches, but I've not yet
attempted to back-patch.  It looks like it might be a bit painful
thanks to past code churn in these areas.

I didn't do anything about the idea of making rolconnlimit applicable
to superusers.  If we do that at all, it should only be in HEAD.
Also, I got a shade less enthusiastic about it after noting that this
logic is parallel to that for datconnlimit, and it does seems sensible
to allow superusers to ignore datconnlimit.  Maybe it's fine for the
two limits to operate differently, but I'm unsure.

Also, it probably would make sense to rename PGPROC.isBackgroundWorker
to isRegularBackend (inverting the sense of the boolean), but that
doesn't seem like back-patch material either, so I didn't include it
here.  I think we can get away with a subtle adjustment of which
processes that flag is set for in the back branches, but not with
renaming it.

            regards, tom lane

From 84017bf4da912cad44416b729860baf2edfe66fd Mon Sep 17 00:00:00 2001
From: Tom Lane <tgl@sss.pgh.pa.us>
Date: Fri, 27 Dec 2024 15:36:31 -0500
Subject: [PATCH v1] Exclude parallel workers from connection privilege/limit
 checks.

Cause parallel workers to not check datallowconn, rolcanlogin, and
ACL_CONNECT privileges.  The leader already checked these things
(except for rolcanlogin which might have been checked for a different
role).  Re-checking can accomplish little except to induce unexpected
failures in applications that might not even be aware that their query
has been parallelized.  We already had the principle that parallel
workers rely on their leader to pass a valid set of authorization
information, so this change just extends that a bit further.

Also, modify the ReservedConnections, datconnlimit and rolconnlimit
logic so that these limits are only enforced against regular backends,
and only regular backends are counted while checking if the limits
were already reached.  Previously, background processes that had an
assigned database or role were subject to these limits (with rather
random exclusions for autovac workers and walsenders), and the set of
existing processes that counted against each limit was quite haphazard
as well.  The point of these limits, AFAICS, is to ensure the
availability of PGPROC slots for regular backends.  Since all other
types of processes have their own separate pools of PGPROC slots, it
makes no sense either to enforce these limits against them or to count
them while enforcing the limit.

While edge-case failures of these sorts have been possible for a
long time, the problem got a good deal worse with commit 5a2fed911
(CVE-2024-10978), which caused parallel workers to make these checks
using the leader's current role instead of its AuthenticatedUserId,
thus allowing parallel queries to fail after SET ROLE.  That older
behavior was fairly accidental and I have no desire to return to it.

This patch allows reverting 73c9f91a1 which was an emergency hack
to suppress these same checks in some cases.  It wasn't complete,
as shown by a recent bug report from Laurenz Albe.  We can also
revert fd4d93d26 and 492217301, which hacked around the problems in
one regression test.

Like 5a2fed911, back-patch to supported branches (which sadly no
longer includes v12).

Discussion: https://postgr.es/m/1808397.1735156190@sss.pgh.pa.us
---
 src/backend/access/transam/parallel.c    | 10 ++++--
 src/backend/access/transam/twophase.c    |  2 +-
 src/backend/postmaster/bgworker.c        |  4 +--
 src/backend/storage/ipc/procarray.c      |  4 +--
 src/backend/storage/lmgr/proc.c          |  4 +--
 src/backend/utils/init/miscinit.c        |  8 ++++-
 src/backend/utils/init/postinit.c        | 40 ++++++++----------------
 src/include/miscadmin.h                  |  1 +
 src/include/storage/proc.h               |  2 +-
 src/test/modules/worker_spi/worker_spi.c | 10 ------
 10 files changed, 37 insertions(+), 48 deletions(-)

diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c
index 0a1e089ec1..60d95037b3 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -1419,10 +1419,16 @@ ParallelWorkerMain(Datum main_arg)
                             fps->session_user_is_superuser);
     SetCurrentRoleId(fps->outer_user_id, fps->role_is_superuser);

-    /* Restore database connection. */
+    /*
+     * Restore database connection.  We skip connection authorization checks,
+     * reasoning that (a) the leader checked these things when it started, and
+     * (b) we do not want parallel mode to cause these failures, because that
+     * would make use of parallel query plans not transparent to applications.
+     */
     BackgroundWorkerInitializeConnectionByOid(fps->database_id,
                                               fps->authenticated_user_id,
-                                              0);
+                                              BGWORKER_BYPASS_ALLOWCONN |
+                                              BGWORKER_BYPASS_ROLELOGINCHECK);

     /*
      * Set the client encoding to the database encoding, since that is what
diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c
index 49be1df91c..edb8e06081 100644
--- a/src/backend/access/transam/twophase.c
+++ b/src/backend/access/transam/twophase.c
@@ -466,7 +466,7 @@ MarkAsPreparingGuts(GlobalTransaction gxact, TransactionId xid, const char *gid,
     proc->databaseId = databaseid;
     proc->roleId = owner;
     proc->tempNamespaceId = InvalidOid;
-    proc->isBackgroundWorker = false;
+    proc->isBackgroundWorker = true;
     proc->lwWaiting = LW_WS_NOT_WAITING;
     proc->lwWaitMode = 0;
     proc->waitLock = NULL;
diff --git a/src/backend/postmaster/bgworker.c b/src/backend/postmaster/bgworker.c
index 07bc5517fc..7afe56885c 100644
--- a/src/backend/postmaster/bgworker.c
+++ b/src/backend/postmaster/bgworker.c
@@ -854,7 +854,7 @@ BackgroundWorkerInitializeConnection(const char *dbname, const char *username, u
     BackgroundWorker *worker = MyBgworkerEntry;
     bits32        init_flags = 0; /* never honor session_preload_libraries */

-    /* ignore datallowconn? */
+    /* ignore datallowconn and ACL_CONNECT? */
     if (flags & BGWORKER_BYPASS_ALLOWCONN)
         init_flags |= INIT_PG_OVERRIDE_ALLOW_CONNS;
     /* ignore rolcanlogin? */
@@ -888,7 +888,7 @@ BackgroundWorkerInitializeConnectionByOid(Oid dboid, Oid useroid, uint32 flags)
     BackgroundWorker *worker = MyBgworkerEntry;
     bits32        init_flags = 0; /* never honor session_preload_libraries */

-    /* ignore datallowconn? */
+    /* ignore datallowconn and ACL_CONNECT? */
     if (flags & BGWORKER_BYPASS_ALLOWCONN)
         init_flags |= INIT_PG_OVERRIDE_ALLOW_CONNS;
     /* ignore rolcanlogin? */
diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index c769b1aa3e..a640b6f5f7 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -3622,8 +3622,7 @@ CountDBBackends(Oid databaseid)
 }

 /*
- * CountDBConnections --- counts database backends ignoring any background
- *        worker processes
+ * CountDBConnections --- counts database backends (only regular backends)
  */
 int
 CountDBConnections(Oid databaseid)
@@ -3695,6 +3694,7 @@ CancelDBBackends(Oid databaseid, ProcSignalReason sigmode, bool conflictPending)

 /*
  * CountUserBackends --- count backends that are used by specified user
+ * (only regular backends, not any type of background worker)
  */
 int
 CountUserBackends(Oid roleid)
diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c
index 720ef99ee8..9bce100256 100644
--- a/src/backend/storage/lmgr/proc.c
+++ b/src/backend/storage/lmgr/proc.c
@@ -427,7 +427,7 @@ InitProcess(void)
     MyProc->databaseId = InvalidOid;
     MyProc->roleId = InvalidOid;
     MyProc->tempNamespaceId = InvalidOid;
-    MyProc->isBackgroundWorker = AmBackgroundWorkerProcess();
+    MyProc->isBackgroundWorker = !AmRegularBackendProcess();
     MyProc->delayChkptFlags = 0;
     MyProc->statusFlags = 0;
     /* NB -- autovac launcher intentionally does not set IS_AUTOVACUUM */
@@ -626,7 +626,7 @@ InitAuxiliaryProcess(void)
     MyProc->databaseId = InvalidOid;
     MyProc->roleId = InvalidOid;
     MyProc->tempNamespaceId = InvalidOid;
-    MyProc->isBackgroundWorker = AmBackgroundWorkerProcess();
+    MyProc->isBackgroundWorker = true;
     MyProc->delayChkptFlags = 0;
     MyProc->statusFlags = 0;
     MyProc->lwWaiting = LW_WS_NOT_WAITING;
diff --git a/src/backend/utils/init/miscinit.c b/src/backend/utils/init/miscinit.c
index d24ac133fb..68a917c8fb 100644
--- a/src/backend/utils/init/miscinit.c
+++ b/src/backend/utils/init/miscinit.c
@@ -844,6 +844,9 @@ InitializeSessionUserId(const char *rolename, Oid roleid, bool bypass_login_chec
      * These next checks are not enforced when in standalone mode, so that
      * there is a way to recover from sillinesses like "UPDATE pg_authid SET
      * rolcanlogin = false;".
+     *
+     * We also do not enforce rolcanlogin in background workers that set
+     * bypass_login_check.
      */
     if (IsUnderPostmaster)
     {
@@ -857,7 +860,9 @@ InitializeSessionUserId(const char *rolename, Oid roleid, bool bypass_login_chec
                             rname)));

         /*
-         * Check connection limit for this role.
+         * Check connection limit for this role.  We enforce the limit only
+         * for regular backends, since other process types have their own
+         * PGPROC pools.
          *
          * There is a race condition here --- we create our PGPROC before
          * checking for other PGPROCs.  If two backends did this at about the
@@ -867,6 +872,7 @@ InitializeSessionUserId(const char *rolename, Oid roleid, bool bypass_login_chec
          * just document that the connection limit is approximate.
          */
         if (rform->rolconnlimit >= 0 &&
+            AmRegularBackendProcess() &&
             !is_superuser &&
             CountUserBackends(roleid) > rform->rolconnlimit)
             ereport(FATAL,
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index 770ab6906e..af3b030741 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -22,7 +22,6 @@
 #include "access/genam.h"
 #include "access/heapam.h"
 #include "access/htup_details.h"
-#include "access/parallel.h"
 #include "access/session.h"
 #include "access/tableam.h"
 #include "access/xact.h"
@@ -342,7 +341,8 @@ CheckMyDatabase(const char *name, bool am_superuser, bool override_allow_connect
      * a way to recover from disabling all access to all databases, for
      * example "UPDATE pg_database SET datallowconn = false;".
      *
-     * We do not enforce them for autovacuum worker processes either.
+     * We do not enforce them for autovacuum worker processes either, nor for
+     * background workers that set override_allow_connections.
      */
     if (IsUnderPostmaster && !AmAutoVacuumWorkerProcess())
     {
@@ -360,7 +360,7 @@ CheckMyDatabase(const char *name, bool am_superuser, bool override_allow_connect
          * is redundant, but since we have the flag, might as well check it
          * and save a few cycles.)
          */
-        if (!am_superuser &&
+        if (!am_superuser && !override_allow_connections &&
             object_aclcheck(DatabaseRelationId, MyDatabaseId, GetUserId(),
                             ACL_CONNECT) != ACLCHECK_OK)
             ereport(FATAL,
@@ -369,7 +369,9 @@ CheckMyDatabase(const char *name, bool am_superuser, bool override_allow_connect
                      errdetail("User does not have CONNECT privilege.")));

         /*
-         * Check connection limit for this database.
+         * Check connection limit for this database.  We enforce the limit
+         * only for regular backends, since other process types have their own
+         * PGPROC pools.
          *
          * There is a race condition here --- we create our PGPROC before
          * checking for other PGPROCs.  If two backends did this at about the
@@ -379,6 +381,7 @@ CheckMyDatabase(const char *name, bool am_superuser, bool override_allow_connect
          * just document that the connection limit is approximate.
          */
         if (dbform->datconnlimit >= 0 &&
+            AmRegularBackendProcess() &&
             !am_superuser &&
             CountDBConnections(MyDatabaseId) > dbform->datconnlimit)
             ereport(FATAL,
@@ -865,23 +868,7 @@ InitPostgres(const char *in_dbname, Oid dboid,
         {
             InitializeSessionUserId(username, useroid,
                                     (flags & INIT_PG_OVERRIDE_ROLE_LOGIN) != 0);
-
-            /*
-             * In a parallel worker, set am_superuser based on the
-             * authenticated user ID, not the current role.  This is pretty
-             * dubious but it matches our historical behavior.  Note that this
-             * value of am_superuser is used only for connection-privilege
-             * checks here and in CheckMyDatabase (we won't reach
-             * process_startup_options in a background worker).
-             *
-             * In other cases, there's been no opportunity for the current
-             * role to diverge from the authenticated user ID yet, so we can
-             * just rely on superuser() and avoid an extra catalog lookup.
-             */
-            if (InitializingParallelWorker)
-                am_superuser = superuser_arg(GetAuthenticatedUserId());
-            else
-                am_superuser = superuser();
+            am_superuser = superuser();
         }
     }
     else
@@ -908,17 +895,16 @@ InitPostgres(const char *in_dbname, Oid dboid,
     }

     /*
-     * The last few connection slots are reserved for superusers and roles
-     * with privileges of pg_use_reserved_connections.  Replication
-     * connections are drawn from slots reserved with max_wal_senders and are
-     * not limited by max_connections, superuser_reserved_connections, or
-     * reserved_connections.
+     * The last few regular connection slots are reserved for superusers and
+     * roles with privileges of pg_use_reserved_connections.  We do not apply
+     * these limits to background processes, since they all have their own
+     * pools of PGPROC slots.
      *
      * Note: At this point, the new backend has already claimed a proc struct,
      * so we must check whether the number of free slots is strictly less than
      * the reserved connection limits.
      */
-    if (!am_superuser && !am_walsender &&
+    if (AmRegularBackendProcess() && !am_superuser &&
         (SuperuserReservedConnections + ReservedConnections) > 0 &&
         !HaveNFreeProcs(SuperuserReservedConnections + ReservedConnections, &nfree))
     {
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 3f97fcef80..4e3f5fbbca 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -375,6 +375,7 @@ typedef enum BackendType

 extern PGDLLIMPORT BackendType MyBackendType;

+#define AmRegularBackendProcess()    (MyBackendType == B_BACKEND)
 #define AmAutoVacuumLauncherProcess() (MyBackendType == B_AUTOVAC_LAUNCHER)
 #define AmAutoVacuumWorkerProcess()    (MyBackendType == B_AUTOVAC_WORKER)
 #define AmBackgroundWorkerProcess() (MyBackendType == B_BG_WORKER)
diff --git a/src/include/storage/proc.h b/src/include/storage/proc.h
index 5a3dd5d2d4..9dba393da3 100644
--- a/src/include/storage/proc.h
+++ b/src/include/storage/proc.h
@@ -216,7 +216,7 @@ struct PGPROC
     Oid            tempNamespaceId;    /* OID of temp schema this backend is
                                      * using */

-    bool        isBackgroundWorker; /* true if background worker. */
+    bool        isBackgroundWorker; /* true if not a regular backend. */

     /*
      * While in hot standby mode, shows that a conflict signal has been sent
diff --git a/src/test/modules/worker_spi/worker_spi.c b/src/test/modules/worker_spi/worker_spi.c
index d4403b24d9..cf5b7505ec 100644
--- a/src/test/modules/worker_spi/worker_spi.c
+++ b/src/test/modules/worker_spi/worker_spi.c
@@ -169,16 +169,6 @@ worker_spi_main(Datum main_arg)
         BackgroundWorkerInitializeConnection(worker_spi_database,
                                              worker_spi_role, flags);

-    /*
-     * Disable parallel query for workers started with
-     * BGWORKER_BYPASS_ALLOWCONN or BGWORKER_BYPASS_ROLELOGINCHECK so as these
-     * don't attempt connections using a database or a role that may not allow
-     * that.
-     */
-    if ((flags & (BGWORKER_BYPASS_ALLOWCONN | BGWORKER_BYPASS_ROLELOGINCHECK)))
-        SetConfigOption("max_parallel_workers_per_gather", "0",
-                        PGC_USERSET, PGC_S_OVERRIDE);
-
     elog(LOG, "%s initialized with %s.%s",
          MyBgworkerEntry->bgw_name, table->schema, table->name);
     initialize_worker_spi(table);
--
2.43.5


pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: [PATCHES] Post-special page storage TDE support
Next
From: David Christensen
Date:
Subject: Re: [PATCHES] Post-special page storage TDE support