Thread: Show dropped users' backends in pg_stat_activity
I was looking into some issues we recently had when dropping db users and was surprised to see that dropped users' sessions and transactions continue to work after the role is dropped. Since dropping a role requires dropping all grants it has (using DROP OWNED BY ...) the dropped role can't start new transactions that do a whole lot unless there are objects with access granted to PUBLIC, but any running transactions remain running and can write to the database. They can also hold locks which interfere with other backends without showing up in most activity or lock monitoring tools as they won't appear in pg_stat_activity. IMO any open sessions for a dropped user should be automatically terminated when the role is dropped, but that would probably be a bigger change so attached a proposed patch for using left joins in pg_stat_activity and pg_stat_replication to show activity by dropped roles. / Oskari
Attachment
<div dir="ltr">On Tue, Mar 15, 2016 at 5:21 PM, Oskari Saarenmaa <<a href="mailto:os@ohmu.fi">os@ohmu.fi</a>> wrote:<br/>> I was looking into some issues we recently had when dropping db users and<br />> was surprised to seethat dropped users' sessions and transactions continue<br />> to work after the role is dropped.<br />><br />>Since dropping a role requires dropping all grants it has (using DROP OWNED<br />> BY ...) the dropped role can'tstart new transactions that do a whole lot<br />> unless there are objects with access granted to PUBLIC, but anyrunning<br />> transactions remain running and can write to the database. They can also<br />> hold locks whichinterfere with other backends without showing up in most<br />> activity or lock monitoring tools as they won't appearin pg_stat_activity.<br />><br />> IMO any open sessions for a dropped user should be automatically terminated<br/>> when the role is dropped, but that would probably be a bigger change so<br />> attached a proposedpatch for using left joins in pg_stat_activity and<br />> pg_stat_replication to show activity by dropped roles.<br/><br />Gee, I would have expected the DROP to be blocked until the user disconnected, like we do for DROP DATABASE.<br/><br />-- <br />Robert Haas<br />EnterpriseDB: <a href="http://www.enterprisedb.com">http://www.enterprisedb.com</a><br/>The Enterprise PostgreSQL Company<br /></div>
Robert Haas <robertmhaas@gmail.com> writes: > Gee, I would have expected the DROP to be blocked until the user > disconnected, like we do for DROP DATABASE. Making that race-condition-free would require some notion of a lock on roles, I think. Seems pretty messy compared to the amount of actual value obtained. There are good reasons why you can't have a backend running in a nonexistent database; but a backend with a nonexistent user OID is not really going to be a problem for anything except monitoring queries that fail to use left joins where appropriate. Even if we maintained some interlock for a backend's login role identity, I hardly think it would be practical to e.g. lock during transient SET ROLE or security-definer-function-call operations. So it's not like we can let the permissions system assume that a role OID being inquired about always matches a live entry in pg_authid. regards, tom lane
16.03.2016, 17:48, Tom Lane kirjoitti: > Robert Haas <robertmhaas@gmail.com> writes: >> Gee, I would have expected the DROP to be blocked until the user >> disconnected, like we do for DROP DATABASE. > > Making that race-condition-free would require some notion of a lock on > roles, I think. Seems pretty messy compared to the amount of actual > value obtained. There are good reasons why you can't have a backend > running in a nonexistent database; but a backend with a nonexistent > user OID is not really going to be a problem for anything except > monitoring queries that fail to use left joins where appropriate. I don't think most people expect dropped users to be able to keep using the database. If it's not feasible to block DROP ROLE until the user has disconnected or to kill dropped users' sessions immediately after they're dropped we should at least show their sessions in pg_stat_activity and add a note about it in DROP ROLE docs. / Oskari
I had the same problem and thought similar thing. At Wed, 16 Mar 2016 11:48:10 -0400, Tom Lane <tgl@sss.pgh.pa.us> wrote in <16068.1458143290@sss.pgh.pa.us> > Robert Haas <robertmhaas@gmail.com> writes: > > Gee, I would have expected the DROP to be blocked until the user > > disconnected, like we do for DROP DATABASE. FWTW, I agree with Robert. > Making that race-condition-free would require some notion of a lock on > roles, I think. Seems pretty messy compared to the amount of actual > value obtained. There are good reasons why you can't have a backend > running in a nonexistent database; but a backend with a nonexistent > user OID is not really going to be a problem for anything except > monitoring queries that fail to use left joins where appropriate. > > Even if we maintained some interlock for a backend's login role identity, > I hardly think it would be practical to e.g. lock during transient SET > ROLE or security-definer-function-call operations. So it's not like we > can let the permissions system assume that a role OID being inquired about > always matches a live entry in pg_authid. Even if blocking DROPs is not perfect for all cases, unconditionally allowing to DROP a role still doesn't seem proper behavior, especially for replication roles. And session logins seem to me to have enough reason to be treated differently than disguising as another role using SET ROLE or sec-definer. The attached patch blocks DROP ROLE for roles that own active sessions, and on the other hand prevents a session from being activated if the login role is concurrently dropped. Oskari's LEFT-Join patch is still desirable. Is this still pointless? regards, diff --git a/src/backend/commands/user.c b/src/backend/commands/user.c index 4baeaa2..52ac271 100644 --- a/src/backend/commands/user.c +++ b/src/backend/commands/user.c @@ -31,6 +31,7 @@#include "libpq/md5.h"#include "miscadmin.h"#include "storage/lmgr.h" +#include "storage/procarray.h"#include "utils/acl.h"#include "utils/builtins.h"#include "utils/fmgroids.h" @@ -1026,6 +1027,12 @@ DropRole(DropRoleStmt *stmt) errdetail_internal("%s", detail), errdetail_log("%s", detail_log))); + /* If this role is currently connecting, refuse to drop it. */ + if (BackendRoleProcExists(roleid)) + ereport(ERROR, + (errcode(ERRCODE_DEPENDENT_OBJECTS_STILL_EXIST), + errmsg("role \"%s\" is currently logged in", role))); + /* * Remove the role from the pg_authid table */ diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c index 740beb6..ad208d7 100644 --- a/src/backend/storage/ipc/procarray.c +++ b/src/backend/storage/ipc/procarray.c @@ -2358,6 +2358,54 @@ BackendPidGetProcWithLock(int pid)}/* + * BackendRoleProcExists -- check if a backend with given its role exists + * + * Returns true if found. + */ +bool +BackendRoleProcExists(Oid roleoid) +{ + bool result; + + if (roleoid == 0) /* No match with role id 0 */ + return false; + + LWLockAcquire(ProcArrayLock, LW_SHARED); + + result = BackendRoleProcExistsWithLock(roleoid); + + LWLockRelease(ProcArrayLock); + + return result; +} + +/* + * BackendRoleProcExistsWithLock -- check if a backend with given its role + * exists + * + * Same as above, except caller must be holding ProcArrayLock. + */ +bool +BackendRoleProcExistsWithLock(Oid roleoid) +{ + ProcArrayStruct *arrayP = procArray; + int index; + + if (roleoid == 0) /* No match with role id 0 */ + return false; + + for (index = 0; index < arrayP->numProcs; index++) + { + PGPROC *proc = &allProcs[arrayP->pgprocnos[index]]; + + if (proc->roleId == roleoid) + return true; + } + + return false; +} + +/* * BackendXidGetPid -- get a backend's pid given its XID * * Returns 0 if not found or it's a prepared transaction. Notethat diff --git a/src/backend/utils/init/miscinit.c b/src/backend/utils/init/miscinit.c index d13355b..f115c43 100644 --- a/src/backend/utils/init/miscinit.c +++ b/src/backend/utils/init/miscinit.c @@ -41,6 +41,7 @@#include "storage/fd.h"#include "storage/ipc.h"#include "storage/latch.h" +#include "storage/lmgr.h"#include "storage/pg_shmem.h"#include "storage/proc.h"#include "storage/procarray.h" @@ -500,20 +501,24 @@ InitializeSessionUserId(const char *rolename, Oid roleid) ereport(FATAL, (errcode(ERRCODE_INVALID_AUTHORIZATION_SPECIFICATION), errmsg("role \"%s\" does not exist", rolename))); - } - else - { - roleTup = SearchSysCache1(AUTHOID, ObjectIdGetDatum(roleid)); - if (!HeapTupleIsValid(roleTup)) - ereport(FATAL, - (errcode(ERRCODE_INVALID_AUTHORIZATION_SPECIFICATION), - errmsg("role with OID %u does not exist", roleid))); + roleid = HeapTupleGetOid(roleTup); + ReleaseSysCache(roleTup); } + /* + * Inhibiting this session from being activated for concurrently dropped + * roles + */ + LockSharedObject(AuthIdRelationId, roleid, 0, AccessShareLock); + + roleTup = SearchSysCache1(AUTHOID, ObjectIdGetDatum(roleid)); + if (!HeapTupleIsValid(roleTup)) + ereport(FATAL, + (errcode(ERRCODE_INVALID_AUTHORIZATION_SPECIFICATION), + errmsg("role with OID %u does not exist", roleid))); + rform = (Form_pg_authid) GETSTRUCT(roleTup); - roleid = HeapTupleGetOid(roleTup); rname = NameStr(rform->rolname); - AuthenticatedUserId = roleid; AuthenticatedUserIsSuperuser = rform->rolsuper; @@ -524,6 +529,8 @@ InitializeSessionUserId(const char *rolename, Oid roleid) /* (We assume this is an atomic store sono lock is needed) */ MyProc->roleId = roleid; + UnlockSharedObject(AuthIdRelationId, roleid, 0, AccessShareLock); + /* * These next checks are not enforced when in standalone mode, so that * there is a way to recover from sillinesseslike "UPDATE pg_authid SET diff --git a/src/include/storage/procarray.h b/src/include/storage/procarray.h index dd37c0c..8c69d6f 100644 --- a/src/include/storage/procarray.h +++ b/src/include/storage/procarray.h @@ -62,6 +62,8 @@ extern bool HaveVirtualXIDsDelayingChkpt(VirtualTransactionId *vxids, int nvxidsextern PGPROC *BackendPidGetProc(intpid);extern PGPROC *BackendPidGetProcWithLock(int pid); +extern bool BackendRoleProcExists(Oid roleoid); +extern bool BackendRoleProcExistsWithLock(Oid roleoid);extern int BackendXidGetPid(TransactionId xid);extern boolIsBackendPid(int pid);
On 3/22/16 10:35 PM, Kyotaro HORIGUCHI wrote: >> Even if we maintained some interlock for a backend's login role identity, >> >I hardly think it would be practical to e.g. lock during transient SET >> >ROLE or security-definer-function-call operations. So it's not like we >> >can let the permissions system assume that a role OID being inquired about >> >always matches a live entry in pg_authid. > Even if blocking DROPs is not perfect for all cases, > unconditionally allowing to DROP a role still doesn't seem proper > behavior, especially for replication roles. And session logins > seem to me to have enough reason to be treated differently than > disguising as another role using SET ROLE or sec-definer. There's probably a way this could be handled, since DROP ROLE is presumably a very uncommon operation. Perhaps something as simple as keeping a single OID in shared memory for the role about to be dropped. That would serialize role drops, but I doubt that matters. > The attached patch blocks DROP ROLE for roles that own active > sessions, and on the other hand prevents a session from being > activated if the login role is concurrently dropped. I think this is fine for now, but... what happens if you drop a role that's in use on a streaming replica? Does replay stall or do we just ignore it? There should probably be some doc changes to go with the patch too, no? -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com
Hi, At Tue, 22 Mar 2016 22:47:16 -0500, Jim Nasby <Jim.Nasby@BlueTreble.com> wrote in <56F211C4.6010103@BlueTreble.com> > On 3/22/16 10:35 PM, Kyotaro HORIGUCHI wrote: > >> Even if we maintained some interlock for a backend's login role > >> identity, > >> >I hardly think it would be practical to e.g. lock during transient SET > >> >ROLE or security-definer-function-call operations. So it's not like > >> >we > >> >can let the permissions system assume that a role OID being inquired > >> >about > >> >always matches a live entry in pg_authid. > > Even if blocking DROPs is not perfect for all cases, > > unconditionally allowing to DROP a role still doesn't seem proper > > behavior, especially for replication roles. And session logins > > seem to me to have enough reason to be treated differently than > > disguising as another role using SET ROLE or sec-definer. > > There's probably a way this could be handled, since DROP ROLE is > presumably a very uncommon operation. Perhaps something as simple as > keeping a single OID in shared memory for the role about to be > dropped. That would serialize role drops, but I doubt that matters. The OID in shared memory has the same role with a tuple with the OID in pg_authid in this patch. So it seems need a lock or a retry mechanism, or we see a message something like this:p | DROP ROLE: Another role is concurrently being dropped. > > The attached patch blocks DROP ROLE for roles that own active > > sessions, and on the other hand prevents a session from being > > activated if the login role is concurrently dropped. > > I think this is fine for now, but... what happens if you drop a role > that's in use on a streaming replica? Does replay stall or do we just > ignore it? It behaves as the same to the ordinary backends. DROP ROLE fails for any active walsender's session(?) role, or a new walsender rejects login attempts by the role under being dropped. > There should probably be some doc changes to go with the patch too, > no? Yes, this is a PoC. I'll provide documentation if this is acceptable, and necessary. "20.4 Dropping Roles" would be appropriate? http://www.postgresql.org/docs/9.5/static/role-removal.html Treating a session as an object dependent on the role could be cleaner but may be too complex and fragile.. regards, -- Kyotaro Horiguchi NTT Open Source Software Center
On Tue, Mar 22, 2016 at 11:35 PM, Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote: > Even if blocking DROPs is not perfect for all cases, > unconditionally allowing to DROP a role still doesn't seem proper > behavior, especially for replication roles. And session logins > seem to me to have enough reason to be treated differently than > disguising as another role using SET ROLE or sec-definer. > > The attached patch blocks DROP ROLE for roles that own active > sessions, and on the other hand prevents a session from being > activated if the login role is concurrently dropped. > > Oskari's LEFT-Join patch is still desirable. > > Is this still pointless? I am not really in favor of half-fixing this. If we can't conveniently wait until a dropped role is completely out of the system, then I don't see a lot of point in trying to do it in the limited cases where we can. If LEFT JOIN is the way to go, then, blech, but, so be it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > I am not really in favor of half-fixing this. If we can't > conveniently wait until a dropped role is completely out of the > system, then I don't see a lot of point in trying to do it in the > limited cases where we can. If LEFT JOIN is the way to go, then, > blech, but, so be it. I concur. Let's put the left join(s) into those views and call it good. BTW, I think we would need the left joins even if we had interlocking in DROP, just to protect ourselves against race conditions. Remember that what pg_stat_activity shows is a snapshot, which might be more or less out of date compared to the catalog contents. regards, tom lane
* Tom Lane (tgl@sss.pgh.pa.us) wrote: > Robert Haas <robertmhaas@gmail.com> writes: > > I am not really in favor of half-fixing this. If we can't > > conveniently wait until a dropped role is completely out of the > > system, then I don't see a lot of point in trying to do it in the > > limited cases where we can. If LEFT JOIN is the way to go, then, > > blech, but, so be it. > > I concur. Let's put the left join(s) into those views and call it > good. I'd suggest we also add some notes to the documentation that the correct approach to dropping users is to disallow access first, then kill any existing backends, and then drop the user. That, plus the left joins, seems like it's good enough. > BTW, I think we would need the left joins even if we had interlocking > in DROP, just to protect ourselves against race conditions. Remember > that what pg_stat_activity shows is a snapshot, which might be more or > less out of date compared to the catalog contents. True, though that would likely be a much smaller set of cases that might also be short lived. Might be good to also note in the docs how to kill off sessions which are regular users but which no longer have a username, for folks who end up in this situation that they managed to drop a role which still had connections to the system. Thanks! Stephen
24.03.2016, 18:03, Tom Lane kirjoitti: > Robert Haas <robertmhaas@gmail.com> writes: >> I am not really in favor of half-fixing this. If we can't >> conveniently wait until a dropped role is completely out of the >> system, then I don't see a lot of point in trying to do it in the >> limited cases where we can. If LEFT JOIN is the way to go, then, >> blech, but, so be it. > > I concur. Let's put the left join(s) into those views and call it > good. > > BTW, I think we would need the left joins even if we had interlocking > in DROP, just to protect ourselves against race conditions. Remember > that what pg_stat_activity shows is a snapshot, which might be more or > less out of date compared to the catalog contents. Added my patch to the 2016-09 commitfest (https://commitfest.postgresql.org/10/601/) as a bug fix as I thought not showing all backends in pg_stat_activity is a bug. Any chance to get it in 9.6? -- Oskari Saarenmaa Aiven: managed cloud databases https://aiven.io
Oskari Saarenmaa <os@aiven.io> writes: > 24.03.2016, 18:03, Tom Lane kirjoitti: >> I concur. Let's put the left join(s) into those views and call it >> good. > Added my patch to the 2016-09 commitfest > (https://commitfest.postgresql.org/10/601/) as a bug fix as I thought > not showing all backends in pg_stat_activity is a bug. Any chance to > get it in 9.6? I do not think this is a sufficiently high-priority bug to justify putting it into 9.6 at this point. It's been like this for years and there's a noticeable lack of any field complaints. Fixing it in the next release cycle seems fine to me. regards, tom lane
On Thu, Apr 7, 2016 at 11:22:08PM +0300, Oskari Saarenmaa wrote: > 24.03.2016, 18:03, Tom Lane kirjoitti: > >Robert Haas <robertmhaas@gmail.com> writes: > >>I am not really in favor of half-fixing this. If we can't > >>conveniently wait until a dropped role is completely out of the > >>system, then I don't see a lot of point in trying to do it in the > >>limited cases where we can. If LEFT JOIN is the way to go, then, > >>blech, but, so be it. > > > >I concur. Let's put the left join(s) into those views and call it > >good. > > > >BTW, I think we would need the left joins even if we had interlocking > >in DROP, just to protect ourselves against race conditions. Remember > >that what pg_stat_activity shows is a snapshot, which might be more or > >less out of date compared to the catalog contents. > > Added my patch to the 2016-09 commitfest > (https://commitfest.postgresql.org/10/601/) as a bug fix as I thought not > showing all backends in pg_stat_activity is a bug. Any chance to get it in > 9.6? Do we need a comment in the query explaining why a left join is needed, e.g. "Use LEFT JOIN in case the role has been dropped"? That wouldn't be obvious to me. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +
Stephen Frost <sfrost@snowman.net> writes: > * Tom Lane (tgl@sss.pgh.pa.us) wrote: >> I concur. Let's put the left join(s) into those views and call it >> good. > I'd suggest we also add some notes to the documentation that the correct > approach to dropping users is to disallow access first, then kill any > existing backends, and then drop the user. That, plus the left joins, > seems like it's good enough. I've pushed Oskari's original patch; a look around in system_views.sql shows that it's justified on the basis of consistency with other views even aside from the functional problem he ran into. The documentation question seems like material for a separate patch. regards, tom lane