Thread: Re: [PATCHES] Merge pg_shadow && pg_group -- UNTESTED
Stephen Frost <sfrost@snowman.net> writes: > Here's a proof-of-concept pretty much untested (it compiles) patch > against HEAD for review of the general approach I'm taking to > merging pg_shadow and pg_group. This is in order to support group > ownership and eventually roles. This patch includes my grammar and > get_grosysid move patches, and so conflicts with them. One point is that you can't simply whack pg_shadow around and eliminate pg_group, because that will break lord-knows-how-much client software that looks at these tables. What I'm envisioning is to create a new system catalog (say pg_role) that holds the New Truth, and then make pg_shadow and pg_group be predefined views on this catalog that provide as much backwards compatibility as we can manage. I believe this was done once before already --- I think that the pg_user view exists to emulate a prior incarnation of pg_shadow. A related point is that I hope soon to get rid of type AclId and usesysid/grosysid/rolesysid and start identifying roles by Oids. This is connected to Alvaro's work to create proper dependencies for object owners and privilege entries: once that exists and you can't drop a referenced role, there will be no need to allow explicit setting of the SYSID for a new user. Not sure if you want to do any of the associated changes in your patch, but if int4 is bugging you then feel free to change it. regards, tom lane
* Tom Lane (tgl@sss.pgh.pa.us) wrote: > Stephen Frost <sfrost@snowman.net> writes: > > Here's a proof-of-concept pretty much untested (it compiles) patch > > against HEAD for review of the general approach I'm taking to > > merging pg_shadow and pg_group. This is in order to support group > > ownership and eventually roles. This patch includes my grammar and > > get_grosysid move patches, and so conflicts with them. > > One point is that you can't simply whack pg_shadow around and eliminate > pg_group, because that will break lord-knows-how-much client software > that looks at these tables. What I'm envisioning is to create a new > system catalog (say pg_role) that holds the New Truth, and then make > pg_shadow and pg_group be predefined views on this catalog that provide > as much backwards compatibility as we can manage. Ok. Can I get some help defining what the New Truth will look like then? I understand users and groups pretty well but I'm not 100% sure about roles. Is it as simple as what my changed pg_shadow looks like? What's the difference between a role, a user and a group? Can a role log in/have a password? Can a role own an object? If a role owns an object, can any users who have that role {drop, create index, etc} it? Once we get the layout of pg_role defined I think I'll be able to make much better progress towards what you're looking for. :) > A related point is that I hope soon to get rid of type AclId and > usesysid/grosysid/rolesysid and start identifying roles by Oids. Alright. That doesn't sound too bad. > This is connected to Alvaro's work to create proper dependencies > for object owners and privilege entries: once that exists and you > can't drop a referenced role, there will be no need to allow explicit > setting of the SYSID for a new user. Not sure if you want to do any > of the associated changes in your patch, but if int4 is bugging you > then feel free to change it. Ok, I probably will. Should I be concerned with trying to make 'smallish' patches that build upon each other (ie: change to pg_role first, then change AclId to Oid, or whatever) or will one larger patch that takes care of it all be ok? Thanks, Stephen
Stephen Frost <sfrost@snowman.net> writes: > Ok. Can I get some help defining what the New Truth will look like > then? I understand users and groups pretty well but I'm not 100% sure > about roles. I looked through SQL99 a bit (see 4.31 "Basic security model") and think I now have some handle on this. According to the spec a "role" is more or less exactly what we think of as a "group", with the extension that roles can have other roles as members (barring circularity). In particular the spec draws a distinction between "user identifiers" and "role identifiers", although this distinction seems very nearly 100% useless because the two sorts of identifiers can be used almost interchangeably (an "authorization identifier" means either one, and in most places "authorization identifier" is what is relevant). AFAICT the only really solid reason for the distinction is that you have to log in initially as a user and not as a role. That strikes me as a security policy --- it's analogous to saying you can't log in directly as root but have to su to root from your personal login --- which may be a good thing for a given site to enforce but IMHO it should not be hard-wired into the security mechanism. The implementation reason for not having a hard distinction is mainly that we want to have a single unique-identifier space for both users and roles. This simplifies representation of ACLs (which will no longer need extra bits to identify whether an entry references a user or a group) and allows us to have groups as members of other groups without messy complication there. It's not entirely clear to me whether the spec allows roles to be directly owners of objects, but I think we should allow it. So I'm envisioning something like CREATE TABLE pg_role (rolname name, -- name of rolerolsuper boolean, -- superuser?rolcreateuser boolean, -- can create more users?rolcreatedb boolean, -- can create databases?rolcatupdate boolean, -- canhack system catalogs?rolcanlogin boolean, -- can log in as this role?rolvaliduntil timestamptz, -- passwordrolpassword text, -- password expiration timerolmembers oid[], -- OIDs of members, if anyroladmin boolean[], -- do members have ADMIN OPTIONrolconfig text[] -- ALTER USER SET guc = value ) WITH OIDS; Some notes: It might be better to call this by some name other than "pg_role", since what it defines is not exactly roles in the sense that SQL99 uses; but I don't have a good idea what to use instead. "pg_authorization" would work but it's unwieldy. OIDs of rows in this table replace AclIds. I'm supposing that we should separate "superuserness" from "can create users" (presumably a non-superuser with rolcreateuser would only be allowed to create non-super users). The lack of distinction on this point has been a conceptual problem for newbies for a long time, and an admin issue too. As long as we are hacking this table we should fix it. If you want to enforce a hard distinction between users and roles (groups) then you'd prohibit rolcanlogin from being true when rolmembers is nonempty, but as said above I'm not sure the system should enforce that. rolpassword, rolvaliduntil, and rolconfig are irrelevant if not rolcanlogin. The roladmin[] bool array indicates whether members were granted admission WITH ADMIN OPTION, which means they can grant membership to others (analogous to WITH GRANT OPTION for individual privileges). I'm not sure this is sufficient ... we may need to record who granted membership to each member as well, in order to process revocation. It might be better to lose the rolmembers/roladmin columns and instead represent membership in a separate table, roughly CREATE TABLE pg_role_members (role oid,member oid,grantor oid,admin_option bool,primary key (role,member, grantor) ); This is cleaner from a relational theory point of view but is probably harder for the system to process. One advantage is that it is easier to find out "which roles does user X belong to?" ... but I'm not sure we care about making that fast. One thing that needs to be thought about before going too far is exactly how ACL rights testing will work, particularly in the face of roles being members of other roles. That is the one performance-critical operation that uses these data structures, so we ought to design around making it fast. > Ok, I probably will. Should I be concerned with trying to make > 'smallish' patches that build upon each other (ie: change to pg_role > first, then change AclId to Oid, or whatever) or will one larger patch > that takes care of it all be ok? Smaller patches are easier to review, for sure. Also, you'll need to coordinate with Alvaro's work on dependencies for global objects. regards, tom lane
Stephan, On Sun, Jan 23, 2005 at 03:14:04PM -0500, Tom Lane wrote: > Smaller patches are easier to review, for sure. Also, you'll need to > coordinate with Alvaro's work on dependencies for global objects. If you want, I can send you the current patch so you can see what has changed in it, maybe merge it with some work of yours for separate submittal. -- Alvaro Herrera (<alvherre[@]dcc.uchile.cl>) "Siempre hay que alimentar a los dioses, aunque la tierra esté seca" (Orual)
* Alvaro Herrera (alvherre@dcc.uchile.cl) wrote: > On Sun, Jan 23, 2005 at 03:14:04PM -0500, Tom Lane wrote: > > Smaller patches are easier to review, for sure. Also, you'll need to > > coordinate with Alvaro's work on dependencies for global objects. > > If you want, I can send you the current patch so you can see what has > changed in it, maybe merge it with some work of yours for separate > submittal. Yeah, I'd appriciate seeing it and seeing if/how much it conflicts with what I'm working on. I'll probably be going back to scratch since there isn't really anything worthwhile in my patch to build on working towards what Tom's laid out. Not a problem though, it didn't take me very long to code. :) Thanks, Stephen
* Tom Lane (tgl@sss.pgh.pa.us) wrote: > Stephen Frost <sfrost@snowman.net> writes: > > Ok. Can I get some help defining what the New Truth will look like > > then? I understand users and groups pretty well but I'm not 100% sure > > about roles. > > I looked through SQL99 a bit (see 4.31 "Basic security model") and think Ah, I was looking through SQL2003 recently, I don't think much has changed in that area though. > I now have some handle on this. According to the spec a "role" is > more or less exactly what we think of as a "group", with the extension > that roles can have other roles as members (barring circularity). Right, ok. > In particular the spec draws a distinction between "user identifiers" > and "role identifiers", although this distinction seems very nearly 100% > useless because the two sorts of identifiers can be used almost > interchangeably (an "authorization identifier" means either one, and in > most places "authorization identifier" is what is relevant). AFAICT the > only really solid reason for the distinction is that you have to log in > initially as a user and not as a role. That strikes me as a security > policy --- it's analogous to saying you can't log in directly as root > but have to su to root from your personal login --- which may be a good > thing for a given site to enforce but IMHO it should not be hard-wired > into the security mechanism. Ok, I agree, though personally I don't like the idea of permitting role-logins, but no need to have the security system force it. > The implementation reason for not having a hard distinction is mainly > that we want to have a single unique-identifier space for both users and > roles. This simplifies representation of ACLs (which will no longer > need extra bits to identify whether an entry references a user or a > group) and allows us to have groups as members of other groups without > messy complication there. The other difference would seem to be that "user identifiers" can't be granted to users whereas "role identifiers" can be. Following this, "rolmembers" must be NULL if rolcanlogin is true, no? That breaks if roles can log in though. Or should we just allow granting of "user identifiers" to other users- but if we do should the user be permitted to do that? > It's not entirely clear to me whether the spec allows roles to be > directly owners of objects, but I think we should allow it. I agree, and in fact group/role ownership is what I'm specifically interested in, though I'd like role support too and so I'm happy to implement it along the way. :) > So I'm envisioning something like > > CREATE TABLE pg_role ( > rolname name, -- name of role > rolsuper boolean, -- superuser? > rolcreateuser boolean, -- can create more users? > rolcreatedb boolean, -- can create databases? > rolcatupdate boolean, -- can hack system catalogs? > rolcanlogin boolean, -- can log in as this role? > rolvaliduntil timestamptz, -- password > rolpassword text, -- password expiration time > rolmembers oid[], -- OIDs of members, if any > roladmin boolean[], -- do members have ADMIN OPTION > rolconfig text[] -- ALTER USER SET guc = value > ) WITH OIDS; > > It might be better to call this by some name other than "pg_role", > since what it defines is not exactly roles in the sense that SQL99 > uses; but I don't have a good idea what to use instead. > "pg_authorization" would work but it's unwieldy. Hmmm, I agree pg_role isn't quite right. pg_auth would be shorter than pg_authorization, but it isn't intuitive what it is. How about pg_ident? It's not users or roles, but it's identifiers. Perhaps pg_authid? > OIDs of rows in this table replace AclIds. ok. > I'm supposing that we should separate "superuserness" from "can create > users" (presumably a non-superuser with rolcreateuser would only be > allowed to create non-super users). The lack of distinction on this > point has been a conceptual problem for newbies for a long time, and an > admin issue too. As long as we are hacking this table we should fix it. Agreed. > If you want to enforce a hard distinction between users and roles (groups) > then you'd prohibit rolcanlogin from being true when rolmembers is > nonempty, but as said above I'm not sure the system should enforce that. Right, but there's still the issue of granting "users" to users. > rolpassword, rolvaliduntil, and rolconfig are irrelevant if not rolcanlogin. Right. > The roladmin[] bool array indicates whether members were granted > admission WITH ADMIN OPTION, which means they can grant membership to > others (analogous to WITH GRANT OPTION for individual privileges). > I'm not sure this is sufficient ... we may need to record who granted > membership to each member as well, in order to process revocation. I think we'll probably need to record who granted membership too, to prevent circulation as well as revocation processing.. > It might be better to lose the rolmembers/roladmin columns and instead > represent membership in a separate table, roughly > > CREATE TABLE pg_role_members ( > role oid, > member oid, > grantor oid, > admin_option bool, > primary key (role, member, grantor) > ); > > This is cleaner from a relational theory point of view but is probably > harder for the system to process. One advantage is that it is easier to > find out "which roles does user X belong to?" ... but I'm not sure we > care about making that fast. I like this approach more. > One thing that needs to be thought about before going too far is exactly > how ACL rights testing will work, particularly in the face of roles > being members of other roles. That is the one performance-critical > operation that uses these data structures, so we ought to design around > making it fast. I agree that it should be fast- but I think it should be possible to implement it in such a way that if you don't make roles members of other roles then you won't pay a performance penelty for the fact that we support that ability. If you use it, it'll be a bit more expensive to check permissions where you do. > > Ok, I probably will. Should I be concerned with trying to make > > 'smallish' patches that build upon each other (ie: change to pg_role > > first, then change AclId to Oid, or whatever) or will one larger patch > > that takes care of it all be ok? > > Smaller patches are easier to review, for sure. Also, you'll need to > coordinate with Alvaro's work on dependencies for global objects. Right, ok. I'll look at what Alvaro's got and think about the approach and milestones/patches to get from where we're at now to what we want. Stephen
Stephen Frost wrote: > The other difference would seem to be that "user identifiers" can't > be granted to users whereas "role identifiers" can be. Following > this, "rolmembers" must be NULL if rolcanlogin is true, no? That > breaks if roles can log in though. Or should we just allow granting > of "user identifiers" to other users- but if we do should the user be > permitted to do that? If he has admin option on his own role, sure. But I suppose by default we wouldn't. One use case I see is if someone goes on vacation he can temporarily grant the privileges held by his user account to others without actually giving out the login data. -- Peter Eisentraut http://developer.postgresql.org/~petere/
On Sun, Jan 23, 2005 at 15:14:04 -0500, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > It's not entirely clear to me whether the spec allows roles to be > directly owners of objects, but I think we should allow it. I aggree with this. This can simplify maintainance as members of a group come and go.
* Peter Eisentraut (peter_e@gmx.net) wrote: > If he has admin option on his own role, sure. But I suppose by default > we wouldn't. > > One use case I see is if someone goes on vacation he can temporarily > grant the privileges held by his user account to others without > actually giving out the login data. Alright. I've thought about this some more and I think I agree with it. A user doesn't implicitly have all rights on his own oid, but I guess that wasn't ever really the case anyway (can't give himself superuser rights, etc). I'll begin working on this soon (possibly as soon as Thursday evening) unless someone else has comments on it. Thanks, Stephen
* Tom Lane (tgl@sss.pgh.pa.us) wrote: > Stephen Frost <sfrost@snowman.net> writes: > > Ok. Can I get some help defining what the New Truth will look like > > then? I understand users and groups pretty well but I'm not 100% sure > > about roles. > > So I'm envisioning something like [...] > It might be better to call this by some name other than "pg_role", [...] > It might be better to lose the rolmembers/roladmin columns and instead > represent membership in a separate table, roughly [...] > One thing that needs to be thought about before going too far is exactly > how ACL rights testing will work, particularly in the face of roles > being members of other roles. That is the one performance-critical > operation that uses these data structures, so we ought to design around > making it fast. Alright, here's a patch against head which adds in the tables pg_authid and pg_auth_members as described in your previous mail. I've gotten a bit farther than this in terms of implementation but here's something for people to comment on, if they'd like to. I've been thinking about the performance issues some and have to admit that I havn't really come to much of a solution. It seems to me that there's two ways to come at the issue: a) start from the user: Search for useroid in pg_auth_members.member For each returned role, search for that role in member column Repeat until all roles the useroid is in have been found [Note: This could possibly be done and stored per-user on connection, but it would mean we'd have to have a mechanism to update it when necessary, possibly instigated by the user, or just force them to reconnect ala unix group membership] Look through ACL list to see if the useroid has permission or if any of the roles found do. b) start from the ACL list: Search for each roleoid in pg_auth_members.role For each returned member, search for that member in role column Upon member == useroid match is found check for permission, if granted then stop, otherwise continue processing Has the advantage that the search stops once it's been determined that permission is there and doesn't require updating. If we do the user-part-of-which-roles search upon connection I expect 'a' would be quite fast, obviously it has it's drawbacks though. If we feel that's not acceptable then I'm thinking 'b' would probably be faster given that the ACL list is probably generally small and we can short-circuit. I'm afraid 'b' might still be too slow though, comments? thoughts? better ideas? Thanks, Stephen
Attachment
Stephen Frost <sfrost@snowman.net> writes: > I've been thinking about the performance issues some and have to admit > that I havn't really come to much of a solution. It seems to me that > there's two ways to come at the issue: > a) start from the user: > ... > b) start from the ACL list: > ... The current ACL-checking code does (b), relying on a function in_group() that tests whether the target userid is a member of a given group. I would suggest preserving this basic structure if only to avoid breaking things unintentionally. However, you could think about caching the set of groups that a user belongs to, thus combining the best features of both (a) and (b). It's always bothered me that in_group() seemed like a fairly expensive operation. > [Note: This could possibly be done and stored per-user on connection, > but it would mean we'd have to have a mechanism to update it when > necessary, possibly instigated by the user, or just force them to > reconnect ala unix group membership] No, we'd drive it off syscache invalidation. Any change in pg_auth_members would cause us to just discard the whole membership cache. This sort of mechanism is already in use in a couple places (schema search list maintenance is one example IIRC --- look at namespace.c). regards, tom lane