Thread: Re: [PATCHES] Merge pg_shadow && pg_group -- UNTESTED

Re: [PATCHES] Merge pg_shadow && pg_group -- UNTESTED

From
Tom Lane
Date:
Stephen Frost <sfrost@snowman.net> writes:
>   Here's a proof-of-concept pretty much untested (it compiles) patch
>   against HEAD for review of the general approach I'm taking to 
>   merging pg_shadow and pg_group.  This is in order to support group 
>   ownership and eventually roles.  This patch includes my grammar and 
>   get_grosysid move patches, and so conflicts with them.

One point is that you can't simply whack pg_shadow around and eliminate
pg_group, because that will break lord-knows-how-much client software
that looks at these tables.  What I'm envisioning is to create a new
system catalog (say pg_role) that holds the New Truth, and then make
pg_shadow and pg_group be predefined views on this catalog that provide
as much backwards compatibility as we can manage.

I believe this was done once before already --- I think that the pg_user
view exists to emulate a prior incarnation of pg_shadow.

A related point is that I hope soon to get rid of type AclId and
usesysid/grosysid/rolesysid and start identifying roles by Oids.
This is connected to Alvaro's work to create proper dependencies
for object owners and privilege entries: once that exists and you
can't drop a referenced role, there will be no need to allow explicit
setting of the SYSID for a new user.  Not sure if you want to do any
of the associated changes in your patch, but if int4 is bugging you
then feel free to change it.
        regards, tom lane


Re: [PATCHES] Merge pg_shadow && pg_group -- UNTESTED

From
Stephen Frost
Date:
* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> Stephen Frost <sfrost@snowman.net> writes:
> >   Here's a proof-of-concept pretty much untested (it compiles) patch
> >   against HEAD for review of the general approach I'm taking to
> >   merging pg_shadow and pg_group.  This is in order to support group
> >   ownership and eventually roles.  This patch includes my grammar and
> >   get_grosysid move patches, and so conflicts with them.
>
> One point is that you can't simply whack pg_shadow around and eliminate
> pg_group, because that will break lord-knows-how-much client software
> that looks at these tables.  What I'm envisioning is to create a new
> system catalog (say pg_role) that holds the New Truth, and then make
> pg_shadow and pg_group be predefined views on this catalog that provide
> as much backwards compatibility as we can manage.

Ok.  Can I get some help defining what the New Truth will look like
then?  I understand users and groups pretty well but I'm not 100% sure
about roles.  Is it as simple as what my changed pg_shadow looks like?
What's the difference between a role, a user and a group?  Can a role
log in/have a password?  Can a role own an object?  If a role owns an
object, can any users who have that role {drop, create index, etc} it?

Once we get the layout of pg_role defined I think I'll be able to make
much better progress towards what you're looking for. :)

> A related point is that I hope soon to get rid of type AclId and
> usesysid/grosysid/rolesysid and start identifying roles by Oids.

Alright.  That doesn't sound too bad.

> This is connected to Alvaro's work to create proper dependencies
> for object owners and privilege entries: once that exists and you
> can't drop a referenced role, there will be no need to allow explicit
> setting of the SYSID for a new user.  Not sure if you want to do any
> of the associated changes in your patch, but if int4 is bugging you
> then feel free to change it.

Ok, I probably will.  Should I be concerned with trying to make
'smallish' patches that build upon each other (ie: change to pg_role
first, then change AclId to Oid, or whatever) or will one larger patch
that takes care of it all be ok?
Thanks,
    Stephen

Re: [PATCHES] Merge pg_shadow && pg_group -- UNTESTED

From
Tom Lane
Date:
Stephen Frost <sfrost@snowman.net> writes:
> Ok.  Can I get some help defining what the New Truth will look like
> then?  I understand users and groups pretty well but I'm not 100% sure
> about roles.

I looked through SQL99 a bit (see 4.31 "Basic security model") and think
I now have some handle on this.  According to the spec a "role" is
more or less exactly what we think of as a "group", with the extension
that roles can have other roles as members (barring circularity).
In particular the spec draws a distinction between "user identifiers"
and "role identifiers", although this distinction seems very nearly 100%
useless because the two sorts of identifiers can be used almost
interchangeably (an "authorization identifier" means either one, and in
most places "authorization identifier" is what is relevant).  AFAICT the
only really solid reason for the distinction is that you have to log in
initially as a user and not as a role.  That strikes me as a security
policy --- it's analogous to saying you can't log in directly as root
but have to su to root from your personal login --- which may be a good
thing for a given site to enforce but IMHO it should not be hard-wired
into the security mechanism.

The implementation reason for not having a hard distinction is mainly
that we want to have a single unique-identifier space for both users and
roles.  This simplifies representation of ACLs (which will no longer
need extra bits to identify whether an entry references a user or a
group) and allows us to have groups as members of other groups without
messy complication there.

It's not entirely clear to me whether the spec allows roles to be
directly owners of objects, but I think we should allow it.

So I'm envisioning something like

CREATE TABLE pg_role (rolname        name,        -- name of rolerolsuper    boolean,    -- superuser?rolcreateuser
boolean,   -- can create more users?rolcreatedb    boolean,    -- can create databases?rolcatupdate    boolean,    --
canhack system catalogs?rolcanlogin    boolean,    -- can log in as this role?rolvaliduntil    timestamptz,    --
passwordrolpassword   text,        -- password expiration timerolmembers    oid[],        -- OIDs of members, if
anyroladmin   boolean[],    -- do members have ADMIN OPTIONrolconfig    text[]        -- ALTER USER SET guc = value
 
) WITH OIDS;

Some notes:

It might be better to call this by some name other than "pg_role",
since what it defines is not exactly roles in the sense that SQL99
uses; but I don't have a good idea what to use instead.
"pg_authorization" would work but it's unwieldy.

OIDs of rows in this table replace AclIds.

I'm supposing that we should separate "superuserness" from "can create
users" (presumably a non-superuser with rolcreateuser would only be
allowed to create non-super users).  The lack of distinction on this
point has been a conceptual problem for newbies for a long time, and an
admin issue too. As long as we are hacking this table we should fix it.

If you want to enforce a hard distinction between users and roles (groups)
then you'd prohibit rolcanlogin from being true when rolmembers is
nonempty, but as said above I'm not sure the system should enforce that.

rolpassword, rolvaliduntil, and rolconfig are irrelevant if not rolcanlogin.

The roladmin[] bool array indicates whether members were granted
admission WITH ADMIN OPTION, which means they can grant membership to
others (analogous to WITH GRANT OPTION for individual privileges).
I'm not sure this is sufficient ... we may need to record who granted
membership to each member as well, in order to process revocation.


It might be better to lose the rolmembers/roladmin columns and instead
represent membership in a separate table, roughly

CREATE TABLE pg_role_members (role        oid,member        oid,grantor        oid,admin_option    bool,primary key
(role,member, grantor)
 
);

This is cleaner from a relational theory point of view but is probably
harder for the system to process.  One advantage is that it is easier to
find out "which roles does user X belong to?" ... but I'm not sure we
care about making that fast.

One thing that needs to be thought about before going too far is exactly
how ACL rights testing will work, particularly in the face of roles
being members of other roles.  That is the one performance-critical
operation that uses these data structures, so we ought to design around
making it fast.

> Ok, I probably will.  Should I be concerned with trying to make
> 'smallish' patches that build upon each other (ie: change to pg_role
> first, then change AclId to Oid, or whatever) or will one larger patch
> that takes care of it all be ok?

Smaller patches are easier to review, for sure.  Also, you'll need to
coordinate with Alvaro's work on dependencies for global objects.
        regards, tom lane


Re: [PATCHES] Merge pg_shadow && pg_group -- UNTESTED

From
Alvaro Herrera
Date:
Stephan,

On Sun, Jan 23, 2005 at 03:14:04PM -0500, Tom Lane wrote:

> Smaller patches are easier to review, for sure.  Also, you'll need to
> coordinate with Alvaro's work on dependencies for global objects.

If you want, I can send you the current patch so you can see what has
changed in it, maybe merge it with some work of yours for separate
submittal.

-- 
Alvaro Herrera (<alvherre[@]dcc.uchile.cl>)
"Siempre hay que alimentar a los dioses, aunque la tierra esté seca" (Orual)


Re: [PATCHES] Merge pg_shadow && pg_group -- UNTESTED

From
Stephen Frost
Date:
* Alvaro Herrera (alvherre@dcc.uchile.cl) wrote:
> On Sun, Jan 23, 2005 at 03:14:04PM -0500, Tom Lane wrote:
> > Smaller patches are easier to review, for sure.  Also, you'll need to
> > coordinate with Alvaro's work on dependencies for global objects.
>
> If you want, I can send you the current patch so you can see what has
> changed in it, maybe merge it with some work of yours for separate
> submittal.

Yeah, I'd appriciate seeing it and seeing if/how much it conflicts with
what I'm working on.  I'll probably be going back to scratch since there
isn't really anything worthwhile in my patch to build on working towards
what Tom's laid out.  Not a problem though, it didn't take me very long
to code. :)
Thanks,
    Stephen

Re: [PATCHES] Merge pg_shadow && pg_group -- UNTESTED

From
Stephen Frost
Date:
* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> Stephen Frost <sfrost@snowman.net> writes:
> > Ok.  Can I get some help defining what the New Truth will look like
> > then?  I understand users and groups pretty well but I'm not 100% sure
> > about roles.
>
> I looked through SQL99 a bit (see 4.31 "Basic security model") and think

Ah, I was looking through SQL2003 recently, I don't think much has
changed in that area though.

> I now have some handle on this.  According to the spec a "role" is
> more or less exactly what we think of as a "group", with the extension
> that roles can have other roles as members (barring circularity).

Right, ok.

> In particular the spec draws a distinction between "user identifiers"
> and "role identifiers", although this distinction seems very nearly 100%
> useless because the two sorts of identifiers can be used almost
> interchangeably (an "authorization identifier" means either one, and in
> most places "authorization identifier" is what is relevant).  AFAICT the
> only really solid reason for the distinction is that you have to log in
> initially as a user and not as a role.  That strikes me as a security
> policy --- it's analogous to saying you can't log in directly as root
> but have to su to root from your personal login --- which may be a good
> thing for a given site to enforce but IMHO it should not be hard-wired
> into the security mechanism.

Ok, I agree, though personally I don't like the idea of permitting
role-logins, but no need to have the security system force it.

> The implementation reason for not having a hard distinction is mainly
> that we want to have a single unique-identifier space for both users and
> roles.  This simplifies representation of ACLs (which will no longer
> need extra bits to identify whether an entry references a user or a
> group) and allows us to have groups as members of other groups without
> messy complication there.

The other difference would seem to be that "user identifiers" can't be
granted to users whereas "role identifiers" can be.  Following this,
"rolmembers" must be NULL if rolcanlogin is true, no?  That breaks if
roles can log in though.  Or should we just allow granting of "user
identifiers" to other users- but if we do should the user be permitted
to do that?

> It's not entirely clear to me whether the spec allows roles to be
> directly owners of objects, but I think we should allow it.

I agree, and in fact group/role ownership is what I'm specifically
interested in, though I'd like role support too and so I'm happy to
implement it along the way. :)

> So I'm envisioning something like
>
> CREATE TABLE pg_role (
>     rolname        name,        -- name of role
>     rolsuper    boolean,    -- superuser?
>     rolcreateuser    boolean,    -- can create more users?
>     rolcreatedb    boolean,    -- can create databases?
>     rolcatupdate    boolean,    -- can hack system catalogs?
>     rolcanlogin    boolean,    -- can log in as this role?
>     rolvaliduntil    timestamptz,    -- password
>     rolpassword    text,        -- password expiration time
>     rolmembers    oid[],        -- OIDs of members, if any
>     roladmin    boolean[],    -- do members have ADMIN OPTION
>     rolconfig    text[]        -- ALTER USER SET guc = value
> ) WITH OIDS;
>
> It might be better to call this by some name other than "pg_role",
> since what it defines is not exactly roles in the sense that SQL99
> uses; but I don't have a good idea what to use instead.
> "pg_authorization" would work but it's unwieldy.

Hmmm, I agree pg_role isn't quite right.  pg_auth would be shorter than
pg_authorization, but it isn't intuitive what it is.  How about
pg_ident?  It's not users or roles, but it's identifiers.  Perhaps
pg_authid?

> OIDs of rows in this table replace AclIds.

ok.

> I'm supposing that we should separate "superuserness" from "can create
> users" (presumably a non-superuser with rolcreateuser would only be
> allowed to create non-super users).  The lack of distinction on this
> point has been a conceptual problem for newbies for a long time, and an
> admin issue too. As long as we are hacking this table we should fix it.

Agreed.

> If you want to enforce a hard distinction between users and roles (groups)
> then you'd prohibit rolcanlogin from being true when rolmembers is
> nonempty, but as said above I'm not sure the system should enforce that.

Right, but there's still the issue of granting "users" to users.

> rolpassword, rolvaliduntil, and rolconfig are irrelevant if not rolcanlogin.

Right.

> The roladmin[] bool array indicates whether members were granted
> admission WITH ADMIN OPTION, which means they can grant membership to
> others (analogous to WITH GRANT OPTION for individual privileges).
> I'm not sure this is sufficient ... we may need to record who granted
> membership to each member as well, in order to process revocation.

I think we'll probably need to record who granted membership too, to
prevent circulation as well as revocation processing..

> It might be better to lose the rolmembers/roladmin columns and instead
> represent membership in a separate table, roughly
>
> CREATE TABLE pg_role_members (
>     role        oid,
>     member        oid,
>     grantor        oid,
>     admin_option    bool,
>     primary key (role, member, grantor)
> );
>
> This is cleaner from a relational theory point of view but is probably
> harder for the system to process.  One advantage is that it is easier to
> find out "which roles does user X belong to?" ... but I'm not sure we
> care about making that fast.

I like this approach more.

> One thing that needs to be thought about before going too far is exactly
> how ACL rights testing will work, particularly in the face of roles
> being members of other roles.  That is the one performance-critical
> operation that uses these data structures, so we ought to design around
> making it fast.

I agree that it should be fast- but I think it should be possible to
implement it in such a way that if you don't make roles members of other
roles then you won't pay a performance penelty for the fact that we
support that ability.  If you use it, it'll be a bit more expensive to
check permissions where you do.

> > Ok, I probably will.  Should I be concerned with trying to make
> > 'smallish' patches that build upon each other (ie: change to pg_role
> > first, then change AclId to Oid, or whatever) or will one larger patch
> > that takes care of it all be ok?
>
> Smaller patches are easier to review, for sure.  Also, you'll need to
> coordinate with Alvaro's work on dependencies for global objects.

Right, ok.  I'll look at what Alvaro's got and think about the approach
and milestones/patches to get from where we're at now to what we want.
Stephen

Re: [PATCHES] Merge pg_shadow && pg_group -- UNTESTED

From
Peter Eisentraut
Date:
Stephen Frost wrote:
> The other difference would seem to be that "user identifiers" can't
> be granted to users whereas "role identifiers" can be.  Following
> this, "rolmembers" must be NULL if rolcanlogin is true, no?  That
> breaks if roles can log in though.  Or should we just allow granting
> of "user identifiers" to other users- but if we do should the user be
> permitted to do that?

If he has admin option on his own role, sure.  But I suppose by default 
we wouldn't.

One use case I see is if someone goes on vacation he can temporarily 
grant the privileges held by his user account to others without 
actually giving out the login data.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/


Re: [PATCHES] Merge pg_shadow && pg_group -- UNTESTED

From
Bruno Wolff III
Date:
On Sun, Jan 23, 2005 at 15:14:04 -0500, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> 
> It's not entirely clear to me whether the spec allows roles to be
> directly owners of objects, but I think we should allow it.

I aggree with this. This can simplify maintainance as members of a group
come and go.


Re: [PATCHES] Merge pg_shadow && pg_group -- UNTESTED

From
Stephen Frost
Date:
* Peter Eisentraut (peter_e@gmx.net) wrote:
> If he has admin option on his own role, sure.  But I suppose by default
> we wouldn't.
>
> One use case I see is if someone goes on vacation he can temporarily
> grant the privileges held by his user account to others without
> actually giving out the login data.

Alright.  I've thought about this some more and I think I agree with it.
A user doesn't implicitly have all rights on his own oid, but I guess
that wasn't ever really the case anyway (can't give himself superuser
rights, etc).  I'll begin working on this soon (possibly as soon as
Thursday evening) unless someone else has comments on it.
Thanks,
    Stephen

Re: [PATCHES] Merge pg_shadow && pg_group -- UNTESTED

From
Stephen Frost
Date:
* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> Stephen Frost <sfrost@snowman.net> writes:
> > Ok.  Can I get some help defining what the New Truth will look like
> > then?  I understand users and groups pretty well but I'm not 100% sure
> > about roles.
>
> So I'm envisioning something like
[...]
> It might be better to call this by some name other than "pg_role",
[...]
> It might be better to lose the rolmembers/roladmin columns and instead
> represent membership in a separate table, roughly
[...]
> One thing that needs to be thought about before going too far is exactly
> how ACL rights testing will work, particularly in the face of roles
> being members of other roles.  That is the one performance-critical
> operation that uses these data structures, so we ought to design around
> making it fast.

Alright, here's a patch against head which adds in the tables pg_authid
and pg_auth_members as described in your previous mail.  I've gotten a
bit farther than this in terms of implementation but here's something
for people to comment on, if they'd like to.

I've been thinking about the performance issues some and have to admit
that I havn't really come to much of a solution.  It seems to me that
there's two ways to come at the issue:

a) start from the user:
   Search for useroid in pg_auth_members.member
   For each returned role, search for that role in member column
   Repeat until all roles the useroid is in have been found
   [Note: This could possibly be done and stored per-user on connection,
   but it would mean we'd have to have a mechanism to update it when
   necessary, possibly instigated by the user, or just force them to
   reconnect ala unix group membership]
   Look through ACL list to see if the useroid has permission or if any
   of the roles found do.

b) start from the ACL list:
   Search for each roleoid in pg_auth_members.role
   For each returned member, search for that member in role column
   Upon member == useroid match is found check for permission, if
   granted then stop, otherwise continue processing
   Has the advantage that the search stops once it's been determined
   that permission is there and doesn't require updating.

If we do the user-part-of-which-roles search upon connection I expect
'a' would be quite fast, obviously it has it's drawbacks though.  If we
feel that's not acceptable then I'm thinking 'b' would probably be
faster given that the ACL list is probably generally small and we can
short-circuit.  I'm afraid 'b' might still be too slow though, comments?
thoughts?  better ideas?

    Thanks,

        Stephen

Attachment

Re: [PATCHES] Merge pg_shadow && pg_group -- UNTESTED

From
Tom Lane
Date:
Stephen Frost <sfrost@snowman.net> writes:
> I've been thinking about the performance issues some and have to admit
> that I havn't really come to much of a solution.  It seems to me that
> there's two ways to come at the issue:

> a) start from the user:
>    ...
> b) start from the ACL list:
>    ...

The current ACL-checking code does (b), relying on a function in_group()
that tests whether the target userid is a member of a given group.
I would suggest preserving this basic structure if only to avoid breaking
things unintentionally.  However, you could think about caching the set
of groups that a user belongs to, thus combining the best features of
both (a) and (b).  It's always bothered me that in_group() seemed like a
fairly expensive operation.

>    [Note: This could possibly be done and stored per-user on connection,
>    but it would mean we'd have to have a mechanism to update it when
>    necessary, possibly instigated by the user, or just force them to
>    reconnect ala unix group membership]

No, we'd drive it off syscache invalidation.  Any change in pg_auth_members
would cause us to just discard the whole membership cache.  This sort of
mechanism is already in use in a couple places (schema search list
maintenance is one example IIRC --- look at namespace.c).
        regards, tom lane