Proposal for enhancements of privilege system - Mailing list pgsql-hackers

From Peter Eisentraut
Subject Proposal for enhancements of privilege system
Date
Msg-id Pine.LNX.4.21.0005221915300.392-100000@localhost.localdomain
Whole thread Raw
Responses Re: Proposal for enhancements of privilege system
List pgsql-hackers
Some of you may remember some mumblings about some work on access
privileges, so this is the idea. Comments welcome.


* Goals

The goal of the first stage is to provide a fully SQL92 compliant
solution. That comprises particularly column level granularity, the
REFERENCES[*] privilege for use by constraints, and the USAGE
privilege. We currently don't have any of the things that USAGE
pertains to -- domains, character sets, collations, translations --
but at least we shouldn't have to start all over when we do. Also
GRANT OPTION needs to be supported.

[*] -- now with that RI snafu unveiled that goal seems optimistic

The second stage would be adopting all specifications made by SQL3 if
they are applicable. This includes particularly the privilege types
TRIGGER and UNDER (for table inheritance, which should probably work
well first). Also we could think about EXECUTE for functions and some
`setuid'-like features.

Stage two isn't necessarily anticipated for 7.1 but I'd like to have a
framework which adapts well.

* User management

One thing I'd like to see resolved first is the issue of
pg_shadow.usesysid. This field is fully functionally dependent on
pg_shadow.oid so there's little theoretical need to keep it around.
Secondly, the system happily reassigns previously used sysids, which
is a pretty dangerous thing to do as we all know, since there might
still be old database objects hanging around that the new users
shouldn't necessarily have access to. (And connecting to all databases
in turn to remove any dangling objects when a user is dropped isn't
really an option.) So the answer is to not recycle sysids. But then
why not use the oid?

Some arguments for user sysids I have heard in the past were that some
people want to keep them the same as the Unix uid. While I'm at a loss
as to how this would matter in practice (aren't names enough) I grant
that that's an argument (albeit one that doesn't scale well because
not every database user is a Unix user and two identically numbered
Unix users from different machines would presumably map to different
database users). But if you look closer then this thinking is
primarily caused by the fact that there is a usesysid field at all --
if there wasn't, you wouldn't have to keep it in sync.

Another reason why an oid based arrangement would be nicer is that if
we did the same thing for groups why could refer to both users and
groups through one attribute. See `Implementation' below.

* Implementation

The central idea in this proposal is a new system table to store
permissions:

pg_privilege (   priobj oid,   prigrantor oid,   prigrantee oid,   priaction char,   priisgrantable boolean,
   primary key (priobj, prigrantee, priaction)
)                                                                                 

"priobj" would store the oid of the object being described, for
example a table or function or type. "prigrantor" is the user who
granted the privilege. (It is necessary to store this to get grant
options to work correctly.) "prigrantee" is obviously the user to
which you grant the privilege or a group. We could put 0 for "public".
"priaction" would be the encoding of the privilege type, such as
's'=select, 'u'=update, perhaps. And "prigrantable" is whether the
privilege is grantable.

The key advantages to this method over the old one are:
- Extensible beyond tables, in fact to any kind of object
- Easier to query, e.g., for what-if inquiries
- The old method would make grant options pretty tough without a major rework
- A pg_privilege row would be almost exactly what SQL calls a "privilege descriptor". So the implementation will be
mucheasier and verifyable because you can read the program code out of the standard text. (in theory anyway)
 

Those that follow will see how simple-minded grant, revoke, and
privilege lookup will be in their core: simply insert, delete, or look
for a row. (Of course the devil is in the details.)

* Column privileges

There are two approaches I see to managing column privileges, one is a
little cleaner, the other faster. Note that granting access to a
table is different than granting access to all of its columns; the
difference is what happens when you add a new column.

The straightforward choice would be to store a single reference to
pg_class when the privilege describes the whole table, and
pg_attribute references when only specific columns are named. That
would mean the lookup routine will first look for a pg_class.oid entry
and, failing that, then for possible pg_attribute.oid entries for the
columns that it's interested in. This is of course suboptimal when no
privilege exists in the first place but that is not necessarily the case
we're optimizing for.

The second choice would be to always have an entry for the table, even
if it only says "I'm not the real privilege, but there are column
privileges, so you better keep looking." That would probably mean
another column in the pg_privilege. This way you have to maintain
redundant information but there is enough precedent for this sort of
thing in the other system catalogs.

* Groups

Handling groups efficiently is a bit tricky because it's essentially
equivalent to a join: scan all the privileges and all the groups and
look for matches between them and with the current user id. I suppose
one could simply run this query by hand once and see what the
optimizer thinks would be a good way to run it, but that isn't
facilitated by the way group information is stored right now.

I would do it like this: Looking up privileges granted to groups would
be done if the lookup based on the user id fails. Then you have to
scan pg_group anyway, so you might as well just scan it once
completely and record all the groups the user is in. Then you do a
privilege lookup for each group in a manner identical to individual
users.

This is different from the current implementation which looks through
all existing privileges on a table and if one is owned by a group then
scan pg_group to see if the user is in the group. That might be
suboptimal.

* Performance concerns

The fastest privilege system if of course one that does no checking at
all. Features always cost a price. I have no concern, however, that
this new implementation would cause any noticeable penalty at all. If
you consider how much reading the parser, planner, optimizer, and
rewriter do just to make sense of a query, this is really a minor
item.

If you're the table owner then no access checking is done at all. If
you don't use groups or column privileges then one syscache lookup
will tell you yes or no. If you do use groups then the new system
would potentially even be faster. If you want to use column privileges
you'd currently wait forever. :)

* Possibilities for extensions

One thing that has been thrown around is a LOCK privilege. Currently
everyone with write access can lock the table completely. It would
make sense to me to restrict locks of Share mode and higher to the
owner and owners of this privilege.

There is also demand for various CREATE privileges (one for each thing
you can create, one supposes). Once we have schemas we can easily fit
this into the above design. Since this is not covered by the standard
("implementation-defined"), a good round of discussion ought to take
place first.


-- 
Peter Eisentraut                  Sernanders väg 10:115
peter_e@gmx.net                   75262 Uppsala
http://yi.org/peter-e/            Sweden



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: libpq++ tracing considered harmfu[Fwd: libpq++ tracing considered harmful (was Re: [INTERFACES] libpq++ memory problems)]
Next
From: Marten Feldtmann
Date:
Subject: Re: OO Patch