Thread: API change advice: Passing plan invalidation info from the rewriter into the planner?
API change advice: Passing plan invalidation info from the rewriter into the planner?
From
Craig Ringer
Date:
Hi all One of the remaining issues with row security is how to pass plan invalidation information generated in the rewriter back into the planner. With row security, it's necessary to set a field in PlannerGlobal, tracking the user ID of the user the query was planned for if row security was applied. It is also necessary to add a PlanInvalItem for the user ID. Currently the rewriter has no way to pass this information to the planner. QueryRewrite returns just a Query*. We use Query structs throughout the rewriter and planner; it doesn't make sense to add a List* field for PlanInvalItem nodes and an Oid field for the user ID to the Query node when it's only ever going to get used for the top level Query node returned by the rewriter, and only for long enough to copy the data into PlannerGlobal. The alternative seems to be changing the return type of QueryRewrite, introducing a new node type, say: struct RewriteResult { Query *productQuery; Oid planUserId; List* planInvalItems; } This seems cleaner, and more extensible, but it means changing a fair bit of API, including: pg_plan_query planner standard_planner planner_hook_type QueryRewrite and probably the plan cache infrastructure too. So it'd be fairly invasive, and I know that creates concerns about backpatching and extensions. I can't just polymorphically subclass Query as some kind of "TopQuery" - no true polymorphism in C, would need a new NodeType for it, and then need to teach everything that knows about T_Query about T_TopQuery too. So that won't work. So, I'm looking for advice before I embark on this change. I need _some_ way to pass invalidation information from the rewriter into the planner when it's collected by row security code during rewriting. Any advice/comments? I'm inclined to bite the bullet and make the API change. It'll be a pain, but I can see future uses for passing global info out of the rewriter rather than shoving it into per-Query structures. I'd define a RewriteResult and pass that down into all the rewriter internal functions, then return the outer query wrapped in it. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
From
Alvaro Herrera
Date:
Craig Ringer escribió: > One of the remaining issues with row security is how to pass plan > invalidation information generated in the rewriter back into the planner. I think I already asked this, but would it work to extract this info by walking the rewritten list of queries instead; and in case it would, would that be any easier than the API change you're proposing? > We use Query structs throughout the rewriter and planner; it doesn't > make sense to add a List* field for PlanInvalItem nodes and an Oid field > for the user ID to the Query node when it's only ever going to get used > for the top level Query node returned by the rewriter, and only for long > enough to copy the data into PlannerGlobal. So there is an assumption that you can't have a subquery that uses a different role ID than the main query. That sounds fine, and anyway I don't think we're prepared to deal with differing userids for subqueries, so the proposal that it belongs only on the top-level node is acceptable. And from there, it seems that not putting the info in Query (which would be a waste everywhere else than the toplevel query node) is sensible. > The alternative seems to be changing the return type of QueryRewrite, > introducing a new node type, say: > > struct RewriteResult { > Query *productQuery; > Oid planUserId; > List* planInvalItems; > } > > This seems cleaner, and more extensible, but it means changing a fair > bit of API, including: > > pg_plan_query > planner > standard_planner > planner_hook_type > QueryRewrite I think we should just bite the bullet and do the change (a new struct, I assume, not a new node). It will cause an incompatibility to anyone that has written planner hooks, but probably the number of such hooks is not very large anyway. I don't think we should base decisions on the amount of backpatching pain we cause, for patches that involve new functionality such as this one. We commit patches that will cause future merge conflicts all the time. > I'm inclined to bite the bullet and make the API change. It'll be a > pain, but I can see future uses for passing global info out of the > rewriter rather than shoving it into per-Query structures. I'd define a > RewriteResult and pass that down into all the rewriter internal > functions, then return the outer query wrapped in it. Is there already something in Query that could be a toplevel struct member only? -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
From
Tom Lane
Date:
Craig Ringer <craig@hobby.2ndquadrant.com> writes: > One of the remaining issues with row security is how to pass plan > invalidation information generated in the rewriter back into the planner. > With row security, it's necessary to set a field in PlannerGlobal, > tracking the user ID of the user the query was planned for if row > security was applied. It is also necessary to add a PlanInvalItem for > the user ID. TBH I'd just add a user OID field in struct Query and not hack up a bunch of existing function APIs. It's not much worse than the existing constraintDeps field. The PlanInvalItem could perfectly well be generated by the planner, no, if it has the user OID? But I'm not real sure why you need it. I don't see the reason for an invalidation triggered by user ID. What exactly about the *user*, and not something else, would trigger plan invalidation? What we do need is a notion that a plan cache entry might only be valid for a specific calling user ID; but that's a matter for cache entry lookup not for subsequent invalidation. regards, tom lane
Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
From
Craig Ringer
Date:
On 03/06/2014 02:58 AM, Tom Lane wrote: > Craig Ringer <craig@hobby.2ndquadrant.com> writes: >> One of the remaining issues with row security is how to pass plan >> invalidation information generated in the rewriter back into the planner. > >> With row security, it's necessary to set a field in PlannerGlobal, >> tracking the user ID of the user the query was planned for if row >> security was applied. It is also necessary to add a PlanInvalItem for >> the user ID. > > TBH I'd just add a user OID field in struct Query and not hack up a bunch > of existing function APIs. It's not much worse than the existing > constraintDeps field. If you're happy with that, I certainly won't complain. It's much simpler and less intrusive. I should be able to post an update using this later today. > The PlanInvalItem could perfectly well be generated by the planner, > no, if it has the user OID? But I'm not real sure why you need it. > I don't see the reason for an invalidation triggered by user ID. > What exactly about the *user*, and not something else, would trigger > plan invalidation? It's only that the plan depends on the user ID. There's no point keeping the plan around if the user no longer exists. You're quite right that this can be done in the planner when a dependency on the user ID is found, though. So there's no need to pass a PlanInvalItem down, which is a lot nicer. > What we do need is a notion that a plan cache entry might only be > valid for a specific calling user ID; but that's a matter for cache > entry lookup not for subsequent invalidation. Yes, that would be good, but is IMO more of a separate optimization. I'm currently using KaiGai's code to invalidate and re-plan when a user ID change is detected. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
From
Tom Lane
Date:
Craig Ringer <craig@2ndquadrant.com> writes: > On 03/06/2014 02:58 AM, Tom Lane wrote: >> The PlanInvalItem could perfectly well be generated by the planner, >> no, if it has the user OID? But I'm not real sure why you need it. >> I don't see the reason for an invalidation triggered by user ID. >> What exactly about the *user*, and not something else, would trigger >> plan invalidation? > It's only that the plan depends on the user ID. There's no point keeping > the plan around if the user no longer exists. [ shrug... ] Leaving such a plan cached would be harmless, though. Furthermore, the user ID we'd be talking about is either the owner of the current session, or the owner of some view or security-definer function that the plan is already dependent on, so it's fairly hard to credit that the plan would survive long enough for the issue to arise. Even if there is a scenario where invalidating by user ID is actually useful, I think adding infrastructure to cause invalidation in such a case is optimizing for the wrong thing. You're adding cycles to every query to benefit a case that is going to be quite infrequent in practice. >> What we do need is a notion that a plan cache entry might only be >> valid for a specific calling user ID; but that's a matter for cache >> entry lookup not for subsequent invalidation. > Yes, that would be good, but is IMO more of a separate optimization. I'm > currently using KaiGai's code to invalidate and re-plan when a user ID > change is detected. I'm unlikely to accept a patch that does that; wouldn't it be catastrophic for performance in the presence of security-definer functions? You can't just trash the whole plan cache when a user ID switch occurs. regards, tom lane
Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
From
Stephen Frost
Date:
Craig, Tom, all, I've been through the RLS code over the past couple of days which I pulled from Craig's repo and have a bunch of minor updates. In general, the patch seems pretty reasonable- except for the issues discussed below. Quite a bit of this patch is tied up in plan invalidation and tracking if the security quals depend on the current user, all of which seems pretty grotty and the wrong way around to me. * Tom Lane (tgl@sss.pgh.pa.us) wrote: > Craig Ringer <craig@2ndquadrant.com> writes: > > It's only that the plan depends on the user ID. There's no point keeping > > the plan around if the user no longer exists. > > [ shrug... ] Leaving such a plan cached would be harmless, though. Agreed. > Furthermore, the user ID we'd be talking about is either the owner > of the current session, or the owner of some view or security-definer > function that the plan is already dependent on, so it's fairly hard > to credit that the plan would survive long enough for the issue to > arise. I don't entirely follow which 'issue' is being referred to here, but we need to consider that 'set role' changes should also cause a new plan. > Even if there is a scenario where invalidating by user ID is actually > useful, I think adding infrastructure to cause invalidation in such a case > is optimizing for the wrong thing. You're adding cycles to every query to > benefit a case that is going to be quite infrequent in practice. Yeah, I have a hard time seeing that there's an issue w/ keeping the cached plans around even if the session never goes back to being under the user ID for which those older plans were built. Also, wouldn't a 'RESET ALL' clear any of them anyway? > > Yes, that would be good, but is IMO more of a separate optimization. I'm > > currently using KaiGai's code to invalidate and re-plan when a user ID > > change is detected. > > I'm unlikely to accept a patch that does that; wouldn't it be catastrophic > for performance in the presence of security-definer functions? You can't > just trash the whole plan cache when a user ID switch occurs. Yeah, this doesn't seem like the right approach. Adding the user ID to the cache key definitely strikes me as the right way to fix this. I've uploaded the latest patch, rebased against master, with my changes to here: http://snowman.net/~sfrost/rls_ringerc_sf.patch.gz as I don't believe it'd clear the mailing list (it's 29k). I'll take a look at changing the cache key to include user ID and ripping out the plan invalidation logic from the current patch tomorrow but I seriously doubt I'll be able to get all of that done in the next day or two. If anyone else is able to help out, it'd certainly be appreciated; I really think that's the main hurdle to address at this point with this patch- without the plan invalidation complexity, the the patch is really just building out the catalog, the SQL-level statements for managing it, and the bit of code required to add the conditional to statements involving RLS-enabled tables. Thanks, Stephen
Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
From
Tom Lane
Date:
Stephen Frost <sfrost@snowman.net> writes: > I've uploaded the latest patch, rebased against master, with my changes > to here: http://snowman.net/~sfrost/rls_ringerc_sf.patch.gz as I don't > believe it'd clear the mailing list (it's 29k). Please actually post it, for the archives' sake. 29k is far below the list limit. (Which I don't know exactly what it is ... but certainly in the hundreds of KB.) > I'll take a look at changing the cache key to include user ID and > ripping out the plan invalidation logic from the current patch tomorrow > but I seriously doubt I'll be able to get all of that done in the next > day or two. TBH I think we are up against the deadline. April 15 was the agreed-to drop dead date for pushing new features into 9.4. regards, tom lane
Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
From
Stephen Frost
Date:
* Tom Lane (tgl@sss.pgh.pa.us) wrote: > Stephen Frost <sfrost@snowman.net> writes: > > I've uploaded the latest patch, rebased against master, with my changes > > to here: http://snowman.net/~sfrost/rls_ringerc_sf.patch.gz as I don't > > believe it'd clear the mailing list (it's 29k). > > Please actually post it, for the archives' sake. 29k is far below the > list limit. (Which I don't know exactly what it is ... but certainly > in the hundreds of KB.) Huh, thought it was more like 25k. Well, here goes then... > > I'll take a look at changing the cache key to include user ID and > > ripping out the plan invalidation logic from the current patch tomorrow > > but I seriously doubt I'll be able to get all of that done in the next > > day or two. > > TBH I think we are up against the deadline. April 15 was the agreed-to > drop dead date for pushing new features into 9.4. Yeah. :/ May be for the best anyway, this should be able to go in early in the 9.5 cycle and get more testing and refinement. Still stinks though as I feel like this patch didn't get the attention it should have due to a simple misunderstanding, but we do need to stop at some point to get a release together. Thanks, Stephen
Attachment
Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
From
Craig Ringer
Date:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 04/15/2014 10:06 AM, Stephen Frost wrote: > I've uploaded the latest patch, rebased against master, with my > changes to here: http://snowman.net/~sfrost/rls_ringerc_sf.patch.gz > as I don't believe it'd clear the mailing list (it's 29k). Does this exist in the form of an accessible git branch, too? I was trying to maintain the patch as a series of distinct changes to make it easier to see what each part is doing, and it'd be nice to preserve that if possible. It also makes seeing what's changed a lot easier. - -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBAgAGBQJTWbGNAAoJELBXNkqjr+S28W4H/R49CJfz4Y3TMbvwxhrwkjL2 WEv80qY4GDCzG5CGKROn3kT9H5xePvL9eadSjr+CPsilerHrPkHmXnU5w+K2LnKV MCL/A2969b4ng1cUK9eHEFVx9BLLQmiVI6DbJ2OA2oWUs/Y7Zne5h6q0fNnnnTSq XEU6r3tVkUp5ipbhHi+aJ+mfckirdcMR0U5X+2fgGpLZ3D+8j9azvuXvQjSOekVB 3+EVVI0UXhhvw4It4/1CjieHvScdxnsz9bOpKGiEeePUB3CGC0iPtBgIGtE0n2OK cqKryuwZ3++LZih74M8z+Rn6yao5f4ElJrO3gz5q8axKzH/bHkEYElwEUhVfbSE= =AKzL -----END PGP SIGNATURE-----
Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
From
Stephen Frost
Date:
* Craig Ringer (craig@2ndquadrant.com) wrote: > On 04/15/2014 10:06 AM, Stephen Frost wrote: > > I've uploaded the latest patch, rebased against master, with my > > changes to here: http://snowman.net/~sfrost/rls_ringerc_sf.patch.gz > > as I don't believe it'd clear the mailing list (it's 29k). > > Does this exist in the form of an accessible git branch, too? Eh, no. > I was trying to maintain the patch as a series of distinct changes to > make it easier to see what each part is doing, and it'd be nice to > preserve that if possible. It also makes seeing what's changed a lot > easier. Yeah, I almost just posted a patch against your tree. I'll look at doing that tomorrow. Thanks, Stephen
Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
From
Adam Brightwell
Date:
Hi all,
This is my first post to the mailing list and I am looking forward to working with everyone in the community.
With that said...
I'll take a look at changing the cache key to include user ID and
ripping out the plan invalidation logic from the current patch tomorrow
but I seriously doubt I'll be able to get all of that done in the next
day or two. If anyone else is able to help out, it'd certainly be
appreciated; I really think that's the main hurdle to address at this
point with this patch- without the plan invalidation complexity, the
the patch is really just building out the catalog, the SQL-level
statements for managing it, and the bit of code required to add the
conditional to statements involving RLS-enabled tables.
I have been collaborating with Stephen on addressing this particular item with RLS.
As a basis, I have been working with Craig's 'rls-9.4-upd-sb-views' branch rebased against master around 9.4beta1.
Through this effort, we have concluded that for RLS the case of invalidating a plan is only necessary when switching between a superuser and a non-superuser. Obviously, re-planning on every role change would be too costly, but this approach should help minimize that cost. As well, there were not any cases outside of this one that were immediately apparent with respect to RLS that would require re-planning on a per userid basis.
I have tested this approach with the following patch.
Does this sound like a sane approach? Thoughts? Recommendations?
Thanks,
Adam
Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
From
Tom Lane
Date:
Adam Brightwell <adam.brightwell@crunchydatasolutions.com> writes: > Through this effort, we have concluded that for RLS the case of > invalidating a plan is only necessary when switching between a superuser > and a non-superuser. Obviously, re-planning on every role change would be > too costly, but this approach should help minimize that cost. As well, > there were not any cases outside of this one that were immediately apparent > with respect to RLS that would require re-planning on a per userid basis. Hm ... I'm not following why we'd need a special case for superusers and not anyone else? Seems like any useful RLS scheme is going to require more privilege levels than just superuser and not-superuser. Could we put the "if superuser then ok" test into the RLS condition test and thereby not need more than one plan at all? regards, tom lane
Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
From
"Brightwell, Adam"
Date:
Hey Tom,
Hm ... I'm not following why we'd need a special case for superusers andnot anyone else? Seems like any useful RLS scheme is going to require
more privilege levels than just superuser and not-superuser.
As it stands right now, superuser is the only case where RLS policies should not be applied/completely ignored. I suppose it is possible to create RLS policies that are related to other privilege levels, but those would still need to be applied despite user id, excepting superuser. I'll defer to Stephen or Craig on the usefulness of this scheme.
Could we put the "if superuser then ok" test into the RLS condition test
and thereby not need more than one plan at all?
As I understand it, the application of RLS policies occurs in the rewriter. Therefore, when switching back and forth between superuser and not-superuser the query must be rewritten, which would ultimately result in the need for a new plan correct? If that is the case, then I am not sure how one plan is possible. However, again, I'll have to defer to Stephen or Craig on this one.
Thanks,
Adam
Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
From
Craig Ringer
Date:
On 06/11/2014 02:19 AM, Tom Lane wrote: > Hm ... I'm not following why we'd need a special case for superusers and > not anyone else? Seems like any useful RLS scheme is going to require > more privilege levels than just superuser and not-superuser. What it really needs is to invalidate plans when switching between RLS-enabled and RLS-exempt users, yes. I'm sure we'll want an "RLS exempt" right or mode sooner rather than later, so I'm against tying this explicitly to superuser as such. I wouldn't be surprised to see SET ROW SECURITY ON|OFF down the track, with a right controlling whether you can or not. Or at least, a right that directly exempts a user from row security. > Could we put the "if superuser then ok" test into the RLS condition test > and thereby not need more than one plan at all? Only if we put it in another level of security barrier subquery, because otherwise the planner might execute the other quals (including possible user defined functions) before the superuser test. Which was the whole reason for the superuser test in the first place. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
From
Tom Lane
Date:
Craig Ringer <craig@2ndquadrant.com> writes: > On 06/11/2014 02:19 AM, Tom Lane wrote: >> Could we put the "if superuser then ok" test into the RLS condition test >> and thereby not need more than one plan at all? > Only if we put it in another level of security barrier subquery, because > otherwise the planner might execute the other quals (including possible > user defined functions) before the superuser test. Which was the whole > reason for the superuser test in the first place. Is the point of that that the table owner might have put trojan-horse functions into the RLS qual? If so, why are we only concerned about defending the superuser and not other users? Seems like the right fix would be to insist that functions in the RLS qual run as the table owner. Granted, that might be painful to do. But it still seems like "we only need to do this for superusers" is designing with blinkers on. regards, tom lane
Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
From
Craig Ringer
Date:
On 06/11/2014 07:24 AM, Tom Lane wrote: > Is the point of that that the table owner might have put trojan-horse > functions into the RLS qual? If so, why are we only concerned about > defending the superuser and not other users? Seems like the right fix > would be to insist that functions in the RLS qual run as the table owner. > Granted, that might be painful to do. But it still seems like "we only > need to do this for superusers" is designing with blinkers on. I agree, and now that the urgency of trying to deliver this for 9.4 is over it's worth seeing if we can just run as table owner. Failing that, we could take the approach a certain other RDBMS does and make the ability to define row security quals a GRANTable right initially held only by the superuser. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
From
Tom Lane
Date:
Craig Ringer <craig@2ndquadrant.com> writes: > On 06/11/2014 07:24 AM, Tom Lane wrote: >> Is the point of that that the table owner might have put trojan-horse >> functions into the RLS qual? If so, why are we only concerned about >> defending the superuser and not other users? Seems like the right fix >> would be to insist that functions in the RLS qual run as the table owner. >> Granted, that might be painful to do. But it still seems like "we only >> need to do this for superusers" is designing with blinkers on. > I agree, and now that the urgency of trying to deliver this for 9.4 is > over it's worth seeing if we can just run as table owner. > Failing that, we could take the approach a certain other RDBMS does and > make the ability to define row security quals a GRANTable right > initially held only by the superuser. Hmm ... that might be a workable compromise. I think the main issue here is whether we expect that RLS quals will be something that the planner could optimize to any meaningful extent. If they're always (in effect) wrapped in SECURITY DEFINER functions, I think that largely blocks any optimizations; but maybe that wouldn't matter in practice. regards, tom lane
Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
From
Robert Haas
Date:
On Tue, Jun 10, 2014 at 7:18 PM, Craig Ringer <craig@2ndquadrant.com> wrote: > On 06/11/2014 02:19 AM, Tom Lane wrote: >> Hm ... I'm not following why we'd need a special case for superusers and >> not anyone else? Seems like any useful RLS scheme is going to require >> more privilege levels than just superuser and not-superuser. > > What it really needs is to invalidate plans when switching between > RLS-enabled and RLS-exempt users, yes. I'm sure we'll want an "RLS > exempt" right or mode sooner rather than later, so I'm against tying > this explicitly to superuser as such. > > I wouldn't be surprised to see > > SET ROW SECURITY ON|OFF > > down the track, with a right controlling whether you can or not. Or at > least, a right that directly exempts a user from row security. I'm really concerned about the security implications of this patch. I think we're setting ourselves up for a whole lot of hurt for somewhat unclear gain. In my view, commit 842faa714c0454d67e523f5a0b6df6500e9bc1a5 basically *is* row-level security: instead of applying a row-level security policy to a table, just create a security-barrier view over the table and grant access to the view. Forget that the table ever existed. Done. With this approach, there's a lot of stuff that we don't have to reinvent. We've talked a lot about whether row-level security should only be concerned with the rows it scans, or whether it should also restrict the new rows that can be created. You can get either behavior by choosing whether or not to use WITH CHECK OPTION. And then there's this question of who should be RLS-exempt; that's basically a question of to whom you grant privileges on the underlying table. Note that this can be very fine-grained: for example, you can allow someone to exempt themselves for selects but not for updates by granting them SELECT privileges but not UPDATE privileges on the underlying table. And potentially-exempt users can choose whether they want a particular access to actually be exempt by targeting the view when they don't want to be exempt and the table when they do. That's mighty useful for debugging, at least IMHO. And, if you want to have several row-level security policies for different classes of users, just create more than one view and grant different privileges on each. By contrast, it seems to me that every design so far proposed for something that is actually called row-level security - as opposed to commit 842faa714c0454d67e523f5a0b6df6500e9bc1a5, which *really is* row-level security, is extremely limited. Look back at all the things listed in the previous paragraph; can you do those things easily with the designs that have been proposed? As far as I can see, not really. Your (Craig's) rls-9.4-upd-sb-views patch seems to have a rough equivalent of WITH CHECK OPTION, probably because we've talked a lot about that specific issue, but it doesn't line up exactly to what WITH CHECK OPTION actually does. There's no independently-grantable RLS-exemption privilege - and even when we talk about that, it's usually some kind of global bit that applies to all tables and all operations equally - whereas with the above approach it can be per-table and per-operation and doesn't require superuser intervention to flip the bit. There's no way for users who are RLS exempt to turn off their exemption for testing purposes, let alone on a per-table basis. There's no way to have multiple RLS policies on a single table. All of those are things that we get "for free" in the view-over-table model, and implementing formal RLS basically requires us to either invent a new RLS-specific way of doing each of those things, or suffer along with a subset of the functionality. Yuck. But what's really awful about this whole design is that it breaks the invariant that reading from a table doesn't run anybody else's code. It's already the case that users need to be awfully careful about modifying tables, because that might fire triggers that do bad things. But at least you can SELECT from a table and it will either work, or it will fail with a permission denied error. What it will not do is unexpectedly run some code that you weren't expecting it to run. You can't be so blithe about selecting from views, but reading a plain table is always OK. Now, as soon as we introduce the concept that selecting from a table might not really mean "read from the table" but "read from the table after applying this owner-specified qual", we're opening up a whole new set of attack surfaces. Every pg_dump is an opportunity to hack somebody else's account, or at least audit their activity. Protecting the superuser against everybody else is nice, but I think it's just as important to protect non-superusers against each other, and I think that's going to be hard -- because in the RLS world, SELECT * FROM tab is now *fundamentally* ambiguous. Maybe it's reading from the table, and maybe it's really clandestinely reading from a view over the table, and the user has no way of being really clear about which behavior they want. From a security point of view, that seems very bad. To recap: 1. Reinventing RLS-specific ways to do all of the things that can already be done in the view-over-table model is a lot of work. 2. There's a danger that the functionality available in the two models will diverge, so that certain things can only be done in one world or the other. 3. On the whole, it seems likely that the RLS-specific world will remain impoverished compared to the view-over-table model. 4. Making SELECT * FROM tab ambiguous seems likely to be a security minefield. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
From
Stephen Frost
Date:
* Craig Ringer (craig@2ndquadrant.com) wrote: > On 06/11/2014 02:19 AM, Tom Lane wrote: > > Hm ... I'm not following why we'd need a special case for superusers and > > not anyone else? Seems like any useful RLS scheme is going to require > > more privilege levels than just superuser and not-superuser. > > What it really needs is to invalidate plans when switching between > RLS-enabled and RLS-exempt users, yes. I'm sure we'll want an "RLS > exempt" right or mode sooner rather than later, so I'm against tying > this explicitly to superuser as such. That certainly sounds reasonable to me, but the point is we're just looking to see if the current role executing the plan should or should not have RLS applied and, if it's changing, we need to re-plan. We don't need to actually track an independent plan for each and every user executing the plan, which means that the plan cache can be largely left alone. > I wouldn't be surprised to see > > SET ROW SECURITY ON|OFF > > down the track, with a right controlling whether you can or not. Or at > least, a right that directly exempts a user from row security. Agreed, but doing a re-planning in that case seems reasonable to me. I find it pretty unlikely that there will be a lot of critical path cases of the same plan flipping back and forth between a role for which RLS is applied and a role where it shouldn't be. > > Could we put the "if superuser then ok" test into the RLS condition test > > and thereby not need more than one plan at all? > > Only if we put it in another level of security barrier subquery, because > otherwise the planner might execute the other quals (including possible > user defined functions) before the superuser test. Which was the whole > reason for the superuser test in the first place. Yeah, I'm not a big fan of this and it certainly seems a simpler approach to just force a re-plan. We're talking about a query which has been prepared and then is being executed by different roles, some of which are RLS enabled and some which are RLS exempt. That just strikes me as pretty unlikely to happen and if it does become an issue, a user could work around it by having two different plans prepared and making sure that they are called from the appropriate roles to avoid the replanning. Thanks, Stephen
Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
From
Stephen Frost
Date:
* Craig Ringer (craig@2ndquadrant.com) wrote: > On 06/11/2014 07:24 AM, Tom Lane wrote: > > Is the point of that that the table owner might have put trojan-horse > > functions into the RLS qual? If so, why are we only concerned about > > defending the superuser and not other users? Seems like the right fix > > would be to insist that functions in the RLS qual run as the table owner. > > Granted, that might be painful to do. But it still seems like "we only > > need to do this for superusers" is designing with blinkers on. > > I agree, and now that the urgency of trying to deliver this for 9.4 is > over it's worth seeing if we can just run as table owner. We'll need to work out how to ensure that things like current_user() still returns the calling user in that case, otherwise it won't make any sense. In general, I agree that having the RLS quals run as the table owner is a good approach and would love to hear suggestions about how we can make that happen. > Failing that, we could take the approach a certain other RDBMS does and > make the ability to define row security quals a GRANTable right > initially held only by the superuser. I don't particularly like this idea- it's akin, to me anyway, to making the ability to control other permissions on a table (SELECT, INSERT, etc) something which a user would have to be granted- and it doesn't really address the issue. Thanks, Stephen
Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
From
Stephen Frost
Date:
* Robert Haas (robertmhaas@gmail.com) wrote: > I'm really concerned about the security implications of this patch. I > think we're setting ourselves up for a whole lot of hurt for somewhat > unclear gain. I'm certainly of a different opinion and, for the most part, I feel that if there are security concerns then they need to be addressed- and better by us than by asking users to use some other mechanism to implement RLS. > In my view, commit 842faa714c0454d67e523f5a0b6df6500e9bc1a5 basically > *is* row-level security: instead of applying a row-level security > policy to a table, just create a security-barrier view over the table > and grant access to the view. Forget that the table ever existed. > Done. This argument could have been made for column-level privileges also, no? Yet I don't hear any calls for that to be ripped out now that you could implement it through updatable security-barrier views. That commit was the ground-work to allow us to finally get proper RLS and I'm very disappointed to hear that the mechanical pieces around making RLS easy for users to use (and getting that check-box taken care of in a wide variety of fields that we are being exposed to now, see the PGConf.NYC keynote speakers...) is receiving such push-back. > With this approach, there's a lot of stuff that we don't have to > reinvent. We've talked a lot about whether row-level security should > only be concerned with the rows it scans, or whether it should also > restrict the new rows that can be created. You can get either > behavior by choosing whether or not to use WITH CHECK OPTION. And > then there's this question of who should be RLS-exempt; that's > basically a question of to whom you grant privileges on the underlying > table. Note that this can be very fine-grained: for example, you can > allow someone to exempt themselves for selects but not for updates by > granting them SELECT privileges but not UPDATE privileges on the > underlying table. And potentially-exempt users can choose whether > they want a particular access to actually be exempt by targeting the > view when they don't want to be exempt and the table when they do. I agree that views, or even security-definer functions, offer a great deal of flexibility, and that may be necessary in some use-cases, but I fail to see why that means we should avoid providing the mechanics to achieve simple and usable RLS akin to what other major RDBMS's have. > That's mighty useful for debugging, at least IMHO. And, if you want > to have several row-level security policies for different classes of > users, just create more than one view and grant different privileges > on each. I'm really not impressed with the idea that RLS should be done with multiple different views of the same underlying table. > By contrast, it seems to me that every design so far proposed for > something that is actually called row-level security - as opposed to > commit 842faa714c0454d67e523f5a0b6df6500e9bc1a5, which *really is* > row-level security, is extremely limited. Look back at all the things > listed in the previous paragraph; can you do those things easily with > the designs that have been proposed? As far as I can see, not really. I don't feel that RLS will, or even *should*, have the same level of flexibility that you can achieve with views and/or security definer functions. I expect that, over time, we will add more capabilities to it, but it's never going to be able to redefine the contents of a column as a view can, nor will it be able to add columns to a table as views can. I don't see those as reasons against having support for RLS. > Your (Craig's) rls-9.4-upd-sb-views patch seems to have a rough > equivalent of WITH CHECK OPTION, probably because we've talked a lot > about that specific issue, but it doesn't line up exactly to what WITH > CHECK OPTION actually does. There's no independently-grantable > RLS-exemption privilege - and even when we talk about that, it's > usually some kind of global bit that applies to all tables and all > operations equally - whereas with the above approach it can be > per-table and per-operation and doesn't require superuser intervention > to flip the bit. I'm glad to hear your thoughts on the level of granularity which might be nice to have with RLS. What would be great is to spend a bit more time reviewing what other systems provide in this area and considering what makes sense for us. This will also be a feature and an area which we will be improving for a long time to come, but we do need this capability and we have to start somewhere. > There's no way for users who are RLS exempt to turn > off their exemption for testing purposes, let alone on a per-table > basis. I don't follow this argument entirely- users can't turn off the existing permissions system for testing either, unless an authorized user with the correct permissions makes the change to allow it- or the user bumps themselves up to superuser, or to a role which has broader permissions, both of which would also be possible to do with RLS. > There's no way to have multiple RLS policies on a single > table. All of those are things that we get "for free" in the > view-over-table model, and implementing formal RLS basically requires > us to either invent a new RLS-specific way of doing each of those > things, or suffer along with a subset of the functionality. Yuck. What would probably be good is to review the use-cases which the current patch already addresses- and we've had good responses from actual users who are already playing with the patch and are hearing that it is addressing their requirements. > But what's really awful about this whole design is that it breaks the > invariant that reading from a table doesn't run anybody else's code. You're suggesting that we use views instead, which clearly could run someone else's code. Perhaps the user will notice that they're selecting from a view instead of a table, but I've never seen a security design around making sure that what is being select'd from is a table vs. a view. Have you seen applications which implement such a check prior to running a query? > It's already the case that users need to be awfully careful about > modifying tables, because that might fire triggers that do bad things. > But at least you can SELECT from a table and it will either work, or > it will fail with a permission denied error. What it will not do is > unexpectedly run some code that you weren't expecting it to run. You > can't be so blithe about selecting from views, but reading a plain > table is always OK. Now, as soon as we introduce the concept that > selecting from a table might not really mean "read from the table" but > "read from the table after applying this owner-specified qual", we're > opening up a whole new set of attack surfaces. With this, I agree, there is risk associated with the implementation we're looking at for RLS. We could narrow the case by reducing the capabilities of RLS in PG by only allowing certain functions to be used in the definition of a RLS policy (eg: btree operators of known data types, or something similar to our "leak-proof" attribute), but I don't see that it really buys us much. There are a *lot* of ways in which an individual who has the ability to create objects inside the database can cause problems, but that comes with the flexibility we provide users with. That will always be a balance but, I believe, we wouldn't have the same level of success or have such an awesome system without that flexibility. > Every pg_dump is an > opportunity to hack somebody else's account, or at least audit their > activity. Protecting the superuser against everybody else is nice, > but I think it's just as important to protect non-superusers against > each other, and I think that's going to be hard -- because in the RLS > world, SELECT * FROM tab is now *fundamentally* ambiguous. Maybe it's > reading from the table, and maybe it's really clandestinely reading > from a view over the table, and the user has no way of being really > clear about which behavior they want. From a security point of view, > that seems very bad. I don't see this as being an insurmountable issue. I agree that having a way for pg_dump to run safely is important and the superuser check does address that, given that we don't have a "read-only (and everything)" capability today. Once we do (and I surely hope that will come sooner rather than later), such a role should also have the 'no RLS' bit, as it wouldn't make any sense for such a role anyway. The lack of that is not a strike against RLS though. > To recap: > > 1. Reinventing RLS-specific ways to do all of the things that can > already be done in the view-over-table model is a lot of work. I agree that there's a fair bit of work involved, but I do not see reimplementing views as RLS as the goal. > 2. There's a danger that the functionality available in the two models > will diverge, so that certain things can only be done in one world or > the other. They will always be distinct, intentionally so. > 3. On the whole, it seems likely that the RLS-specific world will > remain impoverished compared to the view-over-table model. Agreed. As is the case with views vs. security definer functions. > 4. Making SELECT * FROM tab ambiguous seems likely to be a security minefield. While I agree that we need to consider this, I don't think it will be a "minefield", but rather something we need to document and educate our users about. If you'd like a "disable-all-RLS" GUC, I'm all for it. Tossing out any hope of having RLS in PG is tossing the baby out with the bathwater though, imv. Thanks, Stephen
Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
From
Tom Lane
Date:
Stephen Frost <sfrost@snowman.net> writes: > * Robert Haas (robertmhaas@gmail.com) wrote: >> I'm really concerned about the security implications of this patch. I >> think we're setting ourselves up for a whole lot of hurt for somewhat >> unclear gain. > I'm certainly of a different opinion and, for the most part, I feel that > if there are security concerns then they need to be addressed- and > better by us than by asking users to use some other mechanism to > implement RLS. TBH, I found Robert's argument pretty persuasive. The idea that "SELECT * FROM table" might invoke arbitrary processing ought to scare anyone who's concerned about security, because that's going to completely break any assumptions about pg_dump being safe for instance, as well as force top-to-bottom rethinking of many other security assumptions. > ... That commit was > the ground-work to allow us to finally get proper RLS and I'm very > disappointed to hear that the mechanical pieces around making RLS easy > for users to use (and getting that check-box taken care of in a wide > variety of fields that we are being exposed to now, see the PGConf.NYC > keynote speakers...) is receiving such push-back. If this is being sold as merely "ease of use", then it is probably going to get rejected. In order to get some extra ease of use for the minority of users who need RLS, you are going to significantly complicate the lives of all Postgres users. That's not a net win in any sane calculation of ease of use. Maybe the right thing to think about is how we can make it easier to set up table + view combinations according to the pattern Robert described. I wouldn't have a problem with some more-or-less-automated support for doing that. (Consider SERIAL as a possible precedent here: it's basically a table creation macro.) > You're suggesting that we use views instead, which clearly could run > someone else's code. Perhaps the user will notice that they're > selecting from a view instead of a table, but I've never seen a security > design around making sure that what is being select'd from is a table > vs. a view. pg_dump is a sufficient counterexample to that statement. regards, tom lane
Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
From
Robert Haas
Date:
On Wed, Jun 11, 2014 at 12:23 PM, Stephen Frost <sfrost@snowman.net> wrote: >> In my view, commit 842faa714c0454d67e523f5a0b6df6500e9bc1a5 basically >> *is* row-level security: instead of applying a row-level security >> policy to a table, just create a security-barrier view over the table >> and grant access to the view. Forget that the table ever existed. >> Done. > > This argument could have been made for column-level privileges also, no? Not really. First of all, we didn't have security_barrier views at that time, let alone security barrier views that are auto-updateable. That's a really important piece of technology which makes filtering access via views feasible in ways that really were not feasible in the past. Secondly, column-level permissions - like every other currently-existing type of permissions - are declarative. They are an additional opportunity for the system to say "no" to something it otherwise would have allowed, but no user-defined code is executed. Row-level security is not a chance for the system to deny access; it's a chance for user-defined code to take control and perform arbitrary operations. So the scope of what we're contemplating for row-level security is really far, far more invasive than what we did for column-level privileges. > I agree that views, or even security-definer functions, offer a great > deal of flexibility, and that may be necessary in some use-cases, but I > fail to see why that means we should avoid providing the mechanics to > achieve simple and usable RLS akin to what other major RDBMS's have. Because we don't have a good design. I'm not categorically opposed to adding more RLS features to PostgreSQL and never have been; in fact, I was deeply involved in the original design of security barrier views and committed the original patch to add that functionality to PostgreSQL, without which none of what we're talking about here would be possible. But the currently-proposed design is very unappealing to me, for the reasons that I've explained. The right answer to "this feature doesn't provide anything that we don't already have and will introduce major new security exposures that haven't been adequate thought" is debatable, but "well other people have this so we should too" is definitely not it. Craig's patch really hasn't grappled with any of these thorny definition and security issues; it's just about making the basic functionality work. That's fine for a POC, but it's not enough for a feature that the project would be committing to maintain for the indefinite future. >> That's mighty useful for debugging, at least IMHO. And, if you want >> to have several row-level security policies for different classes of >> users, just create more than one view and grant different privileges >> on each. > > I'm really not impressed with the idea that RLS should be done with > multiple different views of the same underlying table. Are you equally unimpressed with the idea that RLS as proposed can't support more than one security policy right now *at all*? Because it seems to me that either you think multiple RLS policies on a single table is important (in which case the current patch is inadequate) or you think it's not important (in which case we need not argue about whether doing it with multiple views over the same underlying table is awkward). >> By contrast, it seems to me that every design so far proposed for >> something that is actually called row-level security - as opposed to >> commit 842faa714c0454d67e523f5a0b6df6500e9bc1a5, which *really is* >> row-level security, is extremely limited. Look back at all the things >> listed in the previous paragraph; can you do those things easily with >> the designs that have been proposed? As far as I can see, not really. > > I don't feel that RLS will, or even *should*, have the same level of > flexibility that you can achieve with views and/or security definer > functions. I expect that, over time, we will add more capabilities to > it, but it's never going to be able to redefine the contents of a column > as a view can, nor will it be able to add columns to a table as views > can. I don't see those as reasons against having support for RLS. What this patch is doing is basically allowing a table to really be a view over itself. If we choose to support that, I think it is absolutely inevitable that people are going to want all the same options that they would have if they really made a separate view - separate permissions, WITH CHECK OPTION, all of it. I find the contrary argument - that people will only want X amount and no more - simply not plausible. If it's valuable to have some of those capabilities in an RLS framework, somebody's going to want all of them. There's no bright line to divide the things that are valuable in that context from those that aren't. > I'm glad to hear your thoughts on the level of granularity which might > be nice to have with RLS. What would be great is to spend a bit more > time reviewing what other systems provide in this area and considering > what makes sense for us. This will also be a feature and an area which > we will be improving for a long time to come, but we do need this > capability and we have to start somewhere. I think this definitely important. I also think that we should be careful to study the deficiencies in those other systems and to clearly call out what value the capabilities we're thinking of adding to PostgreSQL 9.5 have over the status quo in PostgreSQL 9.4. I'm not so much arguing that we shouldn't have row-level security as that, in every way that's really meaningful, we already do. >> There's no way for users who are RLS exempt to turn >> off their exemption for testing purposes, let alone on a per-table >> basis. > > I don't follow this argument entirely- users can't turn off the existing > permissions system for testing either, unless an authorized user with > the correct permissions makes the change to allow it- or the user bumps > themselves up to superuser, or to a role which has broader permissions, > both of which would also be possible to do with RLS. Sure, but in the existing system, the query either returns the same results for everybody, or it fails outright with an error. It's certainly possible to screw up the existing permissions, but this new thing that's being proposed is much more complicated, because it's not just whether it works that's at issue, but what results you actually get. >> There's no way to have multiple RLS policies on a single >> table. All of those are things that we get "for free" in the >> view-over-table model, and implementing formal RLS basically requires >> us to either invent a new RLS-specific way of doing each of those >> things, or suffer along with a subset of the functionality. Yuck. > > What would probably be good is to review the use-cases which the current > patch already addresses- and we've had good responses from actual users > who are already playing with the patch and are hearing that it is > addressing their requirements. Yes. And in particular, I think we should have a much clearer statement than we currently do about the use cases in which it falls short. >> But what's really awful about this whole design is that it breaks the >> invariant that reading from a table doesn't run anybody else's code. > > You're suggesting that we use views instead, which clearly could run > someone else's code. Perhaps the user will notice that they're > selecting from a view instead of a table, but I've never seen a security > design around making sure that what is being select'd from is a table > vs. a view. Have you seen applications which implement such a check > prior to running a query? Yes. pg_dump, to name one really important one. I wouldn't be surprised if graphical clients did something similar - display the table data for a table, or the view definition for a view. But I admit to not having checked that. More than that, if I were a DBA, I'd certainly be darn careful about selecting from untrusted views, but I expect to be able to read a table, or run pg_dump, without getting my account hacked. > With this, I agree, there is risk associated with the implementation > we're looking at for RLS. We could narrow the case by reducing the > capabilities of RLS in PG by only allowing certain functions to be used > in the definition of a RLS policy (eg: btree operators of known data > types, or something similar to our "leak-proof" attribute), but I don't > see that it really buys us much. There are a *lot* of ways in which an > individual who has the ability to create objects inside the database can > cause problems, but that comes with the flexibility we provide users > with. That will always be a balance but, I believe, we wouldn't have > the same level of success or have such an awesome system without that > flexibility. I don't think restricting what can go into an RLS policy is the right answer; that to me misses the point. What needs to be restricted is the possibility that a user will inadvertently run code they didn't mean to run. >> Every pg_dump is an >> opportunity to hack somebody else's account, or at least audit their >> activity. Protecting the superuser against everybody else is nice, >> but I think it's just as important to protect non-superusers against >> each other, and I think that's going to be hard -- because in the RLS >> world, SELECT * FROM tab is now *fundamentally* ambiguous. Maybe it's >> reading from the table, and maybe it's really clandestinely reading >> from a view over the table, and the user has no way of being really >> clear about which behavior they want. From a security point of view, >> that seems very bad. > > I don't see this as being an insurmountable issue. I agree that having > a way for pg_dump to run safely is important and the superuser check > does address that, given that we don't have a "read-only (and > everything)" capability today. Once we do (and I surely hope that will > come sooner rather than later), such a role should also have the 'no > RLS' bit, as it wouldn't make any sense for such a role anyway. The > lack of that is not a strike against RLS though. It addresses running pg_dump *as the superuser*, but not as a database owner or just a regular users. If unprivileged user A runs pg_dump -t some_table_owned_by_user_b, and falls victim to a Trojan horse, that is going to get reported as a security defect in PostgreSQL. Telling the person who reports that issue that it's design behavior is not going to make them happy, or result in good press coverage for PostgreSQL. >> 2. There's a danger that the functionality available in the two models >> will diverge, so that certain things can only be done in one world or >> the other. > > They will always be distinct, intentionally so. I think that's an absolutely terrible idea. We do not want to be in the business of having two parallel systems with slightly different capabilities and syntax that are providing the same fundamental functionality. And they are: the proposal for RLS is to make it work just like a security_barrier view, sharing a common implementation. >> 4. Making SELECT * FROM tab ambiguous seems likely to be a security minefield. > > While I agree that we need to consider this, I don't think it will be a > "minefield", but rather something we need to document and educate our > users about. If you'd like a "disable-all-RLS" GUC, I'm all for it. I would definitely like that. I have proposed it in the past. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
From
Stephen Frost
Date:
Tom, * Tom Lane (tgl@sss.pgh.pa.us) wrote: > Stephen Frost <sfrost@snowman.net> writes: > > * Robert Haas (robertmhaas@gmail.com) wrote: > >> I'm really concerned about the security implications of this patch. I > >> think we're setting ourselves up for a whole lot of hurt for somewhat > >> unclear gain. > > > I'm certainly of a different opinion and, for the most part, I feel that > > if there are security concerns then they need to be addressed- and > > better by us than by asking users to use some other mechanism to > > implement RLS. > > TBH, I found Robert's argument pretty persuasive. The idea that > "SELECT * FROM table" might invoke arbitrary processing ought to scare > anyone who's concerned about security, because that's going to completely > break any assumptions about pg_dump being safe for instance, as well as > force top-to-bottom rethinking of many other security assumptions. SELECT triggers for a wide variety of use-cases are pretty commonly asked for here and are something I'd like to see us support also. There are also quite a few ways in which a select can end up executing code. Today it requires more than 'select * from table;', but not very much.. I agree that it'd be good if we had a way to address that but I continue to view that as an independent issue. What I haven't heard any comments on, yet found interesting, was the idea of having the RLS quals run as the owner of the table. Would that address these concerns? I seem to recall wondering why we didn't do that for views in the first place, though I doubt we could change it now even if we wanted to (and I'm guessing the spec has something to say about this, though I haven't gone and looked and don't remember offhand). It's certainly rather curious that functions called under a view are run as the calling user while permissions checks on relations referred to by the view are as the view owner. Hopefully that will make the rest of this discussion less relevant, but I'll respond with my feelings anyway. > > ... That commit was > > the ground-work to allow us to finally get proper RLS and I'm very > > disappointed to hear that the mechanical pieces around making RLS easy > > for users to use (and getting that check-box taken care of in a wide > > variety of fields that we are being exposed to now, see the PGConf.NYC > > keynote speakers...) is receiving such push-back. > > If this is being sold as merely "ease of use", then it is probably going > to get rejected. In order to get some extra ease of use for the minority > of users who need RLS, you are going to significantly complicate the lives > of all Postgres users. That's not a net win in any sane calculation of > ease of use. I don't view this as being at all accurate- how is this complicating the lives of all Postgres users? If they are worried about running user defined code then they *already* have a lot to worry about. While the users of RLS might be less than 50% and therefore the minority, I expect it will have quite a bit of up-take in certain industries and I know that our lack of any RLS is currently preventing use of Postgres in some rather important cases. As for it being ease-of-use, again, there are ways in which column level privileges could have been dealt with using views, rules, security definer functions, etc, but that doesn't mean we don't want that feature. I certainly view RLS (and have for quite some time..) as a much needed capability, even if it can be done today using a bunch of user written code that must be security audited. > Maybe the right thing to think about is how we can make it easier to set > up table + view combinations according to the pattern Robert described. While this sounds interesting, I don't see adding columns or redefining them as being in the perview of RLS. The current approach of allowing a boolean expression to be defined is both extremely flexible while also being simple when the requirement is simple. Having to create, manage, update, etc, an independent object would add unnecessary complexity. Perhaps having it be a boolean expression is too much flexibility but the alternatives that I can think of aren't terribly attractive to me and the boolean expression approach is what folks coming from other RDBMS's will be familiar with and understand how to build their applications around. We may need to provide some additional pieces around this (perhaps a trigger-like function type which also gets information about the object being queried, etc) but the point is to have a straight-forward and simply reasoned about way of limiting what data is returned. > I wouldn't have a problem with some more-or-less-automated support for > doing that. (Consider SERIAL as a possible precedent here: it's basically > a table creation macro.) Perhaps there's a way to make that work, but personally it looks like a whole bunch more work and I don't see the gain. How would adding RLS to an existing table work? It's worse than the SERIAL case as at least a default clause can be added later without impacting the application code. Would the functions referenced through such a view run as the user of the view? > > You're suggesting that we use views instead, which clearly could run > > someone else's code. Perhaps the user will notice that they're > > selecting from a view instead of a table, but I've never seen a security > > design around making sure that what is being select'd from is a table > > vs. a view. > > pg_dump is a sufficient counterexample to that statement. No, it isn't. pg_dump's defined purpose is explicitly to pull out the data contents underneath, or the definition of the object, which means it happens to issue explicit select * from table's (or COPY commands) for tables and pull the view definition for views. There's no way to even ask it to dump out the contents of a view (rather than the definition of it). I don't consider that a security design which checks if the object *that we're asking to select the contents of* is checked to see if it's a view or a table, in order to avoid calling user-defined code. I agree that pg_dump takes many precautions to avoid running user code in a way which could be dangerous, both to avoid security issues and because its goal is to reproduce the system exactly as it was, and running user code would likely cause problems for that. I still do not buy this argument that individuals or applications pay much more attention to selecting from views than they do selecting from tables, or generally go out of their way to try and avoid running user defined code (indeed, much of the point is to be able to add such things without having to change the application around..). We care about these issues a great deal in pg_dump, rightfully, but psql, pgAdmin3, Perl DBD/DBI, libpq-using application, etc, etc, have no mechanism to say "give me just the data and only the data and don't run any user-defined code". Adding that capability might be interesting if we can figure out how exactly to define it but it's still an orthogonal issue to RLS, imv. Thanks, Stephen
Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
From
Stephen Frost
Date:
* Robert Haas (robertmhaas@gmail.com) wrote: > On Wed, Jun 11, 2014 at 12:23 PM, Stephen Frost <sfrost@snowman.net> wrote: > > This argument could have been made for column-level privileges also, no? > > Not really. First of all, we didn't have security_barrier views at > that time, let alone security barrier views that are auto-updateable. We had security definer functions, even set-returning ones, along with rules and triggers. > That's a really important piece of technology which makes filtering > access via views feasible in ways that really were not feasible in the > past. Secondly, column-level permissions - like every other > currently-existing type of permissions - are declarative. They are an > additional opportunity for the system to say "no" to something it > otherwise would have allowed, but no user-defined code is executed. We could try to avoid calling user-defined code for RLS, but it'd add a whole lot of complexity and as far as I can see and your proposed solution isn't avoiding the user-defined code anyway, so I'm not sure why this solution should be required to meet that. > Row-level security is not a chance for the system to deny access; it's > a chance for user-defined code to take control and perform arbitrary > operations. So the scope of what we're contemplating for row-level > security is really far, far more invasive than what we did for > column-level privileges. In this case the user-defined code needs to return a boolean. We don't currently do anything to prevent it from having side-effects, no, but the same is true with views which incorporate functions. I agree that it makes a difference when compared to column-level privileges, but my point was that we have provided easier ways to do things which were possible using more complicated methods before. Perhaps the risk with RLS is higher but these issues look managable to me and the level of doubt about our ability to provide this feature in a reasonable and principled way that our users will understand surprises me. > > I agree that views, or even security-definer functions, offer a great > > deal of flexibility, and that may be necessary in some use-cases, but I > > fail to see why that means we should avoid providing the mechanics to > > achieve simple and usable RLS akin to what other major RDBMS's have. > > Because we don't have a good design. We're using a design that's found in multiple other RDBMS's and used extensively by certain industries which use those RDBMS's today. I'm certainly open to improving what is found in other systems for PG but I have a hard time seeing this approach as a bad design. Perhaps you're referring to our implementation, in which case I might agree and things like running the quals as the table owner is something which should be considered (I don't know how the other RDBMS's operate in this regard offhand- it'd be good to find out). > I'm not categorically opposed to adding more RLS features to > PostgreSQL and never have been; in fact, I was deeply involved in the > original design of security barrier views and committed the original > patch to add that functionality to PostgreSQL, without which none of > what we're talking about here would be possible. But the > currently-proposed design is very unappealing to me, for the reasons > that I've explained. The right answer to "this feature doesn't > provide anything that we don't already have and will introduce major > new security exposures that haven't been adequate thought" is > debatable, but "well other people have this so we should too" is > definitely not it. How about "it's in high demand by our user base"? In particular, it's being asked for by a *highly* technical section of our user base who uses these capabilities today, with this design, in those other databases. > Craig's patch really hasn't grappled with any of > these thorny definition and security issues; it's just about making > the basic functionality work. That's fine for a POC, but it's not > enough for a feature that the project would be committing to maintain > for the indefinite future. Improving the patch is exactly what I'd like to do, but throwing out the notion that RLS can't be allowed to execute user-defined code is cutting the legs out of the feature completely- particularly with our system where users can create all manner of objects in the system with their own code being run. > >> That's mighty useful for debugging, at least IMHO. And, if you want > >> to have several row-level security policies for different classes of > >> users, just create more than one view and grant different privileges > >> on each. > > > > I'm really not impressed with the idea that RLS should be done with > > multiple different views of the same underlying table. > > Are you equally unimpressed with the idea that RLS as proposed can't > support more than one security policy right now *at all*? Because it > seems to me that either you think multiple RLS policies on a single > table is important (in which case the current patch is inadequate) or > you think it's not important (in which case we need not argue about > whether doing it with multiple views over the same underlying table is > awkward). The current approach allows a nearly unlimited level of flexibility, should the user wish it, by being able to run user-defined code. Perhaps that would be considered 'one policy', but it could certainly take under consideration the calling user, the object being queried (if a function is defined per table, or if we provide a way to get that information in the function), etc. What it wouldn't require is the same object to be queried through different object names, which is what I was principally objecting to. What would it mean to have mutliple RLS policies for a given object? There would have to be some criteria to distinguish which one would be applied, yet that can be handled with the existing design by the user already, if they wish to. Were we to preclude users from being able to have user-defined functions called, then there's quite a bit of additional complexity we'd need to replicate. Per-user policies, per-role policies, a definition of which one applies when, per-source-IP, per-connection-type (SSL vs. non-SSL), per-security-label, etc.. > >> By contrast, it seems to me that every design so far proposed for > >> something that is actually called row-level security - as opposed to > >> commit 842faa714c0454d67e523f5a0b6df6500e9bc1a5, which *really is* > >> row-level security, is extremely limited. Look back at all the things > >> listed in the previous paragraph; can you do those things easily with > >> the designs that have been proposed? As far as I can see, not really. > > > > I don't feel that RLS will, or even *should*, have the same level of > > flexibility that you can achieve with views and/or security definer > > functions. I expect that, over time, we will add more capabilities to > > it, but it's never going to be able to redefine the contents of a column > > as a view can, nor will it be able to add columns to a table as views > > can. I don't see those as reasons against having support for RLS. > > What this patch is doing is basically allowing a table to really be a > view over itself. I don't agree with this characterization. This patch specifically allows filtering the rows returned from the table, and it intentionally does not allow changing the data. > If we choose to support that, I think it is > absolutely inevitable that people are going to want all the same > options that they would have if they really made a separate view - > separate permissions, WITH CHECK OPTION, all of it. We are already looking at WITH CHECK OPTION-style support, but I disagree that separate permissions or data changing will ever be a part of RLS because then it's no longer RLS. > I find the > contrary argument - that people will only want X amount and no more - > simply not plausible. I'm not sure where you are seeing the requests for this feature from, but where I have heard them it's been to match what exists in other RDBMS's which do not have the capabilities that you're describing users will want- yet RLS is heavily used in those organizations. For the use cases that I've had in the past, RLS-as-defined would be the feature that I want for most tables, with views for joins and data-changing operations. > If it's valuable to have some of those > capabilities in an RLS framework, somebody's going to want all of > them. There's no bright line to divide the things that are valuable > in that context from those that aren't. I see the line quite clearly- RLS is about having a filtering mechanism and that's it. If it isn't filtering the rows (meaning giving back a 'true' or 'false' result for each row) then it's beyond RLS. > > I'm glad to hear your thoughts on the level of granularity which might > > be nice to have with RLS. What would be great is to spend a bit more > > time reviewing what other systems provide in this area and considering > > what makes sense for us. This will also be a feature and an area which > > we will be improving for a long time to come, but we do need this > > capability and we have to start somewhere. > > I think this definitely important. I also think that we should be > careful to study the deficiencies in those other systems and to > clearly call out what value the capabilities we're thinking of adding > to PostgreSQL 9.5 have over the status quo in PostgreSQL 9.4. I'm not > so much arguing that we shouldn't have row-level security as that, in > every way that's really meaningful, we already do. This is not the feeling that the users which I have been working with have, nor does it match my feelings about this. As mentioned in my email to Tom just now, having another object to deal with adds unnecessary complexity and will require application changes potentially to implement over existing tables. > >> There's no way for users who are RLS exempt to turn > >> off their exemption for testing purposes, let alone on a per-table > >> basis. > > > > I don't follow this argument entirely- users can't turn off the existing > > permissions system for testing either, unless an authorized user with > > the correct permissions makes the change to allow it- or the user bumps > > themselves up to superuser, or to a role which has broader permissions, > > both of which would also be possible to do with RLS. > > Sure, but in the existing system, the query either returns the same > results for everybody, or it fails outright with an error. It's > certainly possible to screw up the existing permissions, but this new > thing that's being proposed is much more complicated, because it's not > just whether it works that's at issue, but what results you actually > get. I agree that we'll need to make sure we return the correct answer. There is complexity there, but hopefully we've addressed much or all of that with what we have in 9.4 and this is just adding a simpler and often requested way to use that capability without the need to create and manage another object in the system. > >> There's no way to have multiple RLS policies on a single > >> table. All of those are things that we get "for free" in the > >> view-over-table model, and implementing formal RLS basically requires > >> us to either invent a new RLS-specific way of doing each of those > >> things, or suffer along with a subset of the functionality. Yuck. > > > > What would probably be good is to review the use-cases which the current > > patch already addresses- and we've had good responses from actual users > > who are already playing with the patch and are hearing that it is > > addressing their requirements. > > Yes. And in particular, I think we should have a much clearer > statement than we currently do about the use cases in which it falls > short. I'm happy to have that discussion with the users who are asking for this but in the conversations that I've had to date, updatable s.b. views are not RLS to them and I have to agree- having to maintain twice as many objects in the system which have to be named differently and have permissions which can be distinct from each other (which is something that could be a *problem* if it isn't intended), must both be updated when adding or removing columns, etc, makes that solution quite unappealing. > > You're suggesting that we use views instead, which clearly could run > > someone else's code. Perhaps the user will notice that they're > > selecting from a view instead of a table, but I've never seen a security > > design around making sure that what is being select'd from is a table > > vs. a view. Have you seen applications which implement such a check > > prior to running a query? > > Yes. pg_dump, to name one really important one. I wouldn't be > surprised if graphical clients did something similar - display the > table data for a table, or the view definition for a view. I'm quite sure you can select back the data from a view in every graphical client that exists- and without any warning popping up that you might be running code that someone else wrote. Yes, you can also get the definition of the view in many cases and you can tell if what you're selecting is a view or a table but that doesn't mean people are actively being paranoid about that distinction or worrying about the other cases where user-defined code might be run, even when selecting from a table, in general. > But I > admit to not having checked that. More than that, if I were a DBA, > I'd certainly be darn careful about selecting from untrusted views, > but I expect to be able to read a table, or run pg_dump, without > getting my account hacked. I'd love to hear how you decide which views are trusted and which are not. Last I checked, most serious attacks still come from internal individuals rather than external ones. Don't get me wrong- we definitely have an issue here that it'd be great to find a solution to, as has been discussed extensively, but I don't see RLS as making that problem particularly worse, and really, excluding superusers and having the option for other users to be excluded goes above what we've done to date in other areas. > I don't think restricting what can go into an RLS policy is the right > answer; that to me misses the point. What needs to be restricted is > the possibility that a user will inadvertently run code they didn't > mean to run. I'm glad that you agree that restricting the RLS policy isn't the right answer. I agree that we want to come up with a way to prevent users from running code that isn't safe or isn't intended. I still don't see RLS as making that particularly worse. The system is really nearly unusable in any interactive way if you restrict yourself to operations which can't possibly run any user-defined code today. There have been discussions about ways to possibly improve that, and those ways would need to address the RLS case in addition to the other already existing cases but I don't see that as a signifigant increase in the amount of work required to address that problem (which is already quite large..). > > I don't see this as being an insurmountable issue. I agree that having > > a way for pg_dump to run safely is important and the superuser check > > does address that, given that we don't have a "read-only (and > > everything)" capability today. Once we do (and I surely hope that will > > come sooner rather than later), such a role should also have the 'no > > RLS' bit, as it wouldn't make any sense for such a role anyway. The > > lack of that is not a strike against RLS though. > > It addresses running pg_dump *as the superuser*, but not as a database > owner or just a regular users. If unprivileged user A runs pg_dump -t > some_table_owned_by_user_b, and falls victim to a Trojan horse, that > is going to get reported as a security defect in PostgreSQL. Telling > the person who reports that issue that it's design behavior is not > going to make them happy, or result in good press coverage for > PostgreSQL. We have this problem with psql today, as has been discussed. The fact that pg_dump doesn't happen to have this problem is great but it's no true solution for the problem at hand. > >> 2. There's a danger that the functionality available in the two models > >> will diverge, so that certain things can only be done in one world or > >> the other. > > > > They will always be distinct, intentionally so. > > I think that's an absolutely terrible idea. We do not want to be in > the business of having two parallel systems with slightly different > capabilities and syntax that are providing the same fundamental > functionality. And they are: the proposal for RLS is to make it work > just like a security_barrier view, sharing a common implementation. While RLS could be viewed as providing a subset of what updatable sb views provide, I can see a clear line between the two and, for my part, we should allow users to make their own decision about if they want the complexity involved with maintaining another object in the system to provide the filtering or if they want to implement the filtering and the data manipulation, joins, etc, independently. That's really another big point to be made here- there's value in separating these concerns. Security is a big enough concern and a big enough issue that being able to address it explicitly and with a simple syntax is extremely valuable. RLS as we've been discussing it allows that, while having to include it in more complicated view definitions could make it much more difficult to reason about. I suppose one could define a view for just the filtering and then another view for the data manipulation and joining over top of the other views, but, again, that adds another level of complexity that isn't needed- and you can't be 100% sure that the only thing the supposedly filtering view is doing is *just* filtering unless you audit it regularly. > >> 4. Making SELECT * FROM tab ambiguous seems likely to be a security minefield. > > > > While I agree that we need to consider this, I don't think it will be a > > "minefield", but rather something we need to document and educate our > > users about. If you'd like a "disable-all-RLS" GUC, I'm all for it. > > I would definitely like that. I have proposed it in the past. Great. Thanks, Stephen
Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
From
Robert Haas
Date:
On Wed, Jun 11, 2014 at 8:59 PM, Stephen Frost <sfrost@snowman.net> wrote: >> Row-level security is not a chance for the system to deny access; it's >> a chance for user-defined code to take control and perform arbitrary >> operations. So the scope of what we're contemplating for row-level >> security is really far, far more invasive than what we did for >> column-level privileges. > > In this case the user-defined code needs to return a boolean. We don't > currently do anything to prevent it from having side-effects, no, but > the same is true with views which incorporate functions. I agree that > it makes a difference when compared to column-level privileges, but my > point was that we have provided easier ways to do things which were > possible using more complicated methods before. Perhaps the risk with > RLS is higher but these issues look managable to me and the level of > doubt about our ability to provide this feature in a reasonable and > principled way that our users will understand surprises me. I'm glad the issues look manageable to you, but you haven't really explained how to manage them. The way to dispel doubt is to come up with specific technical proposals that address the technical issues that have been raised. I accept that you are surprised that someone might not think we are on the right course here, but it's entirely appropriate for me to express my doubts about this or any other patch, much as many people do in regards to many patches that are posted here - generally for good and valid reasons. For my part, I'm mildly surprised that anyone thinks it's a good idea to have SELECT * FROM tab to mean different things depending on who is typing it. To me, that seems very confusing; how does an unprivileged user with no ability to assume some other role validate that the row security policy they've configured works at all and exposes precisely the intended set of rows? Even aside from security exposures, how does a non-superuser who runs pg_dump know whether they've got a complete backup or a filtered dump that's missing some rows? A filtered dump might not even be restorable if foreign keys are involved. I think those are serious issues that deserve serious thought and consideration, not just a vague assurance that the issues are probably manageable. >> Because we don't have a good design. > > We're using a design that's found in multiple other RDBMS's and used > extensively by certain industries which use those RDBMS's today. I'm > certainly open to improving what is found in other systems for PG but I > have a hard time seeing this approach as a bad design. Perhaps you're > referring to our implementation, in which case I might agree and things > like running the quals as the table owner is something which should be > considered (I don't know how the other RDBMS's operate in this regard > offhand- it'd be good to find out). I'm not referring to the proposed implementation particularly; or at least not that aspect of it. I don't think trying to run the view quals as the defining user is likely to be very appealing, because I think it's going to hurt performance, for example by preventing function inlining and requiring lots of user-ID switches. But I'm not gonna complain if someone wants to mull it over and make a proposal for how to make it work. Rather, my concern is that all we've got is what might be called the core of the feature; the actual guts of it. There are a lot of ancillary details that seem to me to be not worked out at all yet, or only half-baked. >> I'm not categorically opposed to adding more RLS features to >> PostgreSQL and never have been; in fact, I was deeply involved in the >> original design of security barrier views and committed the original >> patch to add that functionality to PostgreSQL, without which none of >> what we're talking about here would be possible. But the >> currently-proposed design is very unappealing to me, for the reasons >> that I've explained. The right answer to "this feature doesn't >> provide anything that we don't already have and will introduce major >> new security exposures that haven't been adequate thought" is >> debatable, but "well other people have this so we should too" is >> definitely not it. > > How about "it's in high demand by our user base"? In particular, it's > being asked for by a *highly* technical section of our user base who > uses these capabilities today, with this design, in those other > databases. Sure, that's a valid reason for considering any feature. But it's not an excuse to overlook whatever design problems may exist. >> Are you equally unimpressed with the idea that RLS as proposed can't >> support more than one security policy right now *at all*? Because it >> seems to me that either you think multiple RLS policies on a single >> table is important (in which case the current patch is inadequate) or >> you think it's not important (in which case we need not argue about >> whether doing it with multiple views over the same underlying table is >> awkward). > > The current approach allows a nearly unlimited level of flexibility, > should the user wish it, by being able to run user-defined code. > Perhaps that would be considered 'one policy', but it could certainly > take under consideration the calling user, the object being queried > (if a function is defined per table, or if we provide a way to get > that information in the function), etc. In theory, that's true. But in practice, performance will suck unless the security qual is easily optimizable. If your security qual is WHERE somecomplexfunction() you're going to have to implement that by sequential-scanning the table and evaluating the function for each row. For example, I once worked at a company where we had a table containing information about our customers and potential customers. Sales representatives were allowed to see their own accounts, and partners were allowed to see accounts associated with that partner. These things were independent. So for a sales rep, the security qual was WHERE sales_rep_id = <something> and for a partner the security qual was WHERE partner_id = <something>. Now, you could maybe write this as a single qual, something like this: WHERE sales_rep_id = (SELECT oid FROM pg_authid WHERE rolname = current_user AND oid IN (SELECT id FROM person WHERE is_sales_rep)) OR partner_id = (SELECT p.org_id FROM pg_authid a, person p WHERE a.rolname = current_user and a.oid = p.id) But that's probably not going to perform very well, because to match an index on sales_rep_id, or an index on partner_id, that's going to have to get simplified a whole lot, and that's probably not going to happen. If we've only got one branch of the OR, I think we'll realize we can evaluate the subquery as an InitPlan and then use an index, but with two branches I think that will fail. I don't want to overstate the importance of this particular case; but I do think scenarios in which it's advantageous to have multiple row-level security policies are plausible. Another, perhaps-simpler example is that you might have a table containing unclassified data, classified data, and secret data. You want to give access to the unclassified data only to one category of users; access to the unclassified data and the classified data to a second group of more-trusted users; and access to all of the data to a third group of very highly trusted users. If the table can only have one security policy that applies to everyone who isn't exempt, how will you do that? This sort of use case seems very plausible to me so I think we need to give some real thought to what we will recommend to users who want to do things like this. Can the proposed patch handle it? How? >> > I don't feel that RLS will, or even *should*, have the same level of >> > flexibility that you can achieve with views and/or security definer >> > functions. I expect that, over time, we will add more capabilities to >> > it, but it's never going to be able to redefine the contents of a column >> > as a view can, nor will it be able to add columns to a table as views >> > can. I don't see those as reasons against having support for RLS. >> >> What this patch is doing is basically allowing a table to really be a >> view over itself. > > I don't agree with this characterization. This patch specifically > allows filtering the rows returned from the table, and it intentionally > does not allow changing the data. I don't know what to say to this. What I said is, quite literally, what the patch does. It wraps the patch in an subquery RTE that is precisely the same thing you would get if you defined a security_barrier view with the security qual in the WHERE clause. This is not a question of opinion; the patch either does that or it doesn't, and I think it does. >> If we choose to support that, I think it is >> absolutely inevitable that people are going to want all the same >> options that they would have if they really made a separate view - >> separate permissions, WITH CHECK OPTION, all of it. > > We are already looking at WITH CHECK OPTION-style support, but I > disagree that separate permissions or data changing will ever be a part > of RLS because then it's no longer RLS. What do you mean by "data changing"? If you mean inserts, updates, and deletes, I am very sure people are going to want to perform those operations on RLS-enabled tables. Do you find it implausible that someone will want to exempt a certain role from RLS on only one table but not on other tables in the system?Do you find it implausible that someone will want toallow a certain table to bypass RLS when selecting rows, but not when updating or deleting them? I find those scenarios very plausible. >> It addresses running pg_dump *as the superuser*, but not as a database >> owner or just a regular users. If unprivileged user A runs pg_dump -t >> some_table_owned_by_user_b, and falls victim to a Trojan horse, that >> is going to get reported as a security defect in PostgreSQL. Telling >> the person who reports that issue that it's design behavior is not >> going to make them happy, or result in good press coverage for >> PostgreSQL. > > We have this problem with psql today, as has been discussed. The fact > that pg_dump doesn't happen to have this problem is great but it's no > true solution for the problem at hand. It's true that users can break security by being incautious about the queries they type into psql, and I'm all for having better tools to manage that. But a feature that causes currently-safe uses of pg_dump to become unsafe is, in my opinion, absolutely not OK. I do agree with your argument that things like adding and removing columns, or changing their data types, could be simpler with RLS than in the view-over-table model - because in the view-over-table model, we don't really know whether the user would like a new column to cascade to the view, whereas in the RLS model, we can automatically do the right thing. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
From
Kevin Grittner
Date:
Robert Haas <robertmhaas@gmail.com> wrote: > Even aside from security exposures, how > does a non-superuser who runs pg_dump know whether they've got a > complete backup or a filtered dump that's missing some rows? This seems to me to be a killer objection to the feature as proposed, and points out a huge difference between column level security and the proposed implementation of row level security. (In fact it is a difference between just about any GRANTed permission and row level security.) If you try to SELECT * FROM sometable and you don't have rights to all the columns, you get an error. A dump would always either work as expected or generate an error. test=# create user bob; CREATE ROLE test=# create user bill; CREATE ROLE test=# set role bob; SET test=> create table person (person_id int not null primary key, name text not null, ssn text); CREATE TABLE test=> grant select (person_id, name) on table person to bill; GRANT test=> reset role; RESET test=# set role bill; SET test=> select person_id, name from person; person_id | name -----------+------ (0 rows) test=> select * from person; ERROR: permission denied for relation person The proposed approach would leave the validity of any dump which was not run as a superuser in doubt. The last thing we need, in terms of improving security, is another thing you can't do without connecting as a superuser. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
From
Gregory Smith
Date:
On 6/11/14, 10:26 AM, Robert Haas wrote: > Now, as soon as we introduce the concept that selecting from a table > might not really mean "read from the table" but "read from the table > after applying this owner-specified qual", we're opening up a whole > new set of attack surfaces. Every pg_dump is an opportunity to hack > somebody else's account, or at least audit their activity. I'm in full agreement we should clearly communicate the issues around pg_dump in particular, because they can't necessarily be eliminated altogether without some major work that's going to take a while to finish. And if the work-around is some sort of GUC for killing RLS altogether, that's ugly but not unacceptable to me as a short-term fix. One of the difficult design requests in my inbox right now asks how pg_dump might be changed both to reduce its overlap with superuser permissions and to allow auditing of its activity. Those requests aren't going away; their incoming frequency is actually rising quite fast right now. They're both things people expect from serious SQL oriented commercial database products, and I'd like to see PostgreSQL continue to displace those as we reach feature parity in those areas. Any way you implement finer grained user permissions and auditing features will be considered a new attack vector when you use those features. The way the proposed RLS feature inserts an arbitrary function for reads has a similar new attack vector when you use that feature. I'm kind of surprised to see this turn into a hot button all of the sudden though, because my thought on all that so far has been a giant so what? This is what PostgreSQL does. You wanna write your own C code and then link the thing right into the server, so that bugs can expose data and crash the whole server? Not only can you shoot yourself in the foot that way, we supply a sample gun and bullets in contrib. How about writing arbitrary code in any one of a dozen server-side languages of wildly varying quality, then hooking that code so it runs as a trigger function whenever you change a row? PostgreSQL is *on it*; we love letting people write some random thing, and then running that random thing against your data as a side-effect of doing an operation. And if you like that...just wait until you learn about this half-assed rules feature we have too! And when the database breaks because the functions people inserted were garbage, that's their fault, not a cause for a CVE. And when someone blindly installs adminpack because it sounded like a pgAdmin requirement, lets a monitoring system run as root so it can watch pg_stat_activity, and then discovers that pair of reasonable decisions suddenly means any fool with monitoring access can call pg_file_unlink...that's their fault too. These are powerful tools with serious implications, and they're expected to be used by equally serious users. We as a development community do need to put a major amount of work into refactoring all of these security mechanisms. There should be less of these embarrassing incidents where bad software design really forced the insecure thing to happen, which I'd argue is the case for that pg_stat_activity example. And luckily so far development resources are appearing for organizations I know of working in that direction recently, as fast as the requirements are rising. I think there's a good outcome at the end of that road. But let's not act like RLS is a scary bogeyman because it introduces a new way to hack the server or get surprising side-effects. That's expected and possibly unavoidable behavior in a feature like this, and there are much worse instances of arbitrary function risk throughout the core code already. -- Greg Smith greg.smith@crunchydatasolutions.com Chief PostgreSQL Evangelist - http://crunchydatasolutions.com/
Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
From
Stephen Frost
Date:
Greg, all,
I will reply to the emails in detail when I get a chance but am out of town at a funeral, so it'll likely be delayed. I did want to echo my agreement for the most part with Greg and in particular...
On Thursday, June 12, 2014, Gregory Smith <gregsmithpgsql@gmail.com> wrote:
On Thursday, June 12, 2014, Gregory Smith <gregsmithpgsql@gmail.com> wrote:
On 6/11/14, 10:26 AM, Robert Haas wrote:Now, as soon as we introduce the concept that selecting from a table might not really mean "read from the table" but "read from the table after applying this owner-specified qual", we're opening up a whole new set of attack surfaces. Every pg_dump is an opportunity to hack somebody else's account, or at least audit their activity.
I'm in full agreement we should clearly communicate the issues around pg_dump in particular, because they can't necessarily be eliminated altogether without some major work that's going to take a while to finish. And if the work-around is some sort of GUC for killing RLS altogether, that's ugly but not unacceptable to me as a short-term fix.
A GUC which is enable / disable / error-instead may work quiet well, with error-instead for pg_dump default if people really want it (there would have to be a way to disable that though, imv).
Note that enable is default in general, disable would be for superuser only (or on start-up) to disable everything, and error-instead anyone could use but it would error instead of implementing RLS when querying an RLS-enabled table.
This approach was suggested by an existing user testing out this RLS approach, to be fair, but it looks pretty sane to me as a way to address some of these concerns. Certainly open to other ideas and thoughts though.
Thanks,
Stephen
Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
From
Dean Rasheed
Date:
On 13 June 2014 01:13, Stephen Frost <sfrost@snowman.net> wrote: > Greg, all, > > I will reply to the emails in detail when I get a chance but am out of town > at a funeral, so it'll likely be delayed. I did want to echo my agreement > for the most part with Greg and in particular... > > On Thursday, June 12, 2014, Gregory Smith <gregsmithpgsql@gmail.com> wrote: >> >> On 6/11/14, 10:26 AM, Robert Haas wrote: >>> >>> Now, as soon as we introduce the concept that selecting from a table >>> might not really mean "read from the table" but "read from the table after >>> applying this owner-specified qual", we're opening up a whole new set of >>> attack surfaces. Every pg_dump is an opportunity to hack somebody else's >>> account, or at least audit their activity. >> >> >> I'm in full agreement we should clearly communicate the issues around >> pg_dump in particular, because they can't necessarily be eliminated >> altogether without some major work that's going to take a while to finish. >> And if the work-around is some sort of GUC for killing RLS altogether, >> that's ugly but not unacceptable to me as a short-term fix. > > > A GUC which is enable / disable / error-instead may work quiet well, with > error-instead for pg_dump default if people really want it (there would have > to be a way to disable that though, imv). > > Note that enable is default in general, disable would be for superuser only > (or on start-up) to disable everything, and error-instead anyone could use > but it would error instead of implementing RLS when querying an RLS-enabled > table. > > This approach was suggested by an existing user testing out this RLS > approach, to be fair, but it looks pretty sane to me as a way to address > some of these concerns. Certainly open to other ideas and thoughts though. > Yeah, I was thinking something like this could work, but I would go further. Suppose you had separate GRANTable privileges for direct access to individual tables, bypassing RLS, e.g. GRANT DIRECT SELECT|INSERT|UPDATE|DELETE ON table_name TO role_name Combined with the GUC (direct_table_access, say) to request direct access to all tables. Then with direct_table_access = true/required, a SELECT from a table would error if the user hadn't been granted the DIRECT SELECT privilege on all the tables referenced in the query. Tools like pg_dump would require direct_table_access, but there might be other levels of access that didn't error out. I think if I were using RLS, I would definitely want/expect this level of fine-grained control over permissions on a per-table basis, rather than the superuser/non-superuser level of control, or having RLS-exempt users. Actually, given the fact that the majority of users won't be using RLS, I would be tempted to invert the above logic and have the new privilege be for LIMITED access (via RLS quals). So a user granted the normal SELECT privilege would be able to bypass RLS, but a user only granted LIMITED SELECT wouldn't. Regards, Dean
Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
From
Stephen Frost
Date:
* Tom Lane (tgl@sss.pgh.pa.us) wrote: > Adam Brightwell <adam.brightwell@crunchydatasolutions.com> writes: > > Through this effort, we have concluded that for RLS the case of > > invalidating a plan is only necessary when switching between a superuser > > and a non-superuser. Obviously, re-planning on every role change would be > > too costly, but this approach should help minimize that cost. As well, > > there were not any cases outside of this one that were immediately apparent > > with respect to RLS that would require re-planning on a per userid basis. > > Hm ... I'm not following why we'd need a special case for superusers and > not anyone else? Seems like any useful RLS scheme is going to require > more privilege levels than just superuser and not-superuser. Just to clarify this- the proposal allows RLS to be implemented essentially by any user-defined qual, where that qual can include the current user, the IP the user is connecting from, or more-or-less anything else, possibly even via a user-defined function or security module. It is not superuser-or-not. This discussion is about how to support users for which RLS should not be applied. I can see that being useful at a more granular level than superuser-or-not, but even at that level, RLS is still extremely useful. > Could we put the "if superuser then ok" test into the RLS condition test > and thereby not need more than one plan at all? As discussed, that unfortunately doesn't quite work. This discussion, in general, has been quite useful and I'll work on adding documentation to the wiki pages which discusses the consideration and suggestions for a GUC to disable-or-error when RLS is encountered, along with a per-role capability to bypass RLS; that is in line with the goal of avoiding adding superuser-specific capabilities. Thanks, Stephen
Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
From
Stephen Frost
Date:
* Tom Lane (tgl@sss.pgh.pa.us) wrote: > Craig Ringer <craig@2ndquadrant.com> writes: > > I agree, and now that the urgency of trying to deliver this for 9.4 is > > over it's worth seeing if we can just run as table owner. > > > Failing that, we could take the approach a certain other RDBMS does and > > make the ability to define row security quals a GRANTable right > > initially held only by the superuser. > > Hmm ... that might be a workable compromise. I think the main issue here > is whether we expect that RLS quals will be something that the planner > could optimize to any meaningful extent. If they're always (in effect) > wrapped in SECURITY DEFINER functions, I think that largely blocks any > optimizations; but maybe that wouldn't matter in practice. From what I've heard from actual users with other RDBMS's who are coming to PostgreSQL- the reality is that they're going to be using a security module (eg: SELinux) whose responsibility it is to manage this whole question of "can this user see this row", meaning there's zero chance of optimization. I'd certainly like to see the ability to optimize remain in cases where the qual itself gives us a way to filter (eg: a table partitioned based on some security level, where another table maps users to levels), but that is, from a practical standpoint, not an immediate concern from real users and I don't believe our approach paints us into a corner which would prevent that. What that would require is better support for true partitioning rather than constraint exclusions. Thanks, Stephen
Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
From
Stephen Frost
Date:
Robert, * Robert Haas (robertmhaas@gmail.com) wrote: > On Wed, Jun 11, 2014 at 8:59 PM, Stephen Frost <sfrost@snowman.net> wrote: > > In this case the user-defined code needs to return a boolean. We don't > > currently do anything to prevent it from having side-effects, no, but > > the same is true with views which incorporate functions. I agree that > > it makes a difference when compared to column-level privileges, but my > > point was that we have provided easier ways to do things which were > > possible using more complicated methods before. Perhaps the risk with > > RLS is higher but these issues look managable to me and the level of > > doubt about our ability to provide this feature in a reasonable and > > principled way that our users will understand surprises me. > > I'm glad the issues look manageable to you, but you haven't really > explained how to manage them. There's been a number of suggestions made and it'd be great to get more feedback on them- running the quals as the table owner, having a GUC which can be set to either run 'as normal' or either ignore RLS (if the user has that right) or error out if RLS would happen, and undoubtably there are other ideas along those same lines to address the pg_dump and other concerns. > For my part, I'm mildly surprised that anyone thinks it's a good idea > to have SELECT * FROM tab to mean different things depending on who is > typing it. Realistically, in the RDBMS realm in which we're in and that we're working to break into- this is absolutely a given and expected. It's new to PostgreSQL, certainly, but it's not uncommon or surprising at all in our industry. > To me, that seems very confusing; how does an unprivileged > user with no ability to assume some other role validate that the row > security policy they've configured works at all and exposes precisely > the intended set of rows? While I see what you're getting at, I'm not convinced it's really all that different from being set up without access to some schema or table which the administrator setting up accounts didn't include for you. Sure, in the case of a schema or table, you can get an error back instead of just not seeing the data, but if you're looking for specific data, chances are pretty good you'll realize the lack of data quickly and ask the same question regarding access. To wit, I've certainly had users ask exactly that question of- "do I have access to all the data in this table?" even when using PG where it's a bit tricky to limit such access. Clearly, the same risk applies when using views and so the question is understandable. Perhaps these were users with more experience in other RDBMS's where it's more common to have RLS, but there are at least a couple cases which I can think of where that wouldn't apply. > Even aside from security exposures, how > does a non-superuser who runs pg_dump know whether they've got a > complete backup or a filtered dump that's missing some rows? This would be addressed with the GUC that's been proposed. As would the previous paragraph, though I wanted to apply to that independently. > I'm not referring to the proposed implementation particularly; or at > least not that aspect of it. I don't think trying to run the view > quals as the defining user is likely to be very appealing, because I > think it's going to hurt performance, for example by preventing > function inlining and requiring lots of user-ID switches. I understand that there are performance implications. As mentioned to Tom, realistically, there's zero way to optimized at least some of these use-cases because they require a completely external module (eg: SELlinux) to be involved in the decision about who can view what records. If we can optimize that, it'd be by a completely different approach whereby we pull up the qual higher because we know the whole query only involves leakproof functions or similar, allowing us to only apply the filter to the final set of records prior to them being returned to the user. The point being that such optimizations would happen independently and regardless of the quals or user-defined functions involved. At the end of the day, I can't think of a better optimization for such a case (where we have to ask an external security module if a row is acceptable to return to the user) than that. Is there something specific you're thinking about that we'd be missing out on? > But I'm not > gonna complain if someone wants to mull it over and make a proposal > for how to make it work. Rather, my concern is that all we've got is > what might be called the core of the feature; the actual guts of it. > There are a lot of ancillary details that seem to me to be not worked > out at all yet, or only half-baked. Perhaps it's just my experience, but I've been focused on the main core feature for quite some time and it feels like we're really close to having it there. I agree that a few additional bits would be nice to have but these strike me as relatively straight-forward to implement overtop of this general construct. I do see value in documenting these concerns and will see about making that happen, along with what the general viewpoints and thoughts are about how to address the concern. > > How about "it's in high demand by our user base"? In particular, it's > > being asked for by a *highly* technical section of our user base who > > uses these capabilities today, with this design, in those other > > databases. > > Sure, that's a valid reason for considering any feature. But it's not > an excuse to overlook whatever design problems may exist. Agreed- improvements in the design, provided it continues to meet the expectations of the user-base, are absolutely welcome. > > The current approach allows a nearly unlimited level of flexibility, > > should the user wish it, by being able to run user-defined code. > > Perhaps that would be considered 'one policy', but it could certainly > > take under consideration the calling user, the object being queried > > (if a function is defined per table, or if we provide a way to get > > that information in the function), etc. > > In theory, that's true. But in practice, performance will suck unless > the security qual is easily optimizable. If your security qual is > WHERE somecomplexfunction() you're going to have to implement that by > sequential-scanning the table and evaluating the function for each > row. That's not actualy true today, is it? Given our leak-proof attribute, if the qual is "WHERE somecomplexfunction() AND leakprooffunctionx()" then we would be able to push down the leak-proof function and not necessairly run a straight sequential scan, no? Even so, though, we've had users who have tested exactly what this patch implements and they've been happy with their real-world use-cases. I'm certainly all for optimization and would love to see us make this better for everyone, but I don't view that as a reason to delay this particular feature which is really just bringing us up to parity with other RDMBS's. > For example, I once worked at a company where we had a table > containing information about our customers and potential customers. > Sales representatives were allowed to see their own accounts, and > partners were allowed to see accounts associated with that partner. > These things were independent. So for a sales rep, the security qual > was WHERE sales_rep_id = <something> and for a partner the security > qual was WHERE partner_id = <something>. Now, you could maybe write > this as a single qual, something like this: > > WHERE sales_rep_id = (SELECT oid FROM pg_authid WHERE rolname = > current_user AND oid IN (SELECT id FROM person WHERE is_sales_rep)) OR > partner_id = (SELECT p.org_id FROM pg_authid a, person p WHERE > a.rolname = current_user and a.oid = p.id) That looks like it'd work, or a pl/pgsql function which did the same. > But that's probably not going to perform very well, because to match > an index on sales_rep_id, or an index on partner_id, that's going to > have to get simplified a whole lot, and that's probably not going to > happen. If we've only got one branch of the OR, I think we'll realize > we can evaluate the subquery as an InitPlan and then use an index, but > with two branches I think that will fail. You're right- we could perform better in such a case. What solution did you come up with for this case, which performed well and was also secure..? > I don't want to overstate the importance of this particular case; but > I do think scenarios in which it's advantageous to have multiple > row-level security policies are plausible. I'm not against this in general. The question, in my mind, is what level of granularity we would provide this at. As I tried to outline previously, there's a huge number of combinations which we could come up with to support this under and I'm not 100% sure that it'd actualy end up being better than the simplicity of a single qual where the user gets to define any kind of relationship they want between the various policies; even programatically if they want. > Another, perhaps-simpler > example is that you might have a table containing unclassified data, > classified data, and secret data. You want to give access to the > unclassified data only to one category of users; access to the > unclassified data and the classified data to a second group of > more-trusted users; and access to all of the data to a third group of > very highly trusted users. If the table can only have one security > policy that applies to everyone who isn't exempt, how will you do > that? This sort of use case seems very plausible to me so I think we > need to give some real thought to what we will recommend to users who > want to do things like this. Can the proposed patch handle it? How? There are multiple ways this could be implemented- the first, basic, way would be through a table which maps users to security levels via an enum where more privileged levels are higher in value and therefore a simple greater-than could be applied after a join which would implement this particular policy. The reality (which I've had discussions with users about..) is actually much more complicated where an extermal security module makes the decision about if a given user/connection can have access to a specific bit of labeled data. The reason is that things are not classified so simply as "unclass", "class" and "secret" but rather into much more granular pieces- user X might have access to A and B, but not C, while user Y can access B and C. The absolute levels described above may exist for less sensetive data but for data beyond that (which I'd hazard to guess is most of it...), more granularity and control is needed. > >> What this patch is doing is basically allowing a table to really be a > >> view over itself. > > > > I don't agree with this characterization. This patch specifically > > allows filtering the rows returned from the table, and it intentionally > > does not allow changing the data. > > I don't know what to say to this. What I said is, quite literally, > what the patch does. It wraps the patch in an subquery RTE that is > precisely the same thing you would get if you defined a > security_barrier view with the security qual in the WHERE clause. Exactly- it does *not* allow changing the SELECT clause, or adding in a GROUP BY, or a JOIN, or tossing in a windowing function, etc. > This is not a question of opinion; the patch either does that or it > doesn't, and I think it does. Apologies for not being clearer but my point was that only the WHERE clause can be modified by this patch, which is quite intentional. This separates the concerns of "can I access this data" from "modify the data to represent it in X way". > > We are already looking at WITH CHECK OPTION-style support, but I > > disagree that separate permissions or data changing will ever be a part > > of RLS because then it's no longer RLS. > > What do you mean by "data changing"? If you mean inserts, updates, > and deletes, I am very sure people are going to want to perform those > operations on RLS-enabled tables. Yes, they'll want to support those operations. However, they will not expect RLS to allow them to redefine a columns as "x+10" instead of "x", which a view does allow. > Do you find it implausible that someone will want to exempt a certain > role from RLS on only one table but not on other tables in the system? No- excempting certain roles from RLS makes sense as a capability. > Do you find it implausible that someone will want to allow a certain > table to bypass RLS when selecting rows, but not when updating or > deleting them? I find those scenarios very plausible. This is also plausible and something which we were anticipating while developing this patch. Simon, KaiGai and I specifically discussed addressing SELECT vs UPDATE/DELETE earlier this year, as I recall. Providing that level of flexibility is absolutely on the road map, but I don't know that it all has to exist in 9.5; it may, which would be great, but I don't view it as required. > > We have this problem with psql today, as has been discussed. The fact > > that pg_dump doesn't happen to have this problem is great but it's no > > true solution for the problem at hand. > > It's true that users can break security by being incautious about the > queries they type into psql, and I'm all for having better tools to > manage that. But a feature that causes currently-safe uses of pg_dump > to become unsafe is, in my opinion, absolutely not OK. I don't particularly like it, and would require a way to override it, but a GUC which pg_dump sets by default that says "give me everything or error" would work to address this. I'm open to other thoughts, of course, but it does seem like a relatively simple solution (which is a good thing when it comes to security concerns, imv). > I do agree with your argument that things like adding and removing > columns, or changing their data types, could be simpler with RLS than > in the view-over-table model - because in the view-over-table model, > we don't really know whether the user would like a new column to > cascade to the view, whereas in the RLS model, we can automatically do > the right thing. Agreed. Thanks, Stephen
Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
From
Stephen Frost
Date:
Kevin, * Kevin Grittner (kgrittn@ymail.com) wrote: > Robert Haas <robertmhaas@gmail.com> wrote: > > Even aside from security exposures, how > > does a non-superuser who runs pg_dump know whether they've got a > > complete backup or a filtered dump that's missing some rows? > > This seems to me to be a killer objection to the feature as > proposed, and points out a huge difference between column level > security and the proposed implementation of row level security. I really hate this notion of "killer objection". It's been discussed (perhaps not seen by all) at least one suggestion for how to address this specific issue and there are other ways in which to address it (having COPY have the same behavior as the GUC being discussed, instead of having a GUC, though I feel like the GUC is a better approach..). > (In fact it is a difference between just about any GRANTed > permission and row level security.) If you try to SELECT * FROM > sometable and you don't have rights to all the columns, you get an > error. A dump would always either work as expected or generate an > error. Provided you know all of the tables and other objects which need to be included in such a partial dump (as a full dump, today, must be run by a superuser to be sure you're actually getting everything anyway...). > The proposed approach would leave the validity of any dump which > was not run as a superuser in doubt. The last thing we need, in > terms of improving security, is another thing you can't do without > connecting as a superuser. Any dump not run by a superuser is already in doubt, imv. That is a problem we already have which really needs to be addressed, but I view that as an independent issue. I agree with avoiding adding another superuser-only capability; see the other sub-thread about making this a per-user capability. Thanks, Stephen
Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
From
Stephen Frost
Date:
Dean, * Dean Rasheed (dean.a.rasheed@gmail.com) wrote: > On 13 June 2014 01:13, Stephen Frost <sfrost@snowman.net> wrote: > > This approach was suggested by an existing user testing out this RLS > > approach, to be fair, but it looks pretty sane to me as a way to address > > some of these concerns. Certainly open to other ideas and thoughts though. > > Yeah, I was thinking something like this could work, but I would go > further. Suppose you had separate GRANTable privileges for direct > access to individual tables, bypassing RLS, e.g. > > GRANT DIRECT SELECT|INSERT|UPDATE|DELETE ON table_name TO role_name This is certainly an interesting idea and I'm glad we're getting this level of discussion early on in the 9.5 cycle as I'd really like to see a good solution implemented for 9.5. I've been going back-and-forth about this and what's really swaying me right now is that it'd be nearly impossible to determine if a given RLS qual actually allows full access to a table for a given user without going through the entire table and testing the qual against each row. With this GRANT ability, we'd be able to completely avoid calling the RLS quals when the user is granted this right. Not sure offhand how many bits we've got left at the per-table level though; we added TRUNCATE rights not that long ago and this looks like another good right to add, but there are only so many bits available.. At the same time, I do think this is something we could also add later, perhaps after figuring out a good way to extend the set of bits available for privileges on tables. > Combined with the GUC (direct_table_access, say) to request direct > access to all tables. Then with direct_table_access = true/required, a > SELECT from a table would error if the user hadn't been granted the > DIRECT SELECT privilege on all the tables referenced in the query. I can see this working. One thing I'm curious about is if we would want to support this inside of the SELECT statement (or perhaps COPY?) directly, rather than making a user have to flip a GUC back and forth while they're doing something. I can imagine, during testing, a session looking like this: select * from table; @#@!$! set direct_table_access = true; select * from table; select * from table where blah = x; alter table set row level security blah = x; select * from table; select * from table; select * from table; @!#$!@#! set direct_table_access = false; select * from table; ... Would 'select direct' or 'select * from DIRECT table' (or maybe 'ONLY'?) be workable? There's certainly SQL standard concerns to be thought of here which might precldue anything we do with SELECT, but we could support something with COPY. > Tools like pg_dump would require direct_table_access, but there might > be other levels of access that didn't error out. pg_dump would need an option to set direct_table_access or not. Having it ask by default is acceptable to me, but I do think we need to be able to tell it to *not* set that. > I think if I were using RLS, I would definitely want/expect this level > of fine-grained control over permissions on a per-table basis, rather > than the superuser/non-superuser level of control, or having > RLS-exempt users. I agree that it'd be great to have- and we need to make sure we don't paint ourselves into a corner with the initial versions. What I'm worried about is that we're going to end up feature-creeping this to death and ending up with nothing in 9.5. I'll try to get a wiki page going to discuss these items (as mentioned up-thread) and we can look at prioritizing them and looking at what dependencies exist on other parts of the system and seeing what's required for the initial version. > Actually, given the fact that the majority of users won't be using > RLS, I would be tempted to invert the above logic and have the new > privilege be for LIMITED access (via RLS quals). So a user granted the > normal SELECT privilege would be able to bypass RLS, but a user only > granted LIMITED SELECT wouldn't. This I don't agree with- it goes against what is done on existing systems afaik and part of the idea is that you can minimize changes to the applications or users but still be able to curtail what they can see. Making regular SELECTs start erroring if they haven't set some GUC because RLS has been implemented on a given table would be quite annoying, imv. Now, that said, wouldn't the end user be able to control this for their particular environment by setting the GUC accordingly in postgresql.conf? I'd still argue that it should be defaulted to what I view as the 'normal' case, where RLS is applied unless you asked for your queries to error instead, but if a user wants to have it flipped around the other way, they could update their postgresql.conf to make it so. Thanks, Stephen
Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
From
Kevin Grittner
Date:
Stephen Frost <sfrost@snowman.net> wrote: > Kevin Grittner (kgrittn@ymail.com) wrote: >> The proposed approach would leave the validity of any dump which >> was not run as a superuser in doubt. The last thing we need, in >> terms of improving security, is another thing you can't do >> without connecting as a superuser. > > Any dump not run by a superuser is already in doubt, imv. That > is a problem we already have which really needs to be addressed, > but I view that as an independent issue. I'm not seeing that. If the user can't dump, you get an error and pg_dump returns something other than SUCCESS. test=# create user bob; CREATE ROLE test=# create user tom; CREATE ROLE test=# set role bob; SET test=> create table person(person_id int primary key, name text not null, ssn text); CREATE TABLE test=> insert into person values (1, 'Stephen Frost', '123-45-6789'); INSERT 0 1 test=> insert into person values (2, 'Kevin Grittner'); INSERT 0 1 test=> grant select (person_id, name) on person to tom; GRANT test=> \q kgrittn@Kevin-Desktop:~/pg/master$ pg_dump -U bob test >bob-backup.sql kgrittn@Kevin-Desktop:~/pg/master$ pg_dump -U tom test >tom-backup.sql pg_dump: [archiver (db)] query failed: ERROR: permission denied for relation person pg_dump: [archiver (db)] query was: LOCK TABLE public.person IN ACCESS SHARE MODE kgrittn@Kevin-Desktop:~/pg/master$ echo $? 1 > I agree with avoiding adding another superuser-only capability; > see the other sub-thread about making this a per-user capability. It should be possible to design something which does not have this risk. What I was saying was that what was being described at that point wasn't it, and IMV was not acceptable. I think that there should never by any doubt that a pg_dump run which completes without error copied all requested tables in their entirety, not a subset of the rows in the tables. A GUC which only caused an error on the attempt to actually read specific rows which the user does not have permission to see would leak too much information. A GUC which caused a SELECT or COPY from a table to throw an error if the user was not entitled to see all rows in the table could work. Another thing which could work, if it can be coded, would be a GUC which would throw an error if the there were not quals on the query to prohibit seeing rows which the security conditions would prohibit, whether or not any matching rows actually existed. The latter would match the behavior of column level security -- you get an error when trying to select a prohibited column even if there are no rows in the table. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
From
Stephen Frost
Date:
* Kevin Grittner (kgrittn@ymail.com) wrote: > Stephen Frost <sfrost@snowman.net> wrote: > > Any dump not run by a superuser is already in doubt, imv. That > > is a problem we already have which really needs to be addressed, > > but I view that as an independent issue. > > I'm not seeing that. If the user can't dump, you get an error and > pg_dump returns something other than SUCCESS. We've outlined an approach with RLS which would do the same. I'm still of the opinion that, today, we have a problem that only a superuser-run dump has any chance of success (and even if you get it working today it'll probably break tomorrow, and you had better be paying attention). I'd like to fix that situation, but it's an independent effort from this. We've had issues in the past with pg_dump creating things that can't be restored and they're certainly bugs but trying to make that work with a regular user as a whole system backup strategy, today, is just asking for trouble. > > I agree with avoiding adding another superuser-only capability; > > see the other sub-thread about making this a per-user capability. > > It should be possible to design something which does not have this > risk. The risk that pg_dump might create a dump which can't be restored? Agreed, and I'd love to hear your thoughts on the proposal. > What I was saying was that what was being described at that > point wasn't it, and IMV was not acceptable. I think that there > should never by any doubt that a pg_dump run which completes > without error copied all requested tables in their entirety, not a > subset of the rows in the tables. pg_dump needs to be able to have an option to go either way on this case, as I can see value in running pg_dump in "RLS-enforcing" mode, but it could default to "error-if-RLS". > A GUC which only caused an error on the attempt to actually read > specific rows which the user does not have permission to see would > leak too much information. A GUC which caused a SELECT or COPY > from a table to throw an error if the user was not entitled to see > all rows in the table could work. Right- this would be the 'DIRECT SELECT' which would allow bypassing all RLS and therefore mean that the user is allowed to see ALL rows of a table. That's one of the reasons why I agree with Dean's approach, because we really need to know at the outset if the calling user is allowed to extract all rows from a table or not- we can't go looking through the entire table testing each row before we start running the query. > Another thing which could work, > if it can be coded, would be a GUC which would throw an error if > the there were not quals on the query to prohibit seeing rows which > the security conditions would prohibit, whether or not any matching > rows actually existed. If I'm following you correctly, this would be an optimization that allows avoiding RLS in the case where some information about the user causes the overall qual to always return 'true', correct? I'd certainly like to see what happens in that case today and agree that it'd be great to optimize for and perhaps even allow a user for which that is true to not need the 'DIRECT SELECT' privilege, but in practice, I don't think it'll be possible in most cases (certainly not in the case where an external security module is deciding the access) and the optimization may not be worth it. > The latter would match the behavior of > column level security -- you get an error when trying to select a > prohibited column even if there are no rows in the table. Agreed, but that would be a relaxation of the proposed approach and therefore something which could be added later, if it's deemed worthwhile. Thanks, Stephen
Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
From
Robert Haas
Date:
On Thu, Jun 12, 2014 at 6:33 PM, Gregory Smith <gregsmithpgsql@gmail.com> wrote: > I'm kind of surprised to see this turn into a hot button all of the sudden > though, because my thought on all that so far has been a giant so what? > This is what PostgreSQL does. [...] > But let's not act like RLS is a scary bogeyman because it introduces a new > way to hack the server or get surprising side-effects. That's expected and > possibly unavoidable behavior in a feature like this, and there are much > worse instances of arbitrary function risk throughout the core code already. I have some technical comments on later emails in this thread, but first let me address this point. In the past, people have sometimes complained that reviewers waited until very late in the cycle to complain about issues which they found problematic. By the time the issues were pointed out, insufficient time remained before feature freeze to get those issues addressed, causing the patch to slip out of the release and provoking developer frustration. It has therefore been requested numerous times by numerous people that potential issues be raised as early as possible. The concerns that I have raised in this thread are not new; I have raised them before. However, we are now at the beginning of a new development cycle, and it seems fair to assume that the people who are working on this patch are hoping very much that something will get committed to 9.5. Since it seems to me that there are unaddressed issues with the design of this patch, I felt that it was a good idea to make sure that those concerns were on the table right from the beginning of the process, rather than waiting until the patch was on the verge of commit or, indeed, already committed. That is why, when this thread was revived on June 10th, I decide that it was a good time to again comment on the design points about which I was concerned. After sending that one (1) email, I was promptly told that "I'm very disappointed to hear that the mechanical pieces around making RLS easy for users to use ... is receiving such push-back." The push-back, at that point in time, consisted of one (1) email. Several more emails have been sent that time, including the above-quoted text, seeming to me to imply that the people who are concerned about this feature are being unreasonable. I don't believe I am the only such person, although I may be the main one right at the moment, and you may not be entirely surprised to hear that I don't think I'm being unreasonable. I will admit that my initial email may have contained just a touch of hyperbole. But I won't admit to more than a touch, and frankly, I think it was warranted. I perfectly well understand that people really, really, really want this feature, and if I hadn't understood that before, I certainly understand it now. However, I believe that there has been a lack of focus in the development of the patch thus far in a couple of key areas - first in terms of articulating how it is different from and better than a writeable security barrier view, and second on how to manage the security and operational aspects of having a feature like this. I think that the discussion subsequent to my June 10th email has let to some good discussion on both points, which was my intent, but I still think much more time and thought needs to be spent on those issues if we are to have a feature which is up to our usual standards. I do apologize to anyone who interpreted that initial as a pure rant, because it really wasn't intended that way. Contrariwise, I hope that the people defending this patch will admit that the issues I am raising are real and focus on whether and how those concerns can be addressed. Thanks, -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
From
Robert Haas
Date:
On Fri, Jun 13, 2014 at 3:11 AM, Dean Rasheed <dean.a.rasheed@gmail.com> wrote: > Yeah, I was thinking something like this could work, but I would go > further. Suppose you had separate GRANTable privileges for direct > access to individual tables, bypassing RLS, e.g. > > GRANT DIRECT SELECT|INSERT|UPDATE|DELETE ON table_name TO role_name So, is this one new privilege (DIRECT) or four separate new privileges that are variants of the existing privileges (DIRECT SELECT, DIRECT INSERT, DIRECT UPDATE, DIRECT DELETE)? > Actually, given the fact that the majority of users won't be using > RLS, I would be tempted to invert the above logic and have the new > privilege be for LIMITED access (via RLS quals). So a user granted the > normal SELECT privilege would be able to bypass RLS, but a user only > granted LIMITED SELECT wouldn't. Well, for the people who are not using RLS, there's no difference anyway. I think it matters more what users of RLS will expect from a command like GRANT SELECT ... and I'm guessing they'll prefer that RLS always apply unless they very specifically grant the right for RLS to not apply. I might be wrong, though. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
From
Robert Haas
Date:
On Thu, Jun 12, 2014 at 8:13 PM, Stephen Frost <sfrost@snowman.net> wrote: >> I'm in full agreement we should clearly communicate the issues around >> pg_dump in particular, because they can't necessarily be eliminated >> altogether without some major work that's going to take a while to finish. >> And if the work-around is some sort of GUC for killing RLS altogether, >> that's ugly but not unacceptable to me as a short-term fix. > > A GUC which is enable / disable / error-instead may work quiet well, with > error-instead for pg_dump default if people really want it (there would have > to be a way to disable that though, imv). > > Note that enable is default in general, disable would be for superuser only > (or on start-up) to disable everything, and error-instead anyone could use > but it would error instead of implementing RLS when querying an RLS-enabled > table. > > This approach was suggested by an existing user testing out this RLS > approach, to be fair, but it looks pretty sane to me as a way to address > some of these concerns. Certainly open to other ideas and thoughts though. In general, I agree that this is a good approach. I think it will be difficult to have a GUC with three values, one of which is superuser-only and the other two of which are not. I don't think there's any precedent for something like that in the existing framework, and I think it's likely we'll run into unpleasant corner cases if we try to graft it in. Also, I think we need to separate things: whether the system is willing to allow the user to access the table without RLS, and whether the user is willing to accept RLS if the system deems it necessary. For the first one, two solutions have been proposed. The initial proposal was to insist on RLS except for the superuser (and maybe the table owner?). Having a separate grantable privilege, as Dean suggests, may be better. I'll reply separately to that email also, as I have a question about what he's proposing. For the second one, I think the two most useful behaviors are "normal mode" - i.e. allow access to the table, applying RLS predicates if required and not applying them if I am exempt - and "error-instead" mode - i.e. if my access to this table would be mediated by an RLS predicate, then throw an error instead. There's a third mode which might be useful as well, which is "even though I have the *right* to bypass the RLS predicate on this table, please apply the predicate anyway". This could be used by the table owner in testing, for example. Here again, the level of granularity we want to provide is an interesting question. Having a GUC (e.g. enable_row_level_security = on, off, force) would be adequate for pg_dump, but allowing the table name to be qualified in the query, as proposed downthread, would be more granular, admittedly at some parser cost. I'm personally of the view that we *at least* need the GUC, because that seems like the best way to secure pg_dump, and perhaps other applications. We can and should give pg_dump an--allow-row-level-security flag, I think, but pg_dump's default behavior should be to configure the system in such a way that the dump will fail rather than complete with a subset of the data. I'm less sure whether we should have something that can be used to qualify table names in particular queries. I think it would be really useful, but I'm concerned that it will require creating additional fully-reserved keywords, which are somewhat painful for users. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
From
Stephen Frost
Date:
Robert,
On Tuesday, June 17, 2014, Robert Haas <robertmhaas@gmail.com> wrote:
On Tuesday, June 17, 2014, Robert Haas <robertmhaas@gmail.com> wrote:
After sending that one (1) email, I was promptly told that "I'm very
disappointed to hear that the mechanical pieces around making RLS easy
for users to use ... is receiving such push-back." The push-back, at
that point in time, consisted of one (1) email. Several more emails
have been sent that time, including the above-quoted text, seeming to
me to imply that the people who are concerned about this feature are
being unreasonable. I don't believe I am the only such person,
although I may be the main one right at the moment, and you may not be
entirely surprised to hear that I don't think I'm being unreasonable.
I'm on my phone at the moment but that looks like a quote from me. My email and concern there was regarding the specific suggestion that we could check off the "RLS" capability which users have been asking us to provide nearly since I started with PG by saying that they could use Updatable SB views. I did not intend it as a comment regarding the specific technical concerns raised and have been responding to and trying to address those independently and openly.
I've expressed elsewhere on this thread my gratitude that the technical concerns are being brought up now, near the beginning of the cycle, so we can address them. I've been working with others who are interested in RLS on a wiki page to outline and understand the options and identify dependencies and priorities. Hopefully the link will be posted shortly (again, not at a computer right now) and we can get comments back. There are some very specific questions which really need to be addressed and which I've mentioned before (in particular the question of what user the functions in a view definition should run as, both for "normal" views, for SB views, and for when an RLS qual is included and run through that framework, and if doing so would address some of the concerns which have been raised regarding selects running code).
Thanks,
Stephen
Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
From
Robert Haas
Date:
On Mon, Jun 16, 2014 at 1:15 AM, Stephen Frost <sfrost@snowman.net> wrote: >> I'm not referring to the proposed implementation particularly; or at >> least not that aspect of it. I don't think trying to run the view >> quals as the defining user is likely to be very appealing, because I >> think it's going to hurt performance, for example by preventing >> function inlining and requiring lots of user-ID switches. > > I understand that there are performance implications. As mentioned to > Tom, realistically, there's zero way to optimized at least some of these > use-cases because they require a completely external module (eg: > SELlinux) to be involved in the decision about who can view what > records. If we can optimize that, it'd be by a completely different > approach whereby we pull up the qual higher because we know the whole > query only involves leakproof functions or similar, allowing us to only > apply the filter to the final set of records prior to them being > returned to the user. The point being that such optimizations would > happen independently and regardless of the quals or user-defined > functions involved. At the end of the day, I can't think of a better > optimization for such a case (where we have to ask an external security > module if a row is acceptable to return to the user) than that. Is > there something specific you're thinking about that we'd be missing out > on? Yeah, if we have to ask an external security module a question for each row, there's little hope of any real optimization. However, I think there will be a significant number of cases where people will want filtering clauses that can be realized by doing an index scan instead of a sequential scan, and if we end up forcing a sequential scan anyway, the feature will be useless to those people. >> But I'm not >> gonna complain if someone wants to mull it over and make a proposal >> for how to make it work. Rather, my concern is that all we've got is >> what might be called the core of the feature; the actual guts of it. >> There are a lot of ancillary details that seem to me to be not worked >> out at all yet, or only half-baked. > > Perhaps it's just my experience, but I've been focused on the main core > feature for quite some time and it feels like we're really close to > having it there. I agree that a few additional bits would be nice to > have but these strike me as relatively straight-forward to implement > overtop of this general construct. I do see value in documenting these > concerns and will see about making that happen, along with what the > general viewpoints and thoughts are about how to address the concern. I feel like there's quite a bit of work left to do around these issues. The technical bits may not be too hard, but deciding what we want will take some thought and discussion. >> > The current approach allows a nearly unlimited level of flexibility, >> > should the user wish it, by being able to run user-defined code. >> > Perhaps that would be considered 'one policy', but it could certainly >> > take under consideration the calling user, the object being queried >> > (if a function is defined per table, or if we provide a way to get >> > that information in the function), etc. >> >> In theory, that's true. But in practice, performance will suck unless >> the security qual is easily optimizable. If your security qual is >> WHERE somecomplexfunction() you're going to have to implement that by >> sequential-scanning the table and evaluating the function for each >> row. > > That's not actualy true today, is it? Given our leak-proof attribute, > if the qual is "WHERE somecomplexfunction() AND leakprooffunctionx()" > then we would be able to push down the leak-proof function and not > necessairly run a straight sequential scan, no? Even so, though, we've > had users who have tested exactly what this patch implements and they've > been happy with their real-world use-cases. I'm certainly all for > optimization and would love to see us make this better for everyone, but > I don't view that as a reason to delay this particular feature which is > really just bringing us up to parity with other RDMBS's. I'm a bit confused here, because your example seems to be totally different from my example. In my example, somecomplexfunction() will get pushed down because it's the security qual; that needs to be inside the security_barrier view, or a malicious user can subvert the system by getting some other qual evaluated first. In your example, you seem to be imagining WHERE somecomplexfunction() AND leakprooffunctionx() as queries sent by the untrusted user, in which case, yet, the leak-proof one will get pushed down and the other one will not. >> But that's probably not going to perform very well, because to match >> an index on sales_rep_id, or an index on partner_id, that's going to >> have to get simplified a whole lot, and that's probably not going to >> happen. If we've only got one branch of the OR, I think we'll realize >> we can evaluate the subquery as an InitPlan and then use an index, but >> with two branches I think that will fail. > > You're right- we could perform better in such a case. > > What solution did you come up with for this case, which performed well > and was also secure..? I put the logic in the client. :-( >> I don't want to overstate the importance of this particular case; but >> I do think scenarios in which it's advantageous to have multiple >> row-level security policies are plausible. > > I'm not against this in general. The question, in my mind, is what > level of granularity we would provide this at. As I tried to outline > previously, there's a huge number of combinations which we could come up > with to support this under and I'm not 100% sure that it'd actualy end > up being better than the simplicity of a single qual where the user gets > to define any kind of relationship they want between the various > policies; even programatically if they want. I agree. That's why I think we need some more design work in this area. Perhaps it's OK to allow only one RLS-qual per table at most, and tell people that if you want more than that, you need to use security-barrier views as wrappers instead. But I'm not sure; that feels like it's giving something up that might be important. And I think that the kinds of syntax we're discussing won't support leaving that out of the initial version and adding it later, so if we commit to this syntax, we're stuck with that behavior. To avoid that, we'd need something like this: ALTER TABLE tab ADD POLICY polname WHERE quals; GRANT SELECT (polname) ON TABLE tab TO role; >> Another, perhaps-simpler >> example is that you might have a table containing unclassified data, >> classified data, and secret data. You want to give access to the >> unclassified data only to one category of users; access to the >> unclassified data and the classified data to a second group of >> more-trusted users; and access to all of the data to a third group of >> very highly trusted users. If the table can only have one security >> policy that applies to everyone who isn't exempt, how will you do >> that? This sort of use case seems very plausible to me so I think we >> need to give some real thought to what we will recommend to users who >> want to do things like this. Can the proposed patch handle it? How? > > There are multiple ways this could be implemented- the first, basic, way > would be through a table which maps users to security levels via an enum > where more privileged levels are higher in value and therefore a simple > greater-than could be applied after a join which would implement this > particular policy. Interesting. >> What do you mean by "data changing"? If you mean inserts, updates, >> and deletes, I am very sure people are going to want to perform those >> operations on RLS-enabled tables. > > Yes, they'll want to support those operations. However, they will not > expect RLS to allow them to redefine a columns as "x+10" instead of "x", > which a view does allow. Hmm, I think some users do want to do things like this. There are previous discussions of wanting to fuzz a set of coordinates, for example, or blank out a certain list of columns. >> Do you find it implausible that someone will want to allow a certain >> table to bypass RLS when selecting rows, but not when updating or >> deleting them? I find those scenarios very plausible. > > This is also plausible and something which we were anticipating while > developing this patch. Simon, KaiGai and I specifically discussed > addressing SELECT vs UPDATE/DELETE earlier this year, as I recall. > Providing that level of flexibility is absolutely on the road map, but I > don't know that it all has to exist in 9.5; it may, which would be > great, but I don't view it as required. I think we at least need to have a clear design for it before committing anything. Otherwise we may find that we've committed to syntax which backs us into a corner. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
From
"Brightwell, Adam"
Date:
Robert,
However, I believe that
there has been a lack of focus in the development of the patch thus
far in a couple of key areas - first in terms of articulating how it
is different from and better than a writeable security barrier view,
and second on how to manage the security and operational aspects of
having a feature like this. I think that the discussion subsequent to
my June 10th email has let to some good discussion on both points,
which was my intent, but I still think much more time and thought
needs to be spent on those issues if we are to have a feature which is
up to our usual standards. I do apologize to anyone who interpreted
that initial as a pure rant, because it really wasn't intended that
way. Contrariwise, I hope that the people defending this patch will
admit that the issues I am raising are real and focus on whether and
how those concerns can be addressed.
I absolutely appreciate all of the feedback that has been provided. It has been educational. To your point above, I started putting together a wiki page, as Stephen has spoken to, that is meant to capture these concerns and considerations as well as to capture ideas around solutions.
This page is obviously not complete, but I think it is a good start. Hopefully this document will help to continue the conversation and assist in addressing all the concerns that have been brought to the table. As well, I hope that this document serves to demonstrate our intent and that we *are* taking these concerns seriously. I assure you that as one of the individuals who is working towards the acceptance of this feature/patch, I am very much concerned about meeting the expected standards of quality and security.
Thanks,
Adam
Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
From
Stephen Frost
Date:
* Robert Haas (robertmhaas@gmail.com) wrote: > On Thu, Jun 12, 2014 at 8:13 PM, Stephen Frost <sfrost@snowman.net> wrote: > > This approach was suggested by an existing user testing out this RLS > > approach, to be fair, but it looks pretty sane to me as a way to address > > some of these concerns. Certainly open to other ideas and thoughts though. > > In general, I agree that this is a good approach. I think it will be > difficult to have a GUC with three values, one of which is > superuser-only and the other two of which are not. I don't think > there's any precedent for something like that in the existing > framework, and I think it's likely we'll run into unpleasant corner > cases if we try to graft it in. Also, I think we need to separate > things: whether the system is willing to allow the user to access the > table without RLS, and whether the user is willing to accept RLS if > the system deems it necessary. Good point- I agree that it's best to avoid having to support individual superuser-only only options on a GUC. Also, addressing the issues independently also makes sense to me. > For the first one, two solutions have been proposed. The initial > proposal was to insist on RLS except for the superuser (and maybe the > table owner?). Having a separate grantable privilege, as Dean > suggests, may be better. I'll reply separately to that email also, as > I have a question about what he's proposing. I like the idea of a grantable privilege as it allows the granularity that some users may require (or be frustrated that we don't have it). > For the second one, I think the two most useful behaviors are "normal > mode" - i.e. allow access to the table, applying RLS predicates if > required and not applying them if I am exempt - and "error-instead" > mode - i.e. if my access to this table would be mediated by an RLS > predicate, then throw an error instead. Right, makes sense. > There's a third mode which > might be useful as well, which is "even though I have the *right* to > bypass the RLS predicate on this table, please apply the predicate > anyway". This could be used by the table owner in testing, for > example. Agreed, this sounds very useful too. > Here again, the level of granularity we want to provide is > an interesting question. Having a GUC (e.g. enable_row_level_security > = on, off, force) would be adequate for pg_dump, but allowing the > table name to be qualified in the query, as proposed downthread, would > be more granular, admittedly at some parser cost. I'm personally of > the view that we *at least* need the GUC, because that seems like the > best way to secure pg_dump, and perhaps other applications. We can > and should give pg_dump an--allow-row-level-security flag, I think, > but pg_dump's default behavior should be to configure the system in > such a way that the dump will fail rather than complete with a subset > of the data. This sounds good to me. > I'm less sure whether we should have something that can > be used to qualify table names in particular queries. I think it > would be really useful, but I'm concerned that it will require > creating additional fully-reserved keywords, which are somewhat > painful for users. I've been trying to think of the use-case for this. It certainly *sounds* nice, but on reflection, the use-case for this seems to me to be that you're trying to develop some application which will be constrained by RLS totally and therefore want to flip back-and-forth between "RLS on" and "RLS off" (for the tables involved). When would you really need, in the same query, to have RLS enabled for table X but disabled for table Y? I do like the idea of an *independent* option to (just) COPY which says "give me all the data or error, independent of the GUC for the same purpose". Would be curious to hear what others think of that proposal. Thanks, Stephen
Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
From
Stephen Frost
Date:
* Robert Haas (robertmhaas@gmail.com) wrote: > On Fri, Jun 13, 2014 at 3:11 AM, Dean Rasheed <dean.a.rasheed@gmail.com> wrote: > > Yeah, I was thinking something like this could work, but I would go > > further. Suppose you had separate GRANTable privileges for direct > > access to individual tables, bypassing RLS, e.g. > > > > GRANT DIRECT SELECT|INSERT|UPDATE|DELETE ON table_name TO role_name > > So, is this one new privilege (DIRECT) or four separate new privileges > that are variants of the existing privileges (DIRECT SELECT, DIRECT > INSERT, DIRECT UPDATE, DIRECT DELETE)? I had taken it to be a single privilege, but you're right, it could be done for each of those.. I really don't think we have the bits for more than one case here though (if that) without a fair bit of additional rework. I'm not against that rework (and called for it wayyy back when I proposed the TRUNCATE privilege, as I recall) but that's a whole different challenge and no small bit of work.. > > Actually, given the fact that the majority of users won't be using > > RLS, I would be tempted to invert the above logic and have the new > > privilege be for LIMITED access (via RLS quals). So a user granted the > > normal SELECT privilege would be able to bypass RLS, but a user only > > granted LIMITED SELECT wouldn't. > > Well, for the people who are not using RLS, there's no difference > anyway. I think it matters more what users of RLS will expect from a > command like GRANT SELECT ... and I'm guessing they'll prefer that RLS > always apply unless they very specifically grant the right for RLS to > not apply. I might be wrong, though. The preference from the folks using RLS that I've talked to is absolutely that it be applied by default for all 'normal' (eg: non-pg_dump) sessions. Thanks, Stephen
Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
From
Stephen Frost
Date:
* Robert Haas (robertmhaas@gmail.com) wrote: > On Mon, Jun 16, 2014 at 1:15 AM, Stephen Frost <sfrost@snowman.net> wrote: > > I understand that there are performance implications. As mentioned to > > Tom, realistically, there's zero way to optimized at least some of these > > use-cases because they require a completely external module (eg: > > SELlinux) to be involved in the decision about who can view what > > records. If we can optimize that, it'd be by a completely different > > approach whereby we pull up the qual higher because we know the whole > > query only involves leakproof functions or similar, allowing us to only > > apply the filter to the final set of records prior to them being > > returned to the user. The point being that such optimizations would > > happen independently and regardless of the quals or user-defined > > functions involved. At the end of the day, I can't think of a better > > optimization for such a case (where we have to ask an external security > > module if a row is acceptable to return to the user) than that. Is > > there something specific you're thinking about that we'd be missing out > > on? > > Yeah, if we have to ask an external security module a question for > each row, there's little hope of any real optimization. However, I > think there will be a significant number of cases where people will > want filtering clauses that can be realized by doing an index scan > instead of a sequential scan, and if we end up forcing a sequential > scan anyway, the feature will be useless to those people. I agree that we want to support that, if we can do so reasonably. What I was trying to get at is simply this- don't we provide that already with the leakproof attribute and functions? If we don't have enough there to allow index scans then we should be looking to add more, I'm thinking. > > Perhaps it's just my experience, but I've been focused on the main core > > feature for quite some time and it feels like we're really close to > > having it there. I agree that a few additional bits would be nice to > > have but these strike me as relatively straight-forward to implement > > overtop of this general construct. I do see value in documenting these > > concerns and will see about making that happen, along with what the > > general viewpoints and thoughts are about how to address the concern. > > I feel like there's quite a bit of work left to do around these > issues. The technical bits may not be too hard, but deciding what we > want will take some thought and discussion. I agree on this point, but I'm still hopeful that we'll be able to get a good feature into 9.5. There are quite a few resources available for the 'just programming' part, so the long pole in the tent here is absolutely hashing out what we want and how it should function. I'd be happy to host or participate in a conference call or similar if that would be useful to move this along- or we can continue to communicate via email. There's a bit of a lull in conferences to which I'm going to right now, so in person is unlikely, unless folks want to get together somewhere on the east coast (I'd be happy to travel to Philly, Pittsburgh, NYC, etc, if it'd help..). > > That's not actualy true today, is it? Given our leak-proof attribute, > > if the qual is "WHERE somecomplexfunction() AND leakprooffunctionx()" > > then we would be able to push down the leak-proof function and not > > necessairly run a straight sequential scan, no? Even so, though, we've > > had users who have tested exactly what this patch implements and they've > > been happy with their real-world use-cases. I'm certainly all for > > optimization and would love to see us make this better for everyone, but > > I don't view that as a reason to delay this particular feature which is > > really just bringing us up to parity with other RDMBS's. > > I'm a bit confused here, because your example seems to be totally > different from my example. In my example, somecomplexfunction() will > get pushed down because it's the security qual; that needs to be > inside the security_barrier view, or a malicious user can subvert the > system by getting some other qual evaluated first. In your example, > you seem to be imagining WHERE somecomplexfunction() AND > leakprooffunctionx() as queries sent by the untrusted user, in which > case, yet, the leak-proof one will get pushed down and the other one > will not. Right- my point there was that the leakproof one might allow an index scan to be run. This is all pretty hand-wavey, I admit, so I'll see if I can get more details about how the currently-proposed patch is performing for the users who are testing it and what kind of plans they're seeing. If that falls through, I'll try and build up my own set of realistic-looking (to myself and the users who are testing) example. > > What solution did you come up with for this case, which performed well > > and was also secure..? > > I put the logic in the client. :-( Well, that's not helpful here. ;) > >> I don't want to overstate the importance of this particular case; but > >> I do think scenarios in which it's advantageous to have multiple > >> row-level security policies are plausible. > > > > I'm not against this in general. The question, in my mind, is what > > level of granularity we would provide this at. As I tried to outline > > previously, there's a huge number of combinations which we could come up > > with to support this under and I'm not 100% sure that it'd actualy end > > up being better than the simplicity of a single qual where the user gets > > to define any kind of relationship they want between the various > > policies; even programatically if they want. > > I agree. That's why I think we need some more design work in this > area. Perhaps it's OK to allow only one RLS-qual per table at most, > and tell people that if you want more than that, you need to use > security-barrier views as wrappers instead. Note that my suggestion would be to simply put a pl/pgsql call (perhaps a security definer one) into the RLS definition- not to say "use views". > But I'm not sure; that > feels like it's giving something up that might be important. And I > think that the kinds of syntax we're discussing won't support leaving > that out of the initial version and adding it later, so if we commit > to this syntax, we're stuck with that behavior. To avoid that, we'd > need something like this: > > ALTER TABLE tab ADD POLICY polname WHERE quals; > GRANT SELECT (polname) ON TABLE tab TO role; Right, if we were to support multiple policies on a given table then we would have to support adding and removing them individually, as well as specify when they are to be applied- and what if that "when" overlaps? Do we apply both and only a row which passed them all gets sent to the user? Essentially we'd be defining the RLS policies to be AND'd together, right? Would we want to support both AND-based and OR-based, and allow users to pick what set of conditionals they want applied to their various overlapping RLS policies? Sounds all rather painful and much better done programatically by the user in a language which is suited to that task- eg: pl/pgsql, perl, C, or something besides our ALTER syntax + catalog representation. > >> What do you mean by "data changing"? If you mean inserts, updates, > >> and deletes, I am very sure people are going to want to perform those > >> operations on RLS-enabled tables. > > > > Yes, they'll want to support those operations. However, they will not > > expect RLS to allow them to redefine a columns as "x+10" instead of "x", > > which a view does allow. > > Hmm, I think some users do want to do things like this. There are > previous discussions of wanting to fuzz a set of coordinates, for > example, or blank out a certain list of columns. Absolutely they'll want to be able to do this- but that's going to be a case which I (and others, I think) feel comfortable going back and saying "use views for that". I'm trying to draw that line in the ground between what is RLS and what are views and keeping RLS to the WHERE clause strikes me as a good line to draw (and one which matches up with existing expectations in this space). > > This is also plausible and something which we were anticipating while > > developing this patch. Simon, KaiGai and I specifically discussed > > addressing SELECT vs UPDATE/DELETE earlier this year, as I recall. > > Providing that level of flexibility is absolutely on the road map, but I > > don't know that it all has to exist in 9.5; it may, which would be > > great, but I don't view it as required. > > I think we at least need to have a clear design for it before > committing anything. Otherwise we may find that we've committed to > syntax which backs us into a corner. Fair enough. There was some support for this idea in the original patch by Craig, but we can further develop this syntax (and what it may look like for 9.5, if it ends up not covering all cases). Thanks! Stephen
Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
From
Robert Haas
Date:
On Tue, Jun 17, 2014 at 9:45 PM, Brightwell, Adam <adam.brightwell@crunchydatasolutions.com> wrote: > I absolutely appreciate all of the feedback that has been provided. It has > been educational. To your point above, I started putting together a wiki > page, as Stephen has spoken to, that is meant to capture these concerns and > considerations as well as to capture ideas around solutions. > > https://wiki.postgresql.org/wiki/Row_Security_Considerations > > This page is obviously not complete, but I think it is a good start. > Hopefully this document will help to continue the conversation and assist in > addressing all the concerns that have been brought to the table. As well, I > hope that this document serves to demonstrate our intent and that we *are* > taking these concerns seriously. I assure you that as one of the > individuals who is working towards the acceptance of this feature/patch, I > am very much concerned about meeting the expected standards of quality and > security. Cool, thanks for weighing in. I think that page is a good start. An item that I think should be added there is the potential overlap between security_barrier views and row-level security. How can we reuse code (and SQL syntax?) for existing features like WITH CHECK OPTION instead of writing new code (and inventing new syntax) for very similar concepts? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
From
Robert Haas
Date:
On Tue, Jun 17, 2014 at 10:06 PM, Stephen Frost <sfrost@snowman.net> wrote: > * Robert Haas (robertmhaas@gmail.com) wrote: >> On Fri, Jun 13, 2014 at 3:11 AM, Dean Rasheed <dean.a.rasheed@gmail.com> wrote: >> > Yeah, I was thinking something like this could work, but I would go >> > further. Suppose you had separate GRANTable privileges for direct >> > access to individual tables, bypassing RLS, e.g. >> > >> > GRANT DIRECT SELECT|INSERT|UPDATE|DELETE ON table_name TO role_name >> >> So, is this one new privilege (DIRECT) or four separate new privileges >> that are variants of the existing privileges (DIRECT SELECT, DIRECT >> INSERT, DIRECT UPDATE, DIRECT DELETE)? > > I had taken it to be a single privilege, but you're right, it could be > done for each of those.. I really don't think we have the bits for more > than one case here though (if that) without a fair bit of additional > rework. I'm not against that rework (and called for it wayyy back when > I proposed the TRUNCATE privilege, as I recall) but that's a whole > different challenge and no small bit of work.. Technically, there are 4 bits left, and that's what we need for separate privileges. We last consumed bits in 2008 (for TRUNCATE) and 2006 (for GRANT ON DATABASE), so even if we used all of the remaining bits it might be another 5 years before anyone has to do that refactoring. But even if the refactoring needs to be done now for some reason, it's only June, and the last CommitFest doesn't start until February 15th. I think we're being way too quick to jump to talking about what can and can't be done in time for 9.5. Let's start by figuring out how we'd really like it to work and then, if it's too ambitious, we can scale it back. My main concern about using only one bit is that someone might want to allow a user to bypass RLS on SELECT while still enforcing it for data-modifying operations. That seems like a plausible use case to me. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
From
Stephen Frost
Date:
* Robert Haas (robertmhaas@gmail.com) wrote: > On Tue, Jun 17, 2014 at 10:06 PM, Stephen Frost <sfrost@snowman.net> wrote: > > I had taken it to be a single privilege, but you're right, it could be > > done for each of those.. I really don't think we have the bits for more > > than one case here though (if that) without a fair bit of additional > > rework. I'm not against that rework (and called for it wayyy back when > > I proposed the TRUNCATE privilege, as I recall) but that's a whole > > different challenge and no small bit of work.. > > Technically, there are 4 bits left, and that's what we need for > separate privileges. I'd really hate to chew them all up.. > We last consumed bits in 2008 (for TRUNCATE) and > 2006 (for GRANT ON DATABASE), so even if we used all of the remaining > bits it might be another 5 years before anyone has to do that > refactoring. Perhaps, or we might come up with some new whiz-bang permission to add next year. :/ > But even if the refactoring needs to be done now for > some reason, it's only June, and the last CommitFest doesn't start > until February 15th. I think we're being way too quick to jump to > talking about what can and can't be done in time for 9.5. Let's start > by figuring out how we'd really like it to work and then, if it's too > ambitious, we can scale it back. Alright- perhaps we can discuss what kind of refactoring would be needed for such a change then, to get a better idea as to the scope of the change and the level of effort required. My thoughts on how to address this were to segregate the ACL bits by object type. That is to say, the AclMode stored for databases might only use bits 0-2 (create/connect/temporary), while tables would use bits 0-7 (insert/select/update/delete/references/trigger). This would allow us to more easily add more rights at the database and/or tablespace level too. > My main concern about using only one bit is that someone might want to > allow a user to bypass RLS on SELECT while still enforcing it for > data-modifying operations. That seems like a plausible use case to > me. I absolutely agree that it's a real use-case and one which we should support, just trying to avoid biting off more than can be done between now and February. Still, if we get things hammered out and more-or-less agreement on the way forward, getting the code written may move quickly. Thanks, Stephen
Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
From
Robert Haas
Date:
On Tue, Jun 17, 2014 at 10:25 PM, Stephen Frost <sfrost@snowman.net> wrote: >> Yeah, if we have to ask an external security module a question for >> each row, there's little hope of any real optimization. However, I >> think there will be a significant number of cases where people will >> want filtering clauses that can be realized by doing an index scan >> instead of a sequential scan, and if we end up forcing a sequential >> scan anyway, the feature will be useless to those people. > > I agree that we want to support that, if we can do so reasonably. What > I was trying to get at is simply this- don't we provide that already > with the leakproof attribute and functions? If we don't have enough > there to allow index scans then we should be looking to add more, I'm > thinking. So the reason why we got onto this particular topic was because of the issue of multiple security policies for a single table. Of course, multiple security policies can always be merged into a single more-complex policy, but the resulting policy may be so complex that the query-planner is no longer capable of doing a good job optimizing it. I won't mention here exactly what a certain large commercial database vendor has implemented here; suffice it to say, however, that their design avoids this pitfall, and ours currently does not. > I agree on this point, but I'm still hopeful that we'll be able to get a > good feature into 9.5. There are quite a few resources available for > the 'just programming' part, so the long pole in the tent here is > absolutely hashing out what we want and how it should function. Agreed. > I'd be happy to host or participate in a conference call or similar if > that would be useful to move this along- or we can continue to > communicate via email. There's a bit of a lull in conferences to which > I'm going to right now, so in person is unlikely, unless folks want to > get together somewhere on the east coast (I'd be happy to travel to > Philly, Pittsburgh, NYC, etc, if it'd help..). For me, email is easiest; but there are other options, too. >> > What solution did you come up with for this case, which performed well >> > and was also secure..? >> >> I put the logic in the client. :-( > > Well, that's not helpful here. ;) Sure. The reason I brought it up is to say - hey, look, I had this come up in the real world. What would it take to be able to do actually do it in the database server? And the answer is - something that will handle multiple security policies cleanly. >> But I'm not sure; that >> feels like it's giving something up that might be important. And I >> think that the kinds of syntax we're discussing won't support leaving >> that out of the initial version and adding it later, so if we commit >> to this syntax, we're stuck with that behavior. To avoid that, we'd >> need something like this: >> >> ALTER TABLE tab ADD POLICY polname WHERE quals; >> GRANT SELECT (polname) ON TABLE tab TO role; > > Right, if we were to support multiple policies on a given table then we > would have to support adding and removing them individually, as well as > specify when they are to be applied- and what if that "when" overlaps? > Do we apply both and only a row which passed them all gets sent to the > user? Essentially we'd be defining the RLS policies to be AND'd > together, right? Would we want to support both AND-based and OR-based, > and allow users to pick what set of conditionals they want applied to > their various overlapping RLS policies? AND is not a sensible policy; it would need to be OR. If you grant someone access to two different subsets of the rows in a table, it stands to reason that they will expect to have access to all of the rows that are in at least one of those subsets. If you give someone your car key and your house key, that means they can operate your car or enter your house; it does not mean that they can operate your car but only when it's inside your garage. Alternatively, we could: - Require the user to specify in some way which of the available policies they want applied, and then apply only that one. or - Decide that such scenarios constitute misconfiguration. Throw an error and make the table owner or other relevant local authority fix it. > Sounds all rather painful and much better done programatically by the > user in a language which is suited to that task- eg: pl/pgsql, perl, C, > or something besides our ALTER syntax + catalog representation. I think exactly the opposite, for the query planning reasons previously stated. I think the policies will quickly get so complicated that they're no longer optimizable. Here's a simple example: - Policy 1 allows the user to access rows for which complexfunc() returns true. - Policy 2 allows the user to access rows for which a = 1. Most users have access only through policy 2, but some have access through policy 1. Users who have access through policy 1 will always get a sequential scan, but users who have access through policy 2 have an excellent chance of getting an index scan if the selectivity of a = 1 is high. When you merge those two things into a single policy, no matter how you do it, everyone gets sequential scans all the time. That sucks. >> Hmm, I think some users do want to do things like this. There are >> previous discussions of wanting to fuzz a set of coordinates, for >> example, or blank out a certain list of columns. > > Absolutely they'll want to be able to do this- but that's going to be a > case which I (and others, I think) feel comfortable going back and > saying "use views for that". I'm trying to draw that line in the ground > between what is RLS and what are views and keeping RLS to the WHERE > clause strikes me as a good line to draw (and one which matches up with > existing expectations in this space). Fair. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
From
Robert Haas
Date:
On Wed, Jun 18, 2014 at 10:40 AM, Stephen Frost <sfrost@snowman.net> wrote: > * Robert Haas (robertmhaas@gmail.com) wrote: >> On Tue, Jun 17, 2014 at 10:06 PM, Stephen Frost <sfrost@snowman.net> wrote: >> > I had taken it to be a single privilege, but you're right, it could be >> > done for each of those.. I really don't think we have the bits for more >> > than one case here though (if that) without a fair bit of additional >> > rework. I'm not against that rework (and called for it wayyy back when >> > I proposed the TRUNCATE privilege, as I recall) but that's a whole >> > different challenge and no small bit of work.. >> >> Technically, there are 4 bits left, and that's what we need for >> separate privileges. > > I'd really hate to chew them all up.. Usually it's the patch author who WANTS to chew up all the available bit space and OTHER people who say no. :-) >> We last consumed bits in 2008 (for TRUNCATE) and >> 2006 (for GRANT ON DATABASE), so even if we used all of the remaining >> bits it might be another 5 years before anyone has to do that >> refactoring. > > Perhaps, or we might come up with some new whiz-bang permission to add > next year. :/ Well, people proposed separate permissions for things like VACUUM and ANALYZE around the time TRUNCATE was added, and those were rejected on the grounds that they didn't add enough value to justify wasting bits on them. I think we see whether there's a workable system that such that marginal permissions (like TRUNCATE) that won't be checked often don't have to consume bits. >> But even if the refactoring needs to be done now for >> some reason, it's only June, and the last CommitFest doesn't start >> until February 15th. I think we're being way too quick to jump to >> talking about what can and can't be done in time for 9.5. Let's start >> by figuring out how we'd really like it to work and then, if it's too >> ambitious, we can scale it back. > > Alright- perhaps we can discuss what kind of refactoring would be needed > for such a change then, to get a better idea as to the scope of the > change and the level of effort required. > > My thoughts on how to address this were to segregate the ACL bits by > object type. That is to say, the AclMode stored for databases might > only use bits 0-2 (create/connect/temporary), while tables would use > bits 0-7 (insert/select/update/delete/references/trigger). This would > allow us to more easily add more rights at the database and/or > tablespace level too. Yeah, that's another idea. But it really deserves its own thread. I'm still not convinced we have to do this at all to meet this need, but that should be argued back and forth on that other thread. >> My main concern about using only one bit is that someone might want to >> allow a user to bypass RLS on SELECT while still enforcing it for >> data-modifying operations. That seems like a plausible use case to >> me. > > I absolutely agree that it's a real use-case and one which we should > support, just trying to avoid biting off more than can be done between > now and February. Still, if we get things hammered out and more-or-less > agreement on the way forward, getting the code written may move quickly. Nifty. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
From
Stephen Frost
Date:
* Robert Haas (robertmhaas@gmail.com) wrote: > On Tue, Jun 17, 2014 at 10:25 PM, Stephen Frost <sfrost@snowman.net> wrote: > > I agree that we want to support that, if we can do so reasonably. What > > I was trying to get at is simply this- don't we provide that already > > with the leakproof attribute and functions? If we don't have enough > > there to allow index scans then we should be looking to add more, I'm > > thinking. > > So the reason why we got onto this particular topic was because of the > issue of multiple security policies for a single table. Of course, > multiple security policies can always be merged into a single > more-complex policy, but the resulting policy may be so complex that > the query-planner is no longer capable of doing a good job optimizing > it. Yeah, I could see that happening with some use-cases. > >> ALTER TABLE tab ADD POLICY polname WHERE quals; > >> GRANT SELECT (polname) ON TABLE tab TO role; > > > > Right, if we were to support multiple policies on a given table then we > > would have to support adding and removing them individually, as well as > > specify when they are to be applied- and what if that "when" overlaps? > > Do we apply both and only a row which passed them all gets sent to the > > user? Essentially we'd be defining the RLS policies to be AND'd > > together, right? Would we want to support both AND-based and OR-based, > > and allow users to pick what set of conditionals they want applied to > > their various overlapping RLS policies? > > AND is not a sensible policy; it would need to be OR. If you grant > someone access to two different subsets of the rows in a table, it > stands to reason that they will expect to have access to all of the > rows that are in at least one of those subsets. I think I can buy off on this. What that also means is that any 'short-circuiting' that we try to do here would be based on "stop once we get back a 'true'". This could seriously change how we actually implement RLS though as doing it all through query rewrites and making this work with multiple security policies which are OR'd together and yet keeping the optimization and qual push-down and index-based plans is looking pretty daunting. I'm also of the opinion that this isn't strictly necessary for the initial RLS offering in PG- there's a clear way we could migrate existing users to a multi-policy system from a single-policy system. Sure, to get the performance and optimization benefits that we'd presumably have in the multi-policy case they'd need to re-work their RLS configuration, but for users who care, they'll likely be very happy to do so to gain those benefits. Perhaps the question here is- if we implement RLS one way for the single case and then change the implementation all around for the multi case, will we end up breaking the single case? Or destroying the performance for it? I can't see either of those cases being allowed- if and when we support multi, we must still support single and the whole point of multi would be to allow more performant implementations and that solution will require the single case to be at least as performant as what we're proposing to do today, I believe. Or are you thinking that we would never support calling user-defined functions in any RLS scheme because we want to be able to do that optimization? I don't see that being acceptable from a feature standpoint. > Alternatively, we could: > > - Require the user to specify in some way which of the available > policies they want applied, and then apply only that one. I'd want to at least see a way to apply an ordering to the policies being applied, or have PG work out which one is "cheapest" and try that one first. > - Decide that such scenarios constitute misconfiguration. Throw an > error and make the table owner or other relevant local authority fix > it. Having them all be OR'd together feels simpler and easier to work with than trying to provide the user with all the knobs necessary to select which subset of users they want the policy applied to when (user X from IP range a.b.c.d/24 at time 1500). We could probably make it work with exclusion constraints, range types, etc, and perhaps it'd be a reason to bring btree_gist into core (which I'm all for) and make it work with catalog tables, but... just 'yuck' all around, for my part. > > Sounds all rather painful and much better done programatically by the > > user in a language which is suited to that task- eg: pl/pgsql, perl, C, > > or something besides our ALTER syntax + catalog representation. > > I think exactly the opposite, for the query planning reasons > previously stated. I think the policies will quickly get so > complicated that they're no longer optimizable. Here's a simple > example: > > - Policy 1 allows the user to access rows for which complexfunc() returns true. > - Policy 2 allows the user to access rows for which a = 1. > > Most users have access only through policy 2, but some have access > through policy 1. Users who have access through policy 1 will always > get a sequential scan, This is the thing which I most object to- if the quals being provided at any level are leakproof and would be able to reduce the returned set sufficiently that an index scan is the best bet, we should be doing that. I don't anticipate the RLS quals to be as selective as the quals which the user is adding. I agree that in cases where the user isn't using a leakproof function in their quals and the policy is complex, a sequential scan would have to be done over the table, but looking at the set of leakproof vs not leakproof functions used by operators which return boolean, certainly the most common of the index using cases are covered and we may be able to add more leakproof functions, should we get user complaints that the function they're using works fine with an index but isn't leakproof. > but users who have access through policy 2 have > an excellent chance of getting an index scan if the selectivity of a = > 1 is high. When you merge those two things into a single policy, no > matter how you do it, everyone gets sequential scans all the time. > That sucks. It just strikes me as unlikely that in such a simple policy the selectivity of the RLS qual used will be high and this feels like a lot of mechanism and complication to be adding for that use-case. If the selectivity is actually high in terms of what the RLS qual will allow, then it seems likely, to me at least, that it's going to need to depend on another table or function, eg: exists(select 1 from security_table where (current_user(),a) = (sec_user,sec_label)) Still thinking about this approach in general. Having a good answer to the question about granularity and how this multiple RLS-policy would actually work would certainly help. Being able to pick a single policy (rather than deal with overlapping policies that all have to be tested) would definitely make this simpler, but I suppose we could build up "X OR Y OR Z..." inside the query.. Thanks, Stephen
Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
From
Stephen Frost
Date:
Robert, * Robert Haas (robertmhaas@gmail.com) wrote: > On Wed, Jun 18, 2014 at 10:40 AM, Stephen Frost <sfrost@snowman.net> wrote: > > * Robert Haas (robertmhaas@gmail.com) wrote: > >> Technically, there are 4 bits left, and that's what we need for > >> separate privileges. > > > > I'd really hate to chew them all up.. > > Usually it's the patch author who WANTS to chew up all the available > bit space and OTHER people who say no. :-) Ah, well, technically I'm not the patch author here, though I would like to see it happen. :) Still, have to balance these features and capabilities against the future unknown options we might want to add and it certainly doesn't seem terribly nice to chew up all that remain rather than addressing the need to support more. Still, perhaps we can put together a patch for this and then review the implementation and, if we like it and that functionality, we can make the decision about if it should be on this patch to make more bits available. > > Perhaps, or we might come up with some new whiz-bang permission to add > > next year. :/ > > Well, people proposed separate permissions for things like VACUUM and > ANALYZE around the time TRUNCATE was added, and those were rejected on > the grounds that they didn't add enough value to justify wasting bits > on them. I think we see whether there's a workable system that such > that marginal permissions (like TRUNCATE) that won't be checked often > don't have to consume bits. That's an interesting approach but I'm not sure that we need to go a system where we segregate "often-used" bits from "less-used" ones. > > My thoughts on how to address this were to segregate the ACL bits by > > object type. That is to say, the AclMode stored for databases might > > only use bits 0-2 (create/connect/temporary), while tables would use > > bits 0-7 (insert/select/update/delete/references/trigger). This would > > allow us to more easily add more rights at the database and/or > > tablespace level too. > > Yeah, that's another idea. But it really deserves its own thread. > I'm still not convinced we have to do this at all to meet this need, > but that should be argued back and forth on that other thread. Fair enough. Thanks, Stephen
Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
From
Tom Lane
Date:
Stephen Frost <sfrost@snowman.net> writes: > * Robert Haas (robertmhaas@gmail.com) wrote: >> [ counting bits in ACLs ] Wouldn't it be fairly painless to widen AclMode from 32 bits to 64, and thereby double the number of available bits? That code was all written before we required platforms to have an int64 primitive type, but of course now we expect that. In any case, I concur with the position that this feature patch should be separate from a patch to make additional bitspace available. regards, tom lane
Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
From
Stephen Frost
Date:
Tom, * Tom Lane (tgl@sss.pgh.pa.us) wrote: > Stephen Frost <sfrost@snowman.net> writes: > > * Robert Haas (robertmhaas@gmail.com) wrote: > >> [ counting bits in ACLs ] > > Wouldn't it be fairly painless to widen AclMode from 32 bits to 64, > and thereby double the number of available bits? Thanks for commenting on this. I hadn't considered that but I don't see any particular problem with it either.. > In any case, I concur with the position that this feature patch should > be separate from a patch to make additional bitspace available. Certainly. Thanks for your thoughts. Stephen
Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
From
Dean Rasheed
Date:
On 17 June 2014 20:19, Robert Haas <robertmhaas@gmail.com> wrote: > On Fri, Jun 13, 2014 at 3:11 AM, Dean Rasheed <dean.a.rasheed@gmail.com> wrote: >> Yeah, I was thinking something like this could work, but I would go >> further. Suppose you had separate GRANTable privileges for direct >> access to individual tables, bypassing RLS, e.g. >> >> GRANT DIRECT SELECT|INSERT|UPDATE|DELETE ON table_name TO role_name > > So, is this one new privilege (DIRECT) or four separate new privileges > that are variants of the existing privileges (DIRECT SELECT, DIRECT > INSERT, DIRECT UPDATE, DIRECT DELETE)? > I was thinking it would be 4 new privileges, so that a user could for example be granted DIRECT SELECT permission on a table, but not DIRECT UPDATE. On reflection though, I think I prefer the approach of allowing multiple named security policies per table, because it gives the planner more opportunity to optimize queries against specific RLS quals, which won't work if the ACL logic is embedded in functions. That seems like something that would have to be designed in now, because it's difficult to see how you could add it later. Managing policy names becomes an issue though, because if you have 2 tables each with 1 policy, but you give them different names, how can the user querying the data specify that they want policy1 for table1 and policy2 for table2, possibly in the same query? I think that can be made more manageable by making policies top-level objects that exist independently of any particular tables. So you might do something like: \c - alice CREATE POLICY policy1; CREATE POLICY policy2; ALTER TABLE t1 SET POLICY policy1 TO t1_quals; ALTER TABLE t2 SET POLICY policy1 TO t2_quals; ... GRANT SELECT ON TABLE t1, t2 TO bob USING policy1; GRANT SELECT ON TABLE t1, t2 TO manager; -- Can use any policy, or bypass all policies Then a particular user would typically only have to set their policy once per session, for accessing multiple tables: \c - bob SET rls_policy = policy1; SELECT * FROM t1 JOIN t2; -- OK SET rls_policy = policy2; SELECT * FROM t1; -- ERROR: no permission to access t1 using policy2 or you'd be able to set a default policy for users, so that they wouldn't need to explicitly choose one: ALTER ROLE bob SET rls_policy = policy1; Note that the syntax proposed elsewhere --- GRANT SELECT (polname) ON TABLE tab TO role --- doesn't work because it conflicts with the syntax for granting column privileges, so there needs to be a distinct syntax for this, and I think it ought to ultimately allow things like GRANT SELECT (col1, col2), UPDATE (col1) ON t1 TO bob USING policy1; Regards, Dean
Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
From
Stephen Frost
Date:
Dean, * Dean Rasheed (dean.a.rasheed@gmail.com) wrote: > On 17 June 2014 20:19, Robert Haas <robertmhaas@gmail.com> wrote: > > On Fri, Jun 13, 2014 at 3:11 AM, Dean Rasheed <dean.a.rasheed@gmail.com> wrote: > >> Yeah, I was thinking something like this could work, but I would go > >> further. Suppose you had separate GRANTable privileges for direct > >> access to individual tables, bypassing RLS, e.g. > >> > >> GRANT DIRECT SELECT|INSERT|UPDATE|DELETE ON table_name TO role_name > > > > So, is this one new privilege (DIRECT) or four separate new privileges > > that are variants of the existing privileges (DIRECT SELECT, DIRECT > > INSERT, DIRECT UPDATE, DIRECT DELETE)? > > I was thinking it would be 4 new privileges, so that a user could for > example be granted DIRECT SELECT permission on a table, but not DIRECT > UPDATE. Ok. > On reflection though, I think I prefer the approach of allowing > multiple named security policies per table, because it gives the > planner more opportunity to optimize queries against specific RLS > quals, which won't work if the ACL logic is embedded in functions. Having more than one policy for the purpose of performance really doesn't make a huge amount of sense to me. Perhaps someone could explain the use-case with specific example applications where they would benefit from this? Based on the discussion, they would have to be OR'd together in the query as built with any result being marked as success. One could build an SQL function which could be in-lined potentially which does the same if their case is that simple. Being able to define the policy based on some criteria may allow it to be simpler (eg: policy 'a' applies for certain roles, while policy 'b' applies for other roles), but I'm not enthusiastic about that approach because there could be a huge number of permutations to allow. How about another approach- what about having a function which is called (as the table owner, I'm thinking..) that then returns the qual to be included, instead of having to define a specific qual which is included in the catalog? That function could take into consideration the user, table, etc, and return a qual which includes constants to compare rows against for planning purposes. This would have to be done early enough, of course, which might be difficult. For my part, having that capability would be neat, but nothing we're trying to do here would preclude us from adding it later either. > That seems like something that would have to be designed in now, > because it's difficult to see how you could add it later. I don't follow this at all. Going from supporting one qual to supporting multiple seems like it'd be quite straight-forward to add in later? Going the other way would be difficult. > Managing policy names becomes an issue though, because if you have 2 > tables each with 1 policy, but you give them different names, how can > the user querying the data specify that they want policy1 for table1 > and policy2 for table2, possibly in the same query? From my experience, users don't pick the policy any more than they get to pick which set of permissions get applied to them when querying tables (modulo roles, of course, but that's a mechanism for changing users, not for saying which set of permissions you get). All that you describe could be done for regular permissions also, but we don't, and I don't think we get complaints about that because we have roles- which would work just the same for RLS (assuming the RLS policy defined has a role component). Having a function be able to be called to return a qual to be used would be a way to have per-role RLS also, along with providing the flexibility to have per-source-IP, per-connection-type, etc, RLS policies also. Thanks, Stephen
Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
From
Robert Haas
Date:
On Wed, Jun 18, 2014 at 2:18 PM, Stephen Frost <sfrost@snowman.net> wrote: > I'm also of the opinion that this isn't strictly necessary for the > initial RLS offering in PG- there's a clear way we could migrate > existing users to a multi-policy system from a single-policy system. > Sure, to get the performance and optimization benefits that we'd > presumably have in the multi-policy case they'd need to re-work their > RLS configuration, but for users who care, they'll likely be very happy > to do so to gain those benefits. I think a lot depends on the syntax we choose. If we choose a syntax that only makes sense in a single-policy framework, then I think allowing upgrades to a multi-policy syntax is going to be really difficult. On the other hand, if we choose a syntax that allows multiple policies, I suspect we can support multiple policies from the beginning without much extra effort. >> - Require the user to specify in some way which of the available >> policies they want applied, and then apply only that one. > > I'd want to at least see a way to apply an ordering to the policies > being applied, or have PG work out which one is "cheapest" and try that > one first. Cost-based comparison of policies that return different results doesn't seem sensible to me. >> I think exactly the opposite, for the query planning reasons >> previously stated. I think the policies will quickly get so >> complicated that they're no longer optimizable. Here's a simple >> example: >> >> - Policy 1 allows the user to access rows for which complexfunc() returns true. >> - Policy 2 allows the user to access rows for which a = 1. >> >> Most users have access only through policy 2, but some have access >> through policy 1. Users who have access through policy 1 will always >> get a sequential scan, > > This is the thing which I most object to- if the quals being provided at > any level are leakproof and would be able to reduce the returned set > sufficiently that an index scan is the best bet, we should be doing > that. I don't anticipate the RLS quals to be as selective as the > quals which the user is adding. I think it would be a VERY bad idea to design the system around the assumption that the RLS quals will be much more or less selective than the user-supplied quals. That's going to be different in different environments. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
From
Stephen Frost
Date:
* Robert Haas (robertmhaas@gmail.com) wrote: > On Wed, Jun 18, 2014 at 2:18 PM, Stephen Frost <sfrost@snowman.net> wrote: > > I'm also of the opinion that this isn't strictly necessary for the > > initial RLS offering in PG- there's a clear way we could migrate > > existing users to a multi-policy system from a single-policy system. > > Sure, to get the performance and optimization benefits that we'd > > presumably have in the multi-policy case they'd need to re-work their > > RLS configuration, but for users who care, they'll likely be very happy > > to do so to gain those benefits. > > I think a lot depends on the syntax we choose. If we choose a syntax > that only makes sense in a single-policy framework, then I think > allowing upgrades to a multi-policy syntax is going to be really > difficult. On the other hand, if we choose a syntax that allows > multiple policies, I suspect we can support multiple policies from the > beginning without much extra effort. What are these policies going to depend on? Will they be allowed to overlap? I don't see multi-policy support as being very easily added. If there are specific ways to design the syntax which would make it easier to support multiple policies in the future, I'm all for it. Have any specific thoughts regarding that? > >> - Require the user to specify in some way which of the available > >> policies they want applied, and then apply only that one. > > > > I'd want to at least see a way to apply an ordering to the policies > > being applied, or have PG work out which one is "cheapest" and try that > > one first. > > Cost-based comparison of policies that return different results > doesn't seem sensible to me. I keep coming back to the thought that, really, having multiple overlapping policies just adds unnecessary complication to the system for not much gain in real functionality. Being able to specify a policy per-role might be useful, but that's only one dimension and I can imagine a lot of other dimensions that one might want to use to control which policy is used. > >> I think exactly the opposite, for the query planning reasons > >> previously stated. I think the policies will quickly get so > >> complicated that they're no longer optimizable. Here's a simple > >> example: > >> > >> - Policy 1 allows the user to access rows for which complexfunc() returns true. > >> - Policy 2 allows the user to access rows for which a = 1. > >> > >> Most users have access only through policy 2, but some have access > >> through policy 1. Users who have access through policy 1 will always > >> get a sequential scan, > > > > This is the thing which I most object to- if the quals being provided at > > any level are leakproof and would be able to reduce the returned set > > sufficiently that an index scan is the best bet, we should be doing > > that. I don't anticipate the RLS quals to be as selective as the > > quals which the user is adding. > > I think it would be a VERY bad idea to design the system around the > assumption that the RLS quals will be much more or less selective than > the user-supplied quals. That's going to be different in different > environments. Fine- but do you really see the query planner having a problem pushing down whichever is the more selective qual, if the user-provided qual is marked as leakproof? I realize that you want multiple policies because you'd like a way for the RLS qual to be made simpler for certain cases while also having more complex quals for other cases. What I keep waiting to hear is exactly how you want to specify which policy is used because that's where it gets ugly and complicated. I still really don't like the idea of trying to apply multiple policies inside of a single query execution. Thanks, Stephen
Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
From
Robert Haas
Date:
On Mon, Jun 23, 2014 at 2:29 PM, Stephen Frost <sfrost@snowman.net> wrote: > What are these policies going to depend on? Will they be allowed to > overlap? I don't see multi-policy support as being very easily added. We discussed the point about overlap upthread, and I gave specific examples. If there's something else you want me to provide here, please be more clear about it. > If there are specific ways to design the syntax which would make it > easier to support multiple policies in the future, I'm all for it. Have > any specific thoughts regarding that? I did propose something already upthread, and then Dean said this: # Note that the syntax proposed elsewhere --- GRANT SELECT (polname) ON # TABLE tab TO role --- doesn't work because it conflicts with the # syntax for granting column privileges, so there needs to be a distinct # syntax for this, and I think it ought to ultimately allow things like # # GRANT SELECT (col1, col2), UPDATE (col1) ON t1 TO bob USING policy1; He's got a good point there. I don't know whether the policy should be given inline (e.g. GRANT ... WHERE stuff()) or out-of-line (GRANT ... USING policy1) but it seems like specifying it as some sort of GRANT modifier might make sense. I'm sure there are other ways also, of course. >> >> - Require the user to specify in some way which of the available >> >> policies they want applied, and then apply only that one. >> > >> > I'd want to at least see a way to apply an ordering to the policies >> > being applied, or have PG work out which one is "cheapest" and try that >> > one first. >> >> Cost-based comparison of policies that return different results >> doesn't seem sensible to me. > > I keep coming back to the thought that, really, having multiple > overlapping policies just adds unnecessary complication to the system > for not much gain in real functionality. Being able to specify a policy > per-role might be useful, but that's only one dimension and I can > imagine a lot of other dimensions that one might want to use to control > which policy is used. Well, I don't agree, and I've given examples upthread showing the kinds of scenarios that I'm concerned about, which are drawn from real experiences I've had. It may be that I'm the only one who has had such experiences, of course; or that there aren't enough people who have to justify catering to such use cases. But I'm not sure there's much point in trying to have a conversation about how such a thing could be made to work if you're just going to revert back to "well, we don't really need this anyway" each time I make or refute a technical point. >> I think it would be a VERY bad idea to design the system around the >> assumption that the RLS quals will be much more or less selective than >> the user-supplied quals. That's going to be different in different >> environments. > > Fine- but do you really see the query planner having a problem pushing > down whichever is the more selective qual, if the user-provided qual is > marked as leakproof? I'm not quite sure I understand the scenario you're describing here. Can you provide a tangible example? I expect that most of the things the RLS-limited user might write in the WHERE clause will NOT get pushed down because most functions are not leakproof. However, the issue I'm actually concerned about is whether the *security* qual is simple enough to permit an index-scan. Anything with an OR clause in it probably won't be, and any function call definitely won't be. > I realize that you want multiple policies because you'd like a way for > the RLS qual to be made simpler for certain cases while also having more > complex quals for other cases. What I keep waiting to hear is exactly > how you want to specify which policy is used because that's where it > gets ugly and complicated. I still really don't like the idea of trying > to apply multiple policies inside of a single query execution. See above comments. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
From
Alvaro Herrera
Date:
Robert Haas wrote: > > Right, if we were to support multiple policies on a given table then we > > would have to support adding and removing them individually, as well as > > specify when they are to be applied- and what if that "when" overlaps? > > Do we apply both and only a row which passed them all gets sent to the > > user? Essentially we'd be defining the RLS policies to be AND'd > > together, right? Would we want to support both AND-based and OR-based, > > and allow users to pick what set of conditionals they want applied to > > their various overlapping RLS policies? > > AND is not a sensible policy; it would need to be OR. If you grant > someone access to two different subsets of the rows in a table, it > stands to reason that they will expect to have access to all of the > rows that are in at least one of those subsets. I haven't been following this thread, but this bit caught my attention. I'm not sure I agree that OR is always the right policy either. There is a case for a policy that says "forbid these rows to these guys, even if they have read permissions from elsewhere". If OR is the only way to mix multiple policies there might not be a way to implement this. So ISTM each policy must be able to indicate what to do -- sort of how PAM config files allow you to specify "required", "optional" and so forth for each module. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
From
Robert Haas
Date:
On Tue, Jun 24, 2014 at 10:30 AM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > Robert Haas wrote: >> > Right, if we were to support multiple policies on a given table then we >> > would have to support adding and removing them individually, as well as >> > specify when they are to be applied- and what if that "when" overlaps? >> > Do we apply both and only a row which passed them all gets sent to the >> > user? Essentially we'd be defining the RLS policies to be AND'd >> > together, right? Would we want to support both AND-based and OR-based, >> > and allow users to pick what set of conditionals they want applied to >> > their various overlapping RLS policies? >> >> AND is not a sensible policy; it would need to be OR. If you grant >> someone access to two different subsets of the rows in a table, it >> stands to reason that they will expect to have access to all of the >> rows that are in at least one of those subsets. > > I haven't been following this thread, but this bit caught my attention. > I'm not sure I agree that OR is always the right policy either. > There is a case for a policy that says "forbid these rows to these guys, > even if they have read permissions from elsewhere". If OR is the only > way to mix multiple policies there might not be a way to implement this. > So ISTM each policy must be able to indicate what to do -- sort of how > PAM config files allow you to specify "required", "optional" and so > forth for each module. Hmm. Well, that could be useful, but I'm not sure I'd view it as something we absolutely have to have... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
From
Stephen Frost
Date:
Robert, I feel like we are getting to the point of simply talking past each other and so I'll try anew, and I'll include my understanding of how the different approaches would address the specific use-case you outlined up-thread. Single policy ------------- The current implementation approach only allows a single policy to be included. The concern raised with this approach is that it won't be very performant due to the qual complexity, which you outlined (reformatted a bit) as: WHERE sales_rep_id = (SELECT oid FROM pg_roles WHERE rolname = current_user AND oid IN (SELECT id FROM person WHERE is_sales_rep)) OR partner_id = (SELECT p.org_id FROM pg_roles a,person p WHERE a.rolname = current_user AND a.oid = p.id) Which I take to mean there is a 'person' table which looks like: id, is_sales_rep, org_id and a table which has the RLS qual which looks like: pk_id, sales_rep_id, partner_id Then, if the individual is_sales_rep and it's their account by sales_rep_id, or if the individual's org_id matches the partner_id, they can see the record. Using this example with security barrier views and indexes on person.id, data.pk_id, data.sales_rep_id, and data.partner_id, we'll get a bitmap heap scan across the 'data' table by having the two OR's run as InitPlan 1 and InitPlan 2. Does that address the concern you had around multi-branch OR policies? This works with more than two OR branches also, though of course we need appropriate indexes to make use of a Bitmap Heap Scan. Even with per-user policies, we would define a policy along these lines, for the "sfrost" role: WHERE sales_rep_id = 16384 OR partner_id = 1 Which also ends up doing a Bitmap Heap Scan across the data table. For the case where a sales rep isn't also a partner, you could simplify this to: WHERE sales_rep_id = 16384 but I'm not sure that really buys you much? With the bitmap heap scan, if one side of the OR ends up not returning anything then it doesn't contribute to the blocks which have to be scanned. The index might still need to be scanned, although I think you could avoid even that with an EXISTS check to see if the user is a partner at all. That's not to say that a bitmap scan is equivilant to an index scan, but it's certainly likely to be far better than a sequential scan. Now, if the query is "select * from data_view with pk_id = 1002;", then we get an indexed lookup on the data table based on the PK. That's what I was trying to point out previously regarding leakproof functions (which comprise about half of the boolean functions we provide, if I recall my previous analysis correctly). We also get indexed lookups with "pk_id < 10" or similar as those are also leakproof. Multiple, Overlapping policies ------------------------------ Per discussion, these would generally be OR'd together. Building up the overall qual which has to include an OR branch for each individual policy qual(s) looks like a complicated bit of work and one which might be better left to the user (and, as just pointed out, the user may actually want AND instead of OR in some cases..). Managing the plan cache in a sensible way is certainly made more complicated by this and might mean that it can't be used at all, which has already been raised as a show-stopper issue. In the example which you provided, while we could represent that the two policies exist (sales representatives vs partners) and that they are to be OR'd together in the catalog, but I don't immediately see how that would change the qual which ends up being added to the query in this case or really improving the overall query plan; at least, not without eliminating one of the OR branches somehow- which I discuss below. Multiple, Non-overlapping policies ---------------------------------- Preventing the overlap of policies ends up being very complicated if many dimensions are allowed. For the simple case, perhaps only the 'current role' dimension is useful. I expect that going down that route would very quickly lead to requests for other dimensions (client IP, etc) which is why I'm not a big fan of it, but if that's the concensus then let's work out the syntax and update the patch and move on. Another option might be to have a qual for each policy which the user can define that indicates if that policy is to be applied or not and then simply pick the first policy for which that qual which returns 'true'. We would require an ordering to be defined in this case, which I believe was an issue up-thread. If we allow all policies matching the quals then we run into the complications mentioned under "Overlapping policies" above. If we decide that per-role policies need to be supported, I very quickly see the need to have "groups" of roles to which a policy is to be applied. This would differ from roles today as they would not be allowed to overlap (otherwise we are into overlapping policies again, or having to figure out which of the overlapping policies should be applied for each query; another option would be to error at run-time, but that seems pretty ugly). In this case we would still need to support "all" as an option, which is what I would expect to have implemented for 9.5, or at least in the early part of 9.5 (I really don't want to wait until the last CF or even the CF before that to get anything in as I suspect it will have grown by that point to be large enough to be an issue..), adding the per-role(s) option could be for 9.6/10.0. In your example, if sales representatives have distinct roles from partners, then the specific policy could be chosen and used based on which role is running the query, which might lead to, perhaps only maginal, improved performance in those specific cases, as discussed above. General multi-policy concerns ----------------------------- Choosing which policy or policies to apply for a given query gets very complicated very quickly if we're to do so in an automated way. Dean suggests that the user would pick which policy to use, to which I argued that roles could be used to manage that instead (a user could 'set role' to a role which has the access requested). That mechanism would also work in the existing single-policy approach by having a policy which depends on the calling role (eg: by looking up that role in a table which defines what access that role should have). It would also work in the above proposal for multiple non-overlapping policies where the policy to use is based on the current role. Overall, while I'm interested in defining where this is going in a way which allows us implement an initial RLS capability while avoiding future upgrade issues, I am perfectly happy to say that the 9.5 RLS implementation may not be exactly syntax-compatible with 9.6 or 10.0. What I wish to avoid is a case where what's in 9.5 includes RLS definitions which can't be implemented in 9.6/10.0 and as would cause upgrade problems. As long as what's in 9.5 can be represented and supported in 9.6/10.0, we can implement the necessary logic to migrate from one to the other in pg_dump. We do not guarantee syntax compatibility between major versions and we often warn users of newer features that there may be some changes in subsequent releases which they'll need to address when they upgrade (and, of course, these are noted in the release notes). Hopefully this will help us move the discussion forward to a point where we have a long-term design as well as a short-term goal which is actionable for 9.5. The current work is around adding the GUCs discussed previously to the RLS patch and modifying pg_dump to use them, to address the concerns raised previously about pg_dump running user code and possibly not having a complete copy of the data. Thanks, Stephen
Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
From
Dean Rasheed
Date:
On 24 June 2014 17:27, Stephen Frost <sfrost@snowman.net> wrote: > Single policy vs Multiple, Overlapping policies vs Multiple, Non-overlapping policies > What I was describing upthread was multiple non-overlapping policies. I disagree that this will be more complicated to use. It's a strict superset of the single policy functionality, so if you want to do it all using a single policy then you can. But I think that once the ACLs reach a certain level of complexity, you probably will want to break it up into multiple policies, and I think doing so will make things simpler, not more complicated. Taking a specific, simplistic example, suppose you had 2 groups of users - some are normal users who should only be able to access their own records. For these users, you might have a policy like WHERE person_id = current_user which would be highly selective, and probably use an index scan. Then there might be another group of users who are managers with access to the records of, say, everyone in their department. This might then be a more complex qual along the lines of WHERE person_id IN (SELECT ... FROM person_department WHERE mgr_id = current_user AND ...) which might end up being a hash or merge join, depending on any user-supplied quals. You _could_ combine those into a single policy, but I think it would be much better to have 2 distinct policies, since they're 2 very different queries, for different use cases. Normal users would only be granted permission to use the normal_user_policy. Managers might be granted permission to use either the normal_user_policy or the manager_policy (but not both at the same time). That's a very simplified example. In more realistic situations there are likely to be many more classes of users, and trying to enforce all the logic in a single WHERE clause is likely to get unmanageable, or inefficient if it involves lots of logic hidden away in functions. Allowing multiple, non-overlapping policies allows the problem to be broken up into more manageable pieces, which also makes the planner's job easier, since only a single, simpler policy is in effect in any given query. Regards, Dean
Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
From
Dean Rasheed
Date:
Thinking about the examples upthread, a separate issue occurs to me --- when defining a RLS qual, I think that there has to be a syntax to specify an alias for the main table, so that correlated subqueries can refer to it. I'm not sure if that's been mentioned in any of the discussions so far, but it might be quite hard to define certain quals without it. Regards, Dean
Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
From
Stephen Frost
Date:
Dean, * Dean Rasheed (dean.a.rasheed@gmail.com) wrote: > Thinking about the examples upthread, a separate issue occurs to me > --- when defining a RLS qual, I think that there has to be a syntax to > specify an alias for the main table, so that correlated subqueries can > refer to it. I'm not sure if that's been mentioned in any of the > discussions so far, but it might be quite hard to define certain quals > without it. Yeah, that thought had occured to me also. Have any suggestions about how to approach that issue? The way triggers have OLD/NEW comes to mind but I'm not sure how easily that'd work. Thanks, Stephen
Dean, all, Changing the subject of this thread (though keeping it threaded) as we've really moved on to a much broader discussion. * Dean Rasheed (dean.a.rasheed@gmail.com) wrote: > On 24 June 2014 17:27, Stephen Frost <sfrost@snowman.net> wrote: > > Single policy vs Multiple, Overlapping policies vs Multiple, Non-overlapping policies > > What I was describing upthread was multiple non-overlapping policies. Ok. > I disagree that this will be more complicated to use. It's a strict > superset of the single policy functionality, so if you want to do it > all using a single policy then you can. But I think that once the ACLs > reach a certain level of complexity, you probably will want to break > it up into multiple policies, and I think doing so will make things > simpler, not more complicated. If we keep it explicitly to per-role only, with only one policy ever being applied, then perhaps it would be, but I'm not convinced.. > Taking a specific, simplistic example, suppose you had 2 groups of > users - some are normal users who should only be able to access their > own records. For these users, you might have a policy like > > WHERE person_id = current_user > > which would be highly selective, and probably use an index scan. Then > there might be another group of users who are managers with access to > the records of, say, everyone in their department. This might then be > a more complex qual along the lines of > > WHERE person_id IN (SELECT ... FROM person_department > WHERE mgr_id = current_user AND ...) > > which might end up being a hash or merge join, depending on any > user-supplied quals. Certainly my experience with such a setup is that it includes at least 4 levels (self, manager, director, officer). Now, officer you could perhaps exclude as being simply RLS-exempt but with such a structure I would think we'd just make that a special kind of policy (and not chew up those last 4 bits). As for this example, it's quite naturally done with a recursive query as it's a tree structure, but if you want to keep the qual simple and fast, you'd materialize the results of such a query and simply have: WHERE EXISTS (SELECT 1 from org_chart WHERE current_user = emp_id AND person_id = org_chart.id) > You _could_ combine those into a single policy, but I think it would > be much better to have 2 distinct policies, since they're 2 very > different queries, for different use cases. Normal users would only be > granted permission to use the normal_user_policy. Managers might be > granted permission to use either the normal_user_policy or the > manager_policy (but not both at the same time). I can't recall a system where managers have to request access to their manager role. Having another way of changing the permissions which are applied to a session (the existing one being 'set role') doesn't strike me as a great idea either. > That's a very simplified example. In more realistic situations there > are likely to be many more classes of users, and trying to enforce all > the logic in a single WHERE clause is likely to get unmanageable, or > inefficient if it involves lots of logic hidden away in functions. Functions and external security systems are exactly the real-world use-case which users I've talked to are looking for. All of this discussion is completely orthogonal to their requirements. I understand that there are simpler use-cases than those and we may be able to provide an approach which performs better for those. > Allowing multiple, non-overlapping policies allows the problem to be > broken up into more manageable pieces, which also makes the planner's > job easier, since only a single, simpler policy is in effect in any > given query. Let's try to outline what this would look like then. Taking your approach, we'd have: CREATE POLICY p1; CREATE POLICY p2; ALTER TABLE t1 SET POLICY p1 TO t1_p1_quals; ALTER TABLE t1 SET POLICY p2 TO t1_p2_quals; GRANT SELECT ON TABLE t1 TO role1 USING p1; GRANT SELECT ON TABLE t1 TO role2 USING p2; I'm guessing we would need to further support: GRANT INSERT ON TABLE t1 TO role1 USING p2; as we've already discussed being able to support per-action (SELECT, INSERT, UPDATE, DELETE) policies. I'm not quite sure how to address that though. Further, as you mention, users would be able to do: SET rls_policy = whatever; and things would appear fine, until they tried to access a table to which they didn't have that policy for, at which point they'd get an error. You mention: GRANT SELECT (col1, col2), UPDATE (col1) ON t1 TO bob USING policy1; but, to be clear, there would be no option for policies to be column-specific, right? The policy would apply to the whole row and just the SELECT/UPDATE privileges would be on the specific columns (as exists today). From this what I'm gathering is that we'd need catalog tables along these lines: rls_policy oid, polname name, polowner oid, polnamespace oid, polacl aclitme[] (oid, policy name, policy owner, policy namespace,ACL, eg: usage?) rls_policy_table ptblpolid oid, ptblrelid oid, ptblquals text(?), ptblacl aclitem[]? (policy oid, table/relation oid, quals,ACL) pg_class relhasrls boolean ? An extension to the existing ACLs which are for GRANT to include a policy OID, eg: typedef struct AclItem {Oid ai_grantee;Oid ai_grantor;AclMode ai_privs;Oid rls_policy; } and further: role1=r|p1/postgres role2=r|p2/postgres or even: bob=|policy1/postgres with no table-level privileges and only column-level privileges granted to role3 for this table. The plan cache would include what policy OID a given plan was run under (with InvalidOid indicating an "everything-allowed" policy). This doesn't address the concern raised about having different policies depending on the action type (SELECT, INSERT, etc) though, as mentioned above.. For that we may have to add "Oid rls_select_policy", etc, to AclItem, which would be pretty painful. Other thoughts? This certainly feels like quite a bit to try and bite off for 9.5 and, as mentioned, this would be a strict superset of the current approach, which could be implemented under this structure as: CREATE POLICY t1_p1_policy; ALTER TABLE t1 SET POLICY p1 TO t1_p1_quals; GRANT (user's rights) ON t1 TO user USING policy1; Tha main downside here is that we'd have to create a policy for every table in the system which had RLS applied, to avoid granting more than should be. Perhaps the 9.4 approach could include the 'CREATE POLICY' and 'ALTER TABLE' bits, but not the GRANT parts, meaning that we would, for the 9.5 -> 9.6 upgrade, pg_dump: GRANT (user's rights) ON t1 TO user USING policy1; We would still need the GUCs for "rls_enable = on/off" and perhaps the role-level "bypass_rls" attribute, but those wouldn't change with this. Thoughts? Thanks! Stephen
Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
From
Craig Ringer
Date:
On 06/24/2014 10:30 PM, Alvaro Herrera wrote: > I haven't been following this thread, but this bit caught my attention. > I'm not sure I agree that OR is always the right policy either. > There is a case for a policy that says "forbid these rows to these guys, > even if they have read permissions from elsewhere". That's generally considered a "DENY" policy, a concept borrowed from ACLs. You have access to a resource if: - You have at least one policy that gives you access AND - You have no policies that deny you access > If OR is the only > way to mix multiple policies there might not be a way to implement this. I think that's a "later" myself, but we shouldn't design ourselves into a corner where we can't support deny rules either. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
From
Stephen Frost
Date:
Craig, * Craig Ringer (craig@2ndquadrant.com) wrote: > On 06/24/2014 10:30 PM, Alvaro Herrera wrote: > > I haven't been following this thread, but this bit caught my attention. > > I'm not sure I agree that OR is always the right policy either. > > There is a case for a policy that says "forbid these rows to these guys, > > even if they have read permissions from elsewhere". > > That's generally considered a "DENY" policy, a concept borrowed from ACLs. Right. > > If OR is the only > > way to mix multiple policies there might not be a way to implement this. > > I think that's a "later" myself, but we shouldn't design ourselves into > a corner where we can't support deny rules either. Agreed, but I don't want to get so wrapped up in all of this that we end up with a set of requirements so long that we'll never be able to accomplish them all in a single release... Thanks! Stephen
On 25 June 2014 01:49, Stephen Frost <sfrost@snowman.net> wrote: > Dean, all, > > Changing the subject of this thread (though keeping it threaded) as > we've really moved on to a much broader discussion. > > * Dean Rasheed (dean.a.rasheed@gmail.com) wrote: >> On 24 June 2014 17:27, Stephen Frost <sfrost@snowman.net> wrote: >> > Single policy vs Multiple, Overlapping policies vs Multiple, Non-overlapping policies >> >> What I was describing upthread was multiple non-overlapping policies. > > Ok. > >> I disagree that this will be more complicated to use. It's a strict >> superset of the single policy functionality, so if you want to do it >> all using a single policy then you can. But I think that once the ACLs >> reach a certain level of complexity, you probably will want to break >> it up into multiple policies, and I think doing so will make things >> simpler, not more complicated. > > If we keep it explicitly to per-role only, with only one policy ever > being applied, then perhaps it would be, but I'm not convinced.. > >> Taking a specific, simplistic example, suppose you had 2 groups of >> users - some are normal users who should only be able to access their >> own records. For these users, you might have a policy like >> >> WHERE person_id = current_user >> >> which would be highly selective, and probably use an index scan. Then >> there might be another group of users who are managers with access to >> the records of, say, everyone in their department. This might then be >> a more complex qual along the lines of >> >> WHERE person_id IN (SELECT ... FROM person_department >> WHERE mgr_id = current_user AND ...) >> >> which might end up being a hash or merge join, depending on any >> user-supplied quals. > > Certainly my experience with such a setup is that it includes at least 4 > levels (self, manager, director, officer). Now, officer you could > perhaps exclude as being simply RLS-exempt but with such a structure I > would think we'd just make that a special kind of policy (and not chew > up those last 4 bits). As for this example, it's quite naturally done > with a recursive query as it's a tree structure, but if you want to keep > the qual simple and fast, you'd materialize the results of such a query > and simply have: > > WHERE EXISTS (SELECT 1 from org_chart > WHERE current_user = emp_id > AND person_id = org_chart.id) > >> You _could_ combine those into a single policy, but I think it would >> be much better to have 2 distinct policies, since they're 2 very >> different queries, for different use cases. Normal users would only be >> granted permission to use the normal_user_policy. Managers might be >> granted permission to use either the normal_user_policy or the >> manager_policy (but not both at the same time). > > I can't recall a system where managers have to request access to their > manager role. Having another way of changing the permissions which are > applied to a session (the existing one being 'set role') doesn't strike > me as a great idea either. > Actually I think it's quite common to build applications where more privileged users might want to initially log in with normal privileges, and then only escalate to a higher privilege level if needed (much like only being root on a machine when absolutely necessary). But as you say, that can be done through 'set role' so I don't think being able to choose between policies is as important as being able to define different policies for different roles. >> That's a very simplified example. In more realistic situations there >> are likely to be many more classes of users, and trying to enforce all >> the logic in a single WHERE clause is likely to get unmanageable, or >> inefficient if it involves lots of logic hidden away in functions. > > Functions and external security systems are exactly the real-world > use-case which users I've talked to are looking for. All of this > discussion is completely orthogonal to their requirements. I understand > that there are simpler use-cases than those and we may be able to > provide an approach which performs better for those. > OK. >> Allowing multiple, non-overlapping policies allows the problem to be >> broken up into more manageable pieces, which also makes the planner's >> job easier, since only a single, simpler policy is in effect in any >> given query. > > Let's try to outline what this would look like then. > > Taking your approach, we'd have: > > CREATE POLICY p1; > CREATE POLICY p2; > > ALTER TABLE t1 SET POLICY p1 TO t1_p1_quals; > ALTER TABLE t1 SET POLICY p2 TO t1_p2_quals; > > GRANT SELECT ON TABLE t1 TO role1 USING p1; > GRANT SELECT ON TABLE t1 TO role2 USING p2; > > I'm guessing we would need to further support: > > GRANT INSERT ON TABLE t1 TO role1 USING p2; > > as we've already discussed being able to support per-action (SELECT, > INSERT, UPDATE, DELETE) policies. I'm not quite sure how to address > that though. > > Further, as you mention, users would be able to do: > > SET rls_policy = whatever; > > and things would appear fine, until they tried to access a table to > which they didn't have that policy for, at which point they'd get an > error. > > You mention: > > GRANT SELECT (col1, col2), UPDATE (col1) ON t1 TO bob USING policy1; > > but, to be clear, there would be no option for policies to be > column-specific, right? The policy would apply to the whole row and > just the SELECT/UPDATE privileges would be on the specific columns (as > exists today). > I think that would be OK for the first release. It could be extended in a future release to support column-specific policy ACLs, as long as we don't preclude that in the syntax we choose now. The syntax GRANT <command> [,<command>] ON table TO role USING policy works because columns can be added to it later. > From this what I'm gathering is that we'd need catalog tables along > these lines: > > rls_policy > oid, polname name, polowner oid, polnamespace oid, polacl aclitme[] > (oid, policy name, policy owner, policy namespace, ACL, eg: usage?) > > rls_policy_table > ptblpolid oid, ptblrelid oid, ptblquals text(?), ptblacl aclitem[]? > (policy oid, table/relation oid, quals, ACL) > > pg_class > relhasrls boolean ? > Seems about right. > An extension to the existing ACLs which are for GRANT to include a > policy OID, eg: > > typedef struct AclItem > { > Oid ai_grantee; > Oid ai_grantor; > AclMode ai_privs; > Oid rls_policy; > } > Alternatively, use the ACLs on rls_policy_table - i.e., to SELECT from a table using a particular policy, you would need to have the SELECT bit assigned to you in the corresponding rls_policy_table entry's ACLs. That seems like it would be a less invasive change, but I don't know if there are other problems with that approach. > and further: > > role1=r|p1/postgres > role2=r|p2/postgres > Or just table1: role1=rw/grantor table1 using policy1: role2=rw/grantor to avoid changing the privilege display pattern. That's also more in keeping with the model of storing the per-policy ACLs in rls_policy_table. > or even: > > bob=|policy1/postgres > > with no table-level privileges and only column-level privileges granted > to role3 for this table. > I don't get that last one. If there are no table-level privileges, would it not just be empty? > The plan cache would include what policy OID a given plan was run under > (with InvalidOid indicating an "everything-allowed" policy). > > This doesn't address the concern raised about having different policies > depending on the action type (SELECT, INSERT, etc) though, as mentioned > above.. For that we may have to add "Oid rls_select_policy", etc, to > AclItem, which would be pretty painful. Other thoughts? > Huh? Isn't it just another column in rls_policy_table to specify the action type? > This certainly feels like quite a bit to try and bite off for 9.5 and, > as mentioned, this would be a strict superset of the current approach, > which could be implemented under this structure as: > > CREATE POLICY t1_p1_policy; > ALTER TABLE t1 SET POLICY p1 TO t1_p1_quals; > GRANT (user's rights) ON t1 TO user USING policy1; > > Tha main downside here is that we'd have to create a policy for every > table in the system which had RLS applied, to avoid granting more than > should be. Perhaps the 9.4 approach could include the 'CREATE POLICY' > and 'ALTER TABLE' bits, but not the GRANT parts, meaning that we would, > for the 9.5 -> 9.6 upgrade, pg_dump: > > GRANT (user's rights) ON t1 TO user USING policy1; > > We would still need the GUCs for "rls_enable = on/off" and perhaps the > role-level "bypass_rls" attribute, but those wouldn't change with this. > > Thoughts? > Well I think you'd have to flesh out the alternatives to a similar level of detail to assess the relative effort involved, but I think it's encouraging to see this level of design this early in the 9.5 cycle. Regards, Dean
Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
From
Robert Haas
Date:
On Tue, Jun 24, 2014 at 12:27 PM, Stephen Frost <sfrost@snowman.net> wrote: > I feel like we are getting to the point of simply talking past each > other and so I'll try anew, and I'll include my understanding of how the > different approaches would address the specific use-case you outlined > up-thread. Thanks, you're right, and this is a good write-up. > Single policy > ------------- > The current implementation approach only allows a single policy to be > included. > [...snip...] > For the case where a sales rep isn't also a partner, you could simplify > this to: > > WHERE > sales_rep_id = 16384 > > but I'm not sure that really buys you much? With the bitmap heap > scan, if one side of the OR ends up not returning anything then it > doesn't contribute to the blocks which have to be scanned. The index > might still need to be scanned, although I think you could avoid even > that with an EXISTS check to see if the user is a partner at all. > That's not to say that a bitmap scan is equivilant to an index scan, but > it's certainly likely to be far better than a sequential scan. True, but the wins could be much larger if one policy is WHERE sales_rep_id = (SELECT oid FROM pg_roles WHERE rolname = current_user) and the other policy is WHERE complexfn(). I'll also throw out a +1 for Dean's comments on this topic. > Multiple, Non-overlapping policies > ---------------------------------- > Preventing the overlap of policies ends up being very complicated if > many dimensions are allowed. For the simple case, perhaps only the > 'current role' dimension is useful. I expect that going down that > route would very quickly lead to requests for other dimensions (client > IP, etc) which is why I'm not a big fan of it, but if that's the > concensus then let's work out the syntax and update the patch and move > on. I think role is good enough. That's the primary identifier for all access-control related decisions, so it should be good enough here, too. I don't really understand your concerns about overlapping policies being complex. If you've got a couple of WHERE clauses, combining them with OR is not hard. Now, the query optimizer may have trouble with it, but on the whole I expect to win more than we lose, by entirely excluding some branches of an OR for users for whom entirely policies can be excluded. > Overall, while I'm interested in defining where this is going in a way > which allows us implement an initial RLS capability while avoiding > future upgrade issues, I am perfectly happy to say that the 9.5 RLS > implementation may not be exactly syntax-compatible with 9.6 or 10.0. Again, I think that's completely non-viable. Are you going to tell people they can't pg_upgrade, and they can't dump-and-reload either, without manual fiddling? There's no way that's going to be accepted. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
* Dean Rasheed (dean.a.rasheed@gmail.com) wrote: > On 25 June 2014 01:49, Stephen Frost <sfrost@snowman.net> wrote: > > I can't recall a system where managers have to request access to their > > manager role. Having another way of changing the permissions which are > > applied to a session (the existing one being 'set role') doesn't strike > > me as a great idea either. > > > > Actually I think it's quite common to build applications where more > privileged users might want to initially log in with normal > privileges, and then only escalate to a higher privilege level if > needed (much like only being root on a machine when absolutely > necessary). But as you say, that can be done through 'set role' so I > don't think being able to choose between policies is as important as > being able to define different policies for different roles. For those kinds of applications (eg: sudo), yes. I was, perhaps, looking at your example a bit too literally- I was thinking of HR management type systems (timecard systems, etc). > > You mention: > > > > GRANT SELECT (col1, col2), UPDATE (col1) ON t1 TO bob USING policy1; > > > > but, to be clear, there would be no option for policies to be > > column-specific, right? The policy would apply to the whole row and > > just the SELECT/UPDATE privileges would be on the specific columns (as > > exists today). > > > > I think that would be OK for the first release. It could be extended > in a future release to support column-specific policy ACLs, as long as > we don't preclude that in the syntax we choose now. The syntax > > GRANT <command> [,<command>] ON table TO role USING policy > > works because columns can be added to it later. What would per-column RLS policies mean..? Would we have to work out which columns are being updated vs. select'd on before being able to choose the policy/quals to include? Seems like that's probably workable but I've not thought about it very hard. > > From this what I'm gathering is that we'd need catalog tables along > > these lines: > > > > rls_policy > > oid, polname name, polowner oid, polnamespace oid, polacl aclitme[] > > (oid, policy name, policy owner, policy namespace, ACL, eg: usage?) > > > > rls_policy_table > > ptblpolid oid, ptblrelid oid, ptblquals text(?), ptblacl aclitem[]? > > (policy oid, table/relation oid, quals, ACL) > > > > pg_class > > relhasrls boolean ? > > Seems about right. > > > An extension to the existing ACLs which are for GRANT to include a > > policy OID, eg: > > > > typedef struct AclItem > > { > > Oid ai_grantee; > > Oid ai_grantor; > > AclMode ai_privs; > > Oid rls_policy; > > } > > > > Alternatively, use the ACLs on rls_policy_table - i.e., to SELECT from > a table using a particular policy, you would need to have the SELECT > bit assigned to you in the corresponding rls_policy_table entry's > ACLs. That seems like it would be a less invasive change, but I don't > know if there are other problems with that approach. Ah, that's a good thought. My original thinking for that column was some kind of privilege structure around who is allowed to modify the quals for a given policy+table, but using that as the definition of who has what policies does make sense and means we can leave AclItem more-or-less alone, which is very nice. The relhasrls boolean would allow us to only query that catalog in cases where a policy exists, hopefully minimizing the impact for users who are not using RLS. > > and further: > > > > role1=r|p1/postgres > > role2=r|p2/postgres > > Or just > > table1: > role1=rw/grantor > table1 using policy1: > role2=rw/grantor > > to avoid changing the privilege display pattern. That's also more in > keeping with the model of storing the per-policy ACLs in > rls_policy_table. I like that output, but do we expect any pushback from users who parse out that field? Admittedly, they really shouldn't be doing that, but I'm sure most actually do, and "table1 using policy1" isn't terribly nice to parse. > > or even: > > > > bob=|policy1/postgres > > > > with no table-level privileges and only column-level privileges granted > > to role3 for this table. > > I don't get that last one. If there are no table-level privileges, > would it not just be empty? No, as there could be column-level privileges. Note that table-level privileges get you access to all columns, and column level privileges (as defined by SQL spec) give you access even if you don't have any table-level privileges. As I was trying to exclude the notion of column-level policies, I figured policies would always show up in the "table" level ACL fields, but if there aren't any table-level rights, what would that look like? With your proposal, it'd be: table1 using policy1: bob=/grantor ? > > The plan cache would include what policy OID a given plan was run under > > (with InvalidOid indicating an "everything-allowed" policy). > > > > This doesn't address the concern raised about having different policies > > depending on the action type (SELECT, INSERT, etc) though, as mentioned > > above.. For that we may have to add "Oid rls_select_policy", etc, to > > AclItem, which would be pretty painful. Other thoughts? > > > > Huh? Isn't it just another column in rls_policy_table to specify the > action type? I had been trying to fit it into the ACL structure somehow. What would it look like to have multiple action types then? Here's one thought: table1 using policy1 for INSERT: bob=rw/grantor table1 using policy1 for SELECT: bob=r/grantor Or how about: table1|policy1/w: bob=rw/grantor table1|policy1/r: bob=r/grantor Another question is about showing what the actual quals are for a given policy which is being applied to a table. Would we want that to show up in \d, \d+, or only be available through querying the catalog..? > > This certainly feels like quite a bit to try and bite off for 9.5 and, > > as mentioned, this would be a strict superset of the current approach, > > which could be implemented under this structure as: > > > > CREATE POLICY t1_p1_policy; > > ALTER TABLE t1 SET POLICY p1 TO t1_p1_quals; > > GRANT (user's rights) ON t1 TO user USING policy1; > > > > Tha main downside here is that we'd have to create a policy for every > > table in the system which had RLS applied, to avoid granting more than > > should be. Perhaps the 9.4 approach could include the 'CREATE POLICY' > > and 'ALTER TABLE' bits, but not the GRANT parts, meaning that we would, > > for the 9.5 -> 9.6 upgrade, pg_dump: > > > > GRANT (user's rights) ON t1 TO user USING policy1; > > > > We would still need the GUCs for "rls_enable = on/off" and perhaps the > > role-level "bypass_rls" attribute, but those wouldn't change with this. > > Well I think you'd have to flesh out the alternatives to a similar > level of detail to assess the relative effort involved, but I think > it's encouraging to see this level of design this early in the 9.5 > cycle. I'm not sure which other alternatives you're thinking about here- could you be more specific..? I can try to flesh them out but I had actually been hoping that this would be a good compromise position among the alternatives. This provides the per-role policy granularity which has been mentioned a few times, but doesn't allow the policies to overlap. Overlapping policies could be added to this general design, I believe, though we'd have to make a few catalog changes and invent some new syntax to define how the policies are to be combined (ANDs vs ORs, etc). I had brought up the idea of ordering/prioritizing policies, but I didn't particularly like the suggestion when I made it and I don't recall anyone else voicing interest in that approach. For my part, I don't see the GUCs as really being "alternatives" so much as pre-requisites. Even with all the granularity and comprehensive set of features which we're talking about here, we're going to need a way for pg_dump to simply say "do not apply RLS to me and ERROR out if that's an issue". I agree that it's great to get these design discussions happening now but I really do not want this to become a behemoth patch by the last CF and ends up bounced because of that. What I'd like to work through is the minimal set which would be accepted and get that in, in a way that doesn't prevent further improvements, and then see what can be done to get those improvements and refinements in during the 9.5 cycle and what gets bounced to the next release. To that end, I've been trying to gauge interest in this and get some feel for who is interested in helping push this forward- your help was instrumental in getting updatable security barrier views into 9.4, would you have time to help with this also..? Thanks! Stephen
Robert, all, Changing the thread topic to match the other one, and adding Dean in explicitly since we're talking about the design discussed with him. * Robert Haas (robertmhaas@gmail.com) wrote: > I think role is good enough. That's the primary identifier for all > access-control related decisions, so it should be good enough here, > too. Alright. That works for me. > I don't really understand your concerns about overlapping policies > being complex. If you've got a couple of WHERE clauses, combining > them with OR is not hard. Now, the query optimizer may have trouble > with it, but on the whole I expect to win more than we lose, by > entirely excluding some branches of an OR for users for whom entirely > policies can be excluded. On the thread with Dean we're proposing some specific catalog designs and part of that included (fleshing it out a bit more) something like: CREATE TABLE pg_relrlspolicy (-- relation RLS policy table ptblrelid oid, -- Relation/table ptblaction text, -- SELECT, INSERT, UPDATE, DELETE ptblpolid oid, -- Policy ptblquals text, -- Qualsto add ptblacl aclitem[], -- Rights to use this policy on the table primary key (ptblrelid, ptblaction) ); And note that I had expected aclitem to only include one entry per role. To support overlapping policies, we could add 'ptblpolid' into the primary key and then simply extract out all of the entries for the relation and action that we're currently running and step through each to find which of the policies apply to the current_role...? If a role has policyA with 'INSERT' rights, but no rights to SELECT, but they also have an entry for policyB with 'SELECT' rights, we would use only policyB for a SELECT query? Does that approach mean we don't need 'ptblaction' after all? I'm thinking this approach would also toss out the "pick your policy" concept that Dean had proposed up-thread. How would these interact with the existing table-level rights? For column-level rights, if you have access to SELECT the column then you don't need any table-level rights (and the table-level rights mean you can SELECT from any column), are we thinking the same would apply here, such that having 'USING POLICY' rights means you can SELECT from the table and the table-level rights end up being the 'DIRECT' rights which had been discussed up-thread? Not sure that I like that approach, though I understand some others might find it appealing.. As we're integrating this with the GRANT command, perhaps it'd be alright. > > Overall, while I'm interested in defining where this is going in a way > > which allows us implement an initial RLS capability while avoiding > > future upgrade issues, I am perfectly happy to say that the 9.5 RLS > > implementation may not be exactly syntax-compatible with 9.6 or 10.0. > > Again, I think that's completely non-viable. Are you going to tell > people they can't pg_upgrade, and they can't dump-and-reload either, > without manual fiddling? There's no way that's going to be accepted. I don't understand what you're getting at here. We dump the catalog using the newer version of pg_dump for pg_upgrade and we could handle any *syntax* change required during that process to ensure that the same access is granted in the new cluster as existed in the old cluster. We do the exact same thing every time we add a new reserved keyword- anything which used that keyword before ends up getting double-quoted by the new version of pg_dump and both pg_dump and pg_upgrade work just fine. We routinly break some syntax compatibility between major versions, address those changes in the newer version of pg_dump, and move on. I am not proposing that users won't be able to upgrade from 9.5 to 9.6 if they have RLS and agree that it'd be a non-starter. Thanks, Stephen
On Tue, Jun 24, 2014 at 8:49 PM, Stephen Frost <sfrost@snowman.net> wrote: > Let's try to outline what this would look like then. > > Taking your approach, we'd have: > > CREATE POLICY p1; > CREATE POLICY p2; > > ALTER TABLE t1 SET POLICY p1 TO t1_p1_quals; > ALTER TABLE t1 SET POLICY p2 TO t1_p2_quals; This seems like a very nice, flexible framework. > GRANT SELECT ON TABLE t1 TO role1 USING p1; > GRANT SELECT ON TABLE t1 TO role2 USING p2; Instead of doing it this way, we could instead do: ALTER ROLE role1 ADD POLICY p1; ALTER ROLE role2 ADD POLICY p2; We could possibly allow multiple policies to be set for the same user, but given an error (or OR the quals together) if there are conflicting policies for the same table. A user with no policies would see everything to which they've been granted access. To support different policies on different operations, you could have something like: ALTER TABLE t1 SET POLICY p1 ON INSERT TO t1_p1_quals; Without the ON clause, it would establish the given policy for all operations. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
* Robert Haas (robertmhaas@gmail.com) wrote: > On Tue, Jun 24, 2014 at 8:49 PM, Stephen Frost <sfrost@snowman.net> wrote: > > Let's try to outline what this would look like then. > > > > Taking your approach, we'd have: > > > > CREATE POLICY p1; > > CREATE POLICY p2; > > > > ALTER TABLE t1 SET POLICY p1 TO t1_p1_quals; > > ALTER TABLE t1 SET POLICY p2 TO t1_p2_quals; > > This seems like a very nice, flexible framework. > > > GRANT SELECT ON TABLE t1 TO role1 USING p1; > > GRANT SELECT ON TABLE t1 TO role2 USING p2; > > Instead of doing it this way, we could instead do: > > ALTER ROLE role1 ADD POLICY p1; > ALTER ROLE role2 ADD POLICY p2; > > We could possibly allow multiple policies to be set for the same user, > but given an error (or OR the quals together) if there are conflicting > policies for the same table. A user with no policies would see > everything to which they've been granted access. Ok, I like that as it means that "normal" GRANTs, etc, remain the same and are just constrained by RLS when there is an RLS policy on the table, which I believe is really the right approach. > To support different policies on different operations, you could have > something like: > > ALTER TABLE t1 SET POLICY p1 ON INSERT TO t1_p1_quals; > > Without the ON clause, it would establish the given policy for all operations. Right, this makes sense also and is similar to what we were angling towards initially. I'll think further on this and propose a catalog structure and try to delve into the semantics of query operations, etc. One issue that occurs to me is trying to think through how to address the plancache invalidation, such that we are not invalidating constantly but also setting the correct quals for the current query. We had gone down a road where we saved a plan for each role under which a query was run but then ripped that out because the RLS policy would handle the per-role issues (modulo whether RLS should be enabled or not). This approach means that we'd have to bring back the notion of per-role plan cacheing. I'm not against that, just making note of it. Thanks, Stephen
On 25 June 2014 16:44, Robert Haas <robertmhaas@gmail.com> wrote: > On Tue, Jun 24, 2014 at 8:49 PM, Stephen Frost <sfrost@snowman.net> wrote: >> Let's try to outline what this would look like then. >> >> Taking your approach, we'd have: >> >> CREATE POLICY p1; >> CREATE POLICY p2; >> >> ALTER TABLE t1 SET POLICY p1 TO t1_p1_quals; >> ALTER TABLE t1 SET POLICY p2 TO t1_p2_quals; > > This seems like a very nice, flexible framework. > >> GRANT SELECT ON TABLE t1 TO role1 USING p1; >> GRANT SELECT ON TABLE t1 TO role2 USING p2; > > Instead of doing it this way, we could instead do: > > ALTER ROLE role1 ADD POLICY p1; > ALTER ROLE role2 ADD POLICY p2; > > We could possibly allow multiple policies to be set for the same user, > but given an error (or OR the quals together) if there are conflicting > policies for the same table. A user with no policies would see > everything to which they've been granted access. > I'm a bit uneasy about allowing overlapping policies like this, because I think it is more likely to lead to unintended consequences than solve real use cases. For example, suppose you define policies p1 and p2 and set them up on table t1, and you grant role1 permissions on t1 and allow role1 the use of policy p1. Then you set up policy p2 on another table t2, and decide you want to allow role1 access to t2 using this policy. The only way to do it is to add p2 to role1, but doing so also then gives role1 access to t1 using p2, which might not have been what you intended. With the GRANT ... USING policy syntax, you have greater flexibility to pick and choose which policies each user has permission to use with each table. To me at least, that seems much less error prone, since you are being much more explicit about exactly what privileges you are granting. The ALTER ROLE ... ADD POLICY syntax is potentially adding a whole bunch of extra privileges to the role, and you have to work quite hard to see exactly what it's adding. > To support different policies on different operations, you could have > something like: > > ALTER TABLE t1 SET POLICY p1 ON INSERT TO t1_p1_quals; > > Without the ON clause, it would establish the given policy for all operations. > Yes, that makes sense. But as I was arguing above, I think the ACLs should be attached to the specific RLS policy identified uniquely by (table, policy, command). So, for example, if you did ALTER TABLE t1 SET POLICY p1 ON SELECT TO t1_p1_sel_quals; ALTER TABLE t1 SET POLICY p1 ON UPDATE TO t1_p1_upd_quals; you could also do GRANT SELECT ON TABLE t1 TO role1 USING p1; GRANT UPDATE ON TABLE t1 TO role1 USING p1; but it would be an error to do GRANT DELETE ON TABLE t1 TO role1 USING p1; because there is no p1 delete policy for t1; Regards, Dean
On Wed, Jun 25, 2014 at 4:48 PM, Dean Rasheed <dean.a.rasheed@gmail.com> wrote: >> Instead of doing it this way, we could instead do: >> >> ALTER ROLE role1 ADD POLICY p1; >> ALTER ROLE role2 ADD POLICY p2; >> >> We could possibly allow multiple policies to be set for the same user, >> but given an error (or OR the quals together) if there are conflicting >> policies for the same table. A user with no policies would see >> everything to which they've been granted access. >> > I'm a bit uneasy about allowing overlapping policies like this, > because I think it is more likely to lead to unintended consequences > than solve real use cases. For example, suppose you define policies p1 > and p2 and set them up on table t1, and you grant role1 permissions on > t1 and allow role1 the use of policy p1. Then you set up policy p2 on > another table t2, and decide you want to allow role1 access to t2 > using this policy. The only way to do it is to add p2 to role1, but > doing so also then gives role1 access to t1 using p2, which might not > have been what you intended. I guess that's true but it just seems like a configuration error. I have it in mind that most people will define policies for non-overlapping sets of tables and then apply those policies as appropriate to each user. Whether that's true or not, I don't see it as being materially different from granting membership in a role - you could easily give the user permission to do stuff they shouldn't be able to do, but if you don't carefully examine the bundle of privileges that come with that GRANT before executing on it, that's your fault, not the system's. >> To support different policies on different operations, you could have >> something like: >> >> ALTER TABLE t1 SET POLICY p1 ON INSERT TO t1_p1_quals; >> >> Without the ON clause, it would establish the given policy for all operations. > > Yes, that makes sense. But as I was arguing above, I think the ACLs > should be attached to the specific RLS policy identified uniquely by > (table, policy, command). So, for example, if you did > > ALTER TABLE t1 SET POLICY p1 ON SELECT TO t1_p1_sel_quals; > ALTER TABLE t1 SET POLICY p1 ON UPDATE TO t1_p1_upd_quals; > > you could also do > > GRANT SELECT ON TABLE t1 TO role1 USING p1; > GRANT UPDATE ON TABLE t1 TO role1 USING p1; > > but it would be an error to do > > GRANT DELETE ON TABLE t1 TO role1 USING p1; As I see it, the downside of this is that it gets a lot more complex. We have to revise the ACL representation, which is already pretty darn complicated, to keep track not only of the grantee, grantor, and permissions, but also the policies qualifying those permissions. The changes to GRANT will need to propagate into GRANT ON ALL TABLES IN SCHEMA and AFTER DEFAULT PRIVILEGES. There is administrative complexity as well, because if you want to policy-protect an additional table, you've got to add the table to the policy and then update all the grants as well. I think what will happen in practice is that people will grant to PUBLIC all rights on the policy, and then do all the access control through the GRANT statements. An interesting question we haven't much considered is: who can set up policies and add then to users? Maybe we should flip this around, and instead of adding users to policies, we should exempt users from policies. CREATE POLICY p1; And then, if they own p1 and t1, they can do: ALTER TABLE t1 SET POLICY p1 TO t1_p1_quals; (or maybe we should associate it to the policy instead of the table: ALTER POLICY p1 SET TABLE t1 TO t1_p1_quals) And then the policy applies to everyone who doesn't have the grantable EXEMPT privilege on the policy. The policy owner and superuser have that privilege by default and it can be handed out to others like this: GRANT EXEMPT ON POLICY p1 TO snowden; Then users who have row_level_security=on will bypass RLS if possible, and otherwise it will be applied. Users who have row_level_security=off will bypass RLS if possible, and otherwise error. And users who have row_level_security=force will apply RLS even if they are entitled to bypass it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 26 June 2014 18:04, Robert Haas <robertmhaas@gmail.com> wrote: >> ALTER TABLE t1 SET POLICY p1 ON SELECT TO t1_p1_sel_quals; >> GRANT SELECT ON TABLE t1 TO role1 USING p1; > > As I see it, the downside of this is that it gets a lot more complex. > We have to revise the ACL representation, which is already pretty darn > complicated, to keep track not only of the grantee, grantor, and > permissions, but also the policies qualifying those permissions. The > changes to GRANT will need to propagate into GRANT ON ALL TABLES IN > SCHEMA and AFTER DEFAULT PRIVILEGES. No, it can be done without any changes to the permissions code by storing the ACLs on the catalog entries where the RLS quals are held, rather than modifying the ACL items on the table. I.e., instead of thinking of "USING polname" as a modifier to the grant, think of it as as an additional qualifier on the thing being granted. That means the syntax I proposed earlier is wrong/misleading. Instead of GRANT SELECT ON TABLE tbl TO role USING polname; it should really be GRANT SELECT USING polname ON TABLE tbl TO role; > There is administrative > complexity as well, because if you want to policy-protect an > additional table, you've got to add the table to the policy and then > update all the grants as well. I think what will happen in practice > is that people will grant to PUBLIC all rights on the policy, and then > do all the access control through the GRANT statements. > If you assume that most users will only have one policy through which they can access any given table, then there is no more administrative overhead than we have right now. Right now you have to grant each user permissions on each table you define. The only difference is that now you throw in a "USING polname". We could also simplify administration by supporting GRANT SELECT USING polname ON ALL TABLES IN SCHEMA sch TO role; The important distinction is that this is only granting permissions on tables that exist now, not on tables that might be created later. > An interesting question we haven't much considered is: who can set up > policies and add then to users? Maybe we should flip this around, and > instead of adding users to policies, we should exempt users from > policies. > > CREATE POLICY p1; > > And then, if they own p1 and t1, they can do: > > ALTER TABLE t1 SET POLICY p1 TO t1_p1_quals; > (or maybe we should associate it to the policy instead of the table: > ALTER POLICY p1 SET TABLE t1 TO t1_p1_quals) > > And then the policy applies to everyone who doesn't have the grantable > EXEMPT privilege on the policy. The policy owner and superuser have > that privilege by default and it can be handed out to others like > this: > > GRANT EXEMPT ON POLICY p1 TO snowden; > > Then users who have row_level_security=on will bypass RLS if possible, > and otherwise it will be applied. Users who have > row_level_security=off will bypass RLS if possible, and otherwise > error. And users who have row_level_security=force will apply RLS > even if they are entitled to bypass it. > That's interesting. I need to think some more about what that means. Regards, Dean
Robert, Dean, * Dean Rasheed (dean.a.rasheed@gmail.com) wrote: > On 26 June 2014 18:04, Robert Haas <robertmhaas@gmail.com> wrote: > >> ALTER TABLE t1 SET POLICY p1 ON SELECT TO t1_p1_sel_quals; > >> GRANT SELECT ON TABLE t1 TO role1 USING p1; > > > > As I see it, the downside of this is that it gets a lot more complex. > > We have to revise the ACL representation, which is already pretty darn > > complicated, to keep track not only of the grantee, grantor, and > > permissions, but also the policies qualifying those permissions. The > > changes to GRANT will need to propagate into GRANT ON ALL TABLES IN > > SCHEMA and AFTER DEFAULT PRIVILEGES. > > No, it can be done without any changes to the permissions code by > storing the ACLs on the catalog entries where the RLS quals are held, > rather than modifying the ACL items on the table. I.e., instead of > thinking of "USING polname" as a modifier to the grant, think of it as > as an additional qualifier on the thing being granted. Yeah, I agree that we could do it without changing the existing ACL structure by using another table and having a flag in pg_class which indicates if there are RLS policies on the table or not. Regarding the concerns about users not using the RLS capabilities correctly- I find that concern to be much more appropriate for the current permissions system rather than RLS. If a user is going to the level of even looking at RLS then, I'd hope at least, they'll be able to understand and make good use of RLS to implement what they need and they would appreciate the flexibility. To try and clarify what this distinction is- Dean's approach with GRANT allows specifying the policy to be used when a given role queries a given table. Through this mechanism, one role might have access to many different tables, possibly with a different policy granting that access for each table. Robert's approach defines a policy for a user and that policy is used for all tables that user accesses. This ties the policy very closely to the role. With either approach, I wonder how we are going to address the role membership question. Do you inherit policies through role membership? What happens on 'set role'? Robert points out that we should be using "OR" for these situations of overlapping policies and I tend to agree. Therefore, we would look at the RLS policies for a table and extract out all of them for all of the roles which the current user is a member of, OR them together and that would be the set of quals used. I'm leaning towards Dean's approach. With Robert's approach, one could emulate Dean's approach but I suspect it would devolve quickly into one policy per user with that policy simply being a proxy for the role instead of being useful on its own. With Dean's approach though, I don't think there's a need for a policy to be a stand-alone object. The policy is simply a proxy for the set of quals to be added and therefore the policy could really live as a per-table object. > That means the syntax I proposed earlier is wrong/misleading. Instead of > > GRANT SELECT ON TABLE tbl TO role USING polname; > > it should really be > > GRANT SELECT USING polname ON TABLE tbl TO role; This would work, though the 'polname' could be a per-table object, no? This could even be: GRANT SELECT USING (sec_level=manager) ON TABLE tbl TO role; > > There is administrative > > complexity as well, because if you want to policy-protect an > > additional table, you've got to add the table to the policy and then > > update all the grants as well. I think what will happen in practice > > is that people will grant to PUBLIC all rights on the policy, and then > > do all the access control through the GRANT statements. I agree that if you want to policy protect a table that you'll need to set the policies on the table (that's required either way) and that, with Dean's approach, you'd have to modify the GRANTs done to that table as well. I don't follow what you're suggesting with granting to PUBLIC all rights on the policy though..? With your approach though, if you have a policy which covers all managers and one which covers all VPs and then you have one VP whose access should be different, you'd have to create a new policy just for that VP and then modify all of the tables which have manager/VP access to also have that new VP's policy too, or something along those lines, no? > If you assume that most users will only have one policy through which > they can access any given table, then there is no more administrative > overhead than we have right now. Right now you have to grant each user > permissions on each table you define. The only difference is that now > you throw in a "USING polname". We could also simplify administration > by supporting > > GRANT SELECT USING polname ON ALL TABLES IN SCHEMA sch TO role; > > The important distinction is that this is only granting permissions on > tables that exist now, not on tables that might be created later. Sure, that's the same as it is now.. Robert's correct, imv, that we'll need to make GRANT .. ON ALL, and ALTER DEFAULT PRIVS work with this. > > An interesting question we haven't much considered is: who can set up > > policies and add then to users? Maybe we should flip this around, and > > instead of adding users to policies, we should exempt users from > > policies. > > > > CREATE POLICY p1; > > > > And then, if they own p1 and t1, they can do: > > > > ALTER TABLE t1 SET POLICY p1 TO t1_p1_quals; > > (or maybe we should associate it to the policy instead of the table: > > ALTER POLICY p1 SET TABLE t1 TO t1_p1_quals) > > > > And then the policy applies to everyone who doesn't have the grantable > > EXEMPT privilege on the policy. The policy owner and superuser have > > that privilege by default and it can be handed out to others like > > this: > > > > GRANT EXEMPT ON POLICY p1 TO snowden; > > > > Then users who have row_level_security=on will bypass RLS if possible, > > and otherwise it will be applied. Users who have > > row_level_security=off will bypass RLS if possible, and otherwise > > error. And users who have row_level_security=force will apply RLS > > even if they are entitled to bypass it. > > That's interesting. I need to think some more about what that means. I'm not a fan of the EXEMPT approach.. Thanks, Stephen
On Sun, Jun 29, 2014 at 3:42 PM, Stephen Frost <sfrost@snowman.net> wrote: >> > An interesting question we haven't much considered is: who can set up >> > policies and add then to users? Maybe we should flip this around, and >> > instead of adding users to policies, we should exempt users from >> > policies. >> > >> > CREATE POLICY p1; >> > >> > And then, if they own p1 and t1, they can do: >> > >> > ALTER TABLE t1 SET POLICY p1 TO t1_p1_quals; >> > (or maybe we should associate it to the policy instead of the table: >> > ALTER POLICY p1 SET TABLE t1 TO t1_p1_quals) >> > >> > And then the policy applies to everyone who doesn't have the grantable >> > EXEMPT privilege on the policy. The policy owner and superuser have >> > that privilege by default and it can be handed out to others like >> > this: >> > >> > GRANT EXEMPT ON POLICY p1 TO snowden; >> > >> > Then users who have row_level_security=on will bypass RLS if possible, >> > and otherwise it will be applied. Users who have >> > row_level_security=off will bypass RLS if possible, and otherwise >> > error. And users who have row_level_security=force will apply RLS >> > even if they are entitled to bypass it. >> >> That's interesting. I need to think some more about what that means. > > I'm not a fan of the EXEMPT approach.. Just out of curiosity, why not? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
* Robert Haas (robertmhaas@gmail.com) wrote: > On Sun, Jun 29, 2014 at 3:42 PM, Stephen Frost <sfrost@snowman.net> wrote: > >> > An interesting question we haven't much considered is: who can set up > >> > policies and add then to users? Maybe we should flip this around, and > >> > instead of adding users to policies, we should exempt users from > >> > policies. > >> > > >> > CREATE POLICY p1; > >> > > >> > And then, if they own p1 and t1, they can do: > >> > > >> > ALTER TABLE t1 SET POLICY p1 TO t1_p1_quals; > >> > (or maybe we should associate it to the policy instead of the table: > >> > ALTER POLICY p1 SET TABLE t1 TO t1_p1_quals) > >> > > >> > And then the policy applies to everyone who doesn't have the grantable > >> > EXEMPT privilege on the policy. The policy owner and superuser have > >> > that privilege by default and it can be handed out to others like > >> > this: > >> > > >> > GRANT EXEMPT ON POLICY p1 TO snowden; > >> > > >> > Then users who have row_level_security=on will bypass RLS if possible, > >> > and otherwise it will be applied. Users who have > >> > row_level_security=off will bypass RLS if possible, and otherwise > >> > error. And users who have row_level_security=force will apply RLS > >> > even if they are entitled to bypass it. > >> > >> That's interesting. I need to think some more about what that means. > > > > I'm not a fan of the EXEMPT approach.. > > Just out of curiosity, why not? I don't see it as really solving the flexibility need and it feels quite a bit more complicated to reason about. Would someone who is EXEMPT from one policy on a given table still have other policies on that table applied to them? Would a user be able to be EXEMPT from multiple policies? I feel like that's what you're suggesting with this approach, otherwise I don't see it as really different from the 'DIRECT SELECT' privilege discussed previously.. Thanks, Stephen
On Mon, Jun 30, 2014 at 9:42 AM, Stephen Frost <sfrost@snowman.net> wrote: >> > I'm not a fan of the EXEMPT approach.. >> >> Just out of curiosity, why not? > > I don't see it as really solving the flexibility need and it feels quite > a bit more complicated to reason about. Would someone who is EXEMPT > from one policy on a given table still have other policies on that table > applied to them? Yes; otherwise, EXEMPT couldn't be granted by non-superusers, and the whole point of that proposal was to come up with something that would be clearly safe for ordinary users to use. > Would a user be able to be EXEMPT from multiple > policies? Yes, clearly. It would be a privilege on the policy object, so different objects can have different privileges. > I feel like that's what you're suggesting with this approach, > otherwise I don't see it as really different from the 'DIRECT SELECT' > privilege discussed previously.. Right. If you took that away, it wouldn't be different. The number of possible approaches here has expanded beyond what I can keep in my head; I'm assuming you are planning to think this over and propose something comprehensive, or maybe Dean or someone else will do that. But I'm not sure that all the approaches proposed would make it safe for non-superusers to use RLS, and I think it would be good if they could. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
* Robert Haas (robertmhaas@gmail.com) wrote: > On Mon, Jun 30, 2014 at 9:42 AM, Stephen Frost <sfrost@snowman.net> wrote: > > I don't see it as really solving the flexibility need and it feels quite > > a bit more complicated to reason about. Would someone who is EXEMPT > > from one policy on a given table still have other policies on that table > > applied to them? > > Yes; otherwise, EXEMPT couldn't be granted by non-superusers, and the > whole point of that proposal was to come up with something that would > be clearly safe for ordinary users to use. I'm confused on this part- granting EXEMPT and/or DIRECT SELECT would definitely need to be supported by a non-superuser, though someone who had the appropriate rights on the object involved (either the policy or the table, depending on where we feel that definition should go). > > Would a user be able to be EXEMPT from multiple > > policies? > > Yes, clearly. It would be a privilege on the policy object, so > different objects can have different privileges. Ok.. then I'm not entirely sure how this is different from Dean's proposal except that it's a way of defining the inverse, which we don't do anywhere else in the system today.. > > I feel like that's what you're suggesting with this approach, > > otherwise I don't see it as really different from the 'DIRECT SELECT' > > privilege discussed previously.. > > Right. If you took that away, it wouldn't be different. Ok. > The number of possible approaches here has expanded beyond what I can > keep in my head; I'm assuming you are planning to think this over and > propose something comprehensive, or maybe Dean or someone else will do > that. But I'm not sure that all the approaches proposed would make it > safe for non-superusers to use RLS, and I think it would be good if > they could. I've been thinking about it quite a bit over the past few days (weeks?) and trying to continue to outline the proposals as they've changed. I'll try and work up another comprehensive email which covers the options currently under discussion as I understand them. Allowing non-superuser to use RLS is absolutely key to this in any case- it'd be great if you could voice any specific concerns you see there. We've already been working through the GUCs previously discussed, as I feel those will be necessary for any of these approaches (in particular the "bypass RLS-or-error" GUC which pg_dump will enable by default). Thanks, Stephen
On 29 June 2014 20:42, Stephen Frost <sfrost@snowman.net> wrote: > To try and clarify what this distinction is- > > Dean's approach with GRANT allows specifying the policy to be > used when a given role queries a given table. Through this mechanism, > one role might have access to many different tables, possibly with a > different policy granting that access for each table. > > Robert's approach defines a policy for a user and that policy is used > for all tables that user accesses. This ties the policy very closely to > the role. > Actually I think they were both originally Robert's ideas in one form or another, but at this point I'm losing track a bit :-) > With either approach, I wonder how we are going to address the role > membership question. Do you inherit policies through role membership? > What happens on 'set role'? Robert points out that we should be using > "OR" for these situations of overlapping policies and I tend to agree. > Therefore, we would look at the RLS policies for a table and extract out > all of them for all of the roles which the current user is a member of, > OR them together and that would be the set of quals used. > Yes I think that's right. I had hoped to avoid overlapping policies, but maybe they're more-or-less inevitable and we should just allow them. It seems justifiable in terms of GRANTs --- one GRANT gives you permission to access one set of rows from a table, another GRANT gives you permission to access another set of rows, so in the end you have access to the union of both sets. > I'm leaning towards Dean's approach. With Robert's approach, one could > emulate Dean's approach but I suspect it would devolve quickly into one > policy per user with that policy simply being a proxy for the role > instead of being useful on its own. With Dean's approach though, I > don't think there's a need for a policy to be a stand-alone object. The > policy is simply a proxy for the set of quals to be added and therefore > the policy could really live as a per-table object. > Yes I think that's right too. I had thought that stand-alone policies would be useful for selecting which policies to apply to each role, but maybe that's not necessary if you rely entirely on GRANTs to decide which policies apply. >> That means the syntax I proposed earlier is wrong/misleading. Instead of >> >> GRANT SELECT ON TABLE tbl TO role USING polname; >> >> it should really be >> >> GRANT SELECT USING polname ON TABLE tbl TO role; > > This would work, though the 'polname' could be a per-table object, no? > Right. > This could even be: > > GRANT SELECT USING (sec_level=manager) ON TABLE tbl TO role; > Maybe. The important thing is that it's granting the role access to a {table,command,policy} set or equivalently a {table,command,quals} set --- i.e., the right to access a sub-set of the table's rows with a particular command. Let's explore this further to see where it leads. In some ways, I think it has ended up even simpler than I thought. To setup RLS, you would just need to do 2 things: 1). Add a bunch of RLS policies to your tables (not connected to any particular commands, since that is done using GRANTs). This could use Robert's earlier syntax: ALTER TABLE t1 ADD POLICY p1 WHERE p1_quals; ALTER TABLE t1 ADD POLICY p2 WHERE p2_quals; ... (or some similar syntax) where the policy names p1 and p2 need only be unique within the table. For maintenance purposes you'd also need to be able to do ALTER TABLE t1 DROP POLICY pol; and maybe in the future we'd support ALTER TABLE t1 ALTER POLICY pol TO new_quals; 2). Once each table has the required set of policies, grant each role permissions, specifying the allowed commands and policies together: GRANT SELECT USING p1 ON TABLE t1 TO role1; GRANT SELECT USING p2 ON TABLE t1 TO role1; GRANT UPDATE USING p3 ON TABLE t1 TO role1; ... (or some similar syntax) So in this example, if role1 SELECTed from t1, the system would automatically apply the combined quals (p1_quals OR p2_quals), whereas when role1 UPDATEd t1, it would apply p3_quals. So that takes care of the different-quals-for-different-commands requirement without even needing any special syntax for it in ALTER TABLE. A straight "GRANT SELECT ON TABLE .. TO .." would grant access to the whole table without any RLS quals, as it always has done, which is good because it means nothing changes for users who aren't interested in RLS. Finally, pg_dump would require a GUC to ensure that RLS was not in effect. Perhaps something like SET require_direct_table_access = true, which would cause an error to be thrown if the user hadn't been granted straight select permissions on the tables in question. That all seems relatively easy to understand, whilst giving a lot of flexibility. An annoying complication, however, is how this interacts with column privileges. Right now "GRANT SELECT(col1) ON t1 TO role1" gives role1 access to every row in col1, and I think that has to remain the case, since GRANTs only ever give you more access. But that leads to a situation where the RLS quals applied would depend on the columns selected. That could be avoided by consistent use of GRANT SELECT(col1,col2,...) USING p1 ON TABLE t1 TO role1; so that the same policy applied to all accessible columns. But what if different policies applied to different columns? Logically that would require the sets of quals for each of the selected columns to be ANDed together, or perhaps we would throw an error in that case. My inclination is to allow it, because it's probably as much effort to detect and forbid it. Despite this complication, I still quite like this approach because it seems to build naturally on existing technology, giving a lot of flexibility, without requiring too much additional syntax. Regards, Dean
On Tue, Jul 1, 2014 at 3:33 AM, Dean Rasheed <dean.a.rasheed@gmail.com> wrote: > An annoying complication, however, is how this interacts with column > privileges. Right now "GRANT SELECT(col1) ON t1 TO role1" gives role1 > access to every row in col1, and I think that has to remain the case, > since GRANTs only ever give you more access. But that leads to a > situation where the RLS quals applied would depend on the columns > selected. Wow, that seems pretty horrible to me. That means that if I do: SELECT a FROM tab; and then: SELECT a, b FROM tab; ...the second one might return fewer rows than the first one. I think there's a good argument that RLS is unlike other grantable privileges, and that it really ought to be defined as something which is imposed rather than a kind of access grant. If RLS is merely a modifier to an access grant, then every access grant has to make sure to include that modifier, or you have a security hole. But if it's a separate constrain on access, then you just do it once, and exempt people from it only as needed. That seems less error-prone to me -- it's sort of a default-deny policy, which is generally viewed as good for security -- and it avoids weird cases like the above, which I think could easily break application logic. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 1 July 2014 17:42, Robert Haas <robertmhaas@gmail.com> wrote: > On Tue, Jul 1, 2014 at 3:33 AM, Dean Rasheed <dean.a.rasheed@gmail.com> wrote: >> An annoying complication, however, is how this interacts with column >> privileges. Right now "GRANT SELECT(col1) ON t1 TO role1" gives role1 >> access to every row in col1, and I think that has to remain the case, >> since GRANTs only ever give you more access. But that leads to a >> situation where the RLS quals applied would depend on the columns >> selected. > > Wow, that seems pretty horrible to me. That means that if I do: > > SELECT a FROM tab; > > and then: > > SELECT a, b FROM tab; > > ...the second one might return fewer rows than the first one. > > I think there's a good argument that RLS is unlike other grantable > privileges, and that it really ought to be defined as something which > is imposed rather than a kind of access grant. If RLS is merely a > modifier to an access grant, then every access grant has to make sure > to include that modifier, or you have a security hole. But if it's a > separate constrain on access, then you just do it once, and exempt > people from it only as needed. That seems less error-prone to me -- > it's sort of a default-deny policy, which is generally viewed as good > for security -- and it avoids weird cases like the above, which I > think could easily break application logic. > That seems like a pretty strong argument. If RLS quals are instead regarded as constraints on access, and multiple policies apply, then it seems that the quals should now be combined with AND rather than OR, right? Regards, Dean
On Tue, Jul 1, 2014 at 3:20 PM, Dean Rasheed <dean.a.rasheed@gmail.com> wrote: > On 1 July 2014 17:42, Robert Haas <robertmhaas@gmail.com> wrote: >> On Tue, Jul 1, 2014 at 3:33 AM, Dean Rasheed <dean.a.rasheed@gmail.com> wrote: >>> An annoying complication, however, is how this interacts with column >>> privileges. Right now "GRANT SELECT(col1) ON t1 TO role1" gives role1 >>> access to every row in col1, and I think that has to remain the case, >>> since GRANTs only ever give you more access. But that leads to a >>> situation where the RLS quals applied would depend on the columns >>> selected. >> >> Wow, that seems pretty horrible to me. That means that if I do: >> >> SELECT a FROM tab; >> >> and then: >> >> SELECT a, b FROM tab; >> >> ...the second one might return fewer rows than the first one. >> >> I think there's a good argument that RLS is unlike other grantable >> privileges, and that it really ought to be defined as something which >> is imposed rather than a kind of access grant. If RLS is merely a >> modifier to an access grant, then every access grant has to make sure >> to include that modifier, or you have a security hole. But if it's a >> separate constrain on access, then you just do it once, and exempt >> people from it only as needed. That seems less error-prone to me -- >> it's sort of a default-deny policy, which is generally viewed as good >> for security -- and it avoids weird cases like the above, which I >> think could easily break application logic. > > That seems like a pretty strong argument. > > If RLS quals are instead regarded as constraints on access, and > multiple policies apply, then it seems that the quals should now be > combined with AND rather than OR, right? Yeah, maybe. I intuitively feel that OR would be more useful, so it would be nice to find a design where that makes sense. But it depends a lot, in my view, on what syntax we end up with. For example, suppose we add just one command: ALTER TABLE table_name FILTER [ role_name | PUBLIC ] USING qual; If the given role inherits from multiple roles that have different filters, I think the user will naturally expect all of the filters to be applied. But you could do it other ways. For example: ALTER TABLE table_name [ NO ] ROW LEVEL SECURITY; ALTER TABLE table_name GRANT ROW ACCESS TO role_name USING qual; If a table is set to NO ROW LEVEL SECURITY then it behaves just like it does now: anyone who accesses it sees all the rows, restricted to those columns for which they have permission. If the table is set to ROW LEVEL SECURITY then the default is to show no rows. The second command then allows access to a subset of the rows for a give role name. In this case, it is probably logical for access to be combined via OR. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 01/07/14 21:51, Robert Haas wrote: > On Tue, Jul 1, 2014 at 3:20 PM, Dean Rasheed <dean.a.rasheed@gmail.com> wrote: >> >> That seems like a pretty strong argument. >> >> If RLS quals are instead regarded as constraints on access, and >> multiple policies apply, then it seems that the quals should now be >> combined with AND rather than OR, right? > Yeah, maybe. I intuitively feel that OR would be more useful, so it > would be nice to find a design where that makes sense. Looking at the use cases we described earlier in http://www.postgresql.org/message-id/attachment/32196/mini-rim.sql I see more OR than AND, for instance 'if the row is sensitive then the user must be related to the row' which translates to (NOT sensitive) OR the user is related. An addition to that rule could be a breakglass method or other reasons to access, e.g. (NOT sensitive) OR user is related OR break glass OR legally required access. > But it depends > a lot, in my view, on what syntax we end up with. For example, > suppose we add just one command: > > ALTER TABLE table_name FILTER [ role_name | PUBLIC ] USING qual; > > If the given role inherits from multiple roles that have different > filters, I think the user will naturally expect all of the filters to > be applied. Suppose a building administrator gives a single person that has multiple roles multiple key cards to access appropriate rooms in a building. You could draw a venn diagram of the rooms those key cards open, and the intuition here probably is that the person can enter any room if one of the key cards matches, not all cards. > But you could do it other ways. For example: > > ALTER TABLE table_name [ NO ] ROW LEVEL SECURITY; > ALTER TABLE table_name GRANT ROW ACCESS TO role_name USING qual; > > If a table is set to NO ROW LEVEL SECURITY then it behaves just like > it does now: anyone who accesses it sees all the rows, restricted to > those columns for which they have permission. If the table is set to > ROW LEVEL SECURITY then the default is to show no rows. The second > command then allows access to a subset of the rows for a give role > name. In this case, it is probably logical for access to be combined > via OR. > regards, Yeb Havinga
* Robert Haas (robertmhaas@gmail.com) wrote: > On Tue, Jul 1, 2014 at 3:20 PM, Dean Rasheed <dean.a.rasheed@gmail.com> wrote: > > If RLS quals are instead regarded as constraints on access, and > > multiple policies apply, then it seems that the quals should now be > > combined with AND rather than OR, right? I do feel that RLS quals are constraints on access, but I don't see how it follows that multiple quals should be AND'd together because of that. I view the RLS policies on each table as being independent and "standing alone" regarding what can be seen. If you have access to a table today through policy A, and then later policy B is added, using AND would mean that the set of rows returned is less than if only policy A existed. That doesn't seem correct to me. > Yeah, maybe. I intuitively feel that OR would be more useful, so it > would be nice to find a design where that makes sense. But it depends > a lot, in my view, on what syntax we end up with. For example, > suppose we add just one command: > > ALTER TABLE table_name FILTER [ role_name | PUBLIC ] USING qual; > > If the given role inherits from multiple roles that have different > filters, I think the user will naturally expect all of the filters to > be applied. Agreed. > But you could do it other ways. For example: > > ALTER TABLE table_name [ NO ] ROW LEVEL SECURITY; > ALTER TABLE table_name GRANT ROW ACCESS TO role_name USING qual; > > If a table is set to NO ROW LEVEL SECURITY then it behaves just like > it does now: anyone who accesses it sees all the rows, restricted to > those columns for which they have permission. If the table is set to > ROW LEVEL SECURITY then the default is to show no rows. The second > command then allows access to a subset of the rows for a give role > name. In this case, it is probably logical for access to be combined > via OR. I can see value is having a table-level option to indicate if RLS is applied for that table or not, but I had been thinking we'd just automatically manage that. That is to say that once you define an RLS policy for a table, we go look and see what policy should be applied in each case. With the user able to control that, what happens if they say "row security" on the table and there are no policies? All access would show the table as empty? What if policies exist and they decide to 'turn off' RLS for the table- suddenly everyone can see all the rows? My answers to the above (which are making me like the idea more, actually...) would be: Yes, if they turn on RLS for the table and there aren't any policies, then the table appears empty for anyone with normal SELECT rights (table owner and superusers would still see everything). If policies exist and the user asks to turn off RLS, I'd throw an ERROR as there is a security risk there. We could support a CASCADE option which would go and drop the policies from the table first. Otherwise, I'm generally liking Dean's thoughts in http://www.postgresql.org/message-id/CAEZATCVftksFH=X+9mVmBNMZo5KsUP+RK0kb4oRO92JOfjO29g@mail.gmail.com along with the table-level "enable RLS" option. Are we getting to a point where there is sufficient agreement that it'd be worthwhile to really start implementing this? I'd suggest that we either forgo or at least table the notion of per-column policy definitions- RLS controls whole rows and so I don't feel that per-column policies really make sense. Thanks, Stephen
On Wed, Jul 2, 2014 at 9:47 AM, Stephen Frost <sfrost@snowman.net> wrote: >> But you could do it other ways. For example: >> >> ALTER TABLE table_name [ NO ] ROW LEVEL SECURITY; >> ALTER TABLE table_name GRANT ROW ACCESS TO role_name USING qual; >> >> If a table is set to NO ROW LEVEL SECURITY then it behaves just like >> it does now: anyone who accesses it sees all the rows, restricted to >> those columns for which they have permission. If the table is set to >> ROW LEVEL SECURITY then the default is to show no rows. The second >> command then allows access to a subset of the rows for a give role >> name. In this case, it is probably logical for access to be combined >> via OR. > > I can see value is having a table-level option to indicate if RLS is > applied for that table or not, but I had been thinking we'd just > automatically manage that. That is to say that once you define an RLS > policy for a table, we go look and see what policy should be applied in > each case. With the user able to control that, what happens if they say > "row security" on the table and there are no policies? All access would > show the table as empty? I said the same thing in the text you quoted immediately above this reply. > What if policies exist and they decide to > 'turn off' RLS for the table- suddenly everyone can see all the rows? That'd be my vote. Sorta like disabling triggers. > Are we getting to a point where there is sufficient agreement that it'd > be worthwhile to really start implementing this? I think we're converging, but it might be a good idea to summarize a specific proposal before you start implementing. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
* Robert Haas (robertmhaas@gmail.com) wrote: > On Wed, Jul 2, 2014 at 9:47 AM, Stephen Frost <sfrost@snowman.net> wrote: > >> But you could do it other ways. For example: > >> > >> ALTER TABLE table_name [ NO ] ROW LEVEL SECURITY; > >> ALTER TABLE table_name GRANT ROW ACCESS TO role_name USING qual; > >> > >> If a table is set to NO ROW LEVEL SECURITY then it behaves just like > >> it does now: anyone who accesses it sees all the rows, restricted to > >> those columns for which they have permission. If the table is set to > >> ROW LEVEL SECURITY then the default is to show no rows. The second > >> command then allows access to a subset of the rows for a give role > >> name. In this case, it is probably logical for access to be combined > >> via OR. > > > > I can see value is having a table-level option to indicate if RLS is > > applied for that table or not, but I had been thinking we'd just > > automatically manage that. That is to say that once you define an RLS > > policy for a table, we go look and see what policy should be applied in > > each case. With the user able to control that, what happens if they say > > "row security" on the table and there are no policies? All access would > > show the table as empty? > > I said the same thing in the text you quoted immediately above this reply. huh. Somehow I managed to only read the first sentence in that paragraph. Clearly I need to go get (more) coffee. Still- sounds like agreement. :) > > What if policies exist and they decide to > > 'turn off' RLS for the table- suddenly everyone can see all the rows? > > That'd be my vote. Sorta like disabling triggers. Hmm. Ok- how would you feel about at least spitting out a WARNING if there are still policies on the table in that case..? Just makes me a bit nervous to have a case where policies can be defined on a table but are not actually being enforced.. > > Are we getting to a point where there is sufficient agreement that it'd > > be worthwhile to really start implementing this? > > I think we're converging, but it might be a good idea to summarize a > specific proposal before you start implementing. Right- will do so later today. Thanks! Stephen
On Wed, Jul 2, 2014 at 11:42 AM, Stephen Frost <sfrost@snowman.net> wrote: >> > What if policies exist and they decide to >> > 'turn off' RLS for the table- suddenly everyone can see all the rows? >> >> That'd be my vote. Sorta like disabling triggers. > > Hmm. Ok- how would you feel about at least spitting out a WARNING if > there are still policies on the table in that case..? Just makes me a > bit nervous to have a case where policies can be defined on a table but > are not actually being enforced.. Sounds like nanny-ism to me. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
* Robert Haas (robertmhaas@gmail.com) wrote: > On Wed, Jul 2, 2014 at 11:42 AM, Stephen Frost <sfrost@snowman.net> wrote: > >> > What if policies exist and they decide to > >> > 'turn off' RLS for the table- suddenly everyone can see all the rows? > >> > >> That'd be my vote. Sorta like disabling triggers. > > > > Hmm. Ok- how would you feel about at least spitting out a WARNING if > > there are still policies on the table in that case..? Just makes me a > > bit nervous to have a case where policies can be defined on a table but > > are not actually being enforced.. > > Sounds like nanny-ism to me. Alright, fair enough. Clearly, the individual changing the RLS on the table will have to have appropriate rights to do so. Thanks, Stephen
Robert, all, * Robert Haas (robertmhaas@gmail.com) wrote: > I think we're converging, but it might be a good idea to summarize a > specific proposal before you start implementing. Alright, apologies for it being a bit later than intended, but here's what I've come up with thus far. -- policies defined at a table scope -- allows using the same policy name for different tables -- with quals appropriate for each table ALTER TABLE t1 ADD POLICY p1 USING p1_quals; ALTER TABLE t1 ADD POLICY p2 USING p2_quals; -- used to drop a policy definition from a table ALTER TABLE t1 DROP POLICY p1; -- cascade required when references exist for the policy -- from roles ALTER TABLE t1 DROP POLICY p1 CASCADE; ALTER TABLE t1 ALTER POLICY p1 USING new_quals; -- Controls if any RLS is applied to this table or not -- If enabled, all users must access through some policy ALTER TABLE table_name [ NO ] ROW LEVEL SECURITY; -- Associates roles to policies ALTER TABLE table_name GRANT ROW ACCESS TO role_name USING p1; ALTER TABLE table_name REVOKE ROW ACCESS FROM role_name USING p1; -- "all" provides a policy which equates to full access (eg: 'true' or -- 'direct' access). Used to explicitly state when RLS can be bypassed -- and therefore a GUC can be set which says "bypass-RLS-or-error" and -- not have an error if this policy is granted to the role. ALTER TABLE table_name GRANT ROW ACCESS TO role_name USING all; -- Per-command-type control ALTER TABLE table_name GRANT SELECT ROW ACCESS TO role_name USING all; ALTER TABLE table_name GRANT UPDATE ROW ACCESS TO role_name USING all; Policies for a table are checked against pg_has_role() and all which apply are OR'd together. Added to pg_class: relrlsenabled boolean pg_rowsecurity oid oid rlsrel oid rlspol name rlsquals text rlsacls aclitem[]..? cmdtype(s)+ role If relrlsenabled then scan pg_rowsecurity for the policies associated with the table, testing each to see if any apply for the current role based on pg_has_role() against the aclitem array. Any which apply are added and OR'd together. Thoughts? Thanks, Stephen
Sorry for my late responding, now I'm catching up the discussion. > * Robert Haas (robertmhaas@gmail.com) wrote: > > On Tue, Jul 1, 2014 at 3:20 PM, Dean Rasheed <dean.a.rasheed@gmail.com> > wrote: > > > If RLS quals are instead regarded as constraints on access, and > > > multiple policies apply, then it seems that the quals should now be > > > combined with AND rather than OR, right? > > I do feel that RLS quals are constraints on access, but I don't see how > it follows that multiple quals should be AND'd together because of that. > I view the RLS policies on each table as being independent and "standing > alone" regarding what can be seen. If you have access to a table today > through policy A, and then later policy B is added, using AND would mean > that the set of rows returned is less than if only policy A existed. > That doesn't seem correct to me. > It seems to me direction of the constraints (RLS-policy) works to is reverse. In case when we have no RLS-policy, 100% of rows are visible isn't it? Addition of a constraint usually reduces the number of rows being visible, or same number of rows at least. Constraint shall never work to the direction to increase the number of rows being visible. If multiple RLS-policies are connected with OR-operator, the first policy works to the direction to reduce number of visible rows, but the second policy works to the reverse direction. If we would have OR'd RLS-policy, how does it merged with user given qualifiers with? For example, if RLS-policy of t1 is (t1.credential < get_user_credential) and user's query is: SELECT * FROM t1 WHERE t1.x = t1.x; Do you think RLS-policy shall be merged with OR'd form? > > Yeah, maybe. I intuitively feel that OR would be more useful, so it > > would be nice to find a design where that makes sense. But it depends > > a lot, in my view, on what syntax we end up with. For example, > > suppose we add just one command: > > > > ALTER TABLE table_name FILTER [ role_name | PUBLIC ] USING qual; > > > > If the given role inherits from multiple roles that have different > > filters, I think the user will naturally expect all of the filters to > > be applied. > > Agreed. > > > But you could do it other ways. For example: > > > > ALTER TABLE table_name [ NO ] ROW LEVEL SECURITY; ALTER TABLE > > table_name GRANT ROW ACCESS TO role_name USING qual; > > > > If a table is set to NO ROW LEVEL SECURITY then it behaves just like > > it does now: anyone who accesses it sees all the rows, restricted to > > those columns for which they have permission. If the table is set to > > ROW LEVEL SECURITY then the default is to show no rows. The second > > command then allows access to a subset of the rows for a give role > > name. In this case, it is probably logical for access to be combined > > via OR. > > I can see value is having a table-level option to indicate if RLS is applied > for that table or not, but I had been thinking we'd just automatically manage > that. That is to say that once you define an RLS policy for a table, we > go look and see what policy should be applied in each case. With the user > able to control that, what happens if they say "row security" on the table > and there are no policies? All access would show the table as empty? What > if policies exist and they decide to 'turn off' RLS for the table- suddenly > everyone can see all the rows? > > My answers to the above (which are making me like the idea more, > actually...) would be: > > Yes, if they turn on RLS for the table and there aren't any policies, then > the table appears empty for anyone with normal SELECT rights (table owner > and superusers would still see everything). > > If policies exist and the user asks to turn off RLS, I'd throw an ERROR > as there is a security risk there. We could support a CASCADE option which > would go and drop the policies from the table first. > Hmm... This approach starts from the empty permission then adds permission to reference a particular range of the configured table. It's one attitude. However, I think it has a dark side we cannot ignore. Usually, the purpose of security mechanism is to ensure which is readable/writable according to the rules. Once multiple RLS-policies are merged with OR'd form, its results are unpredicatable. Please assume here are two individual applications that use RLS on table-X. Even if application-1 want only rows being "public" become visible, it may expose "credential" or "secret" rows by interaction of orthogonal policy configured by application-2 (that may configure the policy according to the source ip-address). It seems to me application-2 partially invalidated the RLS-policy configured by application-1. I think, an important characteristic is things to be invisible is invisible even though multiple rules are configured. > Otherwise, I'm generally liking Dean's thoughts in > http://www.postgresql.org/message-id/CAEZATCVftksFH=X+9mVmBNMZo5KsUP+R > K0kb4oRO92JOfjO29g@mail.gmail.com > along with the table-level "enable RLS" option. > > Are we getting to a point where there is sufficient agreement that it'd > be worthwhile to really start implementing this? I'd suggest that we either > forgo or at least table the notion of per-column policy > definitions- RLS controls whole rows and so I don't feel that per-column > policies really make sense. > Thanks, -- NEC OSS Promotion Center / PG-Strom Project KaiGai Kohei <kaigai@ak.jp.nec.com>
Kaigai,
On Thursday, July 3, 2014, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote:
Can you clarify where this is coming from..? It sounds like you're referring to an existing implementation and, if so, it'd be good to get more information on how that works exactly.
You are suggesting instead that if application 2 sets up policies on the table and then application 1 adds another policy that it should reduce what application 2's users can see? That doesn't make any sense to me. I'd actually expect these applications to at least use different roles anyway, which means they could each have a single role specific policy which only returns what that application is allowed to see.
On Thursday, July 3, 2014, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote:
Sorry for my late responding, now I'm catching up the discussion.
> * Robert Haas (robertmhaas@gmail.com) wrote:
> > On Tue, Jul 1, 2014 at 3:20 PM, Dean Rasheed <dean.a.rasheed@gmail.com>
> wrote:
> > > If RLS quals are instead regarded as constraints on access, and
> > > multiple policies apply, then it seems that the quals should now be
> > > combined with AND rather than OR, right?
>
> I do feel that RLS quals are constraints on access, but I don't see how
> it follows that multiple quals should be AND'd together because of that.
> I view the RLS policies on each table as being independent and "standing
> alone" regarding what can be seen. If you have access to a table today
> through policy A, and then later policy B is added, using AND would mean
> that the set of rows returned is less than if only policy A existed.
> That doesn't seem correct to me.
>
It seems to me direction of the constraints (RLS-policy) works to is reverse.
In case when we have no RLS-policy, 100% of rows are visible isn't it?
No, as outlined later, the table would appear empty if no policies exist and RLS is enabled for the table.
Addition of a constraint usually reduces the number of rows being visible,
or same number of rows at least. Constraint shall never work to the direction
to increase the number of rows being visible.
If multiple RLS-policies are connected with OR-operator, the first policy
works to the direction to reduce number of visible rows, but the second
policy works to the reverse direction.
This isn't accurate, as mentioned. Each policy stands alone to define what is visible through it and if no policy exists then no rows are visible.
If we would have OR'd RLS-policy, how does it merged with user given
qualifiers with?
The RLS quals are all applied together with OR's and the result is AND'd with any user quals provided. This is only when multiple policies are being applied for a given query and seems pretty straight forward to me.
For example, if RLS-policy of t1 is (t1.credential < get_user_credential)
and user's query is:
SELECT * FROM t1 WHERE t1.x = t1.x;
Do you think RLS-policy shall be merged with OR'd form?
Only the RLS policies are OR'd together, not user provided quals. The above would result in:
Where t1.x = t1.x and (t1.credential < get_user_credential)
If another policy also applies for this query, such as t1.cred2 < get_user_credential then we would have:
Where t1.x = t1.x and (t1.credential < get_user_credential OR t1.cred2 < get_user_credential)
This is similar to how roles work- your overall access includes all access granted to any roles you are a member of. You don't need SELECT rights granted to every role you are a member of to select from the table. Additionally, if an admin wants to AND the quals together then they can simply create a policy which does that rather than have 2 policies.
> > Yeah, maybe. I intuitively feel that OR would be more useful, so it
> > would be nice to find a design where that makes sense. But it depends
> > a lot, in my view, on what syntax we end up with. For example,
> > suppose we add just one command:
> >
> > ALTER TABLE table_name FILTER [ role_name | PUBLIC ] USING qual;
> >
> > If the given role inherits from multiple roles that have different
> > filters, I think the user will naturally expect all of the filters to
> > be applied.
>
> Agreed.
>
> > But you could do it other ways. For example:
> >
> > ALTER TABLE table_name [ NO ] ROW LEVEL SECURITY; ALTER TABLE
> > table_name GRANT ROW ACCESS TO role_name USING qual;
> >
> > If a table is set to NO ROW LEVEL SECURITY then it behaves just like
> > it does now: anyone who accesses it sees all the rows, restricted to
> > those columns for which they have permission. If the table is set to
> > ROW LEVEL SECURITY then the default is to show no rows. The second
> > command then allows access to a subset of the rows for a give role
> > name. In this case, it is probably logical for access to be combined
> > via OR.
>
> I can see value is having a table-level option to indicate if RLS is applied
> for that table or not, but I had been thinking we'd just automatically manage
> that. That is to say that once you define an RLS policy for a table, we
> go look and see what policy should be applied in each case. With the user
> able to control that, what happens if they say "row security" on the table
> and there are no policies? All access would show the table as empty? What
> if policies exist and they decide to 'turn off' RLS for the table- suddenly
> everyone can see all the rows?
>
> My answers to the above (which are making me like the idea more,
> actually...) would be:
>
> Yes, if they turn on RLS for the table and there aren't any policies, then
> the table appears empty for anyone with normal SELECT rights (table owner
> and superusers would still see everything).
>
> If policies exist and the user asks to turn off RLS, I'd throw an ERROR
> as there is a security risk there. We could support a CASCADE option which
> would go and drop the policies from the table first.
>
Hmm... This approach starts from the empty permission then adds permission
to reference a particular range of the configured table. It's one attitude.
Right- just like how our grant system works.
However, I think it has a dark side we cannot ignore. Usually, the purpose
of security mechanism is to ensure which is readable/writable according to
the rules. Once multiple RLS-policies are merged with OR'd form, its results
are unpredicatable.
I don't see how it's unpredictable at all.
Please assume here are two individual applications that use RLS on table-X.
Even if application-1 want only rows being "public" become visible, it may
expose "credential" or "secret" rows by interaction of orthogonal policy
configured by application-2 (that may configure the policy according to the
source ip-address). It seems to me application-2 partially invalidated the
RLS-policy configured by application-1.
I think, an important characteristic is things to be invisible is invisible
even though multiple rules are configured.
This is addressed through the ability to associate roles to policies.
Thanks,
Stephen
> Kaigai, > > On Thursday, July 3, 2014, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote: > > > Sorry for my late responding, now I'm catching up the discussion. > > > * Robert Haas (robertmhaas@gmail.com <javascript:;> ) wrote: > > > On Tue, Jul 1, 2014 at 3:20 PM, Dean Rasheed > <dean.a.rasheed@gmail.com <javascript:;> > > > wrote: > > > > If RLS quals are instead regarded as constraints on access, > and > > > > multiple policies apply, then it seems that the quals should > now be > > > > combined with AND rather than OR, right? > > > > I do feel that RLS quals are constraints on access, but I don't > see how > > it follows that multiple quals should be AND'd together because > of that. > > I view the RLS policies on each table as being independent and > "standing > > alone" regarding what can be seen. If you have access to a table > today > > through policy A, and then later policy B is added, using AND > would mean > > that the set of rows returned is less than if only policy A existed. > > That doesn't seem correct to me. > > > It seems to me direction of the constraints (RLS-policy) works to > is reverse. > > In case when we have no RLS-policy, 100% of rows are visible isn't > it? > > > No, as outlined later, the table would appear empty if no policies exist > and RLS is enabled for the table. > > > Addition of a constraint usually reduces the number of rows being > visible, > or same number of rows at least. Constraint shall never work to > the direction > to increase the number of rows being visible. > > > Can you clarify where this is coming from..? It sounds like you're > referring to an existing implementation and, if so, it'd be good to get > more information on how that works exactly. > Oracle VPD - Multiple Policies for Each Table, View, or Synonym http://docs.oracle.com/cd/B19306_01/network.102/b14266/apdvpoli.htm#i1008351 It says - Note that all policies applied to a table are enforced with AND syntax. Not only Oracle VPD, it fits attitude of defense in depth. Please assume a system that installs network firewall, unix permissions and selinux. If somebody wants to reference an information asset within a file, he has to connect the server from the network address being allowed by the firewall configuration AND both of DAC and MAC has to allow his access. Usually, we have to pass all the access control to reference the target information, not one of the access control stuffs being installed. > For example, if RLS-policy of t1 is (t1.credential < > get_user_credential) > and user's query is: > SELECT * FROM t1 WHERE t1.x = t1.x; > Do you think RLS-policy shall be merged with OR'd form? > > > Only the RLS policies are OR'd together, not user provided quals. The above > would result in: > > Where t1.x = t1.x and (t1.credential < get_user_credential) > > If another policy also applies for this query, such as t1.cred2 < > get_user_credential then we would have: > > Where t1.x = t1.x and (t1.credential < get_user_credential OR t1.cred2 < > get_user_credential) > > This is similar to how roles work- your overall access includes all access > granted to any roles you are a member of. You don't need SELECT rights granted > to every role you are a member of to select from the table. Additionally, > if an admin wants to AND the quals together then they can simply create > a policy which does that rather than have 2 policies. > It seems to me a pain on database administration, if we have to pay attention not to conflict each RLS-policy. I expect 90% of RLS-policy will be configured to PUBLIC user, to apply everybody same rules on access. In this case, DBA has to ensure the target table has no policy or existing policy does not conflict with the new policy to be set. I don't think it is a good idea to enforce DBA these checks. > Please assume here are two individual applications that use RLS > on table-X. > Even if application-1 want only rows being "public" become visible, > it may > expose "credential" or "secret" rows by interaction of orthogonal > policy > configured by application-2 (that may configure the policy > according to the > source ip-address). It seems to me application-2 partially > invalidated the > RLS-policy configured by application-1. > > > You are suggesting instead that if application 2 sets up policies on the > table and then application 1 adds another policy that it should reduce what > application 2's users can see? That doesn't make any sense to me. I'd > actually expect these applications to at least use different roles anyway, > which means they could each have a single role specific policy which only > returns what that application is allowed to see. > I don't think this assumption is reasonable. Please expect two applications: app-X that is a database security product to apply access control based on remote ip-address of the client for any table accesses by any database roles. app-Y that is a usual enterprise package for daily business data, with RLS-policy. What is the expected behavior in this case? App-X provides overall access control towards whole of the database. So, it expects any client out of 192.168.0.0/16 should not reference any credential information for example. How does it interact with the RLS-policy by app-Y? If RLS-policies are merged with OR'd form, it seems to me it invalidate control of app-Y if client connected from inside of 192.168.0.0/16 or if client connects with a particular app-Y's role from out of 192.168.0.0/16. How to solve the situation above? Thanks, -- NEC OSS Promotion Center / PG-Strom Project KaiGai Kohei <kaigai@ak.jp.nec.com>
Kaigai, * Kouhei Kaigai (kaigai@ak.jp.nec.com) wrote: > > Can you clarify where this is coming from..? It sounds like you're > > referring to an existing implementation and, if so, it'd be good to get > > more information on how that works exactly. > > Oracle VPD - Multiple Policies for Each Table, View, or Synonym > http://docs.oracle.com/cd/B19306_01/network.102/b14266/apdvpoli.htm#i1008351 > > It says - Note that all policies applied to a table are enforced with AND syntax. While I'm not against using this as an example to consider, it's much more complex than what we're talking about here- and it supports application contexts which allow groups of RLS rights to be applied or not applied; essentially it allows both "AND" and "OR" for sets of RLS policies, along with "default" policies which are applied no matter what. > Not only Oracle VPD, it fits attitude of defense in depth. > Please assume a system that installs network firewall, unix permissions > and selinux. If somebody wants to reference an information asset within > a file, he has to connect the server from the network address being allowed > by the firewall configuration AND both of DAC and MAC has to allow his > access. These are not independent systems and your argument would apply to our GRANT system also, which I hope it's agreed would make it far less useful. Note also that SELinux brings in another complexity- it needs to make system calls out to check the access. > Usually, we have to pass all the access control to reference the target > information, not one of the access control stuffs being installed. This is true in some cases, and not in others. Only one role you are a member of needs to have access to a relation, not all of them. There are other examples of 'OR'-style security policies, this is merely one. I'm simply not convinced that it applies in the specific case we're talking about. In the end, I expect that either way people will be upset because they won't be able to specify fully which should be AND vs. which should be OR with the kind of flexibility other systems provide. What I'm trying to get to is an initial implementation which is generally useful and is able to add such support later. > > This is similar to how roles work- your overall access includes all access > > granted to any roles you are a member of. You don't need SELECT rights granted > > to every role you are a member of to select from the table. Additionally, > > if an admin wants to AND the quals together then they can simply create > > a policy which does that rather than have 2 policies. > > > It seems to me a pain on database administration, if we have to pay attention > not to conflict each RLS-policy. This notion of 'conflict' doesn't make much sense to me. What is 'conflicting' here? Each policy would simply need to stand on its own for the role which it's being applied to. That's very simple and straight-forward. > I expect 90% of RLS-policy will be configured > to PUBLIC user, to apply everybody same rules on access. In this case, DBA > has to ensure the target table has no policy or existing policy does not > conflict with the new policy to be set. > I don't think it is a good idea to enforce DBA these checks. If the DBA only uses PUBLIC then they have to ensure that each policy they set up for PUBLIC can stand on its own- though, really, I expect if they go that route they'd end up with just one policy that calls a stored procedure... > > You are suggesting instead that if application 2 sets up policies on the > > table and then application 1 adds another policy that it should reduce what > > application 2's users can see? That doesn't make any sense to me. I'd > > actually expect these applications to at least use different roles anyway, > > which means they could each have a single role specific policy which only > > returns what that application is allowed to see. > > > I don't think this assumption is reasonable. > Please expect two applications: app-X that is a database security product > to apply access control based on remote ip-address of the client for any > table accesses by any database roles. app-Y that is a usual enterprise > package for daily business data, with RLS-policy. > What is the expected behavior in this case? That the DBA manage the rights on the tables. I expect that will be required for quite a while with PG. It's nice to think of these application products that will manage all access for users by setting up their own policies, but we have yet to even discuss how they would have appropriate rights on the table to be able to do so (and to not interfere with each other..). Let's at least get something which is generally useful in. I'm all for trying to plan out how to get there and would welcome suggestions you have which are specific to PG on what we could do here (I'm not keen on just trying to mimic another product completely...), but at the level we're talking about (either AND them all or OR them all), I don't think we'd actually solve the use-cases you're describing with either answer. Without getting to the full level of having the flexibility to choose which policies should be AND'd and which should be OR'd, do you see an issue with adding initial support where each policy has to stand on its own and then working to address the more complex cases later? Thanks, Stephen
On Thu, Jul 3, 2014 at 1:14 AM, Stephen Frost <sfrost@snowman.net> wrote: > Alright, apologies for it being a bit later than intended, but here's > what I've come up with thus far. > > -- policies defined at a table scope > -- allows using the same policy name for different tables > -- with quals appropriate for each table > ALTER TABLE t1 ADD POLICY p1 USING p1_quals; > ALTER TABLE t1 ADD POLICY p2 USING p2_quals; > > -- used to drop a policy definition from a table > ALTER TABLE t1 DROP POLICY p1; > > -- cascade required when references exist for the policy > -- from roles > ALTER TABLE t1 DROP POLICY p1 CASCADE; > > ALTER TABLE t1 ALTER POLICY p1 USING new_quals; > > -- Controls if any RLS is applied to this table or not > -- If enabled, all users must access through some policy > ALTER TABLE table_name [ NO ] ROW LEVEL SECURITY; > > -- Associates roles to policies > ALTER TABLE table_name GRANT ROW ACCESS TO role_name USING p1; > ALTER TABLE table_name REVOKE ROW ACCESS FROM role_name USING p1; If you're going to have predicates be table-level and access grants be table-level, then what's the value in having policies? You could just do: ALTER TABLE table_name GRANT ROW ACCESS TO role_name USING quals; As I see it, the only value in having policies as separate objects is that you can then, by granting access to the policy, give a particular user a bundle of rights rather than having to grant each right individually. But with this design, you've got to create the policy, then add the quals to it for each table, and then you still have to give access individually for every <row, table> combination, so what value is the policy object itself providing? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Hi all I was jotting notes about this last sleepless night, and was really glad to see the suggestion of enabling RLS on a table being a requirement for OR-style quals suggested in the thread when I woke. The only sane way to do OR-ing of multiple rules is to require that tables be switched to deny-by-default before RLS quals can be added to then selectively enable access. The next step is DENY rules that override ALLOW rules, and are also ORed, so any DENY rule overrides any ALLOW rule. Like in ACLs. But that can be a "later" - I just think room for it should be left in any catalog definition. My concern with the talk of policies, etc, is with making it possible to impliment this for 9.5. I'd really like to see a robust declarative row-security framework with access policies - but I'm not sure sure it's a good idea to try to assemble policies directly out of low level row security predicates. Tying things into a policy model that isn't tried or tested might create more problems than it solves unless we implement multiple real-world test cases on top of the model to show it works. For how I think we should be pursuing this in the long run, take a look at how TeraData does it, with heirachical and non-heirachical rules - basically bitmaps or thresholds - that get grouped into access policies. It's a very good way to abstract the low level stuff. If we have low level table predicate filters, we can build this sort of thing on top. For 9.5, unless the basics turn out to be way easier than they look and it's all done soon in the release process, surely we should be sticking to just getting the basics of row security in place? Leaving room for enhancement, sure, but sticking to the core feature which to my mind is: - A row security on/off flag for a table; - Room in the catalogs for multiple row security rules per table and a type flag for them. The initial type flag, for ALLOWrules, specifies that all ALLOW rules be ORed together. - Syntax for creating and dropping row security predicates. If there can be multiple ones per table they'll need names, likewe have with triggers, indexes, etc. - psql support for listing row security predicates on a table if running as superuser or if you've been explicitly GRANTedaccess to the catalog table listing row security quals. - The hooks for contribs to inject their own row security rules. The API will need a tweak - right now it assumes these rulesare ANDed with any row security predicates in the catalogs, but we'd want the option of treating them as ALLOW or DENYrules to get ORed with the rest of the set *or* as a pre-filter predicate like currently. - A row-security-exempt right, at the user-level, to assuage the concerns about malicious predicates. I maintain that inthe first rev this should be simple: "superuser is row security exempt". I don't think I'm going to win that one though,so a user/role attribute that makes the role ignore row security seems like the next simplest option. - A way to test whether the current user is row-security exempt so pg_dump can complain unless explicitly told it's allowedto do a selective dump via a cmdline option; Plus a number of fixes: - Fixing the security barrier view isssue with row level lock pushdown that's breaking the row security regression tests; - Enhancing plan cache invalidation so that row-security exempt-ness of a user is part of the plancache key; - Adding session state like current_user to portals, so security_barrier functions returning refcursor, and cursors createdbefore SET SESSION AUTHORIZATION or SET ROLE, get the correct values when they use session information like current_user Note that this doesn't even consider the "with check option" style write-filtering side of row security and the corresponding challenges with the semantics around RETURNING. It's already a decent sized amount of work on top of the existing row security patch. If we start adding policy groups, etc, this will never get done. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
2014-07-06 14:19 GMT+09:00 Stephen Frost <sfrost@snowman.net>: > Kaigai, > > * Kouhei Kaigai (kaigai@ak.jp.nec.com) wrote: >> > Can you clarify where this is coming from..? It sounds like you're >> > referring to an existing implementation and, if so, it'd be good to get >> > more information on how that works exactly. >> >> Oracle VPD - Multiple Policies for Each Table, View, or Synonym >> http://docs.oracle.com/cd/B19306_01/network.102/b14266/apdvpoli.htm#i1008351 >> >> It says - Note that all policies applied to a table are enforced with AND syntax. > > While I'm not against using this as an example to consider, it's much > more complex than what we're talking about here- and it supports > application contexts which allow groups of RLS rights to be applied or > not applied; essentially it allows both "AND" and "OR" for sets of RLS > policies, along with "default" policies which are applied no matter > what. > >> Not only Oracle VPD, it fits attitude of defense in depth. >> Please assume a system that installs network firewall, unix permissions >> and selinux. If somebody wants to reference an information asset within >> a file, he has to connect the server from the network address being allowed >> by the firewall configuration AND both of DAC and MAC has to allow his >> access. > > These are not independent systems and your argument would apply to our > GRANT system also, which I hope it's agreed would make it far less > useful. Note also that SELinux brings in another complexity- it needs > to make system calls out to check the access. > >> Usually, we have to pass all the access control to reference the target >> information, not one of the access control stuffs being installed. > > This is true in some cases, and not in others. Only one role you are a > member of needs to have access to a relation, not all of them. There > are other examples of 'OR'-style security policies, this is merely one. > I'm simply not convinced that it applies in the specific case we're > talking about. > > In the end, I expect that either way people will be upset because they > won't be able to specify fully which should be AND vs. which should be > OR with the kind of flexibility other systems provide. What I'm trying > to get to is an initial implementation which is generally useful and is > able to add such support later. > >> > This is similar to how roles work- your overall access includes all access >> > granted to any roles you are a member of. You don't need SELECT rights granted >> > to every role you are a member of to select from the table. Additionally, >> > if an admin wants to AND the quals together then they can simply create >> > a policy which does that rather than have 2 policies. >> > >> It seems to me a pain on database administration, if we have to pay attention >> not to conflict each RLS-policy. > > This notion of 'conflict' doesn't make much sense to me. What is > 'conflicting' here? Each policy would simply need to stand on its own > for the role which it's being applied to. That's very simple and > straight-forward. > >> I expect 90% of RLS-policy will be configured >> to PUBLIC user, to apply everybody same rules on access. In this case, DBA >> has to ensure the target table has no policy or existing policy does not >> conflict with the new policy to be set. >> I don't think it is a good idea to enforce DBA these checks. > > If the DBA only uses PUBLIC then they have to ensure that each policy > they set up for PUBLIC can stand on its own- though, really, I expect if > they go that route they'd end up with just one policy that calls a > stored procedure... > >> > You are suggesting instead that if application 2 sets up policies on the >> > table and then application 1 adds another policy that it should reduce what >> > application 2's users can see? That doesn't make any sense to me. I'd >> > actually expect these applications to at least use different roles anyway, >> > which means they could each have a single role specific policy which only >> > returns what that application is allowed to see. >> > >> I don't think this assumption is reasonable. >> Please expect two applications: app-X that is a database security product >> to apply access control based on remote ip-address of the client for any >> table accesses by any database roles. app-Y that is a usual enterprise >> package for daily business data, with RLS-policy. >> What is the expected behavior in this case? > > That the DBA manage the rights on the tables. I expect that will be > required for quite a while with PG. It's nice to think of these > application products that will manage all access for users by setting up > their own policies, but we have yet to even discuss how they would have > appropriate rights on the table to be able to do so (and to not > interfere with each other..). > > Let's at least get something which is generally useful in. I'm all for > trying to plan out how to get there and would welcome suggestions you > have which are specific to PG on what we could do here (I'm not keen on > just trying to mimic another product completely...), but at the level > we're talking about (either AND them all or OR them all), I don't think > we'd actually solve the use-cases you're describing with either answer. > > Without getting to the full level of having the flexibility to choose > which policies should be AND'd and which should be OR'd, do you see an > issue with adding initial support where each policy has to stand on its > own and then working to address the more complex cases later? > Let me sort out. Probably, the reason of opinion differences come from the point where I and you focus on. It seems to me you try to position the upcoming RLS feature in the context of existing database role and acl mechanism. I think it is a straightforward approach and never argue. On the other hand, I'm worrying about whether we can utilize the RLS feature as a basis to implement different security model that performs independently from database roles and acl. As long as RLS-policy quals are connected with OR, it is a design choice to fit behavior of database acl and grant / revoke. Things I'd like you to pay attention is, how much flexible to use this RLS feature as a basis of other security model. One candidate is selinux; that does not pay attention on database roles, so row-level security policy attached by selinux should not be over-written by database roles. As you mentioned above, RLS-policy is connected with user-given quals by AND'd manner, like: SELECT * FROM t1 WHERE x like '%abc%'; being replaced to SELECT * FROM t1 WHERE (x like '%abc%') AND (quals by built-in RLS); What I'd like to implement is adjustment of query like: SELECT * FROM t1 WHERE (x like '%abc%') AND (quals by built-in RLS) AND (quals by extension-1) AND ... AND (quals by extension-N); I never mind even if qualifiers in the second block are connected with OR'd manner, however, I want RLS infrastructure to accept additional security models provided by extensions. Thanks, -- KaiGai Kohei <kaigai@kaigai.gr.jp>
Craig, * Craig Ringer (craig@2ndquadrant.com) wrote: > I was jotting notes about this last sleepless night, and was really glad > to see the suggestion of enabling RLS on a table being a requirement for > OR-style quals suggested in the thread when I woke. Thanks for your thoughts and input! > The only sane way to do OR-ing of multiple rules is to require that > tables be switched to deny-by-default before RLS quals can be added to > then selectively enable access. Right. > The next step is DENY rules that override ALLOW rules, and are also > ORed, so any DENY rule overrides any ALLOW rule. Like in ACLs. But that > can be a "later" - I just think room for it should be left in any > catalog definition. I'm not convinced regarding DENY rules, and I've seen very little of their use in practice.. The expectation is generally a deny-by-default setups with access granted explicity. > My concern with the talk of policies, etc, is with making it possible to > impliment this for 9.5. I'd really like to see a robust declarative > row-security framework with access policies - but I'm not sure sure it's > a good idea to try to assemble policies directly out of low level row > security predicates. +1000%- we really need to solidify what should go into 9.5 and get that committed, then work out if there is more we can do in this release cycle. I'm fine with a simple approach to begin with, provided we can build on it moving forward without causing upgrade headaches, provided we can get to where we want to go, of course. > Tying things into a policy model that isn't tried or tested might create > more problems than it solves unless we implement multiple real-world > test cases on top of the model to show it works. To this I would say- the original single-policy-per-table approach has been vetted by actual users to be valuable in their environments. It does not solve all cases, certainly, but it's simple and usable as-is and is the minimum which I would like to see in 9.5. Ideally, we can do better than that, but lets not throw out that win because we insist on a complete solution before it goes into core- because then we'll never get there. > For how I think we should be pursuing this in the long run, take a look > at how TeraData does it, with heirachical and non-heirachical rules - > basically bitmaps or thresholds - that get grouped into access policies. > It's a very good way to abstract the low level stuff. If we have low > level table predicate filters, we can build this sort of thing on top. I keep thinking that a bitmap or similar might make sense here.. Consider a set of policies where we assign them numbers-per-table, a we can then build a bitmap of them, and then store what bitmap is applied to a given query. That then allows us to compare those bitmaps during plan cache checking to make sure that the policies applied last time are the same which we would be applying now, and therefore the existing cached plan is sufficient. It gets a bit more complicated when you allow AND-vs-OR and groups or hierarchies of policies, of course, but I'd like to think we can come up with a sensible way to represent that to allow for a quick check during plan cache lookup. > For 9.5, unless the basics turn out to be way easier than they look and > it's all done soon in the release process, surely we should be sticking > to just getting the basics of row security in place? Leaving room for > enhancement, sure, but sticking to the core feature which to my mind is: Agreed.. > - A row security on/off flag for a table; Yes; I like this approach in general. > - Room in the catalogs for multiple row security rules per table > and a type flag for them. The initial type flag, for ALLOW rules, > specifies that all ALLOW rules be ORed together. Works for me. I'm open to a per-table toggle which says "AND" instead of "OR", provided we could implement that sanely and simply. > - Syntax for creating and dropping row security predicates. If there > can be multiple ones per table they'll need names, like we have with > triggers, indexes, etc. Agreed. To Robert's question about having policy names at all, rather than just quals, I feel like we'll need them eventually anyway and having them earlier will simplify things. Additionally, it's simpler to reason about and to manage- one can expect a one-to-many relationship between policies and roles, making it simpler to work with the policy name when associating it it to a role rather than having to remember all of the quals involved. > - psql support for listing row security predicates on a table if running > as superuser or if you've been explicitly GRANTed access to the > catalog table listing row security quals. We need psql support to list the RLS policies.. I don't wish to get into the question about what kind of access that requires though. At least initially, I wouldn't try to limit access to the policies or quals in the catalog... Perhaps we need that but I'd like a bit more discussion about it first- and we'll need to figure out how to address that when it comes to both psql and the 'rlsenabled' flag. > - The hooks for contribs to inject their own row security rules. The > API will need a tweak - right now it assumes these rules are ANDed > with any row security predicates in the catalogs, but we'd want the > option of treating them as ALLOW or DENY rules to get ORed with the > rest of the set *or* as a pre-filter predicate like currently. I'm really not interested in contrib modules with this first go around.. We can work to address their requests later on. I don't think many contrib authors will be very happy with the low-level support which we'll provide in 9.5 anyway and it'd probably be better off for everyone if we hold off on adding hooks, etc, for them until we have a better idea about how this will be used and it will work. > - A row-security-exempt right, at the user-level, > to assuage the concerns about malicious predicates. I maintain that > in the first rev this should be simple: "superuser is row security > exempt". I don't think I'm going to win that one though, so a > user/role attribute that makes the role ignore row security > seems like the next simplest option. Yes, we'll need this. > - A way to test whether the current user is row-security exempt > so pg_dump can complain unless explicitly told it's allowed > to do a selective dump via a cmdline option; Agreed. Adam has a patch for this already, more or less. > Plus a number of fixes: > > - Fixing the security barrier view isssue with row level lock pushdown > that's breaking the row security regression tests; No- this is not the responsibility of this particular patch or functionality. I agree that we will want to address it at some point, but it's very complicated and not required at this time. > - Enhancing plan cache invalidation so that row-security exempt-ness > of a user is part of the plancache key; We need to ensure that the plan cache is hanlded correctly. I'm not convinced, at this point, that we actually need to inclue the user as part of the key for looking up a plan cache. It might come to that, but I'm not quite convinced it's necessary yet. > - Adding session state like current_user to portals, so security_barrier > functions returning refcursor, and cursors created before SET SESSION > AUTHORIZATION or SET ROLE, get the correct values when they use > session information like current_user Yeah, we need to consider this and how it *should* behave. Have we really thought about and documented that, ideally as regression tests? We need to do so, to ensure that we have the correct behavior in this case. > Note that this doesn't even consider the "with check option" style > write-filtering side of row security and the corresponding challenges > with the semantics around RETURNING. Yeah, not sure how we want to handle these. At this point, I'm open to simply throwing an ERROR in cases which are not well defined or which do not work as expected. Ideally we can do better than that, but throwing an ERROR for cases which don't exist today and which are not yet supported is reasonable, imv. > It's already a decent sized amount of work on top of the existing row > security patch. Indeed. > If we start adding policy groups, etc, this will never get done. Agreed! Thanks! Stephen
KaiGai, * Kohei KaiGai (kaigai@kaigai.gr.jp) wrote: > What I'd like to implement is adjustment of query like: > SELECT * FROM t1 WHERE (x like '%abc%') AND (quals by built-in RLS) > AND (quals by extension-1) AND ... AND (quals by extension-N); > I never mind even if qualifiers in the second block are connected with OR'd > manner, however, I want RLS infrastructure to accept additional security > models provided by extensions. Would having a table-level 'AND'-vs-'OR' modifier for the RLS policies on that table be sufficient for what you're looking for? That seems a simple enough addition which would still allow more complex groups to be developed later on... Thanks! Stephen
Robert, * Robert Haas (robertmhaas@gmail.com) wrote: > If you're going to have predicates be table-level and access grants be > table-level, then what's the value in having policies? You could just > do: > > ALTER TABLE table_name GRANT ROW ACCESS TO role_name USING quals; Yes, this would be possible (and is nearly identical to the original patch, except that this includes per-role considerations), however, my thinking is that it'd be simpler to work with policy names rather than sets of quals, to use when mapping to roles, and they would potentially be useful later for other things (eg: for setting up which policies should be applied when, or which should be OR' or AND"d with other policies, or having groups of policies, etc). > As I see it, the only value in having policies as separate objects is > that you can then, by granting access to the policy, give a particular > user a bundle of rights rather than having to grant each right > individually. But with this design, you've got to create the policy, > then add the quals to it for each table, and then you still have to > give access individually for every <row, table> combination, so what > value is the policy object itself providing? To clarify this part- the idea is that you would simply declare a policy name to be a set of quals for a particular table, so you declare them and then map a policy to roles for which it should be used. In this arrangement, you don't declare the policy explicitly before setting the quals, those are done at the same time. Thanks, Stephen
2014-07-09 15:07 GMT+09:00 Stephen Frost <sfrost@snowman.net>: > KaiGai, > > * Kohei KaiGai (kaigai@kaigai.gr.jp) wrote: >> What I'd like to implement is adjustment of query like: >> SELECT * FROM t1 WHERE (x like '%abc%') AND (quals by built-in RLS) >> AND (quals by extension-1) AND ... AND (quals by extension-N); >> I never mind even if qualifiers in the second block are connected with OR'd >> manner, however, I want RLS infrastructure to accept additional security >> models provided by extensions. > > Would having a table-level 'AND'-vs-'OR' modifier for the RLS policies > on that table be sufficient for what you're looking for? That seems a > simple enough addition which would still allow more complex groups to be > developed later on... > Probably, things I'm considering is more simple. If a table has multiple built-in RLS policies, its expression node will be represented as a BoolExpr with OR_EXPR and every policies are linked to its args field, isn't it? We assume the built-in RLS model merges multiple policies by OR manner. In case when an extension want to apply additional security model on top of RLS infrastructure, a straightforward way is to add its own rules in addition to the built-in rules. If extension can get control to modify the above expression node and RLS infrastructure works well on the modified expression node, I think it's sufficient to implement multiple security models on the RLS infrastructure. Thanks, -- KaiGai Kohei <kaigai@kaigai.gr.jp>
KaiGai, * Kohei KaiGai (kaigai@kaigai.gr.jp) wrote: > 2014-07-09 15:07 GMT+09:00 Stephen Frost <sfrost@snowman.net>: > > * Kohei KaiGai (kaigai@kaigai.gr.jp) wrote: > >> What I'd like to implement is adjustment of query like: > >> SELECT * FROM t1 WHERE (x like '%abc%') AND (quals by built-in RLS) > >> AND (quals by extension-1) AND ... AND (quals by extension-N); > >> I never mind even if qualifiers in the second block are connected with OR'd > >> manner, however, I want RLS infrastructure to accept additional security > >> models provided by extensions. > > > > Would having a table-level 'AND'-vs-'OR' modifier for the RLS policies > > on that table be sufficient for what you're looking for? That seems a > > simple enough addition which would still allow more complex groups to be > > developed later on... > > > Probably, things I'm considering is more simple. > If a table has multiple built-in RLS policies, its expression node will be > represented as a BoolExpr with OR_EXPR and every policies are linked > to its args field, isn't it? We assume the built-in RLS model merges > multiple policies by OR manner. > In case when an extension want to apply additional security model on > top of RLS infrastructure, a straightforward way is to add its own rules > in addition to the built-in rules. If extension can get control to modify > the above expression node and RLS infrastructure works well on the > modified expression node, I think it's sufficient to implement multiple > security models on the RLS infrastructure. Another way would be to have a single RLS policy which extensions can modify, sure. That was actually along the lines of the originally proposed patch.. That approach would work if we OR'd multiple policies together too, provided the user took care to only have one policy implemented.. Not sure how easy that would be to work with for extension authors though. Thanks, Stephen
On Wed, Jul 9, 2014 at 2:13 AM, Stephen Frost <sfrost@snowman.net> wrote: > Robert, > > * Robert Haas (robertmhaas@gmail.com) wrote: >> If you're going to have predicates be table-level and access grants be >> table-level, then what's the value in having policies? You could just >> do: >> >> ALTER TABLE table_name GRANT ROW ACCESS TO role_name USING quals; > > Yes, this would be possible (and is nearly identical to the original > patch, except that this includes per-role considerations), however, my > thinking is that it'd be simpler to work with policy names rather than > sets of quals, to use when mapping to roles, and they would potentially > be useful later for other things (eg: for setting up which policies > should be applied when, or which should be OR' or AND"d with other > policies, or having groups of policies, etc). Hmm. I guess that's reasonable. Should the policy be a per-table object (like rules, constraints, etc.) instead of a global object? You could do: ALTER TABLE table_name ADD POLICY policy_name (quals); ALTER TABLE table_name POLICY FOR role_name IS policy_name; ALTER TABLE table_name DROP POLICY policy_name; -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Thursday, July 10, 2014, Robert Haas <robertmhaas@gmail.com> wrote:
On Wed, Jul 9, 2014 at 2:13 AM, Stephen Frost <sfrost@snowman.net> wrote:
> Yes, this would be possible (and is nearly identical to the original
> patch, except that this includes per-role considerations), however, my
> thinking is that it'd be simpler to work with policy names rather than
> sets of quals, to use when mapping to roles, and they would potentially
> be useful later for other things (eg: for setting up which policies
> should be applied when, or which should be OR' or AND"d with other
> policies, or having groups of policies, etc).
Hmm. I guess that's reasonable. Should the policy be a per-table
object (like rules, constraints, etc.) instead of a global object?
You could do:
ALTER TABLE table_name ADD POLICY policy_name (quals);
ALTER TABLE table_name POLICY FOR role_name IS policy_name;
ALTER TABLE table_name DROP POLICY policy_name;
Right, I was thinking they would be per table as they would specifically provide a name for a set of quals, and quals are naturally table-specific. I don't see a need to have them be global- that had been brought up before with the notion of applications picking their policy, but we could also add that later through another term (eg: contexts) which would then map to policies or similar. We could even extend policies to be global by mapping existing per-table ones to be global if we really needed to...
My feeling at the moment is that having them be per-table makes sense and we'd still have flexibility to change later if we had some compelling reason to do so.
Thanks!
Stephen
On Fri, Jul 11, 2014 at 4:55 AM, Stephen Frost <sfrost@snowman.net> wrote: > On Thursday, July 10, 2014, Robert Haas <robertmhaas@gmail.com> wrote: >> On Wed, Jul 9, 2014 at 2:13 AM, Stephen Frost <sfrost@snowman.net> wrote: >> > Yes, this would be possible (and is nearly identical to the original >> > patch, except that this includes per-role considerations), however, my >> > thinking is that it'd be simpler to work with policy names rather than >> > sets of quals, to use when mapping to roles, and they would potentially >> > be useful later for other things (eg: for setting up which policies >> > should be applied when, or which should be OR' or AND"d with other >> > policies, or having groups of policies, etc). >> >> Hmm. I guess that's reasonable. Should the policy be a per-table >> object (like rules, constraints, etc.) instead of a global object? >> >> You could do: >> >> ALTER TABLE table_name ADD POLICY policy_name (quals); >> ALTER TABLE table_name POLICY FOR role_name IS policy_name; >> ALTER TABLE table_name DROP POLICY policy_name; > > Right, I was thinking they would be per table as they would specifically > provide a name for a set of quals, and quals are naturally table-specific. I > don't see a need to have them be global- that had been brought up before > with the notion of applications picking their policy, but we could also add > that later through another term (eg: contexts) which would then map to > policies or similar. We could even extend policies to be global by mapping > existing per-table ones to be global if we really needed to... > > My feeling at the moment is that having them be per-table makes sense and > we'd still have flexibility to change later if we had some compelling reason > to do so. I don't think you can really change it later. If policies are per-table, then you could have a policy p1 on table t1 and also on table t2; if they become global objects, then you can't have p1 in two places. I hope I'm not beating a dead horse here, but changing syntax after it's been released is very, very hard. But that's not an argument against doing it this way; I think per-table policies are probably simpler and better here. It means, for example, that policies need not have their own permissions and ownership structure - they're part of the table, just like a constraint, trigger, or rule, and the table owner's permissions control. I like that, and I think our users will, too. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert,
On Friday, July 11, 2014, Robert Haas <robertmhaas@gmail.com> wrote:
Fair enough. My thinking was we'd come up with a way to map them (eg: table_policy), but I do agree that changing it later would really suck and having them be per-table makes a lot of sense.
On Friday, July 11, 2014, Robert Haas <robertmhaas@gmail.com> wrote:
On Fri, Jul 11, 2014 at 4:55 AM, Stephen Frost <sfrost@snowman.net> wrote:
> My feeling at the moment is that having them be per-table makes sense and
> we'd still have flexibility to change later if we had some compelling reason
> to do so.
I don't think you can really change it later. If policies are
per-table, then you could have a policy p1 on table t1 and also on
table t2; if they become global objects, then you can't have p1 in two
places. I hope I'm not beating a dead horse here, but changing syntax
after it's been released is very, very hard.
But that's not an argument against doing it this way; I think
per-table policies are probably simpler and better here. It means,
for example, that policies need not have their own permissions and
ownership structure - they're part of the table, just like a
constraint, trigger, or rule, and the table owner's permissions
control. I like that, and I think our users will, too.
Agreed and I believe this is more-or-less what I had proposed up-thread (not at a computer at the moment). I hope to have a chance to review and update the design and flush out the catalog definition this weekend.
Thanks!
Stephen
"Brightwell, Adam" <adam.brightwell@crunchydatasolutions.com> writes: >> You could do: >> >> ALTER TABLE table_name ADD POLICY policy_name (quals); >> ALTER TABLE table_name POLICY FOR role_name IS policy_name; >> ALTER TABLE table_name DROP POLICY policy_name; > I am attempting to modify the grammar to support the above syntax. > Unfortunately, I am encountering quite a number (280) shift/reduce > errors/conflicts in bison. I have reviewed the bison documentation as well > as the wiki page on resolving such conflicts. However, I am not entirely > certain on the direction I should take in order to resolve these conflicts. > I attempted to create a more redundant production like the wiki described, > but unfortunately that was not successful. I have attached both the patch > and bison report. Any help, recommendations or suggestions would be > greatly appreciated. 20MB messages to the list aren't that friendly. Please don't do that again, unless asked to. FWIW, the above syntax is a nonstarter, at least unless we're willing to make POLICY a reserved word (hint: we're not). The reason is that the ADD/DROP COLUMN forms consider COLUMN to be optional, meaning that the column name could directly follow ADD; and the column type name, which could also be just a plain identifier, would directly follow that. So there's no way to resolve the ambiguity with one token of lookahead. This actually isn't just bison being stupid: in fact, you simply cannot tell whether ALTER TABLE tab ADD POLICY varchar(42); is an attempt to add a column named "policy" of type varchar(42), or an attempt to add a policy named "varchar" with quals "42". Pick a different syntax. regards, tom lane
Adam, * Tom Lane (tgl@sss.pgh.pa.us) wrote: > "Brightwell, Adam" <adam.brightwell@crunchydatasolutions.com> writes: > >> ALTER TABLE table_name ADD POLICY policy_name (quals); > >> ALTER TABLE table_name POLICY FOR role_name IS policy_name; > >> ALTER TABLE table_name DROP POLICY policy_name; [...] > This actually isn't just bison being stupid: in fact, you simply > cannot tell whether > > ALTER TABLE tab ADD POLICY varchar(42); > > is an attempt to add a column named "policy" of type varchar(42), or an > attempt to add a policy named "varchar" with quals "42". > > Pick a different syntax. Yeah, now that we're trying to bake this into ALTER TABLE we need to be a bit more cautious. I'd think: ALTER TABLE tab POLICY ADD ... Would work though? (note: haven't looked/tested myself) Thanks! Stephen
Tom,
Thanks for the feedback.
Thanks,
Thanks for the feedback.
20MB messages to the list aren't that friendly. Please don't do that
again, unless asked to.
Apologies, I didn't realize it was so large until after it was sent. At any rate, it won't happen again.
FWIW, the above syntax is a nonstarter, at least unless we're willing to
make POLICY a reserved word (hint: we're not). The reason is that the
ADD/DROP COLUMN forms consider COLUMN to be optional, meaning that the
column name could directly follow ADD; and the column type name, which
could also be just a plain identifier, would directly follow that. So
there's no way to resolve the ambiguity with one token of lookahead.
This actually isn't just bison being stupid: in fact, you simply
cannot tell whether
ALTER TABLE tab ADD POLICY varchar(42);
is an attempt to add a column named "policy" of type varchar(42), or an
attempt to add a policy named "varchar" with quals "42".
Ok. Make sense and I was afraid that was the case.
Adam
--
Adam Brightwell - adam.brightwell@crunchydatasolutions.com
Database Engineer - www.crunchydatasolutions.com
Stephen,
--
Yeah, now that we're trying to bake this into ALTER TABLE we need to be
a bit more cautious. I'd think:
ALTER TABLE tab POLICY ADD ...
Would work though? (note: haven't looked/tested myself)
Yes, I just tested it and the following would work from a grammar perspective:
ALTER TABLE <table_name> POLICY ADD <policy_name> (policy_quals)
ALTER TABLE <table_name> POLICY DROP <policy_name>
Though, it would obviously require the addition of POLICY to the list of unreserved keywords. I don't suspect that would be a concern, as it is not "reserved", but thought I would point it out just in case.
Another thought I had was, would we also want the following, so that policies could be modified?
ALTER TABLE <table_name> POLICY ALTER <policy_name> (policy_quals)
Thanks,
Adam
Adam Brightwell - adam.brightwell@crunchydatasolutions.com
Database Engineer - www.crunchydatasolutions.com
Adam, * Brightwell, Adam (adam.brightwell@crunchydatasolutions.com) wrote: > > Yeah, now that we're trying to bake this into ALTER TABLE we need to be > > a bit more cautious. I'd think: > > > > ALTER TABLE tab POLICY ADD ... > > > > Would work though? (note: haven't looked/tested myself) > > Yes, I just tested it and the following would work from a grammar > perspective: > > ALTER TABLE <table_name> POLICY ADD <policy_name> (policy_quals) > ALTER TABLE <table_name> POLICY DROP <policy_name> Excellent, glad to hear it. > Though, it would obviously require the addition of POLICY to the list of > unreserved keywords. I don't suspect that would be a concern, as it is not > "reserved", but thought I would point it out just in case. Right, I don't anticipate anyone complaining too loudly about that.. > Another thought I had was, would we also want the following, so that > policies could be modified? > > ALTER TABLE <table_name> POLICY ALTER <policy_name> (policy_quals) Sounds like a good idea to me. Thanks! Stephen
Tom Lane wrote: > 20MB messages to the list aren't that friendly. Please don't do that > again, unless asked to. FWIW the message was not distributed to the list. I got a note from Adam and dropped it from the moderation queue. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On Wed, Jul 16, 2014 at 10:04 PM, Brightwell, Adam <adam.brightwell@crunchydatasolutions.com> wrote: > Yes, I just tested it and the following would work from a grammar > perspective: > > ALTER TABLE <table_name> POLICY ADD <policy_name> (policy_quals) > ALTER TABLE <table_name> POLICY DROP <policy_name> > > Though, it would obviously require the addition of POLICY to the list of > unreserved keywords. I don't suspect that would be a concern, as it is not > "reserved", but thought I would point it out just in case. > > Another thought I had was, would we also want the following, so that > policies could be modified? > > ALTER TABLE <table_name> POLICY ALTER <policy_name> (policy_quals) I think we do want a way to modify policies. However, we tend to avoid syntax that involves unnatural word order, as this certainly does. Maybe it's better to follow the example of CREATE RULE and CREATE TRIGGER and do something this instead: CREATE POLICY policy_name ON table_name USING quals; ALTER POLICY policy_name ON table_name USING quals; DROP POLICY policy_name ON table_name; The advantage of this is that you can regard "policy_name ON table_name" as the identifier for the policy throughout the system. You need some kind of identifier of that sort anyway to support COMMENT ON, SECURITY LABEL, and ALTER EXTENSION ADD/DROP for policies. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
I think we do want a way to modify policies. However, we tend toavoid syntax that involves unnatural word order, as this certainly
does. Maybe it's better to follow the example of CREATE RULE and
CREATE TRIGGER and do something this instead:
CREATE POLICY policy_name ON table_name USING quals;
ALTER POLICY policy_name ON table_name USING quals;
DROP POLICY policy_name ON table_name;
The advantage of this is that you can regard "policy_name ON
table_name" as the identifier for the policy throughout the system.
You need some kind of identifier of that sort anyway to support
COMMENT ON, SECURITY LABEL, and ALTER EXTENSION ADD/DROP for policies.
Sounds good. I certainly think it makes a lot of sense to include the ALTER functionality, if for no other reason than ease of use.
Another item to consider, though I believe it can come later, is per-action policies. Following the above suggested syntax, perhaps that might look like the following?
CREATE POLICY policy_name ON table_name FOR action USING quals;
ALTER POLICY policy_name ON table_name FOR action USING quals;
DROP POLICY policy_name ON table_name FOR action;
Thanks,
Adam
Adam Brightwell - adam.brightwell@crunchydatasolutions.com
Database Engineer - www.crunchydatasolutions.com
On Fri, Jul 18, 2014 at 7:01 PM, Brightwell, Adam <adam.brightwell@crunchydatasolutions.com> wrote: >> I think we do want a way to modify policies. However, we tend to >> avoid syntax that involves unnatural word order, as this certainly >> does. Maybe it's better to follow the example of CREATE RULE and >> CREATE TRIGGER and do something this instead: >> >> CREATE POLICY policy_name ON table_name USING quals; >> ALTER POLICY policy_name ON table_name USING quals; >> DROP POLICY policy_name ON table_name; >> >> The advantage of this is that you can regard "policy_name ON >> table_name" as the identifier for the policy throughout the system. >> You need some kind of identifier of that sort anyway to support >> COMMENT ON, SECURITY LABEL, and ALTER EXTENSION ADD/DROP for policies. > > Sounds good. I certainly think it makes a lot of sense to include the ALTER > functionality, if for no other reason than ease of use. > > Another item to consider, though I believe it can come later, is per-action > policies. Following the above suggested syntax, perhaps that might look > like the following? > > CREATE POLICY policy_name ON table_name FOR action USING quals; > ALTER POLICY policy_name ON table_name FOR action USING quals; > DROP POLICY policy_name ON table_name FOR action; That seems reasonable. You need to give some thought to what happens if the user types: CREATE POLICY pol1 ON tab1 FOR SELECT USING q1; ALTER POLICY pol1 ON tab1 FOR INSERT USING q2; I guess you end up with q1 as the SELECT policy and q2 as the INSERT policy. Similarly, had you typed: CREATE POLICY pol1 ON tab1 USING q1; ALTER POLICY pol1 ON tab1 FOR INSERT USING q2; ...then I guess you end up with q2 for INSERTs and q1 for everything else. I'm wondering if it might be better, though, not to allow the quals to be specified in CREATE POLICY, or else to allow multiple actions. Otherwise, getting pg_dump to DTRT might be complicated. Perhaps: CREATE POLICY pol1 ON tab1 ( [ [ FOR operation [ OR operation ] ... ] USING quals ] ... ); where operation = SELECT | INSERT | UPDATE | DELETE So that you can write things like: CREATE POLICY pol1 ON tab1 (USING a = 1); CREATE POLICY pol2 ON tab2 (FOR INSERT USING a = 1, FOR UPDATE USING b = 1, FOR DELETE USING c = 1); And then, for ALTER, just allow one change at a time, syntax as you proposed. That way each policy can be dumped as a single CREATE statement. > I was also giving some thought to the use of "POLICY", perhaps I am wrong, > but it does seem it could be at risk of becoming ambiguous down the road. I > can't think of any specific examples at the moment, but my concern is what > happens if we wanted to add another "type" of policy, whatever that might > be, later? Would it make more sense to go ahead and qualify this a little > more with "ROW SECURITY POLICY"? I think that's probably over-engineering. I'm not aware of anything else we might add that would be likely to be called a policy, and if we did add something we could probably call it something else instead. And long command names are annoying. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
All,
Attached is a patch for RLS that incorporates the following changes:
* Syntax:
- CREATE POLICY <policy_name> ON <table_name> FOR <command> USING ( <qual> )
- ALTER POLICY <policy_name> ON <table_name> FOR <command> USING ( <qual> )
- DROP POLICY <policy_name> ON <table_name> FOR <command>
* "row_security" GUC Setting - enable/disable row level security.
* BYPASSRLS and NOBYPASSRLS role attribute - allows user to bypass RLS if row_security GUC is set to OFF.
There are still some remaining issues but we hope to have those resolved soon.
Any comments or suggestions would be greatly appreciated.
Thanks,
Adam
On Mon, Jul 21, 2014 at 11:38 AM, Robert Haas <robertmhaas@gmail.com> wrote:
On Fri, Jul 18, 2014 at 7:01 PM, Brightwell, AdamThat seems reasonable. You need to give some thought to what happens
<adam.brightwell@crunchydatasolutions.com> wrote:
>> I think we do want a way to modify policies. However, we tend to
>> avoid syntax that involves unnatural word order, as this certainly
>> does. Maybe it's better to follow the example of CREATE RULE and
>> CREATE TRIGGER and do something this instead:
>>
>> CREATE POLICY policy_name ON table_name USING quals;
>> ALTER POLICY policy_name ON table_name USING quals;
>> DROP POLICY policy_name ON table_name;
>>
>> The advantage of this is that you can regard "policy_name ON
>> table_name" as the identifier for the policy throughout the system.
>> You need some kind of identifier of that sort anyway to support
>> COMMENT ON, SECURITY LABEL, and ALTER EXTENSION ADD/DROP for policies.
>
> Sounds good. I certainly think it makes a lot of sense to include the ALTER
> functionality, if for no other reason than ease of use.
>
> Another item to consider, though I believe it can come later, is per-action
> policies. Following the above suggested syntax, perhaps that might look
> like the following?
>
> CREATE POLICY policy_name ON table_name FOR action USING quals;
> ALTER POLICY policy_name ON table_name FOR action USING quals;
> DROP POLICY policy_name ON table_name FOR action;
if the user types:
CREATE POLICY pol1 ON tab1 FOR SELECT USING q1;
ALTER POLICY pol1 ON tab1 FOR INSERT USING q2;
I guess you end up with q1 as the SELECT policy and q2 as the INSERT
policy. Similarly, had you typed:
CREATE POLICY pol1 ON tab1 USING q1;
ALTER POLICY pol1 ON tab1 FOR INSERT USING q2;
...then I guess you end up with q2 for INSERTs and q1 for everything
else. I'm wondering if it might be better, though, not to allow the
quals to be specified in CREATE POLICY, or else to allow multiple
actions. Otherwise, getting pg_dump to DTRT might be complicated.
Perhaps:
CREATE POLICY pol1 ON tab1 ( [ [ FOR operation [ OR operation ] ... ]
USING quals ] ... );
where operation = SELECT | INSERT | UPDATE | DELETE
So that you can write things like:
CREATE POLICY pol1 ON tab1 (USING a = 1);
CREATE POLICY pol2 ON tab2 (FOR INSERT USING a = 1, FOR UPDATE USING b
= 1, FOR DELETE USING c = 1);
And then, for ALTER, just allow one change at a time, syntax as you
proposed. That way each policy can be dumped as a single CREATE
statement.I think that's probably over-engineering. I'm not aware of anything
> I was also giving some thought to the use of "POLICY", perhaps I am wrong,
> but it does seem it could be at risk of becoming ambiguous down the road. I
> can't think of any specific examples at the moment, but my concern is what
> happens if we wanted to add another "type" of policy, whatever that might
> be, later? Would it make more sense to go ahead and qualify this a little
> more with "ROW SECURITY POLICY"?
else we might add that would be likely to be called a policy, and if
we did add something we could probably call it something else instead.
And long command names are annoying.
Adam Brightwell - adam.brightwell@crunchydatasolutions.com
Database Engineer - www.crunchydatasolutions.com
Attachment
All,
Attached is a patch for RLS that was create against master at 01363beae52700c7425cb2d2452177133dad3e93 and is ready for review.
Overview:
This patch provides the capability to create multiple named row level security policies for a table on a per command basis and assign them to be applied to specific roles/users.
It contains the following changes:
* Syntax:
CREATE POLICY <name> ON <table>
[ FOR { ALL | SELECT | INSERT | UPDATE | DELETE } ]
[ TO { PUBLIC | <role> [, <role> ] } ]
USING (<condition>)
Creates a RLS policy named <name> on <table>. Specifying a command is optional, but the default is ALL. Specifying a role is options, but the default is PUBLIC. If PUBLIC and other roles are specified, ONLY PUBLIC is applied and a warning is raised.
ALTER POLICY <name> ON <table>
[ FOR { ALL | SELECT | INSERT | UPDATE | DELETE } ]
[ TO { PUBLIC | <role> [, <role> ] } ]
USING (<condition>)
Alter a RLS policy named <name> on <table>. Specifying a command is optional, if provided then the policy's command is changed otherwise it is left as-is. Specifying a role is optional, if provided then the policy's role is changed otherwise it is left as-is. The <condition> must always be provided and is therefore always replaced.
DROP POLICY <name> ON <table>
Drop a RLS policy named <name> on <table>.
* Plancache Invalidation: If a relation has a row-security policy and row-security is enabled then the invalidation will occur when either the row_security GUC is changed OR when a the current user changes. This invalidation ONLY takes place for cached plans where the target relation has a row security policy.
* Security Qual Expression: All row-security policies are OR'ed together. In the case where another security qual is added, such as in the case of a Security Barrier Views, the the row-security policies are AND'ed with those quals.
Example:
If a table has policies p1 and p2 and a security barrier view is created for that table called rls_sbv, then SELECT * FROM rls_sbv WHERE <some_condition> would result in the following expression: <some_condition> AND (p1 OR p2)
* row_security GUC - enable/disable row level security.
* BYPASSRLS and NOBYPASSRLS role attribute - allows user to bypass RLS if row_security GUC is set to OFF. If a user sets row_security to OFF and does not have this attribute, then an error is raised when attempting to query a relation with a RLS policy.
* psql \d <table> support: psql describe support for listing policy information per table.
* pg_policies system view: lists all row-security policy information.
Any feedback, comments or suggestions would be greatly appreciated.
Thanks,
Adam
On Mon, Aug 18, 2014 at 10:19 PM, Brightwell, Adam <adam.brightwell@crunchydatasolutions.com> wrote:
All,Attached is a patch for RLS that incorporates the following changes:* Syntax:- CREATE POLICY <policy_name> ON <table_name> FOR <command> USING ( <qual> )- ALTER POLICY <policy_name> ON <table_name> FOR <command> USING ( <qual> )- DROP POLICY <policy_name> ON <table_name> FOR <command>* "row_security" GUC Setting - enable/disable row level security.* BYPASSRLS and NOBYPASSRLS role attribute - allows user to bypass RLS if row_security GUC is set to OFF.There are still some remaining issues but we hope to have those resolved soon.Any comments or suggestions would be greatly appreciated.Thanks,AdamOn Mon, Jul 21, 2014 at 11:38 AM, Robert Haas <robertmhaas@gmail.com> wrote:On Fri, Jul 18, 2014 at 7:01 PM, Brightwell, AdamThat seems reasonable. You need to give some thought to what happens
<adam.brightwell@crunchydatasolutions.com> wrote:
>> I think we do want a way to modify policies. However, we tend to
>> avoid syntax that involves unnatural word order, as this certainly
>> does. Maybe it's better to follow the example of CREATE RULE and
>> CREATE TRIGGER and do something this instead:
>>
>> CREATE POLICY policy_name ON table_name USING quals;
>> ALTER POLICY policy_name ON table_name USING quals;
>> DROP POLICY policy_name ON table_name;
>>
>> The advantage of this is that you can regard "policy_name ON
>> table_name" as the identifier for the policy throughout the system.
>> You need some kind of identifier of that sort anyway to support
>> COMMENT ON, SECURITY LABEL, and ALTER EXTENSION ADD/DROP for policies.
>
> Sounds good. I certainly think it makes a lot of sense to include the ALTER
> functionality, if for no other reason than ease of use.
>
> Another item to consider, though I believe it can come later, is per-action
> policies. Following the above suggested syntax, perhaps that might look
> like the following?
>
> CREATE POLICY policy_name ON table_name FOR action USING quals;
> ALTER POLICY policy_name ON table_name FOR action USING quals;
> DROP POLICY policy_name ON table_name FOR action;
if the user types:
CREATE POLICY pol1 ON tab1 FOR SELECT USING q1;
ALTER POLICY pol1 ON tab1 FOR INSERT USING q2;
I guess you end up with q1 as the SELECT policy and q2 as the INSERT
policy. Similarly, had you typed:
CREATE POLICY pol1 ON tab1 USING q1;
ALTER POLICY pol1 ON tab1 FOR INSERT USING q2;
...then I guess you end up with q2 for INSERTs and q1 for everything
else. I'm wondering if it might be better, though, not to allow the
quals to be specified in CREATE POLICY, or else to allow multiple
actions. Otherwise, getting pg_dump to DTRT might be complicated.
Perhaps:
CREATE POLICY pol1 ON tab1 ( [ [ FOR operation [ OR operation ] ... ]
USING quals ] ... );
where operation = SELECT | INSERT | UPDATE | DELETE
So that you can write things like:
CREATE POLICY pol1 ON tab1 (USING a = 1);
CREATE POLICY pol2 ON tab2 (FOR INSERT USING a = 1, FOR UPDATE USING b
= 1, FOR DELETE USING c = 1);
And then, for ALTER, just allow one change at a time, syntax as you
proposed. That way each policy can be dumped as a single CREATE
statement.I think that's probably over-engineering. I'm not aware of anything
> I was also giving some thought to the use of "POLICY", perhaps I am wrong,
> but it does seem it could be at risk of becoming ambiguous down the road. I
> can't think of any specific examples at the moment, but my concern is what
> happens if we wanted to add another "type" of policy, whatever that might
> be, later? Would it make more sense to go ahead and qualify this a little
> more with "ROW SECURITY POLICY"?
else we might add that would be likely to be called a policy, and if
we did add something we could probably call it something else instead.
And long command names are annoying.--Adam Brightwell - adam.brightwell@crunchydatasolutions.comDatabase Engineer - www.crunchydatasolutions.com
Adam Brightwell - adam.brightwell@crunchydatasolutions.com
Database Engineer - www.crunchydatasolutions.com
Attachment
Adam, all, * Brightwell, Adam (adam.brightwell@crunchydatasolutions.com) wrote: > Attached is a patch for RLS that was create against master at > 01363beae52700c7425cb2d2452177133dad3e93 and is ready for review. Many thanks for posting this. As others may realize already, I've reviewed and modified this patch already, working with Adam to get it ready. I'm continuing to review and test it, but in general I'm quite happy with how it's shaping up- additional review, testing, and comments are always appreciated though. > Alter a RLS policy named <name> on <table>. Specifying a command is > optional, if provided then the policy's command is changed otherwise it is > left as-is. Specifying a role is optional, if provided then the policy's > role is changed otherwise it is left as-is. The <condition> must always be > provided and is therefore always replaced. I'm pretty sure the <condition> is also optional in this patch (that was a late change that I made), but the documentation needs to be updated. > * Plancache Invalidation: If a relation has a row-security policy and > row-security is enabled then the invalidation will occur when either the > row_security GUC is changed OR when a the current user changes. This > invalidation ONLY takes place for cached plans where the target relation > has a row security policy. I know there was a lot of discussion about this previously, but I'm fine with the initial version simply invalidating plans which involve RLS-enabled relations and role changes. This patch doesn't create any regressions for individuals who are not using RLS. We can certainly look into improving this in the future to have per-role plan caches but it's a fair bit of additional non-trivial code that can be added independently. > * Security Qual Expression: All row-security policies are OR'ed together. This was also a point of much discussion, but I continue to feel this is the right approach for the initial version. We can add flexability here later, if necessary, but OR'ing these together is in-line with how role membership works today (you have right for all roles you are a member of, persuant to inherit/noinherit status, of course). > * row_security GUC - enable/disable row level security. Note that, as discussed, pg_dump will set row_security off, unless specifically asked to enable it. row_security will also be set to off when the user logging in is a superuser or does a 'set role' to a superuser. Currently, if a user logging in is *not* a superuser, or a 'set role' is done to a non-superuser, row_security gets re-set to enabled. This is one aspect of the patch that I think we should change (which is a matter of removing just a few lines of code and then updating the regression tests to do 'set row_security = on;' before running), because if you log in as a superuser and then 'set role' to a non-superuser, it occurs to me now (it didn't really when I wrote this originally) as a bit surprising that row_security gets set to 'on' when doing a 'set role'. One thing that I really like about this approach is that a superuser can explicitly set 'row_security' to on and be able to see what happens. Clearly, in an environment of untrusted users, that could be dangerous, but it can also be an extremely useful way of testing things, particularly in development environments where everyone is a superuser. This deserves a bit more documentation also. > * BYPASSRLS and NOBYPASSRLS role attribute - allows user to bypass RLS if > row_security GUC is set to OFF. If a user sets row_security to OFF and > does not have this attribute, then an error is raised when attempting to > query a relation with a RLS policy. (note that the superuser is always considered to have the bypassrls attribute) > * psql \d <table> support: psql describe support for listing policy > information per table. This works pretty well for me, but we may want to add some indication that RLS is on a table in the \dp listing. Thanks! Stephen
On Fri, Aug 29, 2014 at 8:16 PM, Brightwell, Adam <adam.brightwell@crunchydatasolutions.com> wrote: > Attached is a patch for RLS that was create against master at > 01363beae52700c7425cb2d2452177133dad3e93 and is ready for review. > > Overview: > > This patch provides the capability to create multiple named row level > security policies for a table on a per command basis and assign them to be > applied to specific roles/users. > > It contains the following changes: > > * Syntax: > > CREATE POLICY <name> ON <table> > [ FOR { ALL | SELECT | INSERT | UPDATE | DELETE } ] > [ TO { PUBLIC | <role> [, <role> ] } ] > USING (<condition>) > > Creates a RLS policy named <name> on <table>. Specifying a command is > optional, but the default is ALL. Specifying a role is options, but the > default is PUBLIC. If PUBLIC and other roles are specified, ONLY PUBLIC is > applied and a warning is raised. > > ALTER POLICY <name> ON <table> > [ FOR { ALL | SELECT | INSERT | UPDATE | DELETE } ] > [ TO { PUBLIC | <role> [, <role> ] } ] > USING (<condition>) > > Alter a RLS policy named <name> on <table>. Specifying a command is > optional, if provided then the policy's command is changed otherwise it is > left as-is. Specifying a role is optional, if provided then the policy's > role is changed otherwise it is left as-is. The <condition> must always be > provided and is therefore always replaced. This is not a full review of this patch; as we're mid-CommitFest, I assume this will get added to the next CommitFest. In earlier discussions, it was proposed (and I thought the proposal was viewed favorably) that when enabling row-level security for a table (i.e. before doing CREATE POLICY), you'd have to first flip the table to a default-deny mode: ALTER TABLE <name> ENABLE ROW LEVEL SECURITY; In this design, I'm not sure what happens when there are policies for some but not all users or some but not all actions. Does creating a INSERT policy for one particular user cause a default-deny policy to be turned on for all other users and all other operations? That might be OK, but at the very least it should be documented more clearly. Does dropping the very last policy then instantaneously flip the table back to default-allow? As far as I can tell from the patch, and that's not too far since I've only looked at briefly, there's a default-deny policy only if there is at least 1 policy that applies to your user ID for this operation. As far as making it easy to create a watertight combination of policies, that seems like a bad plan. + elog(ERROR, "Table \"%s\" already has a policy named \"%s\"." + " Use a different name for the policy or to modify this policy" + " use ALTER POLICY %s ON %s USING (qual)", + RelationGetRelationName(target_table), stmt->policy_name, + RelationGetRelationName(target_table), stmt->policy_name); + That needs to be an ereport, be capitalized properly, and the hint, if it's to be included at all, needs to go into errhint(). + errhint("all roles are considered members of public"))); Wrong message style for a hint. Also, not sure that's actually appropriate for a hint. + case EXPR_KIND_ROW_SECURITY: + return "ROW SECURITY"; This is quite simply bizarre. That's not the SQL syntax of anything. + | ROW SECURITY row_security_option + { + VariableSetStmt *n = makeNode(VariableSetStmt); + n->kind = VAR_SET_VALUE; + n->name = "row_security"; + n->args = list_make1(makeStringConst($3, @3)); + $$ = n; + } I object to this. There's no reason that we should bloat the parser to allow SET ROW SECURITY in lieu of SET row_security unless this is a standard-mandated syntax with standard-mandated semantics, which I bet it isn't. /* + * Although only "on" and"off" are documented, we accept all likely variants of + * "on" and "off". + */ + static const struct config_enum_entry row_security_options[] = { + {"off", ROW_SECURITY_OFF, false}, + {"on", ROW_SECURITY_ON, false}, + {"true", ROW_SECURITY_ON, true}, + {"false", ROW_SECURITY_OFF, true}, + {"yes", ROW_SECURITY_ON, true}, + {"no", ROW_SECURITY_OFF, true}, + {"1", ROW_SECURITY_ON, true}, + {"0", ROW_SECURITY_OFF, true}, + {NULL, 0, false} + }; Just make it a bool and you get all this for free. + /* + * is_rls_enabled - + * determines if row-security is enabled by checking the value of the system + * configuration "row_security". + */ + bool + is_rls_enabled() + { + char const *rls_option; + + rls_option = GetConfigOption("row_security", true, false); + + return (strcmp(rls_option, "on") == 0); + } Words fail me. + if (AuthenticatedUserIsSuperuser) + SetConfigOption("row_security", "off", PGC_INTERNAL, PGC_S_OVERRIDE); Injecting this kind of magic into InitializeSessionUserId(), SetSessionAuthorization(), and SetCurrentRoleId() seems 100% unacceptable to me. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Hey Robert,
On my phone at the moment but wanted to reply.
I'm working through a few of these issues already actually (noticed as I've been going over it with Adam), but certainly appreciate the additional review. We've not posted another update quite yet but plan to shortly.
Thanks!
Stephen
Robert,
Alright, I can't help it so I'll try and reply from my phone for a couple of these. :)
On Wednesday, September 3, 2014, Robert Haas <robertmhaas@gmail.com> wrote:
Already addressed.
Will address.
On Wednesday, September 3, 2014, Robert Haas <robertmhaas@gmail.com> wrote:
On Fri, Aug 29, 2014 at 8:16 PM, Brightwell, Adam
<adam.brightwell@crunchydatasolutions.com> wrote:
> Attached is a patch for RLS that was create against master at
> 01363beae52700c7425cb2d2452177133dad3e93 and is ready for review.
>
> Overview:
>
> This patch provides the capability to create multiple named row level
> security policies for a table on a per command basis and assign them to be
> applied to specific roles/users.
>
> It contains the following changes:
>
> * Syntax:
>
> CREATE POLICY <name> ON <table>
> [ FOR { ALL | SELECT | INSERT | UPDATE | DELETE } ]
> [ TO { PUBLIC | <role> [, <role> ] } ]
> USING (<condition>)
>
> Creates a RLS policy named <name> on <table>. Specifying a command is
> optional, but the default is ALL. Specifying a role is options, but the
> default is PUBLIC. If PUBLIC and other roles are specified, ONLY PUBLIC is
> applied and a warning is raised.
>
> ALTER POLICY <name> ON <table>
> [ FOR { ALL | SELECT | INSERT | UPDATE | DELETE } ]
> [ TO { PUBLIC | <role> [, <role> ] } ]
> USING (<condition>)
>
> Alter a RLS policy named <name> on <table>. Specifying a command is
> optional, if provided then the policy's command is changed otherwise it is
> left as-is. Specifying a role is optional, if provided then the policy's
> role is changed otherwise it is left as-is. The <condition> must always be
> provided and is therefore always replaced.
This is not a full review of this patch; as we're mid-CommitFest, I
assume this will get added to the next CommitFest.
As per usual, the expectation is that the patch is reviewed and updated during the commitfest. Given that the commitfest isn't even over according to the calendar it seems a bit premature to talk about the next one, but certainly if it's not up to a commitable level before the end of this commitfest then it'll be submitted for the next.
In earlier discussions, it was proposed (and I thought the proposal
was viewed favorably) that when enabling row-level security for a
table (i.e. before doing CREATE POLICY), you'd have to first flip the
table to a default-deny mode:
I do recall that (now that you remind me- clearly it had been lost during the subsequent discussion, from my point of view at least) and agree that it'd be useful. I don't believe it'll be difficult to address.
ALTER TABLE <name> ENABLE ROW LEVEL SECURITY;
Sounds reasonable to me.
+ elog(ERROR, "Table \"%s\" already has a policy named \"%s\"."
+ " Use a different name for the policy or to modify this policy"
+ " use ALTER POLICY %s ON %s USING (qual)",
+ RelationGetRelationName(target_table), stmt->policy_name,
+ RelationGetRelationName(target_table), stmt->policy_name);
+
That needs to be an ereport, be capitalized properly, and the hint, if
it's to be included at all, needs to go into errhint().
+ errhint("all roles are considered members
of public")));
Wrong message style for a hint. Also, not sure that's actually
appropriate for a hint.
Fair enough. Will address.
+ case EXPR_KIND_ROW_SECURITY:
+ return "ROW SECURITY";
This is quite simply bizarre. That's not the SQL syntax of anything.
+ | ROW SECURITY row_security_option
+ {
+ VariableSetStmt *n = makeNode(VariableSetStmt);
+ n->kind = VAR_SET_VALUE;
+ n->name = "row_security";
+ n->args = list_make1(makeStringConst($3, @3));
+ $$ = n;
+ }
I object to this. There's no reason that we should bloat the parser
to allow SET ROW SECURITY in lieu of SET row_security unless this is a
standard-mandated syntax with standard-mandated semantics, which I bet
it isn't.
Agreed. Seemed like a nice idea but it's not necessary.
/*
+ * Although only "on" and"off" are documented, we accept all likely
variants of
+ * "on" and "off".
+ */
+ static const struct config_enum_entry row_security_options[] = {
+ {"off", ROW_SECURITY_OFF, false},
+ {"on", ROW_SECURITY_ON, false},
+ {"true", ROW_SECURITY_ON, true},
+ {"false", ROW_SECURITY_OFF, true},
+ {"yes", ROW_SECURITY_ON, true},
+ {"no", ROW_SECURITY_OFF, true},
+ {"1", ROW_SECURITY_ON, true},
+ {"0", ROW_SECURITY_OFF, true},
+ {NULL, 0, false}
+ };
Just make it a bool and you get all this for free.
Right- holdover from an earlier attempt to make it more complicated but now we've simplified it and so it should just be a bool.
+ if (AuthenticatedUserIsSuperuser)
+ SetConfigOption("row_security", "off", PGC_INTERNAL, PGC_S_OVERRIDE);
Injecting this kind of magic into InitializeSessionUserId(),
SetSessionAuthorization(), and SetCurrentRoleId() seems 100%
unacceptable to me.
I was struggling with the right way to address this and welcome suggestions. The primary issue is that I really want to support a superuser turning it on, so we can't simply have it disabled for all superusers all the time. The requirement that it not be enabled by default for superusers makes sense, but how far does that extend and how do we address upgrades? In particular, can we simply set row_security=off as a custom GUC setting when superusers are created or roles altered to be made superusers? Would we do that in pg_upgrade?
Thanks!
Stephen
On Wed, Sep 3, 2014 at 10:40 AM, Stephen Frost <sfrost@snowman.net> wrote: >> This is not a full review of this patch; as we're mid-CommitFest, I >> assume this will get added to the next CommitFest. > > As per usual, the expectation is that the patch is reviewed and updated > during the commitfest. Given that the commitfest isn't even over according > to the calendar it seems a bit premature to talk about the next one, but > certainly if it's not up to a commitable level before the end of this > commitfest then it'll be submitted for the next. The first version of this patch that was described as "ready for review" was submitted on August 29th. The previous revision was submitted on August 18th. Both of those dates are after the CommitFest deadline of August 15th. So from where I sit this is not timely submitted for this CommitFest. The last version before August was submitted in April (there's a link to a version supposedly submitted in June in the CommitFest application, but it doesn't point to an email with a patch attached). I don't want to (and don't feel I should have to) decide between dropping everything to review an untimely-submitted patch and having it get committed with no review from anyone who wasn't involved in writing it. >> + if (AuthenticatedUserIsSuperuser) >> + SetConfigOption("row_security", "off", PGC_INTERNAL, >> PGC_S_OVERRIDE); >> >> Injecting this kind of magic into InitializeSessionUserId(), >> SetSessionAuthorization(), and SetCurrentRoleId() seems 100% >> unacceptable to me. > > I was struggling with the right way to address this and welcome suggestions. > The primary issue is that I really want to support a superuser turning it > on, so we can't simply have it disabled for all superusers all the time. The > requirement that it not be enabled by default for superusers makes sense, > but how far does that extend and how do we address upgrades? In particular, > can we simply set row_security=off as a custom GUC setting when superusers > are created or roles altered to be made superusers? Would we do that in > pg_upgrade? I think you need to have the GUC have one default value, not one default for superusers and another default for everybody else. I previously proposed making the GUC on/off/force, with "on" meaning "apply row-level security unless we have permission to bypass it, either because we are the table owner or the superuser", "off" meaning "error out if we would be forced to apply row-level security", and "force" meaning "always apply row-level security even if we have permission to bypass it". I still think that's a good proposal. There may be other reasonable alternatives as well, but making changes to one GUC magically change other GUCs under the hood isn't one of them. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
All,
Attached is a updated patch taking into account the recommendations provided.
This patch created against master at ad5d46a4494b0b480a3af246bb4227d9bdadca37
The following items have been addressed:
* Add ALTER TABLE <name> { ENABLE | DISABLE } ROW LEVEL SECURITY - set flag on table to allow for a default-deny capability. If RLS is enabled on a table and has no policies, then a default-deny policy is automatically applied. If RLS is disabled on table and the table still has policies on it then then an error is raised. Though if DISABLE is accompanied with CASCADE, then all policies will be removed and no error is raised.
* Update CREATE POLICY to include WITH CHECK ( <expression> ). Therefore, the syntax is now as follows:
CREATE POLICY <name> ON <table>
[ FOR { ALL | SELECT | INSERT | UPDATE | DELETE } ]
[ USING ( <expression> ) ]
[ WITH CHECK ( <expression> ) ]
A WITH CHECK expression is required for creating an INSERT policy and is optional on UPDATE and ALL. The intended purpose is to provide a VIEW-like WITH CHECK OPTION functionality to RLS.
* Add ALTER POLICY <name> ON <table> RENAME TO <new_name> - renames a policy.
* Updated GUC row_security to allow ON | OFF | FORCE. Each option breaks down as follows:
- ON - RLS is appled to all roles except the table owner and superusers.
- OFF - RLS can be bypassed, but only by roles with BYPASSRLS. If the roles does not have BYPASSRLS, then an error is raised.
- FORCE - RLS is applied to all roles, regardless of ownership, superuser or BYPASSRLS.
* Removed SET ROW SECURITY { ON | OFF } as requested.
* Removed all GetConfigOption for "row_security" GUC.
* Removed setting row_security GUC to OFF in SET SESSION/SET ROLE for superuser.
* Add psql \dp support. Displays RLS information in new column "Policies".
* Updated documentation.
* Other cleanup and improvements.
There are still some minor issues being worked through, however, it is expected that those will be resolved soon. However, any feedback, comments or suggestions on the above and in general would be greatly appreciated.
Thanks,
Adam
On Wed, Sep 3, 2014 at 10:17 AM, Robert Haas <robertmhaas@gmail.com> wrote:
This is not a full review of this patch; as we're mid-CommitFest, IOn Fri, Aug 29, 2014 at 8:16 PM, Brightwell, Adam
<adam.brightwell@crunchydatasolutions.com> wrote:
> Attached is a patch for RLS that was create against master at
> 01363beae52700c7425cb2d2452177133dad3e93 and is ready for review.
>
> Overview:
>
> This patch provides the capability to create multiple named row level
> security policies for a table on a per command basis and assign them to be
> applied to specific roles/users.
>
> It contains the following changes:
>
> * Syntax:
>
> CREATE POLICY <name> ON <table>
> [ FOR { ALL | SELECT | INSERT | UPDATE | DELETE } ]
> [ TO { PUBLIC | <role> [, <role> ] } ]
> USING (<condition>)
>
> Creates a RLS policy named <name> on <table>. Specifying a command is
> optional, but the default is ALL. Specifying a role is options, but the
> default is PUBLIC. If PUBLIC and other roles are specified, ONLY PUBLIC is
> applied and a warning is raised.
>
> ALTER POLICY <name> ON <table>
> [ FOR { ALL | SELECT | INSERT | UPDATE | DELETE } ]
> [ TO { PUBLIC | <role> [, <role> ] } ]
> USING (<condition>)
>
> Alter a RLS policy named <name> on <table>. Specifying a command is
> optional, if provided then the policy's command is changed otherwise it is
> left as-is. Specifying a role is optional, if provided then the policy's
> role is changed otherwise it is left as-is. The <condition> must always be
> provided and is therefore always replaced.
assume this will get added to the next CommitFest.
In earlier discussions, it was proposed (and I thought the proposal
was viewed favorably) that when enabling row-level security for a
table (i.e. before doing CREATE POLICY), you'd have to first flip the
table to a default-deny mode:
ALTER TABLE <name> ENABLE ROW LEVEL SECURITY;
In this design, I'm not sure what happens when there are policies for
some but not all users or some but not all actions. Does creating a
INSERT policy for one particular user cause a default-deny policy to
be turned on for all other users and all other operations? That might
be OK, but at the very least it should be documented more clearly.
Does dropping the very last policy then instantaneously flip the table
back to default-allow?
As far as I can tell from the patch, and that's not too far since I've
only looked at briefly, there's a default-deny policy only if there is
at least 1 policy that applies to your user ID for this operation. As
far as making it easy to create a watertight combination of policies,
that seems like a bad plan.
+ elog(ERROR, "Table \"%s\" already has a policy named \"%s\"."
+ " Use a different name for the policy or to modify this policy"
+ " use ALTER POLICY %s ON %s USING (qual)",
+ RelationGetRelationName(target_table), stmt->policy_name,
+ RelationGetRelationName(target_table), stmt->policy_name);
+
That needs to be an ereport, be capitalized properly, and the hint, if
it's to be included at all, needs to go into errhint().
+ errhint("all roles are considered members
of public")));
Wrong message style for a hint. Also, not sure that's actually
appropriate for a hint.
+ case EXPR_KIND_ROW_SECURITY:
+ return "ROW SECURITY";
This is quite simply bizarre. That's not the SQL syntax of anything.
+ | ROW SECURITY row_security_option
+ {
+ VariableSetStmt *n = makeNode(VariableSetStmt);
+ n->kind = VAR_SET_VALUE;
+ n->name = "row_security";
+ n->args = list_make1(makeStringConst($3, @3));
+ $$ = n;
+ }
I object to this. There's no reason that we should bloat the parser
to allow SET ROW SECURITY in lieu of SET row_security unless this is a
standard-mandated syntax with standard-mandated semantics, which I bet
it isn't.
/*
+ * Although only "on" and"off" are documented, we accept all likely
variants of
+ * "on" and "off".
+ */
+ static const struct config_enum_entry row_security_options[] = {
+ {"off", ROW_SECURITY_OFF, false},
+ {"on", ROW_SECURITY_ON, false},
+ {"true", ROW_SECURITY_ON, true},
+ {"false", ROW_SECURITY_OFF, true},
+ {"yes", ROW_SECURITY_ON, true},
+ {"no", ROW_SECURITY_OFF, true},
+ {"1", ROW_SECURITY_ON, true},
+ {"0", ROW_SECURITY_OFF, true},
+ {NULL, 0, false}
+ };
Just make it a bool and you get all this for free.
+ /*
+ * is_rls_enabled -
+ * determines if row-security is enabled by checking the value of the system
+ * configuration "row_security".
+ */
+ bool
+ is_rls_enabled()
+ {
+ char const *rls_option;
+
+ rls_option = GetConfigOption("row_security", true, false);
+
+ return (strcmp(rls_option, "on") == 0);
+ }
Words fail me.
+ if (AuthenticatedUserIsSuperuser)
+ SetConfigOption("row_security", "off", PGC_INTERNAL, PGC_S_OVERRIDE);
Injecting this kind of magic into InitializeSessionUserId(),
SetSessionAuthorization(), and SetCurrentRoleId() seems 100%
unacceptable to me.
Adam Brightwell - adam.brightwell@crunchydatasolutions.com
Database Engineer - www.crunchydatasolutions.com
Attachment
All, * Brightwell, Adam (adam.brightwell@crunchydatasolutions.com) wrote: > Attached is a updated patch taking into account the recommendations > provided. Alright, attached is a patch which I've been over in a great deal more detail, as we seem to have moved beyond grammar and simple functionality. It's been much reworked and improved (particularly in rewrite/rowsecurity.c, but also commands/policy.c). Other improvements of note (not including the improvements made and mentioned by Adam previously): Lots of additional comments around what's happening Improved SGML documentation Better \d and \dp support Explicit function for check row-security requirements Correct handling for views run under policies Simplified changes to copy.c Use normal DROP and RENAME processes (eg: DropStmt and friends) Default-deny policy implementation, and regression tests Handle sub-queries in WITH CHECK Avoid duplicate policy application Corrected plancache invalidation Improved and additional regression tests tab completion This addresses all of the comments brought up previously, as far as I'm aware, along with quite a few other issues which I found while doing my review and rework. As always- testing, reviews, comments are welcome. We've done a fair bit of testing internally, but it's great to see how others are imaginging and trying to use new capabilities like these- especially if they run into any problems! :) This took quite a bit longer than I had expected, but I think the rework, review and additional testing was well worth it. I'm planning to break from this for a few days and resume helping with the commitfest more-or-less full-time until I have to head out for PostgresOpen. Thanks! Stephen
Attachment
On Wed, September 10, 2014 23:50, Stephen Frost wrote: > [rls_9-10-2014.patch] I can't get this to apply; I attach the complaints of patch. Erik Rijkers
Attachment
Erik, * Erik Rijkers (er@xs4all.nl) wrote: > On Wed, September 10, 2014 23:50, Stephen Frost wrote: > > [rls_9-10-2014.patch] > > I can't get this to apply; I attach the complaints of patch. Thanks for taking a look at this! [...] > patching file src/include/catalog/catversion.h > Hunk #1 FAILED at 53. > 1 out of 1 hunk FAILED -- saving rejects to file src/include/catalog/catversion.h.rej That's just the catversion bump- you can simply ignore it and everything should be fine. Look forward to hearing how it works for you! Thanks again, Stephen
On Sat, Sep 6, 2014 at 2:54 AM, Brightwell, Adam <adam.brightwell@crunchydatasolutions.com> wrote: > * Add ALTER TABLE <name> { ENABLE | DISABLE } ROW LEVEL SECURITY - set flag > on table to allow for a default-deny capability. If RLS is enabled on a > table and has no policies, then a default-deny policy is automatically > applied. If RLS is disabled on table and the table still has policies on it > then then an error is raised. Though if DISABLE is accompanied with > CASCADE, then all policies will be removed and no error is raised. This text doesn't make it clear that all of the cases have been covered; in particular, you didn't specify whether an error is thrown if you try to add a policy to a table with DISABLE ROW LEVEL SECURITY in effect. Backing up a bit, I think there are two sensible designs here: 1. Row level security policies can't exist for a table with DISABLE ROW LEVEL SECURITY in effect. It sounds like this is what you have implemented, modulo any hypothetical bugs. You can't add policies without enabling RLS, and you can't disable RLS without dropping them all. 2. Row level security policies can exist for a table with DISABLE ROW LEVEL SECURITY in effect, but they don't do anything until RLS is enabled. A possible advantage of this approach is that you could *temporarily* shut off RLS for a table without having to drop all of your policies and put them back. I kind of like this approach; we have something similar for triggers, and I think it could be useful to people. If you stick with approach #1, make sure pg_dump is guaranteed to enable RLS before applying the policies. And either way, you should that pg_dump behaves sanely in the case where there are circular dependencies, like you have two table A and B, and each has a RLS policy that manages to use the other table's row-type. (You probably also want to check that DROP and DROP .. CASCADE on either policy or either table does the right thing in that situation, but that's probably easier to get right.) -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert, * Robert Haas (robertmhaas@gmail.com) wrote: > On Sat, Sep 6, 2014 at 2:54 AM, Brightwell, Adam > <adam.brightwell@crunchydatasolutions.com> wrote: > > * Add ALTER TABLE <name> { ENABLE | DISABLE } ROW LEVEL SECURITY - set flag > > on table to allow for a default-deny capability. If RLS is enabled on a > > table and has no policies, then a default-deny policy is automatically > > applied. If RLS is disabled on table and the table still has policies on it > > then then an error is raised. Though if DISABLE is accompanied with > > CASCADE, then all policies will be removed and no error is raised. > > This text doesn't make it clear that all of the cases have been > covered; in particular, you didn't specify whether an error is thrown > if you try to add a policy to a table with DISABLE ROW LEVEL SECURITY > in effect. Backing up a bit, I think there are two sensible designs > here: Ah, yeah, the text could certainly be clearer. > 1. Row level security policies can't exist for a table with DISABLE > ROW LEVEL SECURITY in effect. It sounds like this is what you have > implemented, modulo any hypothetical bugs. You can't add policies > without enabling RLS, and you can't disable RLS without dropping them > all. Right, this was the approach we were taking. Specifically, adding policies would implicitly enable RLS for the relation. > 2. Row level security policies can exist for a table with DISABLE ROW > LEVEL SECURITY in effect, but they don't do anything until RLS is > enabled. A possible advantage of this approach is that you could > *temporarily* shut off RLS for a table without having to drop all of > your policies and put them back. I kind of like this approach; we > have something similar for triggers, and I think it could be useful to > people. I like the idea of being able to turn them off without dropping them. We have that with row_security = off, but that would only work for the owner or a superuser (or a user with bypassrls). This would allow disabling RLS temporairly for everything accessing the table. The one thing I'm wondering about with this design is- what happens when a policy is initially added? Currently, we automatically turn on RLS for the table when that happens. I'm not thrilled with the idea that you have to add policies AND turn on RLS explicitly- someone might add policies but then forget to turn RLS on.. > If you stick with approach #1, make sure pg_dump is guaranteed to > enable RLS before applying the policies. Currently, adding a policy automatically turns on RLS, so we don't have any issue with pg_dump from that perspective. Handling cases where RLS is disabled but policies exist would get more complicated for pg_dump if we keep the current idea that adding policies implicitly turns on RLS- it'd essentially have to go back and turn it off after the policies are added. Not a big fan of that either. > And either way, you should > that pg_dump behaves sanely in the case where there are circular > dependencies, like you have two table A and B, and each has a RLS > policy that manages to use the other table's row-type. (You probably > also want to check that DROP and DROP .. CASCADE on either policy or > either table does the right thing in that situation, but that's > probably easier to get right.) Agreed, we'll double-check that this is working. As these are attributes of the table which get added later on by pg_dump, similar to permissions, I'd think it'd all work fine, but good to make sure (and ditto with DROP/DROP CASCADE.. We have some checks for that, but good to make sure it works in a circular-dependency case too). If we want to be able to disable RLS w/o dropping the policies, then I think we have to completely de-couple the two and users would then have both add policies AND turn on RLS to have RLS actually be enabled for a given table. I'm on the fence about that. Thoughts? Thanks! Stephen
On Thu, Sep 11, 2014 at 3:08 PM, Stephen Frost <sfrost@snowman.net> wrote: >> 2. Row level security policies can exist for a table with DISABLE ROW >> LEVEL SECURITY in effect, but they don't do anything until RLS is >> enabled. A possible advantage of this approach is that you could >> *temporarily* shut off RLS for a table without having to drop all of >> your policies and put them back. I kind of like this approach; we >> have something similar for triggers, and I think it could be useful to >> people. > > I like the idea of being able to turn them off without dropping them. > We have that with row_security = off, but that would only work for the > owner or a superuser (or a user with bypassrls). This would allow > disabling RLS temporairly for everything accessing the table. > > The one thing I'm wondering about with this design is- what happens when > a policy is initially added? Currently, we automatically turn on RLS > for the table when that happens. I'm not thrilled with the idea that > you have to add policies AND turn on RLS explicitly- someone might add > policies but then forget to turn RLS on.. Whoa. I think that's a bad idea. I think the default value for RLS should be disabled, and users should have to turn it on explicitly if they want to get it. It's arguable whether the behavior if you try to create a policy beforehand should be (1) outright failure or (2) command accepted but no effect, but I think (3) automagically enable the feature is a POLA violation. When somebody adds a policy and then drops it again, they will expect to be back in the same state they started out in, and for good reason. > If we want to be able to disable RLS w/o dropping the policies, then I > think we have to completely de-couple the two and users would then have > both add policies AND turn on RLS to have RLS actually be enabled for a > given table. I'm on the fence about that. > > Thoughts? A strong +1 for doing just that. Look, anybody who is going to use row-level security but isn't careful enough to verify that it's actually working as desired after configuring it is a lost cause anyway. That is the moral equivalent of a locksmith who comes out and replaces a lock for you and at no point while he's there does he ever close the door and verify that it latches and won't reopen. I'm sure somebody has done that, but if a security breach results, surely everybody would agree that the locksmith is at fault, not the lock manufacturer. Personally, I have to test every GRANT and REVOKE I issue, because there's no error for granting a privilege that the target already has or revoking one they don't, and with group membership and PUBLIC it's quite easy to have not done what you thought you did. Fixing that might be worthwhile but it doesn't take away from the fact that, like any other configuration change you make, security-relevant changes need to be tested. There is another possible advantage of the explicit-enable approach as well, which is that you might want to create several policies and then turn them all on at once. With what you have now, creating the first policy will enable RLS on the table and then everyone who wasn't the beneficiary of that initial policy is locked out. Now, granted, you can probably get around that by doing all of the operations in one transaction, so it's a minor point. But it's still nice to think about being able to add several policies and then flip them on. If it doesn't work out, flip them off, adjust, and flip them back on again. Now, again, the core design issue, IMHO, is that the switch from default-allow to default-deny should be explicit and unmistakable, so the rest of this is just tinkering around the edges. But we might as well make those edges as nice as possible, and the usability of this approach feels good to me. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
* Robert Haas (robertmhaas@gmail.com) wrote: > On Thu, Sep 11, 2014 at 3:08 PM, Stephen Frost <sfrost@snowman.net> wrote: > > The one thing I'm wondering about with this design is- what happens when > > a policy is initially added? Currently, we automatically turn on RLS > > for the table when that happens. I'm not thrilled with the idea that > > you have to add policies AND turn on RLS explicitly- someone might add > > policies but then forget to turn RLS on.. > > Whoa. I think that's a bad idea. I think the default value for RLS > should be disabled, and users should have to turn it on explicitly if > they want to get it. It's arguable whether the behavior if you try to > create a policy beforehand should be (1) outright failure or (2) > command accepted but no effect, but I think (3) automagically enable > the feature is a POLA violation. When somebody adds a policy and then > drops it again, they will expect to be back in the same state they > started out in, and for good reason. Yeah, that I can agree with. Prior to adding the ability to explicitly enable RLS, that's what they got, but that's changed now that we've made the ability to turn on/off RLS half-way independent of policies. Also.. > > If we want to be able to disable RLS w/o dropping the policies, then I > > think we have to completely de-couple the two and users would then have > > both add policies AND turn on RLS to have RLS actually be enabled for a > > given table. I'm on the fence about that. > > A strong +1 for doing just that. Look, anybody who is going to use > row-level security but isn't careful enough to verify that it's > actually working as desired after configuring it is a lost cause > anyway. I had been thinking the same, which is why I was on the fence about if it was really an issue or not. This all amounts to actually making the patch smaller also, which isn't a bad thing. > Personally, I have to test every GRANT and REVOKE I > issue, because there's no error for granting a privilege that the > target already has or revoking one they don't, and with group > membership and PUBLIC it's quite easy to have not done what you > thought you did. Fixing that might be worthwhile but it doesn't take > away from the fact that, like any other configuration change you make, > security-relevant changes need to be tested. Hmm, pretty sure that'd end up going against the spec too, but that's a whole different discussion anyway. > There is another possible advantage of the explicit-enable approach as > well, which is that you might want to create several policies and then > turn them all on at once. With what you have now, creating the first > policy will enable RLS on the table and then everyone who wasn't the > beneficiary of that initial policy is locked out. Now, granted, you > can probably get around that by doing all of the operations in one > transaction, so it's a minor point. But it's still nice to think > about being able to add several policies and then flip them on. If it > doesn't work out, flip them off, adjust, and flip them back on again. > Now, again, the core design issue, IMHO, is that the switch from > default-allow to default-deny should be explicit and unmistakable, so > the rest of this is just tinkering around the edges. But we might as > well make those edges as nice as possible, and the usability of this > approach feels good to me. Fair enough. Thanks! Stephen
* Robert Haas (robertmhaas@gmail.com) wrote: > On Thu, Sep 11, 2014 at 3:08 PM, Stephen Frost <sfrost@snowman.net> wrote: > > If we want to be able to disable RLS w/o dropping the policies, then I > > think we have to completely de-couple the two and users would then have > > both add policies AND turn on RLS to have RLS actually be enabled for a > > given table. I'm on the fence about that. > > > > Thoughts? > > A strong +1 for doing just that. Alright, updated patch attached which does just that (thanks to Adam for the updates for this and testing pg_dump- I just reviewed it and added some documentation updates and other minor improvements), and rebased to master. Also removed the catversion bump, so it should apply cleanly for people, for a while anyway. Thanks! Stephen
Attachment
On Sun, Sep 14, 2014 at 11:38 AM, Stephen Frost <sfrost@snowman.net> wrote: > * Robert Haas (robertmhaas@gmail.com) wrote: >> On Thu, Sep 11, 2014 at 3:08 PM, Stephen Frost <sfrost@snowman.net> wrote: >> > If we want to be able to disable RLS w/o dropping the policies, then I >> > think we have to completely de-couple the two and users would then have >> > both add policies AND turn on RLS to have RLS actually be enabled for a >> > given table. I'm on the fence about that. >> > >> > Thoughts? >> >> A strong +1 for doing just that. > > Alright, updated patch attached which does just that (thanks to Adam > for the updates for this and testing pg_dump- I just reviewed it and > added some documentation updates and other minor improvements), and > rebased to master. Also removed the catversion bump, so it should apply > cleanly for people, for a while anyway. I specifically asked you to hold off on committing this until there was adequate opportunity for review, and explained my reasoning. You committed it anyway. I wonder if I am equally free to commit my own patches without properly observing the CommitFest process, because it would be a whole lot faster. My pg_background patches have been pending since before the start of the August CommitFest and I accepted that I would have to wait an extra two months to commit those because of a *clerical error*, namely my failure to actually add them to the CommitFest. This patch, on the other hand, was massively revised after the start of the CommitFest after many months of inactivity and committed with no thorough review by anyone who was truly independent of the development effort. It was then committed with no warning over a specific request, from another committer, that more time be allowed for review. I'm really disappointed by that. I feel I'm essentially getting punished for trying to follow what I understand to the process, which has involved me doing huge amounts of review of other people's patches and waiting a very long time to get my own stuff committed, while you bull ahead with your own patches. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 2014-09-19 11:53:06 -0400, Robert Haas wrote: > On Sun, Sep 14, 2014 at 11:38 AM, Stephen Frost <sfrost@snowman.net> wrote: > > * Robert Haas (robertmhaas@gmail.com) wrote: > >> On Thu, Sep 11, 2014 at 3:08 PM, Stephen Frost <sfrost@snowman.net> wrote: > >> > If we want to be able to disable RLS w/o dropping the policies, then I > >> > think we have to completely de-couple the two and users would then have > >> > both add policies AND turn on RLS to have RLS actually be enabled for a > >> > given table. I'm on the fence about that. > >> > > >> > Thoughts? > >> > >> A strong +1 for doing just that. > > > > Alright, updated patch attached which does just that (thanks to Adam > > for the updates for this and testing pg_dump- I just reviewed it and > > added some documentation updates and other minor improvements), and > > rebased to master. Also removed the catversion bump, so it should apply > > cleanly for people, for a while anyway. > > I specifically asked you to hold off on committing this until there > was adequate opportunity for review, and explained my reasoning. You > committed it anyway. I was also rather surprised by the push. I wanted to write something about it, but: > This patch, on the other hand, was massively revised after the start > of the CommitFest after many months of inactivity and committed with > no thorough review by anyone who was truly independent of the > development effort. It was then committed with no warning over a > specific request, from another committer, that more time be allowed > for review. says it better. I think that's generally the case, but doubly so with sensitive stuff like this. > I wonder if I am equally free to commit my own patches without > properly observing the CommitFest process, because it would be a whole > lot faster. My pg_background patches have been pending since before > the start of the August CommitFest and I accepted that I would have to > wait an extra two months to commit those because of a *clerical > error*, namely my failure to actually add them to the CommitFest. FWIW, I think if a patch has been sent in time and has gotten a decent amount of review *and* agreement it's fair for a committer to push forward. That doesn't apply to this thread, but sometimes does for others. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 14 September 2014 16:38, Stephen Frost <sfrost@snowman.net> wrote:
* Robert Haas (robertmhaas@gmail.com) wrote:
> On Thu, Sep 11, 2014 at 3:08 PM, Stephen Frost <sfrost@snowman.net> wrote:
> > If we want to be able to disable RLS w/o dropping the policies, then I
> > think we have to completely de-couple the two and users would then have
> > both add policies AND turn on RLS to have RLS actually be enabled for a
> > given table. I'm on the fence about that.
> >
> > Thoughts?
>
> A strong +1 for doing just that.
Alright, updated patch attached which does just that (thanks to Adam
for the updates for this and testing pg_dump- I just reviewed it and
added some documentation updates and other minor improvements), and
rebased to master. Also removed the catversion bump, so it should apply
cleanly for people, for a while anyway.
This is testing what has been committed:
# create table colours (id serial, name text, visible boolean);
CREATE TABLE
# insert into colours (name, visible) values ('blue',true),('yellow',true),('ultraviolet',false),('green',true),('infrared',false);
INSERT 0 5
# create policy visible_colours on colours for all to joe using (visible = true);
CREATE POLICY
# grant all on colours to public;
GRANT
# grant all on sequence colours_id_seq to public;
GRANT
# alter table colours enable row level security ;
ALTER TABLE
\c - joe
> select * from colours;
id | name | visible
----+--------+---------
1 | blue | t
2 | yellow | t
4 | green | t
(3 rows)
> insert into colours (name, visible) values ('purple',true);
INSERT 0 1
> insert into colours (name, visible) values ('transparent',false);
ERROR: new row violates WITH CHECK OPTION for "colours"
DETAIL: Failing row contains (7, transparent, f).
> select * from pg_policies ;
policyname | tablename | roles | cmd | qual | with_check
-----------------+-----------+-------+-----+------------------+------------
visible_colours | colours | {joe} | ALL | (visible = true) |
(1 row)
There was no WITH CHECK OPTION.
Thom
Thom, Thanks! * Thom Brown (thom@linux.com) wrote: > On 14 September 2014 16:38, Stephen Frost <sfrost@snowman.net> wrote: > # create policy visible_colours on colours for all to joe using (visible = > true); > CREATE POLICY [...] > > insert into colours (name, visible) values ('transparent',false); > ERROR: new row violates WITH CHECK OPTION for "colours" > DETAIL: Failing row contains (7, transparent, f). > > > select * from pg_policies ; > policyname | tablename | roles | cmd | qual | with_check > -----------------+-----------+-------+-----+------------------+------------ > visible_colours | colours | {joe} | ALL | (visible = true) | > (1 row) > > There was no WITH CHECK OPTION. As I hope is clear if you look at the documentation- if the WITH CHECK clause is omitted, then the USING clause is used for both filtering and checking new records, otherwise you'd be able to add records which aren't visible to you. Thanks! Stephen
Robert, * Robert Haas (robertmhaas@gmail.com) wrote: > On Sun, Sep 14, 2014 at 11:38 AM, Stephen Frost <sfrost@snowman.net> wrote: > > Alright, updated patch attached which does just that (thanks to Adam > > for the updates for this and testing pg_dump- I just reviewed it and > > added some documentation updates and other minor improvements), and > > rebased to master. Also removed the catversion bump, so it should apply > > cleanly for people, for a while anyway. > > I specifically asked you to hold off on committing this until there > was adequate opportunity for review, and explained my reasoning. You > committed it anyway. Hum- my apologies, I honestly don't recall you specifically asking for it to be held off indefinitely. :( There was discussion back and forth, quite a bit of it with you, and I thank you for your help with that and certainly welcome any additional comments. > This patch, on the other hand, was massively revised after the start > of the CommitFest after many months of inactivity and committed with > no thorough review by anyone who was truly independent of the > development effort. It was then committed with no warning over a > specific request, from another committer, that more time be allowed > for review. I would not (nor do I feel that I did..) have committed it over a specific request to not do so from another committer. I had been hoping that there would be another review coming from somewhere, but there is always a trade-off between waiting longer to get a review ahead of a commit and having it committed and then available more easily for others to work with, review, and generally moving forward. > I'm really disappointed by that. I feel I'm essentially getting > punished for trying to follow what I understand to the process, which > has involved me doing huge amounts of review of other people's patches > and waiting a very long time to get my own stuff committed, while you > bull ahead with your own patches. While I wasn't public about it, I actually specifically discussed this question with others, a few times even, to try and make sure that I wasn't stepping out of line by moving forward. That said, I do see that Andres feels similairly. It certainly wasn't my intent to surprise anyone by it but simply to continue to move forward- in part, to allow me to properly break from it and work on other things, including reviewing other patches in the commitfest. I fear I've simply been overly focused on it these past few weeks, for a variety of reasons that would likely best be discussed at the pub. All-in-all, I feel appropriately chastised and certainly don't wish to be surprising fellow committers. Perhaps we can discuss at the dev meeting. Thanks, Stephen
On 19 September 2014 17:32, Stephen Frost <sfrost@snowman.net> wrote:
Thom Thom,
Thanks!
* Thom Brown (thom@linux.com) wrote:
> On 14 September 2014 16:38, Stephen Frost <sfrost@snowman.net> wrote:
> # create policy visible_colours on colours for all to joe using (visible =
> true);
> CREATE POLICY
[...]
> > insert into colours (name, visible) values ('transparent',false);
> ERROR: new row violates WITH CHECK OPTION for "colours"
> DETAIL: Failing row contains (7, transparent, f).
>
> > select * from pg_policies ;
> policyname | tablename | roles | cmd | qual | with_check
> -----------------+-----------+-------+-----+------------------+------------
> visible_colours | colours | {joe} | ALL | (visible = true) |
> (1 row)
>
> There was no WITH CHECK OPTION.
As I hope is clear if you look at the documentation- if the WITH CHECK
clause is omitted, then the USING clause is used for both filtering and
checking new records, otherwise you'd be able to add records which
aren't visible to you.
I can see that now, although I do find the error message somewhat confusing. Firstly, it looks like "OPTION" is part of the parameter name, which it isn't.
Also, I seem to get an error message with the following:
# create policy nice_colours ON colours for all to joe using (visible = true) with check (name in ('blue','green','yellow'));
CREATE POLICY
\c - joe
> insert into colours (name, visible) values ('blue',false);
ERROR: function with OID 0 does not exist
And if this did work, but I only violated the USING clause, would this still say the WITH CHECK clause was the cause?
On 2014-09-19 12:38:39 -0400, Stephen Frost wrote: > I would not (nor do I feel that I did..) have committed it over a > specific request to not do so from another committer. I had been hoping > that there would be another review coming from somewhere, but there is > always a trade-off between waiting longer to get a review ahead of a > commit and having it committed and then available more easily for others > to work with, review, and generally moving forward. Sure, there is such a tradeoff. But others have to wait months to get enough review. The first revision of the patch in the form you committed was sent 2014-08-19, the first marked *ready for review* (not my words) is from 2014-08-30. 19 days really isn't very far along the tradeoff from waiting for a review to uselessly waiting. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Fri, Sep 19, 2014 at 12:38 PM, Stephen Frost <sfrost@snowman.net> wrote: >> This patch, on the other hand, was massively revised after the start >> of the CommitFest after many months of inactivity and committed with >> no thorough review by anyone who was truly independent of the >> development effort. It was then committed with no warning over a >> specific request, from another committer, that more time be allowed >> for review. > > I would not (nor do I feel that I did..) have committed it over a > specific request to not do so from another committer. Well, you're wrong. How could this email possibly have been any more clear? http://www.postgresql.org/message-id/CA+TgmoYA=uixXmN390SFgfQgVmLL-As5bJaL0oM7yrpPVwNPxQ@mail.gmail.com You can hardly tell me you didn't see that email when you incorporated the technical content into the next patch version. > While I wasn't public about it, I actually specifically discussed this > question with others, a few times even, to try and make sure that I > wasn't stepping out of line by moving forward. And yet you completely ignored the only public commentary on the issue, which was from me. I *should not have had* to object to this patch going in. It was clearly untimely for the August CommitFest, and as a long-time community member, you ought to know full well that any such patch should be resubmitted to a later CommitFest. This patch sat on the shelf for 4 months because you were too busy to work on it, and was committed 5 days from the last posted version, which version had zero review comments. If you didn't have time to work on it for 4 months, you can hardly expect everyone else who has an opinion to comment within 5 days. But, you know, because I could tell that you were fixated on pushing this patch through to commit quickly, I took the time to send you a message on that specific point, even though you should have known full well. In fact I took the time to send TWO. Here's the other one: http://www.postgresql.org/message-id/CA+TgmobqO0z87EiVfDEwjCac1dC4ahh5wCVoQoxrSaTeU1T-RA@mail.gmail.com > All-in-all, I feel appropriately chastised and certainly don't wish to > be surprising fellow committers. Perhaps we can discuss at the dev > meeting. No, I think we should discuss it right now, not nine months from now when the issue has faded from everyone's mind. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Thom, * Thom Brown (thom@linux.com) wrote: > On 19 September 2014 17:32, Stephen Frost <sfrost@snowman.net> wrote: > > * Thom Brown (thom@linux.com) wrote: > > > On 14 September 2014 16:38, Stephen Frost <sfrost@snowman.net> wrote: > > > # create policy visible_colours on colours for all to joe using (visible > > = > > > true); > > > CREATE POLICY > > [...] > > > > insert into colours (name, visible) values ('transparent',false); > > > ERROR: new row violates WITH CHECK OPTION for "colours" > > > DETAIL: Failing row contains (7, transparent, f). > > > > > > > select * from pg_policies ; > > > policyname | tablename | roles | cmd | qual | > > with_check > > > > > -----------------+-----------+-------+-----+------------------+------------ > > > visible_colours | colours | {joe} | ALL | (visible = true) | > > > (1 row) > > > > > > There was no WITH CHECK OPTION. > > > > As I hope is clear if you look at the documentation- if the WITH CHECK > > clause is omitted, then the USING clause is used for both filtering and > > checking new records, otherwise you'd be able to add records which > > aren't visible to you. > > I can see that now, although I do find the error message somewhat > confusing. Firstly, it looks like "OPTION" is part of the parameter name, > which it isn't. Hmm, the notion of 'with check option' is from the SQL standard, which is why I felt the error message was appropriate as-is.. > Also, I seem to get an error message with the following: > > # create policy nice_colours ON colours for all to joe using (visible = > true) with check (name in ('blue','green','yellow')); > CREATE POLICY > > \c - joe > > > insert into colours (name, visible) values ('blue',false); > ERROR: function with OID 0 does not exist Now *that* one is interesting and I'll definitely go take a look at it. We added quite a few regression tests to try and make sure these things work. > And if this did work, but I only violated the USING clause, would this > still say the WITH CHECK clause was the cause? WITH CHECK applies for INSERT and UPDATE for the new records going into the table. You can't actually violate the USING clause for an INSERT as USING is for filtering records, not checking that records being added to the table are valid. To try and clarify- by explicitly setting both USING and WITH CHECK, you *are* able to INSERT records which are not visible to you. We felt that was an important capability to support. Thanks for taking a look at it! Stephen
On 09/20/2014 12:38 AM, Stephen Frost wrote: > I would not (nor do I feel that I did..) have committed it over a > specific request to not do so from another committer. I had been hoping > that there would be another review coming from somewhere, but there is > always a trade-off between waiting longer to get a review ahead of a > commit and having it committed and then available more easily for others > to work with, review, and generally moving forward. Y'know what helps with that? Publishing clean git branches for non-trivial work, rather than just lobbing patches around. I'm finding the reliance on a patch based workflow increasingly frustrating for complex work, and wonder if it's time to revisit introducing a git repo+ref to the commitfest app. I find the need to find the latest patch on the list, apply it, and fix it up really frustrating. "git am --3way" helps a lot, but only if the patch is created with "git format-patch". Perhaps it's time to look at whether git can do more to help us with the testing and review process. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 09/20/2014 12:23 AM, Craig Ringer wrote: > On 09/20/2014 12:38 AM, Stephen Frost wrote: > >> I would not (nor do I feel that I did..) have committed it over a >> specific request to not do so from another committer. I had been hoping >> that there would be another review coming from somewhere, but there is >> always a trade-off between waiting longer to get a review ahead of a >> commit and having it committed and then available more easily for others >> to work with, review, and generally moving forward. > > Y'know what helps with that? Publishing clean git branches for > non-trivial work, rather than just lobbing patches around. > > I'm finding the reliance on a patch based workflow increasingly > frustrating for complex work, and wonder if it's time to revisit > introducing a git repo+ref to the commitfest app. > > I find the need to find the latest patch on the list, apply it, and fix > it up really frustrating. "git am --3way" helps a lot, but only if the > patch is created with "git format-patch". > > Perhaps it's time to look at whether git can do more to help us with the > testing and review process. We discussed this at the last developer meeting, without coming up with a written procedure. Your ideas can help ... -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
On 19 September 2014 17:54, Stephen Frost <sfrost@snowman.net> wrote: > > Thom, > > * Thom Brown (thom@linux.com) wrote: > > On 19 September 2014 17:32, Stephen Frost <sfrost@snowman.net> wrote: > > > * Thom Brown (thom@linux.com) wrote: > > > > On 14 September 2014 16:38, Stephen Frost <sfrost@snowman.net> wrote: > > > > # create policy visible_colours on colours for all to joe using (visible > > > = > > > > true); > > > > CREATE POLICY > > > [...] > > > > > insert into colours (name, visible) values ('transparent',false); > > > > ERROR: new row violates WITH CHECK OPTION for "colours" > > > > DETAIL: Failing row contains (7, transparent, f). > > > > > > > > > select * from pg_policies ; > > > > policyname | tablename | roles | cmd | qual | > > > with_check > > > > > > > -----------------+-----------+-------+-----+------------------+------------ > > > > visible_colours | colours | {joe} | ALL | (visible = true) | > > > > (1 row) > > > > > > > > There was no WITH CHECK OPTION. > > > > > > As I hope is clear if you look at the documentation- if the WITH CHECK > > > clause is omitted, then the USING clause is used for both filtering and > > > checking new records, otherwise you'd be able to add records which > > > aren't visible to you. > > > > I can see that now, although I do find the error message somewhat > > confusing. Firstly, it looks like "OPTION" is part of the parameter name, > > which it isn't. > > Hmm, the notion of 'with check option' is from the SQL standard, which > is why I felt the error message was appropriate as-is.. > > > Also, I seem to get an error message with the following: > > > > # create policy nice_colours ON colours for all to joe using (visible = > > true) with check (name in ('blue','green','yellow')); > > CREATE POLICY > > > > \c - joe > > > > > insert into colours (name, visible) values ('blue',false); > > ERROR: function with OID 0 does not exist > > Now *that* one is interesting and I'll definitely go take a look at it. > We added quite a few regression tests to try and make sure these things > work. > > > And if this did work, but I only violated the USING clause, would this > > still say the WITH CHECK clause was the cause? > > WITH CHECK applies for INSERT and UPDATE for the new records going into > the table. You can't actually violate the USING clause for an INSERT > as USING is for filtering records, not checking that records being added > to the table are valid. > > To try and clarify- by explicitly setting both USING and WITH CHECK, you > *are* able to INSERT records which are not visible to you. We felt that > was an important capability to support. I find it a bit of a limitation that I can't specify both INSERT and UPDATE for a policy. I'd want to be able to specify something like this: CREATE POLICY no_greys_allowed ON colours FOR INSERT, UPDATE WITH CHECK (name NOT IN ('grey','gray')); I would expect this to be rather common to prevent certain values making their way into a table. Instead I'd have to create 2 policies as it stands. In order to debug issues with accessing table data, perhaps it would be useful to output the name of the policy that was violated. If a table had 20 policies on, it could become time-consuming to debug. I keep getting tripped up by overlapping policies. On the one hand, I created a policy to ensure rows being added or selected have a "visible" column set to true. On the other hand, I have a policy that ensures that the name of a colour doesn't appear in a list. Policy 1 is violated until policy 2 is added: (using the table I created in a previous post on this thread...) # create policy must_be_visible ON colours for all to joe using (visible = true) with check (visible = true); CREATE POLICY \c - joe > insert into colours (name, visible) values ('pink',false); ERROR: new row violates WITH CHECK OPTION for "colours" DETAIL: Failing row contains (28, pink, f). \c - thom # create policy no_greys_allowed on colours for insert with check (name not in ('grey','gray')); CREATE POLICY \c - joe # insert into colours (name, visible) values ('pink',false); INSERT 0 1 I expected this to still trigger an error due to the first policy. Am I to infer from this that the policy model is permissive rather than restrictive? I've also attached a few corrections for the docs. Thom
Attachment
Thom, * Thom Brown (thom@linux.com) wrote: > I find it a bit of a limitation that I can't specify both INSERT and > UPDATE for a policy. I'd want to be able to specify something like > this: > > CREATE POLICY no_greys_allowed > ON colours > FOR INSERT, UPDATE > WITH CHECK (name NOT IN ('grey','gray')); > > I would expect this to be rather common to prevent certain values > making their way into a table. Instead I'd have to create 2 policies > as it stands. That's not actually the case... CREATE POLICY no_greys_allowed ON colours FOR ALL USING (true) -- assuming this is what you intended WITH CHECK (name NOTIN ('grey','gray')); Right? That said, I'm not against the idea of supporting mulitple commands with one policy (similar to how ALL is done). It wouldn't be difficult or much of a change- make the 'cmd' a bitfield instead. If others feel the same then I'll look at doing that. > In order to debug issues with accessing table data, perhaps it would > be useful to output the name of the policy that was violated. If a > table had 20 policies on, it could become time-consuming to debug. Good point. That'll involve a bit more as I'll need to look at the existing with check options structure, but I believe it's just adding the field to the structure, populating it when adding the WCO entries, and then checking for it in the ereport() call. The policy name is already stashed in the relcache entry, so it's already pretty easily available. > I keep getting tripped up by overlapping policies. On the one hand, I > created a policy to ensure rows being added or selected have a > "visible" column set to true. On the other hand, I have a policy that > ensures that the name of a colour doesn't appear in a list. Policy 1 > is violated until policy 2 is added: > > (using the table I created in a previous post on this thread...) > > # create policy must_be_visible ON colours for all to joe using > (visible = true) with check (visible = true); > CREATE POLICY > > \c - joe > > > insert into colours (name, visible) values ('pink',false); > ERROR: new row violates WITH CHECK OPTION for "colours" > DETAIL: Failing row contains (28, pink, f). > > \c - thom > > # create policy no_greys_allowed on colours for insert with check > (name not in ('grey','gray')); > CREATE POLICY > > \c - joe > > # insert into colours (name, visible) values ('pink',false); > INSERT 0 1 > > I expected this to still trigger an error due to the first policy. Am > I to infer from this that the policy model is permissive rather than > restrictive? That's correct and I believe pretty clear in the documentation- policies are OR'd together, just the same as how roles are handled. As a logged-in user, you have the rights of all of the roles you are a member of (subject to inheiritance rules, of course), and similairly, you are able to view and add all rows which match any policy which applies to you (either through role membership or through different policies). > I've also attached a few corrections for the docs. Thanks! I'll plan to include these with a few other typos and the fix for the bug that Andres pointed out, once I finish testing (and doing another CLOBBER_CACHE_ALWAYS run..). Thanks again, Stephen
On 25 September 2014 15:26, Stephen Frost <sfrost@snowman.net> wrote: >> I expected this to still trigger an error due to the first policy. Am >> I to infer from this that the policy model is permissive rather than >> restrictive? > > That's correct and I believe pretty clear in the documentation- policies > are OR'd together, just the same as how roles are handled. As a > logged-in user, you have the rights of all of the roles you are a member > of (subject to inheiritance rules, of course), and similairly, you are > able to view and add all rows which match any policy which applies to > you (either through role membership or through different policies). Okay, I see now. This is a mindset issue for me as I'm looking at them like constraints rather than permissions. Thanks for the explanation. Thom