Thread: [v9.4] row level security
The attached patch implements row-level security feature; that allows to enforce a pre-configured security policy on reference of tables with the row-level security policy. It enables to isolate records to be visible from others according to access control decision, usually done based on current user's credential. It will make sense to ensure correctness of security for SaaS style applications that typically share a common table for multiple users but correctness of access control was ensured with correctness of application itself. Here is not functional update since the last commit fest for v9.3 except for adjustment towards the latest master branch. So, the explanation below might be bored for someone. This feature enhances ALTER TABLE statement as follows: ALTER TABLE <tablename> SET ROW SECURITY FOR <command> TO (<expression>); ALTER TABLE <tablename> RESET ROW SECURITY FOR <command>; <command> := { ALL | SELECT | INSERT | UPDATE | DELETE } Right now, only "ALL" is supported command, even though syntax reserves future enhancement allows to set individual security policy for each command. The <expression> should be an expression that returns a bool value. It can reference any column in the target table and contain sub-query that reference another tables. Then, the pre-configured expression shall be added when the table is referenced. See below, it gives "(X % 2 = 1)" as security policy, user can see the record that has odd-number at X. The EXPLAIN output below shows this expression was automatically attached. postgres=> ALTER TABLE tbl SET ROW SECURITY FOR ALL TO (x % 2 = 1); ALTER TABLE postgres=> EXPLAIN SELECT * FROM tbl WHERE y like '%abc%'; QUERY PLAN ----------------------------------------------------------------- Subquery Scan on tbl (cost=0.00..28.52 rows=1 width=36) Filter: (tbl.y ~~ '%abc%'::text) -> Seq Scan on tbl tbl_1 (cost=0.00..28.45 rows=6 width=36) Filter: ((x % 2) = 1) (4 rows) An important point is, reference to a particular relation is replaced with a sub- query that has security policy expression and security barrier attribute. That prevent any (non leakproof) user given condition earlier than security poliy itself, thus, it ensures all records user can see is satisfies the security policy. On writer-queries, things to do are similar. It adds security policy expression on the scan phase of UPDATE or DELETE statement. Thus, only visible records are updatable or deletable. postgres=> EXPLAIN UPDATE tbl SET y = y || '_update' WHERE y like '%xyz%'; QUERY PLAN ----------------------------------------------------------------------- Update on tbl (cost=0.00..28.53 rows=1 width=42) -> Subquery Scan on tbl_1 (cost=0.00..28.53 rows=1 width=42) Filter: (tbl_1.y ~~ '%xyz%'::text) -> Seq Scan on tbl tbl_2 (cost=0.00..28.45 rows=6 width=42) Filter: ((x % 2) = 1) (5 rows) I had a relevant presentation at PGcon last month. I think its slides are good summary to know brief background of the long-standing problem. http://www.pgcon.org/2013/schedule/attachments/273_PGcon2013-kaigai-row-level-security.pdf Thanks, -- KaiGai Kohei <kaigai@kaigai.gr.jp>
Attachment
Hackers, Please, oh please, won't someone review this? This patch has been 3 years in the making, so I suspect that it will NOT be a fast review. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
I was pointed out that the previous patch was not applied cleanly because of enhancement on pg_class system catalog, and related pg_dump portion was getting broken. The attached patch fixes this matters. Please reference this patch instead. Thanks, 2013/6/13 Kohei KaiGai <kaigai@kaigai.gr.jp>: > The attached patch implements row-level security feature; that allows to > enforce a pre-configured security policy on reference of tables with the > row-level security policy. > It enables to isolate records to be visible from others according to access > control decision, usually done based on current user's credential. > It will make sense to ensure correctness of security for SaaS style > applications that typically share a common table for multiple users but > correctness of access control was ensured with correctness of application > itself. > > Here is not functional update since the last commit fest for v9.3 except > for adjustment towards the latest master branch. > > So, the explanation below might be bored for someone. > > This feature enhances ALTER TABLE statement as follows: > ALTER TABLE <tablename> SET ROW SECURITY > FOR <command> TO (<expression>); > ALTER TABLE <tablename> RESET ROW SECURITY > FOR <command>; > <command> := { ALL | SELECT | INSERT | UPDATE | DELETE } > > Right now, only "ALL" is supported command, even though syntax reserves > future enhancement allows to set individual security policy for each command. > The <expression> should be an expression that returns a bool value. It can > reference any column in the target table and contain sub-query that reference > another tables. > Then, the pre-configured expression shall be added when the table is referenced. > > See below, it gives "(X % 2 = 1)" as security policy, user can see the record > that has odd-number at X. The EXPLAIN output below shows this expression > was automatically attached. > > postgres=> ALTER TABLE tbl SET ROW SECURITY FOR ALL TO (x % 2 = 1); > ALTER TABLE > postgres=> EXPLAIN SELECT * FROM tbl WHERE y like '%abc%'; > QUERY PLAN > ----------------------------------------------------------------- > Subquery Scan on tbl (cost=0.00..28.52 rows=1 width=36) > Filter: (tbl.y ~~ '%abc%'::text) > -> Seq Scan on tbl tbl_1 (cost=0.00..28.45 rows=6 width=36) > Filter: ((x % 2) = 1) > (4 rows) > > An important point is, reference to a particular relation is replaced > with a sub- > query that has security policy expression and security barrier attribute. > That prevent any (non leakproof) user given condition earlier than > security poliy > itself, thus, it ensures all records user can see is satisfies the > security policy. > > On writer-queries, things to do are similar. It adds security policy expression > on the scan phase of UPDATE or DELETE statement. Thus, only visible records > are updatable or deletable. > > postgres=> EXPLAIN UPDATE tbl SET y = y || '_update' WHERE y like '%xyz%'; > QUERY PLAN > ----------------------------------------------------------------------- > Update on tbl (cost=0.00..28.53 rows=1 width=42) > -> Subquery Scan on tbl_1 (cost=0.00..28.53 rows=1 width=42) > Filter: (tbl_1.y ~~ '%xyz%'::text) > -> Seq Scan on tbl tbl_2 (cost=0.00..28.45 rows=6 width=42) > Filter: ((x % 2) = 1) > (5 rows) > > I had a relevant presentation at PGcon last month. I think its slides > are good summary > to know brief background of the long-standing problem. > http://www.pgcon.org/2013/schedule/attachments/273_PGcon2013-kaigai-row-level-security.pdf > > Thanks, > -- > KaiGai Kohei <kaigai@kaigai.gr.jp> -- KaiGai Kohei <kaigai@kaigai.gr.jp>
Attachment
Sorry, this included a garbage chunk when I made a patch. The attached one is corrected revision. Please see this patch instead. Thanks, 2013/7/8 Kohei KaiGai <kaigai@kaigai.gr.jp>: > I was pointed out that the previous patch was not applied cleanly because of > enhancement on pg_class system catalog, and related pg_dump portion was > getting broken. > > The attached patch fixes this matters. Please reference this patch instead. > Thanks, > > 2013/6/13 Kohei KaiGai <kaigai@kaigai.gr.jp>: >> The attached patch implements row-level security feature; that allows to >> enforce a pre-configured security policy on reference of tables with the >> row-level security policy. >> It enables to isolate records to be visible from others according to access >> control decision, usually done based on current user's credential. >> It will make sense to ensure correctness of security for SaaS style >> applications that typically share a common table for multiple users but >> correctness of access control was ensured with correctness of application >> itself. >> >> Here is not functional update since the last commit fest for v9.3 except >> for adjustment towards the latest master branch. >> >> So, the explanation below might be bored for someone. >> >> This feature enhances ALTER TABLE statement as follows: >> ALTER TABLE <tablename> SET ROW SECURITY >> FOR <command> TO (<expression>); >> ALTER TABLE <tablename> RESET ROW SECURITY >> FOR <command>; >> <command> := { ALL | SELECT | INSERT | UPDATE | DELETE } >> >> Right now, only "ALL" is supported command, even though syntax reserves >> future enhancement allows to set individual security policy for each command. >> The <expression> should be an expression that returns a bool value. It can >> reference any column in the target table and contain sub-query that reference >> another tables. >> Then, the pre-configured expression shall be added when the table is referenced. >> >> See below, it gives "(X % 2 = 1)" as security policy, user can see the record >> that has odd-number at X. The EXPLAIN output below shows this expression >> was automatically attached. >> >> postgres=> ALTER TABLE tbl SET ROW SECURITY FOR ALL TO (x % 2 = 1); >> ALTER TABLE >> postgres=> EXPLAIN SELECT * FROM tbl WHERE y like '%abc%'; >> QUERY PLAN >> ----------------------------------------------------------------- >> Subquery Scan on tbl (cost=0.00..28.52 rows=1 width=36) >> Filter: (tbl.y ~~ '%abc%'::text) >> -> Seq Scan on tbl tbl_1 (cost=0.00..28.45 rows=6 width=36) >> Filter: ((x % 2) = 1) >> (4 rows) >> >> An important point is, reference to a particular relation is replaced >> with a sub- >> query that has security policy expression and security barrier attribute. >> That prevent any (non leakproof) user given condition earlier than >> security poliy >> itself, thus, it ensures all records user can see is satisfies the >> security policy. >> >> On writer-queries, things to do are similar. It adds security policy expression >> on the scan phase of UPDATE or DELETE statement. Thus, only visible records >> are updatable or deletable. >> >> postgres=> EXPLAIN UPDATE tbl SET y = y || '_update' WHERE y like '%xyz%'; >> QUERY PLAN >> ----------------------------------------------------------------------- >> Update on tbl (cost=0.00..28.53 rows=1 width=42) >> -> Subquery Scan on tbl_1 (cost=0.00..28.53 rows=1 width=42) >> Filter: (tbl_1.y ~~ '%xyz%'::text) >> -> Seq Scan on tbl tbl_2 (cost=0.00..28.45 rows=6 width=42) >> Filter: ((x % 2) = 1) >> (5 rows) >> >> I had a relevant presentation at PGcon last month. I think its slides >> are good summary >> to know brief background of the long-standing problem. >> http://www.pgcon.org/2013/schedule/attachments/273_PGcon2013-kaigai-row-level-security.pdf >> >> Thanks, >> -- >> KaiGai Kohei <kaigai@kaigai.gr.jp> > > > > -- > KaiGai Kohei <kaigai@kaigai.gr.jp> -- KaiGai Kohei <kaigai@kaigai.gr.jp>
Attachment
With the current HEAD and v3 patch I'm seeing the following error with "make check".
------
============== creating temporary installation ==============
============== initializing database system ==============
pg_regress: initdb failed
Examine /usr/local/src/postgres.patched.v3/src/test/regress/log/initdb.log for the reason.
Command was: "/usr/local/src/postgres.patched.v3/src/test/regress/./tmp_check/install//usr/local/pgsql/bin/initdb" -D "/usr/local/src/postgres.patched.v3/src/test/regress/./tmp_check/data" -L "/usr/local/src/postgres.patched.v3/src/test/regress/./tmp_check/install//usr/local/pgsql/share" --noclean --nosync > "/usr/local/src/postgres.patched.v3/src/test/regress/log/initdb.log" 2>&1
make[1]: *** [check] Error 2
make[1]: Leaving directory `/usr/local/src/postgres.patched.v3/src/test/regress'
make: *** [check] Error 2
__________________________________________________________________________________
Mike Blackwell | Technical Analyst, Distribution Services/Rollout Management | RR Donnelley
1750 Wallace Ave | St Charles, IL 60174-3401
Office: 630.313.7818
Mike.Blackwell@rrd.com
http://www.rrdonnelley.com
Current head 4cbe3ac3e86790d05c569de4585e5075a62a9b41 -> patch applies correct (only change needed in parallel_schedule). However it fails on own regression tests (other tests pass). Regards, Karol
On 7/18/13 7:57 PM, Karol Trzcionka wrote: > Current head 4cbe3ac3e86790d05c569de4585e5075a62a9b41 -> patch applies > correct (only change needed in parallel_schedule). > However it fails on own regression tests (other tests pass). I got a rejected hunk in src/backend/nodes/nodeFuncs.c as well as that parallel_schedule issue. Maybe you didn't get the nodeFuncs change but didn't notice that? That might explain why the tests didn't work for you either. Attached is an updated patch where I tried to only fix the two small hunks of bit rot. I get "All 140 tests passed" here, on a Mac no less. I did a brief code scan through the patch just to get a feel for how the feature is put together, and what you'd need to know for a deeper review. (I'm trying to get customer time approved to work on this a lot more) The code was easier to follow than I expected. The way it completely avoids even getting into the security label integration yet seems like a successful design partitioning. This isn't nearly as scary as the SEPostgres patches. There are some useful looking utility functions that dump information about what's going on too. The bulk of the complexity is how the feature modifies query nodes to restrict what rows come through them. Some familiarity with that part of the code is what you'd need to take on reviewing this in detail. That and a week of time to spend trudging through it. If anyone is looking for an educational challenge on query execution, marching through all of these changes to validate they work as expected would do that. -- Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com
Attachment
* Greg Smith (greg@2ndQuadrant.com) wrote: > On 7/18/13 7:57 PM, Karol Trzcionka wrote: > >Current head 4cbe3ac3e86790d05c569de4585e5075a62a9b41 -> patch applies > >correct (only change needed in parallel_schedule). > >However it fails on own regression tests (other tests pass). > > I got a rejected hunk in src/backend/nodes/nodeFuncs.c as well as > that parallel_schedule issue. Maybe you didn't get the nodeFuncs > change but didn't notice that? That might explain why the tests > didn't work for you either. The nodeFuncs.c hunk seems likely to have been impacted by the patch I committed today (WITH CHECK OPTION), so I doubt that was the issue. > Attached is an updated patch where I tried to only fix the two small > hunks of bit rot. I get "All 140 tests passed" here, on a Mac no > less. Thanks for updating the patch, I ran into the failed hunks too and expected to have to deal with them. :) > I did a brief code scan through the patch just to get a feel for how > the feature is put together, and what you'd need to know for a > deeper review. That would be extremely helpful.. Wasn't there a wiki page about this feature which might also help? Bigger question- is it correct for the latest version of the patch? > (I'm trying to get customer time approved to work on > this a lot more) The code was easier to follow than I expected. > The way it completely avoids even getting into the security label > integration yet seems like a successful design partitioning. This > isn't nearly as scary as the SEPostgres patches. There are some > useful looking utility functions that dump information about what's > going on too. > > The bulk of the complexity is how the feature modifies query nodes > to restrict what rows come through them. Some familiarity with that > part of the code is what you'd need to take on reviewing this in > detail. That and a week of time to spend trudging through it. If > anyone is looking for an educational challenge on query execution, > marching through all of these changes to validate they work as > expected would do that. I'm hoping to find time this weekend to look into this patch myself, but the weekend is also filling up with other activities, so we'll see. Thanks! Stephen
On 7/18/13 11:03 PM, Stephen Frost wrote: > Wasn't there a wiki page about this > feature which might also help? Bigger question- is it correct for the > latest version of the patch? https://wiki.postgresql.org/wiki/RLS has accumulated a lot of older debris that may or may not be useful here. This useful piece was just presented at PGCon: http://www.pgcon.org/2013/schedule/attachments/273_PGcon2013-kaigai-row-level-security.pdf That is very up to date intro to the big picture issues. -- Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com
On 19 July 2013 04:09, Greg Smith <greg@2ndquadrant.com> wrote: > On 7/18/13 11:03 PM, Stephen Frost wrote: >> >> Wasn't there a wiki page about this >> feature which might also help? Bigger question- is it correct for the >> latest version of the patch? > > > https://wiki.postgresql.org/wiki/RLS has accumulated a lot of older debris > that may or may not be useful here. > > This useful piece was just presented at PGCon: > http://www.pgcon.org/2013/schedule/attachments/273_PGcon2013-kaigai-row-level-security.pdf > That is very up to date intro to the big picture issues. > Hi, I took a look at this patch too. I didn't read all the code in detail, but there was one area I wondered about --- is it still necessary to add the new field rowsec_relid to RangeTblEntry? As far as I can see, it is only used in the new function getrelid() which replaces the old macro of the same name, so that it can work if it is passed the index of the sourceRelation subquery RTE instead of the resultRelation. This seems to be encoding a specific assumption that a subquery RTE is always going to come from row-level security expansion. Is it the case that this can only happen from expand_targetlist(), in which case why not pass both source_relation and result_relation to expand_targetlist(), which I think will make that code neater. As it stands, what expand_targetlist() sees as result_relation is actually source_relation, which getrelid() then handles specially. Logically I think expand_targetlist() should pass the actual result_relation to getrelid(), opening the target table, and then pass source_relation to makeVar() when building new TLEs. With that change to expand_targetlist(), the change to getrelid() may be unnecessary, and then also the new rowsec_relid field in RangeTblEntry may not be needed. Regards, Dean
Dean Rasheed <dean.a.rasheed@gmail.com> writes: > I took a look at this patch too. I didn't read all the code in detail, > but there was one area I wondered about --- is it still necessary to > add the new field rowsec_relid to RangeTblEntry? > As far as I can see, it is only used in the new function getrelid() > which replaces the old macro of the same name, so that it can work if > it is passed the index of the sourceRelation subquery RTE instead of > the resultRelation. This seems to be encoding a specific assumption > that a subquery RTE is always going to come from row-level security > expansion. I haven't read the patch at all, but I would opine that anything that's changing the behavior of getrelid() is broken by definition. Something is being done at the wrong level of abstraction if you think that you need to do that. regards, tom lane
2013/7/19 Stephen Frost <sfrost@snowman.net>: > * Greg Smith (greg@2ndQuadrant.com) wrote: >> On 7/18/13 7:57 PM, Karol Trzcionka wrote: >> >Current head 4cbe3ac3e86790d05c569de4585e5075a62a9b41 -> patch applies >> >correct (only change needed in parallel_schedule). >> >However it fails on own regression tests (other tests pass). >> >> I got a rejected hunk in src/backend/nodes/nodeFuncs.c as well as >> that parallel_schedule issue. Maybe you didn't get the nodeFuncs >> change but didn't notice that? That might explain why the tests >> didn't work for you either. > > The nodeFuncs.c hunk seems likely to have been impacted by the patch I > committed today (WITH CHECK OPTION), so I doubt that was the issue. > >> Attached is an updated patch where I tried to only fix the two small >> hunks of bit rot. I get "All 140 tests passed" here, on a Mac no >> less. > > Thanks for updating the patch, I ran into the failed hunks too and > expected to have to deal with them. :) > Thanks for pointing out this problem. I synchronized my local master with the upstream one, then adjusted the row-security branch. I'll submit the patch soon, with fixing-up the portion that Tom pointed out. Thanks,
2013/7/19 Dean Rasheed <dean.a.rasheed@gmail.com>: > On 19 July 2013 04:09, Greg Smith <greg@2ndquadrant.com> wrote: >> On 7/18/13 11:03 PM, Stephen Frost wrote: >>> >>> Wasn't there a wiki page about this >>> feature which might also help? Bigger question- is it correct for the >>> latest version of the patch? >> >> >> https://wiki.postgresql.org/wiki/RLS has accumulated a lot of older debris >> that may or may not be useful here. >> >> This useful piece was just presented at PGCon: >> http://www.pgcon.org/2013/schedule/attachments/273_PGcon2013-kaigai-row-level-security.pdf >> That is very up to date intro to the big picture issues. >> > > Hi, > > I took a look at this patch too. I didn't read all the code in detail, > but there was one area I wondered about --- is it still necessary to > add the new field rowsec_relid to RangeTblEntry? > > As far as I can see, it is only used in the new function getrelid() > which replaces the old macro of the same name, so that it can work if > it is passed the index of the sourceRelation subquery RTE instead of > the resultRelation. This seems to be encoding a specific assumption > that a subquery RTE is always going to come from row-level security > expansion. > > Is it the case that this can only happen from expand_targetlist(), in > which case why not pass both source_relation and result_relation to > expand_targetlist(), which I think will make that code neater. As it > stands, what expand_targetlist() sees as result_relation is actually > source_relation, which getrelid() then handles specially. Logically I > think expand_targetlist() should pass the actual result_relation to > getrelid(), opening the target table, and then pass source_relation to > makeVar() when building new TLEs. > > With that change to expand_targetlist(), the change to getrelid() may > be unnecessary, and then also the new rowsec_relid field in > RangeTblEntry may not be needed. > Hmm. I didn't have this idea. It seems to me fair enough and kills necessity to enhance RangeTblEntry and getrelid() indeed. I try to fix up this implementation according to your suggestion. Thanks, -- KaiGai Kohei <kaigai@kaigai.gr.jp>
On 7/20/13 10:08 AM, Kohei KaiGai wrote: >> With that change to expand_targetlist(), the change to getrelid() may >> be unnecessary, and then also the new rowsec_relid field in >> RangeTblEntry may not be needed. >> > Hmm. I didn't have this idea. It seems to me fair enough and kills > necessity to enhance RangeTblEntry and getrelid() indeed. > I try to fix up this implementation according to your suggestion. Great, there's one useful bit of feedback for you then, and that seems to address Tom's getrelid concern too. For the active CommitFest, I don't see any place we can go with this right now except for "Returned with Feedback". We really need more reviewers willing to put a significant amount of time into going through this code. Anyone who would like to see RLS committed should consider how you can get resources to support a skilled PostgreSQL reviewer spending time on it. (This is a bit much to expect new reviewers to chew on usefully) I've been working on that here, but I don't have anything I can publicly commit to yet. -- Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com
> (This is a bit much to expect new reviewers to chew on usefully) I've been > working on that here, but I don't have anything I can publicly commit to > yet. > True that. I spent some time on it, but couldn't come up with anything useful. Mike's extensive testing seems good enough for me, though. One thing I was a bit concerned about (I don't know if it has been resolved already, or if it isn't a concern) is that there was an issue about a security part being visible in the syntax (or something like that, my memory isn't too good today morning). Regards, Atri Regards, Atri l'apprenant
On 07/22/2013 01:27 PM, Greg Smith wrote: > > Anyone who would like to see RLS committed should consider how you can > get resources to support a skilled PostgreSQL reviewer spending time on > it. (This is a bit much to expect new reviewers to chew on usefully) > I've been working on that here, but I don't have anything I can publicly > commit to yet. Apparently it's a little much for experienced reviewers to chew on, too, since I've been trying to get someone to review it since the beginning of the Commitfest. While I understand the call for "resources", this is a bit unfair to KaiGai, who has put in his time reviewing other people's patches. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
On 7/23/13 12:10 PM, Josh Berkus wrote: > Apparently it's a little much for experienced reviewers to chew on, too, > since I've been trying to get someone to review it since the beginning > of the Commitfest. It's more than the available experienced reviewers are willing to chew on fully as volunteers. The reward for spending review time is pretty low right now. > While I understand the call for "resources", this is a bit unfair to > KaiGai, who has put in his time reviewing other people's patches. If you read Dean Rasheed's comments, it's obvious he put a useful amount of work into his review suggestions. It is not the case here that KaiGai worked on a review and got nothing in return. Unfortunately that has happened to this particular patch before, but the community did a little better this time. The goal of the CF is usually to reach one of these outcomes for every submission: -The submission is ready for commit -The submission needs improvement in X Review here went far enough to identify an X to be improved. It was a big enough issue that a rewrite is needed, commit at this time isn't possible, and now KaiGai has something we hope is productive he can continue working on. That's all we can really promise here--that if we say something isn't ready for commit yet, that there's some feedback as to why. I would have preferred to see multiple X issues identified here, given that we know there has to be more than just the one in a submission of this size. The rough fairness promises of the CommitFest seem satisfied to me though. -- Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com
Greg, > It's more than the available experienced reviewers are willing to chew > on fully as volunteers. The reward for spending review time is pretty > low right now. Short of paying for review time, I don't think we have another solution for getting the "big patches" reviewed, except to rely on the major contributors who are paid full-time to hack Postgres. You know as well as me that, as consultants, we can get clients to pay for 10% extra time for review in the course of developing a feature, but the kind of time which patches like Row Security, Changesets, or other "big patches" need nobody is going to pay for on a contract basis. And nobody who is doing this in their "spare time" has that kind of block. So I don't think there's any good solution for the "big patches". I do think our project could do much more to recruit reviewers for the small-medium patches, to take workload off the core contributors in general. Historically, however, this project (and the contributors on this list) has made material decisions not to encourage or recruit new people as reviewers, and has repeatedly stated that reviewers are not important. Until that changes, we are not going to get more reviewers (and I'm not going to have much sympathy for existing contributors who say they have no time for review). If we want more reviewers and people spending more time on review, then we need to give reviewers the *same* respect and the *same* rewards that feature contributors get. Not something else, the exact same. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
On 7/23/13 2:30 PM, Josh Berkus wrote: > You know as well as me that, as consultants, we can get clients to pay for 10% extra time > for review in the course of developing a feature Before this number gets quoted anywhere, I think it's on the low side. I've spent a good bit of time measuring how much time it takes to do a fair offsetting review--one where you put as much time in as it takes to review your submission--and I keep getting numbers more in the 20 to 25% range. The work involved to do a high quality review takes a while. I happen to think the review structure is one of the most important components to PostgreSQL release quality. It used to be a single level review with just the committers, now it's a two level structure. The reason the Postgres code is so good isn't that the submitted development is inherently any better than other projects. There's plenty of bogus material submitted here. It's the high standards for review and commit that are the key filter. The importance of the process to the result isn't weighed as heavily as I think it should be. -- Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com
On 07/23/2013 03:34 PM, Greg Smith wrote: > I happen to think the review structure is one of the most important > components to PostgreSQL release quality. It used to be a single level > review with just the committers, now it's a two level structure. The > reason the Postgres code is so good isn't that the submitted development > is inherently any better than other projects. There's plenty of bogus > material submitted here. It's the high standards for review and commit > that are the key filter. The importance of the process to the result > isn't weighed as heavily as I think it should be. I think we're in violent agreement here. Now, we just need to convince everyone else on this list. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
On Tue, Jul 23, 2013 at 11:30:14AM -0700, Josh Berkus wrote: > Greg, > > > It's more than the available experienced reviewers are willing to chew > > on fully as volunteers. The reward for spending review time is pretty > > low right now. > > Short of paying for review time, I don't think we have another solution > for getting the "big patches" reviewed, except to rely on the major > contributors who are paid full-time to hack Postgres. You know as well > as me that, as consultants, we can get clients to pay for 10% extra time > for review in the course of developing a feature, but the kind of time > which patches like Row Security, Changesets, or other "big patches" need > nobody is going to pay for on a contract basis. And nobody who is doing > this in their "spare time" has that kind of block. > > So I don't think there's any good solution for the "big patches". Let me echo Josh's comments above --- in the early years, we had trouble creating new features that required more than 1-2 weekends of development. We now have enough full-time developers that this is not a problem, but now it seems features requiring more than a weekend to _review_ are a problem, so full-time folks are again required here. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
On 7/20/13 10:08 AM, Kohei KaiGai wrote: > Hmm. I didn't have this idea. It seems to me fair enough and kills > necessity to enhance RangeTblEntry and getrelid() indeed. > I try to fix up this implementation according to your suggestion. How is that going? I'm going to do a serious review of this myself over the next few weeks. I have a good chunk of time set aside for it as part of a larger project. I'm hoping to get more people here involved in that effort too, starting in the November CF if that works out. I've been trying to catch up with your larger plan for this feature for 9.4. You made this comment earlier: > Also, I'd like to have discussion for this feature in earlier half of> v9.4 to keep time for the remaining features, suchas check on> writer-side, integration with selinux, and so on Is any of that code around yet? I see that you have split your submissions so that a smaller program can be reviewed today. I'd like to start taking a look at the next step too though. For the project I'm starting to work on here, getting the integration with labeling also done is a very important thing to target for 9.4. It would be nice to see how that fits together today, even if the code for it isn't being reviewed heavily yet. I don't quite understand yet what's missing on the writer side. If you could help explain what's missing there, I would like to read about that. -- Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com
btw, there is serious problem with row-level security and constraints. For example, user with low security level could use unique constraint to know about existence of a row with higher security. I don't know, what is the best practice to avoid this.
On Wed, Aug 28, 2013 at 1:37 AM, Greg Smith <greg@2ndquadrant.com> wrote:
On 7/20/13 10:08 AM, Kohei KaiGai wrote:Hmm. I didn't have this idea. It seems to me fair enough and kills
necessity to enhance RangeTblEntry and getrelid() indeed.
I try to fix up this implementation according to your suggestion.
How is that going? I'm going to do a serious review of this myself over the next few weeks. I have a good chunk of time set aside for it as part of a larger project. I'm hoping to get more people here involved in that effort too, starting in the November CF if that works out.
I've been trying to catch up with your larger plan for this feature for 9.4. You made this comment earlier:
> Also, I'd like to have discussion for this feature in earlier half of
> v9.4 to keep time for the remaining features, such as check on
> writer-side, integration with selinux, and so on
Is any of that code around yet? I see that you have split your submissions so that a smaller program can be reviewed today. I'd like to start taking a look at the next step too though. For the project I'm starting to work on here, getting the integration with labeling also done is a very important thing to target for 9.4. It would be nice to see how that fits together today, even if the code for it isn't being reviewed heavily yet.
I don't quite understand yet what's missing on the writer side. If you could help explain what's missing there, I would like to read about that.
--
Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
2013/8/27 Greg Smith <greg@2ndquadrant.com>: > On 7/20/13 10:08 AM, Kohei KaiGai wrote: >> >> Hmm. I didn't have this idea. It seems to me fair enough and kills >> necessity to enhance RangeTblEntry and getrelid() indeed. >> I try to fix up this implementation according to your suggestion. > > > How is that going? I'm going to do a serious review of this myself over the > next few weeks. I have a good chunk of time set aside for it as part of a > larger project. I'm hoping to get more people here involved in that effort > too, starting in the November CF if that works out. > Sorry, I tried to rework the portions that were not graceful so much, like system column reference or update/delete on inherited tables, however, it eventually came back to my original implementation. :-( The attached patch fixed the portion I was pointed out, so its overall design has not been changed so much. > I've been trying to catch up with your larger plan for this feature for 9.4. > You made this comment earlier: > >> Also, I'd like to have discussion for this feature in earlier half of >> v9.4 to keep time for the remaining features, such as check on >> writer-side, integration with selinux, and so on > > Is any of that code around yet? I see that you have split your submissions > so that a smaller program can be reviewed today. I'd like to start taking a > look at the next step too though. For the project I'm starting to work on > here, getting the integration with labeling also done is a very important > thing to target for 9.4. It would be nice to see how that fits together > today, even if the code for it isn't being reviewed heavily yet. > The biggest (and most important) portion of overall picture is this patch; that allows to restrict rows to be visible according to pre-configured security policy per table. Towards label based row-level security, it needs an infrastructure to append before-row trigger on the fly, according to row-level security configuration, because we need a check to prevent users to write a record with unprivileged label. It is probably annoying requirement for users to set up triggers for each table with security policy. Then, I'd like to offer special functions of sepgsql that shall be used to security policy function to filter out unprivileged tuples. On the v9.4 time-frame, I'm not certain whether we can implement facility to manage security label of user tables. So, I assume label based row-level security is activated when a special named text column is defined by users. Towards v9.5, I'd like to have a feature to add hidden column for security label purpose on table creation time without user's consciousness. > I don't quite understand yet what's missing on the writer side. If you > could help explain what's missing there, I would like to read about that. > It needs to back the discussion at CF-4th of v9.3. We discussed whether we should have a special feature to check records to be inserted or newer records to be updated, because before-row trigger can offer its infrastructure, even though it requires users an extra configuration job for each tables that have row-level security policy. What I'd like to add on writer-side is an infrastructure for extensions to inject before-row trigger functions on the tail of call-chain, that allows to prevent records with violated values, and also allows sepgsql to assign a default security label on new records being inserted. Thanks, -- KaiGai Kohei <kaigai@kaigai.gr.jp>
Attachment
2013/8/28 Oleg Bartunov <obartunov@gmail.com>: > btw, there is serious problem with row-level security and constraints. For > example, user with low security level could use unique constraint to know > about existence of a row with higher security. I don't know, what is the > best practice to avoid this. > It is out of scope for this feature. We usually calls this type of information leakage "covert channel"; that is not avoidable in principle. However, its significance is minor, because attacker must know identical data to be here, or must have proving for each possible values. Its solution is simple. DBA should not use value to be confidential as unique key. If needed, our recommendation is alternative key, instead of natural key, because its value itself does not have worth. I should add a note of caution onto the documentation according to the previous consensus, however, I noticed it had gone from the sgml files while I was unaware. So, let me add description on the documentation. Thanks, -- KaiGai Kohei <kaigai@kaigai.gr.jp>
<div dir="ltr">Any constraints could be "covert channel".<br /></div><div class="gmail_extra"><br /><br /><div class="gmail_quote">OnWed, Aug 28, 2013 at 4:17 PM, Kohei KaiGai <span dir="ltr"><<a href="mailto:kaigai@kaigai.gr.jp"target="_blank">kaigai@kaigai.gr.jp</a>></span> wrote:<br /><blockquote class="gmail_quote"style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">2013/8/28 Oleg Bartunov <<a href="mailto:obartunov@gmail.com">obartunov@gmail.com</a>>:<br/><div class="im">> btw, there is serious problem withrow-level security and constraints. For<br /> > example, user with low security level could use unique constraintto know<br /> > about existence of a row with higher security. I don't know, what is the<br /> > best practiceto avoid this.<br /> ><br /></div>It is out of scope for this feature. We usually calls this type of information<br/> leakage "covert channel"; that is not avoidable in principle.<br /> However, its significance is minor,because attacker must know identical<br /> data to be here, or must have proving for each possible values.<br /> Itssolution is simple. DBA should not use value to be confidential as unique<br /> key. If needed, our recommendation isalternative key, instead of natural key,<br /> because its value itself does not have worth.<br /><br /> I should add anote of caution onto the documentation according to<br /> the previous consensus, however, I noticed it had gone from thesgml files<br /> while I was unaware. So, let me add description on the documentation.<br /><div class="HOEnZb"><div class="h5"><br/> Thanks,<br /> --<br /> KaiGai Kohei <<a href="mailto:kaigai@kaigai.gr.jp">kaigai@kaigai.gr.jp</a>><br/></div></div></blockquote></div><br /></div>
On Wed, Aug 28, 2013 at 4:17 PM, Kohei KaiGai <kaigai@kaigai.gr.jp> wrote:
------
With best regards,
Alexander Korotkov.
2013/8/28 Oleg Bartunov <obartunov@gmail.com>:> btw, there is serious problem with row-level security and constraints. ForIt is out of scope for this feature. We usually calls this type of information
> example, user with low security level could use unique constraint to know
> about existence of a row with higher security. I don't know, what is the
> best practice to avoid this.
>
leakage "covert channel"; that is not avoidable in principle.
However, its significance is minor, because attacker must know identical
data to be here, or must have proving for each possible values.
Its solution is simple. DBA should not use value to be confidential as unique
key. If needed, our recommendation is alternative key, instead of natural key,
because its value itself does not have worth.
I should add a note of caution onto the documentation according to
the previous consensus, however, I noticed it had gone from the sgml files
while I was unaware. So, let me add description on the documentation.
I think there is another "covert channel" much more serious than constrains. You can gather information about hidden data by reading query plans.
CREATE TABLE test (id SERIAL PRIMARY KEY, value INTEGER);
INSERT INTO test (value) VALUES (1234), (4321), (2356), (6542);
ALTER TABLE test SET ROW SECURITY FOR ALL TO (id % 2 = 1);
CREATE INDEX test_idx ON test (value);
User sees only 1 and 3 rows:
postgres=> select * from test;
id | value
----+-------
1 | 1234
3 | 2356
(2 rows)
But user can probe values column using explain command.
postgres=> set enable_seqscan = off;
SET
postgres=> explain analyze select * from test where value between 1111 and 3333;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------
Index Scan using test_idx on test (cost=0.13..8.16 rows=1 width=8) (actual time=0.021..0.024 rows=2 loops=1)
Index Cond: ((value >= 1111) AND (value <= 3333))
Filter: ((id % 2) = 1)
Total runtime: 0.056 ms
(4 rows)
postgres=> explain analyze select * from test where value between 1111 and 5555;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------
Index Scan using test_idx on test (cost=0.13..8.16 rows=1 width=8) (actual time=0.020..0.024 rows=2 loops=1)
Index Cond: ((value >= 1111) AND (value <= 5555))
Filter: ((id % 2) = 1)
Rows Removed by Filter: 1
Total runtime: 0.057 ms
(5 rows)
In given example user can realize that there is a hidden value in index between 3334 and 5555. Using dichotomy he can find exact value.
I didn't find if there was discussion about it. This example is only my first idea about using plans for probing hidden values. Probably, there are some other ways to do it.
I don't think we can say that indexed data is not sensitive for leakage. Prohibiting push down of all operators which could be used for such kind of attacks also doesn't seem acceptable for me because of huge impact to performance. Another option I see is to hide some sensitive parts of plan from unprivileged user. There is still a room for timing attack, but it doesn't seem to be feasible in practice to apply.
------
With best regards,
Alexander Korotkov.
Alexander Korotkov <aekorotkov@gmail.com> writes: > On Wed, Aug 28, 2013 at 4:17 PM, Kohei KaiGai <kaigai@kaigai.gr.jp> wrote: >> It is out of scope for this feature. We usually calls this type of >> information leakage "covert channel"; that is not avoidable in principle. > I think there is another "covert channel" much more serious than > constrains. You can gather information about hidden data by reading query > plans. I'm not convinced by this argument that covert channels are "out of scope". That would be a fine justification for, say, a thesis topic. However, what we're talking about here is a real-world feature that will be of no real-world use if it can't stand up against rather obvious attack techniques. I'm not interested in carrying the maintenance and runtime overhead of a feature that's only of academic value. regards, tom lane
2013/8/29 Alexander Korotkov <aekorotkov@gmail.com>: > On Wed, Aug 28, 2013 at 4:17 PM, Kohei KaiGai <kaigai@kaigai.gr.jp> wrote: >> >> 2013/8/28 Oleg Bartunov <obartunov@gmail.com>: >> > btw, there is serious problem with row-level security and constraints. >> > For >> > example, user with low security level could use unique constraint to >> > know >> > about existence of a row with higher security. I don't know, what is >> > the >> > best practice to avoid this. >> > >> It is out of scope for this feature. We usually calls this type of >> information >> leakage "covert channel"; that is not avoidable in principle. >> However, its significance is minor, because attacker must know identical >> data to be here, or must have proving for each possible values. >> Its solution is simple. DBA should not use value to be confidential as >> unique >> key. If needed, our recommendation is alternative key, instead of natural >> key, >> because its value itself does not have worth. >> >> I should add a note of caution onto the documentation according to >> the previous consensus, however, I noticed it had gone from the sgml files >> while I was unaware. So, let me add description on the documentation. > > > I think there is another "covert channel" much more serious than constrains. > You can gather information about hidden data by reading query plans. > > CREATE TABLE test (id SERIAL PRIMARY KEY, value INTEGER); > INSERT INTO test (value) VALUES (1234), (4321), (2356), (6542); > ALTER TABLE test SET ROW SECURITY FOR ALL TO (id % 2 = 1); > CREATE INDEX test_idx ON test (value); > > User sees only 1 and 3 rows: > > postgres=> select * from test; > id | value > ----+------- > 1 | 1234 > 3 | 2356 > (2 rows) > > But user can probe values column using explain command. > > postgres=> set enable_seqscan = off; > SET > postgres=> explain analyze select * from test where value between 1111 and > 3333; > QUERY PLAN > --------------------------------------------------------------------------------------------------------------- > Index Scan using test_idx on test (cost=0.13..8.16 rows=1 width=8) (actual > time=0.021..0.024 rows=2 loops=1) > Index Cond: ((value >= 1111) AND (value <= 3333)) > Filter: ((id % 2) = 1) > Total runtime: 0.056 ms > (4 rows) > > postgres=> explain analyze select * from test where value between 1111 and > 5555; > QUERY PLAN > --------------------------------------------------------------------------------------------------------------- > Index Scan using test_idx on test (cost=0.13..8.16 rows=1 width=8) (actual > time=0.020..0.024 rows=2 loops=1) > Index Cond: ((value >= 1111) AND (value <= 5555)) > Filter: ((id % 2) = 1) > Rows Removed by Filter: 1 > Total runtime: 0.057 ms > (5 rows) > > In given example user can realize that there is a hidden value in index > between 3334 and 5555. Using dichotomy he can find exact value. > I didn't find if there was discussion about it. This example is only my > first idea about using plans for probing hidden values. Probably, there are > some other ways to do it. > Hmm. It is a new scenario for me, even though its basis is similar to the scenario that abuses constraints because it still does not leak the hidden value itself and it requires "estimation" by human. If we tackle this issue, an easy solution is to hide "rows removed by filter" output if plan is underlying sub-query plan that was constructed by row-level security feature. > I don't think we can say that indexed data is not sensitive for leakage. > Prohibiting push down of all operators which could be used for such kind of > attacks also doesn't seem acceptable for me because of huge impact to > performance. Another option I see is to hide some sensitive parts of plan > from unprivileged user. There is still a room for timing attack, but it > doesn't seem to be feasible in practice to apply. > A principle of this row-level security feature is, it prohibits to leak invisible datum itself, but might allow users to expect existence of records with a particular value. In fact, we never push down function that may leak the given argument, that does not have leakproof attribute, even if it can be utilized for index-scan. My opinion is, we should deal with it is "a limitation" of this feature, as long as it does not expose the raw data to be hidden. Estimation takes time to carry out much hidden data via covert channel, thus traditional secure operating system specification with MAC implementation says its degree of threat is not significant as long as bandwidth of covert channel is not so much. I think it is a reasonable standpoint. Thanks, -- KaiGai Kohei <kaigai@kaigai.gr.jp>
On Thu, Aug 29, 2013 at 04:14:53PM +0200, Kohei KaiGai wrote: > 2013/8/29 Alexander Korotkov <aekorotkov@gmail.com>: > > On Wed, Aug 28, 2013 at 4:17 PM, Kohei KaiGai <kaigai@kaigai.gr.jp> wrote: > >> > >> 2013/8/28 Oleg Bartunov <obartunov@gmail.com>: > >> > btw, there is serious problem with row-level security and constraints. > >> > For > >> > example, user with low security level could use unique constraint to > >> > know > >> > about existence of a row with higher security. I don't know, what is > >> > the > >> > best practice to avoid this. > >> > ... > > > A principle of this row-level security feature is, it prohibits to > leak invisible > datum itself, but might allow users to expect existence of records with > a particular value. In fact, we never push down function that may leak > the given argument, that does not have leakproof attribute, even if it can > be utilized for index-scan. > My opinion is, we should deal with it is "a limitation" of this feature, as > long as it does not expose the raw data to be hidden. Estimation takes > time to carry out much hidden data via covert channel, thus traditional > secure operating system specification with MAC implementation says > its degree of threat is not significant as long as bandwidth of covert > channel is not so much. I think it is a reasonable standpoint. > > Thanks, > -- > KaiGai Kohei <kaigai@kaigai.gr.jp> > Okay, given that argument, how would you monitor such attempts to access data through the covert channel and shut it down? Regards, Ken
2013/8/29 ktm@rice.edu <ktm@rice.edu>: > On Thu, Aug 29, 2013 at 04:14:53PM +0200, Kohei KaiGai wrote: >> 2013/8/29 Alexander Korotkov <aekorotkov@gmail.com>: >> > On Wed, Aug 28, 2013 at 4:17 PM, Kohei KaiGai <kaigai@kaigai.gr.jp> wrote: >> >> >> >> 2013/8/28 Oleg Bartunov <obartunov@gmail.com>: >> >> > btw, there is serious problem with row-level security and constraints. >> >> > For >> >> > example, user with low security level could use unique constraint to >> >> > know >> >> > about existence of a row with higher security. I don't know, what is >> >> > the >> >> > best practice to avoid this. >> >> > > ... >> > >> A principle of this row-level security feature is, it prohibits to >> leak invisible >> datum itself, but might allow users to expect existence of records with >> a particular value. In fact, we never push down function that may leak >> the given argument, that does not have leakproof attribute, even if it can >> be utilized for index-scan. >> My opinion is, we should deal with it is "a limitation" of this feature, as >> long as it does not expose the raw data to be hidden. Estimation takes >> time to carry out much hidden data via covert channel, thus traditional >> secure operating system specification with MAC implementation says >> its degree of threat is not significant as long as bandwidth of covert >> channel is not so much. I think it is a reasonable standpoint. >> >> Thanks, >> -- >> KaiGai Kohei <kaigai@kaigai.gr.jp> >> > > Okay, given that argument, how would you monitor such attempts to access > data through the covert channel and shut it down? > Although I didn't touch this task by myself, my senior colleague said that we calculated some possible bandwidth of leakage as an evident when we ship supercomputer system with MAC feature for customer who worry about security. I'm not sure how to measure it. However, if we assume a human can run up to 5 query per seconds, he needs 2-3 seconds to identify a particular integer value less than 10000, it means bandwidth of this covert channel is less than 5bps. I'm not sure whether enterprise-grade dbms has to care about these amount of covert channel. Thanks, -- KaiGai Kohei <kaigai@kaigai.gr.jp>
Kaigai, > Although I didn't touch this task by myself, my senior colleague said that we > calculated some possible bandwidth of leakage as an evident when we ship > supercomputer system with MAC feature for customer who worry about security. > I'm not sure how to measure it. However, if we assume a human can run up to > 5 query per seconds, he needs 2-3 seconds to identify a particular integer value > less than 10000, it means bandwidth of this covert channel is less than 5bps. > I'm not sure whether enterprise-grade dbms has to care about these amount > of covert channel. Why are you assuming a human needs to do it? Given the explain vector, I could write a rather simple python or perl script which would find values by EXPLAIN leakage, at 1000 explain plans per minute. It's one thing to day "we can't solve this covert channel issue right now in this patch", but saying "we don't plan to solve it at all" is likely to doom the patch. I'm not sure what the solution would be, exactly. Deny permission for EXPLAIN on certain tables? Surely someone in the security community has discussed this? -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
* Kohei KaiGai (kaigai@kaigai.gr.jp) wrote: > Although I didn't touch this task by myself, my senior colleague said that we > calculated some possible bandwidth of leakage as an evident when we ship > supercomputer system with MAC feature for customer who worry about security. > I'm not sure how to measure it. However, if we assume a human can run up to > 5 query per seconds, he needs 2-3 seconds to identify a particular integer value > less than 10000, it means bandwidth of this covert channel is less than 5bps. > I'm not sure whether enterprise-grade dbms has to care about these amount > of covert channel. A human isn't necessary in this particular scenario- you're doing a simple binary search through the space, which computers can be awful good at. Using the type's bounds (eg: an 'integer' field is only 32bits) and forcing index scans, you could tell how many records exist and then break down how many exist in the first half, second half, and then split those.. Eventually, you could work out every value in the column. That could be applied to variable length values also, though it'd be more costly to get down to the exact values. I don't have a solution to these issues offhand except to suggest that, in such an environment, having a "don't allow these users to run EXPLAIN" and similar options would probably be welcome. Even then there are potential timing attacks, but that certainly increases the level of effort involved. Thanks, Stephen
Josh Berkus <josh@agliodbs.com> writes: > It's one thing to day "we can't solve this covert channel issue right > now in this patch", but saying "we don't plan to solve it at all" is > likely to doom the patch. > I'm not sure what the solution would be, exactly. Deny permission for > EXPLAIN on certain tables? That would close only one covert channel. Others were already pointed out upthread, and I'll bet there are more ... regards, tom lane
>> I'm not sure what the solution would be, exactly. Deny permission for >> EXPLAIN on certain tables? > > That would close only one covert channel. Others were already pointed out > upthread, and I'll bet there are more ... Mind you, fundamentally this is no different from allowing INSERT permission on a table but denying SELECT, or denying SELECT on certain columns. In either case, covert channels for some data are available. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
On Thu, Aug 29, 2013 at 10:05:14AM -0400, Tom Lane wrote: > Alexander Korotkov <aekorotkov@gmail.com> writes: > > On Wed, Aug 28, 2013 at 4:17 PM, Kohei KaiGai <kaigai@kaigai.gr.jp> wrote: > >> It is out of scope for this feature. We usually calls this type > >> of information leakage "covert channel"; that is not avoidable in > >> principle. > > > I think there is another "covert channel" much more serious than > > constrains. You can gather information about hidden data by > > reading query plans. > > I'm not convinced by this argument that covert channels are "out of > scope". That would be a fine justification for, say, a thesis > topic. However, what we're talking about here is a real-world > feature that will be of no real-world use if it can't stand up > against rather obvious attack techniques. I'm not interested in > carrying the maintenance and runtime overhead of a feature that's > only of academic value. Looking at the real-world perspective, what covert channels do our competitors in the space currently claim to do anything about? This would represent the bar we need to clear at least as far as documenting what we do (do the access constraint before anything else, e.g.) or why we don't do things (disabling EXPLAIN, e.g.). Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
Josh Berkus <josh@agliodbs.com> writes: >> That would close only one covert channel. Others were already pointed out >> upthread, and I'll bet there are more ... > Mind you, fundamentally this is no different from allowing INSERT > permission on a table but denying SELECT, or denying SELECT on certain > columns. In either case, covert channels for some data are available. Certainly. But INSERT's purpose in life is not to prevent people from inferring what data is in the table. What we have to ask here is whether a "row level security" feature that doesn't deal with these real-world attack techniques is worth having. regards, tom lane
2013/8/29 Josh Berkus <josh@agliodbs.com>: > Kaigai, > >> Although I didn't touch this task by myself, my senior colleague said that we >> calculated some possible bandwidth of leakage as an evident when we ship >> supercomputer system with MAC feature for customer who worry about security. >> I'm not sure how to measure it. However, if we assume a human can run up to >> 5 query per seconds, he needs 2-3 seconds to identify a particular integer value >> less than 10000, it means bandwidth of this covert channel is less than 5bps. >> I'm not sure whether enterprise-grade dbms has to care about these amount >> of covert channel. > > Why are you assuming a human needs to do it? Given the explain vector, > I could write a rather simple python or perl script which would find > values by EXPLAIN leakage, at 1000 explain plans per minute. > > It's one thing to day "we can't solve this covert channel issue right > now in this patch", but saying "we don't plan to solve it at all" is > likely to doom the patch. > > I'm not sure what the solution would be, exactly. Deny permission for > EXPLAIN on certain tables? > > Surely someone in the security community has discussed this? > Security community considers covert channel is a hard to solve problem; nearly impossible to eliminate. Let's see the summary in wikipedia: http://en.wikipedia.org/wiki/Covert_channel It does not require countermeasure of covert channels in middle or entry class security evaluation; that is usually required for enterprise grade, even though it is required for the product being designed for military grade. The reason why its priority is relatively lower, is that degree of threats with information leakage via covert channel has limited bandwidth in comparison to main channel. I also follow this standpoint; that is enough reasonable between functionality and its strictness under limited resources. Even if we could close a certain channel, we never can all other channels, like a signal by namespace contention on table creation as covert channel. Also, I don't know major commercial dbms handles these scenario well. Of course, it should be described in the document for users not to apply these security features onto the region that overs our capability. Thanks, -- KaiGai Kohei <kaigai@kaigai.gr.jp>
2013/8/29 David Fetter <david@fetter.org>: > On Thu, Aug 29, 2013 at 10:05:14AM -0400, Tom Lane wrote: >> Alexander Korotkov <aekorotkov@gmail.com> writes: >> > On Wed, Aug 28, 2013 at 4:17 PM, Kohei KaiGai <kaigai@kaigai.gr.jp> wrote: >> >> It is out of scope for this feature. We usually calls this type >> >> of information leakage "covert channel"; that is not avoidable in >> >> principle. >> >> > I think there is another "covert channel" much more serious than >> > constrains. You can gather information about hidden data by >> > reading query plans. >> >> I'm not convinced by this argument that covert channels are "out of >> scope". That would be a fine justification for, say, a thesis >> topic. However, what we're talking about here is a real-world >> feature that will be of no real-world use if it can't stand up >> against rather obvious attack techniques. I'm not interested in >> carrying the maintenance and runtime overhead of a feature that's >> only of academic value. > > Looking at the real-world perspective, what covert channels do our > competitors in the space currently claim to do anything about? > I'm not sure whether minor dbms that is designed for extreme secure environment already got certified. (If they have such functionality, they should take certification for promotion.) Oracle lists some of their certified products: http://www.oracle.com/technetwork/topics/security/security-evaluations-099357.html However, these are based on protection profile for basic robustness that is designed for environment where we don't care about covert channel. > This would represent the bar we need to clear at least as far as > documenting what we do (do the access constraint before anything else, > e.g.) or why we don't do things (disabling EXPLAIN, e.g.). > +1. I'd like to add description about this scenario. Thanks, -- KaiGai Kohei <kaigai@kaigai.gr.jp>
2013/8/29 Tom Lane <tgl@sss.pgh.pa.us>: > Josh Berkus <josh@agliodbs.com> writes: >>> That would close only one covert channel. Others were already pointed out >>> upthread, and I'll bet there are more ... > >> Mind you, fundamentally this is no different from allowing INSERT >> permission on a table but denying SELECT, or denying SELECT on certain >> columns. In either case, covert channels for some data are available. > > Certainly. But INSERT's purpose in life is not to prevent people from > inferring what data is in the table. What we have to ask here is whether > a "row level security" feature that doesn't deal with these real-world > attack techniques is worth having. > I think, we should clearly note that row-level security feature does not have capability to control information leakage via covert channel but very limited bandwidth, even though it control information leakage and manipulation via main channel. It depends on user's environment and expectation. If they need rdbms with security feature for military grade, it is not recommendable. However, it is a recommended solution for regular enterprise grade environment. Anything depends on user's environment, threats and worth of values to be protected. Thanks, -- KaiGai Kohei <kaigai@kaigai.gr.jp>
On 08/30/2013 03:05 AM, Kohei KaiGai wrote: >> Surely someone in the security community has discussed this? >> > Security community considers covert channel is a hard to solve problem; > nearly impossible to eliminate. > Let's see the summary in wikipedia: > http://en.wikipedia.org/wiki/Covert_channel Well, in each of the cases covered in that article, the given technology (OSI, TCP, etc.) takes specific provisions to limit the ability of attackers to discover information via the covert channel. However, we have yet to talk about taking any such provisions with Postgres. If we commit this patch, arguably we'll have a row-level security feature which only protects data from well-behaved users, which seems counterproductive. So, arguments in favor of this patch: a) it's as good as Oracle's security features, giving us "check-box compliance". b) it allows securing individual rows against attackers with limited technical knowledge or limited database access, and could be very hardened in combination with intelligent access control. c) it is an improvement on techniques like Veil (is it)? d) we plan to continue improving it and closing covert channels, or limiting their bandwidth. Arguments against: m) covert channels are currently broad enough to make it trivially circumventable (are they?) n) overhead and code maintenance required is substantial So, determinative questions: 1) are the security mechanisms supplied by this patch superior in some way to approaches like Veil for multi-tenant applications? (this is a serious question, as multi-tenant applications are far less concerned about covert channels) 2) do we plan to reduce the accessibility of data via covert channels over successive releases? How? 3) will accepting this patch allow our users in the Government space to more freely adopt PostgreSQL? -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
Josh, * Josh Berkus (josh@agliodbs.com) wrote: > On 08/30/2013 03:05 AM, Kohei KaiGai wrote: > > Security community considers covert channel is a hard to solve problem; > > nearly impossible to eliminate. While impossible to eliminate, we should certainly consider cases like this where we can do better and fix them. RLS certainly brings another level of consideration to the overall PG security environment by requiring we think about security on a row level rather than just a table or column level. We have issues with covert channels even without RLS though and holding up RLS because it doesn't fix all the covert channels isn't sensible. Column-level privleges have a similar problem, where you can read the default value for a column using the catalogs. Perhaps the default isn't sensetive (you'd certainly hope not), but it's still an issue. It wouldn't surprise me to find that there are ways to abuse a multi-column index which includes both a column you can manipulate and one you don't have access to read to determine something about the hidden column (maybe you have access to the 2nd field in the index and you can encourage an in-order index traversal and then look at filtered rows, or just work out a way to do timing attacks to determine the btree depth). > Well, in each of the cases covered in that article, the given technology > (OSI, TCP, etc.) takes specific provisions to limit the ability of > attackers to discover information via the covert channel. The work we've done around secure views would lend credit to our attempts at taking specific provisions as well; sadly, PG is slightly more complicated than TCP. We do what we can and we've got a great community which will point out where we can do better- and we work on it and improve over time. Hell, when roles were first added we had a *massive* security hole because we didn't check to make sure we weren't overrunning the length of the GUC. It was a mistake and we should have done better, but that doesn't mean adding roles was the wrong decision. > However, we have yet to talk about taking any such provisions with > Postgres. If we commit this patch, arguably we'll have a row-level > security feature which only protects data from well-behaved users, which > seems counterproductive. I would argue both that we *have* been taking provisions to avoid obvious and big covert channels, and that this patch adds value even if it doesn't protect the system perfectly from malicious users. We're all certainly aware of the ability for an attacker to cause major problems to a PG system if they can issue arbitrary SQL and our permissions system doesn't do much to protect us. A single query which doesn't require any privileges could cause havok on the system (massive on-disk temp file, which could be shared with pg_xlog causing the system to PANIC, massive CPU load if they can execute multiple commands in parallel...). Not to mention the default installation of pl/pgsql and anonymous functions. I could see many a web app (things like LedgerSMB) which could benefit from having more fine-grained in-database control because they already authenticate to the database as the user and have a static or at least controlled set of queries which they run. Today, any of those kinds of systems have to implement their own RLS (though sometimes it's done through independent tables for each customer or similar, rather than as conditionals added to queries). > a) it's as good as Oracle's security features, giving us "check-box > compliance". I'd argue that this is definitely much more than 'check-box' compliance. > b) it allows securing individual rows against attackers with limited > technical knowledge or limited database access, and could be very > hardened in combination with intelligent access control. > c) it is an improvement on techniques like Veil (is it)? > d) we plan to continue improving it and closing covert channels, or > limiting their bandwidth. > > Arguments against: > m) covert channels are currently broad enough to make it trivially > circumventable (are they?) There are some which are and likely deserve to be fixed. Do they all need to be addressed before this patch goes in? I'd argue 'no'. > n) overhead and code maintenance required is substantial > > So, determinative questions: > > 1) are the security mechanisms supplied by this patch superior in some > way to approaches like Veil for multi-tenant applications? (this is a > serious question, as multi-tenant applications are far less concerned > about covert channels) I'd argue 'yes' if just for the fact that it'd be simpler and easier to use, both because it's in core and because you're using an actual grammar instead of function calls- but this RLS does more than just that, it's going to cause us to improve things that Veil probably can't fix and simply ignores today. > 2) do we plan to reduce the accessibility of data via covert channels > over successive releases? How? By discovering them and fixing them as we go..? I can't imagine there being one massive patch which goes into a single major release that fixes *all* of them- there's going to be ones we can't even imagine today that we discover later. Should we fix *all* of the ones that we discover? Probably not- it's simply not possible. Should we fix the ones that we can easily correct? Of course. > 3) will accepting this patch allow our users in the Government space to > more freely adopt PostgreSQL? There's two parts to this. On the one hand, the 'check-box' would be filled, which by itself would make it easier (at least based on my experience w/ the US gov't, ymmv), but also because it would require *less work* to build a new application on PG which needed RLS. You can already do it today, but you have to bake that into the cost of the implementation of the overall application and accept the limitations which come with it- trivally gotten around once you get a direct connection to PG. Would this be perfect? No, but it'd be quite a bit better. Thanks, Stephen
Stephen Frost <sfrost@snowman.net> writes: > We have issues with covert channels even without RLS though and holding > up RLS because it doesn't fix all the covert channels isn't sensible. I think it's entirely sensible to question whether we should reject (not "hold up") RLS if it has major covert-channel problems. The scenario I'm worried about is where somebody says "hey, Postgres has RLS now, I can rely on that to hide my sooper sekrit data from other users in the same database", and later they have a security breach through some covert-channel attack. Are they going to blame themselves? No, they're gonna blame Postgres. Or consider the case where some bozo publishes a method for such an attack and uses it to badmouth us as insecure. I don't think we need the headaches that will result from promising (or at least appearing to promise) something we can't deliver. Nor am I convinced that we're really doing users any favors by providing such a feature. They'd be *far* better advised to put their critical data in a separate database. In short, "we can check some check-box" is a really, really bad reason to accept a security-related feature. If we're going to put up with all the downsides of RLS, I want the end result to be something that's actually secure, not something that gives the illusion of security. And right now, I do not believe we can get past the illusion stage, ever (certainly not in a release or two). regards, tom lane
On 08/30/2013 12:43 PM, Tom Lane wrote: > In short, "we can check some check-box" is a really, really bad reason > to accept a security-related feature. If we're going to put up with > all the downsides of RLS, I want the end result to be something that's > actually secure, not something that gives the illusion of security. > And right now, I do not believe we can get past the illusion stage, > ever (certainly not in a release or two). Can you be more explicit about "all the downsides of RLS"? I was just looking over the patch, which is less than 5000 lines. While it's not small, we have larger patches in the CF. So what's the specific downsides of this? The reason I brought up multi-tenant applications ("MTA"), BTW, is that this is the other major potential utility of RLS, and for such users the covert channel limitations are acceptable (as long as we publish them).That is, for a "thing-as-a-service" application, usersare not assumed to have unlimited access to the SQL command line anyway; RLS is employed just to limit the damage they can do if they get access, and to limit the disclosure if some app programmer makes a mistake. Right now, the primary tool for doing row filtering for MTA is Veil, which has numerous and well-known limitations. If RLS has fewer limitations, or is easier to deploy, maintain, and/or understand, then it's a valuable feature for that user base, even if it doesn't address the covert channels we've brought up at all. That is, if RLS is your *second* level of defense, instead of your primary defense, covert channels are not a make-or-break issue. It just has to be better than what we had before. Note that I have NOT done an evaluation of Veil vs. RLS for MTA at this point. I'm hoping someone else will ;-) -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
* Tom Lane (tgl@sss.pgh.pa.us) wrote: > Stephen Frost <sfrost@snowman.net> writes: > > We have issues with covert channels even without RLS though and holding > > up RLS because it doesn't fix all the covert channels isn't sensible. > > I think it's entirely sensible to question whether we should reject (not > "hold up") RLS if it has major covert-channel problems. Rejecting RLS because we've suddently realized that covert channels exist is foolishness. It's akin to rejecting the ability to add stored procedures because we don't protect prosrc from people who don't own or can't execute the function. > The scenario I'm worried about is where somebody says "hey, Postgres has > RLS now, I can rely on that to hide my sooper sekrit data from other users > in the same database", and later they have a security breach through some > covert-channel attack. Are they going to blame themselves? No, they're > gonna blame Postgres. Or consider the case where some bozo publishes > a method for such an attack and uses it to badmouth us as insecure. In my experience, issues are discovered, patched, and released as security updates. Does adding RLS mean we might have more of those? Sure, as did column level privileges and as does *any* such increase in the granularity of what we can provide security-wise. > I don't think we need the headaches that will result from promising > (or at least appearing to promise) something we can't deliver. Nor am > I convinced that we're really doing users any favors by providing such a > feature. They'd be *far* better advised to put their critical data in a > separate database. We've barely got cross-database queries with FDWs. The notion that adding such complexity into those as RLS, which each individual user will need to figure out how to do themselves and most will likely get far wrong and much worse than what we'd implement, is "better" for our users is just ridiculous. > In short, "we can check some check-box" is a really, really bad reason > to accept a security-related feature. If we're going to put up with > all the downsides of RLS, I want the end result to be something that's > actually secure, not something that gives the illusion of security. > And right now, I do not believe we can get past the illusion stage, > ever (certainly not in a release or two). I'm not argueing for this because it fulfills some check-box; the question about if it would help a given set of clients (ones which I no longer have any direct relationship with, as it turns out) adopt PG was asked and I answered it as best I could. I certainly think we need to get past the 'illusion' stage also. I'm certainly more optimistic about that than you are but I also understand it's not going to be perfect in the first release- but I do think it'll be better than the 'illusion' stage. It'll get there because we'll continue to discuss it, people will test it, etc; as one hopes happens with all new features, but this even more than others. Thanks, Stephen
Josh Berkus <josh@agliodbs.com> writes: > On 08/30/2013 12:43 PM, Tom Lane wrote: >> In short, "we can check some check-box" is a really, really bad reason >> to accept a security-related feature. If we're going to put up with >> all the downsides of RLS, I want the end result to be something that's >> actually secure, not something that gives the illusion of security. > Can you be more explicit about "all the downsides of RLS"? I was just > looking over the patch, which is less than 5000 lines. While it's not > small, we have larger patches in the CF. So what's the specific > downsides of this? I think it's going to be an ongoing maintenance headache and an endless source of security bugs, even disregarding covert-channel issues. I have pretty much zero faith in the planner changes, in particular, and would still not have a lot if they were adequately documented, which they absolutely are not. The whole thing depends on nowhere-clearly-stated assumptions that plan-time transformations will never allow an RLS check to be bypassed. I foresee future planner work breaking this in non-obvious ways on a regular basis (even granting the assumption that it's bulletproof today, which is at best unproven). > The reason I brought up multi-tenant applications ("MTA"), BTW, is that > this is the other major potential utility of RLS, and for such users the > covert channel limitations are acceptable (as long as we publish them). [ shrug... ] You might've noticed I work for a multi-tenant shop now. I'm still not excited about this. > That is, if RLS is your *second* level of defense, instead of your > primary defense, covert channels are not a make-or-break issue. It just > has to be better than what we had before. Yeah, that's a fair point. I'm just not convinced that it's enough better to justify the maintenance burden we'll be taking on. I'm not thrilled about the "every bug is a security bug" angle, either. regards, tom lane
2013/8/30 Josh Berkus <josh@agliodbs.com>: > On 08/30/2013 03:05 AM, Kohei KaiGai wrote: > >>> Surely someone in the security community has discussed this? >>> >> Security community considers covert channel is a hard to solve problem; >> nearly impossible to eliminate. >> Let's see the summary in wikipedia: >> http://en.wikipedia.org/wiki/Covert_channel > > Well, in each of the cases covered in that article, the given technology > (OSI, TCP, etc.) takes specific provisions to limit the ability of > attackers to discover information via the covert channel. > > However, we have yet to talk about taking any such provisions with > Postgres. If we commit this patch, arguably we'll have a row-level > security feature which only protects data from well-behaved users, which > seems counterproductive. > The point we shouldn't forget is information leakage via covert-channel has less grade of threat than leakage via main-channel, because of much less bandwidth. Security community also concludes it is not avoidable nature as long as human can observe system behavior and estimate something, thus, security evaluation criteria does not require eliminate covert-channels or does not pay attention about covert-channels for the products that is installed on the environment with basic robustness (that means, non-military, regular enterprise usage). I don't think PostgreSQL goes into military-grade secure database system. If so, it has to sacrifice many thing (like, performance, usability, nature of open source, ...) for security. It's not a fact. > So, arguments in favor of this patch: > a) it's as good as Oracle's security features, giving us "check-box > compliance". > b) it allows securing individual rows against attackers with limited > technical knowledge or limited database access, and could be very > hardened in combination with intelligent access control. Even if attacker has enough knowledge, the ratio they can estimate hidden values is very limited because of much less bandwidth of covert channels. > c) it is an improvement on techniques like Veil (is it)? > d) we plan to continue improving it and closing covert channels, or > limiting their bandwidth. > > Arguments against: > m) covert channels are currently broad enough to make it trivially > circumventable (are they?) > n) overhead and code maintenance required is substantial > > So, determinative questions: > > 1) are the security mechanisms supplied by this patch superior in some > way to approaches like Veil for multi-tenant applications? (this is a > serious question, as multi-tenant applications are far less concerned > about covert channels) > Yes. This RLS implementation targets the environment that does not have requirement for a particular bandwidth upperbound on covert- channels. It allows to centralize the place where we have to care about access control on main-channel, so it well fits multi-tenant applications. > 2) do we plan to reduce the accessibility of data via covert channels > over successive releases? How? > Less covert channels is better than massive, if we could close it with reasonable cost; that means run-time performance, code complexity and so on. However, in general, it is impossible to eliminate anything in spite of less degree of threats because of smaller bandwidth. So, I don't think this is an issue to spent much efforts to solve. > 3) will accepting this patch allow our users in the Government space to > more freely adopt PostgreSQL? > Government has two spaces. Most of their environment requires similar requirement as enterprise grade system doing. On the other hand, military environment often requires upper-bound of covert channel, as a story I introduce on upthread, but they are minority and have budget to develop special purpose database system designed for security with first priority. Thanks, -- KaiGai Kohei <kaigai@kaigai.gr.jp>
2013/8/30 Tom Lane <tgl@sss.pgh.pa.us>: > Josh Berkus <josh@agliodbs.com> writes: >> On 08/30/2013 12:43 PM, Tom Lane wrote: >>> In short, "we can check some check-box" is a really, really bad reason >>> to accept a security-related feature. If we're going to put up with >>> all the downsides of RLS, I want the end result to be something that's >>> actually secure, not something that gives the illusion of security. > >> Can you be more explicit about "all the downsides of RLS"? I was just >> looking over the patch, which is less than 5000 lines. While it's not >> small, we have larger patches in the CF. So what's the specific >> downsides of this? > > I think it's going to be an ongoing maintenance headache and an endless > source of security bugs, even disregarding covert-channel issues. I have > pretty much zero faith in the planner changes, in particular, and would > still not have a lot if they were adequately documented, which they > absolutely are not. The whole thing depends on nowhere-clearly-stated > assumptions that plan-time transformations will never allow an RLS check > to be bypassed. I foresee future planner work breaking this in > non-obvious ways on a regular basis (even granting the assumption that > it's bulletproof today, which is at best unproven). > In general, we will adopt / enhance features as long as PostgreSQL runs with evolution. It can never be free from bugs or maintenance, regardless of its nature. Later half seems to me a bit unfair because any features may conflict with some future works, not only RLS. Also, if you have some tangible planner enhancement plan, could you inform us which plan shall be in progress? At least, it is impossible to adjust my implementation because of abstract concern towards future conflict. Thanks, -- KaiGai Kohei <kaigai@kaigai.gr.jp>
KaiGai, * Kohei KaiGai (kaigai@kaigai.gr.jp) wrote: > The point we shouldn't forget is information leakage via covert-channel > has less grade of threat than leakage via main-channel, because of > much less bandwidth. While true, this argument can't be used to simply ignore any and all covert channels. There are covert channels which are *not* much less bandwidth, and the Filtered Rows one is one of those- it's simply too big to ignore. There are likely other which are equally large and until we clean those up our RLS implementation will be questioned by our users. This does not mean that we need to clean up all covert channels and things which are clearly intractable don't need to be addressed (eg: the unique constraint situation; we can't both guarantee uniqueness and hide the existance of an entry). > Security community also concludes it is not avoidable nature as long > as human can observe system behavior and estimate something, This particular case is actually beyond 'estimation'. > Even if attacker has enough knowledge, the ratio they can estimate > hidden values is very limited because of much less bandwidth of > covert channels. You really need to back away from this argument in this case. The example shown isn't based on estimation and provides quite high bandwidth because it can be run by a computer. This argument can't be used for every situation where information is leaked. > However, in general, it is impossible to eliminate anything in spite of > less degree of threats because of smaller bandwidth. So, I don't think > this is an issue to spent much efforts to solve. I don't see a lot of complexity required to fix this specific case. A great deal of effort will be involved in going through the rest of the code and various options and working to eliminate other similar cases, but that's a reality we need to deal with. Hopefully the cases found will not require a lot of additional code to deal with. We will need to be mindful of new code which adds leaks or changes behavior such that RLS doesn't function properly (hopefully, sufficient regression tests will help to address that as well). Thanks, Stephen
> I think there is another "covert channel" much more serious than > constrains. You can gather information about hidden data by reading > query plans. I think a lot of people would be quite happy to simply disallow EXPLAIN. Define a permission for it, grant it to public and newly created users/groups by default (for BC), and allow it to be revoked. To define what is/isn't reasonable in terms of covert channel leakage, I (reluctantly) suggest checking out the Common Criteria stuff. Yes, it's verbose and questionably useful, but it's something that already exists and that has lots of weight in government / large orgs. I've started reading into it but don't have enough info to comment on RLS specifically yet. Regarding unique keys and other constraints as leakage channels, I'm inclined to think that this is partly a documentation issue. The docs can explain what is/isn't protected against. Suggest creating keys incorporating security domain identifiers so that users in different security domains can't create conflicting values. Possibly provide a mechanism to enforce that so that users can't attempt to insert/update rows with a different security identifier. Or even make it transparently part of the system's operations, an implicit extra column in PRIMARY and FOREIGN keys on RLS-enabled tables. What I'm more worried about re the covert uniqueness issue is that any solution might run the risk of creating situations where changes in access rights can make previously valid data in tables invalid. If two users A and B can't see each other's data and create the same values for a key, then B is given the rights to see A's data ... those unique values now have duplicates. So: I do think we need to step back a little when it comes to covert channel attacks and define what we do/don't defend against. What about timing attacks - do we need to make sure we don't leak information about the number of RLS rows via scan durations? Make everything constant-time? Yick. Crypto systems have a hard enough time getting this stuff right, and I just think we say we don't defend against timing attacks. If someone wants to mess around with adding random sleeps in an executor hook, well, that's their pain to deal with. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 30.08.2013 22:57, Josh Berkus wrote: > Right now, the primary tool for doing row filtering for MTA is Veil, > which has numerous and well-known limitations. If RLS has fewer > limitations, or is easier to deploy, maintain, and/or understand, then > it's a valuable feature for that user base, even if it doesn't address > the covert channels we've brought up at all. > > That is, if RLS is your*second* level of defense, instead of your > primary defense, covert channels are not a make-or-break issue. It just > has to be better than what we had before. > > Note that I have NOT done an evaluation of Veil vs. RLS for MTA at this > point. I'm hoping someone else will ;-) I'd also like to hear how Veil differs from RLS. From what I've understood this far, they are the same in terms of what you can and cannot do. To phrase it differently: We already have RLS. It's shipped as an extension called Veil. Now please explain what's wrong with that statement, if anything. - Heikki
On 2013-09-01 16:38:51 +0300, Heikki Linnakangas wrote: > On 30.08.2013 22:57, Josh Berkus wrote: > >Right now, the primary tool for doing row filtering for MTA is Veil, > >which has numerous and well-known limitations. If RLS has fewer > >limitations, or is easier to deploy, maintain, and/or understand, then > >it's a valuable feature for that user base, even if it doesn't address > >the covert channels we've brought up at all. > > > >That is, if RLS is your*second* level of defense, instead of your > >primary defense, covert channels are not a make-or-break issue. It just > >has to be better than what we had before. > > > >Note that I have NOT done an evaluation of Veil vs. RLS for MTA at this > >point. I'm hoping someone else will ;-) > > I'd also like to hear how Veil differs from RLS. From what I've understood > this far, they are the same in terms of what you can and cannot do. > > To phrase it differently: We already have RLS. It's shipped as an extension > called Veil. Now please explain what's wrong with that statement, if > anything. I don't think veil really is an argument for much in this discussion. I don't know who in this discussion has used it, I have. Admittedly a good bit ago, 8.2, 8.3 days I think. There hasn't been much fundamental development since though, if my quick look is right. Veil gives you a rather low level toolbox for developing an RLS and you're left with a *huge* amount of work needed ontop. It basically gives you a bunch of new datatypes (sets, bitmaps, ranges) and provides some support for sharing variables across backends (shared memory). That's nearly it. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 9/1/13 9:38 AM, Heikki Linnakangas wrote: > To phrase it differently: We already have RLS. It's shipped as an > extension called Veil. Now please explain what's wrong with that > statement, if anything. Veil was last updated for 9.1 to work against that version, so the first thing is that it's two versions back from being current. The main improvement for a few now core features, compared to their external/extension predecessors, is that they go through a real review process. I suspect a lot of the criticisms being lobbied against the core RLS feature would also hit Veil if it were evaluated to the same standard. Regardless, I'm seeing a few review themes pop up from this thread: -Comparison against the Veil feature set. -Competitive review against industry expectations, AKA "checkbox" compliance. -Confirm feature set is useful to government security clearance applications and multi-tenant applications. There's also a secured web application use case that's popped up a few times too; KaiGai has used secured Apache installs for example. -Summary of known covert channels, with documentation coverage. -Assess odds of this implementation's future issues turning into security bugs. My personal hotspot here is that I'd like minimal code exposure to people who don't use this feature at all. Are there parts here that should be compile time enabled? Of course those are all on top of the usual code quality review. Did I miss any big themes on that list? -- Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com
Kaigai, >> However, we have yet to talk about taking any such provisions with >> Postgres. If we commit this patch, arguably we'll have a row-level >> security feature which only protects data from well-behaved users, which >> seems counterproductive. >> > The point we shouldn't forget is information leakage via covert-channel > has less grade of threat than leakage via main-channel, because of > much less bandwidth. That's an astonishingly weak argument, because the bandwidth you're talking about is still very high, as in *hundreds* or *thousands* of values per minute. It's one thing to make the bandwidth argument for things like monitoring power draw, where the leakage is one character per hour, but the covert channels we're talking about are several orders of magnitude faster than that. So, I repeat: if you continue to pursue the argument that the covert channels identified aren't significant because of bandwidth, you will doom this patch. The valid arguments are only two: a) clearly identified use case X doesn't care about covert channels for reason Y, and will find RLS extremely useful. b) we can't fix these covert channels now, but plan to work on them in the future, and have ideas for how to approach them. > Security community also concludes it is not avoidable nature as long > as human can observe system behavior and estimate something, thus, > security evaluation criteria does not require eliminate covert-channels > or does not pay attention about covert-channels for the products that > is installed on the environment with basic robustness (that means, > non-military, regular enterprise usage). To be completely blunt, the security community does not understand databases. At all. I'd think if anything had become clear through the course of work on SEPosgres, it would be that. >> So, determinative questions: >> >> 1) are the security mechanisms supplied by this patch superior in some >> way to approaches like Veil for multi-tenant applications? (this is a >> serious question, as multi-tenant applications are far less concerned >> about covert channels) >> > Yes. This RLS implementation targets the environment that does not > have requirement for a particular bandwidth upperbound on covert- > channels. It allows to centralize the place where we have to care > about access control on main-channel, so it well fits multi-tenant > applications. Again, please abandon your bandwidth arguments. They are irrelevant to whether or not this patch gets accepted. So, please try again? >> 3) will accepting this patch allow our users in the Government space to >> more freely adopt PostgreSQL? >> > Government has two spaces. Most of their environment requires similar > requirement as enterprise grade system doing. On the other hand, > military environment often requires upper-bound of covert channel, > as a story I introduce on upthread, but they are minority and have > budget to develop special purpose database system designed for > security with first priority. I don't think I understood this answer. What I'm asking is: will government agencies be sigificantly more likely to adopt PostgreSQL if we have RLS, in this form? Or do we need "military-grade" before it makes a difference? -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
2013/8/31 Stephen Frost <sfrost@snowman.net>: > KaiGai, > > * Kohei KaiGai (kaigai@kaigai.gr.jp) wrote: >> The point we shouldn't forget is information leakage via covert-channel >> has less grade of threat than leakage via main-channel, because of >> much less bandwidth. > > While true, this argument can't be used to simply ignore any and all > covert channels. There are covert channels which are *not* much less > bandwidth, and the Filtered Rows one is one of those- it's simply too > big to ignore. There are likely other which are equally large and > until we clean those up our RLS implementation will be questioned by > our users. > > This does not mean that we need to clean up all covert channels and > things which are clearly intractable don't need to be addressed (eg: > the unique constraint situation; we can't both guarantee uniqueness > and hide the existance of an entry). > >> Security community also concludes it is not avoidable nature as long >> as human can observe system behavior and estimate something, > > This particular case is actually beyond 'estimation'. > >> Even if attacker has enough knowledge, the ratio they can estimate >> hidden values is very limited because of much less bandwidth of >> covert channels. > > You really need to back away from this argument in this case. The > example shown isn't based on estimation and provides quite high > bandwidth because it can be run by a computer. This argument can't > be used for every situation where information is leaked. > >> However, in general, it is impossible to eliminate anything in spite of >> less degree of threats because of smaller bandwidth. So, I don't think >> this is an issue to spent much efforts to solve. > > I don't see a lot of complexity required to fix this specific case. A > great deal of effort will be involved in going through the rest of the > code and various options and working to eliminate other similar cases, > but that's a reality we need to deal with. Hopefully the cases found > will not require a lot of additional code to deal with. We will need to > be mindful of new code which adds leaks or changes behavior such that > RLS doesn't function properly (hopefully, sufficient regression tests > will help to address that as well). > You're saying that we need to fix up cover-channels with unignorable threat degree, even though it does not mean elimination of all the covert-channels. I almost agree and feel it a reasonable stance. One thing we need to pay attention is, how large covert-channel is we need to fix-up and how small is we can ignore. The reason why I used the term of "estimation", even heuristic or machinery way, is this covert-channel does not expose the raw data on the display. I (personally) used this criteria to decide whether the covert-channel is ignorable, or not. Thus, I tackled leaky-function problem on interaction with views for security purpose. It seems to me a rough watermark, rather than my criteria above, is necessary to determine which covert-channel is hit for us. Anyway, I try to investigate the scenario that Korotkov pointed out. It should be a common problem for views with security_barrier attribute, not only RLS feature, because it is designed on a common infrastructure. Thanks, -- KaiGai Kohei <kaigai@kaigai.gr.jp>
Josh, * Josh Berkus (josh@agliodbs.com) wrote: > That's an astonishingly weak argument, because the bandwidth you're > talking about is still very high, as in *hundreds* or *thousands* of > values per minute. I agree that, in this specific case, we should do something about it. > It's one thing to make the bandwidth argument for > things like monitoring power draw, where the leakage is one character > per hour, but the covert channels we're talking about are several orders > of magnitude faster than that. That's actually not an accurate representation of the bandwidth associated with power draw measurements, but even if I convinced you it was in the hundreds or thousands of values per minute, I doubt you'd change your mind regarding the Filtered Rows issue or claim that we should fix the power draw measurement issue. :) > So, I repeat: if you continue to pursue the argument that the covert > channels identified aren't significant because of bandwidth, you will > doom this patch. The valid arguments are only two: There is actually another argument, which I mentioned up-thread.. There are cases where you can't guarantee all the promises which can be asked for. In the unique-value case, we can't guarantee both that the values are unique and that existing values can't be detected by an individual with insert access. We should make the user aware that allowing insert access (or perhaps 'explain') opens up these cases. > a) clearly identified use case X doesn't care about covert channels for > reason Y, and will find RLS extremely useful. These certainly exist and I'd argue that's part of why Veil exists.. Users of Veil (which I admit, I'm not) would likely be much happier with an in-core solution because it will be much easier to use and much more performant. > b) we can't fix these covert channels now, but plan to work on them in > the future, and have ideas for how to approach them. I expect we will find more covert chanels which need to be fixed. > To be completely blunt, the security community does not understand > databases. At all. I'd think if anything had become clear through the > course of work on SEPosgres, it would be that. That's just false and it's poor of you to claim it. The security community is not one individual nor even some small group. They're not very obviously involved with PostgreSQL but that's no reason to claim that they do not understand databases. > I don't think I understood this answer. What I'm asking is: will > government agencies be sigificantly more likely to adopt PostgreSQL if > we have RLS, in this form? Or do we need "military-grade" before it > makes a difference? From my experience, the answer would be 'yes' to your first question. Thanks, Stephen
2013/9/1 Josh Berkus <josh@agliodbs.com>: > Kaigai, > >>> However, we have yet to talk about taking any such provisions with >>> Postgres. If we commit this patch, arguably we'll have a row-level >>> security feature which only protects data from well-behaved users, which >>> seems counterproductive. >>> >> The point we shouldn't forget is information leakage via covert-channel >> has less grade of threat than leakage via main-channel, because of >> much less bandwidth. > > That's an astonishingly weak argument, because the bandwidth you're > talking about is still very high, as in *hundreds* or *thousands* of > values per minute. It's one thing to make the bandwidth argument for > things like monitoring power draw, where the leakage is one character > per hour, but the covert channels we're talking about are several orders > of magnitude faster than that. > > So, I repeat: if you continue to pursue the argument that the covert > channels identified aren't significant because of bandwidth, you will > doom this patch. The valid arguments are only two: > > a) clearly identified use case X doesn't care about covert channels for > reason Y, and will find RLS extremely useful. > > b) we can't fix these covert channels now, but plan to work on them in > the future, and have ideas for how to approach them. > An important point is, what covert-channel should be closed, and what other covert-channel can be ignored. Even though the scenario Korotkov reported would be enough to make us surprised, we may need to consider how wide covert channel should be closed soon or later. * A covert-channel that can expose raw-data directly, should be fixed up. => Thus, leaky-view problem needed to be fixed. * A covert-channel that can expose existence of a particular value with less than Log(N) order trials to the possible rangeof hidden value. => Thus, this scenario needs to be cared? Or, any other criteria even though? My (current) preference is plan (c: we will be able to fix up *this* cover-channel with reasonable efforts on explain code. probably, we can close it if we don't print filtered rows under the sub-query with security-barrier attribute. It does not mean we shall or can fix up any possible covert-channels being found in the future, however, I also agree that we shall tackle covert-channel scenario being enough serious; that's criteria may or may not be bandwidth of information leakage. >> Security community also concludes it is not avoidable nature as long >> as human can observe system behavior and estimate something, thus, >> security evaluation criteria does not require eliminate covert-channels >> or does not pay attention about covert-channels for the products that >> is installed on the environment with basic robustness (that means, >> non-military, regular enterprise usage). > > To be completely blunt, the security community does not understand > databases. At all. I'd think if anything had become clear through the > course of work on SEPosgres, it would be that. > >>> So, determinative questions: >>> >>> 1) are the security mechanisms supplied by this patch superior in some >>> way to approaches like Veil for multi-tenant applications? (this is a >>> serious question, as multi-tenant applications are far less concerned >>> about covert channels) >>> >> Yes. This RLS implementation targets the environment that does not >> have requirement for a particular bandwidth upperbound on covert- >> channels. It allows to centralize the place where we have to care >> about access control on main-channel, so it well fits multi-tenant >> applications. > > Again, please abandon your bandwidth arguments. They are irrelevant to > whether or not this patch gets accepted. > > So, please try again? > >>> 3) will accepting this patch allow our users in the Government space to >>> more freely adopt PostgreSQL? >>> >> Government has two spaces. Most of their environment requires similar >> requirement as enterprise grade system doing. On the other hand, >> military environment often requires upper-bound of covert channel, >> as a story I introduce on upthread, but they are minority and have >> budget to develop special purpose database system designed for >> security with first priority. > > I don't think I understood this answer. What I'm asking is: will > government agencies be sigificantly more likely to adopt PostgreSQL if > we have RLS, in this form? Or do we need "military-grade" before it > makes a difference? > Even though I'm not a marketer for public sector, not a small number of Oracle virtual private database, that provide more simple functionality, has been accepted for public sectors also, I believe it encourages adoption of PostgreSQL on government agencies. However, military division often requires special treatment for security functionality but very small number of users are interested in, thus, it is not a good idea to focus on this grade here. Thanks, -- KaiGai Kohei <kaigai@kaigai.gr.jp>
On Sun, Sep 1, 2013 at 8:31 PM, Kohei KaiGai <kaigai@kaigai.gr.jp> wrote: > Or, any other criteria even though? > > My (current) preference is plan (c: we will be able to fix up *this* > cover-channel with reasonable efforts on explain code. probably, > we can close it if we don't print filtered rows under the sub-query > with security-barrier attribute. I think the criteria being discussed in this thread are too strict. It may be the case that Postgres cannot make a strong general case that it protects against covert channels. However it may still be able to make the much weaker case that it is *possible* to arrange your database such that the covert channels are kept under control. So I think up above Tom is wrong about why it's ok that INSERT leaks keys when it reports a unique key violation. The reason why it's ok that there's a covert channel there is that the DBA can avoid that covert channel by being careful when creating unique constraints. He or she should be aware that creating a unique constraint implicitly provides a kind of limited access to data to users who have INSERT privilege even if they lack the real SELECT privilege. Likewise, as long as the covert channels in RLS are things the DBA has even a modicum of control over I wouldn't be too worried. Afaict from skimming the thread it looks like creating any indexes on columns leaks what values of the index key exist in the table. Is it the case that non-indexed columns do not get leaked? -- greg
On 9/1/13 5:54 PM, Greg Stark wrote: > So I think up above Tom is wrong about why it's ok that INSERT leaks > keys when it reports a unique key violation. The reason why it's ok > that there's a covert channel there is that the DBA can avoid that > covert channel by being careful when creating unique constraints. He > or she should be aware that creating a unique constraint implicitly > provides a kind of limited access to data to users who have INSERT > privilege even if they lack the real SELECT privilege. And if someone can INSERT values that they can't actually see once they're committed, that's a similarly bad we should describe. People should be dumping their trash in their neighbor's yard. I think eventually this needs to be wrestled to the ground in a robust way. I want to see if all unique violations might be changed to give less information in this sort of RLS context. One rough early idea is to create a new error condition that means you hit something protected by RLS, but doesn't leak any more information than that. Just a generic "Security restriction operation" that comes out of fishing for keys, inserting outside your area, etc. I want to think through some use cases and review the code to see whether that concept helps or not. -- Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com
On Sun, Sep 1, 2013 at 11:05:58AM -0700, Josh Berkus wrote: > > Security community also concludes it is not avoidable nature as long > > as human can observe system behavior and estimate something, thus, > > security evaluation criteria does not require eliminate covert-channels > > or does not pay attention about covert-channels for the products that > > is installed on the environment with basic robustness (that means, > > non-military, regular enterprise usage). > > To be completely blunt, the security community does not understand > databases. At all. I'd think if anything had become clear through the > course of work on SEPosgres, it would be that. Agreed. The security community realizes these covert channels exist, but doesn't really have any recommendations on how to avoid them. You could argue that avoiding them is too tied to specific database implementations, but there are general channels, like insert with a unique key, that should at least have well-defined solutions. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
On Fri, Aug 30, 2013 at 04:24:06PM -0400, Stephen Frost wrote: > > The scenario I'm worried about is where somebody says "hey, Postgres has > > RLS now, I can rely on that to hide my sooper sekrit data from other users > > in the same database", and later they have a security breach through some > > covert-channel attack. Are they going to blame themselves? No, they're > > gonna blame Postgres. Or consider the case where some bozo publishes > > a method for such an attack and uses it to badmouth us as insecure. > > In my experience, issues are discovered, patched, and released as > security updates. Does adding RLS mean we might have more of those? > Sure, as did column level privileges and as does *any* such increase in > the granularity of what we can provide security-wise. Releasing a feature we think is perfect, and finding out later is isn't, and fixing it, is not the same as releasing a feature we _know_ isn't perfect, and having no idea how to fix it. From later discussions, it seems like we have to accept that we will never be able to avoid all convert channels, e.g. timing queries, but we can avoid the most obvious ones, e.g. EXPLAIN, and improve it as we go. Knowing there is no end to improving it does make some people feel uncomfortable, and I can't think of another feature we have added with such an open-ended nature. We do have open-ended performance features, but here is seems the value of the feature itself, security, is not attainable. Perhaps reasonable security is attainable. I am not saying we should reject this feature --- just that the calculus of how we decide to add this feature doesn't fit our normal analysis. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
On Fri, Aug 30, 2013 at 3:43 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > I think it's entirely sensible to question whether we should reject (not > "hold up") RLS if it has major covert-channel problems. We've already had this argument before, about the security_barrier view stuff, and that code got committed and is already released. So the horse is already out of the barn and no amount of wishing will put it back in. I haven't reviewed this patch in a long time, but I would expect that it's basically just reusing that same infrastructure; in fact, I'd expect that it's little more than syntactic sugar around that infrastructure. (If it it's instead introducing a whole new mechanism, then I think that's reason enough to reject it right there.) My main question about this is whether that syntactic sugar is really worth having given that it doesn't add any real new functionality, not about the covert channel issues, which are already a done deal. And frankly, I'm with the group that says the covert channel issues are not really a big deal. In many real-world cases, the user can control only the values that get subbed into queries that get sent to the database, not the queries themselves, which eliminates a large category of attacks. Real-world example, from last job: sales reps only get to see their own accounts, not accounts of other sales reps. They could input new accounts (with sales_rep_id set to their ID) and they could query the list of accounts (limited to those where sales_rep_id matched their ID) - pulling either all of them or searching by account name, both through a web application. Yeah, a sales rep could have launched a timing attack through the web interface, and they could also have polled for the existence of accounts by trying to create accounts with names that might already exist in the system to see whether a duplicate got flagged. But neither attack had enough useful bandwidth to matter; a sales rep wishing to learn our full account list (so that he could try to poach them after leaving the company) could have learned a lot more a lot faster via social engineering, and with less risk of being caught doing something that would have resulted in his or her immediate termination. The point is, I don't think RLS needs to solve every problem. What it needs to do is solve one problem well, so that it can be used in combination with other techniques to achieve a certain set of security objectives. If, in a particular environment, EXPLAIN is an issue, then it can be blocked in that environment via a variety of well-understood techniques. That's not very hard to do even without core support, and if we want to add core support, fine, but that's a separate patch. This is a patch to add row-level security, and it deserves to be judged on how well or poorly it does that, not on whether it solves covert channel problems that represent a mostly orthogonal set of technical issues. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Fri, Aug 30, 2013 at 3:43 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> I think it's entirely sensible to question whether we should reject (not >> "hold up") RLS if it has major covert-channel problems. > We've already had this argument before, about the security_barrier > view stuff, and that code got committed and is already released. So > the horse is already out of the barn and no amount of wishing will put > it back in. Well, the security-barrier view stuff did not present itself as a 100% solution. But perhaps more to the point, it was conceptually simple to implement, ie don't flatten views if they have this bit set, and don't push down quals into such sub-selects unless they're marked leakproof. > I haven't reviewed this patch in a long time, but I would > expect that it's basically just reusing that same infrastructure; in > fact, I'd expect that it's little more than syntactic sugar around > that infrastructure. I've not read it in great detail, but it isn't that. It's whacking the planner around in ways that I have no confidence in, and probably still wouldn't have any confidence in if they'd been adequately documented. regards, tom lane
On Sun, Sep 1, 2013 at 11:47 PM, Greg Smith <greg@2ndquadrant.com> wrote: > And if someone can INSERT values that they can't actually see once they're > committed, that's a similarly bad we should describe. This is desirable in some cases but not others. If the goal is compartmentalization, then it's sensible to prevent this. But you might also have a "drop-box" environment - e.g. a student submits coursework to a professor, and can't access the submitted work after it's submitted. FWIW, my CS classes in college had a tool that worked just this way. Or maybe an analyst writes a report and is then permitted to "give away" the document to his boss for revisions. Once the ownership of the document has changed, the analyst can't see it any more, because he can only see the documents he owns. And maybe he's not permitted to give away documents to just anyone (polluting their sandbox), but he can give them to his boss (who expects to receive them). The point is that we should be in the business of providing mechanism, not policy. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Wed, Sep 4, 2013 at 10:45 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Well, the security-barrier view stuff did not present itself as a 100% > solution. But perhaps more to the point, it was conceptually simple to > implement, ie don't flatten views if they have this bit set, and don't > push down quals into such sub-selects unless they're marked leakproof. Right. IMHO, this new feature should be similarly simple: when an unprivileged user references a table, treat that as a reference to a leakproof view over the table, with the RLS qual injected into the view. >> I haven't reviewed this patch in a long time, but I would >> expect that it's basically just reusing that same infrastructure; in >> fact, I'd expect that it's little more than syntactic sugar around >> that infrastructure. > > I've not read it in great detail, but it isn't that. It's whacking the > planner around in ways that I have no confidence in, and probably still > wouldn't have any confidence in if they'd been adequately documented. If that's the case, then I agree that we should not accept it, at least in its present form. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Wed, Sep 4, 2013 at 10:45 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Well, the security-barrier view stuff did not present itself as a 100% >> solution. But perhaps more to the point, it was conceptually simple to >> implement, ie don't flatten views if they have this bit set, and don't >> push down quals into such sub-selects unless they're marked leakproof. > Right. IMHO, this new feature should be similarly simple: when an > unprivileged user references a table, treat that as a reference to a > leakproof view over the table, with the RLS qual injected into the > view. And for insert/update/delete, we do what exactly? regards, tom lane
* Robert Haas (robertmhaas@gmail.com) wrote: > On Sun, Sep 1, 2013 at 11:47 PM, Greg Smith <greg@2ndquadrant.com> wrote: > > And if someone can INSERT values that they can't actually see once they're > > committed, that's a similarly bad we should describe. > > This is desirable in some cases but not others. If the goal is > compartmentalization, then it's sensible to prevent this. But you > might also have a "drop-box" environment - e.g. a student submits > coursework to a professor, and can't access the submitted work after > it's submitted. FWIW, my CS classes in college had a tool that worked > just this way. Agreed, and part of the discussion that I had w/ KaiGai and Simon was that we should provide a way to let the user pick which they'd like. This is the concept around 'insert privileges' being different from 'select privileges' wrt RLS. > The point is that we should be in the business of providing mechanism, > not policy. ++ Thanks, Stephen
On Wed, Sep 4, 2013 at 10:50 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> On Wed, Sep 4, 2013 at 10:45 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> Well, the security-barrier view stuff did not present itself as a 100% >>> solution. But perhaps more to the point, it was conceptually simple to >>> implement, ie don't flatten views if they have this bit set, and don't >>> push down quals into such sub-selects unless they're marked leakproof. > >> Right. IMHO, this new feature should be similarly simple: when an >> unprivileged user references a table, treat that as a reference to a >> leakproof view over the table, with the RLS qual injected into the >> view. > > And for insert/update/delete, we do what exactly? The same mechanism will prevent UPDATE and DELETE from seeing any rows the user shouldn't be able to touch. Simon and Greg are arguing that when an INSERT or UPDATE happens, we ought to also check that the NEW row matches the RLS qual. I don't find that to be terribly important because you can accomplish the same thing with a per-row trigger today; and I also don't think everyone will want that behavior. Some people will, I'm pretty sure, want to let users "give away" rows, either unconditionally or subject to defined restrictions. Perhaps it's worth having, but it's a separate feature, IMHO. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
2013/9/1 Greg Stark <stark@mit.edu>: > On Sun, Sep 1, 2013 at 8:31 PM, Kohei KaiGai <kaigai@kaigai.gr.jp> wrote: >> Or, any other criteria even though? >> >> My (current) preference is plan (c: we will be able to fix up *this* >> cover-channel with reasonable efforts on explain code. probably, >> we can close it if we don't print filtered rows under the sub-query >> with security-barrier attribute. > > I think the criteria being discussed in this thread are too strict. > > It may be the case that Postgres cannot make a strong general case > that it protects against covert channels. However it may still be able > to make the much weaker case that it is *possible* to arrange your > database such that the covert channels are kept under control. > Yes. I have to admit it is difficult to determine a strict and regular rule to handle covert-channel scenario. Sorry, the later half of this sentence is uncertain for me. Are you saying, even if we could have a strict rule, we may have many possible covert channel for information leakage?? > So I think up above Tom is wrong about why it's ok that INSERT leaks > keys when it reports a unique key violation. The reason why it's ok > that there's a covert channel there is that the DBA can avoid that > covert channel by being careful when creating unique constraints. He > or she should be aware that creating a unique constraint implicitly > provides a kind of limited access to data to users who have INSERT > privilege even if they lack the real SELECT privilege. > IIRC, we discussed and concluded that the above information leakage scenario shall be described in the documentation, and the way to avoid valuable information leakage using alternative keys, a few years before. > Likewise, as long as the covert channels in RLS are things the DBA has > even a modicum of control over I wouldn't be too worried. Afaict from > skimming the thread it looks like creating any indexes on columns > leaks what values of the index key exist in the table. Is it the case > that non-indexed columns do not get leaked? > According to the scenario reported by Korotkov, he could find number of rows being filtered by the given qualifier, thus it implies existence of the row with a value in a particular range. Its solution is that I noted above, and I'm working for it now. Thanks, -- KaiGai Kohei <kaigai@kaigai.gr.jp>
2013/9/3 Bruce Momjian <bruce@momjian.us>: > On Sun, Sep 1, 2013 at 11:05:58AM -0700, Josh Berkus wrote: >> > Security community also concludes it is not avoidable nature as long >> > as human can observe system behavior and estimate something, thus, >> > security evaluation criteria does not require eliminate covert-channels >> > or does not pay attention about covert-channels for the products that >> > is installed on the environment with basic robustness (that means, >> > non-military, regular enterprise usage). >> >> To be completely blunt, the security community does not understand >> databases. At all. I'd think if anything had become clear through the >> course of work on SEPosgres, it would be that. > > Agreed. The security community realizes these covert channels exist, > but doesn't really have any recommendations on how to avoid them. You > could argue that avoiding them is too tied to specific database > implementations, but there are general channels, like insert with a > unique key, that should at least have well-defined solutions. > The security community also provides an extreme solution, but I don't think it is suitable for flexible security policy and PostgreSQL wants it. Their "extreme" solution manipulate definition of PK that automatically become combined key that takes user-given key and security level being set mandatory. Thus, it does not conflict even if two different users with different security level tries to insert a row with same primary key. This technology is called polyinstantiation. http://en.wikipedia.org/wiki/Polyinstantiation However, of course, I'm not favor to port this technology to PostgreSQL world. Its side-effects are too much towards the problem to be solved. Thanks, -- KaiGai Kohei <kaigai@kaigai.gr.jp>
Robert Haas <robertmhaas@gmail.com> writes: > On Wed, Sep 4, 2013 at 10:50 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Robert Haas <robertmhaas@gmail.com> writes: >>> Right. IMHO, this new feature should be similarly simple: when an >>> unprivileged user references a table, treat that as a reference to a >>> leakproof view over the table, with the RLS qual injected into the >>> view. >> And for insert/update/delete, we do what exactly? > The same mechanism will prevent UPDATE and DELETE from seeing any rows > the user shouldn't be able to touch. No, it won't, because we don't support direct update/delete on views (and if you look, you'll notice the auto-updatable-view stuff doesn't think a security-barrier view is auto-updatable). AFAICT, to deal with update/delete the RLS patch needs to constrain order of qual application without the crutch of having a separate level of subquery; and it's that behavior that I have zero confidence in, either as to whether it works as submitted or as to our odds of not breaking it in the future. regards, tom lane
Robert Haas <robertmhaas@gmail.com> wrote: > Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Robert Haas <robertmhaas@gmail.com> writes: >>> IMHO, this new feature should be similarly simple: when an >>> unprivileged user references a table, treat that as a reference >>> to a leakproof view over the table, with the RLS qual injected >>> into the view. >> >> And for insert/update/delete, we do what exactly? > > The same mechanism will prevent UPDATE and DELETE from seeing any > rows the user shouldn't be able to touch. +1 > Simon and Greg are arguing that when an INSERT or UPDATE happens, > we ought to also check that the NEW row matches the RLS qual. I > don't find that to be terribly important because you can > accomplish the same thing with a per-row trigger today; and I > also don't think everyone will want that behavior. As an example from my Wisconsin Courts days, source documents come in which need to be entered, which may contain a Social Security Number, and most of the Clerk of Courts staff should be authorized to enter that into the database. Once it is entered, most people should not be allowed to view it, including many of the people who were authorized to enter it initially. It's one thing for a line staff person to have a handful of documents come across their desk with SSN on a given day; it's quite another if they could dump a list of names, addresses, dates of birth, and SSNs for the entire database on demand. In a way this issue is similar to the covert channel issue -- volume matters. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
2013/9/4 Tom Lane <tgl@sss.pgh.pa.us>: > Robert Haas <robertmhaas@gmail.com> writes: >> On Wed, Sep 4, 2013 at 10:45 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> Well, the security-barrier view stuff did not present itself as a 100% >>> solution. But perhaps more to the point, it was conceptually simple to >>> implement, ie don't flatten views if they have this bit set, and don't >>> push down quals into such sub-selects unless they're marked leakproof. > >> Right. IMHO, this new feature should be similarly simple: when an >> unprivileged user references a table, treat that as a reference to a >> leakproof view over the table, with the RLS qual injected into the >> view. > > And for insert/update/delete, we do what exactly? > This patch does not care about insert, because it shall be done around the place where we usually put before-row-insert; that is not related to planner. Regarding to update/delete, this patch also enhanced to allow update or delete mechanism allows to take a sub-query on top of the table scan plan. So, its explain output shows as follows: postgres=> EXPLAIN (costs off) UPDATE customer SET email = 'alice@example.com'; QUERY PLAN --------------------------------------------------Update on customer -> Subquery Scan on customer -> Seq Scan oncustomer customer_1 Filter: ("current_user"() = uname) You can see this update has Subquery plan instead of regular relation scan. Thanks, -- KaiGai Kohei <kaigai@kaigai.gr.jp>
2013/9/4 Tom Lane <tgl@sss.pgh.pa.us>: > Robert Haas <robertmhaas@gmail.com> writes: >> On Wed, Sep 4, 2013 at 10:50 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> Robert Haas <robertmhaas@gmail.com> writes: >>>> Right. IMHO, this new feature should be similarly simple: when an >>>> unprivileged user references a table, treat that as a reference to a >>>> leakproof view over the table, with the RLS qual injected into the >>>> view. > >>> And for insert/update/delete, we do what exactly? > >> The same mechanism will prevent UPDATE and DELETE from seeing any rows >> the user shouldn't be able to touch. > > No, it won't, because we don't support direct update/delete on views > (and if you look, you'll notice the auto-updatable-view stuff doesn't > think a security-barrier view is auto-updatable). > > AFAICT, to deal with update/delete the RLS patch needs to constrain order > of qual application without the crutch of having a separate level of > subquery; and it's that behavior that I have zero confidence in, either > as to whether it works as submitted or as to our odds of not breaking it > in the future. > Are you suggesting to rewrite update / delete statement to filter out unprivileged rows from manipulation? Yes. I also thought it is a simple solution that does not need additional enhancement to allow update / delete to take sub-query on top of reader side plan. For example, if security policy is (t1.owner = current_user) and the given query was "UPDATE t1 SET value = value || '_updated' WHERE value like '%abc%'", this query may be able to rewritten as follows: UPDATE t1 SET value = value || '_updated' WHERE tid = ( SELECT tid FROMt1 WHERE t1.owner = current_user ) AND value like '%abc%'; This approach makes implementation simple, but it has to scan the relation twice, thus its performance it not ideal, according to the past discussion. Thanks, -- KaiGai Kohei <kaigai@kaigai.gr.jp>
2013/9/4 Kevin Grittner <kgrittn@ymail.com>: > Robert Haas <robertmhaas@gmail.com> wrote: >> Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> Robert Haas <robertmhaas@gmail.com> writes: > >>>> IMHO, this new feature should be similarly simple: when an >>>> unprivileged user references a table, treat that as a reference >>>> to a leakproof view over the table, with the RLS qual injected >>>> into the view. >>> >>> And for insert/update/delete, we do what exactly? >> >> The same mechanism will prevent UPDATE and DELETE from seeing any >> rows the user shouldn't be able to touch. > > +1 > >> Simon and Greg are arguing that when an INSERT or UPDATE happens, >> we ought to also check that the NEW row matches the RLS qual. I >> don't find that to be terribly important because you can >> accomplish the same thing with a per-row trigger today; and I >> also don't think everyone will want that behavior. > > As an example from my Wisconsin Courts days, source documents come > in which need to be entered, which may contain a Social Security > Number, and most of the Clerk of Courts staff should be authorized > to enter that into the database. Once it is entered, most people > should not be allowed to view it, including many of the people who > were authorized to enter it initially. It's one thing for a line > staff person to have a handful of documents come across their desk > with SSN on a given day; it's quite another if they could dump a > list of names, addresses, dates of birth, and SSNs for the entire > database on demand. > > In a way this issue is similar to the covert channel issue -- > volume matters. > I think an important nature of this behavior is it is configurable. In case when both of reader and writer side need to have same security policy, it's good. One configuration allows to apply a consistent security policy to fetch a row from table, and to write a row to table. If they don't want to check security policy on writer side, all they need to do is setting a security policy for SELECT only, even though its functionality is not implemented yet. Thanks, -- KaiGai Kohei <kaigai@kaigai.gr.jp>
Kohei KaiGai <kaigai@kaigai.gr.jp> writes: > 2013/9/4 Tom Lane <tgl@sss.pgh.pa.us>: >> And for insert/update/delete, we do what exactly? > Regarding to update/delete, this patch also enhanced to allow update or > delete mechanism allows to take a sub-query on top of the table scan plan. > So, its explain output shows as follows: > postgres=> EXPLAIN (costs off) UPDATE customer SET email = 'alice@example.com'; > QUERY PLAN > -------------------------------------------------- > Update on customer > -> Subquery Scan on customer > -> Seq Scan on customer customer_1 > Filter: ("current_user"() = uname) > You can see this update has Subquery plan instead of regular relation scan. Really? That wasn't apparent from reading the patch. (Have I mentioned it's desperately underdocumented? Aside from needing a lot more in-code comments than it's got, it would benefit from having an overview section added to optimizer/README explaining stuff at the level of this discussion.) I'm a bit surprised that setrefs.c doesn't eliminate the Subquery Scan as being a no-op, given that no quals end up getting applied at that level. You might look into why not, since if that plan node were eliminated at the end, it'd fix any efficiency complaints about this approach. regards, tom lane
On Wed, Sep 4, 2013 at 11:22 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> The same mechanism will prevent UPDATE and DELETE from seeing any rows >> the user shouldn't be able to touch. > > No, it won't, because we don't support direct update/delete on views > (and if you look, you'll notice the auto-updatable-view stuff doesn't > think a security-barrier view is auto-updatable). > > AFAICT, to deal with update/delete the RLS patch needs to constrain order > of qual application without the crutch of having a separate level of > subquery; and it's that behavior that I have zero confidence in, either > as to whether it works as submitted or as to our odds of not breaking it > in the future. I don't really see why. AIUI, the ModifyTable node just needs to get the right TIDs. It's not like that has to be stacked directly on top of a scan; indeed, in cases like UPDATE .. FROM and DELETE .. USING it already isn't. Maybe there's some reason why the intervening level can be a Join but not a SubqueryScan, but if so I'd expect we could find some way of lifting that limitation without suffering too much pain. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Wed, 2013-08-28 at 13:56 +0200, Kohei KaiGai wrote: > The attached patch fixed the portion I was pointed out, so its overall > design has not been changed so much. The documentation doesn't build: openjade:catalogs.sgml:237:28:X: reference to non-existent ID "CATALOG-PG-ROWLEVELSEC"
Now I'm trying to tackle the covert-channel problem that Korotkov pointed out at upthread. The attached patch works "almost" well. It prevents to print number of rows being filtered if the target plan node is under sub-query with security-barrier attribute; because row- level security feature is constructed on existing security-barrier view infrastructure. One remaining issue is that planner pulls-up underlying relation scan if top-level sub-query is enough simple; probably, in case when targetlist of upper sub-query is compatible with underlying relation scan. See, the example below: postgres=# CREATE TABLE t1 (a int primary key, b int); CREATE TABLE postgres=# INSERT INTO t1 VALUES (1,1111),(2,2222),(3,3333),(4,4444),(5,5555); INSERT 0 5 postgres=# CREATE VIEW v1 WITH(security_barrier) AS SELECT * FROM t1 WHERE a % 2 = 1; CREATE VIEW postgres=# SET enable_seqscan = off; SET postgres=# EXPLAIN ANALYZE SELECT 1 FROM v1 WHERE b BETWEEN 2000 AND 4000; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------- Subquery Scan on v1 (cost=10000000000.00..10000000052.81 rows=1 width=0) (actual time=0.016..0.017 rows=1 loops=1) -> Seq Scan on t1 (cost=10000000000.00..10000000052.80 rows=1 width=8) (actual time=0.014..0.015 rows=1 loops=1) Filter: ((b >= 2000) AND (b <= 4000) AND ((a % 2) = 1)) Total runtime: 0.067 ms (4 rows) According to the scenario that Korotkov pointed out, number of filtered rows shall be printed, thus, it allows to estimate particular value with log(N) times trials. However, this example hides number of rows being done within security-barrier sub- query as I expected. On the other hand, if planner pulled-up underlying relation scan, it does NOT work fine. postgres=# EXPLAIN ANALYZE SELECT * FROM v1 WHERE b BETWEEN 2000 AND 4000; QUERY PLAN ---------------------------------------------------------------------------------------------------------------- Seq Scan on t1 (cost=10000000000.00..10000000052.80 rows=1 width=8) (actual time=0.019..0.021 rows=1 loops=1) Filter: ((b >= 2000) AND (b <= 4000) AND ((a % 2) = 1)) Rows Removed by Filter: 4 Total runtime: 0.055 ms (4 rows) It seems to me we need to prohibit to pull-up security-barrier sub-query here, or to mark underlying relation scan security-barrier attribute to solve this issue. (I'm still looking for which routine handles this pull-up job, however, I didn't find it yet.) How about opinion for this solution? Thanks, 2013/9/7 Peter Eisentraut <peter_e@gmx.net>: > On Wed, 2013-08-28 at 13:56 +0200, Kohei KaiGai wrote: >> The attached patch fixed the portion I was pointed out, so its overall >> design has not been changed so much. > > The documentation doesn't build: > > openjade:catalogs.sgml:237:28:X: reference to non-existent ID "CATALOG-PG-ROWLEVELSEC" > > -- KaiGai Kohei <kaigai@kaigai.gr.jp>
Attachment
On Wed, 2013-09-04 at 14:35 +0000, Robert Haas wrote: > > On Fri, Aug 30, 2013 at 3:43 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > I think it's entirely sensible to question whether we should reject > (not > > "hold up") RLS if it has major covert-channel problems. > > We've already had this argument before, about the security_barrier [ . . . ] Sorry for following up on this so late, I have just been trying to catch up with the mailing lists. I am the developer of Veil, which this thread mentioned a number of times. I wanted to state/confirm a number of things: Veil is not up to date wrt Postgres versions. I didn't release a new version for 9.2, and when no-one complained I figured no-one other than me was using it. I'll happily update it if anyone wants it. Veil makes no attempt to avoid covert channels. It can't. Veil is a low-level toolset designed for optimising queries about privileges. It allows you to build RLS with reasonable performance, but it is not in itself a solution for RLS. I wish the Postgres RLS project well and look forward to its release in Postgres 9.4. __ Marc
Because of CF-2nd end, it was moved to the next commit-fest. In my personal opinion, it probably needs a few groundwork to get RLS commitable, prior to RLS itself. One is a correct handling towards the scenario that Korotkov pointed out. (How was it concluded?) I think it is a problem we can fix with reasonable way. Likely, solution is to prohibit to show number of rows being filtered if plan node is underlying a sub- query with security-barrier. One other stuff I'm concerned about is, existing implementation assumes a certain rtable entry performs as a source relation, also result relation in same time on DELETE and UPDATE. We usually scan a regular relation to fetch ctid of the tuples to be modified, then ModifyTable deletes or updates the tuple being identified. Here has been no matter for long period, even if same rtable entry is used to point out a relation to be scanned and to be modified. It however makes RLS implementation complicated, because this feature tries to replace a rtable entry to relation with row-level security policy by a simple sub-query with security-barrier attribute. Our implementation does not assume a sub-query performs as a result relation, so I had to have some ad-hoc coding, like adjustment on varno/varattno of Var objects, to avoid problem. I think, solution is to separate a rtable entry of result-relation from rtable entry to be scanned. It makes easier to implement RLS feature because all we need to do is just replacement of rtable entry for scanning stage, and no need to care about ModifyTable operations towards sub-query. Is it a right direction to go? Thanks, 2013/10/10 Marc Munro <marc@bloodnok.com>: > On Wed, 2013-09-04 at 14:35 +0000, Robert Haas wrote: >> >> On Fri, Aug 30, 2013 at 3:43 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> > I think it's entirely sensible to question whether we should reject >> (not >> > "hold up") RLS if it has major covert-channel problems. >> >> We've already had this argument before, about the security_barrier > [ . . . ] > > Sorry for following up on this so late, I have just been trying to catch > up with the mailing lists. > > I am the developer of Veil, which this thread mentioned a number of > times. I wanted to state/confirm a number of things: > > Veil is not up to date wrt Postgres versions. I didn't release a new > version for 9.2, and when no-one complained I figured no-one other than > me was using it. I'll happily update it if anyone wants it. > > Veil makes no attempt to avoid covert channels. It can't. > > Veil is a low-level toolset designed for optimising queries about > privileges. It allows you to build RLS with reasonable performance, but > it is not in itself a solution for RLS. > > I wish the Postgres RLS project well and look forward to its release in > Postgres 9.4. > > __ > Marc > > -- KaiGai Kohei <kaigai@kaigai.gr.jp>
On 09/04/2013 11:22 PM, Tom Lane wrote: > AFAICT, to deal with update/delete the RLS patch needs to constrain order > of qual application without the crutch of having a separate level of > subquery; and it's that behavior that I have zero confidence in, either > as to whether it works as submitted or as to our odds of not breaking it > in the future. Wouldn't CASE do that job, albeit in a somewhat ugly manner, and also protect against malicious RLS functions? For any subclause in a WHERE clause "(a) OR (b)" you can instead write a short-circuit OR version as: CASE WHEN (a) THEN true ELSE (b) END no? So, given a row security condition like "(rowowner = current_user AND rls_malicious_leak())" on table "test", this: SELECT * FROM test WHERE client_leak_fn(); could be rewritten by row security as: SELECT * FROM test WHERE CASE WHEN CASE WHEN is_superuser() THEN true ELSE (rowowner = current_user AND rls_malicious_leak()) END THEN client_leak_fn() END; in other words: if the user is the superuser, don't evaluate the RLS predicate, just assume they have the right to see the row. Otherwise evaluate the RLS predicate to determine whether they can see the row. In either case, if they have the right to see the row, run the original WHERE clause. My main concern is that it'd be relying on a guarantee that CASE is always completely ordered, and that might not be ideal. It's also hideously ugly, and future optimiser smarts could create unexpected issues. Unless you propose the creaton of a new short-circuit left-to-right order guaranteed OR operator, and think the planner/optimizer should be taught special case rules to prevent it from doing pull-up / push-down or pre-evaluating subclauses for it, I suspect the notion of using security barrier views or subqueries is still going to be the sanest way to do it. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Craig Ringer <craig@2ndquadrant.com> writes: > On 09/04/2013 11:22 PM, Tom Lane wrote: >> AFAICT, to deal with update/delete the RLS patch needs to constrain order >> of qual application without the crutch of having a separate level of >> subquery; and it's that behavior that I have zero confidence in, either >> as to whether it works as submitted or as to our odds of not breaking it >> in the future. > Wouldn't CASE do that job, albeit in a somewhat ugly manner, and also > protect against malicious RLS functions? You mean wrap all the query quals in a CASE? Sure, if you didn't mind totally destroying any optimization possibilities. If you did that, every table scan would become a seqscan and every join a nestloop. regards, tom lane
On Mon, Nov 4, 2013 at 9:37 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Craig Ringer <craig@2ndquadrant.com> writes: >> On 09/04/2013 11:22 PM, Tom Lane wrote: >>> AFAICT, to deal with update/delete the RLS patch needs to constrain order >>> of qual application without the crutch of having a separate level of >>> subquery; and it's that behavior that I have zero confidence in, either >>> as to whether it works as submitted or as to our odds of not breaking it >>> in the future. > >> Wouldn't CASE do that job, albeit in a somewhat ugly manner, and also >> protect against malicious RLS functions? > > You mean wrap all the query quals in a CASE? Sure, if you didn't mind > totally destroying any optimization possibilities. If you did that, > every table scan would become a seqscan and every join a nestloop. I'd still like to here what's wrong with what I said here: http://www.postgresql.org/message-id/CA+TgmoYr1PHw3X9vnVuWDcfXkzK2p_jhtWc0fV2Q58NEgcxyTA@mail.gmail.com -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 11/04/2013 11:17 PM, Robert Haas wrote: > I'd still like to here what's wrong with what I said here: > > http://www.postgresql.org/message-id/CA+TgmoYr1PHw3X9vnVuWDcfXkzK2p_jhtWc0fV2Q58NEgcxyTA@mail.gmail.com For me, just my understanding. I'm still too new to the planner and rewriter to grasp that properly as written. I was responding to Tom's objection, and he makes a good point about CASE and optimisation. We have to be free to re-order and pre-evaluate where LEAKPROOF flags make it safe and permissible, without ever otherwise doing so. That makes the SECURITY BARRIER subquery look better, since the limited pull-up / push-down is already implemented there. Robert, any suggesitons on how to approach what you suggest? I'm pretty new to the planner's guts, but I know there've been some complaints about the way the current RLS code fiddles with Vars when it changes a direct scan of a rel into a subquery scan. The code in: https://github.com/ringerc/postgres/blob/rls-9.4/src/backend/optimizer/prep/prepunion.c#L1647 and https://github.com/ringerc/postgres/blob/rls-9.4/src/backend/optimizer/prep/prepunion.c#L1591 seems to be the one folks were complaining about earlier. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Mon, Nov 4, 2013 at 8:46 PM, Craig Ringer <craig@2ndquadrant.com> wrote: > On 11/04/2013 11:17 PM, Robert Haas wrote: >> I'd still like to here what's wrong with what I said here: >> >> http://www.postgresql.org/message-id/CA+TgmoYr1PHw3X9vnVuWDcfXkzK2p_jhtWc0fV2Q58NEgcxyTA@mail.gmail.com > > For me, just my understanding. I'm still too new to the planner and > rewriter to grasp that properly as written. > > I was responding to Tom's objection, and he makes a good point about > CASE and optimisation. We have to be free to re-order and pre-evaluate > where LEAKPROOF flags make it safe and permissible, without ever > otherwise doing so. That makes the SECURITY BARRIER subquery look > better, since the limited pull-up / push-down is already implemented there. > > Robert, any suggesitons on how to approach what you suggest? I'm pretty > new to the planner's guts, but I know there've been some complaints > about the way the current RLS code fiddles with Vars when it changes a > direct scan of a rel into a subquery scan. > > The code in: > > https://github.com/ringerc/postgres/blob/rls-9.4/src/backend/optimizer/prep/prepunion.c#L1647 > > and > > https://github.com/ringerc/postgres/blob/rls-9.4/src/backend/optimizer/prep/prepunion.c#L1591 > > seems to be the one folks were complaining about earlier. I haven't studied this patch in detail, but I see why there's some unhappiness about that code: it's an RLS-specific kluge. Just shooting from the hip here, maybe we should attack the problem of making security-barrier views updatable first, as a separate patch. I would think that if we make that work, this will also work without, hopefully, any special hackery. And we'd get a separate, independently useful feature out of it, too. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 11/05/2013 09:36 PM, Robert Haas wrote: > I haven't studied this patch in detail, but I see why there's some > unhappiness about that code: it's an RLS-specific kluge. Just > shooting from the hip here, maybe we should attack the problem of > making security-barrier views updatable first, as a separate patch. That's the approach I've been considering. There are a few wrinkles with it, though: (a) Updatable views are implemented in the rewriter, not the planner. The rewriter is not re-run when plans are invalidated or when the session authorization changes, etc. This means that we can't simply omit the RLS predicate for superuser because the same rewritten parse tree might get used for both superuser and non-superuser queries. Options: * Store the before-rewrite parse tree when RLS is discovered on one of the rels in the tree. Re-run the rewriter when re-planning. Ensure a change in current user always invalidates plans. * Declare that it's not legal to run a query originally parsed and rewritten as superuser as a non-superuser or vice versa. This would cause a great deal of pain with PL/PgSQL. * Always add the RLS predicate and solve the problem of reliably short-circuiting the user-supplied predicate for RLS-exempt users. We'd need a way to allow direct (not query-based) COPY TO for tables with RLS applied, preventing the rewriting of direct table access into subqueries for COPY, but since we never save plans for COPY that may be fine. * ... ? (b) Inheritance is a problem when RLS is done in the rewriter. As I understood it from Kohei KaiGai's description to me earlier, there was a strong preference on -hackers to enforce RLS predicates for child and parent tables completely independently. That's how RLS currently works, but it might be hard to get the same effect when applying RLS in the rewriter. We'd need to solve that, or redefine RLS's behaviour so that the predicate on a parent table applies to any child tables too. Personally I'd prefer the latter. (c) RLS might interact differently with rules declared on tables if implemented in the rewriter, so some investigation into that would be needed. > I > would think that if we make that work, this will also work without, > hopefully, any special hackery. And we'd get a separate, > independently useful feature out of it, too. I tend to agree. I'm just a bit concerned about dealing with the issues around RLS-exempt operations and users. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 6 November 2013 06:38, Craig Ringer <craig@2ndquadrant.com> wrote: > On 11/05/2013 09:36 PM, Robert Haas wrote: >> I haven't studied this patch in detail, but I see why there's some >> unhappiness about that code: it's an RLS-specific kluge. Just >> shooting from the hip here, maybe we should attack the problem of >> making security-barrier views updatable first, as a separate patch. > > That's the approach I've been considering. FWIW, I have a half-complete patch to make SB views auto-updatable, borrowing some code from the RLS patch. I hadn't looked at it for some time, but I've just brushed the dust off it and rebased it to head. There are several things I don't like in it, and it's probably not close to being committable, but it might give you some ideas if you were going to take a crack at this yourself. The basic idea is to have rewriteTargetView() collect up any quals from SB views in a new list on the target RTE, instead of adding them to the main query's predicates (it needs to be a list of SB quals, in case there are SB views on top of other SB views, in which case they need to be kept separate from one another). Then at the end of the rewriting process (after any views referenced in the SB quals have been expanded), a new piece of code kicks in to process any RTEs with SB quals, turning them into (possibly nested) subquery RTEs. The complication is that the query's resultRelation RTE mustn't be a subquery. This is handled this in a similar way to the trigger-updatable views code, producing 2 RTEs --- the resultRelation RTE becomes a direct reference to the base relation, and a separate subquery RTE acts as the source of rows to update. An additional complication arising from that approach is that the planner's preptlist code (expand_targetlist) may add additional targetlist entries representing identity assignments, so it needs to know that the source RTE is now different from the result RTE in the query, otherwise it all falls apart because the target RTE isn't in the query's fromList. That's one of the things I don't like about this. In fact it seems messy that there are 2 separate places in the code that expand the targetlist. Anyway, feel free to do what you like with this. I wasn't planning on submitting it to the next commitfest myself, because my non-PG workload is too high, and I don't expect to get much time to hack on postgresql during the next couple of months. Regards, Dean
Attachment
On 11/06/2013 05:02 PM, Dean Rasheed wrote: > The basic idea is to have rewriteTargetView() collect up any quals > from SB views in a new list on the target RTE, instead of adding them > to the main query's predicates (it needs to be a list of SB quals, in > case there are SB views on top of other SB views, in which case they > need to be kept separate from one another). Then at the end of the > rewriting process (after any views referenced in the SB quals have > been expanded), a new piece of code kicks in to process any RTEs with > SB quals, turning them into (possibly nested) subquery RTEs. That makes sense, though presumably you face the same problem that the existing RLS code does with references to system columns that don't normally exist in subqueries? Since this happens during query rewrite, what prevents the optimizer from pushing outer quals down into the subqueries? > The complication is that the query's resultRelation RTE mustn't be a > subquery. I think this is what Robert was alluding to earlier with his comments about join relations: ____ Robert Haas wrote: > I don't really see why. AIUI, the ModifyTable node just needs to get > the right TIDs. It's not like that has to be stacked directly on top > of a scan; indeed, in cases like UPDATE .. FROM and DELETE .. USING it > already isn't. Maybe there's some reason why the intervening level > can be a Join but not a SubqueryScan, but if so I'd expect we could > find some way of lifting that limitation without suffering too much > pain. (http://www.postgresql.org/message-id/CA+TgmoYr1PHw3X9vnVuWDcfXkzK2p_jhtWc0fV2Q58NEgcxyTA@mail.gmail.com) ____ Maybe we just need to make a subquery scan a valid target for an update, so those fixups aren't required anymore? > This is handled this in a similar way to the > trigger-updatable views code, producing 2 RTEs --- the resultRelation > RTE becomes a direct reference to the base relation, and a separate > subquery RTE acts as the source of rows to update. Some of the complexity of the current RLS code is caused by the need to do similar fixups to handle the case where the input relation isn't the same as the target relation, but is a subquery over it instead. > Anyway, feel free to do what you like with this. I wasn't planning on > submitting it to the next commitfest myself, because my non-PG > workload is too high, and I don't expect to get much time to hack on > postgresql during the next couple of months. Thanks for sending what you have. It's informative, and it shows that some of the same issues must be solved for writable security barrier views and for RLS. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 6 November 2013 09:23, Craig Ringer <craig@2ndquadrant.com> wrote: > On 11/06/2013 05:02 PM, Dean Rasheed wrote: > >> The basic idea is to have rewriteTargetView() collect up any quals >> from SB views in a new list on the target RTE, instead of adding them >> to the main query's predicates (it needs to be a list of SB quals, in >> case there are SB views on top of other SB views, in which case they >> need to be kept separate from one another). Then at the end of the >> rewriting process (after any views referenced in the SB quals have >> been expanded), a new piece of code kicks in to process any RTEs with >> SB quals, turning them into (possibly nested) subquery RTEs. > > That makes sense, though presumably you face the same problem that the > existing RLS code does with references to system columns that don't > normally exist in subqueries? > Yeah, that feels like an ugly hack. > Since this happens during query rewrite, what prevents the optimizer > from pushing outer quals down into the subqueries? > The subquery RTE is marked with the security_barrier flag, which prevents quals from being pushed down in the presence of leaky functions (see set_subquery_pathlist). >> The complication is that the query's resultRelation RTE mustn't be a >> subquery. > > I think this is what Robert was alluding to earlier with his comments > about join relations: > > ____ > Robert Haas wrote: >> I don't really see why. AIUI, the ModifyTable node just needs to get >> the right TIDs. It's not like that has to be stacked directly on top >> of a scan; indeed, in cases like UPDATE .. FROM and DELETE .. USING it >> already isn't. Maybe there's some reason why the intervening level >> can be a Join but not a SubqueryScan, but if so I'd expect we could >> find some way of lifting that limitation without suffering too much >> pain. > (http://www.postgresql.org/message-id/CA+TgmoYr1PHw3X9vnVuWDcfXkzK2p_jhtWc0fV2Q58NEgcxyTA@mail.gmail.com) > ____ > Yeah. We already do a similar thing for trigger-updatable views. So with this approach you end up with similar plans, for example: Update on base_tbl -> Subquery Scan on base_tbl ... > > Maybe we just need to make a subquery scan a valid target for an update, > so those fixups aren't required anymore? > Possibly. That feels like it would require much more extensive surgery on the planner though. I've not explored that idea, but I suspect it would quickly turn into a whole new can of worms. >> This is handled this in a similar way to the >> trigger-updatable views code, producing 2 RTEs --- the resultRelation >> RTE becomes a direct reference to the base relation, and a separate >> subquery RTE acts as the source of rows to update. > > Some of the complexity of the current RLS code is caused by the need to > do similar fixups to handle the case where the input relation isn't the > same as the target relation, but is a subquery over it instead. > >> Anyway, feel free to do what you like with this. I wasn't planning on >> submitting it to the next commitfest myself, because my non-PG >> workload is too high, and I don't expect to get much time to hack on >> postgresql during the next couple of months. > > Thanks for sending what you have. It's informative, and it shows that > some of the same issues must be solved for writable security barrier > views and for RLS. > Agreed. I'm not sure what the best way to fix those issues is though. The currently proposed approach feels pretty ugly, but I can't see a better way at the moment. Regards, Dean
All, Just a comment: I'm really glad to see the serious work on this. If RLS doesn't make it into 9.4, it'll be because the problems of RLS are fundamentally unsolvable, not because we didn't give it our best. Great work, all! -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
2013/11/6 Craig Ringer <craig@2ndquadrant.com>: > On 11/05/2013 09:36 PM, Robert Haas wrote: >> I haven't studied this patch in detail, but I see why there's some >> unhappiness about that code: it's an RLS-specific kluge. Just >> shooting from the hip here, maybe we should attack the problem of >> making security-barrier views updatable first, as a separate patch. > > That's the approach I've been considering. There are a few wrinkles with > it, though: > > (a) Updatable views are implemented in the rewriter, not the planner. > The rewriter is not re-run when plans are invalidated or when the > session authorization changes, etc. This means that we can't simply omit > the RLS predicate for superuser because the same rewritten parse tree > might get used for both superuser and non-superuser queries. > > Options: > > * Store the before-rewrite parse tree when RLS is discovered on one of > the rels in the tree. Re-run the rewriter when re-planning. Ensure a > change in current user always invalidates plans. > > * Declare that it's not legal to run a query originally parsed and > rewritten as superuser as a non-superuser or vice versa. This would > cause a great deal of pain with PL/PgSQL. > > * Always add the RLS predicate and solve the problem of reliably > short-circuiting the user-supplied predicate for RLS-exempt users. We'd > need a way to allow direct (not query-based) COPY TO for tables with RLS > applied, preventing the rewriting of direct table access into subqueries > for COPY, but since we never save plans for COPY that may be fine. > > * ... ? > How about an idea that uses two different type of rules: the existing one is expanded prior to planner stage as we are doing now, and the newer one is expanded on the head of planner stage. The argument of planner() is still parse tree, so it seems to me here is no serious problem to call rewriter again to handle second stage rules. If we go on this approach, ALTER TABLE ... SET ROW SECURITY will become a synonym to declare a rule with special attribute. > (b) Inheritance is a problem when RLS is done in the rewriter. As I > understood it from Kohei KaiGai's description to me earlier, there was a > strong preference on -hackers to enforce RLS predicates for child and > parent tables completely independently. That's how RLS currently works, > but it might be hard to get the same effect when applying RLS in the > rewriter. We'd need to solve that, or redefine RLS's behaviour so that > the predicate on a parent table applies to any child tables too. > Personally I'd prefer the latter. > I'm not certain whether it was a "strong preference", even though I followed the consensus at that time. So, I think it makes sense to discuss how RLS policy shall be enforced on the child tables. As long as we can have consistent view on child tables even if it is referenced without parent tables, I don't have any arguments to your preference. Also, it makes implementation simple than the approach I tried to have; that enforces RLS policy of tables individually, because of utilization of existing rule mechanism. It is not difficult to enforce parent's RLS policy on the child relation even if it is referenced individually. All we need to do special is append RLS policy of its parent, not only child's one, if referenced table has parent. > (c) RLS might interact differently with rules declared on tables if > implemented in the rewriter, so some investigation into that would be > needed. > >> I >> would think that if we make that work, this will also work without, >> hopefully, any special hackery. And we'd get a separate, >> independently useful feature out of it, too. > > I tend to agree. I'm just a bit concerned about dealing with the issues > around RLS-exempt operations and users. > > -- > Craig Ringer http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Training & Services -- KaiGai Kohei <kaigai@kaigai.gr.jp>
On 6 November 2013 19:12, Kohei KaiGai <kaigai@kaigai.gr.jp> wrote: > 2013/11/6 Craig Ringer <craig@2ndquadrant.com>: >> On 11/05/2013 09:36 PM, Robert Haas wrote: >>> I haven't studied this patch in detail, but I see why there's some >>> unhappiness about that code: it's an RLS-specific kluge. Just >>> shooting from the hip here, maybe we should attack the problem of >>> making security-barrier views updatable first, as a separate patch. >> >> That's the approach I've been considering. There are a few wrinkles with >> it, though: >> >> (a) Updatable views are implemented in the rewriter, not the planner. >> The rewriter is not re-run when plans are invalidated or when the >> session authorization changes, etc. This means that we can't simply omit >> the RLS predicate for superuser because the same rewritten parse tree >> might get used for both superuser and non-superuser queries. >> >> Options: >> >> * Store the before-rewrite parse tree when RLS is discovered on one of >> the rels in the tree. Re-run the rewriter when re-planning. Ensure a >> change in current user always invalidates plans. >> >> * Declare that it's not legal to run a query originally parsed and >> rewritten as superuser as a non-superuser or vice versa. This would >> cause a great deal of pain with PL/PgSQL. >> >> * Always add the RLS predicate and solve the problem of reliably >> short-circuiting the user-supplied predicate for RLS-exempt users. We'd >> need a way to allow direct (not query-based) COPY TO for tables with RLS >> applied, preventing the rewriting of direct table access into subqueries >> for COPY, but since we never save plans for COPY that may be fine. >> >> * ... ? >> > How about an idea that uses two different type of rules: the existing one > is expanded prior to planner stage as we are doing now, and the newer > one is expanded on the head of planner stage. > The argument of planner() is still parse tree, so it seems to me here is > no serious problem to call rewriter again to handle second stage rules. > If we go on this approach, ALTER TABLE ... SET ROW SECURITY > will become a synonym to declare a rule with special attribute. > I don't really get this part of the discussion. Why would you want to make updatable SB views do any of that? With RLS on tables, there is only one object in play - the table itself. So I can see that there is this requirement for certain privileged users to be able to bypass the RLS quals, and hence the need to invalidate cached plans. With SB views, however, you can have multiple SB views on top of the same base table, each giving different users access to different subsets of the data, and controlled by suitable GRANTs, and suitably privileged users can be given direct access to the base table. This also gives greater flexibility than the superuser/non-superuser distinction being discussed here. I don't think a view should ever show different data to different users (unless it has been deliberately set up to do so) because that would likely lead to confusion. Is there some other use-case that I'm missing here? Regards, Dean
<div dir="ltr"><div class="gmail_extra"><br /><div class="gmail_quote">On Wed, Nov 6, 2013 at 6:38 AM, Craig Ringer <spandir="ltr"><<a href="mailto:craig@2ndquadrant.com" target="_blank">craig@2ndquadrant.com</a>></span> wrote:<br/><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="overflow:hidden">That'sthe approach I've been considering. There are a few wrinkles with<br /> it, though:<br /><br/> (a) Updatable views are implemented in the rewriter, not the planner.<br /> The rewriter is not re-run when plansare invalidated or when the<br /> session authorization changes, etc. This means that we can't simply omit<br /> theRLS predicate for superuser because the same rewritten parse tree<br /> might get used for both superuser and non-superuserqueries.<br /></div></blockquote></div><br /></div><div class="gmail_extra">Incidentally I still feel this isat root the problem with updateable views in general. I know it's a bit off to be tossing in concerns from the peanut gallerywhen I'm not actually offering to work on it and others are having putting in serious efforts in this area and havingsome success. So take this for what it's worth...<br /><br /></div><div class="gmail_extra">I think the right approachfor updateable views would be to support a syntax like this in the planner:<br /><br /></div><div class="gmail_extra">UPDATE(select * from tab1,tab2 ...) WHERE <a href="http://tab1.id" target="_blank">tab1.id</a> = .. SET...<br /><br /></div><div class="gmail_extra">Then updateable views would just rewrite the query mechanically the wayregular views work by substituting the view definition in place of the view name. Since all the work would be done inthe planner it would have access to the same kinds of information that regular join planning etc have.<br /></div><divclass="gmail_extra"><br /></div><div class="gmail_extra">I'm not sure if this solves all the problems with RLSbut it would solve the concern about plan invalidations and I think it would make it simpler to reason about securityrules that are otherwise happening at plan time.<br clear="all" /></div><div class="gmail_extra"><br />-- <br />greg<br/></div></div>
On 11/07/2013 09:47 PM, Greg Stark wrote: > Incidentally I still feel this is at root the problem with updateable > views in general. I know it's a bit off to be tossing in concerns from > the peanut gallery when I'm not actually offering to work on it and > others are having putting in serious efforts in this area and having > some success. So take this for what it's worth... Frankly, the peanut gallery input can be quite handy. It's easy to get so stuck in the way you've seen it thought about already that you don't see other ways to view it. Plus, sometimes the peanut gallery becomes the "oh, I don't like this at all" crowd when commit time is approaching, so early comments are better than no comments then last minute complaints. > I think the right approach for updateable views would be to support a > syntax like this in the planner: > > UPDATE (select * from tab1,tab2 ...) WHERE tab1.id <http://tab1.id> = .. > SET ... I want to support that for rewritten parse trees, and therefore (because of recursive rewrite) in pre-rewrite parse trees. It's exactly what's needed to make this sane, IMO, and I think this is what Robert was suggesting with making UPDATE capable of dealing with operating directly on a subquery scan. I'm not at all convinced it should be exposed to the user and accepted by the parser as SQL, but I don't know if that's what you were suggesting. Robert? Is this what you meant? If so, any chance you can point a planner neophyte like me in vaguely the right direction? > I'm not sure if this solves all the problems with RLS but it would solve > the concern about plan invalidations and I think it would make it > simpler to reason about security rules that are otherwise happening at > plan time. I don't think it'd help with plan invalidations. View expansion happens at rewrite time, and we don't re-run the rewriter on the original parse tree when we invalidate plans and re-plan. The current patch entirely elides the RLS quals and subquery when running as superuser and that won't work with view expansion at the rewrite stage. So we still have that to deal with, and need to handle a few side issues with portals and user id changes etc, but the biggest issue is coming up with an acceptable way to update a security barrier view or subquery. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 11/07/2013 06:11 PM, Dean Rasheed wrote: > I don't really get this part of the discussion. Why would you want to > make updatable SB views do any of that? I don't, especially. If we're going to use updatable SB views as the basis for RLS then we need the option to skip adding the qual for superuser. That might be done outside the SB code its self, by not invoking the SB view rewrite on the base rel when we see we're running as an RLS-exempt user. It's all about dealing with the re-plan problem really. > With SB views, however, you can have multiple SB views on top of the > same base table, each giving different users access to different > subsets of the data, and controlled by suitable GRANTs, and suitably > privileged users can be given direct access to the base table. This > also gives greater flexibility than the superuser/non-superuser > distinction being discussed here. That's a thought. Can we munge what the planner sees in pg_class instead? So when we see a ref to an RLS relation we transparently substitute the oid for a hidden SB view over the relation instead. For RLS exempt users we omit that substitution. Since the lookups and view subs are done in the rewrite phase it probably doesn't help us much, but it'd get rid of the need to play about with substituting a RTE_RELATION with an RTE_SUBQUERY dynamically during rewrite. > I don't think a view should ever show different data to different > users (unless it has been deliberately set up to do so) because that > would likely lead to confusion. Is there some other use-case that I'm > missing here? The main concern is pg_dump - it's important that dumps be able to take a complete copy without relying on hacks or bug-free user-written RLS quals. Highly privileged users should also be exempt from RLS so they don't invoke untrusted functions that're part of RLS quals written by lower-rights users. This isn't really "superuser" vs "not superuser. In fact we'll want a new right that controls whether RLS can be bypassed, and another that controls the ability to set RLS rights on tables. Both of those would be superuser only by default, but could be delegated. (Note: It's important that table owners _not_ get the right to set RLS on tables by default for security reasons. I'll explain in more detail later.) -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Wed, Nov 6, 2013 at 1:38 AM, Craig Ringer <craig@2ndquadrant.com> wrote: > (a) Updatable views are implemented in the rewriter, not the planner. > The rewriter is not re-run when plans are invalidated or when the > session authorization changes, etc. This means that we can't simply omit > the RLS predicate for superuser because the same rewritten parse tree > might get used for both superuser and non-superuser queries. My impression was that we had discussed this problem with respect to some earlier version of the RLS patch and that the conclusion of that discussion was that we needed to record in the cached plan whether the plan was one which is sensitive to the user ID and, if so, avoid using that plan with a different user ID. I am murky on the details; I believe the original discussion of this topic was a year or more ago. > (b) Inheritance is a problem when RLS is done in the rewriter. As I > understood it from Kohei KaiGai's description to me earlier, there was a > strong preference on -hackers to enforce RLS predicates for child and > parent tables completely independently. Not to put a too fine a point on it, but I think that's a really bad plan; and I don't remember any such discussion. > That's how RLS currently works, > but it might be hard to get the same effect when applying RLS in the > rewriter. We'd need to solve that, or redefine RLS's behaviour so that > the predicate on a parent table applies to any child tables too. > Personally I'd prefer the latter. Yes, let's please redefine it. The goal here ought to be to make RLS work as smoothly as possible with the rest of the system, not to invent weird semantics that are both unlike what we do elsewhere - and difficult to implement, to boot. I thought the whole point of implementing security barrier views was that read-side RLS would work just the same way, not having randomly incompatible semantics. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Thu, Nov 7, 2013 at 9:08 PM, Craig Ringer <craig@2ndquadrant.com> wrote: > On 11/07/2013 09:47 PM, Greg Stark wrote: >> Incidentally I still feel this is at root the problem with updateable >> views in general. I know it's a bit off to be tossing in concerns from >> the peanut gallery when I'm not actually offering to work on it and >> others are having putting in serious efforts in this area and having >> some success. So take this for what it's worth... > > Frankly, the peanut gallery input can be quite handy. It's easy to get > so stuck in the way you've seen it thought about already that you don't > see other ways to view it. Plus, sometimes the peanut gallery becomes > the "oh, I don't like this at all" crowd when commit time is > approaching, so early comments are better than no comments then last > minute complaints. > >> I think the right approach for updateable views would be to support a >> syntax like this in the planner: >> >> UPDATE (select * from tab1,tab2 ...) WHERE tab1.id <http://tab1.id> = .. >> SET ... > > I want to support that for rewritten parse trees, and therefore (because > of recursive rewrite) in pre-rewrite parse trees. It's exactly what's > needed to make this sane, IMO, and I think this is what Robert was > suggesting with making UPDATE capable of dealing with operating directly > on a subquery scan. > > I'm not at all convinced it should be exposed to the user and accepted > by the parser as SQL, but I don't know if that's what you were suggesting. > > Robert? Is this what you meant? If so, any chance you can point a > planner neophyte like me in vaguely the right direction? I haven't studied this issue well enough to know what's really needed here, but Dean Rasheed's approach sounded like a promising tack to me. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
While Craig has been soldiering onward to the write side of things, I've been working with some groups who are evaluating if the RLS feature is good enough for switching some real world apps over to use PostgreSQL. Amazingly I can even say that these are applications involving the US Department of Defense, and they're going well so far. There has been some concern on this list as to whether this feature is being evaluated by people who are serious about real-world database security issues. The crowd I'm talking to now sure is. There's some interesting bad news, like how we're going to have to completely refactor all the "if (!superuser())" shortcuts into real permissions one day to make them happy. But for the most part the prototype conversions have been working out. The RLS feature set available with the CF submission is good enough for those projects to continue exploring PostgreSQL. I feel good now that if the code issues around what's already been developed are sorted out to commit quality, the users really will queue once this hits. == Review demo code == Attached is a small demo piece built by my new co-worker Jeff McCormick that I've been hacking lightly on. The idea was to get something a step beyond trivial that maps into real-world need, well enough that we could get people to agree it was usefully representative. What they did is take the sort of security levels these apps have, which often includes two different criteria for each row, and mapped that into what the RLS submission supports. There are a set of roles for the permission to see various shapes (triangle/circle/square/heart) and colors, and then people can see a row only if they have both color and shape permissions. The part that I thought was cute when I first read it is how this query looks up data in the catalog to accomplish that: ALTER TABLE colors_shapes SET ROW SECURITY FOR ALL TO ( color IN ( SELECT permission_role.rolname FROM pg_roles active_role, pg_auth_members, pg_roles permission_role WHERE active_role.rolname = current_user AND active_role.oid = pg_auth_members.member AND permission_role.oid = pg_auth_members.roleid ) AND shape IN ( SELECT permission_role.rolname FROM pg_roles active_role, pg_auth_members, pg_roles permission_role WHERE active_role.rolname = current_user AND active_role.oid = pg_auth_members.member AND permission_role.oid = pg_auth_members.roleid ) ); Insert some data that looks like this: id | color | shape ----+-------------+--------------- 1 | blue_role | triangle_role 2 | red_role | heart_role 3 | purple_role | circle_role And then users can see rows only if they have a pair of matches on both those role sets. The fact that you can just plug in whatever you want into SET ROW SECURITY allows all kinds of things. I have a more complicated sample we're still poking at, using a full pl/pgsql function there, but I've already got enough to try and cover in this message to bring that in today. One reason the users I've been talking to feel comfortable that they'll be able to build apps usefully with this feature is this interface. It feels more like a pluggable API for defining rules than a specific implementation. Congratulations to the usual community members who expect complete feature overkill in every design. == Code review == Onto coding. Rather than trying to cope with patches I pulled KaiGai's tree from https://github.com/kaigai/sepgsql/tree/rowsec to try things out. That hasn't been merged against head in a while and it's suffered serious but not terminal bit rot. The REPLICA IDENTITY changes in particular touched adjacent code all over the place, since it and RLS are both added things to the end of existing lists. The merges to get thing back to working again were only hard when my eyes crossed on the always fun pg_class bootstrapping table. I did all that myself up to committed changes on Dec 3. The only real change I made was to punt the OID number assignment into a whole different range, to make future merge rot less likely. Those are going to get reassigned at commit time anyway, no need to try and guess the appropriate next number for each today. I just pushed a fork of KaiGai's repo to https://github.com/gregs1104/sepgsql/tree/rls that includes all of the must fix items to get the feature working again. That's what the attached demo code was tested again. The regression tests are still seriously busted, beyond just regenerating new output. I pushed that off for now, I need to study recent commits better to catch up on what changed. For community reference I'm attaching an updated patch that forks master after dfd5151c5800448a2be521797f95e1aa63d87b67 and then has the merging I did. I'm hoping that getting a positive functional review will lure KaiGai back to help me update things. Hopefully pulling or looking at commits from my merge attempt helps with that. The (blue) elephant in the room here for this feature is the list of committers in particular who are very picky about the parts of the code this feature is touching (*cough* Tom). One reason I'm trying to get these demo pieces together--Jeff here is building a whole RLS test sample repo--is to try and make the job of testing this out easier. As I dig myself out of the hole I'm in after getting really sick instead of going to PG.EU, I've got a whole lot of time set aside to work on community features lined up. == Outstanding issues == Things I can already see to work on here are: -Fix the regression tests -Catch up to master again -Continue to expand the functional tests -Is there enough information about row security available in psql output? For example, there's nothing in "\d" output that suggests it might exist. pg_rowsecurity is a monster to try and read. -Documentation needs plenty of editing, which I can take care of. == Cover Channels == One last topic since I know it's been a sore point. Covert channels. There's still some obvious ones around. In this example I was plenty uneasy about some of what you can learn from the EXPLAIN output too. But no one I've talked to about deploying PostgreSQL RLS seem terrible uncomfortable with the exposure at this point. RLS is just one layer of extra protection on checks that are also happening in the application API exposing the data to users. No one expects any one layer to be the impenetrable one. In the interest of CYA, I've talked through some scenarios where attackers have resources like a hacked libpq client and enough info to search the key space. Instead of expecting that the database can survive that, which is pretty unreasonable, what I'm getting requests for now is better audit logging to make it easier to detect that. We should certainly close any holes we can, but where things are at today doesn't seem functionally unacceptable anymore. -- Greg Smith greg.smith@crunchydatasolutions.com Chief PostgreSQL Evangelist - http://crunchydatasolutions.com/
Attachment
On 12/14/2013 11:24 AM, Gregory Smith wrote: > The RLS feature set available with the CF submission is good enough > for those projects to continue exploring PostgreSQL You may want to check out the updated writable security-barrier views patch. http://www.postgresql.org/message-id/52AB112B.6020403@2ndquadrant.com It may offer a path forward for the CF submission for RLS, letting us get rid of the var/attr fiddling that many here objected to. > There's some interesting bad > news, like how we're going to have to completely refactor all the "if > (!superuser())" shortcuts into real permissions one day to make them > happy. We'll need that for mandatory access control. Initially I'd probably want to do it through SEPostgreSQL role checks, but it'd be desirable to have a generic mechanism that'd work with any OS's role management eventually. I think we're going to need finer grained rights than we have now, as "superuser or not" isn't really good enough for some things. In particular, the fact that if plperl or plpython are installed the superuser (often accessible remotely) can trivially run arbitrary C-level functions and invoke commands as the PostgreSQL OS user isn't going to be acceptable in these environments, but they may still want the trusted PLs. > Attached is a small demo piece built by my new co-worker Jeff McCormick > that I've been hacking lightly on. The idea was to get something a > step beyond trivial that maps into real-world need, well enough that > we could get people to agree it was usefully representative. I really appreciate that. I suspect more work will be needed on the interface for the actual row-level security code, and work like this helps point in the direction of what's needed. > One reason the users I've been talking to feel > comfortable that they'll be able to build apps usefully with this > feature is this interface. It feels more like a pluggable API for > defining rules than a specific implementation. Congratulations to the > usual community members who expect complete feature overkill in every > design. There's a C-level API that's completely pluggable, for what it's worth. I'm actually unconvinced that the SQL-level API for row-security is appropriate - I'm persuaded by comments made on the list that the RLS syntax and security model needs to be easy enough to use to be useful if it's to add much on top of what we get with updatable s.b. views, security barrier, leakproof, etc. I don't think "run some SQL for every row" fits that. We have that at the C level in the RLS patch, and I wonder if it should stay there. I find the heirachical and non-heirachical label security model used in Teradata to be extremely interesting and worthy of study. The concept is that there are heirachical label policies (think "classified", "unclassified", etc) or non-heirachical (mutually exclusive labels). A user can see a row if they have all the policies used in a table, and each of the user's policy values is sufficient to see the row. For non-heiracical policies that means the user and row must have at least one label in common; for heiracical policies it means the user label must be greater than or equal to the row label. That model lets you quite flexibly and easily build complex security models. It'd work well with PostgreSQL, too: We could implement something like non-heirachical policies using a system varbit column that's added whenever a non-heirachical policy gets added to a table, and simply AND it with the varbit on the user's policy to see if they can access the row. Or use a compact fixed-width bitfield. Heirachical policies would use a 1 or 2 byte int system column to store the label. So we'd have one system column per policy that's been applied to each table. For users we'd need a catalog table to describe policies, and a join table mapping users to policies with a user policy value. Optimization by storing policy values for users directly in pg_role might be worthwhile. At the SQL level, an admin would then: - Define policies - Assign users labels in policies - Apply policies to tables PostgreSQL would be responsible for ensuring that rows written by a user were labeled with that user's non-heirachical labels, and with their current session level for their heirachical labels. It'd also be responsible for enforcing matching on reads. Internally, we can do all of this with the functoinality provided by security-barrier views once write support is added. The interface exposed to the user is made drastically easier to use, though. Your example would become two non-heirachical security policies: CREATE SECURITY POLICY colors AS (blue, red, purple); CREATE SECURITY POLICY shapes AS (triangle, circle, square); ALTER TABLE color_shapes ADD SECURITY POLICY colors; ALTER TABLE colors_shapes ADD SECURITY POLICY shapes; You'd assign users to the policies: ALTER USER user1 SET SECURITY LABEL FOR POLICY colors TO blue, purple; ALTER USER user2 SET SECURITY LABEL FOR POLICY colors TO red; ALTER USER user1 SET SECURITY LABEL FOR POLICY shapes TO circle; ALTER USER user2 SET SECURITY LABEL FOR POLICY shapes TO triangle; user1 would now write rows with colors=blue|purple, shapes=circle. user2 would write rows with colors=red, shapes=triangle. We'd provide session-level control over the current active label set, eg: SET (LOCAL) LABELS FOR POLICY ... TO ... so users can further control the rows they write and see at the session level, but only within the limits of their authorized labels. Much like what we have with SET ROLE at the moment. This would be particularly crucial with heirachical policies, where rows get written at the current session level. I'll look at the rest of the mail a bit later; it's Saturday morning here. I just wanted to raise this for your consideration now, as I think this offers us a way to provide row-security that doesn't require every user to just roll their own. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 12/14/2013 11:24 AM, Gregory Smith wrote: > > > Things I can already see to work on here are: > > -Fix the regression tests > -Catch up to master again I've got much of that in the tree: https://github.com/ringerc/postgres/tree/rls-9.4 -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Hi all I'd like to outline a path toward getting row-security into core. Comments / discussion appreciated. I think I/we need to: - Finish and commit updatable security barrier views. I've still got a lot of straightening out to do there. - Add an attribute to portals that stores the user ID at the time the portal was planned. Possibly extensibly; I'd be surprised if we won't need to associate other local info with a portal later. - Allow storage of the pre-rewrite query plan and let saved plans be marked as needing rewrite when replanned. We'll need this to permit some users (initially just a hardcoded "superuser"; later by posession of a certain right) to be totally exempt from row-security policy. (Alternative: store separate with- and without-row-security plan trees and pick which we use). - Decide on and implement a structure for row-security functionality its self. I'm persuaded by Robert's comments here, that whatever we expose must be significantly more usable than a DIY on top of views (with the fix for updatable security barrier views to make that possible). I outlined the skeleton of a possible design in my earlier post, with the heirachical and non-heirachical access labels. Should be implemented using the same API we expose for extensions (like SEPostgresql RLS). - Produce and commit a patch that adds the C API for row-security, including calls to make it easy to wrap any relation in a dynamically created or stored updatable security barrier subquery during rewrite. - Produce and commit a patch series implementing the syntax, catalog tables / catalog changes, documentation, etc for row-security that uses this C API and commit it to core. SEPostgreSQL RLS can then be built on top of the same API, using the same core support. Thoughts? -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Craig Ringer <craig@2ndquadrant.com> writes: > - Add an attribute to portals that stores the user ID at the time the > portal was planned. Possibly extensibly; I'd be surprised if we won't > need to associate other local info with a portal later. This bit seems rather confused. A portal is a runnable query; we do not support replanning midstream, and I don't think we support changes of UID either. I agree that the plan cache needs to support treating change of UID as a reason to discard a cached plan, but that's nothing to do with portals. regards, tom lane
On 12/16/2013 10:43 PM, Tom Lane wrote: > Craig Ringer <craig@2ndquadrant.com> writes: >> - Add an attribute to portals that stores the user ID at the time the >> portal was planned. Possibly extensibly; I'd be surprised if we won't >> need to associate other local info with a portal later. > > This bit seems rather confused. A portal is a runnable query; we > do not support replanning midstream, and I don't think we support > changes of UID either. We _do_ support changes of UID, or rather, current_user returns the session user ID at the point in time it runs in the portal. This can be observed with SECURITY DEFINER pl/pgsql functions returning refcursor, and with cursors that're retained across a SET SESSION AUTHORIZATION. They don't even need to be WITH HOLD, and s.s.a. can occur within a transaction. The point is to return the user ID at the time the portal was created, rather than whatever the session now is. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 12/16/13 9:36 AM, Craig Ringer wrote: > - Finish and commit updatable security barrier views. I've still got a > lot of straightening out to do there. I don't follow why you've put this part first. It has a lot of new development and the risks that go along with that, but the POC projects I've been testing are more interested in the view side issues. > - Decide on and implement a structure for row-security functionality its > self. I'm persuaded by Robert's comments here, that whatever we expose > must be significantly more usable than a DIY on top of views (with the > fix for updatable security barrier views to make that possible). Can't say i agree we should be getting tangled into making this slick on the user side right now. If you open a new can of worms like heirachical access labels, this goes back into basic design, and we'll spend months on that before returning to exactly the same implementation details. I tried to make a case for how having a really generic mechanism that's doesn't presume to know how labels will be assigned can be a good thing. Given the state this is all in right now, I'd much rather publish a hard to use but powerful API than to presume you're going to get an easier to use design right. The place I'm at here is trying to figure out the simplest useful thing that could be committed and then hammer on the details. (Not the first time I've beat that drum on a feature) Your roadmap goes way past that, which is great to avoid being painted into a corner, but I'm thinking more about what a useful feature freeze point would look like given it's December 16 now. -- Greg Smith greg.smith@crunchydatasolutions.com Chief PostgreSQL Evangelist - http://crunchydatasolutions.com/
On Mon, Dec 16, 2013 at 3:12 PM, Gregory Smith <gregsmithpgsql@gmail.com> wrote: > On 12/16/13 9:36 AM, Craig Ringer wrote: >> >> - Finish and commit updatable security barrier views. I've still got a >> lot of straightening out to do there. > > I don't follow why you've put this part first. It has a lot of new > development and the risks that go along with that, but the POC projects I've > been testing are more interested in the view side issues. I don't really see a way that any of this can work without that. To be clear, that work is required even just for read-side security. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 16 December 2013 14:36, Craig Ringer <craig@2ndquadrant.com> wrote: > - Decide on and implement a structure for row-security functionality its > self. I'm persuaded by Robert's comments here, that whatever we expose > must be significantly more usable than a DIY on top of views (with the > fix for updatable security barrier views to make that possible). I > outlined the skeleton of a possible design in my earlier post, with the > heirachical and non-heirachical access labels. Should be implemented > using the same API we expose for extensions (like SEPostgresql RLS). That part isn't clear why we "must" do better than that. Having declarative security is a huge step forward, in just the same way that updateable views were. They save the need for writing scripts to implement things, rather than just having a useful default. If there is a vision for that, lets see the vision. And then decide whether its worth waiting for. Personally, I see no reason not to commit the syntax we have now. So people can see what we'll be supporting, whenever that is. -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 17 December 2013 17:03, Robert Haas <robertmhaas@gmail.com> wrote: > On Mon, Dec 16, 2013 at 3:12 PM, Gregory Smith <gregsmithpgsql@gmail.com> wrote: >> On 12/16/13 9:36 AM, Craig Ringer wrote: >>> >>> - Finish and commit updatable security barrier views. I've still got a >>> lot of straightening out to do there. >> >> I don't follow why you've put this part first. It has a lot of new >> development and the risks that go along with that, but the POC projects I've >> been testing are more interested in the view side issues. > > I don't really see a way that any of this can work without that. To > be clear, that work is required even just for read-side security. Not sure I'd say required, but its certainly desirable to have updateable security barrier views in themselves. And it comes across to me as a cleaner and potentially more performant way of doing the security checks for RLS. So I think its the right thing to do to wait for this, even if we can't do that for 9.4 Realistically, we have a significant amount of work before we're ready to pass a high security audit based around these features. -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 12/18/2013 01:03 AM, Robert Haas wrote: > On Mon, Dec 16, 2013 at 3:12 PM, Gregory Smith <gregsmithpgsql@gmail.com> wrote: >> > On 12/16/13 9:36 AM, Craig Ringer wrote: >>> >> >>> >> - Finish and commit updatable security barrier views. I've still got a >>> >> lot of straightening out to do there. >> > >> > I don't follow why you've put this part first. It has a lot of new >> > development and the risks that go along with that, but the POC projects I've >> > been testing are more interested in the view side issues. > I don't really see a way that any of this can work without that. To > be clear, that work is required even just for read-side security. It's possible to build limited read-side-only security on top of the existing s.b. views as they stand, with no update support. You can grant write-only access to the base relations, and require people to use a different relation name / schema when they want to access a relation for write vs for read. You can't use RETURNING, and you can still learn from result rowcounts etc. It's clumsy but usable-ish. So it works - as long as you're using absolutely 100% read-only access for users you need to constrain, or you don't mind explicitly referring to the base table for write operations and not being able to use RETURNING. I've been looking at write support primarily because I was under the impression from prior discussion I read that the feature wasn't considered committable as a read-only feature. If a consensus can be built that read-only RLS would be acceptable after all, then I'll happily defer that in favour of the other work items. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 12/18/2013 02:21 AM, Simon Riggs wrote: > On 16 December 2013 14:36, Craig Ringer <craig@2ndquadrant.com> wrote: > >> - Decide on and implement a structure for row-security functionality its >> self. I'm persuaded by Robert's comments here, that whatever we expose >> must be significantly more usable than a DIY on top of views (with the >> fix for updatable security barrier views to make that possible). I >> outlined the skeleton of a possible design in my earlier post, with the >> heirachical and non-heirachical access labels. Should be implemented >> using the same API we expose for extensions (like SEPostgresql RLS). > > That part isn't clear why we "must" do better than that. > > Having declarative security is a huge step forward, in just the same > way that updateable views were. They save the need for writing scripts > to implement things, rather than just having a useful default. In my view the proposed patch doesn't offer a significant improvement in declarative security, beyond what we can get by just adding update support to s.b. views and using search_path to control whether a user sees the view or the base table. It's a lot like Oracle Virtual Private Database (VPD): A set of low level building blocks you can build your own flexible row security model with. One where you have to very carefully write security-sensitive predicates and define all your security model tables, etc yourself. That's why I'm now of the opinion that we should make it possible to achieve the same thing with s.b. views and the search_path (by adding update support)... then build a declarative row-security system that doesn't require the user fiddling with delicate queries and sensitive scripts on top of that. > Personally, I see no reason not to commit the syntax we have now. So > people can see what we'll be supporting, whenever that is. Do you mean commit the syntax without the supporting functionality, so it doesn't operate / just ERROR's initially? Or are you suggesting that the project commit the current patch with the known issues with user ID changes and the concerns about its internal structure and operation, then clean it up down the track? > If there is a vision for that, lets see the vision. And then decide > whether its worth waiting for. OK, I'll put an example together as a follow-up to this post - the "how it should look" part. Below I explain the *why* part. What the current patch does is add something a bit like a simplified and cut-down version of Oracle VPD. Oracle's VPD predicates are generated by a PL/SQL procedure that emits the SQL for a predicate. It has varying levels of caching to mitigate performance effects. The proposed patch for Pg doesn't do that. We'll have the predicate fixed and stored in the catalogs. It'll be like VPD with policy level SHARED_STATIC locked in as the only option. See http://docs.oracle.com/cd/B28359_01/network.111/b28531/vpd.htm#DBSEG98250 for details on Oracle VPD. It's essentially a few tools you can build your own row security model on, making the job a little easier. The DB designer / DBA still does a lot of work to define security on each table and do a bunch of procedure writing and setup to make it work. They have to get potentially security sensitive queries just right for each table. IMO it really isn't much different to just having an updatable security barrier view. Differences arise when you look at VPD-exempt user acess rights, at policy groups, etc, but we don't have any of that anyway. The only big differences are that you can apply a VPD policy to an existing table w/o having to refresh all views/functions that point to it, and that some users can be made exempt. (Note: I have not used VPD or Label Security myself; my comments are intentionally based only on public documentation and on discussions with others). Oracle then built Oracle Label Security (their row-security solution) *on* *top* of VPD. It's much more of a ready-to-go, configure it and use it product, where you don't have to write procedures and define security tables and generally build most of it yourself. See: http://docs.oracle.com/cd/B14117_01/network.101/b10774.pdf The Teradata implementation Robert pointed out skips the VPD-like layer entirely. It just implements a fairly simple and flexible label security model directly. You don't have to muck about with the details, you don't have to deal with the guts of the row security implementation, you just say "here's what I want". I'd prefer to avoid adding syntax and interfaces for a feature that'll likely become mostly implementation detail for the feature users actually want - declarative security. That's part of why I've been focused on upd. s. b. views. It'll provide a way for users to do flexible DIY row-security without introducing a big chunk of new syntax and functionality we'll be stuck with supporting. It'll also be a useful building block for declarative row security. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 2013-12-14 05:40, Craig Ringer wrote: > I find the heirachical and non-heirachical label security model used in > Teradata to be extremely interesting and worthy of study. > > The concept is that there are heirachical label policies (think > "classified", "unclassified", etc) or non-heirachical (mutually > exclusive labels). A user can see a row if they have all the policies > used in a table, and each of the user's policy values is sufficient to > see the row. For non-heiracical policies that means the user and row > must have at least one label in common; for heiracical policies it means > the user label must be greater than or equal to the row label. > > That model lets you quite flexibly and easily build complex security > models. It'd work well with PostgreSQL, too: We could implement > something like non-heirachical policies using a system varbit column > that's added whenever a non-heirachical policy gets added to a table, > and simply AND it with the varbit on the user's policy to see if they > can access the row. Or use a compact fixed-width bitfield. Heirachical > policies would use a 1 or 2 byte int system column to store the label. Is is necessary to fix the attribute type of the security label? The label model described above seems to restrictive to support e.g. the labels described on p18 of the 'healthcare classification system' (HCS) (http://www.hl7.org/documentcenter/public/wg/secure/3.%20HCS%20Guide%20Final%202013%200322%20JMD.pdf). The HCS security label is based on the NIST's FIPS188 standard / http://www.itl.nist.gov/fipspubs/fip188.htm regards, Yeb Havinga
On Tue, Dec 17, 2013 at 1:27 PM, Simon Riggs <simon@2ndquadrant.com> wrote: > Not sure I'd say required, but its certainly desirable to have > updateable security barrier views in themselves. And it comes across > to me as a cleaner and potentially more performant way of doing the > security checks for RLS. Yes, that's how I'm thinking of it. It's required in the sense that if we don't do it as a separate patch, we'll need to fold many of changes into the RLS patch, which IMV is not desirable. We'd end up with more complexity and less functionality with no real upside that I can see. But I think we are basically saying the same thing. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Wed, Dec 18, 2013 at 3:30 AM, Craig Ringer <craig@2ndquadrant.com> wrote: > In my view the proposed patch doesn't offer a significant improvement in > declarative security, beyond what we can get by just adding update > support to s.b. views and using search_path to control whether a user > sees the view or the base table. > > It's a lot like Oracle Virtual Private Database (VPD): A set of low > level building blocks you can build your own flexible row security model > with. One where you have to very carefully write security-sensitive > predicates and define all your security model tables, etc yourself. > > That's why I'm now of the opinion that we should make it possible to > achieve the same thing with s.b. views and the search_path (by adding > update support)... then build a declarative row-security system that > doesn't require the user fiddling with delicate queries and sensitive > scripts on top of that. To be clear, I wasn't advocating for a declarative approach; I think predicates are fine. There are usability issues to worry about either way, and my concern is that we address those. A declarative approach would certainly be valuable in that, for those people for whom it is sufficiently flexible, it's probably quite a lot easier than writing predicates. But I fear that some people will want a lot more generality than a label-based system can accommodate. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 12/18/2013 11:14 PM, Robert Haas wrote: > To be clear, I wasn't advocating for a declarative approach; I think > predicates are fine. There are usability issues to worry about either > way, and my concern is that we address those. A declarative approach > would certainly be valuable in that, for those people for whom it is > sufficiently flexible, it's probably quite a lot easier than writing > predicates. But I fear that some people will want a lot more > generality than a label-based system can accommodate. Having spent some time reading the HL7 spec, I now tend to agree that labels alone are not going to be sufficient. There are security decisions made based on: - Security labels - User/group/role memberships, both for accessor and target entity owner (down to the row level) - Black/white list ACLs, with inheritance and deny rules. - One-off or narrow authorizations. "I permit my GP to examine my mental health record details, but only over the last year, and the authorization stands only for today." - Authorization assertions. "I declare that the patient told me it is OK to access these, let me see them." - "Break glass" access. "This is an emergency. Give me access and I'll deal with the consequences later." So while security labels are an important part of the picture I'm forced to agree that they are not sufficient, and that a generalized row security mechanism actually is necessary. We don't have the time or resources to build all these sorts of things individually, and if we did it'd still be too inflexible for somebody. In the end, sometimes I guess there's no replacement for "WHERE call_some_procedure()" In particular, labels are totally orthogonal to entity identity in the data model, and being able to make row access decisions based on information already in the data model is important. FK relationships to owning entities and relationships between entities must be usable for security access policy decisions. So: I'm withdrawing the proposal to go straight to label security; I concede that it's insufficiently expressive to meet all possible needs, and we don't have the time to build anything declarative and user-friendly that would be. I do want to make sure that whatever we include this time around does not create problems for a future label security approach, but that's kind of required by the desire to add SEPostgreSQL RLS down the track anyway. Given that: What are your specific usability concerns about the current approach proposed in KaiGai's patch? My main worry was that it requires the user to build everything manually, and is potentially error prone as a result. To address that we can build convenience features (label security, ACL types and operators, etc) on top of the same infrastructure later - it doesn't have to go in right away. So putting that side, the concerns I have are: - The current RLS design is restricted to one predicate per table with no contextual control over when that predicate applies. That means we can't implement anything like "policy groups" or overlapping sets of predicates, anything like that has to be built by the user. We don't need multiple predicates to start with but I want to make sure there's room for them later, particularly so that different apps / extensions that use row-security don't have to fight with each other. - You can't declare a predicate then apply it to a bunch of tables with consistent security columns. Updating/changing predicates will be a pain. We might be able to get around that by recommending that people use an inlineable SQL function to declare their predicates, but "inlineable" can be hard to pin down sometimes. If not, we need something akin to GRANT ... ALL TABLES IN SCHEMA ... for row security, and ALTER DEFAULT PRIVILEGES ... too. - It's too hard to see what tables have row-security and what impact it has. Needs psql improvements. - There's no way to control whether or not a client can see the row-security policy definition. The other one I want to deal with is secure session variables, but that's mostly a performance optimisation we can add later. What's your list? -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 12/18/13 10:21 PM, Craig Ringer wrote: > In the end, sometimes I guess there's no replacement for "WHERE > call_some_procedure()" That's where I keep ending up at. The next round of examples I'm reviewing this week plug pl/pgsql code into that model. And the one after that actually references locally cached data that starts stored in LDAP on another machine altogether. That one I haven't even asked for permission to share with the community because of my long standing LDAP allergy, but the whole thing plugs into the already submitted patch just fine. (Shrug) I started calling all of the things that generate data for RLS to filter on "label providers". You've been using SELinux as an example future label provider. Things like this LDAP originated bit are another provider. Making the database itself a richer label provider one day is an interesting usability improvement to map out. But on the proof of concept things I've been getting passed I haven't seen an example where I'd use that yet anyway. The real world label providers are too complicated.
On Wed, Dec 18, 2013 at 10:21 PM, Craig Ringer <craig@2ndquadrant.com> wrote: > My main worry was that it requires the user to build everything > manually, and is potentially error prone as a result. To address that we > can build convenience features (label security, ACL types and operators, > etc) on top of the same infrastructure later - it doesn't have to go in > right away. > > So putting that side, the concerns I have are: > > - The current RLS design is restricted to one predicate per table with > no contextual control over when that predicate applies. That means we > can't implement anything like "policy groups" or overlapping sets of > predicates, anything like that has to be built by the user. We don't > need multiple predicates to start with but I want to make sure there's > room for them later, particularly so that different apps / extensions > that use row-security don't have to fight with each other. > > - You can't declare a predicate then apply it to a bunch of tables with > consistent security columns. Updating/changing predicates will be a > pain. We might be able to get around that by recommending that people > use an inlineable SQL function to declare their predicates, but > "inlineable" can be hard to pin down sometimes. If not, we need > something akin to GRANT ... ALL TABLES IN SCHEMA ... for row security, > and ALTER DEFAULT PRIVILEGES ... too. > > - It's too hard to see what tables have row-security and what impact it > has. Needs psql improvements. > > - There's no way to control whether or not a client can see the > row-security policy definition. > > > The other one I want to deal with is secure session variables, but > that's mostly a performance optimisation we can add later. > > What's your list? I don't have a lot of specific concerns apart from what I mentioned here: http://www.postgresql.org/message-id/CA+TgmoYC37qWNQkKQx3P3GU3Psfh3Tcox8uE06nnUiqn_4RAqA@mail.gmail.com One thing we do need to think about is our good friend search_path, and making sure that it's relatively easy to implement RLS in a way that's secure against malicious manipulation thereof. I do also agree with some of your concerns, particularly the first two ("is it too manual?" and "insufficient contextual control"). -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 12/13/13 11:40 PM, Craig Ringer wrote: > You may want to check out the updated writable security-barrier views patch. > > http://www.postgresql.org/message-id/52AB112B.6020403@2ndquadrant.com > > It may offer a path forward for the CF submission for RLS, letting us > get rid of the var/attr fiddling that many here objected to. With my advocacy hat on, I'd like to revisit this idea now that there's a viable updatable security barrier view submission. I thought the most serious showstopper feedback from the last CF's RLS submission was that this needed to be sorted out first. Reworking KaiGai's submission to merge against Dean's new one makes it viable again in my mind, and I'd like to continue toward re-reviewing it as part of this CF in that light. Admittedly it's not ideal to try and do that at the same time the barrier view patch is being modified, but I see that as a normal CF merge of things based on other people's submissions. I mentioned advocacy because the budding new PostgreSQL test instances I'm seeing now will lose a lot of momentum if we end up with no user visible RLS features in 9.4. The pieces we have now can assemble into something that's useful, and I don't think that goal is unreasonably far away. -- Greg Smith greg.smith@crunchydatasolutions.com Chief PostgreSQL Evangelist - http://crunchydatasolutions.com/
On 01/18/2014 03:27 AM, Gregory Smith wrote: > With my advocacy hat on, I'd like to revisit this idea now that there's > a viable updatable security barrier view submission. I thought the most > serious showstopper feedback from the last CF's RLS submission was that > this needed to be sorted out first. Reworking KaiGai's submission to > merge against Dean's new one makes it viable again in my mind, and I'd > like to continue toward re-reviewing it as part of this CF in that > light. I had hoped to have this done weeks ago, but was blocked on getting a viable approach to updatable security barrier views in place. I really appreciate Dean, with his greater experience and skill in this area, looking into it. As it is I'm spending today reworking the RLS patch on top of the new approach to updatable security barrier views. Then it'll be a matter of tests, lots and lots of tests of every weird corner I can think of. > Admittedly it's not ideal to try and do that at the same time > the barrier view patch is being modified, but I see that as a normal CF > merge of things based on other people's submissions. I tend to agree - and the whole idea of reworking RLS on top of updatable security barrier views is that it makes the patch for RLS's UI and catalogs largely independent from the mechanics of filtering rows. > I mentioned advocacy because the budding new PostgreSQL test instances > I'm seeing now will lose a lot of momentum if we end up with no user > visible RLS features in 9.4. The pieces we have now can assemble into > something that's useful, and I don't think that goal is unreasonably far > away. If it's possible, getting _something_ into 9.4 would be great. I'm aware of multiple interested users who originally expected this in 9.3. That hasn't worked out, but it'd be great to make 9.4. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 01/20/2014 09:58 AM, Craig Ringer wrote: > As it is I'm spending today reworking the RLS patch on top of the new > approach to updatable security barrier views. To get that rolling I've split the RLS patch up into chunks, so we can argue about the catalogs, ALTER syntax, and the actual row-filtering implementation separately ;-) It's currently on git@github.com:ringerc/postgres.git in the branch rls-9.4-split, which is subject to rebasing. I'm still going through it making sure each chunk at least compiles and preferably passes "make check". The first version is on the tag rls-9.4-split-v1, which will remain static and contains an initial patch-split. The patch series for this version is attached. This is a clean tree on top of today's git master, it's not descended from KaiGai's / Greg's trees. That means it doesn't track RLS's development details, merge commits, etc. It's just a multi-stage patch merge of RLS on top of master. Hopefully it'll be a useful working point. If you omit "guts" commit: RLS: Enforce row-security by transforming query plans you'll get the skeleton of the catalogs and and supporting commands without the actual planner/optimizer changes, forming the useful basis for rebuilding RLS on top of Dean's updatable s.b. views patch. I'm going through the patch series to make sure the split is consistent and each piece at least builds by its self, then I'm going to rip out the above-mentioned commit and rework it on top of the updatable security barriers code. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
- 0001-RLS-Make-plan-caching-dependent-on-user-ID.patch
- 0002-RLS-add-pg_rowsecurity-catalog.patch
- 0003-RLS-Add-rowsec_relid-to-ResultRelInfo.patch
- 0004-Add-ALTER-TABLE-commands-for-row-security.patch
- 0005-RLS-pg_dump-support-for-dumping-rowsecurity-catalogs.patch
- 0006-RLS-psql-support-for-reporting-row-security-constrai.patch
- 0007-RLS-Enforce-row-security-by-transforming-query-plans.patch
- 0008-RLS-Enforce-row-security-on-COPY.patch
- 0009-RLS-Regression-tests.patch
- 0010-RLS-Supplimental-docs.patch
(Re-sending; I forgot to cc the list) On 01/20/2014 02:15 PM, Craig Ringer wrote: > On 01/20/2014 09:58 AM, Craig Ringer wrote: >> As it is I'm spending today reworking the RLS patch on top of the new >> approach to updatable security barrier views. > > To get that rolling I've split the RLS patch up into chunks, so we can > argue about the catalogs, ALTER syntax, and the actual row-filtering > implementation separately ;-) > > It's currently on git@github.com:ringerc/postgres.git in the branch > rls-9.4-split, which is subject to rebasing. I'm still going through it > making sure each chunk at least compiles and preferably passes "make > check". That branch is now pretty stable, and passes checks at every stage up to the new RLS regression tests. I've pushed a new version to branch rls-9.4-split. Further updates will rebase this branch. The tag rls-9.4-split-v5 identifies this particular push, and won't get rebased away. I found a number of merge errors and problems that I've corrected and incorporated into the history, including a bad pg_class.h edit, a missing break; in tablecmds.c, a missing Node* assignment in the parser grammar for RESET ROW SECURITY, etc. A number of concerns remain outstanding, including those Greg listed except for the regression tests, which I've fixed, and rebasing on top of current master. -Continue to expand the functional tests -Is there enough information about row security available in psql output? For example, there's nothing in "\d" output that suggests it might exist. pg_rowsecurity is a monster to try and read. -Documentation needs plenty of editing, which I can take care of. Additionally, the issue I found earlier remains outstanding: - Fails to deal with user-id change during portal execution in (eg) SECURITY DEFINER funcs returning refcursor, or SQL cursors open across SET SESSION AUTHORIZATION. `current_user` returns value after change, not original value, causing issues with RLS predicates. Now that the patch is split up into stages I'm going to work on replacing patches 7 and 8 (query plan transformation, COPY support) with an approach based on Dean's updaable security barriers code. If anyone else picks up work on this, **please** try to keep this patch series intact, or even better, move the regression tests back into the relevant patches as you go. If you send an updated version of the RLS patch please send a git ref too; that'll make it much easier to pull your changes and integrate them into the relevant patch in the series with appropriate "git rebase --interactive" (ab)use. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
- 0001-RLS-Make-plan-caching-dependent-on-user-ID.patch
- 0002-RLS-add-pg_rowsecurity-catalog.patch
- 0003-RLS-Add-rowsec_relid-to-ResultRelInfo.patch
- 0004-Add-ALTER-TABLE-commands-for-row-security.patch
- 0005-RLS-pg_dump-support-for-dumping-rowsecurity-catalogs.patch
- 0006-RLS-psql-support-for-reporting-row-security-constrai.patch
- 0007-RLS-Enforce-row-security-by-transforming-query-plans.patch
- 0008-RLS-Enforce-row-security-on-COPY.patch
- 0009-RLS-Regression-tests.patch
- 0010-RLS-Supplimental-docs.patch
On 01/24/2014 10:12 AM, Craig Ringer wrote: > (Re-sending; I forgot to cc the list) > > On 01/20/2014 02:15 PM, Craig Ringer wrote: >> On 01/20/2014 09:58 AM, Craig Ringer wrote: >>> As it is I'm spending today reworking the RLS patch on top of the new >>> approach to updatable security barrier views. >> >> To get that rolling I've split the RLS patch up into chunks, so we can >> argue about the catalogs, ALTER syntax, and the actual row-filtering >> implementation separately ;-) >> >> It's currently on git@github.com:ringerc/postgres.git in the branch >> rls-9.4-split, which is subject to rebasing. I'm still going through it >> making sure each chunk at least compiles and preferably passes "make >> check". > > That branch is now pretty stable, and passes checks at every stage up to > the new RLS regression tests. I've pushed a new version to branch > rls-9.4-split. Further updates will rebase this branch. > > The tag rls-9.4-split-v5 identifies this particular push, and won't get > rebased away. Pushed a new rebase to the main working branch, merging in the fixes I made to KaiGai's patch last time. Tagged rls-9.4-split-v6 I haven't bothered with a patchset for this one, I'll be replacing it again soon. This is just for anybody following along. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Hi all I've hit an interesting wrinkle and am interested in opinions. By integrating updatable security barrier view support with row-security, it has become a lot harder to detect and prevent infinite recursion (including mutual recursion) in row-security policy expansion. The code is attached, but it's not an independent patch so it's way easier to grab it from git: git@github.com:ringerc/postgres.git branch rls-9.4-upd-sb-views (subject to rebase); or tag rls-9.4-upd-sb-views-v1 (Only contains half the old row-security patch; I'll rebase the docs, tests, etc on top of this once it's working properly). I've integrated the updatable security barrier view support into RLS by injecting securityQuals in subquery_planner() just before preprocess_rowmarks . (I'm still thinking about some inheritance related aspects to that, but that's for later). The problem is that this causes infinite recursion - the securityQuals get expanded into a subquery over the original RTE that had the row-security policy on it. Then subquery_planner is re-entered when processing the subquery, a row-security qual is found on the target RTE, and ... *boom*. Fixing this is not as simple as suppressing expansion of row-security policy when processing a security barrier subquery created by a row-security policy, as it is desirable to respect the row-security policy of *other* tables that get referenced in the expanded row-security qual. If we just record the relid of the relation a subquery was expanded from and avoid expanding that inside the generated subquery we prevent simple linear recursion, but not mutual recursion between two or more rels with row-security policy. KaiGai's row-security patch handles this because it does its own recursive expansion, so (like the rewriter) it can keep a breadcrumb trail and detect when it is repeating a path. That's not so practical when row-security code tags RTEs with policy, then updatable s.b. views goes and expands them. So. Options. 1.Come up with a reasonable way to track recursive row-security expansion, detect infinite recursion, and bail out. Likely to involve a new global structure with planner/optimizer lifetime that gets checked and maintained by apply_row_security_rangetbl. 2.Solve the linear recursion case by storing a relid that should not get expanded for security quals when processing a subquery. Say "don't do that, expect stack exceeded crashes if you do" for the mutual recursion case. Seems user hostile, incomplete, and likely to break people's systems when they really don't expect it. 3.Disregard row-security policy on referenced tables when expanding row-security qualifiers. There's precendent here in foreign keys, which ignore row-security policy, but I don't think this is at all appealing. 4.Magic? Unless someone has a suggestion for #4, I'll be going with #1. I'd appreciate tips on doing this in a sane and efficient manner if anyone has them. I'll be reading over the rewriter's infinite loop protection and that of KaiGai's RLS patch for ideas in the mean time, and will give it a good go. BTW, I feel like we should be letting the rewriter do this job; it's good at dealing with recursion problems already. That won't work so long as security barrier qual expansion happens during planning, not rewrite, though - and we've already explored the fun problems with doing upd. s.b. qual expansion in rewrite. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
On 24 January 2014 09:04, Craig Ringer <craig@2ndquadrant.com> wrote: > Hi all > > I've hit an interesting wrinkle and am interested in opinions. By > integrating updatable security barrier view support with row-security, > it has become a lot harder to detect and prevent infinite recursion > (including mutual recursion) in row-security policy expansion. > > The code is attached, but it's not an independent patch so it's way > easier to grab it from git: > > git@github.com:ringerc/postgres.git > branch rls-9.4-upd-sb-views (subject to rebase); or > tag rls-9.4-upd-sb-views-v1 > > (Only contains half the old row-security patch; I'll rebase the docs, > tests, etc on top of this once it's working properly). > > > I've integrated the updatable security barrier view support into RLS by > injecting securityQuals in subquery_planner() just before > preprocess_rowmarks . (I'm still thinking about some inheritance related > aspects to that, but that's for later). > > The problem is that this causes infinite recursion - the securityQuals > get expanded into a subquery over the original RTE that had the > row-security policy on it. Then subquery_planner is re-entered when > processing the subquery, a row-security qual is found on the target RTE, > and ... *boom*. > > Fixing this is not as simple as suppressing expansion of row-security > policy when processing a security barrier subquery created by a > row-security policy, as it is desirable to respect the row-security > policy of *other* tables that get referenced in the expanded > row-security qual. > > If we just record the relid of the relation a subquery was expanded from > and avoid expanding that inside the generated subquery we prevent simple > linear recursion, but not mutual recursion between two or more rels with > row-security policy. > > KaiGai's row-security patch handles this because it does its own > recursive expansion, so (like the rewriter) it can keep a breadcrumb > trail and detect when it is repeating a path. That's not so practical > when row-security code tags RTEs with policy, then updatable s.b. views > goes and expands them. > > So. Options. > > 1.Come up with a reasonable way to track recursive row-security > expansion, detect infinite recursion, and bail out. Likely to involve > a new global structure with planner/optimizer lifetime that gets > checked and maintained by apply_row_security_rangetbl. > > 2.Solve the linear recursion case by storing a relid that should not get > expanded for security quals when processing a subquery. Say "don't do > that, expect stack exceeded crashes if you do" for the mutual > recursion case. Seems user hostile, incomplete, and likely to break > people's systems when they really don't expect it. > > 3.Disregard row-security policy on referenced tables when expanding > row-security qualifiers. There's precendent here in foreign keys, > which ignore row-security policy, but I don't think this is at all > appealing. > > 4.Magic? > My first thought is to add a boolean flag to RangeTblEntry (similar to the "inh" flag) to say whether RLS expansion is requested for that RTE. Then set it to false on each RTE as you add new RLS quals to it's securityQuals. In addition, when adding RLS quals to an RTE, I think they should be fully and recursively expanded immediately, before setting the new flag to false and moving on --- think recursively calling the rewriter to expand view references in the new RLS qual, and expand_security_qual() to expand any additional RLS quals in the securityQuals list --- with loop detection there. I'm not pretending that's going to be easy, but there are a couple of existing pieces of code to borrow ideas from. Doing it this way should make it possible to do the loop detection without any global structures. Regards, Dean
On 01/24/2014 07:16 PM, Dean Rasheed wrote: > My first thought is to add a boolean flag to RangeTblEntry (similar to > the "inh" flag) to say whether RLS expansion is requested for that > RTE. Then set it to false on each RTE as you add new RLS quals to it's > securityQuals. That's what I was getting at with adding flags to RangeTblEntry, yes. Given the number of flags we're growing I wonder if they should be consolodated into a bitmask, but I'll leave that problem for later. I'll do this part first, since it seems you agree that a RangeTblEntry flag is the appropriate path. That'll make row-security checking work and make the patch testable. It won't deal with recursive rules, they'll still crash the backend. I'll deal with that as a further step. > In addition, when adding RLS quals to an RTE, I think they should be > fully and recursively expanded immediately, before setting the new > flag to false and moving on --- think recursively calling the rewriter > to expand view references in the new RLS qual, and > expand_security_qual() to expand any additional RLS quals in the > securityQuals list --- with loop detection there. I'm not pretending > that's going to be easy, but there are a couple of existing pieces of > code to borrow ideas from. Doing it this way should make it possible > to do the loop detection without any global structures. Ugh. I was really hoping to avoid *another* place where we do recursive expansion and infinite recursion checking, especially when the rewriter already does this job. Unfortunately, so long as the rewriter doesn't know about inheritance, it cannot fully solve this problem. A mutually recursive rule involving inheritance wouldn't get detected by rewriter based checking. The original RLS patch only detects direct recursion, btw, it looks like it won't catch mutual recursion. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 01/24/2014 07:16 PM, Dean Rasheed wrote: > think recursively calling the rewriter > to expand view references in the new RLS qual, and > expand_security_qual() to expand any additional RLS quals in the > securityQuals list With this, it'd be helpful if expand_security_qual(...) took a RangeTblEntry instead of an rt_index. That'd also be much more efficient with large rtables if we can arrange a scan through the rtable when looking for security quals. Like other places that operate on the rangetable while it's being modified, we can walk the rangetable list up until the final entry that existed when we started walking. This approach saves the series of rt_fetch calls, which are something like O(n log n) for n relations. It's safe because the operation will only append rangetable entries. (I can't help wonder how much we'd gain by making the rtable an array that gets doubled in size and copied whenever it overflows, rather than a linked list, given all the walking of it that gets done, and how dead entries to get flagged as dead rather than actually removed.) I'm looking for where I found the code that already does this so I can point and say "I'm not crazy, we already do it here". Will follow up with a patch. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 01/28/2014 04:39 PM, Craig Ringer wrote: > I'm looking for where I found the code that already does this so I can > point and say "I'm not crazy, we already do it here". Will follow up > with a patch. I spoke to soon. The code I'm talking about is expand_inherited_tables(...) and it still uses rt_fetch, it just avoids foreach(...) in favour of stopping scanning at the end of the original rtable. So I get to be crazy after all. I really don't like how many places we're rt_fetch'ing the same RTE from in updatable s.b. views and its interaction with row-security, but that can be a "later" problem. For now I'll stick with RTIs. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 01/28/2014 02:11 PM, Craig Ringer wrote: >> > My first thought is to add a boolean flag to RangeTblEntry (similar to >> > the "inh" flag) to say whether RLS expansion is requested for that >> > RTE. Then set it to false on each RTE as you add new RLS quals to it's >> > securityQuals. > That's what I was getting at with adding flags to RangeTblEntry, yes. > > Given the number of flags we're growing I wonder if they should be > consolodated into a bitmask, but I'll leave that problem for later. > > I'll do this part first, since it seems you agree that a RangeTblEntry > flag is the appropriate path. That'll make row-security checking work > and make the patch testable. > > It won't deal with recursive rules, they'll still crash the backend. > I'll deal with that as a further step. > I've put together a working RLS patch on top of updatable security barrier views. It has some known issues remaining; it doesn't do recursion checking yet, and it fails some of the regression tests in exciting ways. I'm looking into them step by step. Some differences in the tests behaviours that have changed due to the inheritance rules changing; many appear to be oversights or bugs yet to be chased down. You can find it here; https://github.com/ringerc/postgres/compare/rls-9.4-upd-sb-views i.e. https://github.com/ringerc/postgres.git , branch rls-9.4-upd-sb-views (subject to rebasing) or the non-rebased tag rls-9.4-upd-sb-views-v2 The guts of the patch appear as a diff, attached, but it's not standalone so I suggest using git. I'll be looking into recursion issues and the test failures tomorrow. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
On 01/29/2014 09:47 PM, Craig Ringer wrote: > https://github.com/ringerc/postgres/compare/rls-9.4-upd-sb-views > > i.e. https://github.com/ringerc/postgres.git , > branch rls-9.4-upd-sb-views > > (subject to rebasing) or the non-rebased tag rls-9.4-upd-sb-views-v2 Pushed an update to the branch. New update tagged rls-9.4-upd-sb-views-v3 . Fixes an issue with rowmarking that stems from the underlying updatable s.b. views patch. Other tests continue to fail, this isn't ready yet. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 01/30/2014 01:25 PM, Craig Ringer wrote: > On 01/29/2014 09:47 PM, Craig Ringer wrote: >> https://github.com/ringerc/postgres/compare/rls-9.4-upd-sb-views >> >> i.e. https://github.com/ringerc/postgres.git , >> branch rls-9.4-upd-sb-views >> >> (subject to rebasing) or the non-rebased tag rls-9.4-upd-sb-views-v2 > > Pushed an update to the branch. New update tagged > rls-9.4-upd-sb-views-v3 . Fixes an issue with rowmarking that stems from > the underlying updatable s.b. views patch. > > Other tests continue to fail, this isn't ready yet. Specifically: - Needs checks in AT INHERITS, AT SET ROW SECURITY, and CT INHERITS to prohibit any combination of inheritance and row-security, per: http://www.postgresql.org/message-id/52EA01C3.70804@2ndquadrant.com - row-security rule recursion detection isn't solved yet, it just overflows the stack. - COPY doesn't know anything about row-security - I'm just starting to chase some odd errors in the tests, "ERROR: failed to find unique expression in subplan tlist" and "ERROR: could not open file "base/16384/30070": No such file or directory". Their cause/origin is not yet known, but they're specific to when row-security policy is being applied. - policies based on current_user don't "remember" current_user when rows are pulled from refcursor returned by a security definer function. There is a chunk of work here. Anybody who wants row-security to happen for 9.4, please pick something and pitch in. (Or we could just decide that my rebased and tweaked version of KaiGai's original patch internal query structure twiddling aside, is the best way forward after all. That leaves only the last item to deal with.) -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 01/30/2014 04:05 PM, Craig Ringer wrote: > On 01/30/2014 01:25 PM, Craig Ringer wrote: >> On 01/29/2014 09:47 PM, Craig Ringer wrote: >>> https://github.com/ringerc/postgres/compare/rls-9.4-upd-sb-views >>> >>> i.e. https://github.com/ringerc/postgres.git , >>> branch rls-9.4-upd-sb-views >>> >>> (subject to rebasing) or the non-rebased tag rls-9.4-upd-sb-views-v2 >> >> Pushed an update to the branch. New update tagged >> rls-9.4-upd-sb-views-v3 . Fixes an issue with rowmarking that stems from >> the underlying updatable s.b. views patch. >> >> Other tests continue to fail, this isn't ready yet. > > Specifically: > - row-security rule recursion detection isn't solved yet, it just > overflows the stack. This is now fixed in the newly tagged rls-9.4-upd-sb-views-v4 in git@github.com:ringerc/postgres.git . I landed up adding a field to RangeTblEntry that keeps track of all the oids of relations row-security expanded to produce this RTE. When testing an RTE for row-security policy, this list is checked to see if the oid of the relation being expanded is already on the list. The list is copied to all RTEs inside sublinks after a relation is expanded using a query_tree_walker. > - COPY doesn't know anything about row-security COPY will ERROR on row-security rels in rls-9.4-upd-sb-views-v4. I'm looking at integrating the approach Kohei KaiGai took in the original patch now, then I'll be moving on to: > - I'm just starting to chase some odd errors in the tests, "ERROR: > failed to find unique expression in subplan tlist" and "ERROR: could > not open file "base/16384/30070": No such file or directory". Their > cause/origin is not yet known, but they're specific to when row-security > policy is being applied. I was hoping these would be fixed by solving the recursion issues, but that wasn not the case. Input/comments would be appreciated. I haven't looked into this yet. > - policies based on current_user don't "remember" current_user when rows > are pulled from refcursor returned by a security definer function. This is actually a separate, existing bug, or surprising behaviour. I'd like to address it, but it's really a separate patch. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Craig Ringer <craig@2ndquadrant.com> writes: > I landed up adding a field to RangeTblEntry that keeps track of all the > oids of relations row-security expanded to produce this RTE. When > testing an RTE for row-security policy, this list is checked to see if > the oid of the relation being expanded is already on the list. The list > is copied to all RTEs inside sublinks after a relation is expanded using > a query_tree_walker. That's impossibly ugly, both as to the idea that an RTE is a mutable data structure for this processing, and as to having to copy the list around (meaning that you can't even claim to have one source of truth for the state...) regards, tom lane
On 02/04/2014 03:14 PM, Tom Lane wrote: > Craig Ringer <craig@2ndquadrant.com> writes: >> I landed up adding a field to RangeTblEntry that keeps track of all the >> oids of relations row-security expanded to produce this RTE. When >> testing an RTE for row-security policy, this list is checked to see if >> the oid of the relation being expanded is already on the list. The list >> is copied to all RTEs inside sublinks after a relation is expanded using >> a query_tree_walker. > > That's impossibly ugly, both as to the idea that an RTE is a mutable > data structure for this processing, and as to having to copy the list > around (meaning that you can't even claim to have one source of truth > for the state...) I see. Do you have any suggestion about how this might be approached in a manner that would be more satisfactory? It's not strictly necessary to duplicate the parent list when annotating the RTEs in an expanded row-security qual; I did so only out of concern that an object of indeterminate/shared ownership may not be considered desirable. The only time it's strictly neccessary to copy the parent list is when a new row-security qual is being expanded. In this case it cannot be shared, because there may be multiple relations with row-security applied, and a separate "path" is required for each. If the list is shared, infinite recursion is falsely reported in cases like: rel a has a qual referring to b rel b has a qual referring to c rel c has a qual referring to d select ... from a inner join b on ... There's no infinite recursion there, but a simple approach that keeps a global list of expanded quals will think there is as it'll see "b" and "c" expanded twice. So whenever you expand a qual, you have to "fork" the parent list, copying it. The approach used in the rewriter won't work here, because the rewriter eagerly depth-first recurses when expanding things. In the optimizer, all securityQuals get expanded in one pass, _then_ sublink processing expands any added sublinks in a separate pass. It's possible the parent list could be associated with the Query that's produced by securityQual expansion instead, but at the time you're expanding row-security on any RTEs within that you don't necessarily have access to the Query node that might be a couple of subquery levels up, since the security qual can be an arbitrary expression. I looked at keeping a structure in PlannerGlobal that tracked the chain of row-security qual expansion, but I couldn't figure out a way to cleanly tell what "branch" of the query a given RTE that's being expanded was in, and thus how to walk the global tree to check for infinite recursion. The only alternative I see is to immediately and recursively expand row-security entries within subqueries using a walker, as soon as the initial row-security qual is expanded on the top level rel. I'm concerned about code duplication when following that path, since that's pretty much what's already happening with sublink expansion, just using the existing execution paths. If I'm right, and immediate recursive expansion isn't an acceptable alternative, any ideas on how the row-security expansion breadcrumb trail might be stored in a more appropriate manner and accessed from where they're needed? Note that it's trivial to prevent simple, direct recursion. Preventing mutual recursion without also breaking non-recursive cases where rels are used in totally different parts of the query is harder. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 02/04/2014 02:43 PM, Craig Ringer wrote: > On 01/30/2014 04:05 PM, Craig Ringer wrote: >> On 01/30/2014 01:25 PM, Craig Ringer wrote: >>> On 01/29/2014 09:47 PM, Craig Ringer wrote: >>>> https://github.com/ringerc/postgres/compare/rls-9.4-upd-sb-views >>>> >>>> i.e. https://github.com/ringerc/postgres.git , >>>> branch rls-9.4-upd-sb-views >>>> >>>> (subject to rebasing) or the non-rebased tag rls-9.4-upd-sb-views-v2 >>> >>> Pushed an update to the branch. New update tagged >>> rls-9.4-upd-sb-views-v3 . Fixes an issue with rowmarking that stems from >>> the underlying updatable s.b. views patch. >>> >>> Other tests continue to fail, this isn't ready yet. >> >> Specifically: > >> - row-security rule recursion detection isn't solved yet, it just >> overflows the stack. > > This is now fixed in the newly tagged rls-9.4-upd-sb-views-v4 in > git@github.com:ringerc/postgres.git . Based on Tom's objections, another approach is presented in rls-9.4-upd-sb-views-v5 on git@github.com:ringerc/postgres.git . The Query node is used to record the recursive expansion parent list instead, and copying is avoided. However, I've separately tracked down the cause of the test failures like: ERROR: could not open file "base/16384/30135": No such file or directory This occurs where a row-security qual is declared to use a view. Row-security quals get stored without rewriting (which is necessary, see below). The qual is injected into securityQuals and expanded, but *not rewritten*. So the executor sees an unexpected view in the tree. Because a view RTE has its relid field set to the view's oid, this doesn't get picked up until we try to actually scan the view relation in the executor. (I'd like to add Asserts to make the executor fail a bit more informatively when you try to scan a view, but that's separate.) So, that's clearly broken. There are really two possible solutions: 1. Try (again) to do row-security in the rewriter. This was previously impossible because of the definition of row-security behaviour around inheritance, but with the simplified inheritance model now proposed I think it's possible. 2. Invoke RewriteQuery from within expand_security_quals, rewriting the query after security qual expansion. This is only needed for row-security; for updatable s.b. views rewriting has happened with recursion into securityQuals during the original rewrite pass. I suspect that (2) will result in a resounding "yuck". So I have to see if I can now turn around *again* and plug row-security into the rewriter after all. That's a pretty frustrating thing to discover in mid-late CF4. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 2014-02-06 05:43, Craig Ringer wrote: > Based on Tom's objections, another approach is presented in > rls-9.4-upd-sb-views-v5 on git@github.com:ringerc/postgres.git . The > Query node is used to record the recursive expansion parent list > instead, and copying is avoided. Cannot fetch or clone. github on web says "This repository is temporarily unavailable. The backend storage is temporarily offline. Usually this means the storage server is undergoing maintenance. Please contact support if the problem persists." regards, Yeb
On 02/06/2014 04:54 PM, Yeb Havinga wrote: > On 2014-02-06 05:43, Craig Ringer wrote: >> Based on Tom's objections, another approach is presented in >> rls-9.4-upd-sb-views-v5 on git@github.com:ringerc/postgres.git . The >> Query node is used to record the recursive expansion parent list >> instead, and copying is avoided. > > Cannot fetch or clone. > > github on web says "This repository is temporarily unavailable. The > backend storage is temporarily offline. Usually this means the storage > server is undergoing maintenance. Please contact support if the problem > persists." If this persists, or someone else has the same issue, try the HTTPS url: https://github.com/ringerc/postgres.git per: http://github.com/ringerc/postgres/ (I need to chase up my git.postgresql.org account request, it seems to have got lost). -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 02/06/2014 12:43 PM, Craig Ringer wrote: > 1. Try (again) to do row-security in the rewriter. This was previously > impossible because of the definition of row-security behaviour around > inheritance, but with the simplified inheritance model now proposed I > think it's possible. Thanks to the simplified requirements for inheritance, this turns out to be fairly easy. There's a version rewritten to use the rewriter in the tag: rls-9.4-upd-sb-views-v6 on https://github.com/ringerc/postgres.git The trickiest bit remaining is how to register the PlanInvalItem to force plan invalidation when the user-id changes. This was easy in the optimizer, but it's not obvious how to do it cleanly in the rewriter. I've got a couple of ideas but don't much like either of them. Recommendations from the experienced welcomed. Other more minor open items, each of which look quite quick: I haven't plugged in rewriter-based recursion detection yet, but that looks fairly easy (famous last words?). It's 10pm and I have an early start, so that won't happen tonight. I haven't removed some cruft from the previous approach from the Query node either; I need to trim the diff. The regression test expected file needs adjustment to match the new inheritance rules, and to cover a couple of new tests I've added. Needs a better error when it gets an unacceptable rewrite product during security qual rewriting (modeled on the errors for CTEs). Easy, just some cookie cutter code. Need to cherry-pick the docs patch back on top of the patch tree now things are settling. Amazingly, I think this has a hope. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 02/06/2014 10:19 PM, Craig Ringer wrote: > On 02/06/2014 12:43 PM, Craig Ringer wrote: >> 1. Try (again) to do row-security in the rewriter. This was previously >> impossible because of the definition of row-security behaviour around >> inheritance, but with the simplified inheritance model now proposed I >> think it's possible. > > Thanks to the simplified requirements for inheritance, this turns out to > be fairly easy. There's a version rewritten to use the rewriter in the tag: > > rls-9.4-upd-sb-views-v6 > > on https://github.com/ringerc/postgres.git > > The trickiest bit remaining is how to register the PlanInvalItem to > force plan invalidation when the user-id changes. This was easy in the > optimizer, but it's not obvious how to do it cleanly in the rewriter. > I've got a couple of ideas but don't much like either of them. > Recommendations from the experienced welcomed. Or, after thinking about it for a second with my tired brain, "not so much". We don't rerun rewrite on plan invalidation. So that means the superuser exemption won't work properly with this patch. So much for having a hope, that's not a small thing to fix. So: either I invoke the rewriter from within the optimizer on the security quals, or I make the rewriter re-run on plan invalidation. Neither is small or simple. Blast. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Craig Ringer <craig@2ndquadrant.com> writes: > We don't rerun rewrite on plan invalidation. Don't we? plancache.c certainly does, in fact it starts from the raw grammar output. Skipping the rewriter would mean failing to respond to CREATE OR REPLACE VIEW, for example. regards, tom lane
On 02/06/2014 11:11 PM, Tom Lane wrote: > Craig Ringer <craig@2ndquadrant.com> writes: >> We don't rerun rewrite on plan invalidation. > > Don't we? plancache.c certainly does, in fact it starts from the raw > grammar output. Skipping the rewriter would mean failing to respond > to CREATE OR REPLACE VIEW, for example. I was thinking about exactly that case as I went to sleep - especially as it's things like CREATE OR REPLACE VIEW that prevent me from just rewriting the security qual when it's initially added, before storage in the catalogs. I could've sworn discussion around row security in the past concluded that it couldn't be properly done in the rewriter because of an inability to correctly invalidate plans. Searching the archives, though, I don't find anything to support that. I'll just say I'm glad to be wrong, and proceed from there. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Review of RLS on inheritance schema HL7 RIM (was Re: Row-security on updatable s.b. views)
From
Yeb Havinga
Date:
On 06/02/14 15:19, Craig Ringer wrote: > > Thanks to the simplified requirements for inheritance, this turns out to > be fairly easy. There's a version rewritten to use the rewriter in the tag: > > rls-9.4-upd-sb-views-v6 > > on https://github.com/ringerc/postgres.git Hi Craig, list, This is review of the current RLS patch on a database schema that uses inheritance in the 'classical' sense (not partitioning). The review was done on rls-9.4-upd-sb-views-v4 and hence all comments are about that version. Comparing output of the minisql script between v4 and v6 gives differences, as v6 seems to be WIP. Our goal is to implement the HL7 Reference Information Model (RIM) in PostgreSQL. A fine-grained access control on the tables would have a practical use in the context of RIM. So, we have made some preliminary tests of the Row Security patch for such a specific data model. For the purpose of reviewing RLS, we have restricted the full RIM to just a few tables which we call the mini-RIM. It is important to observe that the RIM uses inheritance, and we use PostgreSQL inheritance to implement the RIM's inheritance. More details about the RIM are presented below. In the attached SQL script we list a mini-RIM, along with examples of RLS enforcement. General comments about RLS applied on (a minimalistic version of) the RIM can be summarized as follows: 1. The current RLS implementation works for use cases where confidentiality attributes are specified in the inheritance root relation. Since security labelling in the RIM is done on confidentialitycodes that are present in the inheritance roots (e.g., Role and Act), the current RLS works for the RIM. 2. Infinite recursion is well captured in case of recursive restrictions to tables. 3. RLS syntax is readable and easy to use. 4. Documentation needs work. 5. Subqueries in RLS quals can be pulled up, so opens the ability for fast processing. Overall from a users perspective the patch gave a solid impression. regards, Yeb Havinga Albana Gaba Henk-Jan Meijer Portavita B.V. The Netherlands BACKGROUND ON THE REFERENCE INFORMATION MODEL: To understand how The HL7 Reference Information Model (RIM) uses PostgreSQL inheritance, it is helpful to understand the meaning of the content of the parent and child tables. This section describes the background of the RIM, and describes a few classes of the “Act” hierarchy. The HL7 RIM[1] is not just yet another information model. It is a mature, standard information model that has been used and refined over the course of many years [2][3]. Its purpose is to capture detailed information for medical records. Pivotal in the RIM is its action-based modelling, based on ideas that can be traced back to the American philosopher C.S. Peirce. A direct line from Peirce’s Logic of Relatives to the foundations of relational databases has been introduced in [4]. Versions of the RIM are now being released as an ANSI standard. An illustration of the RIM is available at http://www.healthintersections.com.au/wp-content/uploads/2011/05/RIM.png The RIM is a set of UML classes, each containing one or more attributes. The classes are an abstraction of subjects or other concepts that are relevant within the healthcare domain. To avoid a model with a huge number of classes, the RIM defines six core classes whereas the vast majority of the classes are defined as specializations based on the core ones. The specialization classes inherit all the properties of the generalization classes while adding specific attributes of its own. To make matters concrete, let us look at "Act" class. “Act”: Looking at the right hand side of the RIM illustration referenced above, we can see the class “Act” and its specializations, and this is the focal point for the RIM’s action based modeling. Description from the standard: “Acts are the pivot of the RIM: domain information and process records are represented primarily in Acts. Any profession or business, including healthcare, is primarily constituted of intentional and occasionally non-intentional actions, performed and recorded by responsible actors. An Act-instance is a record of such an action.” Notable attributes of “Act” are: “id” - A unique identifier for the Act. Each Act is associated with a unique id. All specialization of Act inherit this id. This means that if there is, for example, an instance of Observation with id 5, there exist no other acts with id 5. In fact, since technically in the RIM all identifiers stem from a single infrastructure root, the identifiers are globally unique: there exists a single object with id 5. This single object is an instance of Observation, and since Observation is a specialization of Act, it is also an instance of Act. “classCode” – The major class of Acts to which an Act-instance belongs. The allowed codes in classCode form a hierarchical code system. In the 2011 RIM, there are 124 different class codes. This is a larger number than the number of specializations in the class diagram: only the classes that need additional properties have their own class definition in the diagram. “moodCode” - The intended use of the Act statement: as a report of fact, a command, a possibility, a goal, etc. “code” - The particular kind of Act that the Act-instance represents within its class. “confidentialityCode” - Codes that identify how sensitive a piece of information is and/or that indicate how the information may be made available or disclosed. Notable specializations of “Act” are “Observation” and “SubstanceAdministration”.: “Observation” - The main difference between Observations and other Acts is that Observations have a value attribute. The code attribute of Observation and the value attribute of Observation must be considered in combination to determine the semantics of the observation. Structurally, many observations are name-value-pairs, where the Observation.code (inherited from Act) is the name and the Observation.value is the value of the property. Such a construct is also known as a "variable" (a named feature that can assume a value); hence, the Observation class is always used to hold generic name-value-pairs or variables, even though the variable valuation may not be the result of an elaborate observation method. It may be a simple answer to a question or it may be an assertion or setting of a parameter. “SubstanceAdministration” - A type of procedure that involves a performer introducing or otherwise applying a material into or to the subject. Substance administration is distinguished from an exposure by the participation of a performer in the act. The substance administered by a performer physically interacts with the subject or is otherwise "taken in" by the subject during the act of administration. Detailed information about the supplied substance is represented using the entity class or one of its subtypes. The performer of a substance administration may be another entity such as a person, device, plant, e.g. poison ivy, animal, e.g. mosquito bite, or it may be the same entity as the subject, as in self-administration. In the intent moods, substance administration represents the plan to apply a given material. This includes (but is not limited to) prescriptions in which it might also be associated with a request for supply. In event mood, substance administration represents a record of the actual application of a substance. On the left hand side of the RIM picture we see the “Entity” hierarchy, with notable specializations “Person” and “Organization. Entities are linked together in “Roles”: a “Patient” is a specialization of “Role” where the player is the person that is patient, and the scoper is the organization where the person is patient. “Roles” can ‘participate’ in “Acts”. These participations are registered using the “Participation” class. [1] ‘HL7 Reference Information Model’ http://www.hl7.org/implement/standards/rim.cfm [2] ‘ Influences of the Unified Service Action Model on the HL7 Reference Information Model.’ https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2232835/ [3] History of the HL7 RIM http://www.ringholm.de/docs/04500_en_History_of_the_HL7_RIM.htm [4] ‘Charles Peirce’ http://www.newworldencyclopedia.org/entry/Charles_Peirce
Attachment
On 02/06/2014 10:19 PM, Craig Ringer wrote: > On 02/06/2014 12:43 PM, Craig Ringer wrote: >> 1. Try (again) to do row-security in the rewriter. This was previously >> impossible because of the definition of row-security behaviour around >> inheritance, but with the simplified inheritance model now proposed I >> think it's possible. > > Thanks to the simplified requirements for inheritance, this turns out to > be fairly easy. There's a version rewritten to use the rewriter in the tag: > > rls-9.4-upd-sb-views-v6 > > on https://github.com/ringerc/postgres.git ... which was totally wrong, and I blame lack of sleep for it ever getting pushed. I didn't understand the rewriter as well as I thought. v7 applies row-security quals in fireRIRrules . It handles recursion correctly now, and works fine for both target and non-target relations. I've cleaned out most of the cruft from the previous optimizer based approach too. I haven't figured out how to pass the plan invalidation information (so plans are invalidated properly when they depend on row security quals) down into the planner yet, that's next. COPY still just ERROR's if you try to copy to/from a rel with row-security quals, but again, just a matter of getting it done, I have KaiGai's patch to work from. Regression tests fail, including a segfault in the executor. Cause as yet unknown, but it takes a hairy view+rowsecrule combo to trigger it. New tag: rls-9.4-upd-sb-views-v6 > The regression test expected file needs adjustment to match the new > inheritance rules, and to cover a couple of new tests I've added. Still true. > Need to cherry-pick the docs patch back on top of the patch tree now > things are settling. Still true. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 2014-02-11 09:36, Craig Ringer wrote: > On 02/06/2014 10:19 PM, Craig Ringer wrote: >> On 02/06/2014 12:43 PM, Craig Ringer wrote: >>> 1. Try (again) to do row-security in the rewriter. This was previously >>> impossible because of the definition of row-security behaviour around >>> inheritance, but with the simplified inheritance model now proposed I >>> think it's possible. >> Thanks to the simplified requirements for inheritance, this turns out to >> be fairly easy. There's a version rewritten to use the rewriter in the tag: >> >> rls-9.4-upd-sb-views-v6 >> >> on https://github.com/ringerc/postgres.git > ... which was totally wrong, and I blame lack of sleep for it ever > getting pushed. I didn't understand the rewriter as well as I thought. > > v7 applies row-security quals in fireRIRrules . > New tag: > > rls-9.4-upd-sb-views-v6 Hi Craig, This looks to be the same v6 version as the initial rewriter version. https://github.com/ringerc/postgres/commits/rls-9.4-upd-sb-views-v6 regards, Yeb
On 02/11/2014 06:05 PM, Yeb Havinga wrote: > On 2014-02-11 09:36, Craig Ringer wrote: >> On 02/06/2014 10:19 PM, Craig Ringer wrote: >>> On 02/06/2014 12:43 PM, Craig Ringer wrote: >>>> 1. Try (again) to do row-security in the rewriter. This was previously >>>> impossible because of the definition of row-security behaviour around >>>> inheritance, but with the simplified inheritance model now proposed I >>>> think it's possible. >>> Thanks to the simplified requirements for inheritance, this turns out to >>> be fairly easy. There's a version rewritten to use the rewriter in >>> the tag: >>> >>> rls-9.4-upd-sb-views-v6 >>> >>> on https://github.com/ringerc/postgres.git >> ... which was totally wrong, and I blame lack of sleep for it ever >> getting pushed. I didn't understand the rewriter as well as I thought. >> >> v7 applies row-security quals in fireRIRrules . > >> New tag: >> >> rls-9.4-upd-sb-views-v6 > > Hi Craig, > > This looks to be the same v6 version as the initial rewriter version. > https://github.com/ringerc/postgres/commits/rls-9.4-upd-sb-views-v6 Whoops, wrong paste. rls-9.4-upd-sb-views-v7 -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 2014-02-11 12:09, Craig Ringer wrote: > On 02/11/2014 06:05 PM, Yeb Havinga wrote: >> On 2014-02-11 09:36, Craig Ringer wrote: >>> On 02/06/2014 10:19 PM, Craig Ringer wrote: >>>> On 02/06/2014 12:43 PM, Craig Ringer wrote: >>>>> 1. Try (again) to do row-security in the rewriter. This was previously >>>>> impossible because of the definition of row-security behaviour around >>>>> inheritance, but with the simplified inheritance model now proposed I >>>>> think it's possible. >>>> Thanks to the simplified requirements for inheritance, this turns out to >>>> be fairly easy. There's a version rewritten to use the rewriter in >>>> the tag: >>>> >>>> rls-9.4-upd-sb-views-v6 >>>> >>>> on https://github.com/ringerc/postgres.git >>> ... which was totally wrong, and I blame lack of sleep for it ever >>> getting pushed. I didn't understand the rewriter as well as I thought. >>> >>> v7 applies row-security quals in fireRIRrules . >>> New tag: >>> >>> rls-9.4-upd-sb-views-v6 >> Hi Craig, >> >> This looks to be the same v6 version as the initial rewriter version. >> https://github.com/ringerc/postgres/commits/rls-9.4-upd-sb-views-v6 > Whoops, wrong paste. > > rls-9.4-upd-sb-views-v7 > Hi Craig, I compared output of psql -ef of the minirim.sql script posted earlier in http://www.postgresql.org/message-id/52F54927.1040102@gmail.com between v4 and v7. Not everything is ok. Seq Scan on patient (cost=0.00..29589.31 rows=495 width=52) Filter: (SubPlan 1) SubPlan 1 @@ -555,7 +592,7 @@ -> Materialize (cost=26.39..570.62 rows=1014 width=4) -> SubqueryScan on act (cost=26.39..565.55 rows=1014 width=4) -> Nested Loop Semi Join (cost=26.39..555.41 rows=1014 width=108) - Join Filter: (((part.act = act_1.id) AND (emp_2.pgname = ("current_user"())::text)) OR (NOT ((act_1.confidentialitycode)::text[] @> '{s}'::text[]))) + Join Filter: (((part.act = act_1.id) AND (emp_2.pgname = ("current_user"())::text)) OR (NOT ((act_1.effectivetime)::text[] @> '{s}'::text[]))) -> Append (cost=0.00..31.19 rows=1019 width=108) -> Seq Scan on act act_1 (cost=0.00..1.59 rows=59 width=108) @@ -587,12 +624,8 @@ FROM patient, person, organization WHERE patient.player = person.id AND patient.scoper = organization.id; - id | vipcode | name | birthtime | name -----+---------+----------+---------------------+-------------------------------- - 10 | | John Doe | 1963-04-01 00:00:00 | Community Health and Hospitals - 16 | | John Doe | 1963-04-01 00:00:00 | Community Mental Health Clinic -(2 rows) - +psql:/home/m/minirim2.sql:409: ERROR: attribute 6 has wrong type +DETAIL: Table has type tsrange, but query expects _confidentialitycode. @@ -629,7 +662,4 @@ SET SESSION AUTHORIZATION sigmund; SET SELECT * FROM test; - id | classcode | moodcode | code | confidentialitycode | effectivetime -----+-----------+----------+------+---------------------+--------------- -(0 rows) - +psql:/home/m/minirim2.sql:439: connection to server was lost regards, Yeb Havinga
On 02/11/2014 08:19 PM, Yeb Havinga wrote: > On 2014-02-11 12:09, Craig Ringer wrote: >> rls-9.4-upd-sb-views-v7 >> > Hi Craig, > > I compared output of psql -ef of the minirim.sql script posted earlier > in http://www.postgresql.org/message-id/52F54927.1040102@gmail.com > between v4 and v7. > > Not everything is ok. > +psql:/home/m/minirim2.sql:409: ERROR: attribute 6 has wrong type > +DETAIL: Table has type tsrange, but query expects _confidentialitycode. That's downright strange. I'll need to look into that one. > +psql:/home/m/minirim2.sql:439: connection to server was lost I've seen a server crash for causes as yet unknown in the main regression tests too. I'd love to stick with the in-optimizer approach used in v4, which - after all - works. The trouble is that it cannot support row-security quals that incorporate views correctly. I would have to invoke the rewriter from the optimizer and deal with recursion detection to make that work, and it looks pretty ugly. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 02/11/2014 08:19 PM, Yeb Havinga wrote: > I compared output of psql -ef of the minirim.sql script posted earlier > in http://www.postgresql.org/message-id/52F54927.1040102@gmail.com > between v4 and v7. > > Not everything is ok. > +psql:/home/m/minirim2.sql:409: ERROR: attribute 6 has wrong type > +DETAIL: Table has type tsrange, but query expects _confidentialitycode. This looks like an issue with attribute transformations made when applying security barrier quals. > +psql:/home/m/minirim2.sql:439: connection to server was lost That's dying with: > Program received signal SIGABRT, Aborted. > 0x0000003f02c359e9 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56 > 56 return INLINE_SYSCALL (tgkill, 3, pid, selftid, sig); > (gdb) bt > #0 0x0000003f02c359e9 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56 > #1 0x0000003f02c370f8 in __GI_abort () at abort.c:90 > #2 0x0000000000758d9d in ExceptionalCondition (conditionName=conditionName@entry=0x8b7bf0 "!(!bms_is_empty(upper_varnos))", > errorType=errorType@entry=0x78e89c "FailedAssertion", fileName=fileName@entry=0x8b78d7 "subselect.c", lineNumber=lineNumber@entry=1394)at assert.c:54 > #3 0x000000000061de2b in convert_EXISTS_sublink_to_join (root=<optimized out>, sublink=<optimized out>, under_not=under_not@entry=0'\000', available_rels=0x117d038) > at subselect.c:1394 > #4 0x000000000061efa9 in pull_up_sublinks_qual_recurse (root=root@entry=0x117d058, node=0x117a0f8, jtlink1=jtlink1@entry=0x7fff6d97f708, > available_rels1=available_rels1@entry=0x117d038, jtlink2=jtlink2@entry=0x0, available_rels2=available_rels2@entry=0x0)at prepjointree.c:391 > #5 0x000000000061f339 in pull_up_sublinks_jointree_recurse (root=root@entry=0x117d058, jtnode=0x117a040, relids=relids@entry=0x7fff6d97f758)at prepjointree.c:208 > #6 0x000000000062066f in pull_up_sublinks (root=root@entry=0x117d058) at prepjointree.c:147 > #7 0x000000000061967e in subquery_planner (glob=0x10c3fb8, parse=parse@entry=0x1179af8, parent_root=parent_root@entry=0x1182fd0,hasRecursion=hasRecursion@entry=0 '\000', ` > #8 0x00000000005fc093 in set_subquery_pathlist (root=root@entry=0x1182fd0, rel=rel@entry=0x1179370, rti=rti@entry=4, rte=rte@entry=0x1183980)at allpaths.c:1209 > #9 0x00000000005fc54e in set_rel_size (root=root@entry=0x1182fd0, rel=0x1179370, rti=rti@entry=4, rte=0x1183980) at allpaths.c:265 > #10 0x00000000005fc62e in set_base_rel_sizes (root=root@entry=0x1182fd0) at allpaths.c:180 > #11 0x00000000005fd584 in make_one_rel (root=root@entry=0x1182fd0, joinlist=joinlist@entry=0x1179560) at allpaths.c:138 > #12 0x000000000061617a in query_planner (root=root@entry=0x1182fd0, tlist=tlist@entry=0x11771c8, qp_callback=qp_callback@entry=0x616fd6<standard_qp_callback>, > qp_extra=qp_extra@entry=0x7fff6d97fa00) at planmain.c:236 > #13 0x00000000006182f2 in grouping_planner (root=root@entry=0x1182fd0, tuple_fraction=tuple_fraction@entry=0) at planner.c:1280 > #14 0x0000000000619a11 in subquery_planner (glob=glob@entry=0x10c3fb8, parse=parse@entry=0x10c3d30, parent_root=parent_root@entry=0x0, > hasRecursion=hasRecursion@entry=0 '\000', tuple_fraction=0, subroot=subroot@entry=0x7fff6d97fb58) at planner.c:574 > #15 0x0000000000619c1f in standard_planner (parse=0x10c3d30, cursorOptions=0, boundParams=0x0) at planner.c:211 > #16 0x0000000000619e35 in planner (parse=parse@entry=0x10c3d30, cursorOptions=cursorOptions@entry=0, boundParams=boundParams@entry=0x0)at planner.c:139 > #17 0x000000000068c64b in pg_plan_query (querytree=0x10c3d30, cursorOptions=cursorOptions@entry=0, boundParams=boundParams@entry=0x0)at postgres.c:759 > #18 0x000000000068c6d3 in pg_plan_queries (querytrees=<optimized out>, cursorOptions=cursorOptions@entry=0, boundParams=boundParams@entry=0x0)at postgres.c:818 > #19 0x000000000068c932 in exec_simple_query (query_string=query_string@entry=0x10c2d78 "SELECT * FROM hl7.test;") at postgres.c:983 > #20 0x000000000068e46e in PostgresMain (argc=<optimized out>, argv=argv@entry=0x10cd1f0, dbname=0x10cd058 "regress", username=<optimizedout>) at postgres.c:4003 > #21 0x00000000006419a7 in BackendRun (port=port@entry=0x10c7e50) at postmaster.c:4080 > #22 0x00000000006432c2 in BackendStartup (port=port@entry=0x10c7e50) at postmaster.c:3769 > #23 0x0000000000643526 in ServerLoop () at postmaster.c:1580 > #24 0x000000000064467c in PostmasterMain (argc=argc@entry=4, argv=argv@entry=0x10a8750) at postmaster.c:1235 > #25 0x00000000005d692c in main (argc=4, argv=0x10a8750) at main.c:205 > (gdb) It's crashing while pulling up the query over "emp" (hl7.employee) and "part" (hl7.participation). Given the simplicity of what the row-security code its self is doing, I'm wondering if this is a case that isn't handled in updatable s.b. views. I'll look into it. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 13 February 2014 04:12, Craig Ringer <craig@2ndquadrant.com> wrote: > On 02/11/2014 08:19 PM, Yeb Havinga wrote: > >> I compared output of psql -ef of the minirim.sql script posted earlier >> in http://www.postgresql.org/message-id/52F54927.1040102@gmail.com >> between v4 and v7. >> >> Not everything is ok. > >> +psql:/home/m/minirim2.sql:409: ERROR: attribute 6 has wrong type >> +DETAIL: Table has type tsrange, but query expects _confidentialitycode. > > This looks like an issue with attribute transformations made when > applying security barrier quals. > >> +psql:/home/m/minirim2.sql:439: connection to server was lost > > That's dying with: > >> Program received signal SIGABRT, Aborted. >> 0x0000003f02c359e9 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56 >> 56 return INLINE_SYSCALL (tgkill, 3, pid, selftid, sig); >> (gdb) bt >> #0 0x0000003f02c359e9 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56 >> #1 0x0000003f02c370f8 in __GI_abort () at abort.c:90 >> #2 0x0000000000758d9d in ExceptionalCondition (conditionName=conditionName@entry=0x8b7bf0 "!(!bms_is_empty(upper_varnos))", >> errorType=errorType@entry=0x78e89c "FailedAssertion", fileName=fileName@entry=0x8b78d7 "subselect.c", lineNumber=lineNumber@entry=1394)at assert.c:54 >> #3 0x000000000061de2b in convert_EXISTS_sublink_to_join (root=<optimized out>, sublink=<optimized out>, under_not=under_not@entry=0'\000', available_rels=0x117d038) >> at subselect.c:1394 >> #4 0x000000000061efa9 in pull_up_sublinks_qual_recurse (root=root@entry=0x117d058, node=0x117a0f8, jtlink1=jtlink1@entry=0x7fff6d97f708, >> available_rels1=available_rels1@entry=0x117d038, jtlink2=jtlink2@entry=0x0, available_rels2=available_rels2@entry=0x0)at prepjointree.c:391 >> #5 0x000000000061f339 in pull_up_sublinks_jointree_recurse (root=root@entry=0x117d058, jtnode=0x117a040, relids=relids@entry=0x7fff6d97f758)at prepjointree.c:208 >> #6 0x000000000062066f in pull_up_sublinks (root=root@entry=0x117d058) at prepjointree.c:147 >> #7 0x000000000061967e in subquery_planner (glob=0x10c3fb8, parse=parse@entry=0x1179af8, parent_root=parent_root@entry=0x1182fd0,hasRecursion=hasRecursion@entry=0 '\000', > ` >> #8 0x00000000005fc093 in set_subquery_pathlist (root=root@entry=0x1182fd0, rel=rel@entry=0x1179370, rti=rti@entry=4,rte=rte@entry=0x1183980) at allpaths.c:1209 >> #9 0x00000000005fc54e in set_rel_size (root=root@entry=0x1182fd0, rel=0x1179370, rti=rti@entry=4, rte=0x1183980) at allpaths.c:265 >> #10 0x00000000005fc62e in set_base_rel_sizes (root=root@entry=0x1182fd0) at allpaths.c:180 >> #11 0x00000000005fd584 in make_one_rel (root=root@entry=0x1182fd0, joinlist=joinlist@entry=0x1179560) at allpaths.c:138 >> #12 0x000000000061617a in query_planner (root=root@entry=0x1182fd0, tlist=tlist@entry=0x11771c8, qp_callback=qp_callback@entry=0x616fd6<standard_qp_callback>, >> qp_extra=qp_extra@entry=0x7fff6d97fa00) at planmain.c:236 >> #13 0x00000000006182f2 in grouping_planner (root=root@entry=0x1182fd0, tuple_fraction=tuple_fraction@entry=0) at planner.c:1280 >> #14 0x0000000000619a11 in subquery_planner (glob=glob@entry=0x10c3fb8, parse=parse@entry=0x10c3d30, parent_root=parent_root@entry=0x0, >> hasRecursion=hasRecursion@entry=0 '\000', tuple_fraction=0, subroot=subroot@entry=0x7fff6d97fb58) at planner.c:574 >> #15 0x0000000000619c1f in standard_planner (parse=0x10c3d30, cursorOptions=0, boundParams=0x0) at planner.c:211 >> #16 0x0000000000619e35 in planner (parse=parse@entry=0x10c3d30, cursorOptions=cursorOptions@entry=0, boundParams=boundParams@entry=0x0)at planner.c:139 >> #17 0x000000000068c64b in pg_plan_query (querytree=0x10c3d30, cursorOptions=cursorOptions@entry=0, boundParams=boundParams@entry=0x0)at postgres.c:759 >> #18 0x000000000068c6d3 in pg_plan_queries (querytrees=<optimized out>, cursorOptions=cursorOptions@entry=0, boundParams=boundParams@entry=0x0)at postgres.c:818 >> #19 0x000000000068c932 in exec_simple_query (query_string=query_string@entry=0x10c2d78 "SELECT * FROM hl7.test;") at postgres.c:983 >> #20 0x000000000068e46e in PostgresMain (argc=<optimized out>, argv=argv@entry=0x10cd1f0, dbname=0x10cd058 "regress", username=<optimizedout>) at postgres.c:4003 >> #21 0x00000000006419a7 in BackendRun (port=port@entry=0x10c7e50) at postmaster.c:4080 >> #22 0x00000000006432c2 in BackendStartup (port=port@entry=0x10c7e50) at postmaster.c:3769 >> #23 0x0000000000643526 in ServerLoop () at postmaster.c:1580 >> #24 0x000000000064467c in PostmasterMain (argc=argc@entry=4, argv=argv@entry=0x10a8750) at postmaster.c:1235 >> #25 0x00000000005d692c in main (argc=4, argv=0x10a8750) at main.c:205 >> (gdb) > > > It's crashing while pulling up the query over "emp" (hl7.employee) and > "part" (hl7.participation). > > Given the simplicity of what the row-security code its self is doing, > I'm wondering if this is a case that isn't handled in updatable s.b. > views. I'll look into it. > I'm not sure how much further you've got with this, but I think the issue is that the securityQuals that you're adding don't refer to the correct RTE. When adding securityQuals to an RTE, they are expected to have Vars whose varno matches the rt_index of the RTE (see for example the code in rewriteTargetView() which calls ChangeVarNodes() on viewqual before adding the qual to securityQuals or the main query jointree). prepend_row_security_quals() doesn't appear to have any similar code, and it would need to be passed the rt_index to do that. Regards, Dean
On 02/25/2014 01:28 AM, Dean Rasheed wrote: > On 13 February 2014 04:12, Craig Ringer <craig@2ndquadrant.com> wrote: >> >> It's crashing while pulling up the query over "emp" (hl7.employee) and >> "part" (hl7.participation). >> >> Given the simplicity of what the row-security code its self is doing, >> I'm wondering if this is a case that isn't handled in updatable s.b. >> views. I'll look into it. > > I'm not sure how much further you've got with this, but I think the > issue is that the securityQuals that you're adding don't refer to the > correct RTE. When adding securityQuals to an RTE, they are expected to > have Vars whose varno matches the rt_index of the RTE (see for example > the code in rewriteTargetView() which calls ChangeVarNodes() on > viewqual before adding the qual to securityQuals or the main query > jointree). prepend_row_security_quals() doesn't appear to have any > similar code, and it would need to be passed the rt_index to do that. Thanks for the pointer. That was indeed the issue. I've pushed an update to the branch with the fix for varno handling. Thanks. It's tagged rls-9.4-upd-sb-views-v8 . I've almost run out of time to spend on row security for this commitfest, unfortunately. I'm putting a blog together with a current status update. Frustrating, as it's coming together now. Open issues include: - Passing plan inval items from rewriter into planner - COPY support pending - Clear syntax in DDL Most of the rest are solved; it's actually looking pretty good. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 04/03/14 02:36, Craig Ringer wrote: > On 02/25/2014 01:28 AM, Dean Rasheed wrote: >> On 13 February 2014 04:12, Craig Ringer <craig@2ndquadrant.com> wrote: >>> It's crashing while pulling up the query over "emp" (hl7.employee) and >>> "part" (hl7.participation). >>> >>> Given the simplicity of what the row-security code its self is doing, >>> I'm wondering if this is a case that isn't handled in updatable s.b. >>> views. I'll look into it. >> I'm not sure how much further you've got with this, but I think the >> issue is that the securityQuals that you're adding don't refer to the >> correct RTE. When adding securityQuals to an RTE, they are expected to >> have Vars whose varno matches the rt_index of the RTE (see for example >> the code in rewriteTargetView() which calls ChangeVarNodes() on >> viewqual before adding the qual to securityQuals or the main query >> jointree). prepend_row_security_quals() doesn't appear to have any >> similar code, and it would need to be passed the rt_index to do that. > Thanks for the pointer. That was indeed the issue. > > I've pushed an update to the branch with the fix for varno handling. > Thanks. It's tagged rls-9.4-upd-sb-views-v8 . > > I've almost run out of time to spend on row security for this > commitfest, unfortunately. I'm putting a blog together with a current > status update. Frustrating, as it's coming together now. > > Open issues include: > > - Passing plan inval items from rewriter into planner > - COPY support pending > - Clear syntax in DDL > > Most of the rest are solved; it's actually looking pretty good. Hi Craig, I've tested the results from the minirim.sql that was posted earlier, and the v8 gives the same results as v4 :-) Thanks for all the work! Yeb
On 04/03/14 02:36, Craig Ringer wrote: > On 02/25/2014 01:28 AM, Dean Rasheed wrote: >> On 13 February 2014 04:12, Craig Ringer <craig@2ndquadrant.com> wrote: >>> It's crashing while pulling up the query over "emp" (hl7.employee) and >>> "part" (hl7.participation). >>> >>> Given the simplicity of what the row-security code its self is doing, >>> I'm wondering if this is a case that isn't handled in updatable s.b. >>> views. I'll look into it. >> I'm not sure how much further you've got with this, but I think the >> issue is that the securityQuals that you're adding don't refer to the >> correct RTE. When adding securityQuals to an RTE, they are expected to >> have Vars whose varno matches the rt_index of the RTE (see for example >> the code in rewriteTargetView() which calls ChangeVarNodes() on >> viewqual before adding the qual to securityQuals or the main query >> jointree). prepend_row_security_quals() doesn't appear to have any >> similar code, and it would need to be passed the rt_index to do that. > Thanks for the pointer. That was indeed the issue. > > I've pushed an update to the branch with the fix for varno handling. > Thanks. It's tagged rls-9.4-upd-sb-views-v8 . > > I've almost run out of time to spend on row security for this > commitfest, unfortunately. I'm putting a blog together with a current > status update. Frustrating, as it's coming together now. > > Open issues include: > > - Passing plan inval items from rewriter into planner > - COPY support pending > - Clear syntax in DDL > > Most of the rest are solved; it's actually looking pretty good. Hi Craig, I've tested the results from the minirim.sql that was posted earlier, and the v8 gives the same results as v4 :-) Thanks for all the work! Yeb
On 03/04/2014 09:41 PM, Yeb Havinga wrote: > On 04/03/14 02:36, Craig Ringer wrote: >> >> I've pushed an update to the branch with the fix for varno handling. >> Thanks. It's tagged rls-9.4-upd-sb-views-v8 . >> >> I've almost run out of time to spend on row security for this >> commitfest, unfortunately. I'm putting a blog together with a current >> status update. Frustrating, as it's coming together now. >> >> Open issues include: >> >> - Passing plan inval items from rewriter into planner >> - COPY support pending >> - Clear syntax in DDL >> >> Most of the rest are solved; it's actually looking pretty good. > > Hi Craig, > > I've tested the results from the minirim.sql that was posted earlier, > and the v8 gives the same results as v4 :-) Good to hear. The main known issue remaining is plan invalidation. Row security needs to create PlanInvalItems during _rewrite_ and also needs to set a PlannerGlobal field for the user ID the plan depends on. If it fails to do this then the superuser will sometimes run queries and have row-security applied, non-superusers might skip row security when it should be applied. This seems to be the cause of foreign key check issues I've observed, too. That's not trivial, because right now the rewriter only returns a Query node. It doesn't have anywhere to stash information that's global across the whole query, and adding fields to Query for the purpose doesn't seem ideal, since it's also used for subqueries, and in the planner. Changing the rewriter to return a RewriteResult struct that contains the Query and some other global info would be very intrusive, though, as it'd affect all the plan cache and invalidation machinery and the various rewriter/planner call sites. I've also got some concerns about the user visible API; I'm not sure it makes sense to set row security policy for row reads per-command in PostgreSQL, since we have the RETURNING clause. Read-side policy should just be "FOR READ". Initially I think we should be using triggers to enforce write side policy, which should raise errors if you try to insert/update/delete the wrong thing. Could move to something working a bit more like WITH CHECK OPTION later. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 2014-03-05 04:02, Craig Ringer wrote: > On 03/04/2014 09:41 PM, Yeb Havinga wrote: >> On 04/03/14 02:36, Craig Ringer wrote: >>> I've pushed an update to the branch with the fix for varno handling. >>> Thanks. It's tagged rls-9.4-upd-sb-views-v8 . >>> >>> I've almost run out of time to spend on row security for this >>> commitfest, unfortunately. I'm putting a blog together with a current >>> status update. Frustrating, as it's coming together now. >>> >>> Open issues include: >>> >>> - Passing plan inval items from rewriter into planner >>> - COPY support pending >>> - Clear syntax in DDL >>> >>> Most of the rest are solved; it's actually looking pretty good. >> Hi Craig, >> >> I've tested the results from the minirim.sql that was posted earlier, >> and the v8 gives the same results as v4 :-) > Good to hear. > > The main known issue remaining is plan invalidation. Row security needs > to create PlanInvalItems during _rewrite_ and also needs to set a > PlannerGlobal field for the user ID the plan depends on. If it fails to > do this then the superuser will sometimes run queries and have > row-security applied, non-superusers might skip row security when it > should be applied. This seems to be the cause of foreign key check > issues I've observed, too. > > That's not trivial, because right now the rewriter only returns a Query > node. It doesn't have anywhere to stash information that's global across > the whole query, and adding fields to Query for the purpose doesn't seem > ideal, since it's also used for subqueries, and in the planner. I looked up the Query structure and steps of e.g. exec_simple_query(), but ISTM that Query would be the place to store a used id. After all it is meta data about the query. Duplication of this information in the presence of subqueries seems less ugly to me than trying to evade duplication by wrapping a structure around a query list. Maybe a naive thought, but shouldn't all plans that include a table with an RLS clause be invalidated when the session role switches, regardless of which users from and to? > I've also got some concerns about the user visible API; I'm not sure it > makes sense to set row security policy for row reads per-command in > PostgreSQL, since we have the RETURNING clause. Read-side policy should > just be "FOR READ". Hmm but FOR READ would mean new keywords, and SELECT is also a concept known to users. I didn't find the api problematic to understand, on the contrary. It might be an idea to add the SELECT RLS clause for DML queries that contain a RETURNING clause. regards, Yeb
On 03/05/2014 05:25 PM, Yeb Havinga wrote: > Maybe a naive thought, but shouldn't all plans that include a table with > an RLS clause be invalidated when the session role switches, regardless > of which users from and to? Only if the plan is actually accessed when under a different user ID. Consider SECURITY DEFINER functions; you don't want to flush all cached plans just because you ran a SECURITY DEFINER function that doesn't even share any statements with the outer transaction. Anyway, the same issue remains: how to pass the information "this plan is user-id specific" from rewriter to planner. >> I've also got some concerns about the user visible API; I'm not sure it >> makes sense to set row security policy for row reads per-command in >> PostgreSQL, since we have the RETURNING clause. Read-side policy should >> just be "FOR READ". > > Hmm but FOR READ would mean new keywords, and SELECT is also a concept > known to users. I didn't find the api problematic to understand, on the > contrary. Would you expect that FOR SELECT also affects rows you can see to UPDATE, INSERT, or DELETE? Because that's what it would have to mean, really. Otherwise, you could just use `UPDATE thetable SET id = id RETURNING *` (or whatever) to read the rows out if you had UPDATE rights. Or do the same with DELETE. With RETURNING, it doesn't make much sense for different statements to have different read access. Can you think of a case where it'd be reasonable to deny SELECT, but allow someone to see the same rows with `UPDATE ... RETURNING` ? > It might be an idea to add the SELECT RLS clause for DML > queries that contain a RETURNING clause. That way lies madness: A DML statement that affects *a different set of rows* depending on whether or not it has a RETURNING clause. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 03/06/2014 04:56 AM, Yeb Havinga wrote: >>> It might be an idea to add the SELECT RLS clause for DML >>> queries that contain a RETURNING clause. >> That way lies madness: A DML statement that affects *a different set of >> rows* depending on whether or not it has a RETURNING clause. > If you state it like that, it sounds like a POLA violation. But the > complete story is: "A user is allowed to UPDATE a set of rows from a > table that is not a subsect of the set of rows he can SELECT from the > table, iow he can UPDATE rows he is not allowed to SELECT. This can lead > to unexpected results: When the user issues an UPDATE of the table > without a returning clause, all rows the user may UPDATE are affected. > When the user issues an UPDATE of the table with a returning clause, all > rows the user may UPDATE and SELECT are affected." Well, I think that's bizarre behaviour. It doesn't make sense for RETURNING to affect the behaviour of the command it's applied to. It never has before, and it's defined to return the set of rows affected by the command. It shouldn't be changing that set, it isn't a predicate. > So the madness comes from the fact that it is allowed to define RLS that > allow to modify rows you cannot select. Either prevent these conditions > (i.e. proof that all DML RLS qual implies the SELECT qual, otherwise > give an error on DML with a RETURNING clause) Equivalence proofs in predicates are WAY outside what's going to be reasonable to tackle in a feature like this. Especially since the row security expression may be "my_c_function(the_row)", and all the detail of behaviour is hidden behind some C function. We need to treat row security expressions as pretty much opaque. > or allow it without > violating the RLS rules but accept that a DML with RETURNING is > different from a DML only. I don't think that's acceptable. Too many tools automatically add a RETURNING clause, or use one in generated SQL. Notably, PgJDBC will append a RETURNING clause to your query so it can return generated keys. With regular permissions, we require that the user has SELECT rights if they use a RETURNING clause. That works because it's a simple permissions check. With row security, we'd be affecting a different set of rows instead, and doing so silently. That's just ugly. (It also creates execution inefficiencies where the SELECT and UPDATE/DELETE predicates are the same). Additionally, in PostgreSQL if you can supply a predicate for a row, you can leak the row via a RAISE NOTICE or other tricks. So even without RETURNING, allowing a user to update/delete a row permits them to potentially see the row. (This is an issue with our current permissions too; if you DELETE with a leaky predicate, you can see the rows you deleted even without SELECT rights on a table). That, IMO, is two good reasons not to differentiate between command types for the purpose of which rows they can see in a scan. We already have a mechanism for allowing users to do things that they can't normally do under controlled and restricted circumstances: SECURITY DEFINER functions. I don't think we need to introduce bizarre and surprising behaviour in DML when we have a viable mechanism for people who need to do this. Is there a compelling use case for this? Where it really makes sense to let users update/delete rows they cannot see via row security? We support it in the table based permissions model, but it's possible to do it with much saner semantics there. And with row security, it'll be possible with security definer functions. I intend to add a "row security exempt" flag for functions down the track, too. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 06/03/14 02:56, Craig Ringer wrote: > On 03/06/2014 04:56 AM, Yeb Havinga wrote: > >> If you state it like that, it sounds like a POLA violation. But the >> complete story is: "A user is allowed to UPDATE a set of rows from a >> table that is not a subsect of the set of rows he can SELECT from the >> table, iow he can UPDATE rows he is not allowed to SELECT. > Is there a compelling use case for this? Where it really makes sense to > let users update/delete rows they cannot see via row security? We > support it in the table based permissions model, but it's possible to do > it with much saner semantics there. And with row security, it'll be > possible with security definer functions. I intend to add a "row > security exempt" flag for functions down the track, too. > Use case: https://en.wikipedia.org/wiki/Bell-La_Padula_model - being able to write up and read down access levels. So for instance in healthcare, a data enterer may enter from hand written notes sensitive data (like subject has aids) in the electronic health record, without having general read access of the level of sensitivity of aids diagnosis. I think what is important in use cases like this, is that at data entry time, the actual data is still at the desk, so having data returned for inserts in the running transaction might not be problematic. As most EHR's today are additive in nature, future additions about the aids conditions would be inserts as well, no updates required. For updates my best guess would be that allowing the command to run with rls permissions different from the select is not required. regards, Yeb
On 03/05/2014 11:02 AM, Craig Ringer wrote: > The main known issue remaining is plan invalidation. I've pushed a version with a plan invalidation implementation. It's tagged: rls-9.4-upd-sb-views-v9 in git@github.com:ringerc/postgres.git The invalidation implementation does not yet handle foreign key checks; that will require additional changes. I'll push an update to the rls-9.4-upd-sb-views and post an update later, at which time I'll rebase the changes back into the history. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 03/07/2014 09:33 PM, Craig Ringer wrote: > On 03/05/2014 11:02 AM, Craig Ringer wrote: >> The main known issue remaining is plan invalidation. > > I've pushed a version with a plan invalidation implementation. It's tagged: > > rls-9.4-upd-sb-views-v9 > > in > > git@github.com:ringerc/postgres.git > > The invalidation implementation does not yet handle foreign key checks; > that will require additional changes. I'll push an update to the > rls-9.4-upd-sb-views and post an update later, at which time I'll rebase > the changes back into the history. Well, that was easy. Done. rls-9.4-upd-sb-views-v11 and rebased the rls-9.4-upd-sb-views branch to incorporate the changes. The regression tests have further failures, but some are due to changes in the inheritance semantics. I'm going through them now. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 03/07/2014 10:07 PM, Craig Ringer wrote: > On 03/07/2014 09:33 PM, Craig Ringer wrote: >> On 03/05/2014 11:02 AM, Craig Ringer wrote: >>> The main known issue remaining is plan invalidation. >> >> I've pushed a version with a plan invalidation implementation. It's tagged: >> >> rls-9.4-upd-sb-views-v9 >> >> in >> >> git@github.com:ringerc/postgres.git >> >> The invalidation implementation does not yet handle foreign key checks; >> that will require additional changes. I'll push an update to the >> rls-9.4-upd-sb-views and post an update later, at which time I'll rebase >> the changes back into the history. > > Well, that was easy. Done. > > rls-9.4-upd-sb-views-v11 > > and rebased the rls-9.4-upd-sb-views branch to incorporate the changes. > > The regression tests have further failures, but some are due to changes > in the inheritance semantics. I'm going through them now. > Need a quick opinion. KaiGai's original code produced a plan like this for an inheritance set: EXPLAIN (costs off) SELECT * FROM t1 WHERE f_leak(b) FOR SHARE; ! QUERY PLAN ! ------------------------------------------- LockRows ! -> Append ! -> Subquery Scan on t1 ! Filter: f_leak(t1.b) ! -> Seq Scan on t1 t1_1 ! Filter: ((a % 2) = 0) ! -> Subquery Scan on t2 ! Filter: f_leak(t2.b) ! -> Seq Scan on t2 t2_1 ! Filter: ((a % 2) = 1) ! -> Seq Scan on t3 ! Filter: f_leak(b) (12 rows) The new code, using updatable s.b. views, instead produces: EXPLAIN (costs off) SELECT * FROM t1 WHERE f_leak(b) FOR SHARE; ! QUERY PLAN ! ------------------------------------------------------- LockRows ! -> Subquery Scan on t1 ! Filter: f_leak(t1.b) ! -> LockRows ! -> Result ! -> Append ! -> Seq Scan on t1 t1_1 ! Filter: ((a % 2) = 0) ! -> Seq Scan on t2 ! Filter: ((a % 2) = 0) ! -> Seq Scan on t3 ! Filter: ((a % 2) = 0) (12 rows) The different quals are expected, because of the change to the definition of inheritance handling in row security. What I'm concerned about is the locking. It looks to me like we're causing the user to lock rows that they may not intend to lock, by applying a LockRows step *before* the user supplied qual. (I'm going to test that tomorrow, it's sleep time in Australia). This seems to be related to RowMark handling in updatable security barrier views. I need to check whether it happens with updates to security barrier views, as well as with row security. Dean, any thoughts? -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Craig Ringer <craig@2ndquadrant.com> writes: > What I'm concerned about is the locking. It looks to me like we're > causing the user to lock rows that they may not intend to lock, by > applying a LockRows step *before* the user supplied qual. (I'm going to > test that tomorrow, it's sleep time in Australia). The fact that there are two LockRows nodes seems outright broken. The one at the top of the plan is correctly placed, but how did the other one get in there? regards, tom lane
On 05/03/14 15:44, Craig Ringer wrote: > On 03/05/2014 05:25 PM, Yeb Havinga wrote: >> Maybe a naive thought, but shouldn't all plans that include a table with >> an RLS clause be invalidated when the session role switches, regardless >> of which users from and to? > Only if the plan is actually accessed when under a different user ID. > Consider SECURITY DEFINER functions; you don't want to flush all cached > plans just because you ran a SECURITY DEFINER function that doesn't even > share any statements with the outer transaction. Hmm good point. >>> I've also got some concerns about the user visible API; I'm not sure it >>> makes sense to set row security policy for row reads per-command in >>> PostgreSQL, since we have the RETURNING clause. Read-side policy should >>> just be "FOR READ". >> Hmm but FOR READ would mean new keywords, and SELECT is also a concept >> known to users. I didn't find the api problematic to understand, on the >> contrary. > Would you expect that FOR SELECT also affects rows you can see to > UPDATE, INSERT, or DELETE? Yes. > Because that's what it would have to mean, really. Otherwise, you could > just use `UPDATE thetable SET id = id RETURNING *` (or whatever) to read > the rows out if you had UPDATE rights. Or do the same with DELETE. > > With RETURNING, it doesn't make much sense for different statements to > have different read access. Can you think of a case where it'd be > reasonable to deny SELECT, but allow someone to see the same rows with > `UPDATE ... RETURNING` ? > >> It might be an idea to add the SELECT RLS clause for DML >> queries that contain a RETURNING clause. > That way lies madness: A DML statement that affects *a different set of > rows* depending on whether or not it has a RETURNING clause. If you state it like that, it sounds like a POLA violation. But the complete story is: "A user is allowed to UPDATE a set of rows from a table that is not a subsect of the set of rows he can SELECT from the table, iow he can UPDATE rows he is not allowed to SELECT. This can lead to unexpected results: When the user issues an UPDATE of the table without a returning clause, all rows the user may UPDATE are affected. When the user issues an UPDATE of the table with a returning clause, all rows the user may UPDATE and SELECT are affected." So the madness comes from the fact that it is allowed to define RLS that allow to modify rows you cannot select. Either prevent these conditions (i.e. proof that all DML RLS qual implies the SELECT qual, otherwise give an error on DML with a RETURNING clause), or allow it without violating the RLS rules but accept that a DML with RETURNING is different from a DML only. regards, Yeb
On 03/08/2014 01:56 AM, Tom Lane wrote: > Craig Ringer <craig@2ndquadrant.com> writes: >> What I'm concerned about is the locking. It looks to me like we're >> causing the user to lock rows that they may not intend to lock, by >> applying a LockRows step *before* the user supplied qual. (I'm going to >> test that tomorrow, it's sleep time in Australia). > > The fact that there are two LockRows nodes seems outright broken. > The one at the top of the plan is correctly placed, but how did the > other one get in there? I initially thought it was the updatable security barrier views code pushing the RowMark down into the generated subquery. But if I remove the pushdown code the inner LockRows node still seems to get emitted. In fact, it's not a new issue. In vanilla 9.3.1: regress=> select version(); version --------------------------------------------------------------------------------------------------------------PostgreSQL 9.3.1on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.8.1 20130603 (Red Hat 4.8.1-1), 64-bit (1 row) regress=> CREATE TABLE t1(x integer, y integer); CREATE TABLE regress=> INSERT INTO t1(x,y) VALUES (1,1), (2,2), (3,3), (4,4); INSERT 0 4 regress=> CREATE VIEW v1 WITH (security_barrier) AS SELECT x, y FROM t1 WHERE x % 2 = 0; CREATE VIEW regress=> CREATE OR REPLACE FUNCTION user_qual() RETURNS boolean AS $$ BEGIN RETURN TRUE; END; $$ LANGUAGEplpgsql; CREATE FUNCTION regress=> EXPLAIN SELECT * FROM v1 WHERE user_qual() FOR UPDATE; QUERY PLAN -----------------------------------------------------------------------LockRows (cost=0.00..45.11 rows=4 width=40) -> Subquery Scan on v1 (cost=0.00..45.07 rows=4 width=40) Filter: user_qual() -> LockRows (cost=0.00..42.21rows=11 width=14) -> Seq Scan on t1 (cost=0.00..42.10 rows=11 width=14) Filter: ((x % 2) = 0) (6 rows) so it looks like security barrier views are locking rows they should not be. I can confirm that on 9.3.1 with: CREATE OR REPLACE FUNCTION row_is(integer, integer) RETURNS boolean as $$ begin return (select $1 = $2); end; $$ language plpgsql; then in two sessions: SELECT * FROM v1 WHERE row_is(x, 2) FOR UPDATE; and SELECT * FROM v1 WHERE row_is(x, 4) FOR UPDATE; These should not block each other, but do. So there's a pre-existing bug here. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services