Thread: leaky views, yet again

leaky views, yet again

From

Robert Haas

Date:

08 July 2010, 13:08:52

I think we pretty much have conceptual agreement on the shape of the
solution to this problem: when a view is created with CREATE SECURITY
VIEW, restrict functions and operators that might disclose tuple data
from being pushed down into view (unless, perhaps, the user invoking
the view has sufficient privileges to execute the underlying query
anyhow, e.g. superuser or view owner).  What we have not resolved is
exactly how we're going to decide which functions and operators might
do such a dastardly thing.  I think the viable options are as follows:

1. Adopt Heikki's proposal of treating indexable operators as non-leaky.

http://archives.postgresql.org/pgsql-hackers/2010-06/msg00291.php

Or, perhaps, and even more restrictively, treat only
hashable/mergeable operators as non-leaky.

2. Add an explicit flag to pg_proc, which can only be set by
superusers (and which is cleared if a non-superuser modifies it in any
way), allowing a function to be tagged as non-leaky.  I believe that
it would be reasonable to set this flag for all of our built-in
indexable operators (though I'd read the code before doing it), but it
would remove the need for the assumption that third-party operators
are equally sane.

CREATE OR REPLACE FUNCTION doesnt_leak() RETURNS integer AS $$SELECT
42$$ IMMUTABLE SEAWORTHY; -- doesn't leak

This problem is not going away, so we should try to decide on something.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

Re: leaky views, yet again

From

KaiGai Kohei

Date:

09 July 2010, 03:48:19

(2010/07/08 22:08), Robert Haas wrote:
> I think we pretty much have conceptual agreement on the shape of the
> solution to this problem: when a view is created with CREATE SECURITY
> VIEW, restrict functions and operators that might disclose tuple data
> from being pushed down into view (unless, perhaps, the user invoking
> the view has sufficient privileges to execute the underlying query
> anyhow, e.g. superuser or view owner).  What we have not resolved is
> exactly how we're going to decide which functions and operators might
> do such a dastardly thing.  I think the viable options are as follows:
> 
> 1. Adopt Heikki's proposal of treating indexable operators as non-leaky.
> 
> http://archives.postgresql.org/pgsql-hackers/2010-06/msg00291.php
> 
> Or, perhaps, and even more restrictively, treat only
> hashable/mergeable operators as non-leaky.
> 
> 2. Add an explicit flag to pg_proc, which can only be set by
> superusers (and which is cleared if a non-superuser modifies it in any
> way), allowing a function to be tagged as non-leaky.  I believe that
> it would be reasonable to set this flag for all of our built-in
> indexable operators (though I'd read the code before doing it), but it
> would remove the need for the assumption that third-party operators
> are equally sane.
> 
> CREATE OR REPLACE FUNCTION doesnt_leak() RETURNS integer AS $$SELECT
> 42$$ IMMUTABLE SEAWORTHY; -- doesn't leak
> 
> This problem is not going away, so we should try to decide on something.
> 
I'd like to vote the second option, because this approach will be also
applied on another aspect of leaky views.

When leaky and non-leaky functions are chained within a WHERE clause,
it will be ordered by the cost of functions. So, we have possibility
that leaky functions are executed earlier than non-leaky functions.

It will not be an easy work to mark built-in functions as either leaky
or non-leaky, but it seems to me the most straight forward solution.

Thanks,
-- 
KaiGai Kohei <kaigai@ak.jp.nec.com>

Re: leaky views, yet again

From

Robert Haas

Date:

19 July 2010, 17:08:29

2010/7/8 KaiGai Kohei <kaigai@ak.jp.nec.com>:
> (2010/07/08 22:08), Robert Haas wrote:
>> I think we pretty much have conceptual agreement on the shape of the
>> solution to this problem: when a view is created with CREATE SECURITY
>> VIEW, restrict functions and operators that might disclose tuple data
>> from being pushed down into view (unless, perhaps, the user invoking
>> the view has sufficient privileges to execute the underlying query
>> anyhow, e.g. superuser or view owner).  What we have not resolved is
>> exactly how we're going to decide which functions and operators might
>> do such a dastardly thing.  I think the viable options are as follows:
>>
>> 1. Adopt Heikki's proposal of treating indexable operators as non-leaky.
>>
>> http://archives.postgresql.org/pgsql-hackers/2010-06/msg00291.php
>>
>> Or, perhaps, and even more restrictively, treat only
>> hashable/mergeable operators as non-leaky.
>>
>> 2. Add an explicit flag to pg_proc, which can only be set by
>> superusers (and which is cleared if a non-superuser modifies it in any
>> way), allowing a function to be tagged as non-leaky.  I believe that
>> it would be reasonable to set this flag for all of our built-in
>> indexable operators (though I'd read the code before doing it), but it
>> would remove the need for the assumption that third-party operators
>> are equally sane.
>>
>> CREATE OR REPLACE FUNCTION doesnt_leak() RETURNS integer AS $$SELECT
>> 42$$ IMMUTABLE SEAWORTHY; -- doesn't leak
>>
>> This problem is not going away, so we should try to decide on something.
>>
> I'd like to vote the second option, because this approach will be also
> applied on another aspect of leaky views.
>
> When leaky and non-leaky functions are chained within a WHERE clause,
> it will be ordered by the cost of functions. So, we have possibility
> that leaky functions are executed earlier than non-leaky functions.
>
> It will not be an easy work to mark built-in functions as either leaky
> or non-leaky, but it seems to me the most straight forward solution.

Does anyone else have an opinion on this?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

Re: leaky views, yet again

From

Heikki Linnakangas

Date:

19 July 2010, 17:14:05

On 09/07/10 06:47, KaiGai Kohei wrote:
> When leaky and non-leaky functions are chained within a WHERE clause,
> it will be ordered by the cost of functions. So, we have possibility
> that leaky functions are executed earlier than non-leaky functions.

No, that needs to be forbidden as part of the fix. Leaky functions must
not be executed before all the quals from the view are evaluated.

--  Heikki Linnakangas EnterpriseDB   http://www.enterprisedb.com

Re: leaky views, yet again

From

Heikki Linnakangas

Date:

19 July 2010, 17:19:47

On 19/07/10 20:08, Robert Haas wrote:
> 2010/7/8 KaiGai Kohei<kaigai@ak.jp.nec.com>:
>> (2010/07/08 22:08), Robert Haas wrote:
>>> I think we pretty much have conceptual agreement on the shape of the
>>> solution to this problem: when a view is created with CREATE SECURITY
>>> VIEW, restrict functions and operators that might disclose tuple data
>>> from being pushed down into view (unless, perhaps, the user invoking
>>> the view has sufficient privileges to execute the underlying query
>>> anyhow, e.g. superuser or view owner).  What we have not resolved is
>>> exactly how we're going to decide which functions and operators might
>>> do such a dastardly thing.  I think the viable options are as follows:
>>>
>>> 1. Adopt Heikki's proposal of treating indexable operators as non-leaky.
>>>
>>> http://archives.postgresql.org/pgsql-hackers/2010-06/msg00291.php
>>>
>>> Or, perhaps, and even more restrictively, treat only
>>> hashable/mergeable operators as non-leaky.
>>>
>>> 2. Add an explicit flag to pg_proc, which can only be set by
>>> superusers (and which is cleared if a non-superuser modifies it in any
>>> way), allowing a function to be tagged as non-leaky.  I believe that
>>> it would be reasonable to set this flag for all of our built-in
>>> indexable operators (though I'd read the code before doing it), but it
>>> would remove the need for the assumption that third-party operators
>>> are equally sane.
>>>
>>> CREATE OR REPLACE FUNCTION doesnt_leak() RETURNS integer AS $$SELECT
>>> 42$$ IMMUTABLE SEAWORTHY; -- doesn't leak
>>>
>>> This problem is not going away, so we should try to decide on something.
>>>
>> I'd like to vote the second option, because this approach will be also
>> applied on another aspect of leaky views.
>>
>> When leaky and non-leaky functions are chained within a WHERE clause,
>> it will be ordered by the cost of functions. So, we have possibility
>> that leaky functions are executed earlier than non-leaky functions.
>>
>> It will not be an easy work to mark built-in functions as either leaky
>> or non-leaky, but it seems to me the most straight forward solution.
>
> Does anyone else have an opinion on this?

I have a bad feeling that marking functions explicitly as seaworthy is 
going to be hard to get right, and every time you get that wrong it's a 
security issue. On the other hand, if it's enough from a performance 
point of view to review and mark only a few built-in functions like 
index operators, maybe it's ok.

It would be easier to assess this if we had a patch to play with that 
contained all the planner changes to keep track of the seaworthiness of 
functions and apply the quals in right order. You could then try out 
different scenarios to see what the performance is like.

I guess what I'm saying is "write a patch, and I'll shoot it down when 
you're done" :-/. But hopefully the planner changes don't depend much on 
how we deduce which quals are leaky.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com

Re: leaky views, yet again

From

Robert Haas

Date:

19 July 2010, 17:47:33

On Mon, Jul 19, 2010 at 1:19 PM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
> I guess what I'm saying is "write a patch, and I'll shoot it down when
> you're done" :-/. But hopefully the planner changes don't depend much on how
> we deduce which quals are leaky.

Oh, great...  I love that.

/me rolls eyes

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

Re: leaky views, yet again

From

KaiGai Kohei

Date:

21 July 2010, 04:59:21

(2010/07/20 2:13), Heikki Linnakangas wrote:
> On 09/07/10 06:47, KaiGai Kohei wrote:
>> When leaky and non-leaky functions are chained within a WHERE clause,
>> it will be ordered by the cost of functions. So, we have possibility
>> that leaky functions are executed earlier than non-leaky functions.
> 
> No, that needs to be forbidden as part of the fix. Leaky functions must
> not be executed before all the quals from the view are evaluated.
> 

IIUC, a view is extracted to a subquery in the rewriter phase, then it
can be pulled up to join clause at pull_up_subqueries(). In this case,
WHERE clause may have the quals come from different origins, isn't it?

E.g) SELECT * FROM v1 WHERE f_malicious(v1.a);
 At the rewriter: -> SELECT v1.* FROM (SELECT * FROM t1 WHERE f_policy(t1.b)) v1 WHERE f_malicious(v1.a);
 At the pull_up_subqueries() -> SELECT * FROM t1 WHERE f_policy(t1.b) AND f_malicious(t1.a);
^^^^^^^^^^^^^^    ^^^^^^^^^^^^^^^^^                            cost = 100         cost = 0.0001
 

Apart from an idea of secure/leaky function mark, isn't it necessary any
mechanism to enforce f_policy() shall be executed earlier than f_malicious()?

Thanks,
-- 
KaiGai Kohei <kaigai@ak.jp.nec.com>

Re: leaky views, yet again

From

KaiGai Kohei

Date:

21 July 2010, 05:43:39

(2010/07/20 2:19), Heikki Linnakangas wrote:
> On 19/07/10 20:08, Robert Haas wrote:
>> 2010/7/8 KaiGai Kohei<kaigai@ak.jp.nec.com>:
>>> (2010/07/08 22:08), Robert Haas wrote:
>>>> I think we pretty much have conceptual agreement on the shape of the
>>>> solution to this problem: when a view is created with CREATE SECURITY
>>>> VIEW, restrict functions and operators that might disclose tuple data
>>>> from being pushed down into view (unless, perhaps, the user invoking
>>>> the view has sufficient privileges to execute the underlying query
>>>> anyhow, e.g. superuser or view owner). What we have not resolved is
>>>> exactly how we're going to decide which functions and operators might
>>>> do such a dastardly thing. I think the viable options are as follows:
>>>>
>>>> 1. Adopt Heikki's proposal of treating indexable operators as 
>>>> non-leaky.
>>>>
>>>> http://archives.postgresql.org/pgsql-hackers/2010-06/msg00291.php
>>>>
>>>> Or, perhaps, and even more restrictively, treat only
>>>> hashable/mergeable operators as non-leaky.
>>>>
>>>> 2. Add an explicit flag to pg_proc, which can only be set by
>>>> superusers (and which is cleared if a non-superuser modifies it in any
>>>> way), allowing a function to be tagged as non-leaky. I believe that
>>>> it would be reasonable to set this flag for all of our built-in
>>>> indexable operators (though I'd read the code before doing it), but it
>>>> would remove the need for the assumption that third-party operators
>>>> are equally sane.
>>>>
>>>> CREATE OR REPLACE FUNCTION doesnt_leak() RETURNS integer AS $$SELECT
>>>> 42$$ IMMUTABLE SEAWORTHY; -- doesn't leak
>>>>
>>>> This problem is not going away, so we should try to decide on 
>>>> something.
>>>>
>>> I'd like to vote the second option, because this approach will be also
>>> applied on another aspect of leaky views.
>>>
>>> When leaky and non-leaky functions are chained within a WHERE clause,
>>> it will be ordered by the cost of functions. So, we have possibility
>>> that leaky functions are executed earlier than non-leaky functions.
>>>
>>> It will not be an easy work to mark built-in functions as either leaky
>>> or non-leaky, but it seems to me the most straight forward solution.
>>
>> Does anyone else have an opinion on this?
> 
> I have a bad feeling that marking functions explicitly as seaworthy is 
> going to be hard to get right, and every time you get that wrong it's a 
> security issue.
>
Yes, it is indeed a uphill work to mark either secure or leaky on the
functions correctly. :(

> On the other hand, if it's enough from a performance 
> point of view to review and mark only a few built-in functions like 
> index operators, maybe it's ok.
> 
I also think it is a worthful idea to try as a proof-of-concept.

If we only allow to push down indexable operators, do you have any idea
to distinguish between a purely indexable operator qual and others?
At the distribute_qual_to_rels() stage, how can we find out a qual which
is consist of only indexable operators?

> It would be easier to assess this if we had a patch to play with that 
> contained all the planner changes to keep track of the seaworthiness of 
> functions and apply the quals in right order. You could then try out 
> different scenarios to see what the performance is like.
> 
> I guess what I'm saying is "write a patch, and I'll shoot it down when 
> you're done" :-/. But hopefully the planner changes don't depend much on 
> how we deduce which quals are leaky.
> 
I agree. The decision to mark either secure or leaky on functions should
be done independently from the planner code, like contain_volatile_functions().

Thanks,
-- 
KaiGai Kohei <kaigai@ak.jp.nec.com>

Re: leaky views, yet again

From

Robert Haas

Date:

21 July 2010, 10:25:53

2010/7/21 KaiGai Kohei <kaigai@ak.jp.nec.com>:
> (2010/07/20 2:13), Heikki Linnakangas wrote:
>> On 09/07/10 06:47, KaiGai Kohei wrote:
>>> When leaky and non-leaky functions are chained within a WHERE clause,
>>> it will be ordered by the cost of functions. So, we have possibility
>>> that leaky functions are executed earlier than non-leaky functions.
>>
>> No, that needs to be forbidden as part of the fix. Leaky functions must
>> not be executed before all the quals from the view are evaluated.
>>
>
> IIUC, a view is extracted to a subquery in the rewriter phase, then it
> can be pulled up to join clause at pull_up_subqueries(). In this case,
> WHERE clause may have the quals come from different origins, isn't it?
>
> E.g)
>  SELECT * FROM v1 WHERE f_malicious(v1.a);
>
>  At the rewriter:
>  -> SELECT v1.* FROM (SELECT * FROM t1 WHERE f_policy(t1.b)) v1 WHERE f_malicious(v1.a);
>
>  At the pull_up_subqueries()
>  -> SELECT * FROM t1 WHERE f_policy(t1.b) AND f_malicious(t1.a);
>                            ^^^^^^^^^^^^^^     ^^^^^^^^^^^^^^^^^
>                             cost = 100         cost = 0.0001
>
> Apart from an idea of secure/leaky function mark, isn't it necessary any
> mechanism to enforce f_policy() shall be executed earlier than f_malicious()?

I think you guys are in fact agreeing with each other.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

Re: leaky views, yet again

From

Robert Haas

Date:

21 July 2010, 10:26:53

2010/7/21 KaiGai Kohei <kaigai@ak.jp.nec.com>:
>> On the other hand, if it's enough from a performance
>> point of view to review and mark only a few built-in functions like
>> index operators, maybe it's ok.
>>
> I also think it is a worthful idea to try as a proof-of-concept.

Yeah.  So, should we mark this patch as Returned with Feedback, and
you can submit a proof-of-concept patch for the next CF?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

Re: leaky views, yet again

From

KaiGai Kohei

Date:

21 July 2010, 11:08:22

(2010/07/21 19:26), Robert Haas wrote:
> 2010/7/21 KaiGai Kohei<kaigai@ak.jp.nec.com>:
>>> On the other hand, if it's enough from a performance
>>> point of view to review and mark only a few built-in functions like
>>> index operators, maybe it's ok.
>>>
>> I also think it is a worthful idea to try as a proof-of-concept.
>
> Yeah.  So, should we mark this patch as Returned with Feedback, and
> you can submit a proof-of-concept patch for the next CF?
>
Yes, it's fair enough.

-- 
KaiGai Kohei <kaigai@kaigai.gr.jp>

Re: leaky views, yet again

From

KaiGai Kohei

Date:

01 September 2010, 08:38:42

(2010/07/21 19:35), KaiGai Kohei wrote:
> (2010/07/21 19:26), Robert Haas wrote:
>> 2010/7/21 KaiGai Kohei<kaigai@ak.jp.nec.com>:
>>>> On the other hand, if it's enough from a performance
>>>> point of view to review and mark only a few built-in functions like
>>>> index operators, maybe it's ok.
>>>>
>>> I also think it is a worthful idea to try as a proof-of-concept.
>>
>> Yeah. So, should we mark this patch as Returned with Feedback, and
>> you can submit a proof-of-concept patch for the next CF?
>>
> Yes, it's fair enough.
>
The attached patch is a proof-of-concept one according to the suggestion
from Heikki before.

Right now, it stands on a strict assumption that considers operators
implemented with built-in functions are safe; it does not have no
possibility to leak supplied arguments anywhere.

This patch modifies the logic that the planner decides where the
given qualifier clause should be distributed.

If the clause does not contain any "leakable" functions, nothing
were changed. In this case, the clause shall be pushed down into
inside of the join, if it depends on one-side of the join.

Elsewhere, if the clause contains any "leakable" functions, this
patch prevents to push down the clause into join loop, because
the clause need to be evaluated after the condition of join.
If it would not be prevented, the "leakable" function may expose
the contents to be invisible for users; due to the VIEWs for
row-level security purpose.

Example
--------------------------------------------
postgres=# CREATE OR REPLACE FUNCTION f_policy(int)
               RETURNS bool LANGUAGE 'plpgsql'
               AS 'begin return $1 % 2 = 0; end;';
CREATE FUNCTION
postgres=# CREATE OR REPLACE FUNCTION f_malicious(text)
               RETURNS bool LANGUAGE 'plpgsql' COST 0.001
               AS 'begin raise notice ''leak: %'', $1; return true; end';
CREATE FUNCTION
postgres=# CREATE TABLE t1 (a int primary key, b text);
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index "t1_pkey" for table "t1"
CREATE TABLE
postgres=# CREATE TABLE t2 (x int primary key, y text);
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index "t2_pkey" for table "t2"
CREATE TABLE
postgres=# INSERT INTO t1 VALUES (1,'aaa'), (2,'bbb'), (3,'ccc');
INSERT 0 3
postgres=# INSERT INTO t2 VALUES (2,'xxx'), (3,'yyy'), (4,'zzz');
INSERT 0 3
postgres=# CREATE VIEW v AS SELECT * FROM t1 JOIN t2 ON f_policy(a+x);
CREATE VIEW

* SELECT * FROM v WHERE f_malicious(b);

[without this patch]
postgres=# EXPLAIN SELECT * FROM v WHERE f_malicious(b);
                            QUERY PLAN
------------------------------------------------------------------
 Nested Loop  (cost=0.00..133685.13 rows=168100 width=72)
   Join Filter: f_policy((t1.a + t2.x))
   ->  Seq Scan on t2  (cost=0.00..22.30 rows=1230 width=36)
   ->  Materialize  (cost=0.00..24.35 rows=410 width=36)
         ->  Seq Scan on t1  (cost=0.00..22.30 rows=410 width=36)
               Filter: f_malicious(b)
(6 rows)
 ==> f_malicious() may be raise a notice about invisible tuples.

[with this patch]
postgres=# EXPLAIN SELECT * FROM v WHERE f_malicious(b);
                            QUERY PLAN
-------------------------------------------------------------------
 Nested Loop  (cost=0.00..400969.96 rows=168100 width=72)
   Join Filter: (f_malicious(t1.b) AND f_policy((t1.a + t2.x)))
   ->  Seq Scan on t1  (cost=0.00..22.30 rows=1230 width=36)
   ->  Materialize  (cost=0.00..28.45 rows=1230 width=36)
         ->  Seq Scan on t2  (cost=0.00..22.30 rows=1230 width=36)
(5 rows)
 ==> f_malicious() is moved to outside of the join.
     It is evaluated earlier than f_policy() in same level due to
     the function cost, but it is another matter.


* SELECT * FROM v WHERE a = 2;
[without this patch]
postgres=# EXPLAIN SELECT * FROM v WHERE a = 2;
                               QUERY PLAN
-------------------------------------------------------------------------
 Nested Loop  (cost=0.00..353.44 rows=410 width=72)
   Join Filter: f_policy((t1.a + t2.x))
   ->  Index Scan using t1_pkey on t1  (cost=0.00..8.27 rows=1 width=36)
         Index Cond: (a = 2)
   ->  Seq Scan on t2  (cost=0.00..22.30 rows=1230 width=36)
(5 rows)

[with this patch]
postgres=# EXPLAIN SELECT * FROM v WHERE a = 2;
                               QUERY PLAN
-------------------------------------------------------------------------
 Nested Loop  (cost=0.00..353.44 rows=410 width=72)
   Join Filter: f_policy((t1.a + t2.x))
   ->  Index Scan using t1_pkey on t1  (cost=0.00..8.27 rows=1 width=36)
         Index Cond: (a = 2)
   ->  Seq Scan on t2  (cost=0.00..22.30 rows=1230 width=36)
(5 rows)

  ==> "a = 2" is a built-in operator, so we assume it is safe.
      This clause was pushed down into the join loop, then utilized to
      index scan.


* SELECT * FROM v WHERE a::text = 'a';

[without this patch]
postgres=# EXPLAIN SELECT * FROM v WHERE a::text = '2';
                           QUERY PLAN
----------------------------------------------------------------
 Nested Loop  (cost=0.00..2009.54 rows=2460 width=72)
   Join Filter: f_policy((t1.a + t2.x))
   ->  Seq Scan on t2  (cost=0.00..22.30 rows=1230 width=36)
   ->  Materialize  (cost=0.00..31.55 rows=6 width=36)
         ->  Seq Scan on t1  (cost=0.00..31.52 rows=6 width=36)
               Filter: ((a)::text = '2'::text)
(6 rows)
  ==> "a::text = 'a'" was pushed down into the join loop.

[with this patch]
postgres=# EXPLAIN SELECT * FROM v WHERE a::text = '2';
                               QUERY PLAN
-------------------------------------------------------------------------
 Nested Loop  (cost=0.00..412312.92 rows=2521 width=72)
   Join Filter: (((t1.a)::text = '2'::text) AND f_policy((t1.a + t2.x)))
   ->  Seq Scan on t1  (cost=0.00..22.30 rows=1230 width=36)
   ->  Materialize  (cost=0.00..28.45 rows=1230 width=36)
         ->  Seq Scan on t2  (cost=0.00..22.30 rows=1230 width=36)
(5 rows)
  ==> The "a::text = '2'" clause contains CoerceViaIO node, so this patch
      assumes it is not safe.


Please note that this patch does not case about a case when
a function inside a view and a function outside a view are
distributed into same level and the later function has lower
cost value.
To fix this problem definitely, we also need to mark nest-level
of the clause being originally supplied, and need to order
the clauses by the nest-level with higher priority than cost.

If we found f_malicious() and f_policy() in same level at the
above example, f_policy() should be executed earlier because
it was originally supplied in the deeper nest level.

Thanks,
--
KaiGai Kohei <kaigai@ak.jp.nec.com>

Attachment

pgsql-fix-leaky-join-view.4.patch

Re: leaky views, yet again

From

Robert Haas

Date:

02 September 2010, 02:57:18

2010/9/1 KaiGai Kohei <kaigai@ak.jp.nec.com>:
> Right now, it stands on a strict assumption that considers operators
> implemented with built-in functions are safe; it does not have no
> possibility to leak supplied arguments anywhere.
>
> Please note that this patch does not case about a case when
> a function inside a view and a function outside a view are
> distributed into same level and the later function has lower
> cost value.

Without making some attempt to address these two points, I don't see
the point of this patch.

Also, I believe we decided previously do this deoptimization only in
case the user requests it with CREATE SECURITY VIEW.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

Re: leaky views, yet again

From

KaiGai Kohei

Date:

02 September 2010, 03:26:08

(2010/09/02 11:57), Robert Haas wrote:
> 2010/9/1 KaiGai Kohei<kaigai@ak.jp.nec.com>:
>> Right now, it stands on a strict assumption that considers operators
>> implemented with built-in functions are safe; it does not have no
>> possibility to leak supplied arguments anywhere.
>>
>> Please note that this patch does not case about a case when
>> a function inside a view and a function outside a view are
>> distributed into same level and the later function has lower
>> cost value.
> 
> Without making some attempt to address these two points, I don't see
> the point of this patch.
> 
> Also, I believe we decided previously do this deoptimization only in
> case the user requests it with CREATE SECURITY VIEW.
> 
Perhaps, I remember the previous discussion incorrectly.

If we have a hint about whether the supplied view is intended to security
purpose, or not, it seems to me it is a reliable method to prevent pulling
up the subqueries come from security views.
Is it too much deoptimization?

If we keep security views as subqueries, the later point shall be solved
implicitly. Even if we try to optimize security views in a certain level,
it is not difficult to solve, as I demonstrated before.

Thanks,
-- 
KaiGai Kohei <kaigai@ak.jp.nec.com>

Re: leaky views, yet again

From

Robert Haas

Date:

02 September 2010, 03:39:22

2010/9/1 KaiGai Kohei <kaigai@ak.jp.nec.com>:
> (2010/09/02 11:57), Robert Haas wrote:
>> 2010/9/1 KaiGai Kohei<kaigai@ak.jp.nec.com>:
>>> Right now, it stands on a strict assumption that considers operators
>>> implemented with built-in functions are safe; it does not have no
>>> possibility to leak supplied arguments anywhere.
>>>
>>> Please note that this patch does not case about a case when
>>> a function inside a view and a function outside a view are
>>> distributed into same level and the later function has lower
>>> cost value.
>>
>> Without making some attempt to address these two points, I don't see
>> the point of this patch.
>>
>> Also, I believe we decided previously do this deoptimization only in
>> case the user requests it with CREATE SECURITY VIEW.
>>
> Perhaps, I remember the previous discussion incorrectly.
>
> If we have a hint about whether the supplied view is intended to security
> purpose, or not, it seems to me it is a reliable method to prevent pulling
> up the subqueries come from security views.
> Is it too much deoptimization?

Well, that'd prevent something like id = 3 from getting pushed down,
which seems a bit harsh.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

Re: leaky views, yet again

From

KaiGai Kohei

Date:

02 September 2010, 04:31:13

(2010/09/02 12:38), Robert Haas wrote:
> 2010/9/1 KaiGai Kohei<kaigai@ak.jp.nec.com>:
>> (2010/09/02 11:57), Robert Haas wrote:
>>> 2010/9/1 KaiGai Kohei<kaigai@ak.jp.nec.com>:
>>>> Right now, it stands on a strict assumption that considers operators
>>>> implemented with built-in functions are safe; it does not have no
>>>> possibility to leak supplied arguments anywhere.
>>>>
>>>> Please note that this patch does not case about a case when
>>>> a function inside a view and a function outside a view are
>>>> distributed into same level and the later function has lower
>>>> cost value.
>>>
>>> Without making some attempt to address these two points, I don't see
>>> the point of this patch.
>>>
>>> Also, I believe we decided previously do this deoptimization only in
>>> case the user requests it with CREATE SECURITY VIEW.
>>>
>> Perhaps, I remember the previous discussion incorrectly.
>>
>> If we have a hint about whether the supplied view is intended to security
>> purpose, or not, it seems to me it is a reliable method to prevent pulling
>> up the subqueries come from security views.
>> Is it too much deoptimization?
> 
> Well, that'd prevent something like id = 3 from getting pushed down,
> which seems a bit harsh.
> 
Hmm. If so, we need to remember what FromExpr was come from subqueries
of security views, and what were not. Then, we need to prevent leakable
clause will be distributed to inside of them.
In addition, we also need to care about the order of function calls in
same level, because it is not implicitly solved.

At first, let's consider top-half of the matter.

When views are expanded into subqueries in query-rewriter, Query tree
lost an information OID of the view, because RangeTblEntry does not have
OID of the relation when it is RTE_SUBQUERY. So, we need to patch here
to mark a flag whether the supplied view is security focused, or not.

Then, pull_up_simple_subquery() pulls up a supplied subquery into normal
join, if possible. In this case, FromExpr is chained into the upper level.
Of course, FromExpr does not have a flag to show its origin, so we also
need to copy the new flag in RangeTblEntry to FromExpr.

Then, when distribute_qual_to_rels() is called, the caller also provides
a Bitmapset of relation-Ids which are contained under the FromExpr with
the flag saying it came from the security views.
Even if the supplied clause references a part of the Bitmapset, we need
to prevent the clause being pushed down into the relations came from
security views, except for ones we can make sure these are safe.

Perhaps, it is not impossible....

Thanks,
-- 
KaiGai Kohei <kaigai@ak.jp.nec.com>

Re: leaky views, yet again

From

KaiGai Kohei

Date:

02 September 2010, 07:34:26

(2010/09/02 13:30), KaiGai Kohei wrote:
> (2010/09/02 12:38), Robert Haas wrote:
>> 2010/9/1 KaiGai Kohei<kaigai@ak.jp.nec.com>:
>>> (2010/09/02 11:57), Robert Haas wrote:
>>>> 2010/9/1 KaiGai Kohei<kaigai@ak.jp.nec.com>:
>>>>> Right now, it stands on a strict assumption that considers operators
>>>>> implemented with built-in functions are safe; it does not have no
>>>>> possibility to leak supplied arguments anywhere.
>>>>>
>>>>> Please note that this patch does not case about a case when
>>>>> a function inside a view and a function outside a view are
>>>>> distributed into same level and the later function has lower
>>>>> cost value.
>>>>
>>>> Without making some attempt to address these two points, I don't see
>>>> the point of this patch.
>>>>
>>>> Also, I believe we decided previously do this deoptimization only in
>>>> case the user requests it with CREATE SECURITY VIEW.
>>>>
>>> Perhaps, I remember the previous discussion incorrectly.
>>>
>>> If we have a hint about whether the supplied view is intended to security
>>> purpose, or not, it seems to me it is a reliable method to prevent pulling
>>> up the subqueries come from security views.
>>> Is it too much deoptimization?
>>
>> Well, that'd prevent something like id = 3 from getting pushed down,
>> which seems a bit harsh.
>>

I've tried to implement a proof of the concept patch according to the
following logic.
(For the quick hack, it does not include statement support. It just
 considers view with name begun from "s" as security views.)

> Hmm. If so, we need to remember what FromExpr was come from subqueries
> of security views, and what were not. Then, we need to prevent leakable
> clause will be distributed to inside of them.
> In addition, we also need to care about the order of function calls in
> same level, because it is not implicitly solved.
>
> At first, let's consider top-half of the matter.
>
> When views are expanded into subqueries in query-rewriter, Query tree
> lost an information OID of the view, because RangeTblEntry does not have
> OID of the relation when it is RTE_SUBQUERY. So, we need to patch here
> to mark a flag whether the supplied view is security focused, or not.
>
This patch added 'security_view' flag into RangeTblEntry.
It shall be set when the query rewriter expands security views.

> Then, pull_up_simple_subquery() pulls up a supplied subquery into normal
> join, if possible. In this case, FromExpr is chained into the upper level.
> Of course, FromExpr does not have a flag to show its origin, so we also
> need to copy the new flag in RangeTblEntry to FromExpr.
>
This patch also added 'security_view' flag into FromExpr.
It shall be set when the pull_up_simple_subquery() pulled up
a RangeTblEntry with security_view = true.

> Then, when distribute_qual_to_rels() is called, the caller also provides
> a Bitmapset of relation-Ids which are contained under the FromExpr with
> the flag saying it came from the security views.
> Even if the supplied clause references a part of the Bitmapset, we need
> to prevent the clause being pushed down into the relations came from
> security views, except for ones we can make sure these are safe.
>
Just before distribute_qual_to_rels(), deconstruct_recurse() is invoked.
It walks on the supplied join-tree to collect what relations are appeared
under the current FromExpr/JoinExpr.
The deconstruct_recurse() was modified to take two new arguments of
'bool below_sec_barriers' and 'Relids *sec_barriers'.
The first one means the current recursion is under the FromExpr with
security_view being true. At that time, it set appeared relations on
the sec_barriers, then returns.
In the result, the 'sec_barriers' shall become a bitmapset of relations
being under the FromExpr which is originated by security views.

Then, 'sec_barriers' shall be delivered to distribute_qual_to_rels().
If the supplied qualifier references a part of 'sec_barriers' and
contains possibly leakable functions, it appends whole of the sec_barriers
to the bitmapset of relations on which the clause is depending.
In the result, it shall not be pushed down into the security view.

Example)
testdb=# CREATE VIEW n_view AS SELECT * FROM t1 JOIN t2 ON t1.a = t2.x;
CREATE VIEW
testdb=# CREATE VIEW s_view AS SELECT * FROM t1 JOIN t2 ON t1.a = t2.x;
CREATE VIEW

testdb=# EXPLAIN SELECT * FROM n_view WHERE f_malicious(y);
                            QUERY PLAN
-------------------------------------------------------------------
 Hash Join  (cost=334.93..365.94 rows=410 width=72)
   Hash Cond: (t1.a = t2.x)
   ->  Seq Scan on t1  (cost=0.00..22.30 rows=1230 width=36)
   ->  Hash  (cost=329.80..329.80 rows=410 width=36)
         ->  Seq Scan on t2  (cost=0.00..329.80 rows=410 width=36)
               Filter: f_malicious(y)
(6 rows)

testdb=# EXPLAIN SELECT * FROM s_view WHERE f_malicious(y);
                            QUERY PLAN
-------------------------------------------------------------------
 Hash Join  (cost=37.68..384.39 rows=410 width=72)
   Hash Cond: (t1.a = t2.x)
   Join Filter: f_malicious(t2.y)
   ->  Seq Scan on t1  (cost=0.00..22.30 rows=1230 width=36)
   ->  Hash  (cost=22.30..22.30 rows=1230 width=36)
         ->  Seq Scan on t2  (cost=0.00..22.30 rows=1230 width=36)
(6 rows)

  ==> f_malicious() was moved to outside of the join loop


testdb=# EXPLAIN SELECT * FROM n_view WHERE f_malicious(y) AND a = 2;
                               QUERY PLAN
-------------------------------------------------------------------------
 Nested Loop  (cost=0.00..16.80 rows=1 width=72)
   ->  Index Scan using t1_pkey on t1  (cost=0.00..8.27 rows=1 width=36)
         Index Cond: (a = 2)
   ->  Index Scan using t2_pkey on t2  (cost=0.00..8.52 rows=1 width=36)
         Index Cond: (x = 2)
         Filter: f_malicious(y)
(6 rows)

testdb=# EXPLAIN SELECT * FROM s_view WHERE f_malicious(y) AND a = 2;
                               QUERY PLAN
-------------------------------------------------------------------------
 Nested Loop  (cost=0.00..16.80 rows=1 width=72)
   Join Filter: f_malicious(t2.y)
   ->  Index Scan using t1_pkey on t1  (cost=0.00..8.27 rows=1 width=36)
         Index Cond: (a = 2)
   ->  Index Scan using t2_pkey on t2  (cost=0.00..8.27 rows=1 width=36)
         Index Cond: (x = 2)
(6 rows)
  ==> Because 'a = 2' is not leakable, this clause was pushed down into
      the join loop. But f_malicious() is not in 's_view' example.


testdb=# CREATE VIEW n_view2 AS SELECT * FROM t1 WHERE f_policy(a);
CREATE VIEW
testdb=# CREATE VIEW s_view2 AS SELECT * FROM t1 WHERE f_policy(a);
CREATE VIEW
testdb=# EXPLAIN SELECT * FROM n_view2 WHERE f_malicious(b);
                      QUERY PLAN
-------------------------------------------------------
 Seq Scan on t1  (cost=0.00..329.80 rows=137 width=36)
   Filter: (f_malicious(b) AND f_policy(a))
(2 rows)

testdb=# EXPLAIN SELECT * FROM s_view2 WHERE f_malicious(b);
                      QUERY PLAN
-------------------------------------------------------
 Seq Scan on t1  (cost=0.00..329.80 rows=137 width=36)
   Filter: (f_malicious(b) AND f_policy(a))
(2 rows)

  ==> But it does not affect anything on the case when a leakable
      function from outside of the view is chained to same level
      with security policy function of the view.
      In this case, we need to mark functions the original nest
      level, then sort by the nest level rather than cost on the
      order_qual_clauses().

Thanks,
--
KaiGai Kohei <kaigai@ak.jp.nec.com>

Attachment

pgsql-fix-leaky-join-view.5.patch

Re: leaky views, yet again

From

KaiGai Kohei

Date:

06 September 2010, 04:55:39

(2010/09/02 11:57), Robert Haas wrote:
> 2010/9/1 KaiGai Kohei<kaigai@ak.jp.nec.com>:
>> Right now, it stands on a strict assumption that considers operators
>> implemented with built-in functions are safe; it does not have no
>> possibility to leak supplied arguments anywhere.
>>
>> Please note that this patch does not case about a case when
>> a function inside a view and a function outside a view are
>> distributed into same level and the later function has lower
>> cost value.
>
> Without making some attempt to address these two points, I don't see
> the point of this patch.
>
> Also, I believe we decided previously do this deoptimization only in
> case the user requests it with CREATE SECURITY VIEW.
>

The attached patch tackles both of the above two points, and allows to
control this deoptimization when CREATE SECURITY VIEW is used.

 Example
---------
  CREATE FUNCTION f_policy(int) RETURNS bool LANGUAGE 'plpgsql'
        AS 'begin return $1 % 2 = 0; end';
  CREATE FUNCTION f_malicious(text) RETURNS bool COST 0.0001 LANGUAGE 'plpgsql'
        AS 'begin raise notice ''leak: %'', $1; return true; end';
  CREATE OR REPLACE VIEW v1
        AS SELECT * FROM t1 JOIN t2 ON t1.a = t2.x WHERE f_policy(x);
  CREATE OR REPLACE VIEW v2
        AS SELECT * FROM t1 WHERE f_policy(a);

In this case, we don't want to invoke f_malicious() with argument come
from the tuples which should be filtered inside of the views, even if
people provides it on the WHERE clause.

 Without this patch
====================

postgres=# EXPLAIN select * from v1 where f_malicious(b);
                               QUERY PLAN
-------------------------------------------------------------------------
 Nested Loop  (cost=0.00..287.63 rows=410 width=72)
   ->  Seq Scan on t1  (cost=0.00..22.30 rows=410 width=36)
         Filter: f_malicious(b)
   ->  Index Scan using t2_pkey on t2  (cost=0.00..0.63 rows=1 width=36)
         Index Cond: (x = t1.a)
         Filter: f_policy(x)
(6 rows)

  ==> f_policy(x) was chained inside of the join loop, so it enables to
      something to be filtered out.

postgres=# select * from v1 where f_malicious(b);
NOTICE:  leak: aaa
NOTICE:  leak: bbb
NOTICE:  leak: ccc
NOTICE:  leak: ddd
 a |  b  | x |  y
---+-----+---+-----
 2 | bbb | 2 | xxx
 4 | ddd | 4 | zzz
(2 rows)

postgres=# EXPLAIN select * from v2 where f_malicious(b);
                      QUERY PLAN
-------------------------------------------------------
 Seq Scan on t1  (cost=0.00..329.80 rows=137 width=36)
   Filter: (f_malicious(b) AND f_policy(a))
(2 rows)

  ==> The view of v2 is pulled up to the upper level, because this subquery
      is enough simple. In this case, f_malicious() and f_policy() are
      handled in a same level, and ordered by its cost to execution.
      Since f_malicious() has lower cost (0.001) the f_policy(), so it shall
      be executed earlier, although f_policy() is used in more deep depth.

 With this patch
=================

postgres=# EXPLAIN select * from v1 where f_malicious(b);
                               QUERY PLAN
-------------------------------------------------------------------------
 Nested Loop  (cost=0.00..287.63 rows=410 width=72)
   ->  Seq Scan on t1  (cost=0.00..22.30 rows=410 width=36)
         Filter: f_malicious(b)
   ->  Index Scan using t2_pkey on t2  (cost=0.00..0.63 rows=1 width=36)
         Index Cond: (x = t1.a)
         Filter: f_policy(x)
(6 rows)

  ==> It has same result of the case when without this patch,
      because this view is declared without "SECURITY"

postgres=# CREATE OR REPLACE SECURITY VIEW v1
        AS SELECT * FROM t1 JOIN t2 ON t1.a = t2.x WHERE f_policy(x);
CREATE VIEW
postgres=# EXPLAIN select * from v1 where f_malicious(b);
                            QUERY PLAN
-------------------------------------------------------------------
 Hash Join  (cost=37.68..384.39 rows=137 width=72)
   Hash Cond: (t1.a = t2.x)
   Join Filter: (f_policy(t2.x) AND f_malicious(t1.b))
   ->  Seq Scan on t1  (cost=0.00..22.30 rows=1230 width=36)
   ->  Hash  (cost=22.30..22.30 rows=1230 width=36)
         ->  Seq Scan on t2  (cost=0.00..22.30 rows=1230 width=36)
(6 rows)

  ==> After we replaced this view with SECURITY option, it prevents
      f_malicious() being come from outside of the view into inside
      of the join loop.

postgres=# EXPLAIN select * from v1 where f_malicious(b) and a = 2;
                               QUERY PLAN
-------------------------------------------------------------------------
 Nested Loop  (cost=0.00..16.80 rows=1 width=72)
   Join Filter: (f_policy(t2.x) AND f_malicious(t1.b))
   ->  Index Scan using t1_pkey on t1  (cost=0.00..8.27 rows=1 width=36)
         Index Cond: (a = 2)
   ->  Index Scan using t2_pkey on t2  (cost=0.00..8.27 rows=1 width=36)
         Index Cond: (x = 2)
(6 rows)

  ==> We assume operators implemented with built-in functions are safe.
      So, we don't prevent this pushing-down, if the clause does not
      contains something leakable.

postgres=# EXPLAIN select * from v2 where f_malicious(b);
                      QUERY PLAN
-------------------------------------------------------
 Seq Scan on t1  (cost=0.00..329.80 rows=137 width=36)
   Filter: (f_policy(a) AND f_malicious(b))
(2 rows)

  ==> In addition, when functions come from different nest levels
      are distributed a certain scan filter, these are ordered by
      its original nest level, not only function cost.

 Introduction of the patch
===========================
This patch tries to tackle the two matters.

The first one is pushing down leakable functions into inside of join-loops
being originated from security view.
The distribute_qual_to_rels() distributes qualifiers of WHERE/JOIN ... ON
clauses to appropriate set of relations. Even if a qualifier is depending
on a part of join-loop inside of security view, we need to prevent it to
be pushed down.
This patch informs distribute_qual_to_rels() what relations being come from
security views using Bitmapset of 'sec_barriers'. If a qualifier overlaps
with any members of security_barriers, its dependency shall be expanded to
the union set of 'sec_barriers'. In this case, this qualifier depends on
whole of the relations inside of the security view, we cannot push it down
anymore.

Views are expanded to sub-queries at query rewriter.
This patch add a flag to FromExpr to distinguish whether this sub-query was
expanded from security view, or not. When deconstruct_recurse() walks on
the supplied join-tree, it construct a bitmapset of the relations under the
FromExpr with security_view = true, then, it shall be delivered to
the distribute_qual_to_rels().

Instead of modifying the structure of pg_class, this patch stores a flag of
security view on the reloption. So, it needed to modify routines about
reloptions because it is the first case to store reloptions of views.

The other matter is priority to order qualifiers of a certain scanning
filter. As I mentioned above, the planner pull up simple subqueries to
join, so a function originated from inside of view and outside of view are
distributed to same scan plan. Then, these are ordered by cost value.
It means functions come from outside of views (maybe leakable) are invoked
with arguments come from tuples to be filtered out with functions come from
inside of the views.

So, I added 'depth' field to FuncExpr and so on. It shall be set just after
query rewriter, then referenced in order_qual_clauses().
If the supplied plan multiple qualifiers, these are ordered by the depth of
qualifier first, then ordered by the cost when here are multiple qualifiers
with same depth.
It makes sure qualifiers being originated inside of views are executed
earlier than others.

Thanks,
--
KaiGai Kohei <kaigai@ak.jp.nec.com>

Attachment

pgsql-fix-leaky-join-view.6.patch

Re: leaky views, yet again

From

Itagaki Takahiro

Date:

05 October 2010, 09:01:55

2010/9/6 KaiGai Kohei <kaigai@ak.jp.nec.com>:
> The attached patch tackles both of the above two points, and allows to
> control this deoptimization when CREATE SECURITY VIEW is used.

I'll send a review for the patch.

Contents & Purpose
==================
The patch adds a new SECURITY option for VIEWs. Views defined with
CREATE SECURITY
VIEW command are not merged with external WHERE-clauses including
security-leakable
functions in queries. However, since internal functions and operators are not
considered as leakable functions, the planner still works well for
security views
unless we use user-defined functions in the WHERE-clause.

Initial Run
===========
* Because of the delayed review, the patch has one hunk:  1 out of 5 hunks FAILED -- saving rejects to file
src/backend/commands/tablecmds.c.rej but it is not serious at all, and can be easily fixed.

* It can be compiled, but has two warnings:
rewriteHandler.c: In function 'MarkFuncExprDepthWalker':
rewriteHandler.c:1155: warning: cast from pointer to integer of different size
rewriteHandler.c:1227: warning: cast to pointer from integer of different size

They should be harmless, but casting intptr_t is often used to avoid warnings: - L1155: (int) (intptr_t) context; -
L1227:(void *) (intptr_t) (depth + 1) 

* It passes all regression tests, but doesn't have own new tests.

Remaining works
===============
The patch might include only core code, but I'll let you know additional
sub-works are still required to complete the feature. Also, I've not measured
the performance yet, though performance might not be so critical for the patch.

* Regression tests and documentation for the feature are required.
* pg_dump must support to dump SECURITY VIEWs. They are dumped as normal views for now.
* pg_views can have "security" column.
* psql's \dv can show security attributes of views.

Questions
=========
> postgres=# EXPLAIN select * from v1 where f_malicious(b) and a = 2;
>  ==> We assume operators implemented with built-in functions are safe.
>      So, we don't prevent this pushing-down, if the clause does not
>      contains something leakable.

The term "built-in functions" means functions written in INTERNAL language
here. But we also have SQL functions by default. Some of them are just a
wrapper to internal functions. I'm not sure the checking of INTERNAL language
is the best way for the purpose. Did you compare it with other methods?
For example, "oid < FirstNormalObjectId" looks workable for me.

> Instead of modifying the structure of pg_class, this patch stores a flag of
> security view on the reloption. So, it needed to modify routines about
> reloptions because it is the first case to store reloptions of views.

Why did you need to extend StdRdOptions strucuture? Since other fields in
the structure are not used by views at all, I think adding a new structure struct ViewOptions { vl_len, security_view }
would be better than extending StdRdOptions.

--
Itagaki Takahiro

Re: leaky views, yet again

From

KaiGai Kohei

Date:

05 October 2010, 09:32:40

(2010/10/05 18:01), Itagaki Takahiro wrote:
> 2010/9/6 KaiGai Kohei<kaigai@ak.jp.nec.com>:
>> The attached patch tackles both of the above two points, and allows to
>> control this deoptimization when CREATE SECURITY VIEW is used.
> 
> I'll send a review for the patch.
> 
Thanks for your reviewing.

> Contents&  Purpose
> ==================
> The patch adds a new SECURITY option for VIEWs. Views defined with
> CREATE SECURITY
> VIEW command are not merged with external WHERE-clauses including
> security-leakable
> functions in queries. However, since internal functions and operators are not
> considered as leakable functions, the planner still works well for
> security views
> unless we use user-defined functions in the WHERE-clause.
> 
> Initial Run
> ===========
> * Because of the delayed review, the patch has one hunk:
>     1 out of 5 hunks FAILED -- saving rejects to file
> src/backend/commands/tablecmds.c.rej
>    but it is not serious at all, and can be easily fixed.
> 
The patch was submitted a month ago. I'll fix it.

> * It can be compiled, but has two warnings:
> rewriteHandler.c: In function 'MarkFuncExprDepthWalker':
> rewriteHandler.c:1155: warning: cast from pointer to integer of different size
> rewriteHandler.c:1227: warning: cast to pointer from integer of different size
> 
> They should be harmless, but casting intptr_t is often used to avoid warnings:
>    - L1155: (int) (intptr_t) context;
>    - L1227: (void *) (intptr_t) (depth + 1)
> 
Thanks for the good idea.

> * It passes all regression tests, but doesn't have own new tests.
> 
OK, I'll add it. Perhaps, EXPLAIN enables to show us differences between
security views and others.

> Remaining works
> ===============
> The patch might include only core code, but I'll let you know additional
> sub-works are still required to complete the feature. Also, I've not measured
> the performance yet, though performance might not be so critical for the patch.
> 
> * Regression tests and documentation for the feature are required.
> * pg_dump must support to dump SECURITY VIEWs.
>    They are dumped as normal views for now.
> * pg_views can have "security" column.
> * psql's \dv can show security attributes of views.
> 
All are fair enough for me.

> Questions
> =========
>> postgres=# EXPLAIN select * from v1 where f_malicious(b) and a = 2;
>>   ==>  We assume operators implemented with built-in functions are safe.
>>       So, we don't prevent this pushing-down, if the clause does not
>>       contains something leakable.
> 
> The term "built-in functions" means functions written in INTERNAL language
> here. But we also have SQL functions by default. Some of them are just a
> wrapper to internal functions. I'm not sure the checking of INTERNAL language
> is the best way for the purpose. Did you compare it with other methods?
> For example, "oid<  FirstNormalObjectId" looks workable for me.
> 
Hmm. I'm not sure they can be used for index-scans. If these operators are not
corresponding to index-scans, I want to keep the logic to check INTERNAL language,
because these have obviously no side effects (= not leakable anything).

kaigai=# select oprname, oprleft::regtype, oprright::regtype, prosrc            from pg_operator o join pg_proc p on
o.oprcode= p.oid where prolang <> 12;oprname |        oprleft         |          oprright           |
prosrc
---------+------------------------+-----------------------------+------------------------------------@>      | path
             | point                       | select pg_catalog.on_ppath($2, $1)+       | time without time zone | date
                     | select ($2 + $1)+       | time with time zone    | date                        | select ($2 +
$1)+      | bigint                 | inet                        | select $2 + $1+       | interval               |
timewithout time zone      | select $2 + $1+       | interval               | date                        | select $2 +
$1+      | interval               | time with time zone         | select $2 + $1+       | interval               |
timestampwithout time zone | select $2 + $1+       | interval               | timestamp with time zone    | select $2 +
$1+      | integer                | date                        | select $2 + $1||      | text                   |
anynonarray                | select $1 || $2::pg_catalog.text||      | anynonarray            | text
   | select $1::pg_catalog.text || $2~       | path                   | point                       | select
pg_catalog.on_ppath($2,$1)
 
(13 rows)

>> Instead of modifying the structure of pg_class, this patch stores a flag of
>> security view on the reloption. So, it needed to modify routines about
>> reloptions because it is the first case to store reloptions of views.
> 
> Why did you need to extend StdRdOptions strucuture? Since other fields in
> the structure are not used by views at all, I think adding a new structure
>    struct ViewOptions { vl_len, security_view }
> would be better than extending StdRdOptions.
> 
Hmm. It might be better than ad-hoc enhancement of StdRdOptions.
BTW, which is more preference to store the flag of security view:
reloption of the view or a new bool variable in the pg_class?

I tried to store this flag within reloptions to minimize the patch
size, but it seems to me reloption support for views makes the patch
size larger in the result.

Thanks,
-- 
KaiGai Kohei <kaigai@ak.jp.nec.com>

Re: leaky views, yet again

From

Robert Haas

Date:

05 October 2010, 14:16:18

2010/10/5 KaiGai Kohei <kaigai@ak.jp.nec.com>:
>> The term "built-in functions" means functions written in INTERNAL language
>> here. But we also have SQL functions by default. Some of them are just a
>> wrapper to internal functions. I'm not sure the checking of INTERNAL language
>> is the best way for the purpose. Did you compare it with other methods?
>> For example, "oid<  FirstNormalObjectId" looks workable for me.
>>
> Hmm. I'm not sure they can be used for index-scans. If these operators are not
> corresponding to index-scans, I want to keep the logic to check INTERNAL language,
> because these have obviously no side effects (= not leakable anything).

I think the idea that all internal operators are safe has been
thoroughly discredited already.

> Hmm. It might be better than ad-hoc enhancement of StdRdOptions.
> BTW, which is more preference to store the flag of security view:
> reloption of the view or a new bool variable in the pg_class?
>
> I tried to store this flag within reloptions to minimize the patch
> size, but it seems to me reloption support for views makes the patch
> size larger in the result.

I think a boolean in pg_class is the way to go.  Is there a padding
byte we can steal, to avoid making the fixed-size portion larger?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

Re: leaky views, yet again

From

KaiGai Kohei

Date:

05 October 2010, 14:27:42

(2010/10/05 23:16), Robert Haas wrote:
> 2010/10/5 KaiGai Kohei<kaigai@ak.jp.nec.com>:
>>> The term "built-in functions" means functions written in INTERNAL language
>>> here. But we also have SQL functions by default. Some of them are just a
>>> wrapper to internal functions. I'm not sure the checking of INTERNAL language
>>> is the best way for the purpose. Did you compare it with other methods?
>>> For example, "oid<    FirstNormalObjectId" looks workable for me.
>>>
>> Hmm. I'm not sure they can be used for index-scans. If these operators are not
>> corresponding to index-scans, I want to keep the logic to check INTERNAL language,
>> because these have obviously no side effects (= not leakable anything).
>
> I think the idea that all internal operators are safe has been
> thoroughly discredited already.
>
How can we distinguish an internal operator and others?
Because pg_operator does not have a property that we can use to
distinguish them, I tried to check function of the operators...

>> Hmm. It might be better than ad-hoc enhancement of StdRdOptions.
>> BTW, which is more preference to store the flag of security view:
>> reloption of the view or a new bool variable in the pg_class?
>>
>> I tried to store this flag within reloptions to minimize the patch
>> size, but it seems to me reloption support for views makes the patch
>> size larger in the result.
>
> I think a boolean in pg_class is the way to go.  Is there a padding
> byte we can steal, to avoid making the fixed-size portion larger?
>
OK, I'll add a boolean in pg_class. Perhaps, 'relissecview'?

Thanks
-- 
KaiGai Kohei <kaigai@kaigai.gr.jp>

Re: leaky views, yet again

From

Robert Haas

Date:

05 October 2010, 14:39:28

On Tue, Oct 5, 2010 at 10:27 AM, KaiGai Kohei <kaigai@kaigai.gr.jp> wrote:
> (2010/10/05 23:16), Robert Haas wrote:
>>
>> 2010/10/5 KaiGai Kohei<kaigai@ak.jp.nec.com>:
>>>>
>>>> The term "built-in functions" means functions written in INTERNAL
>>>> language
>>>> here. But we also have SQL functions by default. Some of them are just a
>>>> wrapper to internal functions. I'm not sure the checking of INTERNAL
>>>> language
>>>> is the best way for the purpose. Did you compare it with other methods?
>>>> For example, "oid<    FirstNormalObjectId" looks workable for me.
>>>>
>>> Hmm. I'm not sure they can be used for index-scans. If these operators
>>> are not
>>> corresponding to index-scans, I want to keep the logic to check INTERNAL
>>> language,
>>> because these have obviously no side effects (= not leakable anything).
>>
>> I think the idea that all internal operators are safe has been
>> thoroughly discredited already.
>>
> How can we distinguish an internal operator and others?
> Because pg_operator does not have a property that we can use to
> distinguish them, I tried to check function of the operators...

Checking the functions of the operators is the right thing to do, but
assuming that internal = safe does not work.  For example, pushing
down arithmetic operators allows you to probe for any given value in a
hidden row by testing whether 1 / (x - constant) throws a division by
zero error.  I believe we need a flag in pg_proc, which we should
probably set only for a very limited set of operators to start.
Really, the only things that seem important to get right for the first
version are things that might allow useful indexing, and even then we
should err on the side of caution.

I think that for review purposes this should be split into two or
three patches.  First, a patch to add the flag to pg_proc (with SQL
support, pg_dump support, psql support, etc.); second, a patch to add
CREATE SECURITY VIEW (with SQL support, pg_dump support, psql support,
etc.); third (or possibly this can be done together with the previous
item), a patch to make the necessary behavior changes for security
views.  We should have all the patches in hand before we start
committing anything, but it will be easier to review in phases.

>>> Hmm. It might be better than ad-hoc enhancement of StdRdOptions.
>>> BTW, which is more preference to store the flag of security view:
>>> reloption of the view or a new bool variable in the pg_class?
>>>
>>> I tried to store this flag within reloptions to minimize the patch
>>> size, but it seems to me reloption support for views makes the patch
>>> size larger in the result.
>>
>> I think a boolean in pg_class is the way to go.  Is there a padding
>> byte we can steal, to avoid making the fixed-size portion larger?
>>
> OK, I'll add a boolean in pg_class. Perhaps, 'relissecview'?

Doesn't sound bad, but I just had another thought.  How about defining
a new relkind?  It'll make the patch a little more complex but I think
if it avoids adding a column to pg_class it's probably worth it.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

Re: leaky views, yet again

From

Tom Lane

Date:

05 October 2010, 14:57:01

Robert Haas <robertmhaas@gmail.com> writes:
> Checking the functions of the operators is the right thing to do, but
> assuming that internal = safe does not work.  For example, pushing
> down arithmetic operators allows you to probe for any given value in a
> hidden row by testing whether 1 / (x - constant) throws a division by
> zero error.

Well, if the goal is "make it impossible to tell whether such-and-such a
value exists", I think this approach can't meet it at all.  There are
too many side channels.  You're focusing here on the error-report side
channel, but there's also performance, ie how long did the query take.
(BTW, is the intent to somehow lobotomize EXPLAIN so you can't use that
to see what happened?)

Personally I think this is a dead end that we shouldn't be wasting
any more time on.

> Doesn't sound bad, but I just had another thought.  How about defining
> a new relkind?  It'll make the patch a little more complex but I think
> if it avoids adding a column to pg_class it's probably worth it.

New relkind would make the patch affect a lot of stuff that shouldn't
need to care, including client-side code.
        regards, tom lane

Re: leaky views, yet again

From

Robert Haas

Date:

05 October 2010, 15:29:07

On Tue, Oct 5, 2010 at 10:56 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Personally I think this is a dead end that we shouldn't be wasting
> any more time on.

But you haven't proposed a reasonable alternative.

As far as I can see, there are only two ways to go here.

Option #1: Remove all mention from the documentation of using views
for security purposes.  Don't allow views to have explicit permissions
attached to them; they are merely shorthand for a SELECT, for which
you either do or do not have privileges.

Option #2: Define a standard for what constitutes acceptable
information leakage and what does not.  Then write the code to try to
meet that standard.

The status quo, whereby we advise people to security their data by
doing something that doesn't actually work, is, to use the
non-technical term, dumb.  We need to decide what we're going to do
about it, not whether we're going to do anything about it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

Re: leaky views, yet again

From

KaiGai Kohei

Date:

05 October 2010, 15:33:32

(2010/10/05 23:56), Tom Lane wrote:
> Robert Haas<robertmhaas@gmail.com>  writes:
>> Checking the functions of the operators is the right thing to do, but
>> assuming that internal = safe does not work.  For example, pushing
>> down arithmetic operators allows you to probe for any given value in a
>> hidden row by testing whether 1 / (x - constant) throws a division by
>> zero error.
> 
> Well, if the goal is "make it impossible to tell whether such-and-such a
> value exists", I think this approach can't meet it at all.  There are
> too many side channels.  You're focusing here on the error-report side
> channel, but there's also performance, ie how long did the query take.
> (BTW, is the intent to somehow lobotomize EXPLAIN so you can't use that
> to see what happened?)
> 
Good point. Major commercial RDBMS with row-level access control
(such as Oracle VPD) does not care about any side channels that
allows us to infer existence of a certain value.

Their features focus on control regular data channels. It allows
malicious users to transfer contents of invisible tuples into others
unexpectedly. It corresponds to a user defined function which insert
the supplied argument into temporary table in my example.

So, if we should catch up same level earlier, I think we need to
ignore such kind of side channel attacks.

If so, the matter become much simple. We need to consider whether
contents of the error messages are delivered via main-channel or
side-channel.
If we consider it is a side-channel, we can trust all the buili-in
functions because nothing tries to write out the supplied argument
into somewhere.
If we consider it is a regular-channel, we need to distinguish safe and
unsafe functions based on a certain criteria, maybe, 'safe' flag in
pg_proc.

In my opinion, I like to categorize error messages as side-channel,
because its through-put is much less than regular-channels, so scale
of the threat is relatively small.

Thanks,
-- 
KaiGai Kohei <kaigai@kaigai.gr.jp>

Re: leaky views, yet again

From

KaiGai Kohei

Date:

05 October 2010, 15:49:56

(2010/10/06 0:33), KaiGai Kohei wrote:
> (2010/10/05 23:56), Tom Lane wrote:
>> Robert Haas<robertmhaas@gmail.com>   writes:
>>> Checking the functions of the operators is the right thing to do, but
>>> assuming that internal = safe does not work.  For example, pushing
>>> down arithmetic operators allows you to probe for any given value in a
>>> hidden row by testing whether 1 / (x - constant) throws a division by
>>> zero error.
>>
>> Well, if the goal is "make it impossible to tell whether such-and-such a
>> value exists", I think this approach can't meet it at all.  There are
>> too many side channels.  You're focusing here on the error-report side
>> channel, but there's also performance, ie how long did the query take.
>> (BTW, is the intent to somehow lobotomize EXPLAIN so you can't use that
>> to see what happened?)
>>
> Good point. Major commercial RDBMS with row-level access control
> (such as Oracle VPD) does not care about any side channels that
> allows us to infer existence of a certain value.
> 
> Their features focus on control regular data channels. It allows
Sorry,                                             prevents ^^^^^^

> malicious users to transfer contents of invisible tuples into others
> unexpectedly. It corresponds to a user defined function which insert
> the supplied argument into temporary table in my example.
> 
> So, if we should catch up same level earlier, I think we need to
> ignore such kind of side channel attacks.
> 
> If so, the matter become much simple. We need to consider whether
> contents of the error messages are delivered via main-channel or
> side-channel.
> If we consider it is a side-channel, we can trust all the buili-in
> functions because nothing tries to write out the supplied argument
> into somewhere.
> If we consider it is a regular-channel, we need to distinguish safe and
> unsafe functions based on a certain criteria, maybe, 'safe' flag in
> pg_proc.
> 
> In my opinion, I like to categorize error messages as side-channel,
> because its through-put is much less than regular-channels, so scale
> of the threat is relatively small.
> 
> Thanks,


-- 
KaiGai Kohei <kaigai@kaigai.gr.jp>

Re: leaky views, yet again

From

Tom Lane

Date:

05 October 2010, 16:27:30

Robert Haas <robertmhaas@gmail.com> writes:
> On Tue, Oct 5, 2010 at 10:56 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Personally I think this is a dead end that we shouldn't be wasting
>> any more time on.

> But you haven't proposed a reasonable alternative.

Tom: "This problem is insoluble."
Robert: "You can't claim that without offering a solution."

Sorry ...

> Option #1: Remove all mention from the documentation of using views
> for security purposes.  Don't allow views to have explicit permissions
> attached to them; they are merely shorthand for a SELECT, for which
> you either do or do not have privileges.

The SQL standard requires us to attach permissions to views.  The
standard makes no claims whatsoever about how leak-proof views should
be; it only says that you can't call a view without the appropriate
permissions.

I do think it's reasonable for the docs to point out that views that do
row-filtering should not be presumed to be absolutely bulletproof.
That doesn't make permissions on them useless, so you're attacking a
straw man.
        regards, tom lane

Re: leaky views, yet again

From

"Joshua D. Drake"

Date:

05 October 2010, 16:32:20

On Tue, 2010-10-05 at 12:27 -0400, Tom Lane wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
> > On Tue, Oct 5, 2010 at 10:56 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> Personally I think this is a dead end that we shouldn't be wasting
> >> any more time on.
>
> > But you haven't proposed a reasonable alternative.
>
> Tom: "This problem is insoluble."
> Robert: "You can't claim that without offering a solution."
>
> Sorry ...
>
> > Option #1: Remove all mention from the documentation of using views
> > for security purposes.  Don't allow views to have explicit permissions
> > attached to them; they are merely shorthand for a SELECT, for which
> > you either do or do not have privileges.
>
> The SQL standard requires us to attach permissions to views.  The
> standard makes no claims whatsoever about how leak-proof views should
> be; it only says that you can't call a view without the appropriate
> permissions.
>
> I do think it's reasonable for the docs to point out that views that do
> row-filtering should not be presumed to be absolutely bulletproof.
> That doesn't make permissions on them useless, so you're attacking a
> straw man.

+1

JD

--
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579
Consulting, Training, Support, Custom Development, Engineering
http://twitter.com/cmdpromptinc | http://identi.ca/commandprompt

Re: leaky views, yet again

From

Robert Haas

Date:

05 October 2010, 17:03:55

On Tue, Oct 5, 2010 at 12:27 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Option #1: Remove all mention from the documentation of using views
>> for security purposes.  Don't allow views to have explicit permissions
>> attached to them; they are merely shorthand for a SELECT, for which
>> you either do or do not have privileges.
>
> The SQL standard requires us to attach permissions to views.  The
> standard makes no claims whatsoever about how leak-proof views should
> be; it only says that you can't call a view without the appropriate
> permissions.
>
> I do think it's reasonable for the docs to point out that views that do
> row-filtering should not be presumed to be absolutely bulletproof.
> That doesn't make permissions on them useless, so you're attacking a
> straw man.

Really?  I'm confused.  What is the use case for the status quo?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

Re: leaky views, yet again

From

Tom Lane

Date:

05 October 2010, 17:38:53

Robert Haas <robertmhaas@gmail.com> writes:
> On Tue, Oct 5, 2010 at 12:27 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> That doesn't make permissions on them useless, so you're attacking a
>> straw man.

> Really?  I'm confused.  What is the use case for the status quo?

Access to tables that you have no direct privileges on, mainly.
(This is significantly more useful when you consider updates on
views than when you consider selects alone.)  Even if it were totally
useless from the point of view you're choosing to adopt, you'd have
a hard row to hoe to convince us to take out a SQL-standard feature.

The SQL facility *is* leak-proof AFAICS for column filtering.
Row filtering, not so much.  We could make it leak-proof (by making
the view into a hard optimization boundary) but I think everyone's
agreed that the performance consequences would be so bad as to make
such a feature useless.  Unfortunately I don't see any design that
avoids the performance penalty while still being guaranteed
leak-proof.  Once you realize that performance itself can be a leak
channel, it may well be that that's simply a contradiction in terms.
And I don't see a lot of use in plugging some side channels while
others remain open.  At least, not without a much more explicit
threat model than anyone's actually offered for this patch, which
would explain exactly why we need to close these particular side
channels and not others.
        regards, tom lane

Re: leaky views, yet again

From

"Kevin Grittner"

Date:

05 October 2010, 17:40:00

Robert Haas <robertmhaas@gmail.com> wrote:
> What is the use case for the status quo?
Much simplified:

create table party
( countyno smallint not null, caseno varchar(14) not null, partyno smallint not null, name text not null, address text,
isaddrsealedboolean not null, primary key (countyno, caseno, partyno)
 
);

create table sealedaddrauth
( userid text not null primary key
);

create view partyview as select countyno, caseno, partyno,   case     when isaddrsealed and not exists      (select *
fromsealedaddrauth       where userid = current_user)     then '*** SEALED ***'     else address   end as address,
isaddrsealedfrom party
 
;

insert into party values (1,'2010FA000123',1,'Jane Doe',
'123 Victim Ave., Anytown, WI 53599',true);
insert into party values (1,'2010FA000123',2,'John Doe',
'123 Stalker St., Hometown, WI 53666',false);
-- Kevin

Re: leaky views, yet again

From

Robert Haas

Date:

05 October 2010, 18:01:41

On Tue, Oct 5, 2010 at 1:29 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Tue, Oct 5, 2010 at 12:27 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> That doesn't make permissions on them useless, so you're attacking a
>>> straw man.
>
>> Really?  I'm confused.  What is the use case for the status quo?
>
> Access to tables that you have no direct privileges on, mainly.
> (This is significantly more useful when you consider updates on
> views than when you consider selects alone.)  Even if it were totally
> useless from the point of view you're choosing to adopt, you'd have
> a hard row to hoe to convince us to take out a SQL-standard feature.
>
> The SQL facility *is* leak-proof AFAICS for column filtering.
> Row filtering, not so much.

OK, that's true.  I guess if you only want to filter columns, and not
rows, then the way it's done right now works.  I didn't think of that.

> We could make it leak-proof (by making
> the view into a hard optimization boundary) but I think everyone's
> agreed that the performance consequences would be so bad as to make
> such a feature useless.  Unfortunately I don't see any design that
> avoids the performance penalty while still being guaranteed
> leak-proof.  Once you realize that performance itself can be a leak
> channel, it may well be that that's simply a contradiction in terms.
> And I don't see a lot of use in plugging some side channels while
> others remain open.  At least, not without a much more explicit
> threat model than anyone's actually offered for this patch, which
> would explain exactly why we need to close these particular side
> channels and not others.

Well, the only thing I've ever wanted to do this for was to allow
sales reps to see their own customers but not the customers of other
sales reps (because if they could pull our complete customer list,
then once they left and went to work for $COMPETITOR they'd start
trying to pick off our customers; of course, we couldn't prevent them
from maintaining a list of their own customers, and no doubt they knew
who some of the other customers were, but they couldn't just dump out
the complete list from the database).  I agree it's hopeless to
prevent all side-channel leaks, but I'd describe the goal like this:

Prevent access to the actual tuple contents of the hidden rows.

Failing to solve this problem at the database level doesn't remove the
business requirement.  I've solved this problem in the past by
ensuring that only trusted users have access to the database, and
forcing everyone else to go through an application that restricts the
set of queries they can issue.  That doesn't eliminate the
side-channel leak, though: they can still pull out a stopwatch and
attempt to infer the size of the table from the query execution time.
But in the above example, the total number of our customers isn't
particularly secret, just the identities of those customers (and
related information like who the main contact is and when their
contract term is up and how much they're paying and all those sorts of
things).  I think this is a common pattern; and it seems to me that if
I can enforce this security policy by writing an application wrapper
around the database, I should also be able to have the database itself
enforce it.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

Re: leaky views, yet again

From

Greg Stark

Date:

05 October 2010, 18:08:53

On Tue, Oct 5, 2010 at 11:01 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> Well, the only thing I've ever wanted to do this for was to allow
> sales reps to see their own customers but not the customers of other
> sales reps (because if they could pull our complete customer list,
> then once they left and went to work for $COMPETITOR they'd start
> trying to pick off our customers; of course, we couldn't prevent them
> from maintaining a list of their own customers, and no doubt they knew
> who some of the other customers were, but they couldn't just dump out
> the complete list from the database).  I agree it's hopeless to
> prevent all side-channel leaks, but I'd describe the goal like this:
>
> Prevent access to the actual tuple contents of the hidden rows.

Though I find it unlikely the sales people would have direct access to
run arbitrary SQL -- let alone create custom functions.

If the users that have select access on the view don't have DDL access
doesn't that make them leak-proof for those users?

--
greg

Re: leaky views, yet again

From

Stephen Frost

Date:

05 October 2010, 18:17:24

* Greg Stark (gsstark@mit.edu) wrote:
> Though I find it unlikely the sales people would have direct access to
> run arbitrary SQL -- let alone create custom functions.

I'm not really sure why..?  Perhaps not quite the same, but I've got
quite a few users who have direct SQL access (though they use ODBC on
their side, typically, there's nothing which forces them to) and I'd
certainly like to have the views that I've created which do row-level
filtering work correctly.  It's not to the point where I've started
using set-returning functions, but it's really not a situation I like
being in. :/

> If the users that have select access on the view don't have DDL access
> doesn't that make them leak-proof for those users?

Yeah..  I feel like we 'fixed' that whole problem with DO in any case..
Thanks,
    Stephen

Re: leaky views, yet again

From

Heikki Linnakangas

Date:

05 October 2010, 18:21:18

On 05.10.2010 21:08, Greg Stark wrote:
> If the users that have select access on the view don't have DDL access
> doesn't that make them leak-proof for those users?

No. You can use built-in functions for leaking data as well.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com

Re: leaky views, yet again

From

Tom Lane

Date:

05 October 2010, 18:46:17

Robert Haas <robertmhaas@gmail.com> writes:
> ... I agree it's hopeless to
> prevent all side-channel leaks, but I'd describe the goal like this:

> Prevent access to the actual tuple contents of the hidden rows.

> Failing to solve this problem at the database level doesn't remove the
> business requirement.  I've solved this problem in the past by
> ensuring that only trusted users have access to the database, and
> forcing everyone else to go through an application that restricts the
> set of queries they can issue.  That doesn't eliminate the
> side-channel leak, though: they can still pull out a stopwatch and
> attempt to infer the size of the table from the query execution time.

I think you were missing the point of my comment about performance.
If the goal is "prevent users from inferring whether value X is present
in the table", I believe this patch cannot fix it because it's possible
(in some cases) to infer that from performance measurements, ie how long
does it take to execute a query that mentions X versus one that mentions
Y.  I agree that it's unlikely to be practical to extract values that
you don't already have a clue about, but broad claims like "prevent all
access" are untenable.

I believe that we might be able to solve your case of ensuring that a
user can't trivially extract the entire table contents, but I don't
believe we can solve Kevin's version of the problem, which is whether
a stalker can verify the address of a victim that he's not supposed to
be able to see.  So we need a pretty clear description of exactly what
it is we're going to be able to prevent and why such a facility is worth
the mess (and future security bugs) it's going to result in.

BTW, I thought Kevin's example view was mighty interesting, because it
applies the security check in a totally different way than what we've
all been implicitly assuming.  Ie, instead ofselect * from underlying_table where security_check();
he didselect security_wrapper(underlying_col) from underlying_table;
Offhand these approaches seem to have quite different properties.
        regards, tom lane

Re: leaky views, yet again

From

Tom Lane

Date:

05 October 2010, 18:48:24

Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
> On 05.10.2010 21:08, Greg Stark wrote:
>> If the users that have select access on the view don't have DDL access
>> doesn't that make them leak-proof for those users?

> No. You can use built-in functions for leaking data as well.

There's a difference between "can be used to extract data wholesale"
and "can be used to probe for the existence of a specific value".
We might need to start using more specific terminology than "leak".
        regards, tom lane

Re: leaky views, yet again

From

Robert Haas

Date:

05 October 2010, 18:49:54

On Tue, Oct 5, 2010 at 2:08 PM, Greg Stark <gsstark@mit.edu> wrote:
> Though I find it unlikely the sales people would have direct access to
> run arbitrary SQL -- let alone create custom functions.

I have definitely seen shops where virtually everyone has SQL-level
access to the database.  Several of them.  Most of them were pretty
insecure, but it certainly doesn't help anything when the database has
no capability to do anything better.  Now, I will grant you that not
everyone in those organizations was actually smart enough to do
meaningful things with the access they had, but I never found that
very comforting.

> If the users that have select access on the view don't have DDL access
> doesn't that make them leak-proof for those users?

Depends what they can do with pre-existing, or built-in, functions.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

Re: leaky views, yet again

From

"Joshua D. Drake"

Date:

05 October 2010, 19:00:10

On Tue, 2010-10-05 at 14:49 -0400, Robert Haas wrote:
> On Tue, Oct 5, 2010 at 2:08 PM, Greg Stark <gsstark@mit.edu> wrote:
> > Though I find it unlikely the sales people would have direct access to
> > run arbitrary SQL -- let alone create custom functions.
>
> I have definitely seen shops where virtually everyone has SQL-level
> access to the database.

Uhh... yeah it is very common to point access at the database and say go
for it. Very common.

>   Several of them.  Most of them were pretty
> insecure, but it certainly doesn't help anything when the database has
> no capability to do anything better.  Now, I will grant you that not
> everyone in those organizations was actually smart enough to do
> meaningful things with the access they had, but I never found that
> very comforting.

The better argument here is, the majority (by far, just google it) of
espionage is done IN HOUSE. It doesn't matter if it is a sales person.
It could be a disgruntled DBA.

JD



--
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579
Consulting, Training, Support, Custom Development, Engineering
http://twitter.com/cmdpromptinc | http://identi.ca/commandprompt

Re: leaky views, yet again

From

Robert Haas

Date:

05 October 2010, 19:06:23

On Tue, Oct 5, 2010 at 2:48 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
>> On 05.10.2010 21:08, Greg Stark wrote:
>>> If the users that have select access on the view don't have DDL access
>>> doesn't that make them leak-proof for those users?
>
>> No. You can use built-in functions for leaking data as well.
>
> There's a difference between "can be used to extract data wholesale"
> and "can be used to probe for the existence of a specific value".
> We might need to start using more specific terminology than "leak".

Yeah.  There are a lot of cases.  The worst is if you can (1a) dump
the underlying table wholesale, or maybe (1b) extract it one row at a
time or something like that.  Not quite as bad is if you can (2) infer
the presence or absence of particular values in particular columns,
e.g. via division-by-zero.  This is still pretty bad though, because
you can probably just keep guessing until you eventually can enumerate
everything in that column.  If it's a text field or a UUID that may be
pretty hard, but if the range of interesting values for that column is
limited to, say, a million or so, then you can just iterate through
them until you find everything.  A related problem is where you can
(3) infer the frequency of a value based on the plan choice, either by
viewing the EXPLAIN output directly or by timing attacks; and then
there's (4) the ability to infer something about the overall amount of
data in the underlying table.  Any others?

Of those, I'm inclined to think that it's possible to close off (1)
and (2) pretty thoroughly with sufficient care, but the best you'd be
able to do for (3) and (4) is refuse to EXPLAIN to a user without
sufficient privileges; the timing attacks seem intractable.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

Re: leaky views, yet again

From

Heikki Linnakangas

Date:

05 October 2010, 19:15:51

On 05.10.2010 21:48, Tom Lane wrote:
> Heikki Linnakangas<heikki.linnakangas@enterprisedb.com>  writes:
>> On 05.10.2010 21:08, Greg Stark wrote:
>>> If the users that have select access on the view don't have DDL access
>>> doesn't that make them leak-proof for those users?
>
>> No. You can use built-in functions for leaking data as well.
>
> There's a difference between "can be used to extract data wholesale"
> and "can be used to probe for the existence of a specific value".

Define wholesale. I'm also not too worried about probing attacks, where 
you can ask "does this value exist?", but there are plenty of built-in 
fúnctions that can regurgitate the argument verbatim in an error 
message. Using that, you can build a script to fetch the value for every 
row in the table, one row at a time. It's orders of magnitude slower 
than "SELECT * FROM table", but orders of magnitude faster than probing 
for every possible value for every row.
> We might need to start using more specific terminology than "leak".

Yeah. There are many different categories of leakage:

1. Wholesale retrieval of all rows. For example, a custom function that 
emits the argument with a NOTICE, used in a WHERE clause.

2. Retrieval of a given value in an arbitrary attacker-chosen row. Using 
a function like text to integer cast that emits the argument in an ERROR 
falls into this category. You can access any value in the table, but 
only one at a time because ERROR causes a rollback.

3. Retrieval of some values, not attacker-chosen. For example, 
statistics might leak the top 100 values.

4. Probing attacks. Let's you check if a row with given value exists.

5. Leakage of statistical information. Lets you know roughly how many 
rows there are in a table, for example.

There's some fuzziness in those too, you might be able to probe for 
values in an indexed column but not others, for example. Or you might be 
able to retrieve all values in one column, or all values in another 
column, but not put them together to form the original rows in the table.

IMHO we don't need to protect against 5. Probing attacks can be nasty, 
but it's next to impossible to protect from them completely. And for 
many applications, probing attacks are a non-issue. For example, imagine 
table of usernames and passwords, with a view that lets you see your own 
password. Probing for other people's passwords would be useless, you 
might as well try to log in with the guessed password directly.

Retrieval of some non-attacker chosen rows, like from statistics, would 
be nice to avoid where feasible, but I won't lose my sleep over it. I do 
think we need to protect against 1 and 2.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com

Re: leaky views, yet again

From

"Kevin Grittner"

Date:

05 October 2010, 19:25:09

Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I don't believe we can solve Kevin's version of the problem, which
> is whether a stalker can verify the address of a victim that he's
> not supposed to be able to see.
I'm surprised; I thought that we were already there.  If someone has
SELECT rights on that view, how would they be able to verify an
address?  More importantly, do you see a way to find out what a
particular party's address is when it is unknown?  I'm getting the
unsettling feeling that I've been missing something important....
By the way, I didn't mean to leave the name column out of the view,
but I guess I inadvertently demonstrated another way in which I
think the current view implementation adds security.  If the column
isn't exposed to the view at all, I don't see how access to the view
can leak much about the omitted column, but perhaps I'm missing
something there, too?
> BTW, I thought Kevin's example view was mighty interesting,
> because it applies the security check in a totally different way
> than what we've all been implicitly assuming.  Ie, instead of
>   select * from underlying_table where security_check();
> he did
>   select security_wrapper(underlying_col) from underlying_table;
> Offhand these approaches seem to have quite different properties.
We do both (sometimes in the same query), but obfuscating detail
about a database object (case, party, address) is much more common
than hiding the existence of these objects.  The obfuscated columns
are usually not indexed or usable as search criteria.
-Kevin

Re: leaky views, yet again

From

Tom Lane

Date:

05 October 2010, 19:42:27

"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
> Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I don't believe we can solve Kevin's version of the problem, which
>> is whether a stalker can verify the address of a victim that he's
>> not supposed to be able to see.
> I'm surprised; I thought that we were already there.

Well, the approach you suggested of putting a security wrapper around
the output column might be bulletproof against that; I'm not entirely
sure, but I don't see a hole in it at the moment.  The trouble with it
is that it's pretty bad from a performance point of view, at least
for columns that people are supposed to be able to use in WHERE clauses.
You couldn't index the wrapper expression either.  So I'm not seeing
a universal solution there.
> By the way, I didn't mean to leave the name column out of the view,
> but I guess I inadvertently demonstrated another way in which I
> think the current view implementation adds security.  If the column
> isn't exposed to the view at all, I don't see how access to the view
> can leak much about the omitted column, but perhaps I'm missing
> something there, too?

Right, *column* filtering seems easy and entirely secure.  The angst
here is about row filtering.  Can we have a view in which users can see
the values of a column for some rows, with perfect security that they
can't identify values for the hidden rows?  The stronger form is that
they shouldn't even be able to tell that hidden rows exist, which is
something your view doesn't try to do; but there are at least some
applications where that would be desirable.
        regards, tom lane

Re: leaky views, yet again

From

"Kevin Grittner"

Date:

05 October 2010, 20:25:26

Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Well, the approach you suggested of putting a security wrapper
> around the output column might be bulletproof against that; I'm
> not entirely sure, but I don't see a hole in it at the moment. 
> The trouble with it is that it's pretty bad from a performance
> point of view, at least for columns that people are supposed to be
> able to use in WHERE clauses.
OK.  We don't really intend for such columns to be used for
selection criteria or sorting, so I think we're fine.  :-)  Thanks
for the clarification.
> The angst here is about row filtering.
OK, I think we're all back on the same page.  I only posted because
Robert questioned whether there was any security use case for
current views.
> The stronger form is that they shouldn't even be able to tell that
> hidden rows exist, which is something your view doesn't try to do;
> but there are at least some applications where that would be
> desirable.
I can understand that, but from what I've read on the topic, it
seems that even some of the most security-conscious government and
military users concede that point and just go with meaningless
identifiers for inter-table references, so that what leaks is
meaningless.
<digression>
The courts have needs which are a bit different than some other
security users; it generally comes down to balancing privacy rights
against the rights of the public to access matters of public record.
Through the continuing efforts of various committees
recommendations, supreme court rules, and legislation, we have one
view of the data which is available on the Internet, a more generous
view of a county's data if you actually walk into that county's
courthouse and use a public access workstation, another level for
all parties on certain case types (with the ability to secure some
of the data about one party from others if ordered by the court),
and then it gets really complicated when you get to what different
court staff are allowed to see or modify.  Then there can be special
exceptions for some business partners, like children's services or
police agencies.
Right now this is managed by query classes in our Java applications,
but as we're moving to a variety of new and different technologies
it's getting harder for the DBAs to ensure that nothing is leaking
to inappropriate recipients.  :-(  I think we're going to need to
move more of the enforcement to database views and/or security
restrictions based on database roles.
Some of this seems to fit fairly neatly with the general direction
in which KaiGai has been pushing; some of it maybe not so much,
because we don't operate on something as simple as "secret", "top
secret", etc.  But we could use some of the features which seem to
be considered pretty esoteric -- like showing different versions of
a row to people with different security.  For example, on a court
calendar, as available to the public in the courthouse for cases
scheduled on the current date, a juvenile case would show the
child's initials ("In the Interest of J.D."), while the case would
not show for the public on the Internet, but the judge involved and
the deputy clerks of court who deal with juvenile cases would see
the entire name.
</digression>
Sorry for drifting off topic....
-Kevin

Re: leaky views, yet again

From

bricklen

Date:

05 October 2010, 20:37:49

On Tue, Oct 5, 2010 at 1:25 PM, Kevin Grittner
<Kevin.Grittner@wicourts.gov> wrote:
> Right now this is managed by query classes in our Java applications,
> but as we're moving to a variety of new and different technologies
> it's getting harder for the DBAs to ensure that nothing is leaking
> to inappropriate recipients.  :-(  I think we're going to need to
> move more of the enforcement to database views and/or security
> restrictions based on database roles.

Does  Veil cover some of those needs?
http://veil.projects.postgresql.org/curdocs/index.html
I've never used it, but from what I recall hearing about it, it did
something similar (I thought).

Re: leaky views, yet again

From

Robert Haas

Date:

05 October 2010, 20:50:57

On Tue, Oct 5, 2010 at 4:25 PM, Kevin Grittner
<Kevin.Grittner@wicourts.gov> wrote:
>> The stronger form is that they shouldn't even be able to tell that
>> hidden rows exist, which is something your view doesn't try to do;
>> but there are at least some applications where that would be
>> desirable.
>
> I can understand that, but from what I've read on the topic, it
> seems that even some of the most security-conscious government and
> military users concede that point and just go with meaningless
> identifiers for inter-table references, so that what leaks is
> meaningless.

Even apart from inter-table references, you can potentially infer
things like table sizes from query response times.  But I think that
stuff is just intractable as a database problem.  In real world
situations, you can handle it by interjecting massive latency.  For
example, if some semi-trusted party asks the US "do you have any
nuclear submarines that are currently near this lat/long?", you can
give them the answer back, say, the next day.  At that point it's
pretty hard for them to infer anything about how long it took you to
search your database of nuclear-submarine-locations, which is
information you might want to keep secret for a variety of reasons.
But the amount of latency that you need to insert to provide a safety
valve is going to be highly application-dependent, and in many cases
it's basically "none", as in the sales rep/customer database I
mentioned earlier.

> Some of this seems to fit fairly neatly with the general direction
> in which KaiGai has been pushing; some of it maybe not so much,
> because we don't operate on something as simple as "secret", "top
> secret", etc.

I wouldn't get the particular issue of leaky views confused with
SE-Linux integration.  There is definitely a use case out there for
label-based mandatory access control, but I don't think anyone would
deny that it's a small subset of our total user base.  Being able to
use views to hide a subset of the rows in some table is a much more
generally useful thing to be able to do.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

Re: leaky views, yet again

From

Tom Lane

Date:

05 October 2010, 20:55:33

bricklen <bricklen@gmail.com> writes:
> Does  Veil cover some of those needs?
> http://veil.projects.postgresql.org/curdocs/index.html

This entire discussion arose from the desire to plug the holes in Veil...
        regards, tom lane

Re: leaky views, yet again

From

"Kevin Grittner"

Date:

05 October 2010, 20:59:37

bricklen <bricklen@gmail.com> wrote:
> Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote:
>> Right now this is managed by query classes in our Java
>> applications, but as we're moving to a variety of new and
>> different technologies it's getting harder for the DBAs to ensure
>> that nothing is leaking to inappropriate recipients.  :-(  I
>> think we're going to need to move more of the enforcement to
>> database views and/or security restrictions based on database
>> roles.
> 
> Does  Veil cover some of those needs?
> http://veil.projects.postgresql.org/curdocs/index.html
> I've never used it, but from what I recall hearing about it, it
> did something similar (I thought).
Possibly.  The general tone of references to it, as well as the fact
that it's still classified as being in beta testing, with a release
number less than one, have put me off from looking at it closely. 
Is anyone using it in a situation where they have three thousand
directly connected users?  Or on a database backing a web site with
five million hits a day?  On a schema with over 300 tables
(normalized, no partitioning)?  If so, I'd be interested in hearing
about it.
-Kevin

Re: leaky views, yet again

From

"Joshua D. Drake"

Date:

05 October 2010, 23:31:56

On Tue, 2010-10-05 at 12:27 -0400, Tom Lane wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
> > On Tue, Oct 5, 2010 at 10:56 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> Personally I think this is a dead end that we shouldn't be wasting
> >> any more time on.
> 
> > But you haven't proposed a reasonable alternative.
> 
> Tom: "This problem is insoluble."
> Robert: "You can't claim that without offering a solution."
> 
> Sorry ...
> 
> > Option #1: Remove all mention from the documentation of using views
> > for security purposes.  Don't allow views to have explicit permissions
> > attached to them; they are merely shorthand for a SELECT, for which
> > you either do or do not have privileges.
> 
> The SQL standard requires us to attach permissions to views.  The
> standard makes no claims whatsoever about how leak-proof views should
> be; it only says that you can't call a view without the appropriate
> permissions.
> 
> I do think it's reasonable for the docs to point out that views that do
> row-filtering should not be presumed to be absolutely bulletproof.
> That doesn't make permissions on them useless, so you're attacking a
> straw man.

+1

JD

-- 
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579
Consulting, Training, Support, Custom Development, Engineering
http://twitter.com/cmdpromptinc | http://identi.ca/commandprompt

Re: leaky views, yet again

From

KaiGai Kohei

Date:

06 October 2010, 00:30:34

(2010/10/06 4:06), Robert Haas wrote:
> On Tue, Oct 5, 2010 at 2:48 PM, Tom Lane<tgl@sss.pgh.pa.us>  wrote:
>> Heikki Linnakangas<heikki.linnakangas@enterprisedb.com>  writes:
>>> On 05.10.2010 21:08, Greg Stark wrote:
>>>> If the users that have select access on the view don't have DDL access
>>>> doesn't that make them leak-proof for those users?
>>
>>> No. You can use built-in functions for leaking data as well.
>>
>> There's a difference between "can be used to extract data wholesale"
>> and "can be used to probe for the existence of a specific value".
>> We might need to start using more specific terminology than "leak".
> 
> Yeah.  There are a lot of cases.  The worst is if you can (1a) dump
> the underlying table wholesale, or maybe (1b) extract it one row at a
> time or something like that.  Not quite as bad is if you can (2) infer
> the presence or absence of particular values in particular columns,
> e.g. via division-by-zero.  This is still pretty bad though, because
> you can probably just keep guessing until you eventually can enumerate
> everything in that column.  If it's a text field or a UUID that may be
> pretty hard, but if the range of interesting values for that column is
> limited to, say, a million or so, then you can just iterate through
> them until you find everything.  A related problem is where you can
> (3) infer the frequency of a value based on the plan choice, either by
> viewing the EXPLAIN output directly or by timing attacks; and then
> there's (4) the ability to infer something about the overall amount of
> data in the underlying table.  Any others?
> 
> Of those, I'm inclined to think that it's possible to close off (1)
> and (2) pretty thoroughly with sufficient care, but the best you'd be
> able to do for (3) and (4) is refuse to EXPLAIN to a user without
> sufficient privileges; the timing attacks seem intractable.
> 

Thanks for good summarize.

I also think the case (1) should be closed off soon, because it allows
to expose hidden data-contents without any inference of attacker; and
its throughput is unignorably fast, so its degree of threat is relatively
higher than other cases.

<side-note>
The idea of throughput is not my own idea. It come from the classic of
security evaluation criteria: Trusted Computer System Evaluation Criteria
(TCSEC, 1985)

See the page.80 of: http://csrc.nist.gov/publications/history/dod85.pdf

| From a security perspective, covert channels with low bandwidths represent a
| lower threat than those with high bandwidths. However, for many types of
| covert channels, techniques used to reduce the bandwidth below a certain rate
| (which depends on the specific channel mechanism and the system architecture)
| also have the effect of degrading the performance provided to legitimate
| system users. Hence, a trade-off between system performance and covert
| channel bandwidth must be made.
</side-node>

I also think we should care about a part of (2) cases.
Could you separate the (2) into two cases.

The (2a) allows people to see hidden value using error message. In this case,
people can see direct value to be hidden, but thorough-put is not fast.
The (2b) allows people to infer existence or absence of a certain value using
PK or UNIQUE conflicts.

(2a) is the reason why my patch allows to push down only operators with
internal functions, because these are not obviously leakable.
However, I don't think (2b) is the case we should fix up here, because
no commercial RDBMSs with RLS don't handle such kind of side-channel
attacks using key conflicts.

And, it seems to me the cost will be too expensive to care about the
case (3) and (4). So, I think it is worthless to fix up them.

Thanks,
-- 
KaiGai Kohei <kaigai@ak.jp.nec.com>

Re: leaky views, yet again

From

KaiGai Kohei

Date:

06 October 2010, 00:54:17

(2010/10/06 4:15), Heikki Linnakangas wrote:
> On 05.10.2010 21:48, Tom Lane wrote:
>> Heikki Linnakangas<heikki.linnakangas@enterprisedb.com> writes:
>>> On 05.10.2010 21:08, Greg Stark wrote:
>>>> If the users that have select access on the view don't have DDL access
>>>> doesn't that make them leak-proof for those users?
>>
>>> No. You can use built-in functions for leaking data as well.
>>
>> There's a difference between "can be used to extract data wholesale"
>> and "can be used to probe for the existence of a specific value".
>
> Define wholesale. I'm also not too worried about probing attacks, where you can ask "does this value exist?", but
thereare plenty of built-in fúnctions that can regurgitate the argument verbatim in an error message. Using that, you
canbuild a script to
 
> fetch the value for every row in the table, one row at a time. It's orders of magnitude slower than "SELECT * FROM
table",but orders of magnitude faster than probing for every possible value for every row.
 
>
>  > We might need to start using more specific terminology than "leak".
>
> Yeah. There are many different categories of leakage:
>
> 1. Wholesale retrieval of all rows. For example, a custom function that>    emits the argument with a NOTICE, used in
aWHERE clause.
 

It contains a custom function with side-effect; such as INSERT the supplied
arguments into private tables.

> 2. Retrieval of a given value in an arbitrary attacker-chosen row. Using> a function like text to integer cast that
emitsthe argument in an ERROR> falls into this category. You can access any value in the table, but only> one at a time
becauseERROR causes a rollback.
 
>
> 3. Retrieval of some values, not attacker-chosen. For example, statistics> might leak the top 100 values.
>
> 4. Probing attacks. Let's you check if a row with given value exists.
>
> 5. Leakage of statistical information. Lets you know roughly how many rows> there are in a table, for example.
>
> There's some fuzziness in those too, you might be able to probe for values> in an indexed column but not others, for
example.Or you might be able to> retrieve all values in one column, or all values in another column, but not> put them
togetherto form the original rows in the table.
 
>
> IMHO we don't need to protect against 5. Probing attacks can be nasty, but> it's next to impossible to protect from
themcompletely. And for many> applications, probing attacks are a non-issue. For example, imagine table> of usernames
andpasswords, with a view that lets you see your own password.> Probing for other people's passwords would be useless,
youmight as well> try to log in with the guessed password directly.
 
>
> Retrieval of some non-attacker chosen rows, like from statistics, would be> nice to avoid where feasible, but I won't
losemy sleep over it. I do think> we need to protect against 1 and 2.
 
>
I also agree with this attitude.
The case 1 and 2 have differences from others. It allows to expose hidden
values, then people may be able to see these values without any inference.
So, their through-put is relatively faster than others. It means degree of
threat is also higher.

If we try to tacker to the matter 1 and 2, suggestions from Itagaki-san
are still available, because this patch was designed to prevent people
to see hidden data without inferences.

Thanks,
-- 
KaiGai Kohei <kaigai@ak.jp.nec.com>

Re: leaky views, yet again

From

"Joshua D. Drake"

Date:

06 October 2010, 01:59:43

On Tue, 2010-10-05 at 14:49 -0400, Robert Haas wrote:
> On Tue, Oct 5, 2010 at 2:08 PM, Greg Stark <gsstark@mit.edu> wrote:
> > Though I find it unlikely the sales people would have direct access to
> > run arbitrary SQL -- let alone create custom functions.
> 
> I have definitely seen shops where virtually everyone has SQL-level
> access to the database.

Uhh... yeah it is very common to point access at the database and say go
for it. Very common.

>   Several of them.  Most of them were pretty
> insecure, but it certainly doesn't help anything when the database has
> no capability to do anything better.  Now, I will grant you that not
> everyone in those organizations was actually smart enough to do
> meaningful things with the access they had, but I never found that
> very comforting.

The better argument here is, the majority (by far, just google it) of
espionage is done IN HOUSE. It doesn't matter if it is a sales person.
It could be a disgruntled DBA.

JD



-- 
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579
Consulting, Training, Support, Custom Development, Engineering
http://twitter.com/cmdpromptinc | http://identi.ca/commandprompt

Re: leaky views, yet again

From

KaiGai Kohei

Date:

06 October 2010, 08:02:04

The attached patch is a revised version of the fix-leaky-view patch.

List of updates:
- It was rebased to the latest git master.
- It revised pointer castings in rewriteHandler.c.
- It added a new regression test: 'security_views'.
- It added SECURITY VIEW support to pg_dump.
- It modified the definition of pg_views to show whether it is
  a security view, or not.
- It revised psql to show SECURITY VIEW attribute of views in
  verbose mode.
- I tried to add a new field to pg_class, but scale of code changes
  are eventually similar to previous version. So, all I revised was
  to define StdViewOptions, instead of abuse of StdRdOptions.


To avoid confusion, I'd like to make clear what is the matter this
patch tries to tackle.
As we have discussed for a long time, an interaction between functions
with side-effects and query optimization on views may cause information
leaks, even if owner of the views intend to use them for row-level
security.
In this context, we define the term of 'leak' as system can expose
content of tuples to be invisible unexpectedly, then people can see
the content directly without any inferences.

Thus, if and when people supplied a function that writes out given
arguments into other tables on the WHERE clause of views, it should
be invoked after all the conditions within view definition.

  E.g)  CREATE OR REPLACE FUNCTION f_malicious(text)
          RETURNS bool COST 0.01 LANGUAGE 'sql'
          AS 'INSERT INTO my_private VALUES($1); SELECT true;';
        SELECT * FROM secret_view v WHERE f_malicious(v);

In addition, some of functions (including built-in functions) raise
an error with message that may contain given arguments. It also allows
to expose content of invisible tuples (although throughput of the
channel is relatively slow), if it would be pushed down into inside
of join-loop.

  E.g)  postgres=# select * from v2;
         a |  b  | x |  y
        ---+-----+---+-----
         2 | bbb | 2 | www
         4 | eee | 4 | yyy
        (2 rows)

        postgres=# select * from v2 where y::int = 1;
        ERROR:  invalid input syntax for integer: "vvv"

However, it is too expensive to prohibit pushing down all the function
calls into join-loops. So, we need allow to push down a part of harmless
functions. In this patch, I don't change the logic to decide it.
Thus, only operators implemented by functions with INTERNALlanguage
are allowed to push down into SECURITY VIEWs.

On the other hand, we can infer a certain value of hidden tuple with
iteration of probes. In this scenario, people never see the hidden
values directly. So, it is not a matter this patch tries to tackle.

  E.g) postgres=# select * from v1;
        a |  b
       ---+-----
        2 | bbb
        4 | eee
       (2 rows)

       postgres=# insert into t1 values (3, 'xyz');
       ERROR:  duplicate key value violates unique constraint "t1_pkey"
       DETAIL:  Key (a)=(3) already exists.
       postgres=# select * from v1 where 1/(3-a) > 0;
       ERROR:  division by zero

As long as I know, commercial RDBMSs with RLS also don't tackle such
kind of information leaks via side-channels. At least, I don't think
it is a matter that we should tackle with first priority.

Thanks,

(2010/10/05 18:01), Itagaki Takahiro wrote:
> 2010/9/6 KaiGai Kohei<kaigai@ak.jp.nec.com>:
>> The attached patch tackles both of the above two points, and allows to
>> control this deoptimization when CREATE SECURITY VIEW is used.
>
> I'll send a review for the patch.
>
> Contents&  Purpose
> ==================
> The patch adds a new SECURITY option for VIEWs. Views defined with
> CREATE SECURITY
> VIEW command are not merged with external WHERE-clauses including
> security-leakable
> functions in queries. However, since internal functions and operators are not
> considered as leakable functions, the planner still works well for
> security views
> unless we use user-defined functions in the WHERE-clause.
>
> Initial Run
> ===========
> * Because of the delayed review, the patch has one hunk:
>     1 out of 5 hunks FAILED -- saving rejects to file
> src/backend/commands/tablecmds.c.rej
>    but it is not serious at all, and can be easily fixed.
>
> * It can be compiled, but has two warnings:
> rewriteHandler.c: In function 'MarkFuncExprDepthWalker':
> rewriteHandler.c:1155: warning: cast from pointer to integer of different size
> rewriteHandler.c:1227: warning: cast to pointer from integer of different size
>
> They should be harmless, but casting intptr_t is often used to avoid warnings:
>    - L1155: (int) (intptr_t) context;
>    - L1227: (void *) (intptr_t) (depth + 1)
>
> * It passes all regression tests, but doesn't have own new tests.
>
> Remaining works
> ===============
> The patch might include only core code, but I'll let you know additional
> sub-works are still required to complete the feature. Also, I've not measured
> the performance yet, though performance might not be so critical for the patch.
>
> * Regression tests and documentation for the feature are required.
> * pg_dump must support to dump SECURITY VIEWs.
>    They are dumped as normal views for now.
> * pg_views can have "security" column.
> * psql's \dv can show security attributes of views.
>
> Questions
> =========
>> postgres=# EXPLAIN select * from v1 where f_malicious(b) and a = 2;
>>   ==>  We assume operators implemented with built-in functions are safe.
>>       So, we don't prevent this pushing-down, if the clause does not
>>       contains something leakable.
>
> The term "built-in functions" means functions written in INTERNAL language
> here. But we also have SQL functions by default. Some of them are just a
> wrapper to internal functions. I'm not sure the checking of INTERNAL language
> is the best way for the purpose. Did you compare it with other methods?
> For example, "oid<  FirstNormalObjectId" looks workable for me.
>
>> Instead of modifying the structure of pg_class, this patch stores a flag of
>> security view on the reloption. So, it needed to modify routines about
>> reloptions because it is the first case to store reloptions of views.
>
> Why did you need to extend StdRdOptions strucuture? Since other fields in
> the structure are not used by views at all, I think adding a new structure
>    struct ViewOptions { vl_len, security_view }
> would be better than extending StdRdOptions.
>


--
KaiGai Kohei <kaigai@ak.jp.nec.com>

Attachment

pgsql-fix-leaky-join-view.7.patch

Re: leaky views, yet again

From

Robert Haas

Date:

07 October 2010, 03:40:01

On Tue, Oct 5, 2010 at 3:42 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Right, *column* filtering seems easy and entirely secure.  The angst
> here is about row filtering.  Can we have a view in which users can see
> the values of a column for some rows, with perfect security that they
> can't identify values for the hidden rows?  The stronger form is that
> they shouldn't even be able to tell that hidden rows exist, which is
> something your view doesn't try to do; but there are at least some
> applications where that would be desirable.

I took a crack at documenting the current behavior; see attached.  It
turns out that a view which only uses boolean operators in the WHERE
clause is not obviously subvertable, because we judge those operations
to have no cost.  (It seems unwise to rely on this for security,
though.)  Anything more complicated - that does row filtering - is
easily hacked.  See within for details.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

Attachment

document-leaky-views.patch

Re: leaky views, yet again

From

Heikki Linnakangas

Date:

07 October 2010, 06:03:20

On 07.10.2010 06:39, Robert Haas wrote:
> On Tue, Oct 5, 2010 at 3:42 PM, Tom Lane<tgl@sss.pgh.pa.us>  wrote:
>> Right, *column* filtering seems easy and entirely secure.  The angst
>> here is about row filtering.  Can we have a view in which users can see
>> the values of a column for some rows, with perfect security that they
>> can't identify values for the hidden rows?  The stronger form is that
>> they shouldn't even be able to tell that hidden rows exist, which is
>> something your view doesn't try to do; but there are at least some
>> applications where that would be desirable.
>
> I took a crack at documenting the current behavior; see attached.

Looks good. It gives the impression that you need to be able to a create 
custom function to exploit, though. It would be good to mention that 
internal functions can be used too, revoking access to CREATE FUNCTION 
does not make you safe.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com

Re: leaky views, yet again

From

Robert Haas

Date:

07 October 2010, 12:24:52

On Thu, Oct 7, 2010 at 2:02 AM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
> On 07.10.2010 06:39, Robert Haas wrote:
>>
>> On Tue, Oct 5, 2010 at 3:42 PM, Tom Lane<tgl@sss.pgh.pa.us>  wrote:
>>>
>>> Right, *column* filtering seems easy and entirely secure.  The angst
>>> here is about row filtering.  Can we have a view in which users can see
>>> the values of a column for some rows, with perfect security that they
>>> can't identify values for the hidden rows?  The stronger form is that
>>> they shouldn't even be able to tell that hidden rows exist, which is
>>> something your view doesn't try to do; but there are at least some
>>> applications where that would be desirable.
>>
>> I took a crack at documenting the current behavior; see attached.
>
> Looks good. It gives the impression that you need to be able to a create
> custom function to exploit, though. It would be good to mention that
> internal functions can be used too, revoking access to CREATE FUNCTION does
> not make you safe.

OK, second try attached.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

Attachment

document-leaky-views-v2.patch

Re: leaky views, yet again

From

Stephen Frost

Date:

07 October 2010, 13:10:33

* Robert Haas (robertmhaas@gmail.com) wrote:
> On Thu, Oct 7, 2010 at 2:02 AM, Heikki Linnakangas
> > Looks good. It gives the impression that you need to be able to a create
> > custom function to exploit, though. It would be good to mention that
> > internal functions can be used too, revoking access to CREATE FUNCTION does
> > not make you safe.
>
> OK, second try attached.

This might be overly pedantic, but I don't think 'tampering' gives the
right impression.  Also, there's a marked difference between viewing
data by using built-ins such as casting (since you'll only get to see
the first value in a column that fails the cast) and being able to write
a function that pulls out every row of the table and dumps it into
another table.  I think it'd have a much bigger impression if you went
ahead and changed the 'raise notice' to an 'insert into table x;'.

Also, even if you can't create functions (due to lack of create
privileges on any schema), you could use DO clauses now.  Revoking
usage rights on all languages should prevent both though.
Thanks,
    Stephen

Re: leaky views, yet again

From

Heikki Linnakangas

Date:

07 October 2010, 13:17:18

On 07.10.2010 16:10, Stephen Frost wrote:
> Also, even if you can't create functions (due to lack of create
> privileges on any schema), you could use DO clauses now.

There's no way to shoehorn a DO clause into a SELECT, you can't do:

SELECT data FROM view WHERE (DO $$ RAISE NOTICE argument; $$) = 1

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com

Re: leaky views, yet again

From

Robert Haas

Date:

07 October 2010, 13:19:43

On Thu, Oct 7, 2010 at 9:10 AM, Stephen Frost <sfrost@snowman.net> wrote:
> * Robert Haas (robertmhaas@gmail.com) wrote:
>> On Thu, Oct 7, 2010 at 2:02 AM, Heikki Linnakangas
>> > Looks good. It gives the impression that you need to be able to a create
>> > custom function to exploit, though. It would be good to mention that
>> > internal functions can be used too, revoking access to CREATE FUNCTION does
>> > not make you safe.
>>
>> OK, second try attached.
>
> This might be overly pedantic, but I don't think 'tampering' gives the
> right impression.

I'm open to suggestions.

> Also, there's a marked difference between viewing
> data by using built-ins such as casting (since you'll only get to see
> the first value in a column that fails the cast) and being able to write
> a function that pulls out every row of the table and dumps it into
> another table.

Well, that's why I give the more serious example first.  Even with
casting failures, there's a good chance you can probe a bunch of
different rows by throwing random filter conditions into the clause.

(function_that_returns_false() OR (name = 'some_constant' AND number::box)

> I think it'd have a much bigger impression if you went
> ahead and changed the 'raise notice' to an 'insert into table x;'.

Possibly, but it makes the example slightly longer, and I think it's
clear enough as-is.

> Also, even if you can't create functions (due to lack of create
> privileges on any schema), you could use DO clauses now.  Revoking
> usage rights on all languages should prevent both though.

I don't see how to make it work with a DO clause.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

Re: leaky views, yet again

From

Stephen Frost

Date:

07 October 2010, 13:23:37

* Heikki Linnakangas (heikki.linnakangas@enterprisedb.com) wrote:
> On 07.10.2010 16:10, Stephen Frost wrote:
>> Also, even if you can't create functions (due to lack of create
>> privileges on any schema), you could use DO clauses now.
>
> There's no way to shoehorn a DO clause into a SELECT, you can't do:
>
> SELECT data FROM view WHERE (DO $$ RAISE NOTICE argument; $$) = 1

Wow, I just kind of assumed you could; I'm not really sure why.  Perhaps
it'll be possible in the future though, so might be something to think
about if/when it happens.  Can't see a way to abuse the view from inside
a DO or in a function in the same way either.
Stephen

Re: leaky views, yet again

From

Stephen Frost

Date:

07 October 2010, 13:27:38

* Robert Haas (robertmhaas@gmail.com) wrote:
> On Thu, Oct 7, 2010 at 9:10 AM, Stephen Frost <sfrost@snowman.net> wrote:
> > This might be overly pedantic, but I don't think 'tampering' gives the
> > right impression.
>
> I'm open to suggestions.

Yeah, wasn't coming up with a better word myself. :/  Maybe something
with "circumvented"?

> > Also, even if you can't create functions (due to lack of create
> > privileges on any schema), you could use DO clauses now.  Revoking
> > usage rights on all languages should prevent both though.
>
> I don't see how to make it work with a DO clause.

Yeah, Heikki pointed out that I was assuming PG could work more magic
than it can today. :)  Sorry for the noise.
Stephen

Re: leaky views, yet again

From

Dimitri Fontaine

Date:

08 October 2010, 07:19:29

Hi,

Stephen Frost <sfrost@snowman.net> writes:
> Wow, I just kind of assumed you could; I'm not really sure why.  Perhaps
> it'll be possible in the future though, so might be something to think
> about if/when it happens.  Can't see a way to abuse the view from inside
> a DO or in a function in the same way either.

It took me some time and reviewing the patch to think about it this way,
so maybe that would help some readers here: DO is a utility command, not
a query. That also explains why it does not get parameters.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support

Re: leaky views, yet again

From

KaiGai Kohei

Date:

12 October 2010, 06:11:15

I noticed the previous patch has an obvious conflict to the latest
git mater, and it does not have any documentation updates.

So, I rebased the patch, and added descriptions about secure views.
Rest of parts are unchanged.

Thanks,

(2010/10/06 17:01), KaiGai Kohei wrote:
> The attached patch is a revised version of the fix-leaky-view patch.
>
> List of updates:
> - It was rebased to the latest git master.
> - It revised pointer castings in rewriteHandler.c.
> - It added a new regression test: 'security_views'.
> - It added SECURITY VIEW support to pg_dump.
> - It modified the definition of pg_views to show whether it is
>    a security view, or not.
> - It revised psql to show SECURITY VIEW attribute of views in
>    verbose mode.
> - I tried to add a new field to pg_class, but scale of code changes
>    are eventually similar to previous version. So, all I revised was
>    to define StdViewOptions, instead of abuse of StdRdOptions.
>
>
> To avoid confusion, I'd like to make clear what is the matter this
> patch tries to tackle.
> As we have discussed for a long time, an interaction between functions
> with side-effects and query optimization on views may cause information
> leaks, even if owner of the views intend to use them for row-level
> security.
> In this context, we define the term of 'leak' as system can expose
> content of tuples to be invisible unexpectedly, then people can see
> the content directly without any inferences.
>
> Thus, if and when people supplied a function that writes out given
> arguments into other tables on the WHERE clause of views, it should
> be invoked after all the conditions within view definition.
>
>    E.g)  CREATE OR REPLACE FUNCTION f_malicious(text)
>            RETURNS bool COST 0.01 LANGUAGE 'sql'
>            AS 'INSERT INTO my_private VALUES($1); SELECT true;';
>          SELECT * FROM secret_view v WHERE f_malicious(v);
>
> In addition, some of functions (including built-in functions) raise
> an error with message that may contain given arguments. It also allows
> to expose content of invisible tuples (although throughput of the
> channel is relatively slow), if it would be pushed down into inside
> of join-loop.
>
>    E.g)  postgres=# select * from v2;
>           a |  b  | x |  y
>          ---+-----+---+-----
>           2 | bbb | 2 | www
>           4 | eee | 4 | yyy
>          (2 rows)
>
>          postgres=# select * from v2 where y::int = 1;
>          ERROR:  invalid input syntax for integer: "vvv"
>
> However, it is too expensive to prohibit pushing down all the function
> calls into join-loops. So, we need allow to push down a part of harmless
> functions. In this patch, I don't change the logic to decide it.
> Thus, only operators implemented by functions with INTERNALlanguage
> are allowed to push down into SECURITY VIEWs.
>
> On the other hand, we can infer a certain value of hidden tuple with
> iteration of probes. In this scenario, people never see the hidden
> values directly. So, it is not a matter this patch tries to tackle.
>
>    E.g) postgres=# select * from v1;
>          a |  b
>         ---+-----
>          2 | bbb
>          4 | eee
>         (2 rows)
>
>         postgres=# insert into t1 values (3, 'xyz');
>         ERROR:  duplicate key value violates unique constraint "t1_pkey"
>         DETAIL:  Key (a)=(3) already exists.
>         postgres=# select * from v1 where 1/(3-a)>  0;
>         ERROR:  division by zero
>
> As long as I know, commercial RDBMSs with RLS also don't tackle such
> kind of information leaks via side-channels. At least, I don't think
> it is a matter that we should tackle with first priority.
>
> Thanks,
>
> (2010/10/05 18:01), Itagaki Takahiro wrote:
>> 2010/9/6 KaiGai Kohei<kaigai@ak.jp.nec.com>:
>>> The attached patch tackles both of the above two points, and allows to
>>> control this deoptimization when CREATE SECURITY VIEW is used.
>>
>> I'll send a review for the patch.
>>
>> Contents&   Purpose
>> ==================
>> The patch adds a new SECURITY option for VIEWs. Views defined with
>> CREATE SECURITY
>> VIEW command are not merged with external WHERE-clauses including
>> security-leakable
>> functions in queries. However, since internal functions and operators are not
>> considered as leakable functions, the planner still works well for
>> security views
>> unless we use user-defined functions in the WHERE-clause.
>>
>> Initial Run
>> ===========
>> * Because of the delayed review, the patch has one hunk:
>>      1 out of 5 hunks FAILED -- saving rejects to file
>> src/backend/commands/tablecmds.c.rej
>>     but it is not serious at all, and can be easily fixed.
>>
>> * It can be compiled, but has two warnings:
>> rewriteHandler.c: In function 'MarkFuncExprDepthWalker':
>> rewriteHandler.c:1155: warning: cast from pointer to integer of different size
>> rewriteHandler.c:1227: warning: cast to pointer from integer of different size
>>
>> They should be harmless, but casting intptr_t is often used to avoid warnings:
>>     - L1155: (int) (intptr_t) context;
>>     - L1227: (void *) (intptr_t) (depth + 1)
>>
>> * It passes all regression tests, but doesn't have own new tests.
>>
>> Remaining works
>> ===============
>> The patch might include only core code, but I'll let you know additional
>> sub-works are still required to complete the feature. Also, I've not measured
>> the performance yet, though performance might not be so critical for the patch.
>>
>> * Regression tests and documentation for the feature are required.
>> * pg_dump must support to dump SECURITY VIEWs.
>>     They are dumped as normal views for now.
>> * pg_views can have "security" column.
>> * psql's \dv can show security attributes of views.
>>
>> Questions
>> =========
>>> postgres=# EXPLAIN select * from v1 where f_malicious(b) and a = 2;
>>>    ==>   We assume operators implemented with built-in functions are safe.
>>>        So, we don't prevent this pushing-down, if the clause does not
>>>        contains something leakable.
>>
>> The term "built-in functions" means functions written in INTERNAL language
>> here. But we also have SQL functions by default. Some of them are just a
>> wrapper to internal functions. I'm not sure the checking of INTERNAL language
>> is the best way for the purpose. Did you compare it with other methods?
>> For example, "oid<   FirstNormalObjectId" looks workable for me.
>>
>>> Instead of modifying the structure of pg_class, this patch stores a flag of
>>> security view on the reloption. So, it needed to modify routines about
>>> reloptions because it is the first case to store reloptions of views.
>>
>> Why did you need to extend StdRdOptions strucuture? Since other fields in
>> the structure are not used by views at all, I think adding a new structure
>>     struct ViewOptions { vl_len, security_view }
>> would be better than extending StdRdOptions.
>>
>
>
>
>
>


--
KaiGai Kohei <kaigai@ak.jp.nec.com>

Attachment

pgsql-fix-leaky-join-view.8.patch

Re: leaky views, yet again

From

Robert Haas

Date:

13 October 2010, 06:29:10

2010/10/12 KaiGai Kohei <kaigai@ak.jp.nec.com>:
> I noticed the previous patch has an obvious conflict to the latest
> git mater, and it does not have any documentation updates.
>
> So, I rebased the patch, and added descriptions about secure views.
> Rest of parts are unchanged.

It seems that we have rough agreement that the existing VIEW feature
is adequate for column filtering but not for row filtering.  The
attack vector is that the planner might reorder quals such that a
value not intended to be visible to the user is passed to a function
which has a side-effect that can expose the value passed to it: either
overtly (as by a user-defined function that writes it to the table or
prints it using RAISE NOTICE) or in some more subtle way (as in the
case where division by zero exposes throws an exception when passed
zero, but not some other value).  With the possible exception of Tom,
everyone seems to agree that it would be a good step forward to
provide a way of plugging these holes, even if it didn't cover subtler
information leaks such as by reading the EXPLAIN output or timing
query execution.

1. Does anyone wish to argue (or continue arguing) that plugging these
more overt information leaks is not worthwhile?

2. Supposing that the answer to question #1 is in the negative, does
anyone wish to argue that this patch as currently written is an
adequate solution to this problem?  It seems obvious to me that it
isn't.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: leaky views, yet again

From

Tom Lane

Date:

13 October 2010, 13:43:19

Robert Haas <robertmhaas@gmail.com> writes:
> With the possible exception of Tom,
> everyone seems to agree that it would be a good step forward to
> provide a way of plugging these holes, even if it didn't cover subtler
> information leaks such as by reading the EXPLAIN output or timing
> query execution.

> 1. Does anyone wish to argue (or continue arguing) that plugging these
> more overt information leaks is not worthwhile?

Yeah, I will.  Plugging an "overt" information leak without plugging
other channels in the same area isn't a security improvement.  It's
merely PR, and rather lame PR at that.  An attacker is not bound to
use only the attack methods you'd like him to.

This would only be a security improvement if there were plausible attack
scenarios in which the attacker would have access to the plugged channel
and not access to the other known channels.  Now, perhaps that's the
case, but no one has put forward an argument showing it.  I think the
burden of proof is on those who favor the patch to put forward that
argument, not for those who don't favor it to try to prove that no such
scenario exists.

> 2. Supposing that the answer to question #1 is in the negative, does
> anyone wish to argue that this patch as currently written is an
> adequate solution to this problem?  It seems obvious to me that it
> isn't.

In that case, one's opinion about #1 hardly matters does it?
        regards, tom lane

Re: leaky views, yet again

From

KaiGai Kohei

Date:

13 October 2010, 14:34:41

(2010/10/13 22:43), Tom Lane wrote:
> Robert Haas<robertmhaas@gmail.com>  writes:
>> With the possible exception of Tom,
>> everyone seems to agree that it would be a good step forward to
>> provide a way of plugging these holes, even if it didn't cover subtler
>> information leaks such as by reading the EXPLAIN output or timing
>> query execution.
> 
>> 1. Does anyone wish to argue (or continue arguing) that plugging these
>> more overt information leaks is not worthwhile?
> 
> Yeah, I will.  Plugging an "overt" information leak without plugging
> other channels in the same area isn't a security improvement.  It's
> merely PR, and rather lame PR at that.  An attacker is not bound to
> use only the attack methods you'd like him to.
> 
It seems to me an extreme opinion, and different from the standard
point of security view.

It is a quotation from the classic of security evaluation criteria.
Trusted Computer System Evaluation Criteria (TCSEC, DoD) says in
the chapter of "A GUIDELINE ON COVERT CHANNELS" as follows:

http://csrc.nist.gov/publications/history/dod85.pdf
| From a security perspective, covert channels with low bandwidths represent a
| lower threat than those with high bandwidths. However, for many types of
| covert channels, techniques used to reduce the bandwidth below a certain rate
| (which depends on the specific channel mechanism and the system architecture)
| also have the effect of degrading the performance provided to legitimate
| system users. Hence, a trade-off between system performance and covert
| channel bandwidth must be made

The "overt" channels has a capability to leak massive invisible information,
so we need to consider them as a serious threat to be fixed up in higher
priority.
However, it is doubtful whether the rest of channels provides enough
bandwidth as actual threat. It also means degree of the threat is
relatively small than the "overt" channels.

Previous security researcher pointed out security is trading-off,
not all-or-nothing. If we can plug most part of the threat with
reasonable performance degrading, it is worthwhile to fix up.

Thanks,
-- 
KaiGai Kohei <kaigai@kaigai.gr.jp>

Re: leaky views, yet again

From

"Kevin Grittner"

Date:

13 October 2010, 15:26:02

KaiGai Kohei <kaigai@kaigai.gr.jp> wrote:
> Previous security researcher pointed out security is trading-off,
> not all-or-nothing. If we can plug most part of the threat with
> reasonable performance degrading, it is worthwhile to fix up.
I had the pleasure of hearing Admiral Grace Hopper[1] speak at an
ACM luncheon once.  When she discussed security, she asserted that
there was no such thing as security which could not be breached. 
The goal of security efforts should not be to make it perfect,
because you can't; any time you convince yourself you have that you
are simply fooling yourself and missing the vulnerabilities.  In her
view the goal was to make the costs of breaching security higher to
the perpetrator than the benefits.  Each obstacle in their way helps
tip the scales in your favor.
-Kevin
http://en.wikipedia.org/wiki/Grace_Hopper

Re: leaky views, yet again

From

Tom Lane

Date:

13 October 2010, 15:46:17

"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
> I had the pleasure of hearing Admiral Grace Hopper[1] speak at an
> ACM luncheon once.  When she discussed security, she asserted that
> there was no such thing as security which could not be breached. 
> The goal of security efforts should not be to make it perfect,
> because you can't; any time you convince yourself you have that you
> are simply fooling yourself and missing the vulnerabilities.  In her
> view the goal was to make the costs of breaching security higher to
> the perpetrator than the benefits.  Each obstacle in their way helps
> tip the scales in your favor.

That's all true, but you have to consider how much the obstacle actually
gets in their way versus how painful it is on your end to create and
maintain the obstacle.  I don't think this proposed patch measures up
very well on either end of that tradeoff.
        regards, tom lane

Re: leaky views, yet again

From

Robert Haas

Date:

13 October 2010, 16:15:47

On Wed, Oct 13, 2010 at 9:43 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> With the possible exception of Tom,
>> everyone seems to agree that it would be a good step forward to
>> provide a way of plugging these holes, even if it didn't cover subtler
>> information leaks such as by reading the EXPLAIN output or timing
>> query execution.
>
>> 1. Does anyone wish to argue (or continue arguing) that plugging these
>> more overt information leaks is not worthwhile?
>
> Yeah, I will.  Plugging an "overt" information leak without plugging
> other channels in the same area isn't a security improvement.It's
> merely PR, and rather lame PR at that.  An attacker is not bound to
> use only the attack methods you'd like him to.

You may as well argue that I shouldn't bother locking the doors to my
house because a determined attacker could simply break the windows.
They certainly could, and yet I am altogether convinced that a habit
of locking my doors when I am away reduces the chances that my house
will be burgled.  Breaking the windows would be altogether more
obvious and more likely to be result in the police being summoned.
Locking the doors won't protect me against someone who is bound and
determined to rob my house specifically, but it provides fairly good
protection against someone who walks around the neighborhood and robs
each unlocked house, which is not an unrealistic threat model.  But
let us suppose that I went a step further and purchased the best
burglar alarm money can buy, reinforced steel doors, and an expensive
alarm monitoring service.  Further, let us suppose that I retained a
24x7 armed guard.  This would likely be a waste of money because
there's not a whole lot in my house worth stealing (and if there were
I wouldn't post the details to a public mailing list) but let us
suppose that I did it anyway.  I would now be about as secure against
burglary as one can hope to be, and yet I'm still pretty sure that the
CIA could manage to clandestinely remove something from my home
against my will, were they of a mind to do so.

In other words - true, an attacker isn't bound to use only the attack
methods I'd like him to - but on the other hand, I'm not bound to care
about protecting myself against every type of attack.  I set a goal
for what level of security I'd like to achieve, and then I try to meet
it.  You seem to believe that being able to infer the total size of a
table or the frequency of some particular key in the table is
equivalent to being able to trivially read every row of it.  That
seems off the wall to me, and I'd like to see you justify that
position.  I and others have already posted examples of situations
where this is not the case, such as my example of allowing sales reps
to view only their own customers.  I believe Kevin has posted similar
examples: the number of cases is not a secret, but details of
individual ones may be.  These cases are taken from real business
situations and I don't understand, what, if anything, you find
implausible about them.

http://archives.postgresql.org/pgsql-hackers/2010-10/msg00299.php

> This would only be a security improvement if there were plausible attack
> scenarios in which the attacker would have access to the plugged channel
> and not access to the other known channels.  Now, perhaps that's the
> case, but no one has put forward an argument showing it.  I think the
> burden of proof is on those who favor the patch to put forward that
> argument, not for those who don't favor it to try to prove that no such
> scenario exists.

I don't favor this particular patch, but I think you're the only
person arguing that there is no subset of this problem which is both
tractable and useful.  As far as I can tell, Heikki, KaiGai, Stephen
Frost, Kevin Grittner, and myself are all on approximately the same
page about where a meaningful dividing line can be drawn.  The
question is not really whether the attacker would have access to the
un-plugged channels but how much and what type of information they
actually leak.  AIUI, the vectors that the proposed approach doesn't
block are basically EXPLAIN and query response times.  What can you
infer from those?  AFAICS, you're not going to be able to do better
than inferring whether a given value is an MCV, which is of a totally
different order than the wholesale data leakage which can be trivially
created using today's system.  And even that can be made much harder
by blocking EXPLAIN, which can be done quite easily using
ProcessUtility_hook, and therefore doesn't need to be addressed by the
patch.  But even suppose someone can reliably infer MCVs.  It does not
follow that because I'm worried about someone enumerating the contents
of my entire table that I am also worried about them learning the MCVs
of those columns which have non-unique indices.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: leaky views, yet again

From

Robert Haas

Date:

13 October 2010, 16:18:27

On Wed, Oct 13, 2010 at 11:45 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> "Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
>> I had the pleasure of hearing Admiral Grace Hopper[1] speak at an
>> ACM luncheon once.  When she discussed security, she asserted that
>> there was no such thing as security which could not be breached.
>> The goal of security efforts should not be to make it perfect,
>> because you can't; any time you convince yourself you have that you
>> are simply fooling yourself and missing the vulnerabilities.  In her
>> view the goal was to make the costs of breaching security higher to
>> the perpetrator than the benefits.  Each obstacle in their way helps
>> tip the scales in your favor.
>
> That's all true, but you have to consider how much the obstacle actually
> gets in their way versus how painful it is on your end to create and
> maintain the obstacle.  I don't think this proposed patch measures up
> very well on either end of that tradeoff.

I think it would behoove us to try to separate concerns about this
particular patch from concerns about the viability of the whole
approach.  Whether or not it's useful to do X is a different question
than whether it can be done with few enough lines of code and/or
whether this patch actually does it well.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: leaky views, yet again

From

Tom Lane

Date:

13 October 2010, 17:03:58

Robert Haas <robertmhaas@gmail.com> writes:
> On Wed, Oct 13, 2010 at 11:45 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> That's all true, but you have to consider how much the obstacle actually
>> gets in their way versus how painful it is on your end to create and
>> maintain the obstacle. �I don't think this proposed patch measures up
>> very well on either end of that tradeoff.

> I think it would behoove us to try to separate concerns about this
> particular patch from concerns about the viability of the whole
> approach.  Whether or not it's useful to do X is a different question
> than whether it can be done with few enough lines of code and/or
> whether this patch actually does it well.

I think I left the wrong impression: I'm concerned about the whole
approach.  I haven't even read this particular patch lately.  I think
that trying to guarantee that the planner applies independent
constraints in a particular order will be expensive, fragile, and prone
to recurring security bugs no matter how it's implemented --- unless you
do it by lobotomizing query pullup/pushdown, which seems unacceptable
from a performance standpoint.

Just to give one example of what this patch misses (probably; as I said
I haven't read it lately), what about selectivity estimation?  One of
the things we like to do when we have an otherwise-unknown function is
to try it on all the histogram members and see what percentage yield
true.  That might already represent enough of an information leak to be
objectionable ... and yet, if we don't do it, the consequences for
performance could be pretty horrid because we'll have to work without
any reasonable selectivity estimate at all.  There are cases where this
technique isn't applied at the moment but probably should be, which is
why I characterize the leak-prevention idea as creating future security
issues: doing that is a constraint that will have to be accounted for in
every future planner change, and we are certain to miss the implications
sometimes.
        regards, tom lane

Re: leaky views, yet again

From

Tom Lane

Date:

13 October 2010, 18:39:37

Robert Haas <robertmhaas@gmail.com> writes:
> You seem to believe that being able to infer the total size of a
> table or the frequency of some particular key in the table is
> equivalent to being able to trivially read every row of it.

I don't say that they're equivalent.  I do say that what this patch is
mostly trying to do is solve a PR problem, and from the PR standpoint
it doesn't help: the "OMG Postgres exposes my information" crowd is not
going to distinguish leaks that only expose MCVs from those that
trivially allow sucking out the entire table.  There are furthermore
plenty of situations where statistical information *is* of interest to
attackers; the traditional example is obtaining the min and max of a
salary column to infer something about what particular people are
getting paid.  So I think if we accept this patch or something like it,
we are going to spend a large part of the next ten years trying to close
other holes of the same ilk, and that's not a development plan I'm
willing to buy into.  I am much happier just making the statement that
we don't try to prevent that type of leak than giving people the
impression that we are committed to trying to prevent it.
        regards, tom lane

Re: leaky views, yet again

From

"Kevin Grittner"

Date:

14 October 2010, 04:09:42

Tom Lane <tgl@sss.pgh.pa.us> wrote:
> the "OMG Postgres exposes my information" crowd is not going to
> distinguish leaks that only expose MCVs from those that trivially
> allow sucking out the entire table.
Well, I'd be in the crowd that would go "OMG" over one but not the
other.  At least in our case management software I can't think of
any MCVs which would be a problem, while exposing entire tables
would be a big problem.
If you get the name, address, birth date, or even the social
security number in isolation, it doesn't mean much.  If you get all
of those for one party, it does.  I suppose that if you could find
that a particular name was used somewhere in the Party table but it
was not visible in the public record, you could guess that someone
by that name (which is certainly not guaranteed to be unique!) was
somehow involved in some role in a juvenile, mental commitment,
adoption, sealed, or other confidential case -- but what role in
what kind of case would still be a complete mystery, making it much
less of a leak than the row in its entirety, much less the entire
table (which could expose, for example, who adopted whom --
information not available from a single row).
If you are arguing that the ability of someone to know that someone,
somewhere, who has had contact with the Wisconsin court system has
social security number 987-65-4321 is the same as knowing who has
that social security number, and all the demographics about that
person, you're dangerously mistaken.
-Kevin

Re: leaky views, yet again

From

KaiGai Kohei

Date:

18 October 2010, 09:02:47

(2010/10/14 1:52), Tom Lane wrote:
> Robert Haas<robertmhaas@gmail.com>  writes:
>> On Wed, Oct 13, 2010 at 11:45 AM, Tom Lane<tgl@sss.pgh.pa.us>  wrote:
>>> That's all true, but you have to consider how much the obstacle actually
>>> gets in their way versus how painful it is on your end to create and
>>> maintain the obstacle. �I don't think this proposed patch measures up
>>> very well on either end of that tradeoff.
>
>> I think it would behoove us to try to separate concerns about this
>> particular patch from concerns about the viability of the whole
>> approach.  Whether or not it's useful to do X is a different question
>> than whether it can be done with few enough lines of code and/or
>> whether this patch actually does it well.

+1.
If the patch I proposed is not enough elegant to commit immediately,
we can discuss how we can get the patch fixing the problem commitable.

> I think I left the wrong impression: I'm concerned about the whole
> approach.  I haven't even read this particular patch lately.  I think
> that trying to guarantee that the planner applies independent
> constraints in a particular order will be expensive, fragile, and prone
> to recurring security bugs no matter how it's implemented --- unless you
> do it by lobotomizing query pullup/pushdown, which seems unacceptable
> from a performance standpoint.
>
It can be controllable by users. Unless specifying "SECURITY VIEW"
flag, it does not restrain a part of optimizations that allows to
execute functions which may leak the supplied arguments unexpectedly
earlier than the functions in deeper level.

Also, it does not ignore performance point of view. Even if we apply
this patch, a part of functions that we believe they don't leak any
supplied arguments can be pushed down inside of the join-loop.
Eventually, these invocations shall be utilized for index-scanning.
The reason why I, Robert and Heikki had a discussion what type of
functions should be allowed to push down is to decide a principle
to bypass this security barrier.

> Just to give one example of what this patch misses (probably; as I said
> I haven't read it lately), what about selectivity estimation?  One of
> the things we like to do when we have an otherwise-unknown function is
> to try it on all the histogram members and see what percentage yield
> true.  That might already represent enough of an information leak to be
> objectionable ... and yet, if we don't do it, the consequences for
> performance could be pretty horrid because we'll have to work without
> any reasonable selectivity estimate at all.

It is a bit unclear for me where is the point of your concerns.
If user wants to generate a histogram from result set of a view that
filters tuples to be invisible, it should just generate the histogram
from the filtered data set.
Even if possible malicious functions are appended to WHERE clause,
the optimizer does not execute them prior to deeper level functions,
as long as is has "SECURITY VIEW" flag.
Of course, here is a performance trade-off, as earlier researcher
pointed out, but user can inform on which does he give higher priority.

Thanks,
-- 
KaiGai Kohei <kaigai@ak.jp.nec.com>

Re: leaky views, yet again

From

Robert Haas

Date:

18 October 2010, 13:11:26

On Mon, Oct 18, 2010 at 5:02 AM, KaiGai Kohei <kaigai@ak.jp.nec.com> wrote:
>> Just to give one example of what this patch misses (probably; as I said
>> I haven't read it lately), what about selectivity estimation?  One of
>> the things we like to do when we have an otherwise-unknown function is
>> to try it on all the histogram members and see what percentage yield
>> true.  That might already represent enough of an information leak to be
>> objectionable ... and yet, if we don't do it, the consequences for
>> performance could be pretty horrid because we'll have to work without
>> any reasonable selectivity estimate at all.
>
> It is a bit unclear for me where is the point of your concerns.
> If user wants to generate a histogram from result set of a view that
> filters tuples to be invisible, it should just generate the histogram
> from the filtered data set.
> Even if possible malicious functions are appended to WHERE clause,
> the optimizer does not execute them prior to deeper level functions,
> as long as is has "SECURITY VIEW" flag.
> Of course, here is a performance trade-off, as earlier researcher
> pointed out, but user can inform on which does he give higher priority.

You need to go back and reread Tom's email until you understand what
he's complaining about, because it's a very serious problem.  I hope
that there is a way around it, because we're not going to be able to
implement any form of row-level security - EVER - unless we figure out
how to address it.  I've been mulling it over a bit and so far I'm
stumped (which is why I haven't replied).  Unfortunately, I don't have
any more time to devote to this right now, so I haven't studied the
code in detail, but at the moment I'd say we're dead in the water.

I am going to mark this patch Returned with Feedback.  Even were the
issue raised by Tom not a problem, it's pretty clear that this patch
as written is still going to allow an unacceptable amount of
information leakage, and wouldn't be committable anyway.  But the
problem Tom raises is so severe that there's no point in writing any
more code unless and until we have a good idea what we're going to do
about it.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: leaky views, yet again

From

KaiGai Kohei

Date:

19 October 2010, 08:34:54

(2010/10/14 1:52), Tom Lane wrote:
> Robert Haas<robertmhaas@gmail.com>  writes:
>> On Wed, Oct 13, 2010 at 11:45 AM, Tom Lane<tgl@sss.pgh.pa.us>  wrote:
>>> That's all true, but you have to consider how much the obstacle actually
>>> gets in their way versus how painful it is on your end to create and
>>> maintain the obstacle. �I don't think this proposed patch measures up
>>> very well on either end of that tradeoff.
>
>> I think it would behoove us to try to separate concerns about this
>> particular patch from concerns about the viability of the whole
>> approach.  Whether or not it's useful to do X is a different question
>> than whether it can be done with few enough lines of code and/or
>> whether this patch actually does it well.
>
> I think I left the wrong impression: I'm concerned about the whole
> approach.  I haven't even read this particular patch lately.  I think
> that trying to guarantee that the planner applies independent
> constraints in a particular order will be expensive, fragile, and prone
> to recurring security bugs no matter how it's implemented --- unless you
> do it by lobotomizing query pullup/pushdown, which seems unacceptable
> from a performance standpoint.
>
> Just to give one example of what this patch misses (probably; as I said
> I haven't read it lately), what about selectivity estimation?  One of
> the things we like to do when we have an otherwise-unknown function is
> to try it on all the histogram members and see what percentage yield
> true.  That might already represent enough of an information leak to be
> objectionable ... and yet, if we don't do it, the consequences for
> performance could be pretty horrid because we'll have to work without
> any reasonable selectivity estimate at all.  There are cases where this
> technique isn't applied at the moment but probably should be, which is
> why I characterize the leak-prevention idea as creating future security
> issues: doing that is a constraint that will have to be accounted for in
> every future planner change, and we are certain to miss the implications
> sometimes.
>
Sorry, I might misread what you pointed out.

Do you see oprrest/oprjoin being internally invoked as a problem?
Well, I also think it is a problem, as long as they can be installed
by non-superusers. But, it is a separated problem from the row-level
security issues.

In my opinion, origin of the matter is incorrect checks on creation
of operators. It allows non-superusers to define a new operator with
restriction and join estimator as long as he has EXECUTE privilege
on the functions. (not EXECUTE privilege of actual user who invokes
this function on estimation phase!)
Then, the optimizer may internally invoke these functions without
any privilege checks on either the function or the table to be
estimated. If a malicious one tries to make other users to invoke
a trojan-horse, he can define a trappy operator with malicious
estimator functions, cannot it?

I think it is a situation that we should check superuser privilege
which means the specified functions have no malicious intention
and equivalent to the privilege to grant 'EXECUTE' to public.

Here is similar problem at conversion functions.
If malicious one want to install a fake conversion function that
records all the stream between server and client, it seems to me
not impossible.

Well, I'd like to suggest to revise the specifications of permission
checks on these commands. If we can ensure these functions are not
malicious, no need to care about information leaks via untrusted
functions internally used.

Thanks,
-- 
KaiGai Kohei <kaigai@ak.jp.nec.com>

Re: leaky views, yet again

From

Robert Haas

Date:

19 October 2010, 12:30:54

On Oct 19, 2010, at 4:34 AM, KaiGai Kohei <kaigai@ak.jp.nec.com> wrote:
> (2010/10/14 1:52), Tom Lane wrote:
>> Robert Haas<robertmhaas@gmail.com>  writes:
>>> On Wed, Oct 13, 2010 at 11:45 AM, Tom Lane<tgl@sss.pgh.pa.us>  wrote:
>>>> That's all true, but you have to consider how much the obstacle actually
>>>> gets in their way versus how painful it is on your end to create and
>>>> maintain the obstacle. �I don't think this proposed patch measures up
>>>> very well on either end of that tradeoff.
>> 
>>> I think it would behoove us to try to separate concerns about this
>>> particular patch from concerns about the viability of the whole
>>> approach.  Whether or not it's useful to do X is a different question
>>> than whether it can be done with few enough lines of code and/or
>>> whether this patch actually does it well.
>> 
>> I think I left the wrong impression: I'm concerned about the whole
>> approach.  I haven't even read this particular patch lately.  I think
>> that trying to guarantee that the planner applies independent
>> constraints in a particular order will be expensive, fragile, and prone
>> to recurring security bugs no matter how it's implemented --- unless you
>> do it by lobotomizing query pullup/pushdown, which seems unacceptable
>> from a performance standpoint.
>> 
>> Just to give one example of what this patch misses (probably; as I said
>> I haven't read it lately), what about selectivity estimation?  One of
>> the things we like to do when we have an otherwise-unknown function is
>> to try it on all the histogram members and see what percentage yield
>> true.  That might already represent enough of an information leak to be
>> objectionable ... and yet, if we don't do it, the consequences for
>> performance could be pretty horrid because we'll have to work without
>> any reasonable selectivity estimate at all.  There are cases where this
>> technique isn't applied at the moment but probably should be, which is
>> why I characterize the leak-prevention idea as creating future security
>> issues: doing that is a constraint that will have to be accounted for in
>> every future planner change, and we are certain to miss the implications
>> sometimes.
>> 
> Sorry, I might misread what you pointed out.

I think you're still misreading it. Want to try a third time?

> Do you see oprrest/oprjoin being internally invoked as a problem?
> Well, I also think it is a problem, as long as they can be installed
> by non-superusers. But, it is a separated problem from the row-level
> security issues.
> 
> In my opinion, origin of the matter is incorrect checks on creation
> of operators. It allows non-superusers to define a new operator with
> restriction and join estimator as long as he has EXECUTE privilege
> on the functions. (not EXECUTE privilege of actual user who invokes
> this function on estimation phase!)
> Then, the optimizer may internally invoke these functions without
> any privilege checks on either the function or the table to be
> estimated. If a malicious one tries to make other users to invoke
> a trojan-horse, he can define a trappy operator with malicious
> estimator functions, cannot it?

This is a pretty poor excuse for a Trojan horse attack.

> 

...Robert

Re: leaky views, yet again

From

KaiGai Kohei

Date:

19 October 2010, 13:26:23

(2010/10/19 21:31), Robert Haas wrote:
> On Oct 19, 2010, at 4:34 AM, KaiGai Kohei<kaigai@ak.jp.nec.com>  wrote:
>> (2010/10/14 1:52), Tom Lane wrote:
>>> Robert Haas<robertmhaas@gmail.com>   writes:
>>>> On Wed, Oct 13, 2010 at 11:45 AM, Tom Lane<tgl@sss.pgh.pa.us>   wrote:
>>>>> That's all true, but you have to consider how much the obstacle actually
>>>>> gets in their way versus how painful it is on your end to create and
>>>>> maintain the obstacle. �I don't think this proposed patch measures up
>>>>> very well on either end of that tradeoff.
>>>
>>>> I think it would behoove us to try to separate concerns about this
>>>> particular patch from concerns about the viability of the whole
>>>> approach.  Whether or not it's useful to do X is a different question
>>>> than whether it can be done with few enough lines of code and/or
>>>> whether this patch actually does it well.
>>>
>>> I think I left the wrong impression: I'm concerned about the whole
>>> approach.  I haven't even read this particular patch lately.  I think
>>> that trying to guarantee that the planner applies independent
>>> constraints in a particular order will be expensive, fragile, and prone
>>> to recurring security bugs no matter how it's implemented --- unless you
>>> do it by lobotomizing query pullup/pushdown, which seems unacceptable
>>> from a performance standpoint.
>>>
>>> Just to give one example of what this patch misses (probably; as I said
>>> I haven't read it lately), what about selectivity estimation?  One of
>>> the things we like to do when we have an otherwise-unknown function is
>>> to try it on all the histogram members and see what percentage yield
>>> true.  That might already represent enough of an information leak to be
>>> objectionable ... and yet, if we don't do it, the consequences for
>>> performance could be pretty horrid because we'll have to work without
>>> any reasonable selectivity estimate at all.  There are cases where this
>>> technique isn't applied at the moment but probably should be, which is
>>> why I characterize the leak-prevention idea as creating future security
>>> issues: doing that is a constraint that will have to be accounted for in
>>> every future planner change, and we are certain to miss the implications
>>> sometimes.
>>>
>> Sorry, I might misread what you pointed out.
>
> I think you're still misreading it.

Hmm. In my understanding, it seems to me he concerned about possible leaky
estimate functions, so he mentioned the horrible performance degrading, if
we don't allow to execute these functions.
So, I suggested an idea that enforces all estimate functions being installed
by superusers; it enables us to assume they are enough safe.

> Want to try a third time?

However, actually, it is still unclear for me... :-(

>> Do you see oprrest/oprjoin being internally invoked as a problem?
>> Well, I also think it is a problem, as long as they can be installed
>> by non-superusers. But, it is a separated problem from the row-level
>> security issues.
>>
>> In my opinion, origin of the matter is incorrect checks on creation
>> of operators. It allows non-superusers to define a new operator with
>> restriction and join estimator as long as he has EXECUTE privilege
>> on the functions. (not EXECUTE privilege of actual user who invokes
>> this function on estimation phase!)
>> Then, the optimizer may internally invoke these functions without
>> any privilege checks on either the function or the table to be
>> estimated. If a malicious one tries to make other users to invoke
>> a trojan-horse, he can define a trappy operator with malicious
>> estimator functions, cannot it?
>
> This is a pretty poor excuse for a Trojan horse attack.
>
>>
>
> ...Robert
>


-- 
KaiGai Kohei <kaigai@kaigai.gr.jp>

Re: leaky views, yet again

From

Tom Lane

Date:

19 October 2010, 14:41:28

KaiGai Kohei <kaigai@kaigai.gr.jp> writes:
> (2010/10/19 21:31), Robert Haas wrote:
>> I think you're still misreading it.

> Hmm. In my understanding, it seems to me he concerned about possible leaky
> estimate functions, so he mentioned the horrible performance degrading, if
> we don't allow to execute these functions.

No, it wasn't about estimator functions at all, it was about what we do
in the absence of an estimator.

> So, I suggested an idea that enforces all estimate functions being installed
> by superusers; it enables us to assume they are enough safe.

Even if we were willing to accept a limitation as stringent as that,
preventing people from installing estimators is hardly going to cure
the problem.
        regards, tom lane

Re: leaky views, yet again

From

Robert Haas

Date:

20 October 2010, 01:36:47

On Wed, Oct 13, 2010 at 12:52 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Wed, Oct 13, 2010 at 11:45 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> That's all true, but you have to consider how much the obstacle actually
>>> gets in their way versus how painful it is on your end to create and
>>> maintain the obstacle.  I don't think this proposed patch measures up
>>> very well on either end of that tradeoff.
>
>> I think it would behoove us to try to separate concerns about this
>> particular patch from concerns about the viability of the whole
>> approach.  Whether or not it's useful to do X is a different question
>> than whether it can be done with few enough lines of code and/or
>> whether this patch actually does it well.
>
> I think I left the wrong impression: I'm concerned about the whole
> approach.  I haven't even read this particular patch lately.  I think
> that trying to guarantee that the planner applies independent
> constraints in a particular order will be expensive, fragile, and prone
> to recurring security bugs no matter how it's implemented --- unless you
> do it by lobotomizing query pullup/pushdown, which seems unacceptable
> from a performance standpoint.
>
> Just to give one example of what this patch misses (probably; as I said
> I haven't read it lately), what about selectivity estimation?  One of
> the things we like to do when we have an otherwise-unknown function is
> to try it on all the histogram members and see what percentage yield
> true.  That might already represent enough of an information leak to be
> objectionable ... and yet, if we don't do it, the consequences for
> performance could be pretty horrid because we'll have to work without
> any reasonable selectivity estimate at all.  There are cases where this
> technique isn't applied at the moment but probably should be, which is
> why I characterize the leak-prevention idea as creating future security
> issues: doing that is a constraint that will have to be accounted for in
> every future planner change, and we are certain to miss the implications
> sometimes.

My concern here is that I think row-level security is important to the
future of PostgreSQL.  Much of the mind-share that we have comes from
being feature-rich, and there seems to be no shortage of people who
are looking for row-level security.  Only some of those are
specifically interested in SE-Linux integration; only some of those
are interested in using views as an RLS mechanism; but from a high
level RLS seems to be one of those things that just keeps bubbling
back up, and I think we need to find a way to have it.  Unfortunately,
I get the impression that you think that there's a problem not only
with the approach but with any approach whatsoever to that underlying
problem.  I actually can't think of a mechanism for attacking RLS that
doesn't involve some kind of planner hackery along the lines discussed
here, but if you have a better idea (other than giving up) I'm all
ears.

With respect to selectivity estimation, do we have a live bug there
now?  I'm not exactly sure in what case we do this, but I observe that
I can do EXPLAIN on a query containing a function that I don't
actually have permission to call.  So if the planner might go off and
call that function on some data (even data that I *am* allowed to
see), that seems bad.  After all the function might be
initiate_self_destruct_sequence().

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: leaky views, yet again

From

Tom Lane

Date:

20 October 2010, 14:01:03

Robert Haas <robertmhaas@gmail.com> writes:
> I get the impression that you think that there's a problem not only
> with the approach but with any approach whatsoever to that underlying
> problem.

Let's just say that the approaches proposed so far have performance
and/or functionality and/or code maintenance penalties that are utterly
unacceptable from the standpoint of people who don't need RLS.  I don't
know if there is a workable solution, but I do know I've not seen one.

> With respect to selectivity estimation, do we have a live bug there
> now?

No, I don't believe so.  Given that you'd like to get the planner to
call function XYZ, you could create an operator using XYZ and attach to
it one of the estimation functions that will actually call the
underlying function --- but you have to have call permission on the
function in order to create the operator.
        regards, tom lane

Re: leaky views, yet again

From

Robert Haas

Date:

20 October 2010, 14:50:29

On Wed, Oct 20, 2010 at 10:00 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> I get the impression that you think that there's a problem not only
>> with the approach but with any approach whatsoever to that underlying
>> problem.
>
> Let's just say that the approaches proposed so far have performance
> and/or functionality and/or code maintenance penalties that are utterly
> unacceptable from the standpoint of people who don't need RLS.  I don't
> know if there is a workable solution, but I do know I've not seen one.
>
>> With respect to selectivity estimation, do we have a live bug there
>> now?
>
> No, I don't believe so.  Given that you'd like to get the planner to
> call function XYZ, you could create an operator using XYZ and attach to
> it one of the estimation functions that will actually call the
> underlying function --- but you have to have call permission on the
> function in order to create the operator.

But suppose the operator already exists, but I don't have permission
to call the underlying function... can I get the planner to call the
function for me by EXPLAIN-ing the right query?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company