Re: [v9.2] Fix Leaky View Problem - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: [v9.2] Fix Leaky View Problem |
Date | |
Msg-id | CA+TgmoaPryBCe0AbEG1s2HNA++oDw_twNDJgxHYPF+LbQ87azA@mail.gmail.com Whole thread Raw |
In response to | Re: [v9.2] Fix Leaky View Problem (Kohei KaiGai <kaigai@kaigai.gr.jp>) |
Responses |
Re: [v9.2] Fix Leaky View Problem
|
List | pgsql-hackers |
On Sun, Oct 9, 2011 at 11:50 AM, Kohei KaiGai <kaigai@kaigai.gr.jp> wrote: > I tried to refactor the patches based on the interface of WITH (...) > and usage of > pg_class.reloptions, although here is no functionality changes; including the > behavior when a view is replaced. > > My preference is WITH (...) interface, however, it is not a strong one. > So, I hope either of versions being reviewed. I spent some more time looking at this, and I guess I'm pretty unsold on the whole approach. In the part 2 patch, for example, we're doing this: +static bool +mark_qualifiers_depth_walker(Node *node, void *context) +{ + int depth = *((int *)(context)); + + if (node == NULL) + return false; + + if (IsA(node, FuncExpr)) + { + ((FuncExpr *)node)->depth = depth; + } + else if (IsA(node, OpExpr)) + { + ((OpExpr *)node)->depth = depth; + } + else if (IsA(node, DistinctExpr)) + { + ((DistinctExpr *)node)->depth = depth; + } + else if (IsA(node, ScalarArrayOpExpr)) + { + ((ScalarArrayOpExpr *)node)->depth = depth; + } + else if (IsA(node, CoerceViaIO)) + { + ((CoerceViaIO *)node)->depth = depth; + } + else if (IsA(node, ArrayCoerceExpr)) + { + ((ArrayCoerceExpr *)node)->depth = depth; + } + else if (IsA(node, NullIfExpr)) + { + ((NullIfExpr *)node)->depth = depth; + } + else if (IsA(node, RowCompareExpr)) + { + ((RowCompareExpr *)node)->depth = depth; + } + return expression_tree_walker(node, mark_qualifiers_depth_walker, context); +} It seems really ugly to me to suppose that we need to add a depth field to every single one of these node types. If you've missed one, then we have a security hole. If someone else adds another node type later that requires this field and doesn't add it, we have a security hole. And since all of these depth fields are going to make their way into stored rules, those security holes will require an initdb to fix.Ouch! And what happens if the security view becomesa non-security view or visca versa? Now all of those stored depth fields are out of date. Maybe you can argue that we can just patch that up when we reload them, but that seems to me to miss the point. If the data in a stored rule can get out of date, then it shouldn't be stored there in the first place. Tom may have a better feeling on this than I do, but my gut feeling here is that this whole approach is letting the cat out of the bag and then trying to stuff it back in. I don't think that's going to be very reliable, and more than that, I don't like our chances of having confidence in its reliability. I feel like the heart of what we're doing here ought to be preventing the subquery from getting flattened.For example: rhaas=# create table secret (a int, b text); CREATE TABLE rhaas=# insert into secret select g, random()::text||random()::text from generate_series(1,10000) g; INSERT 0 10000 rhaas=# create view window_on_secret as select * from secret where a = 1; CREATE VIEW rhaas=# create table leak (a int, b text); CREATE TABLE rhaas=# create or replace function snarf(a int, b text) returns boolean as $$begin insert into leak values ($1, $2); return true; end$$ language plpgsql cost 0.00000000000000000000000000000000000000001; CREATE FUNCTION rhaas=# explain analyze select * from window_on_secret; QUERY PLAN ---------------------------------------------------------------------------------------------------Seq Scan on secret (cost=0.00..209.00rows=1 width=39) (actual time=0.022..2.758 rows=1 loops=1) Filter: (a = 1) Rows Removed by Filter: 9999Total runtime: 2.847 ms (4 rows) rhaas=# select * from leak;a | b ---+--- (0 rows) rhaas=# explain analyze select * from window_on_secret where snarf(a, b); QUERYPLAN -----------------------------------------------------------------------------------------------------Seq Scan on secret (cost=0.00..209.00 rows=1 width=39) (actual time=0.671..126.521 rows=1 loops=1) Filter: (snarf(a, b) AND (a = 1)) Rows Removed by Filter: 9999Total runtime: 126.565ms (4 rows) Woops! I've stolen the whole table. But look what happens when I change the definition of window_on_secret so that it can't be flattened: rhaas=# truncate leak; TRUNCATE TABLE rhaas=# create or replace view window_on_secret as select * from secret where a = 1 limit 1000000000; CREATE VIEW rhaas=# explain analyze select * from window_on_secret where snarf(a, b); QUERY PLAN ------------------------------------------------------------------------------------------------------------------Subquery Scanon window_on_secret (cost=0.00..209.01 rows=1 width=39) (actual time=0.320..2.348 rows=1 loops=1) Filter: snarf(window_on_secret.a, window_on_secret.b) -> Limit (cost=0.00..209.00rows=1 width=39) (actual time=0.016..2.043 rows=1 loops=1) -> Seq Scan on secret (cost=0.00..209.00 rows=1 width=39) (actual time=0.014..2.034 rows=1 loops=1) Filter: (a = 1) Rows Removed by Filter: 9999Total runtime:2.434 ms (7 rows) rhaas=# select * from leak;a | b ---+----------------------------------1 | 0.60352857504040.928101760800928 (1 row) If we make security views work like this, then we don't need to have one mechanism to sort quals by depth and another to prevent them from being pushed down through joins. It all just works. Now, there is one problem: if snarf() were a non-leaky function rather than a maliciously crafted one, it still wouldn't get pushed down: rhaas=# explain analyze select * from window_on_secret where b = 'not so hot'; QUERY PLAN ------------------------------------------------------------------------------------------------------------------Subquery Scanon window_on_secret (cost=0.00..209.01 rows=1 width=39) (actual time=2.080..2.080 rows=0 loops=1) Filter: (window_on_secret.b = 'not so hot'::text) Rows Removed by Filter:1 -> Limit (cost=0.00..209.00 rows=1 width=39) (actual time=0.014..2.077 rows=1 loops=1) -> Seq Scan on secret (cost=0.00..209.00 rows=1 width=39) (actual time=0.013..2.075 rows=1 loops=1) Filter: (a = 1) Rows Removed by Filter: 9999Total runtime:2.131 ms (8 rows) I don't have a clear idea what to do about that, and maybe it's an intractable problem, but I feel like once we've flattened the subquery, it's a whole lot harder to prevent bad stuff from happening, because now the bits that started inside the security view are all over the place. Somebody previously raised the issue of what happen when there are multiple security views involved in the same query. I don't see how the depth-based approach can possibly deal with that case correctly. Let's suppose that in the test case above, window_on_secret was created as a security view. Now, the bad guy comes along and creates a security view that uses with some maliciously crafted qual, and then joins that view against window_on_secret. IIUC, the quals from both views are going to have depth = 1, so from the planner's point of view it ought to be OK to interchange them - which it's not. Now, in the normal course of events it won't matter, because the quals in window_on_secret are going to apply to the "secret" table, and the quals in the other view are going to apply only to whatever view that table ranges over. But there's already at least one case in which that might not be true: if the equivalence class machinery throws a constant into the same bucket as a column in window_on_secret, it will feel free to add a qual comparing that value to the associated constant using the appropriate operator, and that qual could then (presumably) get reordered with the qual from the security view. I'm not 100% sure that it's possible to construct a security breach this way, but I'm definitely not 100% sure that it isn't. And future changes to the way we make deductions based on equivalence classes (like deducing implied equalities, something that's been requested more than once) could open up further possibilities for mischief. We could maybe hunt all of those down but it seems rather error-prone to me, and not very future-proof. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: