Re: leaky views, yet again - Mailing list pgsql-hackers
From | KaiGai Kohei |
---|---|
Subject | Re: leaky views, yet again |
Date | |
Msg-id | 4C84735E.4070604@ak.jp.nec.com Whole thread Raw |
In response to | Re: leaky views, yet again (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: leaky views, yet again
|
List | pgsql-hackers |
(2010/09/02 11:57), Robert Haas wrote: > 2010/9/1 KaiGai Kohei<kaigai@ak.jp.nec.com>: >> Right now, it stands on a strict assumption that considers operators >> implemented with built-in functions are safe; it does not have no >> possibility to leak supplied arguments anywhere. >> >> Please note that this patch does not case about a case when >> a function inside a view and a function outside a view are >> distributed into same level and the later function has lower >> cost value. > > Without making some attempt to address these two points, I don't see > the point of this patch. > > Also, I believe we decided previously do this deoptimization only in > case the user requests it with CREATE SECURITY VIEW. > The attached patch tackles both of the above two points, and allows to control this deoptimization when CREATE SECURITY VIEW is used. Example --------- CREATE FUNCTION f_policy(int) RETURNS bool LANGUAGE 'plpgsql' AS 'begin return $1 % 2 = 0; end'; CREATE FUNCTION f_malicious(text) RETURNS bool COST 0.0001 LANGUAGE 'plpgsql' AS 'begin raise notice ''leak: %'', $1; return true; end'; CREATE OR REPLACE VIEW v1 AS SELECT * FROM t1 JOIN t2 ON t1.a = t2.x WHERE f_policy(x); CREATE OR REPLACE VIEW v2 AS SELECT * FROM t1 WHERE f_policy(a); In this case, we don't want to invoke f_malicious() with argument come from the tuples which should be filtered inside of the views, even if people provides it on the WHERE clause. Without this patch ==================== postgres=# EXPLAIN select * from v1 where f_malicious(b); QUERY PLAN ------------------------------------------------------------------------- Nested Loop (cost=0.00..287.63 rows=410 width=72) -> Seq Scan on t1 (cost=0.00..22.30 rows=410 width=36) Filter: f_malicious(b) -> Index Scan using t2_pkey on t2 (cost=0.00..0.63 rows=1 width=36) Index Cond: (x = t1.a) Filter: f_policy(x) (6 rows) ==> f_policy(x) was chained inside of the join loop, so it enables to something to be filtered out. postgres=# select * from v1 where f_malicious(b); NOTICE: leak: aaa NOTICE: leak: bbb NOTICE: leak: ccc NOTICE: leak: ddd a | b | x | y ---+-----+---+----- 2 | bbb | 2 | xxx 4 | ddd | 4 | zzz (2 rows) postgres=# EXPLAIN select * from v2 where f_malicious(b); QUERY PLAN ------------------------------------------------------- Seq Scan on t1 (cost=0.00..329.80 rows=137 width=36) Filter: (f_malicious(b) AND f_policy(a)) (2 rows) ==> The view of v2 is pulled up to the upper level, because this subquery is enough simple. In this case, f_malicious() and f_policy() are handled in a same level, and ordered by its cost to execution. Since f_malicious() has lower cost (0.001) the f_policy(), so it shall be executed earlier, although f_policy() is used in more deep depth. With this patch ================= postgres=# EXPLAIN select * from v1 where f_malicious(b); QUERY PLAN ------------------------------------------------------------------------- Nested Loop (cost=0.00..287.63 rows=410 width=72) -> Seq Scan on t1 (cost=0.00..22.30 rows=410 width=36) Filter: f_malicious(b) -> Index Scan using t2_pkey on t2 (cost=0.00..0.63 rows=1 width=36) Index Cond: (x = t1.a) Filter: f_policy(x) (6 rows) ==> It has same result of the case when without this patch, because this view is declared without "SECURITY" postgres=# CREATE OR REPLACE SECURITY VIEW v1 AS SELECT * FROM t1 JOIN t2 ON t1.a = t2.x WHERE f_policy(x); CREATE VIEW postgres=# EXPLAIN select * from v1 where f_malicious(b); QUERY PLAN ------------------------------------------------------------------- Hash Join (cost=37.68..384.39 rows=137 width=72) Hash Cond: (t1.a = t2.x) Join Filter: (f_policy(t2.x) AND f_malicious(t1.b)) -> Seq Scan on t1 (cost=0.00..22.30 rows=1230 width=36) -> Hash (cost=22.30..22.30 rows=1230 width=36) -> Seq Scan on t2 (cost=0.00..22.30 rows=1230 width=36) (6 rows) ==> After we replaced this view with SECURITY option, it prevents f_malicious() being come from outside of the view into inside of the join loop. postgres=# EXPLAIN select * from v1 where f_malicious(b) and a = 2; QUERY PLAN ------------------------------------------------------------------------- Nested Loop (cost=0.00..16.80 rows=1 width=72) Join Filter: (f_policy(t2.x) AND f_malicious(t1.b)) -> Index Scan using t1_pkey on t1 (cost=0.00..8.27 rows=1 width=36) Index Cond: (a = 2) -> Index Scan using t2_pkey on t2 (cost=0.00..8.27 rows=1 width=36) Index Cond: (x = 2) (6 rows) ==> We assume operators implemented with built-in functions are safe. So, we don't prevent this pushing-down, if the clause does not contains something leakable. postgres=# EXPLAIN select * from v2 where f_malicious(b); QUERY PLAN ------------------------------------------------------- Seq Scan on t1 (cost=0.00..329.80 rows=137 width=36) Filter: (f_policy(a) AND f_malicious(b)) (2 rows) ==> In addition, when functions come from different nest levels are distributed a certain scan filter, these are ordered by its original nest level, not only function cost. Introduction of the patch =========================== This patch tries to tackle the two matters. The first one is pushing down leakable functions into inside of join-loops being originated from security view. The distribute_qual_to_rels() distributes qualifiers of WHERE/JOIN ... ON clauses to appropriate set of relations. Even if a qualifier is depending on a part of join-loop inside of security view, we need to prevent it to be pushed down. This patch informs distribute_qual_to_rels() what relations being come from security views using Bitmapset of 'sec_barriers'. If a qualifier overlaps with any members of security_barriers, its dependency shall be expanded to the union set of 'sec_barriers'. In this case, this qualifier depends on whole of the relations inside of the security view, we cannot push it down anymore. Views are expanded to sub-queries at query rewriter. This patch add a flag to FromExpr to distinguish whether this sub-query was expanded from security view, or not. When deconstruct_recurse() walks on the supplied join-tree, it construct a bitmapset of the relations under the FromExpr with security_view = true, then, it shall be delivered to the distribute_qual_to_rels(). Instead of modifying the structure of pg_class, this patch stores a flag of security view on the reloption. So, it needed to modify routines about reloptions because it is the first case to store reloptions of views. The other matter is priority to order qualifiers of a certain scanning filter. As I mentioned above, the planner pull up simple subqueries to join, so a function originated from inside of view and outside of view are distributed to same scan plan. Then, these are ordered by cost value. It means functions come from outside of views (maybe leakable) are invoked with arguments come from tuples to be filtered out with functions come from inside of the views. So, I added 'depth' field to FuncExpr and so on. It shall be set just after query rewriter, then referenced in order_qual_clauses(). If the supplied plan multiple qualifiers, these are ordered by the depth of qualifier first, then ordered by the cost when here are multiple qualifiers with same depth. It makes sure qualifiers being originated inside of views are executed earlier than others. Thanks, -- KaiGai Kohei <kaigai@ak.jp.nec.com>
Attachment
pgsql-hackers by date: