Re: [v9.2] Fix Leaky View Problem - Mailing list pgsql-hackers

From Kohei KaiGai
Subject Re: [v9.2] Fix Leaky View Problem
Date
Msg-id CADyhKSUCcpVP+YLFbQ87=q8O8KEyc-dPj1SqmD9j3-YOWr2zoQ@mail.gmail.com
Whole thread Raw
In response to Re: [v9.2] Fix Leaky View Problem  (Kohei KaiGai <kaigai@kaigai.gr.jp>)
Responses Re: [v9.2] Fix Leaky View Problem
List pgsql-hackers
The attached patches are cut-off version based on the latest Robert's
updates. The "v8.regtest" adds regression test cases on variable
leaky-view scenarios with/without security-barrier property.

The "v8.option-1" add checks around restriction_selectivity, and prevent
to invoke estimator function if Var node touches relation with security-
barrier attribute.

The "v8.option-2" add checks around examine_simple_variable, and
prevent to reference statistical data, if Var node tries to reference
relation with security-barrier attribute.
(And, it shall be marked as "leakproof")

I initially thought restriction_selectivity called by clause_selectivity is
the best point to add checks, however, I reconsidered it might not be
the origin of this problem.
As long as user-defined functions acquires control on selectivity
estimation of operators, same problems can be re-produced;
if someone tries to reference unrelated data within estimator.

This scenario is normally prevented, because only superuser can define
a function that can bypass permission checks to reference internal data
structures; using "untrusted" procedural-language.
If my conclusion is right, what we should fix up is built-in estimators side,
and we should enforce estimator function being "leakproof", even though
we still allow unprivileged users to define operators.

Thanks,

2011/12/11 Kohei KaiGai <kaigai@kaigai.gr.jp>:
> 2011/12/10 Kohei KaiGai <kaigai@kaigai.gr.jp>:
>> 2011/12/9 Robert Haas <robertmhaas@gmail.com>:
>>> I feel like there must be some logic in the planner somewhere that is
>>> "looking through" the subquery RTE and figuring out that safe_foo.a is
>>> really the same variable as foo.a, and which therefore feels entitled
>>> to apply !!'s selectivity estimator to foo.a's statistics.  If that's
>>> the case, it might be possible to handicap that logic so that when the
>>> security_barrier flag is set, it doesn't do that, and instead treats
>>> safe_foo.a as a black box.  That would, obviously, degrade the quality
>>> of complex plans involving security views, but I think we should focus
>>> on getting something that meets our security goals first and then try
>>> to improve performance later.
>>>
>> I tried to investigate the code around size-estimation, and it seems to
>> me here is two candidates to put this type of checks.
>>
>> The one is examine_simple_variable() that is used to pull a datum
>> from statistic information, but it locates on the code path restriction
>> estimator of operators; so user controlable, although it requires least
>> code changes just after if (rte->rtekind == RTE_SUBQUERY).
>> The other is clause_selectivity(). Its code path is not user controlable,
>> so we can apply necessary checks to prevent qualifier that reference
>> variable come from sub-query with security-barrier.
>>
>> In my sense, clause_selectivity() is better place to apply this type of
>> checks. But, on the other hand, it provides get_relation_stats_hook
>> to allow extensions to control references to statistic data.
>> So, I wonder whether the PG core assumes this routine covers
>> all the code path here?
>>
> The attached patch adds checks around invocation of selectivity
> estimator functions, and it changes behavior of the estimator, if the
> supplied operator tries to touch variables come from security-barrier
> relations.
> Then, it fixes the problem you mentioned.
>
>  postgres=# explain select * from safe_foo where a !! 0;
>                           QUERY PLAN
>  -------------------------------------------------------------
>   Subquery Scan on safe_foo  (cost=0.00..2.70 rows=3 width=4)
>     Filter: (safe_foo.a !! 0)
>     ->  Seq Scan on foo  (cost=0.00..1.14 rows=6 width=4)
>           Filter: (a > 5)
>  (4 rows)
>
> However, I'm still a bit skeptical on my patch, because it still allows
> to invoke estimator function when operator's argument does not
> touch values originated from security-barrier relation.
> In the case when oprrest or oprjoin are implemented without our
> regular convention (E.g, it anyway reference whole of statistical data),
> it will break this solution.
>
> Of course, it is an assumption that we cannot prevent any attack
> using binary modules, so we need to say "use it your own risk" if
> people tries to use extensional modules. And, we also need to
> keep the built-in code secure.
>
> Some of built-in estimator functions (such as eqsel) provides
> a feature that invokes operator function with arguments originated
> from pg_statistics table. It didn't harmless, however, we got understand
> that this logic can be used to break row-level security.
> So, I begin to consider the routines to be revised are some of built-in
> functions to be used for estimator functions; such as eqsel and ...
> These function eventually reference statistical data at examine_variable.
>
> It might be a better approach to add checks whether invocation of
> the supplied operator possibly leaks contents to be invisible.
> It seems to me the Idea of the attached patch depends on something
> internal stuff of existing built-in estimator functions...
>
> Thanks,
> --
> KaiGai Kohei <kaigai@kaigai.gr.jp>

--
KaiGai Kohei <kaigai@kaigai.gr.jp>

Attachment

pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Command Triggers
Next
From: Dimitri Fontaine
Date:
Subject: Re: Command Triggers