Re: [RFC] Interface of Row Level Security - Mailing list pgsql-hackers

From Kohei KaiGai
Subject Re: [RFC] Interface of Row Level Security
Date
Msg-id CADyhKSX36tpJy86jC1QKZLN085BWr7JQoaNh8-OvdLnaMfC2qw@mail.gmail.com
Whole thread Raw
In response to Re: [RFC] Interface of Row Level Security  (Kohei KaiGai <kaigai@kaigai.gr.jp>)
List pgsql-hackers
2012/5/31 Kohei KaiGai <kaigai@kaigai.gr.jp>:
> 2012/5/31 Robert Haas <robertmhaas@gmail.com>:
>>> If we would have an "ideal optimizer", I'd still like the optimizer to
>>> wipe out redundant clauses transparently, rather than RLSBYPASS
>>> permissions, because it just controls all-or-nothing stuff.
>>> For example, if tuples are categorized to unclassified, classified or
>>> secret, and RLS policy is configured as:
>>>  ((current_user IN ('alice', 'bob') AND X IN ('unclassified',
>>> 'classified')) OR (X IN 'unclassified)),
>>> superuser can see all the tuples, and alice and bob can see
>>> up to classified tuples.
>>> Is it really hard to wipe out redundant condition at planner stage?
>>> If current_user is obviously 'kaigai', it seems to me the left-side of
>>> this clause can be wiped out at the planner stage.
>>> Do I consider the issue too simple?
>>
>> Yes.  :-)
>>
>> There are two problems.  First, if using the extended query protocol
>> (e.g. pgbench -M prepared) you can prepare a statement just once and
>> then execute it multiple times.  In this case, stable-functions cannot
>> be constant-folded at plan time, because they are only guaranteed to
>> remain constant for a *single* execution of the query, not for all
>> executions of the query.  So any optimization in this area would have
>> to be limited to cases where the simple query protocol is used.  I
>> think that might still be worth doing, but it's a significant
>> limitation, to be sure.  Second, at present, there is no guarantee
>> that the snapshot used for planning the query is the same as the
>> snapshot used for executing the query, though commit
>> d573e239f03506920938bf0be56c868d9c3416da made that happen in some
>> common cases.  If we were to do constant-folding of stable functions
>> using the planner snapshot, it would represent a behavior change from
>> previous releases.  I am not clear whether that has any real-world
>> consequences that we should be worried about.  It seems to me that the
>> path of least resistance might be to refactor the portal stuff so that
>> we can provide a uniform guarantee that, when using the simple query
>> protocol, the planner and executor snapshots will be the same ... but
>> I might be wrong.
>>
> It may be an option to separate the case into two; a situation to execute
> the given query immediately just after optimization and never reused,
> and others.
> Even though the second situation, it may give us better query execution
> plan, if we try to reconstruct query plan just before executor with
> assumption that expects immutable / stable function can be replaced
> by constant value prior to execution.
> In other words, this idea tries to query optimization again on EXECUTE
> statement against to its nature, to replace immutable / stable functions
> by constant value, and to generate wiser execute plan.
> At least, it may make sense to have a flag on prepared statement to
> indicate whether it has possible better plan with this re-construction.
>
> Then, if so, we will be able to push the stuff corresponding to
> RLSBYPASS into the query optimization, and works transparently
> for users.
>
> Isn't it feasible to implement?
>
If we could replace a particular term that consists of constant values
and stable / immutable functions only by parameter references,
it may enable to handle the term as if a constant value, but actual
calculation is delayed to executor stage.

For example, according to this idea, PREPARE p1(int) AS SELECT * FROM tbl WHERE     current_user in ('alice','bob') AND
X> $1; 
shall be internally rewritten to, PREPARE p1(int) AS SELECT * FROM tbl WHERE     $2 AND X>$1;

then, $2 is implicitly calculated just before execution of this prepared
statement. The snapshot to be used for this calculation is same with
executor's one. It seems to me it is a feasible idea with less invasive
implementation to existing planner.

Does it make sense to describe exceptional condition using regular
clause, instead of special permission?

Thanks,
--
KaiGai Kohei <kaigai@kaigai.gr.jp>


pgsql-hackers by date:

Previous
From: Kohei KaiGai
Date:
Subject: Re: [RFC] Interface of Row Level Security
Next
From: Robert Haas
Date:
Subject: Re: 9.2beta1, parallel queries, ReleasePredicateLocks, CheckForSerializableConflictIn in the oprofile