Re: [RFC] Interface of Row Level Security - Mailing list pgsql-hackers
From | Kohei KaiGai |
---|---|
Subject | Re: [RFC] Interface of Row Level Security |
Date | |
Msg-id | CADyhKSX36tpJy86jC1QKZLN085BWr7JQoaNh8-OvdLnaMfC2qw@mail.gmail.com Whole thread Raw |
In response to | Re: [RFC] Interface of Row Level Security (Kohei KaiGai <kaigai@kaigai.gr.jp>) |
List | pgsql-hackers |
2012/5/31 Kohei KaiGai <kaigai@kaigai.gr.jp>: > 2012/5/31 Robert Haas <robertmhaas@gmail.com>: >>> If we would have an "ideal optimizer", I'd still like the optimizer to >>> wipe out redundant clauses transparently, rather than RLSBYPASS >>> permissions, because it just controls all-or-nothing stuff. >>> For example, if tuples are categorized to unclassified, classified or >>> secret, and RLS policy is configured as: >>> ((current_user IN ('alice', 'bob') AND X IN ('unclassified', >>> 'classified')) OR (X IN 'unclassified)), >>> superuser can see all the tuples, and alice and bob can see >>> up to classified tuples. >>> Is it really hard to wipe out redundant condition at planner stage? >>> If current_user is obviously 'kaigai', it seems to me the left-side of >>> this clause can be wiped out at the planner stage. >>> Do I consider the issue too simple? >> >> Yes. :-) >> >> There are two problems. First, if using the extended query protocol >> (e.g. pgbench -M prepared) you can prepare a statement just once and >> then execute it multiple times. In this case, stable-functions cannot >> be constant-folded at plan time, because they are only guaranteed to >> remain constant for a *single* execution of the query, not for all >> executions of the query. So any optimization in this area would have >> to be limited to cases where the simple query protocol is used. I >> think that might still be worth doing, but it's a significant >> limitation, to be sure. Second, at present, there is no guarantee >> that the snapshot used for planning the query is the same as the >> snapshot used for executing the query, though commit >> d573e239f03506920938bf0be56c868d9c3416da made that happen in some >> common cases. If we were to do constant-folding of stable functions >> using the planner snapshot, it would represent a behavior change from >> previous releases. I am not clear whether that has any real-world >> consequences that we should be worried about. It seems to me that the >> path of least resistance might be to refactor the portal stuff so that >> we can provide a uniform guarantee that, when using the simple query >> protocol, the planner and executor snapshots will be the same ... but >> I might be wrong. >> > It may be an option to separate the case into two; a situation to execute > the given query immediately just after optimization and never reused, > and others. > Even though the second situation, it may give us better query execution > plan, if we try to reconstruct query plan just before executor with > assumption that expects immutable / stable function can be replaced > by constant value prior to execution. > In other words, this idea tries to query optimization again on EXECUTE > statement against to its nature, to replace immutable / stable functions > by constant value, and to generate wiser execute plan. > At least, it may make sense to have a flag on prepared statement to > indicate whether it has possible better plan with this re-construction. > > Then, if so, we will be able to push the stuff corresponding to > RLSBYPASS into the query optimization, and works transparently > for users. > > Isn't it feasible to implement? > If we could replace a particular term that consists of constant values and stable / immutable functions only by parameter references, it may enable to handle the term as if a constant value, but actual calculation is delayed to executor stage. For example, according to this idea, PREPARE p1(int) AS SELECT * FROM tbl WHERE current_user in ('alice','bob') AND X> $1; shall be internally rewritten to, PREPARE p1(int) AS SELECT * FROM tbl WHERE $2 AND X>$1; then, $2 is implicitly calculated just before execution of this prepared statement. The snapshot to be used for this calculation is same with executor's one. It seems to me it is a feasible idea with less invasive implementation to existing planner. Does it make sense to describe exceptional condition using regular clause, instead of special permission? Thanks, -- KaiGai Kohei <kaigai@kaigai.gr.jp>
pgsql-hackers by date: