Re: API change advice: Passing plan invalidation info from the rewriter into the planner? - Mailing list pgsql-hackers

From Stephen Frost
Subject Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
Date
Msg-id 20140623182943.GS16098@tamriel.snowman.net
Whole thread Raw
In response to Re: API change advice: Passing plan invalidation info from the rewriter into the planner?  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: API change advice: Passing plan invalidation info from the rewriter into the planner?  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
* Robert Haas (robertmhaas@gmail.com) wrote:
> On Wed, Jun 18, 2014 at 2:18 PM, Stephen Frost <sfrost@snowman.net> wrote:
> > I'm also of the opinion that this isn't strictly necessary for the
> > initial RLS offering in PG- there's a clear way we could migrate
> > existing users to a multi-policy system from a single-policy system.
> > Sure, to get the performance and optimization benefits that we'd
> > presumably have in the multi-policy case they'd need to re-work their
> > RLS configuration, but for users who care, they'll likely be very happy
> > to do so to gain those benefits.
>
> I think a lot depends on the syntax we choose.  If we choose a syntax
> that only makes sense in a single-policy framework, then I think
> allowing upgrades to a multi-policy syntax is going to be really
> difficult.  On the other hand, if we choose a syntax that allows
> multiple policies, I suspect we can support multiple policies from the
> beginning without much extra effort.

What are these policies going to depend on?  Will they be allowed to
overlap?  I don't see multi-policy support as being very easily added.

If there are specific ways to design the syntax which would make it
easier to support multiple policies in the future, I'm all for it.  Have
any specific thoughts regarding that?

> >> - Require the user to specify in some way which of the available
> >> policies they want applied, and then apply only that one.
> >
> > I'd want to at least see a way to apply an ordering to the policies
> > being applied, or have PG work out which one is "cheapest" and try that
> > one first.
>
> Cost-based comparison of policies that return different results
> doesn't seem sensible to me.

I keep coming back to the thought that, really, having multiple
overlapping policies just adds unnecessary complication to the system
for not much gain in real functionality.  Being able to specify a policy
per-role might be useful, but that's only one dimension and I can
imagine a lot of other dimensions that one might want to use to control
which policy is used.

> >> I think exactly the opposite, for the query planning reasons
> >> previously stated.  I think the policies will quickly get so
> >> complicated that they're no longer optimizable.  Here's a simple
> >> example:
> >>
> >> - Policy 1 allows the user to access rows for which complexfunc() returns true.
> >> - Policy 2 allows the user to access rows for which a = 1.
> >>
> >> Most users have access only through policy 2, but some have access
> >> through policy 1.  Users who have access through policy 1 will always
> >> get a sequential scan,
> >
> > This is the thing which I most object to- if the quals being provided at
> > any level are leakproof and would be able to reduce the returned set
> > sufficiently that an index scan is the best bet, we should be doing
> > that.  I don't anticipate the RLS quals to be as selective as the
> > quals which the user is adding.
>
> I think it would be a VERY bad idea to design the system around the
> assumption that the RLS quals will be much more or less selective than
> the user-supplied quals.  That's going to be different in different
> environments.

Fine- but do you really see the query planner having a problem pushing
down whichever is the more selective qual, if the user-provided qual is
marked as leakproof?

I realize that you want multiple policies because you'd like a way for
the RLS qual to be made simpler for certain cases while also having more
complex quals for other cases.  What I keep waiting to hear is exactly
how you want to specify which policy is used because that's where it
gets ugly and complicated.  I still really don't like the idea of trying
to apply multiple policies inside of a single query execution.
Thanks,
    Stephen

pgsql-hackers by date:

Previous
From: Gurjeet Singh
Date:
Subject: Re: /proc/self/oom_adj is deprecated in newer Linux kernels
Next
From: Abhijit Menon-Sen
Date:
Subject: Re: pgaudit - an auditing extension for PostgreSQL