Re: [v9.2] Fix Leaky View Problem - Mailing list pgsql-hackers

From Robert Haas
Subject Re: [v9.2] Fix Leaky View Problem
Date
Msg-id CA+TgmoY5=Udi_hpeB3T7F8MV4AEQdqgZgdv+7NV9skKT80nmQw@mail.gmail.com
Whole thread Raw
In response to Re: [v9.2] Fix Leaky View Problem  (Kohei KaiGai <kaigai@kaigai.gr.jp>)
Responses Re: [v9.2] Fix Leaky View Problem
List pgsql-hackers
On Mon, Oct 10, 2011 at 4:28 PM, Kohei KaiGai <kaigai@kaigai.gr.jp> wrote:
> I agreed. We have been on the standpoint that tries to prevent
> leakable functions to reference a portion of join-tree being already
> flatten, however, it has been a tough work.
> It seems to me it is much simple approach that enables to push
> down only non-leaky functions into inside of sub-queries.
>
> An idea is to add a hack on distribute_qual_to_rels() to relocate
> a qualifier into inside of the sub-query, when it references only
> a particular sub-query being come from a security view, and
> when the sub-query satisfies is_simple_subquery(), for example.

If you can make this work, I think it could be a pretty sweet plannner
optimization even apart from the implications for security views.
Consider a query of this form:

A LEFT JOIN B LEFT JOIN C

where B is a view defined as:

B1 JOIN B2 JOIN B3 LEFT JOIN B4 LEFT JOIN B5

Now let's suppose that from_collapse_limit/join_collapse_limit are set
low enough that we decline to fold these subproblems together.  If
there happens to be a qual B.x = 1, where B.x is really B1.x, then the
generated plan sucks, because it will basically lose the ability to
filter B1 early, very possibly on, say, a unique index.  Or at least a
highly selective index.  If we could allow the B.x qual to trickle
down inside of the subquery, we'd get a much better plan.  Of course,
it's still not as good as flattening, because it won't allow us to
consider as many possible join orders - but the whole point of having
from_collapse_limit/join_collapse_limit in the first place is that we
can't consider all the join orders without having planning time and
memory usage balloon wildly out of control.  And in many real-world
cases, I think that this would probably mitigate the effects of
exceeding from_collapse_limit/join_collapse_limit quite a bit.

In order to make it work, though, you'd need to arrange things so that
we distribute quals to rels in the parent query, then let some of them
filter down into the subquery, then distribute quals to rels in the
subquery (possibly adjusting RTE indexes?), then finish planning the
subquery, then finish planning the parent query.  Not sure how
possible/straightforward that is.

It's probably a good idea to deal with this part first, because if you
can't make it work then the whole approach is in trouble.  I'm almost
imagining that we could break this into three independent patches,
like this:

1. Let quals percolate down into subqueries.
2. Add the notion of a security view, which prevents flattening and
disables the optimization of patch #1
3. Add the notion of a leakproof function, which can benefit from the
optimization of #1 even when the view involved is a security view as
introduced in #2

Unlike the way you have it now, I think those patches could be
independently committable.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


pgsql-hackers by date:

Previous
From: Josh Berkus
Date:
Subject: Dumping roles improvements?
Next
From: Florian Pflug
Date:
Subject: Re: Range Types - typo + NULL string constructor