Thread: Sloppy thinking about leakproof properties of opclass co-members

Sloppy thinking about leakproof properties of opclass co-members

From
Tom Lane
Date:
It strikes me that there's a significant gap in the whole "leakproof
function" business, namely that no consideration has been given to
planner-driven transformations of queries.  As an example, if we
have "a = b" and "b = c", the planner may generate and apply "a = c"
instead of one or both of those clauses.  If a, b, c are not all the
same type, "a = c" might involve an operator that's not either of the
original ones.  And it's possible that that operator is not leakproof
where the original ones are.  This could easily result in introducing
non-leakproof operations into a secure subquery after pushdown of a
clause that was marked secure.

Another example is that in attempting to make implication or refutation
proofs involving operator clauses, the planner feels free to apply other
members of the operator's btree opclass (if it's in one).  I've not
bothered to try to create a working exploit, but I'm pretty sure that
this could result in a non-leakproof function being applied during
planning of a supposedly secure subquery.  It might be that that couldn't
leak anything worse than constant values within the query tree, but
perhaps it could leak data values from a protected table's pg_statistic
entries.

ISTM that the most appropriate solution here is to insist that all or none
of the members of an operator class be marked leakproof.  (Possibly we
could restrict that to btree opclasses, but I'm not sure any exception is
needed for other index types.)  I looked for existing violations of this
precept, and unfortunately found a *lot*.  For example, texteq is marked
leakproof but its fellow text comparison operators aren't.  Is that really
sane?

Here's a query to find problematic cases:

select p1.proname,o1.oprname,p2.proname,o2.oprname,a1.amopfamily from
pg_proc p1, pg_operator o1, pg_amop a1,
pg_proc p2, pg_operator o2, pg_amop a2
where
p1.oid = o1.oprcode and p2.oid = o2.oprcode and
o1.oid = a1.amopopr and o2.oid = a2.amopopr and
a1.amopfamily = a2.amopfamily and
p1.proleakproof and not p2.proleakproof;

Oh ... and just to add insult to injury, the underlying opclass support
functions (such as textcmp and hashtext) are generally not marked
leakproof even when the operators they may be executed instead of are.
This is even more silly.
        regards, tom lane



Re: Sloppy thinking about leakproof properties of opclass co-members

From
Robert Haas
Date:
On Wed, Sep 24, 2014 at 5:34 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> It strikes me that there's a significant gap in the whole "leakproof
> function" business, namely that no consideration has been given to
> planner-driven transformations of queries.  As an example, if we
> have "a = b" and "b = c", the planner may generate and apply "a = c"
> instead of one or both of those clauses.  If a, b, c are not all the
> same type, "a = c" might involve an operator that's not either of the
> original ones.  And it's possible that that operator is not leakproof
> where the original ones are.  This could easily result in introducing
> non-leakproof operations into a secure subquery after pushdown of a
> clause that was marked secure.
>
> Another example is that in attempting to make implication or refutation
> proofs involving operator clauses, the planner feels free to apply other
> members of the operator's btree opclass (if it's in one).  I've not
> bothered to try to create a working exploit, but I'm pretty sure that
> this could result in a non-leakproof function being applied during
> planning of a supposedly secure subquery.  It might be that that couldn't
> leak anything worse than constant values within the query tree, but
> perhaps it could leak data values from a protected table's pg_statistic
> entries.
>
> ISTM that the most appropriate solution here is to insist that all or none
> of the members of an operator class be marked leakproof.  (Possibly we
> could restrict that to btree opclasses, but I'm not sure any exception is
> needed for other index types.)  I looked for existing violations of this
> precept, and unfortunately found a *lot*.  For example, texteq is marked
> leakproof but its fellow text comparison operators aren't.  Is that really
> sane?

Not really.  Fortunately, AFAICT, most if not all of these are in the
good direction: there are some things not marked leakproof that can be
so marked.  The reverse direction would be a hard-to-fix security
hole.  I think at some point somebody went through and tried to mark
all of the same-type equality operators as leakproof, and it seems
like that got expanded somewhat without fully rationalizing what we
had in pg_proc... mostly because I think nobody had a clear idea how
to do that.

I think your proposal here is a good one.  Heikki proposed treating
opclass functions as leakproof *in lieu of* adding a flag, but that
didn't seem good because we wanted to allow for the possibility of
cases where that wasn't true, and ensure individual scrutiny of each
case.  Your idea of making sure the flag is set consistently
throughout the opclass (opfamily?) is similar in spirit, but better in
detail.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Sloppy thinking about leakproof properties of opclass co-members

From
Stephen Frost
Date:
* Robert Haas (robertmhaas@gmail.com) wrote:
> On Wed, Sep 24, 2014 at 5:34 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > ISTM that the most appropriate solution here is to insist that all or none
> > of the members of an operator class be marked leakproof.  (Possibly we
> > could restrict that to btree opclasses, but I'm not sure any exception is
> > needed for other index types.)  I looked for existing violations of this
> > precept, and unfortunately found a *lot*.  For example, texteq is marked
> > leakproof but its fellow text comparison operators aren't.  Is that really
> > sane?
>
> Not really.  Fortunately, AFAICT, most if not all of these are in the
> good direction: there are some things not marked leakproof that can be
> so marked.  The reverse direction would be a hard-to-fix security
> hole.  I think at some point somebody went through and tried to mark
> all of the same-type equality operators as leakproof, and it seems
> like that got expanded somewhat without fully rationalizing what we
> had in pg_proc... mostly because I think nobody had a clear idea how
> to do that.

We'll need to investigate and see if there are any which *aren't* safe
and probably fix those to be safe rather than trying to deal with this
risk in some other way.  In other words, I hope it's really "all of
these" rather than just most.  In general, I've been hoping that we have
more leakproof functions than not as, while it's non-trivial to write
them and ensure they're correct, it's better for us to take that on than
to ask our users to do so (or have them get some set that someone else
wrote off of a website or even the extension network).  We can't cover
everything, of course, but hopefully we'll cover all reasonable use
cases for types we ship.

> I think your proposal here is a good one.  Heikki proposed treating
> opclass functions as leakproof *in lieu of* adding a flag, but that
> didn't seem good because we wanted to allow for the possibility of
> cases where that wasn't true, and ensure individual scrutiny of each
> case.  Your idea of making sure the flag is set consistently
> throughout the opclass (opfamily?) is similar in spirit, but better in
> detail.

Agreed- a regression test here is definitely needed and have any
exceptions which must exist, after we've determined that they don't
produce an actual security hole (not sure how they wouldn't, but still)
explicitly called out in the regression tests, to avoid individuals
missing this requirement by copying an existing pg_proc example and
thinking they can add a new opclass item (or change an existing one) to
not be leakproof when the rest is.
Thanks,
    Stephen

Re: Sloppy thinking about leakproof properties of opclass co-members

From
Dean Rasheed
Date:
On 26 September 2014 15:48, Stephen Frost <sfrost@snowman.net> wrote:
> * Robert Haas (robertmhaas@gmail.com) wrote:
>> On Wed, Sep 24, 2014 at 5:34 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> > ISTM that the most appropriate solution here is to insist that all or none
>> > of the members of an operator class be marked leakproof.  (Possibly we
>> > could restrict that to btree opclasses, but I'm not sure any exception is
>> > needed for other index types.)  I looked for existing violations of this
>> > precept, and unfortunately found a *lot*.  For example, texteq is marked
>> > leakproof but its fellow text comparison operators aren't.  Is that really
>> > sane?
>>
>> Not really.  Fortunately, AFAICT, most if not all of these are in the
>> good direction: there are some things not marked leakproof that can be
>> so marked.  The reverse direction would be a hard-to-fix security
>> hole.  I think at some point somebody went through and tried to mark
>> all of the same-type equality operators as leakproof, and it seems
>> like that got expanded somewhat without fully rationalizing what we
>> had in pg_proc... mostly because I think nobody had a clear idea how
>> to do that.
>
> We'll need to investigate and see if there are any which *aren't* safe
> and probably fix those to be safe rather than trying to deal with this
> risk in some other way.  In other words, I hope it's really "all of
> these" rather than just most.  In general, I've been hoping that we have
> more leakproof functions than not as, while it's non-trivial to write
> them and ensure they're correct, it's better for us to take that on than
> to ask our users to do so (or have them get some set that someone else
> wrote off of a website or even the extension network).  We can't cover
> everything, of course, but hopefully we'll cover all reasonable use
> cases for types we ship.
>

Looking at other functions, ISTM that there's an entire class of
functions that can trivially be marked leakproof, and that's no-arg
functions which can't possibly leak. There are quite a few no-arg
builtin functions and none of them are currently marked leakproof.

Rather than (or perhaps as well as) marking all these leakproof,
perhaps we should teach contain_leaky_functions() to automatically
treat any no-arg function as leakproof, so that we allow user-defined
functions too. Taking that one step further, perhaps what it should
really be looking for is Vars in the argument list of a leaky
function, i.e., contain_leaked_vars() rather than
contain_leaky_functions().

Regards,
Dean



Re: Sloppy thinking about leakproof properties of opclass co-members

From
Tom Lane
Date:
Dean Rasheed <dean.a.rasheed@gmail.com> writes:
> Rather than (or perhaps as well as) marking all these leakproof,
> perhaps we should teach contain_leaky_functions() to automatically
> treat any no-arg function as leakproof, so that we allow user-defined
> functions too. Taking that one step further, perhaps what it should
> really be looking for is Vars in the argument list of a leaky
> function, i.e., contain_leaked_vars() rather than
> contain_leaky_functions().

+1, but that's a totally independent question from what I'm on about
at the moment.
        regards, tom lane