Re: GIST/GIN index not used with Row Level Security - Mailing list pgsql-general

From Derek Hans
Subject Re: GIST/GIN index not used with Row Level Security
Date
Msg-id CAGrP7a2kQ5vVUE=46xPU1LLJJ2GuYSREWsTY49WbfPV6R0sM3g@mail.gmail.com
Whole thread Raw
In response to Re: GIST/GIN index not used with Row Level Security  (Stephen Frost <sfrost@snowman.net>)
Responses Re: GIST/GIN index not used with Row Level Security
List pgsql-general
Thanks for the detailed response, super helpful in understanding what's happening, in particular understanding the risk of not marking functions as leakproof. I'll take a look at the underlying code to understand what's involved in getting a function to be leakproof. 

That said, it does seem like it should be possible and reasonable to specify that a user should have access to the table stats so that the query planner works as expected. Maybe it comes down to the fact that RLS is still a work in progress, and I shouldn't be relying on it unless I'm really certain it supports the functionality I need. 

I've updated word_similarity_op(text,text) to be leakproof, and  pg_proc agrees it is. I'm assuming word_similarity_op() is equivalent to <%, though I haven't found explicit confirmation. However, using word_similarity() instead of <% on a 100k row table, without any RLS involved, doesn't make use of the index, while using <% does. Obviously, adding the RLS doesn't make that any better. Any idea what might be the cause?


On Tue, Aug 13, 2019 at 5:39 PM Stephen Frost <sfrost@snowman.net> wrote:
Greetings,

* Derek Hans (derek.hans@gmail.com) wrote:
> Unfortunately only "alter function" supports "leakproof" - "alter operator"
> does not. Is there a function-equivalent for marking operators as
> leakproof? Is there any documentation for which operators/functions are
> leakproof?

Tom's query downthread provides the complete list.

Note that the list is not completely static- it's entirely possible that
additional functions can be made leak-proof, what's needed is a careful
review of the function code to ensure that it can't leak information
about the data (or, if it does today, a patch which removes that).  If
you have an interest in that then I'd encourage you to dig into the code
and look for possible leaks (Tom's already hinted in the direction you'd
want to go in) and then propose a patch to address those cases and to
mark the function(s) as leakproof.

> In my particular case, RLS is still useful even if operators are leaky as I
> control the application code and therefore can ensure leaky errors are
> handled. If it's possible to disable all checking for "leakproof", that
> would work for me.

There isn't a way to disable the leakproof-checking system.  Certainly
in the general case that wouldn't be acceptable and I'm not entirely
convinced by your argument that such an option should exist, though you
could go through and set all of the functions to be leakproof if you
really wish to.

> > If that's not possible, it sounds like it
> > > effectively blocks the use of GIN/GIST indexes when RLS is in use.
> >
> > There's a whole lot of daylight between "it doesn't pick an indexscan in
> > this one example" and "it effectively blocks the use of GIN/GIST".
>
> True indeed :). Would you have a working example of using a GIN/GIST index
> with RLS? All the attempts I've made have ended in seq scans. In practice,
> I'm looking to implement fuzzy search using trigrams, so % and %> operators
> are what matter to me. ~~ also happens to fail. Should I expect to be able
> to use any of these with RLS, large amounts of data and reasonable
> performance?

Functions that aren't marked leakproof aren't going to be able to be
pushed down.

> Your description of leakproof (and the documentation I've found) makes it
> sound like I'm not just hitting an isolated problem, but a general problem
> with RLS that represents a substantial limitation and is likely worth
> documenting.

There's some documentation regarding leakproof functions here:

https://www.postgresql.org/docs/current/ddl-rowsecurity.html

and here:

https://www.postgresql.org/docs/11/sql-createfunction.html

Of course, patches are welcome to improve on our documentation.

One thing that it sounds like you're not quite appreciating is that in
the general case, verifying that a function is leakproof isn't optional.
Without such a check, any user could create a function and then get PG
to push that function down below the RLS checks and therefore gain
access to the data that they aren't supposed to be able to see.

All that said, there's quite a few functions that *are* marked as
leakproof already and they're quite handy and work well with RLS
already, as I expect you'll see when you go querying pg_proc.

Thanks,

Stephen


--

pgsql-general by date:

Previous
From: "Jehan-Guillaume (ioguix) de Rorthais"
Date:
Subject: Re: Postgres HA - pacemaker RA do not support auto failback
Next
From: Stephen Frost
Date:
Subject: Re: GIST/GIN index not used with Row Level Security