Thread: Row Level Security − leakproof-ness and performance implications

Row Level Security − leakproof-ness and performance implications

From
Pierre Ducroquet
Date:
Hello

In order to increase our security, we have started deploying row-level 
security in order to add another safety net if any issue was to happen in our 
applications.
After a careful monitoring of our databases, we found out that a lot of 
queries started to go south, going extremely slow.
The root of these slowdowns is that a lot of the PostgreSQL functions are not 
marked as leakproof, especially the ones used for operators.
In current git master, the following query returns 258 functions that are used 
by operators returning booleans and not marked leakproof:

SELECT proname FROM pg_proc 
        WHERE exists(select 1 from pg_operator where oprcode::name = proname) 
        AND NOT(proleakproof) 
        AND prorettype = 16;


Among these, if you have for instance a table like:
create table demo_enum (
        username text,
        flag my_enum
);

With an index on (username, flag), the index will only be used on username 
because flag is an enum and the enum_eq function is not marked leakproof.

For simple functions like enum_eq/enum_ne, marking them leakproof is an 
obvious fix (patch attached to this email, including also textin/textout). And 
if we had a 'RLS-enabled' context on functions, a way to make a lot of built-
in functions leakproof would simply be to prevent them from logging errors 
containing values.

For others, like arraycontains, it's much trickier : arraycontains can be 
leakproof only if the comparison function of the inner type is leakproof too. 
This raises some interesting issues, like arraycontains being marked as strict 
parallel safe while there is nothing making it mandatory for the inner type.
In order to optimize queries on a table like:
create table demo_array (username text, my_array int[]);
one would need to mark arraycontains leakproof, thus implying that any 
comparison operator must be leakproof too? Another possibility, not that 
complicated, would be to create specific-types entries in pg_proc for each 
type that has a leakproof comparison operator. Or the complicated path would 
be to have a 'dynamic' leakproof for functions like arraycontains that depend 
on the inner type, but since this is determined at planning, it seems very 
complicated to implement.

A third issue we noticed is the usage of functional indexes. If you create an 
index on, for instance, (username, leaky_function(my_field)), and then search 
on leaky_functon(my_field) = 42, the functional index can be used only if 
leaky_function is marked leakproof, even if it is never going to be executed 
on invalid rows thanks to the index. After a quick look at the source code, it 
also seems complicated to implement since the decision to reject potential 
dangerous calls to leaky_function is done really early in the process, before 
the optimizer starts.


I am willing to spend some time on these issues, but I have no specific 
knowledge of the planner and optimizer, and I fear proper fixes would require 
much digging there. If you think it would be appropriate for functions like 
arraycontains or range_contains to require the 'inner' comparison function to 
be leakproof, or just keep looking at the other functions in pg_proc and 
leakproof the ones that can be, I would be happy to write the corresponding 
patches.


Best regards


Attachment

Re: Row Level Security − leakproof-ness andperformance implications

From
Laurenz Albe
Date:
Pierre Ducroquet wrote:
> In order to increase our security, we have started deploying row-level 
> security in order to add another safety net if any issue was to happen in our 
> applications.
> After a careful monitoring of our databases, we found out that a lot of 
> queries started to go south, going extremely slow.
> The root of these slowdowns is that a lot of the PostgreSQL functions are not 
> marked as leakproof, especially the ones used for operators.
> In current git master, the following query returns 258 functions that are used 
> by operators returning booleans and not marked leakproof:
> 
> SELECT proname FROM pg_proc 
>         WHERE exists(select 1 from pg_operator where oprcode::name = proname) 
>         AND NOT(proleakproof) 
>         AND prorettype = 16;
> 
> 
> Among these, if you have for instance a table like:
> create table demo_enum (
>         username text,
>         flag my_enum
> );
> 
> With an index on (username, flag), the index will only be used on username 
> because flag is an enum and the enum_eq function is not marked leakproof.
> 
> For simple functions like enum_eq/enum_ne, marking them leakproof is an 
> obvious fix (patch attached to this email, including also textin/textout). And

The enum_eq part looks totally safe, but the text functions allocate memory,
so you could create memory pressure, wait for error messages that tell you
the size of the allocation that failed and this way learn about the data.

Is this a paranoid idea?

> if we had a 'RLS-enabled' context on functions, a way to make a lot of built-
> in functions leakproof would simply be to prevent them from logging errors 
> containing values.
> 
> For others, like arraycontains, it's much trickier :[...]

I feel a little out of my depth here, so I won't comment.

> A third issue we noticed is the usage of functional indexes. If you create an 
> index on, for instance, (username, leaky_function(my_field)), and then search 
> on leaky_functon(my_field) = 42, the functional index can be used only if 
> leaky_function is marked leakproof, even if it is never going to be executed 
> on invalid rows thanks to the index. After a quick look at the source code, it 
> also seems complicated to implement since the decision to reject potential 
> dangerous calls to leaky_function is done really early in the process, before 
> the optimizer starts.

If there is a bitmap index scan, the condition will be rechecked, so the
function will be executed.

> I am willing to spend some time on these issues, but I have no specific 
> knowledge of the planner and optimizer, and I fear proper fixes would require 
> much digging there. If you think it would be appropriate for functions like 
> arraycontains or range_contains to require the 'inner' comparison function to 
> be leakproof, or just keep looking at the other functions in pg_proc and 
> leakproof the ones that can be, I would be happy to write the corresponding 
> patches.

Thanks, and I think that every function that can safely be marked leakproof
is a progress!

Yours,
Laurenz Albe



Re: Row Level Security − leakproof-ness and performance implications

From
Pierre Ducroquet
Date:
On Wednesday, February 20, 2019 12:43:50 AM CET Laurenz Albe wrote:
> Pierre Ducroquet wrote:
> > In order to increase our security, we have started deploying row-level
> > security in order to add another safety net if any issue was to happen in
> > our applications.
> > After a careful monitoring of our databases, we found out that a lot of
> > queries started to go south, going extremely slow.
> > The root of these slowdowns is that a lot of the PostgreSQL functions are
> > not marked as leakproof, especially the ones used for operators.
> > In current git master, the following query returns 258 functions that are
> > used by operators returning booleans and not marked leakproof:
> >
> > SELECT proname FROM pg_proc
> >
> >         WHERE exists(select 1 from pg_operator where oprcode::name =
> >         proname)
> >         AND NOT(proleakproof)
> >         AND prorettype = 16;
> >
> > Among these, if you have for instance a table like:
> > create table demo_enum (
> >
> >         username text,
> >         flag my_enum
> >
> > );
> >
> > With an index on (username, flag), the index will only be used on username
> > because flag is an enum and the enum_eq function is not marked leakproof.
> >
> > For simple functions like enum_eq/enum_ne, marking them leakproof is an
> > obvious fix (patch attached to this email, including also textin/textout).
> > And
> The enum_eq part looks totally safe, but the text functions allocate memory,
> so you could create memory pressure, wait for error messages that tell you
> the size of the allocation that failed and this way learn about the data.
>
> Is this a paranoid idea?

This is not paranoid, it depends on your threat model.
In the model we implemented, the biggest threat we consider from an user point
of view is IDOR (Insecure Direct Object Reference): a developer forgetting to
check that its input is sane and matches the other parts of the URL or the
current session user. Exploiting the leak you mentioned in such a situation is
almost impossible, and to be honest the attacker has much easier targets if he
gets to that level…
Maybe leakproof is too simple? Should we be able to specify a 'paranoid-level'
to allow some leaks depending on our threat model?

> > if we had a 'RLS-enabled' context on functions, a way to make a lot of
> > built- in functions leakproof would simply be to prevent them from
> > logging errors containing values.
> >
> > For others, like arraycontains, it's much trickier :[...]
>
> I feel a little out of my depth here, so I won't comment.
>
> > A third issue we noticed is the usage of functional indexes. If you create
> > an index on, for instance, (username, leaky_function(my_field)), and then
> > search on leaky_functon(my_field) = 42, the functional index can be used
> > only if leaky_function is marked leakproof, even if it is never going to
> > be executed on invalid rows thanks to the index. After a quick look at
> > the source code, it also seems complicated to implement since the
> > decision to reject potential dangerous calls to leaky_function is done
> > really early in the process, before the optimizer starts.
>
> If there is a bitmap index scan, the condition will be rechecked, so the
> function will be executed.

That's exactly my point: using a bitmap index scan would be dangerous and thus
should not be allowed, but an index scan works fine. Or the recheck on bitmap
scan must first check the RLS condition before doing its check.

> > I am willing to spend some time on these issues, but I have no specific
> > knowledge of the planner and optimizer, and I fear proper fixes would
> > require much digging there. If you think it would be appropriate for
> > functions like arraycontains or range_contains to require the 'inner'
> > comparison function to be leakproof, or just keep looking at the other
> > functions in pg_proc and leakproof the ones that can be, I would be happy
> > to write the corresponding patches.
>
> Thanks, and I think that every function that can safely be marked leakproof
> is a progress!

Thank you for your comments





Re: Row Level Security − leakproof-ness and performance implications

From
Joe Conway
Date:
On 2/19/19 6:43 PM, Laurenz Albe wrote:
> Pierre Ducroquet wrote:
>> if we had a 'RLS-enabled' context on functions, a way to make a lot of built-
>> in functions leakproof would simply be to prevent them from logging errors
>> containing values.
>>
>> For others, like arraycontains, it's much trickier :[...]
>
> I feel a little out of my depth here, so I won't comment.

I had more-or-less the same idea, and even did a PoC patch (not
preventing the log entries but rather redacting the variable data), but
after discussing offlist with some other hackers I got the impression
that it would be a really hard sell.

It does seem to me though that such a feature would be pretty useful.
Something like use a GUC to turn it on, and while on log messages get
redacted. If you really needed to see the details, you could either
duplicate your issue on another installation with the redaction turned
off, or maybe turn it off on production in a controlled manner just long
enough to capture the full error message.

Joe

--
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development


Attachment

Re: Row Level Security − leakproof-ness and performance implications

From
Tom Lane
Date:
Pierre Ducroquet <p.psql@pinaraf.info> writes:
> For simple functions like enum_eq/enum_ne, marking them leakproof is an 
> obvious fix (patch attached to this email, including also textin/textout).

This is not nearly as "obvious" as you think.  See prior discussions,
notably
https://www.postgresql.org/message-id/flat/31042.1546194242%40sss.pgh.pa.us

Up to now we've taken a very strict definition of what leakproofness
means; as Noah stated, if a function can throw errors for some inputs,
it's not considered leakproof, even if those inputs should never be
encountered in practice.  Most of the things we've been willing to
mark leakproof are straight-line code that demonstrably can't throw
any errors at all.

The previous thread seemed to have consensus that the hazards in
text_cmp and friends were narrow enough that nobody had a practical
problem with marking them leakproof --- but we couldn't define an
objective policy statement that would allow making such decisions,
so nothing's been changed as yet.  I think it is important to have
an articulable policy about this, not just a seat-of-the-pants
conclusion about the risk level in a particular function.

            regards, tom lane


Re: Row Level Security − leakproof-ness and performance implications

From
Pierre Ducroquet
Date:
On Wednesday, February 20, 2019 5:24:17 PM CET Tom Lane wrote:
> Pierre Ducroquet <p.psql@pinaraf.info> writes:
> > For simple functions like enum_eq/enum_ne, marking them leakproof is an
> > obvious fix (patch attached to this email, including also textin/textout).
> 
> This is not nearly as "obvious" as you think.  See prior discussions,
> notably
> https://www.postgresql.org/message-id/flat/31042.1546194242%40sss.pgh.pa.us
> 
> Up to now we've taken a very strict definition of what leakproofness
> means; as Noah stated, if a function can throw errors for some inputs,
> it's not considered leakproof, even if those inputs should never be
> encountered in practice.  Most of the things we've been willing to
> mark leakproof are straight-line code that demonstrably can't throw
> any errors at all.
> 
> The previous thread seemed to have consensus that the hazards in
> text_cmp and friends were narrow enough that nobody had a practical
> problem with marking them leakproof --- but we couldn't define an
> objective policy statement that would allow making such decisions,
> so nothing's been changed as yet.  I think it is important to have
> an articulable policy about this, not just a seat-of-the-pants
> conclusion about the risk level in a particular function.
> 
>             regards, tom lane


I undestand these decisions, but it makes RLS quite fragile, with numerous un-
documented side-effects. In order to save difficulties from future users, I 
wrote this patch to the documentation, listing the biggest restrictions I hit 
with RLS so far.

Regards

 Pierre

Attachment

Re: Row Level Security − leakproof-ness and performance implications

From
Joe Conway
Date:
On 2/20/19 11:24 AM, Tom Lane wrote:
> Pierre Ducroquet <p.psql@pinaraf.info> writes:
>> For simple functions like enum_eq/enum_ne, marking them leakproof is an
>> obvious fix (patch attached to this email, including also textin/textout).
>
> This is not nearly as "obvious" as you think.  See prior discussions,
> notably
> https://www.postgresql.org/message-id/flat/31042.1546194242%40sss.pgh.pa.us
>
> Up to now we've taken a very strict definition of what leakproofness
> means; as Noah stated, if a function can throw errors for some inputs,
> it's not considered leakproof, even if those inputs should never be
> encountered in practice.  Most of the things we've been willing to
> mark leakproof are straight-line code that demonstrably can't throw
> any errors at all.
>
> The previous thread seemed to have consensus that the hazards in
> text_cmp and friends were narrow enough that nobody had a practical
> problem with marking them leakproof --- but we couldn't define an
> objective policy statement that would allow making such decisions,
> so nothing's been changed as yet.  I think it is important to have
> an articulable policy about this, not just a seat-of-the-pants
> conclusion about the risk level in a particular function.

What if we provided an option to redact all client messages (leaving
logged messages as-is). Separately we could provide a GUC to force all
functions to be resolved as leakproof. Depending on your requirements,
having both options turned on could be perfectly acceptable.

Patch for discussion attached.

Joe

--
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development

Attachment

Re: Row Level Security − leakproof-ness and performance implications

From
Robert Haas
Date:
On Wed, Feb 27, 2019 at 6:03 PM Joe Conway <mail@joeconway.com> wrote:
> Patch for discussion attached.

So... you're just going to replace ALL error messages of any kind with
"ERROR: missing error text" when this option is enabled?  That sounds
unusable.  I mean if I'm reading it right this would get not only
messages from SQL-callable functions but also things like "deadlock
detected" and "could not read block %u in file %s" and "database is
not accepting commands to avoid wraparound data loss in database with
OID %u".  You can't even shut it off conveniently, because the way
you've designed it it has to be PGC_POSTMASTER to avoid TOCTTOU
vulnerabilities.  Maybe I'm misreading the patch?

I don't think it would be crazy to have a mode where we try to redact
the particular error messages that might leak information, but I think
we'd need to make it only those.  A wild idea might be to let
proleakproof take on three values: yes, no, and maybe.  When 'maybe'
functions are involved, we tell them whether or not the current query
involves any security barriers, and if so they self-censor.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Row Level Security − leakproof-ness and performance implications

From
Joe Conway
Date:
On 2/28/19 9:12 AM, Robert Haas wrote:
> On Wed, Feb 27, 2019 at 6:03 PM Joe Conway <mail@joeconway.com> wrote:
>> Patch for discussion attached.
>
> So... you're just going to replace ALL error messages of any kind with
> "ERROR: missing error text" when this option is enabled?  That sounds
> unusable.  I mean if I'm reading it right this would get not only
> messages from SQL-callable functions but also things like "deadlock
> detected" and "could not read block %u in file %s" and "database is
> not accepting commands to avoid wraparound data loss in database with
> OID %u".  You can't even shut it off conveniently, because the way
> you've designed it it has to be PGC_POSTMASTER to avoid TOCTTOU
> vulnerabilities.  Maybe I'm misreading the patch?

You have it correct.

I completely disagree that is is unusable though. The way I envision
this is that you enable force_leakproof on your development machine
without suppress_client_messages being turned on. Do your debugging there.

On production, both are turned on. You still get full unredacted
messages in your pg log. The client on a prod system does not need these
details. If you *really* need to, you can restart to turn it on for a
short while on prod, but hopefully you have a non prod system where you
reproduce issues for debugging anyway.

I am not married to making this only changeable via restart though --
that's why I posted the patch for discussion. Perhaps a superuserset
would be better so debugging could be done on one session only on the
prod machine.

> I don't think it would be crazy to have a mode where we try to redact
> the particular error messages that might leak information, but I think
> we'd need to make it only those.  A wild idea might be to let
> proleakproof take on three values: yes, no, and maybe.  When 'maybe'
> functions are involved, we tell them whether or not the current query
> involves any security barriers, and if so they self-censor.

Again, I disagree. See above -- you have all you need in the server logs.

Joe
--
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development


Attachment

Re: Row Level Security − leakproof-ness and performance implications

From
Dean Rasheed
Date:
On Thu, 28 Feb 2019 at 14:13, Robert Haas <robertmhaas@gmail.com> wrote:
> A wild idea might be to let
> proleakproof take on three values: yes, no, and maybe.  When 'maybe'
> functions are involved, we tell them whether or not the current query
> involves any security barriers, and if so they self-censor.
>

Does self-censoring mean that they might still throw an error for some
inputs, but that error won't reveal any information about the input
values? That's not entirely consistent with my understanding of the
definition of leakproof, but maybe there are situations where that
amount of information leakage would be OK. So maybe we could have
"strictly leakproof" functions that never throw errors and "weakly
leakproof" functions (needs a better name) that can throw errors, as
long as those errors don't include data values. Then we could allow
strict and weak security barriers on a per-table basis, depending on
how sensitive the data is in each table (I'm not a fan of using GUCs
to control this).

Regards,
Dean


Re: Row Level Security − leakproof-ness and performance implications

From
Joshua Brindle
Date:
On Thu, Feb 28, 2019 at 9:12 AM Robert Haas <robertmhaas@gmail.com> wrote:

Hi, Robert, it has been a while :)

>
> So... you're just going to replace ALL error messages of any kind with
> "ERROR: missing error text" when this option is enabled?  That sounds
> unusable.  I mean if I'm reading it right this would get not only
> messages from SQL-callable functions but also things like "deadlock
> detected" and "could not read block %u in file %s" and "database is
> not accepting commands to avoid wraparound data loss in database with
> OID %u".  You can't even shut it off conveniently, because the way

This makes complete sense to me. The client side of a client/server
protocol doesn't have any way to fix 'could not read block %u in file
%s', the client doesn't need that kind of detailed information about a
server, and in fact that information could be security sensitive.
Imagine connecting to a webserver with a web browser and getting a
similar message.

When something critical happens the servers logs can be viewed and the
error can be addressed there, on the server.


Re: Row Level Security − leakproof-ness and performance implications

From
Chapman Flack
Date:
On 2/28/19 9:52 AM, Dean Rasheed wrote:

> Does self-censoring mean that they might still throw an error for some
> inputs, but that error won't reveal any information about the input
> values? That's not entirely consistent with my understanding of the
> definition of leakproof

That's the question I was also preparing to ask ... I understood the
definition to exclude even the possibility that some inputs could
produce errors.

> amount of information leakage would be OK. So maybe we could have
> "strictly leakproof" functions that never throw errors and "weakly
> leakproof" functions (needs a better name) that can throw errors, as
> long as those errors don't include data values. Then we could allow
> strict and weak security barriers on a per-table basis

Interesting idea. I wonder if the set { strictly, weakly } would be
better viewed as a user-definable set (a site might define "leakproof
wrt HIPAA", "leakproof wrt FERPA", etc.), and then configure which
combination of leakproof properties must apply where.

OTOH, I'd have to wonder about the feasibility of auditing code for
leakproofness at that kind of granularity.

Regards,
-Chap


Re: Re: Row Level Security − leakproof-ness and performance implications

From
Tom Lane
Date:
Joshua Brindle <joshua.brindle@crunchydata.com> writes:
> On Thu, Feb 28, 2019 at 9:12 AM Robert Haas <robertmhaas@gmail.com> wrote:
>> So... you're just going to replace ALL error messages of any kind with
>> "ERROR: missing error text" when this option is enabled?  That sounds
>> unusable.  I mean if I'm reading it right this would get not only
>> messages from SQL-callable functions but also things like "deadlock
>> detected" and "could not read block %u in file %s" and "database is
>> not accepting commands to avoid wraparound data loss in database with
>> OID %u".  You can't even shut it off conveniently, because the way

> This makes complete sense to me. The client side of a client/server
> protocol doesn't have any way to fix 'could not read block %u in file
> %s', the client doesn't need that kind of detailed information about a
> server, and in fact that information could be security sensitive.

I agree with Robert that this idea is a nonstarter.  Not only is it
a complete disaster from a usability standpoint, but *it does not
fix the problem*.  The mere fact that an error occurred, or didn't,
is already an information leak.  Sure, you can only extract one bit
per query that way, but a slow leak is still a leak.  See the Spectre
vulnerability for a pretty exact parallel.

The direction I think we're going to need to go in is to weaken our
standards for what we'll consider a leakproof function, and/or try
harder to make common WHERE-clause operators leakproof.  The thread
over at
https://www.postgresql.org/message-id/flat/7DF52167-4379-4A1E-A957-90D774EBDF21%40winand.at
raises the question of why we don't expect that *all* indexable
operators are leakproof, at least with respect to the index column
contents.  (Failing to distinguish which of the inputs can be
leaked seems like a pretty fundamental mistake in hindsight.
For instance, some opclasses can directly index regex operators,
and one would not wish to give up the ability for a regex operator
to complain about an invalid pattern.  But it could be leakproof
with respect to the data side.)

            regards, tom lane


Re: Re: Row Level Security − leakproof-ness and performance implications

From
Joshua Brindle
Date:
On Thu, Feb 28, 2019 at 10:49 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Joshua Brindle <joshua.brindle@crunchydata.com> writes:
> > On Thu, Feb 28, 2019 at 9:12 AM Robert Haas <robertmhaas@gmail.com> wrote:
> >> So... you're just going to replace ALL error messages of any kind with
> >> "ERROR: missing error text" when this option is enabled?  That sounds
> >> unusable.  I mean if I'm reading it right this would get not only
> >> messages from SQL-callable functions but also things like "deadlock
> >> detected" and "could not read block %u in file %s" and "database is
> >> not accepting commands to avoid wraparound data loss in database with
> >> OID %u".  You can't even shut it off conveniently, because the way
>
> > This makes complete sense to me. The client side of a client/server
> > protocol doesn't have any way to fix 'could not read block %u in file
> > %s', the client doesn't need that kind of detailed information about a
> > server, and in fact that information could be security sensitive.
>
> I agree with Robert that this idea is a nonstarter.  Not only is it
> a complete disaster from a usability standpoint, but *it does not
> fix the problem*.  The mere fact that an error occurred, or didn't,
> is already an information leak.  Sure, you can only extract one bit
> per query that way, but a slow leak is still a leak.  See the Spectre
> vulnerability for a pretty exact parallel.

How is leakproof defined WRT Postgres? Generally speaking a 1 bit
error path would be considered a covert channel on most systems and
are relatively slow even compared to e.g., timing channels.

Redacting error information, outside of the global leakproof setting,
seems useful to prevent data leakage to a client on another system,
such as Robert's example above "could not read block %u in file %s".

Although, and Joe may hate me for saying this, I think only the
non-constants should be redacted to keep some level of usability for
regular SQL errors. Maybe system errors like the above should be
removed from client messages in general.

> The direction I think we're going to need to go in is to weaken our
> standards for what we'll consider a leakproof function, and/or try
> harder to make common WHERE-clause operators leakproof.  The thread
> over at
> https://www.postgresql.org/message-id/flat/7DF52167-4379-4A1E-A957-90D774EBDF21%40winand.at
> raises the question of why we don't expect that *all* indexable
> operators are leakproof, at least with respect to the index column
> contents.  (Failing to distinguish which of the inputs can be
> leaked seems like a pretty fundamental mistake in hindsight.
> For instance, some opclasses can directly index regex operators,
> and one would not wish to give up the ability for a regex operator
> to complain about an invalid pattern.  But it could be leakproof
> with respect to the data side.)
>
>                         regards, tom lane


Re: Row Level Security − leakproof-ness and performance implications

From
Chapman Flack
Date:
On 2/28/19 11:03 AM, Joshua Brindle wrote:

> How is leakproof defined WRT Postgres? Generally speaking a 1 bit

From the CREATE FUNCTION reference page:

LEAKPROOF indicates that the function has no side effects. It reveals no
information about its arguments other than by its return value. For
example, a function which *throws an error message for some argument
values but not others*, or which includes the argument values in any
error message, is not leakproof.

(*emphasis* mine.)

Regards,
-Chap


Re: Row Level Security − leakproof-ness and performance implications

From
Joe Conway
Date:
On 2/28/19 11:03 AM, Joshua Brindle wrote:
> On Thu, Feb 28, 2019 at 10:49 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>
>> Joshua Brindle <joshua.brindle@crunchydata.com> writes:
>> > On Thu, Feb 28, 2019 at 9:12 AM Robert Haas <robertmhaas@gmail.com> wrote:
>> >> So... you're just going to replace ALL error messages of any kind with
>> >> "ERROR: missing error text" when this option is enabled?  That sounds
>> >> unusable.  I mean if I'm reading it right this would get not only
>> >> messages from SQL-callable functions but also things like "deadlock
>> >> detected" and "could not read block %u in file %s" and "database is
>> >> not accepting commands to avoid wraparound data loss in database with
>> >> OID %u".  You can't even shut it off conveniently, because the way
>>
>> > This makes complete sense to me. The client side of a client/server
>> > protocol doesn't have any way to fix 'could not read block %u in file
>> > %s', the client doesn't need that kind of detailed information about a
>> > server, and in fact that information could be security sensitive.
>>
>> I agree with Robert that this idea is a nonstarter.  Not only is it
>> a complete disaster from a usability standpoint, but *it does not
>> fix the problem*.  The mere fact that an error occurred, or didn't,
>> is already an information leak.  Sure, you can only extract one bit
>> per query that way, but a slow leak is still a leak.  See the Spectre
>> vulnerability for a pretty exact parallel.
>
> How is leakproof defined WRT Postgres? Generally speaking a 1 bit
> error path would be considered a covert channel on most systems and
> are relatively slow even compared to e.g., timing channels.

Yes, I am pretty sure there are plenty of very security conscious
environments that would be willing to make this tradeoff in order to get
reliable RLS performance.

> Redacting error information, outside of the global leakproof setting,
> seems useful to prevent data leakage to a client on another system,
> such as Robert's example above "could not read block %u in file %s".

True

> Although, and Joe may hate me for saying this, I think only the
> non-constants should be redacted to keep some level of usability for
> regular SQL errors. Maybe system errors like the above should be
> removed from client messages in general.

I started down this path and it looked fragile. I guess if there is
generally enough support to think this might be viable I could open up
that door again, but I don't want to waste time if the approach is
really a non-starter as stated upthread :-/.

Joe

--
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development


Attachment

Re: Row Level Security − leakproof-ness and performance implications

From
Joshua Brindle
Date:
On Thu, Feb 28, 2019 at 11:14 AM Joe Conway <mail@joeconway.com> wrote:
>
> > Although, and Joe may hate me for saying this, I think only the
> > non-constants should be redacted to keep some level of usability for
> > regular SQL errors. Maybe system errors like the above should be
> > removed from client messages in general.
>
> I started down this path and it looked fragile. I guess if there is
> generally enough support to think this might be viable I could open up
> that door again, but I don't want to waste time if the approach is
> really a non-starter as stated upthread :-/.
>

The only non-starter for Tom was weakening leakproof, right? Can we
keep the suppression, and work on strengthening leakproof as a
separate activity?


Re: Row Level Security − leakproof-ness and performance implications

From
Robert Haas
Date:
On Thu, Feb 28, 2019 at 11:14 AM Joe Conway <mail@joeconway.com> wrote:
> > Although, and Joe may hate me for saying this, I think only the
> > non-constants should be redacted to keep some level of usability for
> > regular SQL errors. Maybe system errors like the above should be
> > removed from client messages in general.
>
> I started down this path and it looked fragile. I guess if there is
> generally enough support to think this might be viable I could open up
> that door again, but I don't want to waste time if the approach is
> really a non-starter as stated upthread :-/.

Hmm.  It seems to me that if there's a function that sometimes throws
an error and other times does not, and if that behavior is dependent
on the input, then even redacting the error message down to 'ERROR:
error' does not remove the leak.  So it seems to me that regardless of
what one thinks about the proposal from a usability perspective, it's
probably not correct from a security standpoint.  Information that
couldn't be leaked until present rules would leak with this change,
when the new GUCs were turned on.

Am I wrong?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Row Level Security − leakproof-ness and performance implications

From
Joe Conway
Date:
On 2/28/19 11:37 AM, Robert Haas wrote:
> On Thu, Feb 28, 2019 at 11:14 AM Joe Conway <mail@joeconway.com> wrote:
>> > Although, and Joe may hate me for saying this, I think only the
>> > non-constants should be redacted to keep some level of usability for
>> > regular SQL errors. Maybe system errors like the above should be
>> > removed from client messages in general.
>>
>> I started down this path and it looked fragile. I guess if there is
>> generally enough support to think this might be viable I could open up
>> that door again, but I don't want to waste time if the approach is
>> really a non-starter as stated upthread :-/.
>
> Hmm.  It seems to me that if there's a function that sometimes throws
> an error and other times does not, and if that behavior is dependent
> on the input, then even redacting the error message down to 'ERROR:
> error' does not remove the leak.  So it seems to me that regardless of
> what one thinks about the proposal from a usability perspective, it's
> probably not correct from a security standpoint.  Information that
> couldn't be leaked until present rules would leak with this change,
> when the new GUCs were turned on.
>
> Am I wrong?


No, and Tom stated as much too, but life is all about tradeoffs. Some
people will find this an acceptable compromise. For those that don't
they don't have to use it. IMHO we tend toward too much nannyism too often.

Joe

--
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development


Attachment

Re: Row Level Security − leakproof-ness and performance implications

From
Robert Haas
Date:
On Thu, Feb 28, 2019 at 11:44 AM Joe Conway <mail@joeconway.com> wrote:
> No, and Tom stated as much too, but life is all about tradeoffs. Some
> people will find this an acceptable compromise. For those that don't
> they don't have to use it. IMHO we tend toward too much nannyism too often.

Well, I agree with that, too.

Hmm.  I don't think there's anything preventing you from implementing
this in "userspace," is there?  A logging hook could suppress all
error message text, and you could just mark all functions leakproof
after that, and you'd have this exact behavior in an existing release
with no core code changes, I think.

If you do that, or just stick this patch into your own distro, I would
be interested to hear some experiences from customers (and those who
support them) after some time had gone by.  I find it hard to imagine
delivering customer support in an environment configured this way, but
sometimes my imagination is limited.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Row Level Security − leakproof-ness and performance implications

From
Joe Conway
Date:
On 2/28/19 11:50 AM, Robert Haas wrote:
> On Thu, Feb 28, 2019 at 11:44 AM Joe Conway <mail@joeconway.com> wrote:
>> No, and Tom stated as much too, but life is all about tradeoffs. Some
>> people will find this an acceptable compromise. For those that don't
>> they don't have to use it. IMHO we tend toward too much nannyism too often.
>
> Well, I agree with that, too.
>
> Hmm.  I don't think there's anything preventing you from implementing
> this in "userspace," is there?  A logging hook could suppress all
> error message text, and you could just mark all functions leakproof
> after that, and you'd have this exact behavior in an existing release
> with no core code changes, I think.

I think that would affect the server logs too, no? Worth thinking about
though...

Also manually marking all functions leakproof is far less convenient
than turning off the check as this patch effectively allows. You would
want to keep track of the initial condition and be able to restore it if
needed. Doable but much uglier. Perhaps we could tolerate a hook that
would allow an extension to do this though?

> If you do that, or just stick this patch into your own distro, I would
> be interested to hear some experiences from customers (and those who
> support them) after some time had gone by.  I find it hard to imagine
> delivering customer support in an environment configured this way, but
> sometimes my imagination is limited.

Again, remember that the actual messages are available in the server
logs. The presumption is that the server logs are kept secure, and it is
ok to leak information into them. How the customer does or does not
decide to pass some of that information on to a support group becomes a
problem to deal with on a case by case basis.

Also, as mentioned up-thread, in many cases there is or should be a
non-production instance available to use for reproducing problems to
debug them. Presumably the data on such a system is faked or has already
been cleaned up for a wider audience.

Joe

--
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development


Attachment

Re: Row Level Security − leakproof-ness and performance implications

From
Robert Haas
Date:
On Thu, Feb 28, 2019 at 12:05 PM Joe Conway <mail@joeconway.com> wrote:
> I think that would affect the server logs too, no? Worth thinking about
> though...

Yeah, I suppose so, although there might be a way to work around that.

> Also manually marking all functions leakproof is far less convenient
> than turning off the check as this patch effectively allows. You would
> want to keep track of the initial condition and be able to restore it if
> needed. Doable but much uglier. Perhaps we could tolerate a hook that
> would allow an extension to do this though?

Yeah, possibly.  I guess we'd have to see how ugly that looks, but....

> Again, remember that the actual messages are available in the server
> logs. The presumption is that the server logs are kept secure, and it is
> ok to leak information into them. How the customer does or does not
> decide to pass some of that information on to a support group becomes a
> problem to deal with on a case by case basis.
>
> Also, as mentioned up-thread, in many cases there is or should be a
> non-production instance available to use for reproducing problems to
> debug them. Presumably the data on such a system is faked or has already
> been cleaned up for a wider audience.

Mmmph.  If your customers always have a non-production instance where
problems from production can be easily reproduced, your customers are
not much like our customers.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Row Level Security − leakproof-ness and performance implications

From
Joe Conway
Date:
On 2/28/19 12:28 PM, Robert Haas wrote:
> Mmmph.  If your customers always have a non-production instance where
> problems from production can be easily reproduced, your customers are
> not much like our customers.

Well I certainly did not mean to imply that this is always the case ;-)

But I think it is fair to tell customers that have these tradeoffs in
front of them that it would be even more wise in the case they decided
to use this capability.

Joe

-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development


Re: Row Level Security − leakproof-ness and performance implications

From
Peter Eisentraut
Date:
On 2019-02-21 15:56, Pierre Ducroquet wrote:
> I undestand these decisions, but it makes RLS quite fragile, with numerous un-
> documented side-effects. In order to save difficulties from future users, I 
> wrote this patch to the documentation, listing the biggest restrictions I hit 
> with RLS so far.

This appears to be the patch of record for this commit fest entry.

I agree that it would be useful to document and discuss better which
built-in operators are leak-proof and which are not.  But I don't think
the CREATE POLICY reference page is the place to do it.  Note that the
leak-proofness mechanism was originally introduced for security-barrier
views (an early form of RLS if you will), so someone could also
reasonably expect a discussion there.

I'm not sure of the best place to put it.  Perhaps adding a section to
the Functions and Operators chapter would work.

Also, once you start such a list, there will be an expectation that it's
complete.  So that would need to be ensured.  You only list a few things
you found.  Are there others?  How do we keep this up to date?

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: Row Level Security − leakproof-ness and performance implications

From
Peter Eisentraut
Date:
On 2019-02-28 00:03, Joe Conway wrote:
> What if we provided an option to redact all client messages (leaving
> logged messages as-is). Separately we could provide a GUC to force all
> functions to be resolved as leakproof. Depending on your requirements,
> having both options turned on could be perfectly acceptable.

There are two commit fest entries for this thread, one in Pierre's name
and one in yours.  Is your entry for the error message redacting
functionality?  I think that approach has been found not to actually
satisfy the leakproofness criteria.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: Row Level Security − leakproof-ness and performance implications

From
Joe Conway
Date:
On 3/18/19 3:52 PM, Peter Eisentraut wrote:
> On 2019-02-28 00:03, Joe Conway wrote:
>> What if we provided an option to redact all client messages (leaving
>> logged messages as-is). Separately we could provide a GUC to force all
>> functions to be resolved as leakproof. Depending on your requirements,
>> having both options turned on could be perfectly acceptable.
>
> There are two commit fest entries for this thread, one in Pierre's name
> and one in yours.  Is your entry for the error message redacting
> functionality?  I think that approach has been found not to actually
> satisfy the leakproofness criteria.


It is a matter of opinion with regard to what the criteria actually is,
and when it ought to apply. But in any case the clear consensus was
against me, so I guess I'll assume "my patch was rejected by PostgreSQL
all I got was this tee shirt" (...I know I have one that says something
like that somewhere...) ;-)

I have no idea what the other entry is all about as I have not had the
time to look.

Joe

--
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development


Attachment

Re: Re: Row Level Security − leakproof-ness and performance implications

From
David Steele
Date:
Hi Pierre,

On 3/18/19 8:13 PM, Joe Conway wrote:
> 
> I have no idea what the other entry is all about as I have not had the
> time to look.

There doesn't seem to be consensus on your patch, either -- I'm planning 
to mark it rejected at the end of the CF unless you have a new patch for 
consideration.

This thread got a bit hijacked and is hard to follow.  If you do submit 
a new patch I recommend creating a new thread.

Regards,
-- 
-David
david@pgmasters.net



Re: Row Level Security − leakproof-ness and performance implications

From
Peter Eisentraut
Date:
On 2019-03-18 20:08, Peter Eisentraut wrote:
> I agree that it would be useful to document and discuss better which
> built-in operators are leak-proof and which are not.  But I don't think
> the CREATE POLICY reference page is the place to do it.  Note that the
> leak-proofness mechanism was originally introduced for security-barrier
> views (an early form of RLS if you will), so someone could also
> reasonably expect a discussion there.

It sounds like this will need significant additional work, so setting as
returned with feedback for now.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services