Row Level Security − leakproof-ness and performance implications - Mailing list pgsql-hackers

From Pierre Ducroquet
Subject Row Level Security − leakproof-ness and performance implications
Date
Msg-id 2811772.0XtDgEdalL@peanuts2
Whole thread Raw
Responses Re: Row Level Security − leakproof-ness andperformance implications
Re: Row Level Security − leakproof-ness and performance implications
List pgsql-hackers
Hello

In order to increase our security, we have started deploying row-level 
security in order to add another safety net if any issue was to happen in our 
applications.
After a careful monitoring of our databases, we found out that a lot of 
queries started to go south, going extremely slow.
The root of these slowdowns is that a lot of the PostgreSQL functions are not 
marked as leakproof, especially the ones used for operators.
In current git master, the following query returns 258 functions that are used 
by operators returning booleans and not marked leakproof:

SELECT proname FROM pg_proc 
        WHERE exists(select 1 from pg_operator where oprcode::name = proname) 
        AND NOT(proleakproof) 
        AND prorettype = 16;


Among these, if you have for instance a table like:
create table demo_enum (
        username text,
        flag my_enum
);

With an index on (username, flag), the index will only be used on username 
because flag is an enum and the enum_eq function is not marked leakproof.

For simple functions like enum_eq/enum_ne, marking them leakproof is an 
obvious fix (patch attached to this email, including also textin/textout). And 
if we had a 'RLS-enabled' context on functions, a way to make a lot of built-
in functions leakproof would simply be to prevent them from logging errors 
containing values.

For others, like arraycontains, it's much trickier : arraycontains can be 
leakproof only if the comparison function of the inner type is leakproof too. 
This raises some interesting issues, like arraycontains being marked as strict 
parallel safe while there is nothing making it mandatory for the inner type.
In order to optimize queries on a table like:
create table demo_array (username text, my_array int[]);
one would need to mark arraycontains leakproof, thus implying that any 
comparison operator must be leakproof too? Another possibility, not that 
complicated, would be to create specific-types entries in pg_proc for each 
type that has a leakproof comparison operator. Or the complicated path would 
be to have a 'dynamic' leakproof for functions like arraycontains that depend 
on the inner type, but since this is determined at planning, it seems very 
complicated to implement.

A third issue we noticed is the usage of functional indexes. If you create an 
index on, for instance, (username, leaky_function(my_field)), and then search 
on leaky_functon(my_field) = 42, the functional index can be used only if 
leaky_function is marked leakproof, even if it is never going to be executed 
on invalid rows thanks to the index. After a quick look at the source code, it 
also seems complicated to implement since the decision to reject potential 
dangerous calls to leaky_function is done really early in the process, before 
the optimizer starts.


I am willing to spend some time on these issues, but I have no specific 
knowledge of the planner and optimizer, and I fear proper fixes would require 
much digging there. If you think it would be appropriate for functions like 
arraycontains or range_contains to require the 'inner' comparison function to 
be leakproof, or just keep looking at the other functions in pg_proc and 
leakproof the ones that can be, I would be happy to write the corresponding 
patches.


Best regards


Attachment

pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: WAL insert delay settings
Next
From: Tomas Vondra
Date:
Subject: Re: WAL insert delay settings