Row Level Security − leakproof-ness and performance implications - Mailing list pgsql-hackers
From | Pierre Ducroquet |
---|---|
Subject | Row Level Security − leakproof-ness and performance implications |
Date | |
Msg-id | 2811772.0XtDgEdalL@peanuts2 Whole thread Raw |
Responses |
Re: Row Level Security − leakproof-ness andperformance implications
Re: Row Level Security − leakproof-ness and performance implications |
List | pgsql-hackers |
Hello In order to increase our security, we have started deploying row-level security in order to add another safety net if any issue was to happen in our applications. After a careful monitoring of our databases, we found out that a lot of queries started to go south, going extremely slow. The root of these slowdowns is that a lot of the PostgreSQL functions are not marked as leakproof, especially the ones used for operators. In current git master, the following query returns 258 functions that are used by operators returning booleans and not marked leakproof: SELECT proname FROM pg_proc WHERE exists(select 1 from pg_operator where oprcode::name = proname) AND NOT(proleakproof) AND prorettype = 16; Among these, if you have for instance a table like: create table demo_enum ( username text, flag my_enum ); With an index on (username, flag), the index will only be used on username because flag is an enum and the enum_eq function is not marked leakproof. For simple functions like enum_eq/enum_ne, marking them leakproof is an obvious fix (patch attached to this email, including also textin/textout). And if we had a 'RLS-enabled' context on functions, a way to make a lot of built- in functions leakproof would simply be to prevent them from logging errors containing values. For others, like arraycontains, it's much trickier : arraycontains can be leakproof only if the comparison function of the inner type is leakproof too. This raises some interesting issues, like arraycontains being marked as strict parallel safe while there is nothing making it mandatory for the inner type. In order to optimize queries on a table like: create table demo_array (username text, my_array int[]); one would need to mark arraycontains leakproof, thus implying that any comparison operator must be leakproof too? Another possibility, not that complicated, would be to create specific-types entries in pg_proc for each type that has a leakproof comparison operator. Or the complicated path would be to have a 'dynamic' leakproof for functions like arraycontains that depend on the inner type, but since this is determined at planning, it seems very complicated to implement. A third issue we noticed is the usage of functional indexes. If you create an index on, for instance, (username, leaky_function(my_field)), and then search on leaky_functon(my_field) = 42, the functional index can be used only if leaky_function is marked leakproof, even if it is never going to be executed on invalid rows thanks to the index. After a quick look at the source code, it also seems complicated to implement since the decision to reject potential dangerous calls to leaky_function is done really early in the process, before the optimizer starts. I am willing to spend some time on these issues, but I have no specific knowledge of the planner and optimizer, and I fear proper fixes would require much digging there. If you think it would be appropriate for functions like arraycontains or range_contains to require the 'inner' comparison function to be leakproof, or just keep looking at the other functions in pg_proc and leakproof the ones that can be, I would be happy to write the corresponding patches. Best regards
Attachment
pgsql-hackers by date: