sandboxing untrusted code - Mailing list pgsql-hackers

From Robert Haas
Subject sandboxing untrusted code
Date
Msg-id CA+TgmoaHpmz9-7ybB17B2qpDoqsi7=OWigc-3VBctb6B_x8bKA@mail.gmail.com
Whole thread Raw
In response to Re: Non-superuser subscription owners  (Jeff Davis <pgsql@j-davis.com>)
Responses Re: sandboxing untrusted code
List pgsql-hackers
On Mon, Feb 27, 2023 at 7:37 PM Jeff Davis <pgsql@j-davis.com> wrote:
> On Mon, 2023-02-27 at 16:13 -0500, Robert Haas wrote:
> > On Mon, Feb 27, 2023 at 1:25 PM Jeff Davis <pgsql@j-davis.com> wrote:
> > > I think you are saying that we should still run Alice's code with
> > > the
> > > privileges of Bob, but somehow make that safe(r) for Bob. Is that
> > > right?
> >
> > Yeah. That's the idea I was floating, at least.
>
> Isn't that a hard problem; maybe impossible?

I want to flesh out the ideas I previously articulated in this area a bit more.

As a refresher, the scenario I'm talking about is any one in which one
user, who I'll call Bob, does something that results in executing code
provided by another user, who I'll call Alice. The most obvious way
that this can happen is if Bob performs some operation that targets a
table owned by Alice. That operation might be DML, like an INSERT or
UPDATE; or it might be some other kind of maintenance command that can
cause code execution, like REINDEX, which can evaluate index
expressions. The code being executed might be run either as Alice or
as Bob, depending on how it's been attached to the table and what
operation is being performed and maybe whether some function or
procedure that might contain it is SECURITY INVOKER or SECURITY
DEFINER. Regardless of the details, our concern is that Alice's code
might do something that Bob does not like. This is a particularly
lively concern if the code happens to be running with the privileges
of Bob, because then Alice might try to do something like access
objects for which Bob has permissions and Alice does not. But the
problems don't completely go away if the code is being run as Alice,
because even then, Alice could try to manipulate the session state in
some way that will cause Bob to hose himself later on. The existing
SECURITY_RESTRICTED_OPERATION flag defends against some scenarios of
this type, but at present we also rely heavily on Bob being *very*
careful, as Jeff has highlighted rather compellingly.

I think we can do better, both in the case where Bob is running code
provided by Alice using his own permissions, and also in the case
where Bob is running code provided by Alice using Alice's permissions.
To that end, I'd like to define a few terms. First, let's define the
provider of a piece of code as either (a) the owner of the function or
procedure that contains it or (b) the owner of the object to which
it's directly attached or (c) the session user, for code directly
entered at top level. For example, if Alice owns a table T1 and
applies a default expression which uses a function provided by
Charlie, and Bob then inserts into T1, then Bob provides the insert
statement, Alice provides the default expression, and Charlie provides
the code inside the function. I assert that in every context where
PostgreSQL evaluates expressions or runs SQL statements, there's a
well-defined provider for the expression or statement, and we can make
the system track it if we want to. Second, I'd like to define trust.
Users trust themselves, and they also trust users who have a superset
of their permissions, a category that most typically just includes
superusers but could include others if role grants are in use. A user
can also declare through some mechanism or other that they trust
another user even if that other user does not have a superset of their
permissions. Such a declaration carries the risk that the trusted user
could hijack the trusting user's permissions; we would document and
disclaim this risk.

Finally, let's define sandboxing. When code is sandboxed, the set of
operations that it is allowed to perform is restricted. Sandboxing
isn't necessarily all or nothing; there can be different categories of
operations and we can allow some and deny others, if we wish.
Obviously this is quite a bit of work to implement, but I don't think
it's unmanageable. YMMV. To keep things simple for purposes of
discussion, I'm going to just define two levels of sandboxing for the
moment; I think we might want more. If code is fully sandboxed, it can
only do the following things:

1. Compute stuff. There's no restriction on the permissible amount of
compute; if you call untrusted code, nothing prevents it from running
forever.
2. Call other code. This may be done by a function call or a command
such as CALL or DO, all subject to the usual permissions checks but no
further restrictions.
3. Access the current session state, without modifying it. For
example, executing SHOW or current_setting() is fine.
4. Transiently modify the current session state in ways that are
necessarily reversed before returning to the caller. For example, an
EXCEPTION block or a configuration change driven by proconfig is fine.
5. Produce messages at any log level. This includes any kind of ERROR.

Fully sandboxed code can't access or modify data beyond what gets
passed to it, with the exception of the session state mentioned above.
This includes data inside of PostgreSQL, like tables or statistics, as
well as data outside of PostgreSQL, like files that it might try to
read by calling pg_read_file(). If it tries, an ERROR occurs.

Partially sandboxed code is much less restricted. Partially sandboxed
code can do almost anything that unsandboxed code can do, but with one
important exception: it can't modify the session state. This means it
can't run commands like CLOSE, DEALLOCATE, DECLARE, DISCARD, EXECUTE,
FETCH, LISTEN, MOVE, PREPARE, or UNLISTEN. Nor can it try to COMMIT or
ROLLBACK the current transaction or set up a SAVEPOINT or ROLLBACK TO
SAVEPOINT. Nor can it use SET or set_config() to change a parameter
value.

With those definitions in hand, I think it's possible to propose a
meaningful security model:

Rule #1: If the current user does not trust the provider, the code is
fully sandboxed.
Rule #2: If the session user does not trust the provider either of the
currently-running code or of any other code that's still on the call
stack, the code is partially sandboxed.

Let's take a few examples. First, suppose Alice has a table and it has
some associated code for which the provider is always Alice. That is,
she may have default expressions or index expressions for which she is
necessarily the provider, and she may have triggers, but in this
example she owns the functions or procedures called by those triggers
and is thus the provider for those as well. Now, Bob, who does not
trust Alice, does something to Alice's table. The code might run as
Bob (by default) and then it will be fully sandboxed because of rule
#1. Or there might be a SECURITY DEFINER function or procedure
involved causing the code to run as Alice, in which case the code will
be partially sandboxed because of rule #2. I argue that Bob is pretty
safe here. Alice can't make any durable changes to Bob's session state
no matter what she does, and if she provides code that runs as Bob it
can only do innocuous things like calculating x+y or x || y or running
generate_series() or examining current_role. Yes, it could go into a
loop, but that doesn't compromise Bob's account: he can hit ^C or set
statement_timeout. If she provides code that runs as herself it can
make use of her privileges (but not Bob's) as long as it doesn't try
to touch the session state. So Bob is pretty safe.

Now, suppose instead that Bob has a table but some code that is
attached to it can call a function that is owned by Alice. In this
case, as long as everything on the call stack is provided by Bob,
there are no restrictions. But as soon as we enter Alice's function,
the code is fully sandboxed unless it arranges to switch to Alice's
permissions using SECURITY DEFINER, in which case it's still partially
sandboxed. Again, it's hard to see how Alice can get any leverage
here.

Finally, suppose Alice has a table and attaches a trigger to it that
calls a function provided by Charlie. Bob now does something to this
table that results in the execution of this trigger. If the current
user -- which will be either Alice or Bob depending on whether the
function is SECURITY DEFINER -- does not trust Charlie, the code
inside the trigger is going to run fully sandboxed because of rule #1.
But even if the current user does trust Charlie, the code inside the
trigger is still going to be partially sandboxed unless Bob trusts
BOTH Alice AND Charlie because of rule #2. This seems appropriate,
because in this situation, either Alice or Charlie could be trying to
fool Bob into taking some action he doesn't intend to take by
tinkering with his session.

In general if we have a great big call stack that involves calling a
whole bunch of functions either as SECURITY INVOKER or as SECURITY
DEFINER, changing the session state is blocked unless the session user
trusts the owners of all of those functions. And if we got to any of
those functions by means of code attached directly to tables, like an
index expression or default expression, changing the session state is
blocked unless the session user also trusts the owners of those
tables.

I see a few obvious objections to this line of attack that someone
might raise, and I'd like to address them now. First, somebody might
argue that this is too hard to implement. I don't think so, because a
lot of things can be blocked from common locations. However, it will
be necessary to go through all the functions we ship and add checks in
a bunch of places to individual functions. That's a pain, but it's not
that different from what we've already done with PARALLEL { SAFE |
RESTRICTED | UNSAFE } or LEAKPROOF. Those weren't terribly enjoyable
exercises for me and I made some mistakes categorizing some things,
but the job got done and those mechanisms are accepted infrastructure
now. Second, somebody might argue that full sandboxing is such a
draconian set of restrictions that it will inconvenience users greatly
or that it's pointless to even allow anything to be executed or
something along those lines. I think that argument has some merit, but
I think the restrictions sound worse than they actually are in
context. For instance, have you ever written a default expression for
a column that would fail under full sandboxing? I wouldn't be
surprised if you have, but I also bet it's a fairly small percentage
of cases. I think a lot of things that people want to do as a
practical matter will be totally fine. I can think of exceptions, most
obviously reading from a random-number generator that has a
user-controllable seed, which technically qualifies as tinkering with
the session state. But a lot of things are going to work fine, and the
things that do fall afoul of a mechanism like this probably deserve
some study and examination. If you're writing index expressions that
do anything more than simple calculation, it's probably fine for the
system to raise an eyebrow about that. Even if they do something as
simple as reading from another table, that's not necessarily going to
dump and restore properly, even if it's secure, because the table
ordering dependencies won't be clear to pg_dump.

And that brings me to another point, which is that we might think of
sandboxing some operations, either by default or unconditionally, for
reasons other than trust or the lack of it. There's a lot of things
that you COULD do in an index expression that you really SHOULD NOT
do. As mentioned, even reading a table is pretty sketchy, but should a
function called from an index expression ever be allowed to execute
DDL? Is it reasonable if such a function wants to execute CREATE
TABLE? Even a temporary table is dubious, and a non-temporary table is
really dubious. What if such a function wants to ALTER ROLE ...
SUPERUSER? I think that's bonkers and should almost certainly be
categorically denied. Probably someone is trying to hack something,
and even if they aren't, it's still nuts. So I would argue that in a
context like an index expression, some amount of sandboxing -- not
necessarily corresponding to either of the levels described above --
is probably a good idea, not based on the relationship between
whatever users are involved, but based rather on the context. There's
room for a lot of bikeshedding here and I don't think this kind of
thing is necessarily the top priority, but I think it's worth thinking
about.

Finally, I'd like to note that partial sandboxing can be viewed as a
strengthening of restrictions that we already have in the form of
SECURITY_RESTRICTED_OPERATION. I can't claim to be an authority on the
evolution of that flag, but I think that up to this point the general
philosophy has been to insert the smallest possible plug in the dike.
When a definite security problem is discovered, somebody tries to
block just enough stuff to make it not demonstrably insecure. However,
I feel that the surface area for things to go wrong is rather large,
and we'd be better off with a more comprehensive series of
restrictions. We likely have some security issues that haven't been
found yet, and even something we wouldn't classify as a security
vulnerability can still be a pitfall for the unwary. I imagine that
SECURITY_RESTRICTED_OPERATION might end up getting subsumed into what
I'm here calling partial sandboxing, but I'm not quite sure about that
because right now this is just a theoretical description of a system,
not something for which I've written any code.

Thanks,

--
Robert Haas
EDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Dagfinn Ilmari Mannsåker
Date:
Subject: Re: Replace some cstring_to_text to cstring_to_text_with_len
Next
From: Peter Eisentraut
Date:
Subject: Re: Replace some cstring_to_text to cstring_to_text_with_len