Re: CREATE FUNCTION ... SEARCH { DEFAULT | SYSTEM | SESSION } - Mailing list pgsql-hackers

From Robert Haas
Subject Re: CREATE FUNCTION ... SEARCH { DEFAULT | SYSTEM | SESSION }
Date
Msg-id CA+TgmoYEP40iBW-A9nPfDp8AhGoekPp3aPDFzTgBUrqmfCwZzQ@mail.gmail.com
Whole thread Raw
In response to Re: CREATE FUNCTION ... SEARCH { DEFAULT | SYSTEM | SESSION }  (Jeff Davis <pgsql@j-davis.com>)
Responses Re: CREATE FUNCTION ... SEARCH { DEFAULT | SYSTEM | SESSION }
Re: CREATE FUNCTION ... SEARCH { DEFAULT | SYSTEM | SESSION }
Re: CREATE FUNCTION ... SEARCH { DEFAULT | SYSTEM | SESSION }
List pgsql-hackers
On Fri, Sep 22, 2023 at 4:05 PM Jeff Davis <pgsql@j-davis.com> wrote:
> On Thu, 2023-09-21 at 14:06 -0400, Robert Haas wrote:
> > Also, in a case like this, I don't think it's unreasonable to ask
> > whether, perhaps, Bob just needs to be a little more careful about
> > setting search_path.
>
> That's what this whole thread is about: I wish it was reasonable, but I
> don't think the tools we provide today make it reasonable. You expect
> Bob to do something like:
>
>   CREATE FUNCTION ... SET search_path = pg_catalog, pg_temp ...
>
> for all functions, not just SECURITY DEFINER functions, is that right?

Yes, I do. I think it's self-evident that a SQL function's behavior is
not guaranteed to be invariant under all possible values of
search_path. If you care about your function behaving the same way all
the time, you have to set the search_path.

TBH, I don't see any reasonable way around that requirement. We can
perhaps provide some safeguards that will make it less likely that you
will get completely hosed if your forget, and we could decide to make
SET search_path or some mostly-equivalent thing the default at the
price of pretty large compatibility break, but you can't have
functions that both resolve object references using the caller's
search path and also reliably do what the author intended.

> > You can, I think, be expected to
> > check that functions you define have SET search_path attached.
>
> We've already established that even postgres hackers are having
> difficulty keeping up with these nuances. Even though the SET clause
> has been there for a long time, our documentation on the subject is
> insufficient and misleading. And on top of that, it's extra typing and
> noise for every schema file. Until we make some changes I don't think
> we can expect users to do as you suggest.

Respectfully, I find this position unreasonable, to the point of
finding it difficult to take seriously. You said in another part of
your email that I didn't quote here that it's a problem that it's a
problem that functions and procedures are created with public execute
access by default -- but you can work around this by using a schema to
which other users don't have access, or by changing the default
permissions for functions on the schema where you are creating them,
or by adjusting permissions on the individual objects. If you don't do
any of that but don't trust the other users on your system then you at
least need to set search_path. If you neither understand how function
permissions work nor understand the importance of controlling
search_path, you cannot expect to have a secure system with multiple,
mutually untrusting users. That's just never going to work, regardless
of what the server behavior is.

I also disagree with the idea that setting the search_path should be
regarded as noise. I think it's quite the opposite. I don't believe
that people want to run their functions under a sanitized search_path
that only includes system schemas. That might work for some people,
but I think most people will define functions that call other
functions that they themselves defined, or access tables that they
themselves created. They will therefore need the search_path to
include the schemas in which they created those objects. There's no
way for the system to magically figure out what the user wants here.
*Perhaps* if the function is defined interactively the then-current
value could be captured, but in a pg_dump for example that won't work,
and the configured value, wherever it came from initially, is going to
have to be recorded so that it can be recreated when the dump is
restored.

Most of the problems that we're dealing with here have analogues in
the world of shell scripts. A sql or plpgsql function is like a shell
script. If it's setuid, i.e. SECURITY DEFINER, you have to worry about
the caller hijacking it by setting PATH or IFS or LD_something. Even
if it isn't, you have to either trust that the caller has set a
reasonable PATH, or set one yourself, else your script isn't always
going to work reliably. Nobody really expects to be able to make a
setuid shell script secure at all -- that typically requires a wrapper
executable -- but it definitely can't be done by someone who doesn't
understand the importance of setting their PATH and has no idea how to
use chmod.

One thing that is quite different between the shell script situation
and what we do inside PostgreSQL is that there's a lot more security
by default. Every user gets a home directory which by default is
accessible only to them, or at the very least writable only by them,
and system directories have tightly-controlled permissions. I think
UNIX had analogues of a lot of the problems we have today 40 years
ago, but they've tightened things up. We've started to move in that
direction by, for example, removing public execute access by default.
If we want to move further in the direction that UNIX has taken, we
should probably get rid of the public schema altogether, and
auto-create per-user schemas with permissions that allow only that
user to access them. But that's only making it easier to not
accidentally have users accessing each other's stuff. The core problem
that, if people do want to access each other's stuff, they either need
to trust each other or really be on point with all the
security-related stuff. That's equally true in the shell script case,
and I think that problem is fundamental. It's just really not possible
for people to call other people's code frequently without everyone
involved either being super-careful about security or just not caring.

--
Robert Haas
EDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Confused about gram.y referencs in Makefile?
Next
From: Tom Lane
Date:
Subject: Re: Why need a lock?