Re: CREATE FUNCTION ... SEARCH { DEFAULT | SYSTEM | SESSION } - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: CREATE FUNCTION ... SEARCH { DEFAULT | SYSTEM | SESSION } |
Date | |
Msg-id | CA+TgmoZGu46an-Vkrbe2o0i2wOpjUFWX93fnVtQ2j8FT32gDQw@mail.gmail.com Whole thread Raw |
In response to | Re: CREATE FUNCTION ... SEARCH { DEFAULT | SYSTEM | SESSION } (Jeff Davis <pgsql@j-davis.com>) |
Responses |
Re: CREATE FUNCTION ... SEARCH { DEFAULT | SYSTEM | SESSION }
|
List | pgsql-hackers |
On Tue, Sep 19, 2023 at 7:56 PM Jeff Davis <pgsql@j-davis.com> wrote: > Good documentation includes some guidance. Sure, it should describe the > system behavior, but without anchoring it to some kind of expected use > case, it can be equally frustrating. Fair. > I don't mean to set some major new standard in the documentation that > should apply to everything; but for the privilege system, even hackers > are having trouble keeping up (myself included). A bit of guidance > toward supported use cases helps a lot. Yeah, this stuff is complicated, and I agree that it's hard even for hackers to keep up with. I don't really have a strong view on the concrete case you mentioned involving password_required. I always worry that if there are three cases and we suggest one of them then the others will be viewed negatively when really they're all equally fine. On the other hand, that can often be addressed by starting the sentence with "For example, you could...." or similar, so perhaps there's no problem here at all. I generally agree with the idea that examples can be useful for clarifying points that may otherwise be too theoretical. > I hope what I'm saying is not useless vitriol. I am offering the best > solutions I see in a bad situation. And I believe I've uncovered some > emergent behaviors that are not well-understood even among prominent > hackers. Yeah, I wasn't really intending to say that you were. I just get nervous about statements like "don't ever do X!" because I find it completely unconvincing. In my experience, when you tell people stuff like that, some of them just go off and do it anyway and, especially in a case like this, chances are very good that nothing bad will ever happen to them, simply because most PostgreSQL installations don't have malicious local users. When you tell them "but that's really bad" they say "why?" and if the documentation doesn't have an answer to the question well then that sucks. > > because it would break a lot of stuff for a lot > > of people, but if we could get that then I think we could maybe > > emerge > > in a better spot once the pain of the compatibility break receded. > > Are there ways we can soften this a bit? I know compatibility GUCs are > not to be added lightly, but perhaps one is justified here? I don't know. I'm skeptical. This behavior is so complicated and hard to get right. Having it GUC-dependent makes it even more confusing than it is already. But I guess it also depends on what the GUC does. Let's say we make a rule that every function or procedure has to have a search_path attached to it as a function property. That is, CREATE FUNCTION .. SEARCH something sets pg_proc.prosearch = 'something'. If you omit the SEARCH clause, one is implicitly supplied for you. If you say SEARCH NULL, then the function is executed with the search_path taken from the GUC; SEARCH 'anything_else' specified a literal search_path to be used. In such a world, I can imagine having a GUC that determines whether the implicitly supplied SEARCH clause is SEARCH ${WHATEVER_THE_SEARCH_PATH_IS_RIGHT_NOW} or SEARCH NULL. Such a GUC only takes effect at CREATE FUNCTION time. However, I cannot imagine having a GUC that causes the SEARCH clauses attached to all functions to be globally ignored at execution time. That seems like a choice we would likely regret bitterly. The first thing is already painful, but the second one is exponentially worse, because in the first world, you have to be careful to get your functions defined correctly, but if you do, you know they'll run OK on any PostgreSQL cluster anywhere, whereas in the second world, there's no way to define a function that behaves the same way on every PostgreSQL instance. Imagine being an extension author, for example. I am a little worried that this kind of design might end up reversing the direction of some security problems that we have now. For instance, right now, if you call a function with a SET search_path clause, you know that it can't make any changes to search_path that survive function exit. You'll get your old search_path back. With this kind of design, it seems like it would be a lot easier to get back to the SQL toplevel and find the search_path surprisingly changed under you. I think we have that problem in some cases already, though. I'm unclear how much worse this makes it. > > Another option is something around sandboxing and/or function trust. > > The idea here is to not care too much about the search_path behavior > > itself, and instead focus on the consequences, namely what code is > > getting executed as which user and perhaps what kinds of operations > > it's performing. > > I'm open to discussing that further, and it certainly may solve some > problems, but it does not seem to solve the fundamental problem with > search_path: that the caller can (intentionally or unintentionally) > cause a function to do unexpected things. Well, I think it's meant to solve that problem. How effectively it does so is a point worth debating. > Sometimes an unexpected thing is not a the kind of thing that would be > caught by a sandbox, e.g. just an unexpected function result. But if > that function is used in a constraint or expression index, that > unexpected result can lead to a violated constraint or a bad index > (that will later cause wrong results). The owner of the table might > reasonably consider that a privilege problem, if the user who causes > the trouble had only INSERT privileges. That's an interesting example. Earlier versions of the function trust proposal proposed to block *any* execution of code belonging to an untrusted party. That could potentially block this attack. However, it would also block a lot of other things. For instance, if Alice tries to insert into Bob's table and Bob's table has a CHECK constraint or an index expression, Alice has to trust Bob or she can't insert anything at all. By trusting Bob just enough to allow him do things like CHECK(LENGTH(foo) < 10) or whatever, Alice can operate on Bob's table without a problem in normal cases, but is still protected if Bob suddenly starts doing something sneaky. I think that's a significant improvement, because a system that is so stringent that it blocks even completely harmless things is likely to get disabled, at which point it protects nobody from anything. However, that analysis presumes that what we're trying to do is protect Alice from Bob, and I think you're raising the question of how we protect Bob from Alice. Suppose Bob has got a trigger function but has failed to control search_path for that function. Alice can set search_path so that Bob's trigger calls some function or operator that she owns instead of the intended call to, say, a system function or operator. Some sufficiently-rigid function trust system could catch this: Bob doesn't trust Alice, and so the fact that his code is trying to call some a function or operator owned by Alice is a red flag. On the basis of the fact that Bob doesn't trust Alice, we should error out to protect Bob. Had the search_path been set in the expected way, Bob would have been trying to call a superuser-owned function, and Bob must trust the superuser, so the operation is permitted. I wouldn't have a problem with a function-trust proposal that incorporated a mode that rigid as a configuration option. I find this a convincing example of how that could be useful. But such a mode has pretty serious downsides, too. It makes it very difficult for one user to interact with another user's objects in any way without triggering security errors. Also, in a case like this, I don't think it's unreasonable to ask whether, perhaps, Bob just needs to be a little more careful about setting search_path. I think that there is a big difference between (a) defining a SQL-language function that is accessible to multiple users and (b) inserting a row into a table you don't own. When you define a function, you know people are potentially going to call it. Asking you, as the function author, to take some care to secure your function against a malicious search_path doesn't seem like an unsupportable burden. After all, you control the definition of that function. The problem with inserting a row into a table you don't own is that all of the objects involved -- the table itself, its indexes, its triggers, its defaults, its constraints -- are owned by somebody else, and that user controls those objects and can change any of them at any time. You can't really be expected to verify that all code reachable as a result of an INSERT into the table is safe enough before every INSERT into that table. You can, I think, be expected to check that functions you define have SET search_path attached. > > Are there any other categories of things we can do? More specific > > kinds of things we can do in each category? I don't really see an > > option other than (1) "change something in the system design so that > > people use search_path wrongly less often" or (2) "make it so that it > > doesn't matter as much if people using the wrong search_path" but > > maybe I'm missing a clever idea. > > Perhaps there are some clever ideas about maintaining compatibility > within the approaches (1) or (2), which might make one of them more > appealing. Indeed. -- Robert Haas EDB: http://www.enterprisedb.com
pgsql-hackers by date: