Thread: Re: pgsql: Fix search_path to a safe value during maintenance operations.
[ emerges from hibernation ] On Mon, Jun 19, 2023 at 6:58 PM Jeff Davis <pgsql@j-davis.com> wrote: > On Mon, 2023-06-19 at 16:03 -0400, Robert Haas wrote: > > I'm inclined to think that this is a real security issue and am not > > Can you expand on that a bit? You mean a practical security issue for > the intended use cases? Yeah. I mean, as things stand, it seems like giving someone the MAINTAIN privilege will be sufficient to allow them to escalate to the table owner if there are any expression indexes involved. That seems like a real problem. We shouldn't ship a new feature with a built-in security hole like that. I was pretty outraged when I realized that we'd been shipping releases for years where CREATEROLE let you grab superuser because you could just grant yourself pg_execute_server_program and then go to town. Ideally, that hazard should have been identified and fixed in some way before introducing pg_execute_server_program. I don't know whether the hazard wasn't realized at the time or whether somebody somehow convinced themselves that that was OK, but it clearly isn't. Now we're proposing to ship a brand-new feature with a hole that we definitely already know exists. I can't understand that at all. Should we just go file the CVE against ourselves right now, then? Seriously, what are we doing? If we're not going to fix the feature so that it doesn't break the security model, we should probably just revert it. I don't understand at all the idea of shipping something that we 100% know is broken. > > very sanguine about waiting another year to fix it, but at the same > > time, I'm somewhat worried that the proposed fix might be too narrow > > or wrongly-shaped. I'm not too convinced that we've properly > > understood what all of the problems in this area are. :-( > > Would it be acceptable to document that the MAINTAIN privilege (along > with TRIGGER and, if I understand correctly, REFERENCES) carries > privilege escalation risk for the grantor? That's clearly better than nothing, but also seems like it's pretty clearly the wrong approach. If somebody electrocutes themselves on the toaster in the break room, you don't stick a sign on the side of it that says "this toaster will electrocute you if you try to use it" and then call it good. You either fix or replace the toaster, or at the very least throw it out, or at the VERY least unplug it. I am failing to understand how this situation is any different. -- Robert Haas EDB: http://www.enterprisedb.com
On 2023-06-29 Th 11:19, Robert Haas wrote:
Now we're proposing to ship a brand-new feature with a hole that we definitely already know exists. I can't understand that at all. Should we just go file the CVE against ourselves right now, then? Seriously, what are we doing? If we're not going to fix the feature so that it doesn't break the security model, we should probably just revert it. I don't understand at all the idea of shipping something that we 100% know is broken.
+1
cheers
andrew
-- Andrew Dunstan EDB: https://www.enterprisedb.com
On Thu, Jun 29, 2023 at 11:19:38AM -0400, Robert Haas wrote: > [ emerges from hibernation ] Welcome back. > If we're not going to fix the feature so that it doesn't break the > security model, we should probably just revert it. I don't understand > at all the idea of shipping something that we 100% know is broken. Given Jeff's commit followed the precedent set by the fix for CVE-2018-1058, I'm inclined to think he was on the right track. Perhaps a more targeted fix, such as only changing search_path when the command is not run by the table owner (as suggested upthread [0]) is worth considering. [0] https://postgr.es/m/CAKFQuwaVJkM9u%2BqpOaom2UkPE1sz0BASF-E5amxWPxncUhm4Hw%40mail.gmail.com -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
On Thu, 2023-06-29 at 11:19 -0400, Robert Haas wrote: > Yeah. I mean, as things stand, it seems like giving someone the > MAINTAIN privilege will be sufficient to allow them to escalate to > the > table owner if there are any expression indexes involved. That seems > like a real problem. We shouldn't ship a new feature with a built-in > security hole like that. Let's take David's suggestion[1] then, and only restrict the search path for those without owner privileges on the object. That would mean no behavior change unless using the MAINTAIN privilege, which is new, so no breakage. And if someone is using the MAINTAIN privilege, they wouldn't be able to abuse the search_path, so it would close the hole. Patch attached (created a bit quickly, but seems to work). Regards, Jeff Davis [1] https://postgr.es/m/CAKFQuwaVJkM9u%2BqpOaom2UkPE1sz0BASF-E5amxWPxncUhm4Hw%40mail.gmail.com
Attachment
Jeff Davis <pgsql@j-davis.com> writes: > On Thu, 2023-06-29 at 11:19 -0400, Robert Haas wrote: >> We shouldn't ship a new feature with a built-in >> security hole like that. > Let's take David's suggestion[1] then, and only restrict the search > path for those without owner privileges on the object. I think that's a seriously awful kluge. It will mean that things behave differently for the owner than for MAINTAIN grantees, which pretty much destroys the use-case for that privilege, as well as being very confusing and hard to debug. Yes, *if* you're careful about search path cleanliness then you can make it work, but that will be a foot-gun forevermore. (I'm also less than convinced that this is sufficient to remove all security hazards. One pretty obvious question is do we really want superusers to be treated as owners, rather than MAINTAIN grantees, for this purpose.) I'm leaning to Robert's thought that we need to revert this for now, and think harder about how to make it work cleanly and safely. regards, tom lane
On Thu, Jun 29, 2023 at 08:53:56PM -0400, Tom Lane wrote: > I'm leaning to Robert's thought that we need to revert this for now, > and think harder about how to make it work cleanly and safely. Since it sounds like this is headed towards a revert, here's a patch for removing MAINTAIN and pg_maintain. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Attachment
On Thu, 2023-06-29 at 20:53 -0400, Tom Lane wrote: > I think that's a seriously awful kluge. It will mean that things > behave > differently for the owner than for MAINTAIN grantees, which pretty > much > destroys the use-case for that privilege, as well as being very > confusing > and hard to debug. In version 15, try this: CREATE USER foo; CREATE SCHEMA foo AUTHORIZATION foo; CREATE USER bar; CREATE SCHEMA bar AUTHORIZATION bar; \c - foo CREATE FUNCTION foo.mod10(INT) RETURNS INT IMMUTABLE LANGUAGE plpgsql AS $$ BEGIN RETURN mod($1,10); END; $$; CREATE TABLE t(i INT); -- units digit must be unique CREATE UNIQUE INDEX t_idx ON t (foo.mod10(i)); INSERT INTO t VALUES(7); -- success INSERT INTO t VALUES(17); -- fails GRANT USAGE ON SCHEMA foo TO bar; GRANT INSERT ON t TO bar; \c - bar CREATE FUNCTION bar.mod(INT, INT) RETURNS INT IMMUTABLE LANGUAGE plpgsql AS $$ BEGIN RETURN $1 + 1000000; END; $$; SET search_path = bar, pg_catalog; INSERT INTO foo.t VALUES(7); -- succeeds \c - foo SELECT * FROM t; i --- 7 7 (2 rows) I'm not sure that everyone in this thread realizes just how broken it is to depend on search_path in a functional index at all. And doubly so if it depends on a schema other than pg_catalog in the search_path. Let's also not forget that logical replication always uses search_path=pg_catalog, so if you depend on a different search_path for any function attached to the table (not just functional indexes, also functions inside expressions or trigger functions), then those are already broken in version 15. And if a superuser is executing maintenance commands, there's little reason to think they'll have the same search path as the user that created the table. At some point in the very near future (though I realize that point may come after version 16), we need to lock down the search path in a lot of cases (not just maintenance commands), and I don't see any way around that. Regards, Jeff Davis
On Thu, Jun 29, 2023 at 10:09:21PM -0700, Nathan Bossart wrote: > On Thu, Jun 29, 2023 at 08:53:56PM -0400, Tom Lane wrote: >> I'm leaning to Robert's thought that we need to revert this for now, >> and think harder about how to make it work cleanly and safely. > > Since it sounds like this is headed towards a revert, here's a patch for > removing MAINTAIN and pg_maintain. I will revert this next week unless opinions change before then. I'm currently planning to revert on both master and REL_16_STABLE, but another option would be to keep it on master while we sort out the remaining issues. Thoughts? -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
On Fri, Jun 30, 2023 at 11:35:46AM -0700, Nathan Bossart wrote: > On Thu, Jun 29, 2023 at 10:09:21PM -0700, Nathan Bossart wrote: > > On Thu, Jun 29, 2023 at 08:53:56PM -0400, Tom Lane wrote: > >> I'm leaning to Robert's thought that we need to revert this for now, > >> and think harder about how to make it work cleanly and safely. Another dimension of compromise could be to make MAINTAIN affect fewer commands in v16. Un-revert the part of commit 05e1737 affecting just the commands it still affects. For example, limit MAINTAIN and the 05e1737 behavior change to VACUUM, ANALYZE, and REINDEX. Don't worry about VACUUM or ANALYZE failing under commit 05e1737, since they would have been failing under autovacuum since 2018. A problem index expression breaks both autoanalyze and REINDEX, hence the inclusion of REINDEX. The already-failing argument doesn't apply to CLUSTER or REFRESH MATERIALIZED VIEW, so those commands could join MAINTAIN in v17. > > Since it sounds like this is headed towards a revert, here's a patch for > > removing MAINTAIN and pg_maintain. > > I will revert this next week unless opinions change before then. I'm > currently planning to revert on both master and REL_16_STABLE, but another > option would be to keep it on master while we sort out the remaining > issues. Thoughts? From my reading of the objections, I think they're saying that commit 05e1737 arrived too late and that MAINTAIN is unacceptable without commit 05e1737. I think you'd conform to all objections by pushing the revert to v16 and pushing a roll-forward of commit 05e1737 to master.
On Sun, Jul 02, 2023 at 08:57:31PM -0700, Noah Misch wrote: > Another dimension of compromise could be to make MAINTAIN affect fewer > commands in v16. Un-revert the part of commit 05e1737 affecting just the > commands it still affects. For example, limit MAINTAIN and the 05e1737 > behavior change to VACUUM, ANALYZE, and REINDEX. Don't worry about VACUUM or > ANALYZE failing under commit 05e1737, since they would have been failing under > autovacuum since 2018. A problem index expression breaks both autoanalyze and > REINDEX, hence the inclusion of REINDEX. The already-failing argument doesn't > apply to CLUSTER or REFRESH MATERIALIZED VIEW, so those commands could join > MAINTAIN in v17. I'm open to compromise if others are, but I'm skeptical that folks will be okay with too much fancy footwork this late in the game. Anyway, IMO your argument could extend to CLUSTER and REFRESH, too. If we're willing to change behavior under the assumption that autovacuum would've been failing since 2018, then why wouldn't we be willing to change it everywhere? I suppose someone could have been manually vacuuming with a special search_path for 5 years to avoid needing to schema-qualify their index expressions (and would then be surprised that CLUSTER/REFRESH no longer work), but limiting MAINTAIN to VACUUM, etc. would still break their use-case, right? > From my reading of the objections, I think they're saying that commit 05e1737 > arrived too late and that MAINTAIN is unacceptable without commit 05e1737. I > think you'd conform to all objections by pushing the revert to v16 and pushing > a roll-forward of commit 05e1737 to master. Okay, I'll adjust my plans accordingly. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
On Mon, Jul 03, 2023 at 11:19:14AM -0700, Nathan Bossart wrote: > On Sun, Jul 02, 2023 at 08:57:31PM -0700, Noah Misch wrote: > > Another dimension of compromise could be to make MAINTAIN affect fewer > > commands in v16. Un-revert the part of commit 05e1737 affecting just the > > commands it still affects. For example, limit MAINTAIN and the 05e1737 > > behavior change to VACUUM, ANALYZE, and REINDEX. Don't worry about VACUUM or > > ANALYZE failing under commit 05e1737, since they would have been failing under > > autovacuum since 2018. A problem index expression breaks both autoanalyze and > > REINDEX, hence the inclusion of REINDEX. The already-failing argument doesn't > > apply to CLUSTER or REFRESH MATERIALIZED VIEW, so those commands could join > > MAINTAIN in v17. > > I'm open to compromise if others are, but I'm skeptical that folks will be > okay with too much fancy footwork this late in the game. Got it. > Anyway, IMO your argument could extend to CLUSTER and REFRESH, too. If > we're willing to change behavior under the assumption that autovacuum > would've been failing since 2018, then why wouldn't we be willing to change > it everywhere? I suppose someone could have been manually vacuuming with a > special search_path for 5 years to avoid needing to schema-qualify their > index expressions (and would then be surprised that CLUSTER/REFRESH no > longer work), but limiting MAINTAIN to VACUUM, etc. would still break their > use-case, right? Yes, limiting MAINTAIN to VACUUM would still break a site that has used manual VACUUM to work around associated loss of autovacuum. I'm not sympathetic to a user who neglected to benefit from the last five years of prep time on this issue as it affects VACUUM and ANALYZE. REFRESH runs more than index expressions, e.g. function calls in the targetlist of the materialized view query. Those targetlist expressions haven't been putting ERRORs in the log during autovacuum, so REFRESH hasn't had the sort of advance warning that VACUUM and ANALYZE got.
On Thu, 2023-06-29 at 22:09 -0700, Nathan Bossart wrote: > On Thu, Jun 29, 2023 at 08:53:56PM -0400, Tom Lane wrote: > > I'm leaning to Robert's thought that we need to revert this for > > now, > > and think harder about how to make it work cleanly and safely. > > Since it sounds like this is headed towards a revert, here's a patch > for > removing MAINTAIN and pg_maintain. It was difficult to review standalone, so I tried a quick version myself and ended up with very similar results. The only substantial difference was that I put back: + if (!vacuum_is_relation_owner(relid, classForm, options)) + continue; in get_all_vacuum_rels() whereas your patch left it out -- double-check that we're doing the right thing there. Also remember to bump the catversion. Other than that, it looks good to me. Regards, Jeff Davis
On Thu, Jul 06, 2023 at 12:55:14AM -0700, Jeff Davis wrote: > It was difficult to review standalone, so I tried a quick version > myself and ended up with very similar results. Thanks for taking a look. > The only substantial > difference was that I put back: > > > + if (!vacuum_is_relation_owner(relid, classForm, > options)) > + continue; > > > in get_all_vacuum_rels() whereas your patch left it out -- double-check > that we're doing the right thing there. The privilege check was moved in d46a979, which I think still makes sense, so I left it there. That might be why it looks like I removed it. > Also remember to bump the catversion. Other than that, it looks good to > me. Will do. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Here is a new version of the patch that I think is ready for commit (except it still needs a catversion bump). The only real difference from v1 is in AdjustUpgrade.pm. From my cross-version pg_upgrade testing, I believe we can remove the other "privilege-set discrepancy" rule as well. Since MAINTAIN will no longer exist in v16, we'll also need the following change applied to v17devel: diff --git a/src/test/perl/PostgreSQL/Test/AdjustUpgrade.pm b/src/test/perl/PostgreSQL/Test/AdjustUpgrade.pm index 843f65b448..d435812c06 100644 --- a/src/test/perl/PostgreSQL/Test/AdjustUpgrade.pm +++ b/src/test/perl/PostgreSQL/Test/AdjustUpgrade.pm @@ -274,7 +274,7 @@ sub adjust_old_dumpfile $dump = _mash_view_qualifiers($dump); } - if ($old_version >= 14 && $old_version < 16) + if ($old_version >= 14 && $old_version < 17) { # Fix up some privilege-set discrepancies. $dump =~ On Thu, Jul 06, 2023 at 10:20:04AM -0700, Nathan Bossart wrote: > On Thu, Jul 06, 2023 at 12:55:14AM -0700, Jeff Davis wrote: >> Also remember to bump the catversion. Other than that, it looks good to >> me. > > Will do. Since we are only reverting from v16, the REL_16_STABLE catversion will be bumped ahead of the one on master. AFAICT that is okay, but there is also a chance that someone bumps the catversion on master to the same value. I'm not sure if this is problem is worth worrying about, but I thought I'd raise it just in case. I could bump the catversion on master to the following value to help prevent this scenario, but I'm not wild about adding unnecessary catversion bumps. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Attachment
On Thu, 2023-07-06 at 22:14 -0700, Nathan Bossart wrote: > Since we are only reverting from v16, the REL_16_STABLE catversion > will be > bumped ahead of the one on master. I don't object to you doing it this way, but FWIW, I'd just revert in both branches to avoid this kind of weirdness. Also I'm not quite sure how quickly my search_path fix will be committed. Hopefully soon, because the current state is not great, but it's hard for me to say for sure. Regards, Jeff Davis
On Fri, Jul 07, 2023 at 09:22:22AM -0700, Jeff Davis wrote: > On Thu, 2023-07-06 at 22:14 -0700, Nathan Bossart wrote: >> Since we are only reverting from v16, the REL_16_STABLE catversion >> will be >> bumped ahead of the one on master. > > I don't object to you doing it this way, but FWIW, I'd just revert in > both branches to avoid this kind of weirdness. > > Also I'm not quite sure how quickly my search_path fix will be > committed. Hopefully soon, because the current state is not great, but > it's hard for me to say for sure. Yeah, I guess I should just revert it in both. Given your fix will hopefully be committed soon, I was hoping to avoid reverting and un-reverting in quick succession to prevent affecting git-blame too much. I found an example of a post-beta2 revert on both master and a stable branch where Tom set the catversions to different values (20b6847, e256312). I'll do the same here. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
On Fri, Jul 07, 2023 at 09:57:10AM -0700, Nathan Bossart wrote: > Yeah, I guess I should just revert it in both. Given your fix will > hopefully be committed soon, I was hoping to avoid reverting and > un-reverting in quick succession to prevent affecting git-blame too much. > > I found an example of a post-beta2 revert on both master and a stable > branch where Tom set the catversions to different values (20b6847, > e256312). I'll do the same here. reverted -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
On Fri, Jun 30, 2023 at 3:41 AM Jeff Davis <pgsql@j-davis.com> wrote: > I'm not sure that everyone in this thread realizes just how broken it > is to depend on search_path in a functional index at all. And doubly so > if it depends on a schema other than pg_catalog in the search_path. > > Let's also not forget that logical replication always uses > search_path=pg_catalog, so if you depend on a different search_path for > any function attached to the table (not just functional indexes, also > functions inside expressions or trigger functions), then those are > already broken in version 15. And if a superuser is executing > maintenance commands, there's little reason to think they'll have the > same search path as the user that created the table. > > At some point in the very near future (though I realize that point may > come after version 16), we need to lock down the search path in a lot > of cases (not just maintenance commands), and I don't see any way > around that. I agree. I think there are actually two interrelated problems here. One is that virtually all code needs to run with the originally intended search_path rather than some search_path chosen at another time and maybe by a different user. If not, it's going to break, or compromise security, depending on the situation. The other is that running arbitrary code written by somebody else as yourself is basically instant death, from a security perspective. It's a little hard to imagine a world in which these problems don't exist at all, but it somehow feels like the design of the system pushes you toward doing this stuff incorrectly rather than doing it correctly. For instance, you can imagine a system where when you run CREATE OR REPLACE FUNCTION, the prevailing search_path is captured and automatically included in proconfig. Then the default behavior would be to run functions and procedures with the search_path that was in effect when they were created, rather than what we actually have, where it's the one in effect at execution time as it is currently. It's a little harder to imagine something similar around all the user switching behavior, just because we have so many ways of triggering arbitrary code execution: views, triggers, event triggers, expression indexes, constraints, etc. But you can maybe imagine a system where all code associated with a table is run as the table owner in all cases, regardless of SECURITY INVOKER/DEFINER, which I think would at least close some holes. The difficulty is that it's a bit hard to imagine making these kinds of definitional changes now, because they'd probably be breaking changes for pretty significant numbers of users. But on the other hand, if we don't start thinking about systemic changes here, it feels like we're just playing whack-a-mole. -- Robert Haas EDB: http://www.enterprisedb.com
On 7/31/23 12:53, Robert Haas wrote: > On Fri, Jun 30, 2023 at 3:41 AM Jeff Davis <pgsql@j-davis.com> wrote: >> I'm not sure that everyone in this thread realizes just how broken it >> is to depend on search_path in a functional index at all. And doubly so >> if it depends on a schema other than pg_catalog in the search_path. >> >> Let's also not forget that logical replication always uses >> search_path=pg_catalog, so if you depend on a different search_path for >> any function attached to the table (not just functional indexes, also >> functions inside expressions or trigger functions), then those are >> already broken in version 15. And if a superuser is executing >> maintenance commands, there's little reason to think they'll have the >> same search path as the user that created the table. >> >> At some point in the very near future (though I realize that point may >> come after version 16), we need to lock down the search path in a lot >> of cases (not just maintenance commands), and I don't see any way >> around that. > > I agree. I think there are actually two interrelated problems here. > > One is that virtually all code needs to run with the originally > intended search_path rather than some search_path chosen at another > time and maybe by a different user. If not, it's going to break, or > compromise security, depending on the situation. The other is that > running arbitrary code written by somebody else as yourself is > basically instant death, from a security perspective. I agree too. But the analysis of the issue needs to go one step further. Even if the search_path does not change from the originally intended one, a newly created function can shadow the intended one based on argument coercion rules. -- Joe Conway PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
On Mon, Jul 31, 2023 at 1:18 PM Joe Conway <mail@joeconway.com> wrote: > But the analysis of the issue needs to go one step further. Even if the > search_path does not change from the originally intended one, a newly > created function can shadow the intended one based on argument coercion > rules. Yeah, this is a complicated issue. As the system works today, if you include in your search_path a schema to which some other user can write, you are pretty much agreeing to execute code provided by that user. If that user has strictly greater privileges than you, e.g. they are the super-user, then that's fine, because they can compromise your account anyway, but otherwise, you're probably doomed. Not only can they try to capture references with similarly-named objects, they can also do things like create objects whose names are common mis-spellings of the objects that are supposed to be there and hope you access the wrong one by mistake. Maybe there are other attacks as well, but even if not, I think it's already a pretty hopeless situation. I think the UNIX equivalent would be including a directory in your PATH that is world-writable and hoping your account will stay secure. Not very likely. We have already taken an important step in terms of preventing this attack in commit b073c3ccd06e4cb845e121387a43faa8c68a7b62, which removed PUBLIC CREATE from the public schema. Before that, we were shipping a configuration analogous to a UNIX system where /usr/bin was world-writable -- something which no actual UNIX system has likely done any time in the last 40 years, because it's so clearly insane. We could maybe go a step further by changing the default search_path to not even include public, to further discourage people from using that as a catch-all where everybody can just dump their objects. Right now, anybody can revert that change with a single GRANT statement, and if we removed public from the default search_path as well, people would have one extra step to restore that insecure configuration. I don't know such a change is worthwhile, though. It would still be super-easy for users to create insecure configurations: as soon as user A can write a schema and user B has it in the search_path, B is in a lot of trouble if A turns out to be malicious. One thing we might be able to do to prevent that sort of thing is to have a feature to prevent "accidental" code execution, as in the "function trust" mechanism proposed previously. Say I trust all users who can SET ROLE to me and/or who inherit my privileges. Additionally I can decide to trust users who do neither of those things by some sort of explicit declaration. If I don't trust a user then if I do anything that would cause code supplied by that user to get executed, it just errors out: ERROR: role "rhaas" should not execute arbitrary code provided by role "jconway" HINT: If this should be allowed, use the TRUST command to permit it. Even if we do this, I still think we need the kinds of fixes that I mentioned earlier. An error like this is fine if you're trying to do something to a table owned by another user and they've got a surprise trigger on there or something, but a maintenance command like VACUUM should find a way to work that is both secure and non-astonishing. And we probably also still need to find ways to control search_path in a lot more widely than we do today. Otherwise, even if stuff is technically secure, it may just not work. -- Robert Haas EDB: http://www.enterprisedb.com
On Mon, 2023-07-31 at 16:06 -0400, Robert Haas wrote: > if you > include in your search_path a schema to which some other user can > write, you are pretty much agreeing to execute code provided by that > user. Agreed on all counts here. I don't think it's reasonable for us to try to make such a setup secure, and I don't think users have much need for such a setup anyway. > One thing we might be able to do to prevent that sort of thing is to > have a feature to prevent "accidental" code execution, as in the > "function trust" mechanism proposed previously. Say I trust all users > who can SET ROLE to me and/or who inherit my privileges. Additionally > I can decide to trust users who do neither of those things by some > sort of explicit declaration. If I don't trust a user then if I do > anything that would cause code supplied by that user to get executed, > it just errors out: > > ERROR: role "rhaas" should not execute arbitrary code provided by > role "jconway" > HINT: If this should be allowed, use the TRUST command to permit it. +1, though I'm not sure we need an extensive trust mechanism beyond what we already have with the SET ROLE privilege. > And > we probably also still need to find ways to control search_path in a > lot more widely than we do today. Otherwise, even if stuff is > technically secure, it may just not work. +1. Regards, Jeff Davis
On Mon, 2023-07-31 at 12:53 -0400, Robert Haas wrote: > I agree. I think there are actually two interrelated problems here. > > One is that virtually all code needs to run with the originally > intended search_path rather than some search_path chosen at another > time and maybe by a different user. If not, it's going to break, or > compromise security, depending on the situation. The other is that > running arbitrary code written by somebody else as yourself is > basically instant death, from a security perspective. Good framing. The search_path is a particularly nasty problem in our system because it means that users can't even trust the code that they write themselves! A function author has no way to know how their own function will behave under a different search_path. > It's a little hard to imagine a world in which these problems don't > exist at all, but it somehow feels like the design of the system > pushes you toward doing this stuff incorrectly rather than doing it > correctly. For instance, you can imagine a system where when you run > CREATE OR REPLACE FUNCTION, the prevailing search_path is captured > and > automatically included in proconfig. Capturing the environment is not ideal either, in my opinion. It makes it easy to carelessly depend on a schema that others might not have USAGE privileges on, which would then create a runtime problem for other callers. Also, I don't think we could just depend on the raw search_path, we'd need to do some processing for $user, and there are probably a few other annoyances. It's one possibility and we don't have a lot of great options, so I don't want to rule it out though. If nothing else it could be a transition path to something better. > But you can maybe imagine a system where > all code associated with a table is run as the table owner in all > cases, regardless of SECURITY INVOKER/DEFINER, which I think would at > least close some holes. > > The difficulty is that it's a bit hard to imagine making these kinds > of definitional changes now, because they'd probably be breaking > changes for pretty significant numbers of users. I believe we can get close to a good model with minimal breakage. And when we make the problem small enough I believe other solutions will emerge. We will probably have to hedge with some compatibility GUCs. > But on the other > hand, if we don't start thinking about systemic changes here, it > feels > like we're just playing whack-a-mole. Exactly. If we can agree on where we're going then I think we can get there. Regards, Jeff Davis
On Mon, 2023-07-31 at 13:17 -0400, Joe Conway wrote: > But the analysis of the issue needs to go one step further. Even if > the > search_path does not change from the originally intended one, a newly > created function can shadow the intended one based on argument > coercion > rules. There are quite a few issues going down this path: * The set of objects in each schema can change. Argument coercion is a particularly subtle one, but there are other ways that it could find the wrong object. The temp namespace also has some subtle issues. * Schema USAGE privileges may vary over time or from caller to caller, affecting which items in the search path are searched at all. The same goes if theres an object access hook in place. * $user should be resolved to a specific schema (or perhaps not in some cases?) * There are other GUCs and environment that can affect function behavior. Is it worth trying to lock those down? I agree that each of these is some potential problem, but these are much smaller problems than allowing the caller to have complete control over the search_path. Regards, Jeff Davis
On Mon, Jul 31, 2023 at 5:15 PM Jeff Davis <pgsql@j-davis.com> wrote: > > ERROR: role "rhaas" should not execute arbitrary code provided by > > role "jconway" > > HINT: If this should be allowed, use the TRUST command to permit it. > > +1, though I'm not sure we need an extensive trust mechanism beyond > what we already have with the SET ROLE privilege. FWIW, I think it would be a good idea. It might not be absolutely mandatory but I think it would be smart. -- Robert Haas EDB: http://www.enterprisedb.com
On Mon, Jul 31, 2023 at 6:10 PM Jeff Davis <pgsql@j-davis.com> wrote: > Capturing the environment is not ideal either, in my opinion. It makes > it easy to carelessly depend on a schema that others might not have > USAGE privileges on, which would then create a runtime problem for > other callers. Also, I don't think we could just depend on the raw > search_path, we'd need to do some processing for $user, and there are > probably a few other annoyances. > > It's one possibility and we don't have a lot of great options, so I > don't want to rule it out though. If nothing else it could be a > transition path to something better. Here is my thought about this. Right now, we basically do one of two things. In some cases, we parse statements when they're submitted, and then store the resulting node trees. In such cases, references are fixed: the statements will always refer to the objects to which they referred previously. In functions and procedures, except for the new BEGIN ATOMIC stuff, we just store the statements as a string and they get parsed at execution time. Then, the problem arises of statements being possibly parsed in an environment that differs from the original one. It can differ either by search_path being different so that we look in different schemas, or, as you point out here, if the contents of the schemas themselves have been modified. I think that a lot of people would like it if we moved more in the direction of parsing statements at object definition time. Possibly because EDB deals with a lot of people coming from Oracle, I've heard a lot of complaints about the PG behavior. However, there are some fairly significant problems with that idea. First, it would break some use cases, such as creating a temporary table and then running DML commands on it, or more generally any use case where a function or procedure might need to reference objects that don't exist at time of definition. Second, while we have clear separation of parsing and execution for queries, the same is not true for DDL commands; it's not the case, I believe, that you can parse an arbitrary DDL command such that all object references are resolved, and then later execute it. We'd need to change a bunch of code to get there. Third, we'd have to deal with dependency loops: right now, because functions and procedures don't parse their bodies at definition time, they also don't depend on the objects that they are going to end up accessing, which means that a function or procedure can be restored by pg_dump without worrying about whether those objects exist yet. That would have to change, and that would mean creating dependency loops for pg_dump, which we'd have to then find a way to break. I'm not trying to say that any of these problems are intractable, but I do think changing stuff like this would be quite a bit of work -- and that's assuming the user impact was judged to be acceptable, which I'm not at all sure that it would be. We'd certainly need to provide some workaround for people who want to do stuff like create and use temporary tables inside of a function or procedure. Now, if we don't go in the direction of resolving everything at parse time, then I think capturing search_path is probably the next best thing, or at least the next best thing that I've thought up so far. It doesn't hold constant the meaning of the code to the same degree that parsing at definition time would do, but it gets us closer to that than the status quo. Crucially, if the user is using a secure search_path, then any changes to the meaning of the code that captures that search_path will have to be made by that user or someone with a superset of their privileges, which is a lot less serious than what you get when there's no search_path setting at all, where the *caller* can change the meaning of the called code. That is not, however, to say that this idea is really good enough. To be honest, I think it's a bit of a kludge, and dropping a kludge on top of our entire user base and maybe also breaking a lot of things is not particularly where I want to be. I just don't have an idea I like better at the moment. -- Robert Haas EDB: http://www.enterprisedb.com
Re: pgsql: Fix search_path to a safe value during maintenance operations.
From
"David G. Johnston"
Date:
On Tue, Aug 1, 2023 at 10:42 AM Robert Haas <robertmhaas@gmail.com> wrote:
Now, if we don't go in the direction of resolving everything at parse
time, then I think capturing search_path is probably the next best
thing, or at least the next best thing that I've thought up so far.
I'd much rather strongly encourage the user to write a safe and self-sufficient function definition. Specifically, if there is no search_path attached to the function then the search path that will be in place will be temp + pg_catalog only. Though I wonder whether it would be advantageous to allow a function to have a temporary schema separate from the session-scoped one...
They can use ALTER FUNCTION and the existing "FROM CURRENT" specification to get back to current behavior if desired.
Going further, I could envision an "explicit" mode that both performs a parse-time check for object existence and optionally reports an error if the lookup happened via an inexact match - forcing lots more type casts to occur (we'd probably need to invent something to say "must match the anyelement function signature") but ensuring at parse time you've correctly identified everything you intend to be using. Sure, the meanings of those things could change but the surface is much smaller than plugging a new function that matches earlier in the lookup resolution process.
David J.
On Tue, 2023-08-01 at 11:16 -0700, David G. Johnston wrote: > They can use ALTER FUNCTION and the existing "FROM CURRENT" > specification to get back to current behavior if desired. The current behavior is that the search_path comes from the environment each execution. FROM CURRENT saves the search_path at definition time and uses that each execution. Regards, Jeff Davis
Re: pgsql: Fix search_path to a safe value during maintenance operations.
From
"David G. Johnston"
Date:
On Tue, Aug 1, 2023 at 2:38 PM Jeff Davis <pgsql@j-davis.com> wrote:
On Tue, 2023-08-01 at 11:16 -0700, David G. Johnston wrote:
> They can use ALTER FUNCTION and the existing "FROM CURRENT"
> specification to get back to current behavior if desired.
The current behavior is that the search_path comes from the environment
each execution. FROM CURRENT saves the search_path at definition time
and uses that each execution.
Right...I apparently misread "create" as "the" in "when CREATE FUNCTION is executed".
The overall point stands, it just requires defining a similar "FROM SESSION" to allow for explicitly specifying the current default (missing) behavior.
David J.
On Tue, 2023-08-01 at 13:41 -0400, Robert Haas wrote: > In functions and procedures, except for the new > BEGIN ATOMIC stuff, we just store the statements as a string and they > get parsed at execution time. ... > I think that a lot of people would like it if we moved more in the > direction of parsing statements at object definition time. Do you mean that we'd introduce some BEGIN ATOMIC version of plpgsql (and other trusted languages)? > However, there are some > fairly significant problems with that idea. To satisfy intended use cases of things like default expressions, CHECK constraints, index expressions, etc., there is no need to call functions that would be subject to the problems you list. One problem is that functions are too general a concept -- we have no idea whether the user is trying to do something simple or trying to do something "interesting". > Now, if we don't go in the direction of resolving everything at parse > time, It would be useful to pursue this approach, but I don't think it will be enough. We still need to solve our search_path problems. > then I think capturing search_path is probably the next best > thing, +0.5. > To be honest, I think it's > a > bit of a kludge, and dropping a kludge on top of our entire user base > and maybe also breaking a lot of things is not particularly where I > want to be. I just don't have an idea I like better at the moment. We will also be fixing things for a lot of users who just haven't run into a problem *yet* (or perhaps did, and just don't know it). Regards, Jeff Davis
On Tue, 2023-08-01 at 14:47 -0700, David G. Johnston wrote: > The overall point stands, it just requires defining a similar "FROM > SESSION" to allow for explicitly specifying the current default > (missing) behavior. That sounds useful as a way to future-proof function definitions that intend to use the session search_path. It seems like we're moving in the direction of search_path defaulting to FROM CURRENT (probably with a long road of compatibility GUCs to minimize breakage...) and everything else defaulting to FROM SESSION (as before)? Regards, Jeff Davis