Thread: Possible major bug in PlPython (plus some other ideas)
Hi everyone, I have noticed a possibly major issues in Plpython that may need to be addressed before 7.2 is released: 1) If Plpython is installed as a trusted language, and from what little I can glean from the documentation, it shouldnot have any filesystem access. However, the default behavior of the restricted execution environment being usedallows read-only filesystem access. Here is the current behavior (from the Python Library Reference): r_open(filename[, mode[, bufsize]]) Method called when open() is called in the restricted environment. The arguments are identical to those of open(),and a file object (or a class instance compatible with file objects) should be returned. RExec's defaultbehavior is allow opening any file for reading, but forbidding any attempt to write a file. See the example belowfor an implementation of a less restrictive r_open(). It is fairly easy to override this method to unconditionally raise an access exception. I have some other suggestions that may not be appropriate for the 7.2 release, but think should be addressed before too long: 2) I'm not sure why the TD dictionary exists. Why not create objects 'new', 'old' or 'event' in the global namespacewhen the interpreter is called in the appropriate contexts? The current way is unwieldy, not very 'Pythonic'(a frequent justification for change in the Python world), and not consistent with other PostgreSQL proceduralbackends. Its possible to keep TD for backward compatibility, so there is no downside. 3) 'old' and 'new' should also provide class-like syntax: e.g. old.foo, new.baz (using getitem) instead of old['foo'], new['baz'] (using getattr) Of course we cannot drop the getattr interface, since many valid column names are not valid python identifiers (Ithink -- I haven't looked at the SQL grammar lately, though I'm guessing that is the case). 4) Plpython does not use the standard Python boolean checks, which is also not very Pythonic and somewhat confusing: e.g. CREATE OR REPLACE FUNCTION py_true() RETURNS bool AS ' return "a" ' LANGUAGE 'plpython'; CREATE OR REPLACE FUNCTION py_false() RETURNS bool AS ' return {} ' LANGUAGE 'plpython'; # select py_true(); ERROR: Bad boolean external representation 'a' select py_false(); ERROR: Bad boolean external representation '{}' These should return: # select py_true(); -- non-empty strings evaluate to true bool ------ t (1 row) select py_false(); -- empty dictionaries evaluate to false bool ------ f (1row) I suggest changing the semantics of boolean return values to use PyObject_IsTrue(PyObject *o) to properly test fortruth values. 5) It should be trivial to create an "untrusted" version of Plpython that bypasses the restricted execution environment. This is worthy of some consideration, since it may be very useful and can be implemented with relativeease. 6) [Very low priority] Its not insane to consider a Plpython procedure that spawns off a Python thread to do backgroundprocessing tasks. This is obviously something that will only be possible in an untrusted version of the interpreter. Also, if the SPI interface is thread-safe, then it may be useful to release the Python interpreter lock around some of the SPI calls. OK, so I've got a laundry list of issues. Only the first issue is a real show-stopper in my mind, though I'd like to see at least 1-4 addressed before 7.2 or 7.2.1, if at all possible. After some discussion, I'll even by happy to implement most/all of these items, though I'd like more collaboration than just submitting patches blindly for consideration. Thanks, -Kevin Jacobs PS: Oh, I'd like to thank everyone working on PostgreSQL for the wonderful job they've done. I'm a _very_ new user andam in the process of porting a _very large_ project from MySQL/MSSQL to PostgresQL, for obvious reasons. I had expectedthe process to be painful and tedious, but instead it has been a real pleasure and I have enjoyed exploring all of the wonderful things you all have given me to play with. -- Kevin Jacobs The OPAL Group - Enterprise Systems Architect Voice: (216) 986-0710 x 19 E-mail: jacobs@theopalgroup.com Fax: (216) 986-0714 WWW: http://www.theopalgroup.com
* Kevin Jacobs (jacobs@penguin.theopalgroup.com) [011109 08:59]: > > 1) If Plpython is installed as a trusted language, and from what little I > can glean from the documentation, it should not have any filesystem access. > However, the default behavior of the restricted execution environment > being used allows read-only filesystem access. >... > It is fairly easy to override this method to unconditionally raise an > access exception. Agreed. Just to amplify the point below, there is currently no distinction between trusted and untrusted in PLpython (hmm, need a doc change to identify this?). lanpltrusted appears nowhere in the implementation. > 2) I'm not sure why the TD dictionary exists. Why not create objects > 'new', 'old' or 'event' in the global namespace when the interpreter is > called in the appropriate contexts? The current way is unwieldy, not > very 'Pythonic' (a frequent justification for change in the Python > world), and not consistent with other PostgreSQL procedural backends. > Its possible to keep TD for backward compatibility, so there is no > downside. > > 3) 'old' and 'new' should also provide class-like syntax: > > e.g. old.foo, new.baz (using getitem) > > instead of > old['foo'], new['baz'] (using getattr) > > Of course we cannot drop the getattr interface, since many valid column > names are not valid python identifiers (I think -- I haven't looked at > the SQL grammar lately, though I'm guessing that is the case). Agree on both. > 4) Plpython does not use the standard Python boolean checks, which is also > not very Pythonic and somewhat confusing: > ... > I suggest changing the semantics of boolean return values to use > PyObject_IsTrue(PyObject *o) to properly test for truth values. Agree. Is this the only type that needs special treatment? > 5) It should be trivial to create an "untrusted" version of Plpython that > bypasses the restricted execution environment. This is worthy of some > consideration, since it may be very useful and can be implemented with > relative ease. Strongly agree. I'd like to see it for your "7.2.1" proposal. > 6) [Very low priority] Its not insane to consider a Plpython procedure > that spawns off a Python thread to do background processing tasks. > This is obviously something that will only be possible in an untrusted > version of the interpreter. Also, if the SPI interface is thread-safe, > then it may be useful to release the Python interpreter lock around > some of the SPI calls. Three weeks ago, I really wanted a feature like this, but I've slowly been convincing myself that it's a bad idea. Consider the effects of a race condition in user generated threaded python code that results in a deadlock within the backend. Extend that to a backend that's holding a database lock of some kind. The expertise required to diagnose and debug such a condition would be rather substantial - threading, python threading, sql, postgres internals, ... in one person. I solved my design issue with the use of an external process and a NOTIFY, and I'm much happier for it. The threaded python stuff can stay out in another process where the interaction concerns are minimized. I will say that the discrepancy between executing SQL from plpython and from python with the pg driver does cause some cognitive disconnect, because despite working in the same language, I have to use different APIs to the same functionality. I don't have a good proposal to solve it, though. > OK, so I've got a laundry list of issues. Only the first issue is a real > show-stopper in my mind, though I'd like to see at least 1-4 addressed > before 7.2 or 7.2.1, if at all possible. After some discussion, I'll even > by happy to implement most/all of these items, though I'd like more > collaboration than just submitting patches blindly for consideration. I'll happily help. One last issue: 7) Would it be completely impossible to make an extra-global dictionary that was shared between multiple back-ends (place in shared memory)? Anything I'm likely to place in GD, I'm likely to initialize once and want in all backends. -Brad McLean
Kevin Jacobs wrote: > > > > 1) If Plpython is installed as a trusted language, and from what little I > > > can glean from the documentation, it should not have any filesystem access. > > > However, the default behavior of the restricted execution environment > > > being used allows read-only filesystem access. > > > > we have 'read-only filesystem access anyhow' : > > Then I consider this a bug if a non-super-user can do this. It's not that bad - only postgresql superuser can use copy to/from file . ------------- Hannu
Kevin Jacobs wrote: > > Hi everyone, > > I have noticed a possibly major issues in Plpython that may need to be > addressed before 7.2 is released: > > 1) If Plpython is installed as a trusted language, and from what little I > can glean from the documentation, it should not have any filesystem access. > However, the default behavior of the restricted execution environment > being used allows read-only filesystem access. we have 'read-only filesystem access anyhow' : pg72b2=# create table hack(row text); CREATE pg72b2=# copy hack from '/home/pg72b2/data/pg_hba.conf' DELIMITERS '\01'; COPY pg72b2=# select * from hack limit 10; row -------------------------------------------------------------------------------# # PostgreSQL HOST-BASED ACCESS(HBA) CONTROL FILE# # # This file controls:# o which hosts are allowed to connect# o how users are authenticatedon each host# o databases accessible by each host# # It is read on postmaster startup and when the postmasterreceives a SIGHUP. (10 rows) do I can't consider having it in plputhon any bigger security threat. using copy xxx to '/file/' we have even read-write access, we just can't overwrite 0600 files. And you can do only what the postgres user can do. > 2) I'm not sure why the TD dictionary exists. Why not create objects > 'new', 'old' or 'event' in the global namespace when the interpreter is > called in the appropriate contexts? The current way is unwieldy, not > very 'Pythonic' (a frequent justification for change in the Python > world), and not consistent with other PostgreSQL procedural backends. > Its possible to keep TD for backward compatibility, so there is no > downside. > > 3) 'old' and 'new' should also provide class-like syntax: > > e.g. old.foo, new.baz (using getitem) > > instead of > old['foo'], new['baz'] (using getattr) > > Of course we cannot drop the getattr interface, since many valid column > names are not valid python identifiers (I think -- I haven't looked at > the SQL grammar lately, though I'm guessing that is the case). You can have almost anything in an identifier if it is quoted. ----------- Hannu
Kevin Jacobs <jacobs@penguin.theopalgroup.com> writes: > I have noticed a possibly major issues in Plpython that may need to be > addressed before 7.2 is released: > 1) If Plpython is installed as a trusted language, and from what little I > can glean from the documentation, it should not have any filesystem access. > However, the default behavior of the restricted execution environment > being used allows read-only filesystem access. I agree, this is not good. If it's easy to patch, please submit a patch. What worries me is not so much this particular hole, which is easily plugged now that we know about it, as that it suggests that Python's idea of a restricted environment is considerably less restricted than we would like. Perhaps there are other facilities that need to be turned off as well? The alternative we could consider is to mark plpython as untrusted for 7.2, until someone has time for a more complete review of possible security problems. > I have some other suggestions that may not be appropriate for the 7.2 > release, but think should be addressed before too long: This would all be good stuff to address in 7.3 or further in the future. As far as I'm concerned, all the PL languages except plpgsql are barely out of the "proof of concept" stage; they all need a lot of work from interested people to bring them to the "industrial strength" stage. If you want to be one of those people, step right up! > 6) [Very low priority] Its not insane to consider a Plpython procedure > that spawns off a Python thread to do background processing tasks. > This is obviously something that will only be possible in an untrusted > version of the interpreter. Also, if the SPI interface is thread-safe, > then it may be useful to release the Python interpreter lock around > some of the SPI calls. SPI is not thread-safe; in fact no part of the backend is thread-safe or designed for multithreading at all. This one I'd view with great wariness. regards, tom lane
* Kevin Jacobs (jacobs@penguin.theopalgroup.com) [011109 10:53]: > > On Fri, 9 Nov 2001, Bradley McLean wrote: > > Actually, I'm mostly unaware of how createlang works, but I assumed that > plpython was installed as a trusted backend. Can anyone clarify _exactly_ > what is meant by a 'trusted'? IANA Expert here. createlang (shell script, just read) defines plpython as "TRUSTED", but you can install it either way with CREATE LANGUAGE. Looking at the pl/tcl and pl/perl implementations, plpython.c needs to look at the Form_pg_language->lanpltrusted element to get the flag. Then, plprocedure_compile needs to be modified to set up either exec or r_exec as appropriate. Also looks like PLy_init and it's sub procedures need some rework as well. Note that I'm just another newcomer who has submitted one more patch than you, but has a keen interest in a strong python PL. > To fix this issue, is it better to create a C-API method override to RExec > or should we implement a real plpy.py module that does this (and some of the > rest of my suggestions). I'd prefer the module, since it is _much_ easier > to do, though it does incur some extra overhead. I think we need to continue to work in the C-API, and not at the Python level. It's not *that* much harder, and I don't see a continuing need to work at the python level after this situation is fixed. > Virtually all the other basic types will usually do the right thing. > We could add support for array conversions: > > Python [1,2,3] => Postgres INTEGER[] {1,2,3} > > Can you think of any others? No, and I couldn't before; just asking the design question. > > Three weeks ago, I really wanted a feature like this, but I've slowly > > been convincing myself that it's a bad idea. Consider the effects of > > a race condition in user generated threaded python code that results > > in a deadlock within the backend. > > I'm not completely sold on this either. I'm going to let this idea > percolate for a while and wait to see if I can find a need for it. Sounds like a plan. > Other than not having the same modules available, what is annoying you most? CREATE FUNCTION one() returns int4 AS ' result = plpy.execute("SELECT 1 as one") return result[0]["one"] ' language 'plpython'; vs. from pg import DB def one(): db = DB(dbname="foo") result = db.query("SELECT 1 as one") return result.getresult()[0][0] ----- query vs execute, getresult() in one case and not the other, results by column name vs column index. Again, I'm not sure I have a strong proposal to fix it, since the two APIs run in very different environments, and one is influenced by the python database api, the other by the postgres SPI api. I'm just whining because when they're up in two emacs windows, I'm forever putting the wrong one in the wrong place. > > 7) Would it be completely impossible to make an extra-global > > dictionary that was shared between multiple back-ends (place > > in shared memory)? Anything I'm likely to place in GD, I'm likely > > to initialize once and want in all backends. > > It will be very tricky to do in full generality. Anything that is inserted > into the "shared global" dictionary will have to be copied, re-allocated on > the shared heap, and hidden from the garbage collector. This is fairly easy > to do for a simple types {int,float,string}; much harder for more complex > types. Agreed it would be tricky. But is it useful? Would write-once (constant) semantics help? -Brad
Hannu Krosing <hannu@tm.ee> writes: >> However, the default behavior of the restricted execution environment >> being used allows read-only filesystem access. > we have 'read-only filesystem access anyhow' : > pg72b2=# create table hack(row text); > CREATE > pg72b2=# copy hack from '/home/pg72b2/data/pg_hba.conf' DELIMITERS > '\01'; Only if you're superuser, which is exactly the point of the trusted vs untrusted function restriction. The plpython problem lets non-superusers read any file that the postgres user can read, which is not cool. regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> writes: > What worries me is not so much this particular hole, which is easily > plugged now that we know about it, as that it suggests that Python's > idea of a restricted environment is considerably less restricted than > we would like. Perhaps there are other facilities that need to be > turned off as well? Could be. FWIW, Zope (www.zope.org) allows for Python scripts, created and managed through the web, that run in a "sandbox" with many of the same restrictions as PG puts on untrusted languages--they actually disallow regex matching so you can't hang the webserver thread with a regex that backtracks forever. Might be worthhhile for the plpython folks to take a look at Zope. > The alternative we could consider is to mark plpython as untrusted for > 7.2, until someone has time for a more complete review of possible > security problems. This sounds like a good idea to me. -Doug -- Let us cross over the river, and rest under the shade of the trees. --T. J. Jackson, 1863
Re: Possible major bug in PlPython (plus some other ideas)
From
teg@redhat.com (Trond Eivind Glomsrød)
Date:
Tom Lane <tgl@sss.pgh.pa.us> writes: > Hannu Krosing <hannu@tm.ee> writes: > >> However, the default behavior of the restricted execution environment > >> being used allows read-only filesystem access. > > > we have 'read-only filesystem access anyhow' : > > > pg72b2=# create table hack(row text); > > CREATE > > pg72b2=# copy hack from '/home/pg72b2/data/pg_hba.conf' DELIMITERS > > '\01'; > > Only if you're superuser, which is exactly the point of the trusted > vs untrusted function restriction. The plpython problem lets > non-superusers read any file that the postgres user can read, which > is not cool. If a fix is made, will it be backported to the 7.1 branch so vendors can upgrade their packages if this is necesarry? -- Trond Eivind Glomsrød Red Hat, Inc.
Doug McNaught <doug@wireboard.com> writes: > FWIW, Zope (www.zope.org) allows for Python scripts, created > and managed through the web, that run in a "sandbox" with many of the > same restrictions as PG puts on untrusted languages--they actually ^^^^^^^^ Er, I meant 'trusted' here, of course. -Doug -- Let us cross over the river, and rest under the shade of the trees. --T. J. Jackson, 1863
On Fri, Nov 09, 2001 at 03:25:04PM -0500, Doug McNaught wrote: > Tom Lane <tgl@sss.pgh.pa.us> writes: > > > What worries me is not so much this particular hole, which is easily > > plugged now that we know about it, as that it suggests that Python's > > idea of a restricted environment is considerably less restricted than > > we would like. Perhaps there are other facilities that need to be > > turned off as well? > > Could be. FWIW, Zope (www.zope.org) allows for Python scripts, created > and managed through the web, that run in a "sandbox" with many of the > same restrictions as PG puts on untrusted languages--they actually > disallow regex matching so you can't hang the webserver thread with a > regex that backtracks forever. Might be worthhhile for the plpython > folks to take a look at Zope. And it took _forever_ to convince the Zope folks to put it in, for this very reason. Those who wanted python scripts (through the web interface, as opposed to through the filesystem) had to jump through all the hoops to make it safe enough. Ross
"Ross J. Reedstrom" wrote: > > On Fri, Nov 09, 2001 at 03:25:04PM -0500, Doug McNaught wrote: > > Tom Lane <tgl@sss.pgh.pa.us> writes: > > > > > What worries me is not so much this particular hole, which is easily > > > plugged now that we know about it, as that it suggests that Python's > > > idea of a restricted environment is considerably less restricted than > > > we would like. Perhaps there are other facilities that need to be > > > turned off as well? Perhaps we need some general guidelines for trusted PLs - lists of things to restrict, e.t.c. Perhaps we even need something like CodeBase Principals from Netscape/Javascript ? Or just more fine-grained PRIVILEGEs ? The python way of defining restricted execution is quite flexible in what to allow. In Zope they used to restict even some kinds of loops and list constructors to prevent users from shooting themselves in the foot. They are now relaxing some restrictions, as there are too many unrestricted ways for doing it to warrant making common operations cumbersome. > > Could be. FWIW, Zope (www.zope.org) allows for Python scripts, created > > and managed through the web, that run in a "sandbox" with many of the > > same restrictions as PG puts on untrusted languages--they actually > > disallow regex matching so you can't hang the webserver thread with a > > regex that backtracks forever. Are there any plans to disallow regex matching in postgreSQL as well ??? AFAIK there are simpler ways to hog a DB server as anyone writing SQL queries can tell you ;) A more reliable approach for DB server may be establishing per-user memory/time/cpu quotas and just rolling back queries that exceed them. > > Might be worthhhile for the plpython folks to take a look at Zope. > > And it took _forever_ to convince the Zope folks to put it in, for this > very reason. Those who wanted python scripts (through the web interface, > as opposed to through the filesystem) had to jump through all the hoops > to make it safe enough. ----------------- Hannu
Hannu Krosing <hannu@tm.ee> writes: > Are there any plans to disallow regex matching in postgreSQL as well ??? > AFAIK there are simpler ways to hog a DB server as anyone writing > SQL queries can tell you ;) That was my reaction too --- disabling regexes is not appropriate for the Postgres environment. But we do need to prevent access to the filesystem, even if it's just read-only access. regards, tom lane
Trond Eivind Glomsrød writes: > If a fix is made, will it be backported to the 7.1 branch so vendors > can upgrade their packages if this is necesarry? Probably not, since there is no PL/Python in 7.1. -- Peter Eisentraut peter_e@gmx.net
I would like to have a patch for this into 7.2 because it is a security problem. --------------------------------------------------------------------------- > Kevin Jacobs <jacobs@penguin.theopalgroup.com> writes: > > I have noticed a possibly major issues in Plpython that may need to be > > addressed before 7.2 is released: > > > 1) If Plpython is installed as a trusted language, and from what little I > > can glean from the documentation, it should not have any filesystem access. > > However, the default behavior of the restricted execution environment > > being used allows read-only filesystem access. > > I agree, this is not good. If it's easy to patch, please submit a > patch. > > What worries me is not so much this particular hole, which is easily > plugged now that we know about it, as that it suggests that Python's > idea of a restricted environment is considerably less restricted than > we would like. Perhaps there are other facilities that need to be > turned off as well? > > The alternative we could consider is to mark plpython as untrusted for > 7.2, until someone has time for a more complete review of possible > security problems. > > > I have some other suggestions that may not be appropriate for the 7.2 > > release, but think should be addressed before too long: > > This would all be good stuff to address in 7.3 or further in the future. > As far as I'm concerned, all the PL languages except plpgsql are barely > out of the "proof of concept" stage; they all need a lot of work from > interested people to bring them to the "industrial strength" stage. > If you want to be one of those people, step right up! > > > 6) [Very low priority] Its not insane to consider a Plpython procedure > > that spawns off a Python thread to do background processing tasks. > > This is obviously something that will only be possible in an untrusted > > version of the interpreter. Also, if the SPI interface is thread-safe, > > then it may be useful to release the Python interpreter lock around > > some of the SPI calls. > > SPI is not thread-safe; in fact no part of the backend is thread-safe > or designed for multithreading at all. This one I'd view with great > wariness. > > regards, tom lane > > ---------------------------(end of broadcast)--------------------------- > TIP 5: Have you checked our extensive FAQ? > > http://www.postgresql.org/users-lounge/docs/faq.html > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
I'm going to do this today (Kevin, I'm not stepping on your toes, am I?) I have some other problems with the module to be addressed (there are about a million, or at least a dozen ways to get it to return Null pointers to the backend, crash the backend, and cause a general shared mem corruption message from the postmaster). -Brad * Bruce Momjian (pgman@candle.pha.pa.us) [011112 00:41]: > > I would like to have a patch for this into 7.2 because it is a security > problem. > > > --------------------------------------------------------------------------- > > > Kevin Jacobs <jacobs@penguin.theopalgroup.com> writes: > > > I have noticed a possibly major issues in Plpython that may need to be > > > addressed before 7.2 is released: > > > > > 1) If Plpython is installed as a trusted language, and from what little I > > > can glean from the documentation, it should not have any filesystem access. > > > However, the default behavior of the restricted execution environment > > > being used allows read-only filesystem access. > > > > I agree, this is not good. If it's easy to patch, please submit a > > patch. > > > > What worries me is not so much this particular hole, which is easily > > plugged now that we know about it, as that it suggests that Python's > > idea of a restricted environment is considerably less restricted than > > we would like. Perhaps there are other facilities that need to be > > turned off as well? > > > > The alternative we could consider is to mark plpython as untrusted for > > 7.2, until someone has time for a more complete review of possible > > security problems. > > > > > I have some other suggestions that may not be appropriate for the 7.2 > > > release, but think should be addressed before too long: > > > > This would all be good stuff to address in 7.3 or further in the future. > > As far as I'm concerned, all the PL languages except plpgsql are barely > > out of the "proof of concept" stage; they all need a lot of work from > > interested people to bring them to the "industrial strength" stage. > > If you want to be one of those people, step right up! > > > > > 6) [Very low priority] Its not insane to consider a Plpython procedure > > > that spawns off a Python thread to do background processing tasks. > > > This is obviously something that will only be possible in an untrusted > > > version of the interpreter. Also, if the SPI interface is thread-safe, > > > then it may be useful to release the Python interpreter lock around > > > some of the SPI calls. > > > > SPI is not thread-safe; in fact no part of the backend is thread-safe > > or designed for multithreading at all. This one I'd view with great > > wariness. > > > > regards, tom lane > > > > ---------------------------(end of broadcast)--------------------------- > > TIP 5: Have you checked our extensive FAQ? > > > > http://www.postgresql.org/users-lounge/docs/faq.html > > > > -- > Bruce Momjian | http://candle.pha.pa.us > pgman@candle.pha.pa.us | (610) 853-3000 > + If your life is a hard drive, | 830 Blythe Avenue > + Christ can be your backup. | Drexel Hill, Pennsylvania 19026 > > ---------------------------(end of broadcast)--------------------------- > TIP 5: Have you checked our extensive FAQ? > > http://www.postgresql.org/users-lounge/docs/faq.html
Kevin Jacobs <jacobs@penguin.theopalgroup.com> writes: > I had to guess what a "trusted" language should or should not do. For > example, I do not allow Python to report the platform it is running on, > though I do allow it to report the native byte-order and interpreter > version. Do you mean platform type, or host name? "select version()" reports the platform type, so I see no reason to hide that in plpython. Otherwise the changes sound reasonable in brief. I don't know Python well enough to review the change properly, though. D'Arcy or someone, please check it... regards, tom lane
Hi Bradley, Thanks for the response! I'm very relieved to get feedback on my suggestions. On Fri, 9 Nov 2001, Bradley McLean wrote: > * Kevin Jacobs (jacobs@penguin.theopalgroup.com) [011109 08:59]: > > > > 1) If Plpython is installed as a trusted language, and from what little I > > can glean from the documentation, it should not have any filesystem access. > > However, the default behavior of the restricted execution environment > > being used allows read-only filesystem access. > >... > > It is fairly easy to override this method to unconditionally raise an > > access exception. > > Agreed. Just to amplify the point below, there is currently no > distinction between trusted and untrusted in PLpython (hmm, need > a doc change to identify this?). lanpltrusted appears nowhere > in the implementation. Actually, I'm mostly unaware of how createlang works, but I assumed that plpython was installed as a trusted backend. Can anyone clarify _exactly_ what is meant by a 'trusted'? To fix this issue, is it better to create a C-API method override to RExec or should we implement a real plpy.py module that does this (and some of the rest of my suggestions). I'd prefer the module, since it is _much_ easier to do, though it does incur some extra overhead. > > 4) Plpython does not use the standard Python boolean checks, which is also > > not very Pythonic and somewhat confusing: > > ... > > I suggest changing the semantics of boolean return values to use > > PyObject_IsTrue(PyObject *o) to properly test for truth values. > > Agree. Is this the only type that needs special treatment? Virtually all the other basic types will usually do the right thing. We could add support for array conversions: Python [1,2,3] => Postgres INTEGER[] {1,2,3} Can you think of any others? > > 6) [Very low priority] Its not insane to consider a Plpython procedure > > that spawns off a Python thread to do background processing tasks. > > This is obviously something that will only be possible in an untrusted > > version of the interpreter. Also, if the SPI interface is thread-safe, > > then it may be useful to release the Python interpreter lock around > > some of the SPI calls. > > Three weeks ago, I really wanted a feature like this, but I've slowly > been convincing myself that it's a bad idea. Consider the effects of > a race condition in user generated threaded python code that results > in a deadlock within the backend. I'm not completely sold on this either. I'm going to let this idea percolate for a while and wait to see if I can find a need for it. > I will say that the discrepancy between executing SQL from plpython > and from python with the pg driver does cause some cognitive > disconnect, because despite working in the same language, I have > to use different APIs to the same functionality. I don't have a > good proposal to solve it, though. Other than not having the same modules available, what is annoying you most? > One last issue: > > 7) Would it be completely impossible to make an extra-global > dictionary that was shared between multiple back-ends (place > in shared memory)? Anything I'm likely to place in GD, I'm likely > to initialize once and want in all backends. It will be very tricky to do in full generality. Anything that is inserted into the "shared global" dictionary will have to be copied, re-allocated on the shared heap, and hidden from the garbage collector. This is fairly easy to do for a simple types {int,float,string}; much harder for more complex types. -Kevin -- Kevin Jacobs The OPAL Group - Enterprise Systems Architect Voice: (216) 986-0710 x 19 E-mail: jacobs@theopalgroup.com Fax: (216) 986-0714 WWW: http://www.theopalgroup.com
On 9 Nov 2001, Doug McNaught wrote: > Doug McNaught <doug@wireboard.com> writes: > > > FWIW, Zope (www.zope.org) allows for Python scripts, created > > and managed through the web, that run in a "sandbox" with many of the > > same restrictions as PG puts on untrusted languages--they actually > > Er, I meant 'trusted' here, of course. plpython does this too -- the problem I've found is due to a bad default setting in the "sandbox". -Kevin -- Kevin Jacobs The OPAL Group - Enterprise Systems Architect Voice: (216) 986-0710 x 19 E-mail: jacobs@theopalgroup.com Fax: (216) 986-0714 WWW: http://www.theopalgroup.com
On Fri, 9 Nov 2001, Hannu Krosing wrote: > Kevin Jacobs wrote: > > > > > > 1) If Plpython is installed as a trusted language, and from what little I > > > > can glean from the documentation, it should not have any filesystem access. > > > > However, the default behavior of the restricted execution environment > > > > being used allows read-only filesystem access. > > > > > > we have 'read-only filesystem access anyhow' : > > > > Then I consider this a bug if a non-super-user can do this. > > It's not that bad - only postgresql superuser can use copy to/from file Ah -- then it still means we should take read-only filesystem access away from plpython for now. If we want to implemente a trusted mode, then we can add it back in. -Kevin -- Kevin Jacobs The OPAL Group - Enterprise Systems Architect Voice: (216) 986-0710 x 19 E-mail: jacobs@theopalgroup.com Fax: (216) 986-0714 WWW: http://www.theopalgroup.com
On Fri, 9 Nov 2001, Tom Lane wrote: > Kevin Jacobs <jacobs@penguin.theopalgroup.com> writes: > > I have noticed a possibly major issues in Plpython that may need to be > > addressed before 7.2 is released: > > > 1) If Plpython is installed as a trusted language, and from what little I > > can glean from the documentation, it should not have any filesystem access. > > However, the default behavior of the restricted execution environment > > being used allows read-only filesystem access. > > I agree, this is not good. If it's easy to patch, please submit a > patch. I'll have something ready by Monday. > What worries me is not so much this particular hole, which is easily > plugged now that we know about it, as that it suggests that Python's > idea of a restricted environment is considerably less restricted than > we would like. Perhaps there are other facilities that need to be > turned off as well? I'm going to do a very careful review of the code. Upfront, I expect that I've found the only major problem. There is already a very good "restricted execution" enviornment in place. The read-only filesystem issue slipped through the cracks because it is the default behavior for the evironment. I'll spend the time to go over any nooks and crannies that bear careful scrutiny. > The alternative we could consider is to mark plpython as untrusted for > 7.2, until someone has time for a more complete review of possible > security problems. If I don't feel that the code is 100% then I'll vote for this option too. -Kevin -- Kevin Jacobs The OPAL Group - Enterprise Systems Architect Voice: (216) 986-0710 x 19 E-mail: jacobs@theopalgroup.com Fax: (216) 986-0714 WWW: http://www.theopalgroup.com
> > 1) If Plpython is installed as a trusted language, and from what little I > > can glean from the documentation, it should not have any filesystem access. > > However, the default behavior of the restricted execution environment > > being used allows read-only filesystem access. > > we have 'read-only filesystem access anyhow' : Then I consider this a bug if a non-super-user can do this. > using copy xxx to '/file/' we have even read-write access, we just can't > overwrite 0600 files. And you can do only what the postgres user can do. This is an even bigger bug. I didn't think I needed to run PostgreSQL in a chroot jail, but its looking more like that may be needed. Any comments from other developers? Is this really the security model you want? If keep telling me things like this, I'll stop using Postgres! -Kevin -- Kevin Jacobs The OPAL Group - Enterprise Systems Architect Voice: (216) 986-0710 x 19 E-mail: jacobs@theopalgroup.com Fax: (216) 986-0714 WWW: http://www.theopalgroup.com
On Mon, 12 Nov 2001, Tom Lane wrote: > Kevin Jacobs <jacobs@penguin.theopalgroup.com> writes: > > I had to guess what a "trusted" language should or should not do. For > > example, I do not allow Python to report the platform it is running on, > > though I do allow it to report the native byte-order and interpreter > > version. > > Do you mean platform type, or host name? "select version()" reports the > platform type, so I see no reason to hide that in plpython. I meant platform type. Since its already available in Postgre, I'll make it available in my next path. -Kevin -- Kevin Jacobs The OPAL Group - Enterprise Systems Architect Voice: (216) 986-0710 x 19 E-mail: jacobs@theopalgroup.com Fax: (216) 986-0714 WWW: http://www.theopalgroup.com
Kevin, I've been using / reviewing your patch and it looks good. Did you submit it to patches? (Everyone) Would a patch to add trusted language support be accepted for 7.2, or is it too late? -Brad
Bradley McLean <brad@bradm.net> writes: > (Everyone) Would a patch to add trusted language support be accepted > for 7.2, or is it too late? I think the code in there already is the trusted case, no? The addition would be an untrusted mode for plpython. trusted = language handler prevents security violations, so unprivileged users are allowed to define functions in the language (ie, we trust the language itself to prevent security breaches) untrusted = language allows user to access things outside database, so only Postgres superusers are allowed to define functions in the language (ie, we must trust the function author instead of the language) In any case, a second security level in plpython would clearly be a new feature, and so I'd say it's too late to consider it for 7.2. All that we want to do at this point is verify Kevin's proposed patch for the existing security level. But certainly a "plpythonu" addition would be welcome for 7.3. regards, tom lane
On Tue, 13 Nov 2001, Bradley McLean wrote: > Kevin, I've been using / reviewing your patch and it looks good. Great! I'll have a very slightly updated version ready by tomorrow that incorprates some suggestions from Tom Lane. > Did you submit it to patches? I sent it to hackers, though the list manager really hates that I'm not subscribed. > (Everyone) Would a patch to add trusted language support be accepted > for 7.2, or is it too late? I've got a patch about half-done to do this. If this gets the green light, then lets work together on this. -Kevin
Has this all been addressed? Are there any TODO items here? --------------------------------------------------------------------------- > Bradley McLean <brad@bradm.net> writes: > > (Everyone) Would a patch to add trusted language support be accepted > > for 7.2, or is it too late? > > I think the code in there already is the trusted case, no? The addition > would be an untrusted mode for plpython. > > trusted = language handler prevents security violations, so unprivileged > users are allowed to define functions in the language (ie, we trust the > language itself to prevent security breaches) > > untrusted = language allows user to access things outside database, > so only Postgres superusers are allowed to define functions in the > language (ie, we must trust the function author instead of the language) > > In any case, a second security level in plpython would clearly be a new > feature, and so I'd say it's too late to consider it for 7.2. All that > we want to do at this point is verify Kevin's proposed patch for the > existing security level. But certainly a "plpythonu" addition would > be welcome for 7.3. > > regards, tom lane > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Don't 'kill -9' the postmaster > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
> On Sat, 17 Nov 2001, Bruce Momjian wrote: > > Has this all been addressed? Are there any TODO items here? > > All of the security related _problems_ that affect the rest of 7.2 have been > solved, to the best of my knowledge. The discussion below pretains to adding > an additional untrusted mode like plperl has. Since this is a new feature, > it is on the TODO list for 7.3. OK, added to TODO: * Add untrusted version of plpython -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
On Sat, 17 Nov 2001, Bruce Momjian wrote: > Has this all been addressed? Are there any TODO items here? All of the security related _problems_ that affect the rest of 7.2 have been solved, to the best of my knowledge. The discussion below pretains to adding an additional untrusted mode like plperl has. Since this is a new feature, it is on the TODO list for 7.3. Regards, -Kevin Jacobs > > --------------------------------------------------------------------------- > > > Bradley McLean <brad@bradm.net> writes: > > > (Everyone) Would a patch to add trusted language support be accepted > > > for 7.2, or is it too late? > > > > I think the code in there already is the trusted case, no? The addition > > would be an untrusted mode for plpython. > > > > trusted = language handler prevents security violations, so unprivileged > > users are allowed to define functions in the language (ie, we trust the > > language itself to prevent security breaches) > > > > untrusted = language allows user to access things outside database, > > so only Postgres superusers are allowed to define functions in the > > language (ie, we must trust the function author instead of the language) > > > > In any case, a second security level in plpython would clearly be a new > > feature, and so I'd say it's too late to consider it for 7.2. All that > > we want to do at this point is verify Kevin's proposed patch for the > > existing security level. But certainly a "plpythonu" addition would > > be welcome for 7.3. > > > > regards, tom lane > > > > ---------------------------(end of broadcast)--------------------------- > > TIP 4: Don't 'kill -9' the postmaster > > > > -- Kevin Jacobs The OPAL Group - Enterprise Systems Architect Voice: (216) 986-0710 x 19 E-mail: jacobs@theopalgroup.com Fax: (216) 986-0714 WWW: http://www.theopalgroup.com