Thread: Add on_perl_init and proper destruction to plperl [PATCH]
This is the third of the patches to be split out from the former 'plperl feature patch 1'. Changes in this patch: - Added plperl.on_perl_init GUC for DBA use (PGC_SIGHUP) SPI functions are not available when the code is run. - Added normal interpreter destruction behaviour END blocks, if any, are run then objects are destroyed, calling their DESTROY methods, if any. SPI functions will die if called at this time. Tim.
Attachment
Tim Bunce wrote: > - Added plperl.on_perl_init GUC for DBA use (PGC_SIGHUP) > SPI functions are not available when the code is run. > > - Added normal interpreter destruction behaviour > END blocks, if any, are run then objects are > destroyed, calling their DESTROY methods, if any. > SPI functions will die if called at this time. > OK, we've made good progress with the PL/Perl patches, and this one is next on the queue. It should also be noted that as proposed END blocks will not run at all in the postmaster, even if perl is preloaded in the postmaster and the preloaded code sets END handlers. That makes setting them rather safer, ISTM. So, are there still objections to applying this patch? (Note, this is different from the proposal to specify on_trusted_init and on_untrusted_init handlers. The on_perl_init handler would be run on library load, and is mainly for the purpose of preloading perl modules and the like). cheers andrew
Andrew Dunstan <andrew@dunslane.net> writes: > Tim Bunce wrote: >> - Added plperl.on_perl_init GUC for DBA use (PGC_SIGHUP) >> SPI functions are not available when the code is run. >> >> - Added normal interpreter destruction behaviour >> END blocks, if any, are run then objects are >> destroyed, calling their DESTROY methods, if any. >> SPI functions will die if called at this time. > So, are there still objections to applying this patch? Yes. regards, tom lane
On Tue, Jan 26, 2010 at 23:14, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Andrew Dunstan <andrew@dunslane.net> writes: >> Tim Bunce wrote: >>> - Added plperl.on_perl_init GUC for DBA use (PGC_SIGHUP) >>> SPI functions are not available when the code is run. >>> >>> - Added normal interpreter destruction behaviour >>> END blocks, if any, are run then objects are >>> destroyed, calling their DESTROY methods, if any. >>> SPI functions will die if called at this time. > >> So, are there still objections to applying this patch? > > Yes. FWIW the atexit scares me to. I was thinking a good workaround perhaps would be to provide a function that destroys the interpreter (so that the END blocks get called). Tim would that work OK ? If we are still worried about that hanging we can probably do something hacky with alarm() and/or signals... Maybe a good solid use case will help figure this out? Im assuming the current one is to profile plperl functions and dump a prof file in /tmp/ or some such (which happens at END time). Or did I miss the use case in one of the other threads?
On Wed, Jan 27, 2010 at 12:46:42AM -0700, Alex Hunsaker wrote: > On Tue, Jan 26, 2010 at 23:14, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Andrew Dunstan <andrew@dunslane.net> writes: > >> Tim Bunce wrote: > >>> - Added plperl.on_perl_init GUC for DBA use (PGC_SIGHUP) > >>> SPI functions are not available when the code is run. > >>> > >>> - Added normal interpreter destruction behaviour > >>> END blocks, if any, are run then objects are > >>> destroyed, calling their DESTROY methods, if any. > >>> SPI functions will die if called at this time. > > > >> So, are there still objections to applying this patch? > > > > Yes. > > FWIW the atexit scares me to. In what way, specifically? I understand concerns about interacting with the database, so the patch ensures that any use of spi functions throws an exception. I don't recall any other concrete concerns. Specifically, how is code that starts executing at the end of a session different in risk to code that starts executing before the end of a session? DO $$ while (1) { } $$ language plperl; Tim.
Tom Lane wrote: > Andrew Dunstan <andrew@dunslane.net> writes: > >> Tim Bunce wrote: >> >>> - Added plperl.on_perl_init GUC for DBA use (PGC_SIGHUP) >>> SPI functions are not available when the code is run. >>> >>> - Added normal interpreter destruction behaviour >>> END blocks, if any, are run then objects are >>> destroyed, calling their DESTROY methods, if any. >>> SPI functions will die if called at this time. >>> > > >> So, are there still objections to applying this patch? >> > > Yes. > > > I see I asked the wrong question. Start again. What more should be done to make all or some of it acceptable? cheers andrew
On Wed, Jan 27, 2010 at 01:14:16AM -0500, Tom Lane wrote: > Andrew Dunstan <andrew@dunslane.net> writes: > > Tim Bunce wrote: > >> - Added plperl.on_perl_init GUC for DBA use (PGC_SIGHUP) > >> SPI functions are not available when the code is run. > >> > >> - Added normal interpreter destruction behaviour > >> END blocks, if any, are run then objects are > >> destroyed, calling their DESTROY methods, if any. > >> SPI functions will die if called at this time. > > > So, are there still objections to applying this patch? > > Yes. To focus the discussion I've looked back through all the messages from you that relate to this issue so I can summarize and try to address your objections. Some I've split or presented out of order, most relate to earlier (less restricted) versions of the patch before it was split out, and naturally they are lacking some context, so I've included archive URLs. Please forgive and correct me if I misrepresent you or your intent here. Regarding the utility of plperl.on_perl_init and END: http://archives.postgresql.org/message-id/18338.1260033447@sss.pgh.pa.us The question is not about whether we thinkit's useful; the question is about whether it's safe. I agree. Regarding visibility of changes to plperl.on_perl_init: http://archives.postgresql.org/message-id/28618.1259952660@sss.pgh.pa.us What is to happen if the admin changes thevalue when the system is already up? If a GUC could be defined as PGC_BACKEND and only settable by superuser, perhaps that would be a good fit. [GucContext seems to conflate some things.] Meanwhile the _init name is meant to convey the fact that it's a before-first-use GUC, like temp_buffers. I'm happy to accept whatever you'd recommend by way of PGC_* GUC selection. Documentation can note any caveats associated with combining plperl.on_perl_init with shared_preload_libraries. http://archives.postgresql.org/message-id/4516.1263168347@sss.pgh.pa.us However, I think PGC_SIGHUP would be enoughto address my basic worry, which is that people shouldn't be depending on the ability to set these things withinan individual session. The patch uses PGC_SIGHUP for plperl.on_perl_init. http://archives.postgresql.org/message-id/8950.1259994082@sss.pgh.pa.us > Tom, what's your objection to Shlib load timebeing user-visible? It's not reallydesigned to be user-visible. Let me give you just two examples: * We call a plperl function for the first time in a session, causing plperl.so to be loaded. Later the transactionfails and is rolled back. If loading plperl.so caused some user-visible things to happen, should those berolled back? If so, how do we get perl to play along? If not, how do we get postgres to play along? I believe that's addressed by spi functions being disabled when init code runs. * We call a plperl function for the first time in a session, causing plperl.so to be loaded. This happens in the contextof a superuser calling a non-superuser security definer function, or perhaps vice versa. Whose permissions applyto whatever the on_load code tries to do? (Hint: every answer is wrong.) I think that related to on_*trusted_init not plperl.on_perl_init, and is also addressed by spi functions being disabled when init code runs. That doesn't even begin to cover the problems with allowing any of this to happen inside the postmaster. Recall thatthe postmaster does not have any database access. I believe that's addressed by spi functions being disabled when init code runs. Furthermore, it is a very long established reliability principle around here that the postmaster process should doas little as possible, because every thing that it does creates another opportunity to have a nonrecoverable failure. The postmaster can recover if a child crashes, but the other way round, not so much. I understand that concern. Ultimately, though, that comes down to the judgement of DBAs and the trust placed in them. They can already load arbitrary code via shared_preload_libraries. http://archives.postgresql.org/message-id/18338.1260033447@sss.pgh.pa.us > I think if we do this the on_perl_init settingshould probably be > PGC_POSTMASTER, which would remove any issue about it changing > underneath us. Yes, if the main intended usage is in combination with preloading perl at postmaster start, it would be pointless toimagine that PGC_SIGHUP is useful anyway. http://archives.postgresql.org/message-id/17793.1260031296@sss.pgh.pa.us Yeah, in the shower this morning I was thinkingthat not loading SPI till after the on_init code runs would alleviate the concerns about transactionality andpermissions --- that would ensure that whatever on_init does affects only the Perl world and not the database world. That's included in the current patch (and also applies to END blocks). However, we're not out of the woods yet. In a trusted interpreter (plperl not plperlu), is the on_init code executedbefore we lock down the interpreter with Safe? I would think it has to be since the main point AFAICS is tolet you preload code via "use". But then what is left of the security guarantees of plperl? I can hardly imagine DBAs wanting to vet a few thousand lines of random Perl code to see if it contains anything that could be subverted. plperl.on_perl_init code, set by the DBA, runs before the Safe compartment is created. Without explicitextra steps the Safe compartment has no access to code loaded by plperl.on_perl_init. The Safe compartment (plperl) could get access to loaded code in one of these ways: 1. by using SQL to call a plperlu function that accesses the code. 2. by the DBA 'sharing' a specific subroutine with the compartment. 3. by the DBA loading a module into the compartment. There's no formal interface for 2. and 3. at the moment, so the only official option is 1. (The final patch in the series includes some building blocks towards an interface for 2 & 3, but doesn't add one.) If you're willing to also confine the feature to plperlu, then maybe the risk level could be decreased from insane tomerely unreasonable. I think you could reasonably describe plperl.on_perl_init as effectively confined to plperlu (because plperl has no access to any new code). http://archives.postgresql.org/message-id/26766.1263149361@sss.pgh.pa.us For the record, I think it's a bad idea torun arbitrary user-defined code in the postmaster, and I think it's a worse idea to run arbitrary user-defined codeat backend shutdown (the END-blocks bit). I do not care in the least what applications you think this might enable--- the negative consequences for overall system stability seem to me to outweigh any possible arguments on thatside. - What happens when the supplied code has errors, For on_perl_init it throws an exception that propagates to the user statement that triggered the initialization of perl. It also ensures that perl is left in a non-initialized state, so any further uses also fail. For END blocks an error triggers an exception that's caught by perl. (As noted above, there's no access to postgres from init or END code.) - takes an unreasonable amount of time to run, Unreasonable is in the eye of the DBA, of course, and they have the discretion to set on_perl_init to fit their needs. For END blocks, I don't see how this issue is any different from "users might do something dumb", like DO 'while(1){}' language plperl; (or plpython , pltcl, or plpgsql for that matter). - does something unsafe, Such as? The code can't do anything more unsafe than is already possible. - depends on the backend not being in an error state already, The code has no access to postgress, whatever the state. - etc. etc? I'd welcome more concrete examples of potential issues. Tim.
Tim Bunce <Tim.Bunce@pobox.com> writes: > On Wed, Jan 27, 2010 at 12:46:42AM -0700, Alex Hunsaker wrote: >> FWIW the atexit scares me to. > In what way, specifically? It runs too late, and too unpredictably, during the shutdown sequence. (In particular note that shutdown itself might be fired as an atexit callback, a move forced on us by exactly the sort of random user code that you want to add more of. It's not clear whether a Perl-added atexit would fire before or after that.) > I understand concerns about interacting with the database, so the > patch ensures that any use of spi functions throws an exception. That assuages my fears to only a tiny degree. SPI is not the only possible connection between perl code and the rest of the backend. Indeed, AFAICS the major *point* of these additions is to allow people to insert unknown other functionality that is likely to interact with the rest of the backend; a prospect that doesn't make me feel better about it. > Specifically, how is code that starts executing at the end of a session > different in risk to code that starts executing before the end of a session? If it runs before the shutdown sequence starts, we know we have a functioning backend. Once shutdown starts, it's unknown and mostly untested exactly what subsystems will still work and which won't. Injecting arbitrary user-written code into an unspecified point in that sequence is not a recipe for good results. Lastly, an atexit trigger will still fire during FATAL or PANIC aborts, which scares me even more. When the house is already afire, it's not prudent to politely let user-written perl code do whatever it wants before you get the heck out of there. regards, tom lane
Tom Lane wrote: > Indeed, AFAICS the major *point* of these additions is to allow people > to insert unknown other functionality that is likely to interact > with the rest of the backend; a prospect that doesn't make me feel > better about it. > > No. The major use case we've seen for END blocks is to allow a profiler to write its data out. That should have zero interaction with the rest of the backend. cheers andrew
Andrew Dunstan <andrew@dunslane.net> writes: > Tom Lane wrote: >> Indeed, AFAICS the major *point* of these additions is to allow people >> to insert unknown other functionality that is likely to interact >> with the rest of the backend; a prospect that doesn't make me feel >> better about it. > No. The major use case we've seen for END blocks is to allow a profiler > to write its data out. That should have zero interaction with the rest > of the backend. Really? We've found that gprof, for instance, doesn't exactly have "zero interaction with the rest of the backend" --- there's actually a couple of different bits in there to help it along, including a behavioral change during shutdown. I rather doubt that Perl profilers would turn out much different. But in any case, I don't believe for a moment that profiling is the only or even the largest use to which people would try to put this. regards, tom lane
On Wed, Jan 27, 2010 at 11:13:43AM -0500, Tom Lane wrote: > Tim Bunce <Tim.Bunce@pobox.com> writes: > > On Wed, Jan 27, 2010 at 12:46:42AM -0700, Alex Hunsaker wrote: > >> FWIW the atexit scares me to. > > > In what way, specifically? > > It runs too late, and too unpredictably, during the shutdown sequence. > (In particular note that shutdown itself might be fired as an atexit > callback, a move forced on us by exactly the sort of random user code > that you want to add more of. It's not clear whether a Perl-added > atexit would fire before or after that.) man atexit says "Functions so registered are called in reverse order". Since the plperl atexit is called only when a plperl SP or DO is executed it would fire before any atexit() registered during startup. The timing and predictability shouldn't be a significant concern if the plperl subsystem can't interact with the rest of the backend - which is the intent. > > I understand concerns about interacting with the database, so the > > patch ensures that any use of spi functions throws an exception. > > That assuages my fears to only a tiny degree. SPI is not the only > possible connection between perl code and the rest of the backend. Could you give me some examples of others? > Indeed, AFAICS the major *point* of these additions is to allow people > to insert unknown other functionality that is likely to interact > with the rest of the backend; a prospect that doesn't make me feel > better about it. The major point is *not at all* to allow people to interact with the rest of the backend. I'm specifically trying to limit that. The major point is simply to allow perl code to clean itself up properly. > > Specifically, how is code that starts executing at the end of a session > > different in risk to code that starts executing before the end of a session? > > If it runs before the shutdown sequence starts, we know we have a > functioning backend. Once shutdown starts, it's unknown and mostly > untested exactly what subsystems will still work and which won't. > Injecting arbitrary user-written code into an unspecified point in > that sequence is not a recipe for good results. The plperl subsystem is isolated from, and can't interact with, the rest of the backend during shutdown. Can you give me examples where that's not the case? > Lastly, an atexit trigger will still fire during FATAL or PANIC aborts, > which scares me even more. When the house is already afire, it's > not prudent to politely let user-written perl code do whatever it wants > before you get the heck out of there. Again, that point rests on your underlying concern about interaction between plperl and the rest of the backend. Examples? Is there some way for plperl.c to detect a FATAL or PANIC abort? If so, or if one could be added, then we could skip the END code in those circumstances. I don't really want to add more GUCs, but perhaps controlling END block execution via a plperl.destroy_end=bool (default false) would help address your concerns. Tim.
Tom Lane wrote: > But in any case, I don't believe for a moment that profiling is the only > or even the largest use to which people would try to put this. > > > Well, ISTR there have been requests over the years for event handlers for (among other things) session shutdown, so if you're speculating that people would use this as an end run around our lack of such things you could be right. Maybe providing for such handlers in a more general and at the same time more safe way would be an alternative. cheers andrew
Andrew Dunstan <andrew@dunslane.net> writes: > I see I asked the wrong question. Start again. > What more should be done to make all or some of it acceptable? I think a "must" is to get rid of the use of atexit(). Possibly an on_proc_exit callback could be used instead, although I'm not sure how you'd handle the case of code loaded in the postmaster that would like corresponding exit-time code to happen in child processes. (OTOH, it seems likely that it's impossible to make that work correctly anyway. It certainly isn't going to work the same on EXEC_BACKEND platforms as anywhere else, and I don't particularly want to see us documenting that the feature works differently on Windows than elsewhere.) Dropping the ability to make the postmaster run any such code would go a very long way towards fixing the above, as well as assuaging other fears. The other thing that I find entirely unconvincing is Tim's idea that shutting off SPI isolates perl from the rest of the backend. I have no confidence in that, but no real idea of how to do better either :-(. If you think that shutting off SPI is sufficient, you can find counterexamples in the CVS history, for instance where we had to take special measures to prevent Perl from screwing up the locale settings. I'm afraid that on_perl_init is going to vastly expand the opportunities for that kind of unwanted side-effect; and the earlier that it runs, the more likely it's going to be that we can't recover easily. regards, tom lane
fche@redhat.com (Frank Ch. Eigler) writes: > Tom Lane <tgl@sss.pgh.pa.us> writes: >> Lastly, an atexit trigger will still fire during FATAL or PANIC aborts, >> which scares me even more. When the house is already afire, it's >> not prudent to politely let user-written perl code do whatever it wants >> before you get the heck out of there. > Is there a reason that these panics don't use _exit(3) to bypass > atexit hooks? Well, I don't really want to entirely forbid the use of atexit() --- I'm just concerned about using it to run arbitrary user-written code. There might be more limited purposes for which it's a reasonable choice. regards, tom lane
Tim Bunce <Tim.Bunce@pobox.com> writes: > On Wed, Jan 27, 2010 at 11:13:43AM -0500, Tom Lane wrote: >> (In particular note that shutdown itself might be fired as an atexit >> callback, a move forced on us by exactly the sort of random user code >> that you want to add more of. It's not clear whether a Perl-added >> atexit would fire before or after that.) > man atexit says "Functions so registered are called in reverse order". > Since the plperl atexit is called only when a plperl SP or DO is > executed it would fire before any atexit() registered during startup. Right, which means that it would occur either before or after on_proc_exit processing, depending on whether we got there through an exit() call or via the normal proc_exit sequence. That's just the kind of instability I don't want to have to debug. > The plperl subsystem is isolated from, and can't interact with, the rest > of the backend during shutdown. This is exactly the claim that I have zero confidence in. Quite frankly, the problem with Perl as an extension language is that Perl was never designed to be a subsystem: it feels free to mess around with the entire state of the process. We've been burnt multiple times by that even with the limited use we make of Perl now, and these proposed additions are going to make it a lot worse IMO. regards, tom lane
Re: Add on_perl_init and proper destruction to plperl [PATCH]
From
fche@redhat.com (Frank Ch. Eigler)
Date:
Tom Lane <tgl@sss.pgh.pa.us> writes: > [...] > Lastly, an atexit trigger will still fire during FATAL or PANIC aborts, > which scares me even more. When the house is already afire, it's > not prudent to politely let user-written perl code do whatever it wants > before you get the heck out of there. Is there a reason that these panics don't use _exit(3) to bypass atexit hooks? - FChE
On Jan 27, 2010, at 9:08 AM, Tom Lane wrote: > This is exactly the claim that I have zero confidence in. Quite > frankly, the problem with Perl as an extension language is that Perl was > never designed to be a subsystem: it feels free to mess around with the > entire state of the process. We've been burnt multiple times by that > even with the limited use we make of Perl now, and these proposed > additions are going to make it a lot worse IMO. Can you provide an example? Such concerns are impossible to address without concrete examples. Best, David
"David E. Wheeler" <david@kineticode.com> writes: > On Jan 27, 2010, at 9:08 AM, Tom Lane wrote: >> This is exactly the claim that I have zero confidence in. Quite >> frankly, the problem with Perl as an extension language is that Perl was >> never designed to be a subsystem: it feels free to mess around with the >> entire state of the process. We've been burnt multiple times by that >> even with the limited use we make of Perl now, and these proposed >> additions are going to make it a lot worse IMO. > Can you provide an example? Such concerns are impossible to address without concrete examples. Two examples that I can find in a quick review of our CVS history: perl stomping on the process's setlocale state, and perl stomping on the stdio state (Windows only). regards, tom lane
On Jan 27, 2010, at 10:08 AM, Tom Lane wrote: > Two examples that I can find in a quick review of our CVS history: perl > stomping on the process's setlocale state, and perl stomping on the > stdio state (Windows only). Are there links to those commits? Thanks, David
On Wed, Jan 27, 2010 at 12:08:48PM -0500, Tom Lane wrote: > Tim Bunce <Tim.Bunce@pobox.com> writes: > > On Wed, Jan 27, 2010 at 11:13:43AM -0500, Tom Lane wrote: > >> (In particular note that shutdown itself might be fired as an atexit > >> callback, a move forced on us by exactly the sort of random user code > >> that you want to add more of. It's not clear whether a Perl-added > >> atexit would fire before or after that.) > > > man atexit says "Functions so registered are called in reverse order". > > Since the plperl atexit is called only when a plperl SP or DO is > > executed it would fire before any atexit() registered during startup. > > Right, which means that it would occur either before or after > on_proc_exit processing, depending on whether we got there through > an exit() call or via the normal proc_exit sequence. That's just > the kind of instability I don't want to have to debug. Okay. I could change the callback code to ignore calls if proc_exit_inprogress is false. So an abnormal shutdown via exit() wouldn't involve plperl at all. (Alternatively I could use use on_proc_exit() instead of atexit() to register the callback.) Would that address this call sequence instability issue? > > The plperl subsystem is isolated from, and can't interact with, the > > rest of the backend during shutdown. > > This is exactly the claim that I have zero confidence in. Quite > frankly, the problem with Perl as an extension language is that Perl was > never designed to be a subsystem: it feels free to mess around with the > entire state of the process. We've been burnt multiple times by that > even with the limited use we make of Perl now, and these proposed > additions are going to make it a lot worse IMO. On Wed, Jan 27, 2010 at 09:53:44AM -0800, David E. Wheeler wrote: > Can you provide an example? Such concerns are impossible to address > without concrete examples. On Wed, Jan 27, 2010 at 01:08:56PM -0500, Tom Lane wrote: > Two examples that I can find in a quick review of our CVS history: perl > stomping on the process's setlocale state, and perl stomping on the > stdio state (Windows only). Neither of those relate to the actions of perl source code. To address that, instead of calling perl_destruct() to perform a complete destruction I could just execute END blocks and object destructors. That would avoid executing any system-level actions. Do you have any examples of how a user could write code in a plperl END block that would interact with the rest of the backend? Tim.
On Wed, Jan 27, 2010 at 11:28:02AM -0500, Tom Lane wrote: > Andrew Dunstan <andrew@dunslane.net> writes: > > Tom Lane wrote: > >> Indeed, AFAICS the major *point* of these additions is to allow people > >> to insert unknown other functionality that is likely to interact > >> with the rest of the backend; a prospect that doesn't make me feel > >> better about it. > > > No. The major use case we've seen for END blocks is to allow a profiler > > to write its data out. That should have zero interaction with the rest > > of the backend. > > Really? We've found that gprof, for instance, doesn't exactly have > "zero interaction with the rest of the backend" --- there's actually > a couple of different bits in there to help it along, including a > behavioral change during shutdown. I rather doubt that Perl profilers > would turn out much different. Devel::NYTProf (http://blog.timbunce.org/tag/nytprof/) has zero interaction with the rest of the backend. It works in PostgreSQL 8.4, although greatly handicapped by the lack of END blocks. http://search.cpan.org/perldoc?Devel::NYTProf::PgPLPerl > But in any case, I don't believe for a moment that profiling is the only > or even the largest use to which people would try to put this. Can you give any examples? Tim.
On Jan 27, 2010, at 1:27 PM, Tim Bunce wrote: > Okay. I could change the callback code to ignore calls if > proc_exit_inprogress is false. So an abnormal shutdown via exit() > wouldn't involve plperl at all. (Alternatively I could use use > on_proc_exit() instead of atexit() to register the callback.) Given Tom's hesitace about atexit(), perhaps that would be best. > Neither of those relate to the actions of perl source code. > To address that, instead of calling perl_destruct() to perform a > complete destruction I could just execute END blocks and object > destructors. That would avoid executing any system-level actions. Does perl_destruct() execute system-level actions, then? If so, then it seems prudent to either audit such actions or, asyou say, call destructors directly. > Do you have any examples of how a user could write code in a plperl END > block that would interact with the rest of the backend? I appreciate you taking the time to look at ways to address the issues Tom has raised, Tim. Good on you. There is so much benefit to this level of interaction, as shown by the success of mod_perl and other forking environmentsthat support pre-loading code, that I think it'd be extremely valuable to get these features in 9.0. They reallyallow Perl to be a first-class PL in a way that it wasn't before. So if there is no way other than SPI for Perl code to interact with the backend, and system-level actions in Perl itselfare disabled, it seems to me that the major issues are addressed. Am I wrong, Tom? Best, David
On Wed, Jan 27, 2010 at 4:51 PM, David E. Wheeler <david@kineticode.com> wrote: >> Neither of those relate to the actions of perl source code. >> To address that, instead of calling perl_destruct() to perform a >> complete destruction I could just execute END blocks and object >> destructors. That would avoid executing any system-level actions. > > Does perl_destruct() execute system-level actions, then? If so, then it seems prudent to either audit such actions or,as you say, call destructors directly. What exactly do we mean by "system-level actions"? I mean, END blocks can execute arbitrary code.... ...Robert
On Jan 27, 2010, at 1:53 PM, Robert Haas wrote: > What exactly do we mean by "system-level actions"? I mean, END blocks > can execute arbitrary code.... Yeah. In Perl. What part of Perl can access the backend systems without SPI? And that it couldn't do at any other point inruntime? Best, David
On Wed, Jan 27, 2010 at 01:51:47PM -0800, David E. Wheeler wrote: > On Jan 27, 2010, at 1:27 PM, Tim Bunce wrote: > > > Okay. I could change the callback code to ignore calls if > > proc_exit_inprogress is false. So an abnormal shutdown via exit() > > wouldn't involve plperl at all. (Alternatively I could use use > > on_proc_exit() instead of atexit() to register the callback.) > > Given Tom's hesitace about atexit(), perhaps that would be best. I've a draft patch using !proc_exit_inprogress implemented now (appended) and I'll test it tomorrow. That was the simplest to do first. Once I've a reasonable way of testing the exit() and proc_exit() code paths I'll try using on_proc_exit(). > > Neither of those relate to the actions of perl source code. > > To address that, instead of calling perl_destruct() to perform a > > complete destruction I could just execute END blocks and object > > destructors. That would avoid executing any system-level actions. > > Does perl_destruct() execute system-level actions, then? Potentially: Kills threads it knows about (should be none), wait for children (should be none), flushes all open *perl* file handles (should be none for plperl), tears down PerlIO layers, etc. etc. In practice none of that should affect the backend, but it's possible, especially for the Windows port. Since none of that is needed it can be skipped. > If so, then it seems prudent to either audit such actions or, as you > say, call destructors directly. The patch now just calls END blocks and DESTROY methods. > > Do you have any examples of how a user could write code in a plperl END > > block that would interact with the rest of the backend? > > I appreciate you taking the time to look at ways to address the issues > Tom has raised, Tim. Good on you. Thanks David. I appreciate the visible support! Tom and the team set the bar high, rightly, so it's certainly been a challenging introduction to PostgreSQL development! Tim. diff --git a/src/pl/plperl/plperl.c b/src/pl/plperl/plperl.c index 8315d5a..38f2d35 100644 *** a/src/pl/plperl/plperl.c --- b/src/pl/plperl/plperl.c *************** *** 27,32 **** --- 27,33 ---- #include "miscadmin.h" #include "nodes/makefuncs.h" #include "parser/parse_type.h" + #include "storage/ipc.h" #include "utils/builtins.h" #include "utils/fmgroids.h" #include "utils/guc.h" *************** _PG_init(void) *** 281,287 **** static void plperl_fini(void) { ! plperl_ending = true; plperl_destroy_interp(&plperl_trusted_interp); plperl_destroy_interp(&plperl_untrusted_interp); plperl_destroy_interp(&plperl_held_interp); --- 282,297 ---- static void plperl_fini(void) { ! plperl_ending = true; /* disables use of spi_* functions */ ! ! /* ! * Only perform perl cleanup if we're exiting cleanly via proc_exit(). ! * If proc_exit_inprogress is false then exit() was called directly ! * (because we call atexit() very late, so get called early). ! */ ! if (!proc_exit_inprogress) ! return; ! plperl_destroy_interp(&plperl_trusted_interp); plperl_destroy_interp(&plperl_untrusted_interp); plperl_destroy_interp(&plperl_held_interp); *************** plperl_destroy_interp(PerlInterpreter ** *** 595,602 **** { if (interp && *interp) { ! perl_destruct(*interp); ! perl_free(*interp); *interp = NULL; } } --- 605,640 ---- { if (interp && *interp) { ! /* ! * Only a very minimal destruction is performed. ! * Just END blocks and object destructors, no system-level actions. ! * Code code here extracted from perl's perl_destruct(). ! */ ! ! /* Run END blocks */ ! if (PL_exit_flags & PERL_EXIT_DESTRUCT_END) { ! dJMPENV; ! int x = 0; ! ! JMPENV_PUSH(x); ! PERL_UNUSED_VAR(x); ! if (PL_endav && !PL_minus_c) ! call_list(PL_scopestack_ix, PL_endav); ! JMPENV_POP; ! } ! LEAVE; ! FREETMPS; ! ! PL_dirty = TRUE; ! ! /* destroy objects - call DESTROY methods */ ! if (PL_sv_objcount) { ! Perl_sv_clean_objs(aTHX); ! PL_sv_objcount = 0; ! if (PL_defoutgv && !SvREFCNT(PL_defoutgv)) ! PL_defoutgv = NULL; /* may have been freed */ ! } ! *interp = NULL; } }
"David E. Wheeler" <david@kineticode.com> writes: > On Jan 27, 2010, at 1:53 PM, Robert Haas wrote: >> What exactly do we mean by "system-level actions"? I mean, END blocks >> can execute arbitrary code.... > Yeah. In Perl. What part of Perl can access the backend systems without SPI? And that it couldn't do at any other pointin runtime? You still aren't letting go of the notion that Perl could only affect the rest of the backend via SPI. The point I'm trying to impress on you is that there are any number of other possible pathways, and that Perl's historical assumption that it owns all resources of the process make those pathways a nontrivial hazard. Anything that Perl does to libc state, open file handles, etc etc carries a high risk of breaking the backend. Now it is certainly true that any such hazards can be created just from use of plperlu (we hope only plperlu, and not plperl ...) today, without any use of the proposed additional features. What is bothering me about these features is that their entire reason for existence is to encourage people to use parts of Perl that have time-extended effects on the process state. That means that (a) the probability of problems goes up substantially, and (b) our ability to fix such problems goes down substantially. Right now, the canonical approach to trying to undo anything bad Perl does is to save/restore process state around a plperl call. If we're trying to support usages in which Perl has time-extended effects on process state, that solution goes out the window, and we have to think of some other way to coexist with Perl. (Where, I note, "coexist" means "Perl does what it damn pleases and we have to pick up the pieces" --- we're not likely to get any cooperation on limiting damage from that side. Nobody even suggested that we treat stomping on setlocale state as a Perl bug, for example, rather than a fact of life that we just had to work around however we could.) So the real bottom line here is that I foresee this patch as being destabilizing and requiring us to put large amounts of time into figuring out workarounds for whatever creative things people decide to try to do with Perl. I'd feel better about it if I thought that we could get away with a policy of "if it breaks it's your problem", but I do not think that will fly from a PR standpoint. It hasn't in the past. regards, tom lane
On Jan 27, 2010, at 3:11 PM, Tom Lane wrote: > You still aren't letting go of the notion that Perl could only affect > the rest of the backend via SPI. The point I'm trying to impress on you > is that there are any number of other possible pathways, and that Perl's > historical assumption that it owns all resources of the process make > those pathways a nontrivial hazard. Anything that Perl does to libc > state, open file handles, etc etc carries a high risk of breaking the > backend. As could any other code that executes then, including C libraries installed from pgFoundry and loaded by a DBA. > Now it is certainly true that any such hazards can be created just from > use of plperlu (we hope only plperlu, and not plperl ...) today, > without any use of the proposed additional features. What is bothering > me about these features is that their entire reason for existence is to > encourage people to use parts of Perl that have time-extended effects on > the process state. Well, mainly it's to avoid the overhead of loading the code except at startup. > That means that (a) the probability of problems goes > up substantially, Why? Arbitrary code can already execute at start time. Is Perl special somehow? > and (b) our ability to fix such problems goes down > substantially. Why is it your problem? > Right now, the canonical approach to trying to undo > anything bad Perl does is to save/restore process state around a plperl > call. If we're trying to support usages in which Perl has time-extended > effects on process state, that solution goes out the window, and we have > to think of some other way to coexist with Perl. (Where, I note, > "coexist" means "Perl does what it damn pleases and we have to pick up > the pieces" --- we're not likely to get any cooperation on limiting > damage from that side. Nobody even suggested that we treat stomping on > setlocale state as a Perl bug, for example, rather than a fact of life > that we just had to work around however we could.) How is that different from any other code that gets loaded when the server starts, exactly? Do, however, feel free to report Perl bugs. Just run `perlbug`. > So the real bottom line here is that I foresee this patch as being > destabilizing and requiring us to put large amounts of time into > figuring out workarounds for whatever creative things people decide to > try to do with Perl. I'd feel better about it if I thought that we > could get away with a policy of "if it breaks it's your problem", but > I do not think that will fly from a PR standpoint. It hasn't in the > past. mod_perl has for many years. Provide lots of caveats in the documentation. Point users to it when they write in about a problem. Truth is, the vast majority of Perl modules are pretty well-behaved. I sincerely doubt you'd hear much complaint. Have theApache guys had to take any special steps to protect httpd from mod_perl? Best, David
Tim Bunce <Tim.Bunce@pobox.com> writes: > Okay. I could change the callback code to ignore calls if > proc_exit_inprogress is false. So an abnormal shutdown via exit() > wouldn't involve plperl at all. (Alternatively I could use use > on_proc_exit() instead of atexit() to register the callback.) Use on_proc_exit please. I will continue to object to any attempt to hang arbitrary processing on atexit(). An advantage of on_proc_exit from your end is that it should allow you to not have to try to prevent the END blocks from using SPI, as that would still be perfectly functional when your callback gets called. (Starting a new transaction would be a good idea though, cf Async_UnlistenOnExit.) regards, tom lane
Tim Bunce <Tim.Bunce@pobox.com> writes: > On Wed, Jan 27, 2010 at 11:28:02AM -0500, Tom Lane wrote: >> Really? We've found that gprof, for instance, doesn't exactly have >> "zero interaction with the rest of the backend" --- there's actually >> a couple of different bits in there to help it along, including a >> behavioral change during shutdown. I rather doubt that Perl profilers >> would turn out much different. > Devel::NYTProf (http://blog.timbunce.org/tag/nytprof/) has zero > interaction with the rest of the backend. I don't have to read any further than the place where it says "doesn't work if you call both plperl and plperlu" to realize that that's quite false. Maybe we have different definitions of what a software interaction is... regards, tom lane
On Jan 27, 2010, at 3:33 PM, Tom Lane wrote: > I don't have to read any further than the place where it says "doesn't > work if you call both plperl and plperlu" to realize that that's quite > false. Maybe we have different definitions of what a software > interaction is... I think that dates from when plperl and plperlu couldn't co-exists, which was fixed a few months ago, n'est pas? Best, David
"David E. Wheeler" <david@kineticode.com> writes: > On Jan 27, 2010, at 3:33 PM, Tom Lane wrote: >> I don't have to read any further than the place where it says "doesn't >> work if you call both plperl and plperlu" to realize that that's quite >> false. Maybe we have different definitions of what a software >> interaction is... > I think that dates from when plperl and plperlu couldn't co-exists, which was fixed a few months ago, n'est pas? No, that was fixed years ago, at least if you have a modern Perl build that supports multiplicity at all. regards, tom lane
"David E. Wheeler" <david@kineticode.com> writes: > On Jan 27, 2010, at 3:11 PM, Tom Lane wrote: >> ... Anything that Perl does to libc >> state, open file handles, etc etc carries a high risk of breaking the >> backend. > As could any other code that executes then, including C libraries installed from pgFoundry and loaded by a DBA. Absolutely. The difference here is in who is going to be expected to try to deal with any problems. When somebody says "if I do this in plperlu, my database crashes! Postgres sux!" it's not going to help to say "that's a Perl bug", even if an independent observer might agree. It's going to be *our* problem, and I don't see any reason to expect a shred of help from the Perl side. regards, tom lane
On Jan 27, 2010, at 4:10 PM, Tom Lane wrote: > Absolutely. The difference here is in who is going to be expected to > try to deal with any problems. When somebody says "if I do this in > plperlu, my database crashes! Postgres sux!" it's not going to help to > say "that's a Perl bug", even if an independent observer might agree. > It's going to be *our* problem, and I don't see any reason to expect > a shred of help from the Perl side. Is that not the case with plperlu already? Best, David
Tom Lane wrote: > "David E. Wheeler" <david@kineticode.com> writes: > >> On Jan 27, 2010, at 3:33 PM, Tom Lane wrote: >> >>> I don't have to read any further than the place where it says "doesn't >>> work if you call both plperl and plperlu" to realize that that's quite >>> false. Maybe we have different definitions of what a software >>> interaction is... >>> > > >> I think that dates from when plperl and plperlu couldn't co-exists, which was fixed a few months ago, n'est pas? >> > > No, that was fixed years ago, at least if you have a modern Perl build > that supports multiplicity at all. > > To be strictly accurate, what we fixed years ago was that we used to run plperl and plperlu in the same interpreter, and that caused all sorts of information leaks, so we switched to running in two interpreters, which in turn became a problem for perl builds that didn't define multiplicity. The problem here is that NYTprof is apparently not multiplicity safe. I guess the question is what would happen if you tried to load it with both plperl and plperlu. In any case, it's a known and documented issue, so it's not one I'd be terribly worried about. cheers andrew
"David E. Wheeler" <david@kineticode.com> writes: > On Jan 27, 2010, at 4:10 PM, Tom Lane wrote: >> Absolutely. The difference here is in who is going to be expected to >> try to deal with any problems. When somebody says "if I do this in >> plperlu, my database crashes! Postgres sux!" it's not going to help to >> say "that's a Perl bug", even if an independent observer might agree. >> It's going to be *our* problem, and I don't see any reason to expect >> a shred of help from the Perl side. > Is that not the case with plperlu already? Sure. Which is why I'm resisting expanding our exposure to it. regards, tom lane
On Jan 27, 2010, at 4:15 PM, Tom Lane wrote: >> Is that not the case with plperlu already? > > Sure. Which is why I'm resisting expanding our exposure to it I don't understand how it's expanding core's exposure to it. Best, David
"David E. Wheeler" <david@kineticode.com> writes: > On Jan 27, 2010, at 4:15 PM, Tom Lane wrote: >> Sure. Which is why I'm resisting expanding our exposure to it > I don't understand how it's expanding core's exposure to it. [ shrug...] I see little point in repeating myself yet again. It's obvious that the people who want this are entirely willing to adopt a Pollyanna-ishly optimistic view about its potential to cause serious problems that we may or may not be able to fix. I don't really expect to be able to prevent something along this line from getting committed --- I'm merely hoping to circumscribe it as much as possible and get large WARNING items into the manual's description. regards, tom lane
On Jan 27, 2010, at 4:33 PM, Tom Lane wrote: > [ shrug...] I see little point in repeating myself yet again. > It's obvious that the people who want this are entirely willing > to adopt a Pollyanna-ishly optimistic view about its potential > to cause serious problems that we may or may not be able to fix. Well, no. The problems you raise already exist in plperlu. And I would argue that they're worse there, as the DBA can giveothers permission to create PL/PerlU functions, and those users can do all kinds of crazy shit with them. on_perl_initcan be executed the DBA only. It's scope is far less. This is *safe* than PL/PerlU, while given more capabilityto PL/Perl. > I don't really expect to be able to prevent something along this line > from getting committed --- I'm merely hoping to circumscribe it as much > as possible and get large WARNING items into the manual's description. Oh, absolutely. Your sober attention to security issues is greatly appreciated by us fanboys. Best, David PS: I'm a PostgreSQL fanboy, not a Tom Lane fanboy. ;-P
Tom Lane wrote: > It's obvious that the people who want this are entirely willing > to adopt a Pollyanna-ishly optimistic view about its potential > to cause serious problems that we may or may not be able to fix. > > I don't really expect to be able to prevent something along this line > from getting committed --- I'm merely hoping to circumscribe it as much > as possible and get large WARNING items into the manual's description. > > Well, we seem to have got much closer to what you can live with on the END block issue, although we took a rather roundabout route to get there. Perhaps with a little more work we can achieve something similar on the on_perl_init front (which, in any case, I don't regard as being as important as the END blocks, although Tim might not agree.) cheers andrew
On Wed, Jan 27, 2010 at 06:33:19PM -0500, Tom Lane wrote: > Tim Bunce <Tim.Bunce@pobox.com> writes: > > On Wed, Jan 27, 2010 at 11:28:02AM -0500, Tom Lane wrote: > >> Really? We've found that gprof, for instance, doesn't exactly have > >> "zero interaction with the rest of the backend" --- there's actually > >> a couple of different bits in there to help it along, including a > >> behavioral change during shutdown. I rather doubt that Perl profilers > >> would turn out much different. > > > Devel::NYTProf (http://blog.timbunce.org/tag/nytprof/) has zero > > interaction with the rest of the backend. > > I don't have to read any further than the place where it says "doesn't > work if you call both plperl and plperlu" to realize that that's quite > false. NYTProf is not, currently, multiplicity-safe. That's a limitation I intend to fix. > Maybe we have different definitions of what a software interaction is... Doing _anything_ in the backend is an interaction of some kind, e.g., shifting later memory allocations to a different address. But that's not a very practical basis for a definition. From what you said, quoted above, it seemed that your definition of "interaction with the rest of the backend" was more much more direct. The specific example you gave related to the backend code needing to be modified to support the gprof profiler. Clearly that's not the case for NYTProf. We're splitting hairs now. Tim.
On Wed, Jan 27, 2010 at 06:27:50PM -0500, Tom Lane wrote: > Tim Bunce <Tim.Bunce@pobox.com> writes: > > Okay. I could change the callback code to ignore calls if > > proc_exit_inprogress is false. So an abnormal shutdown via exit() > > wouldn't involve plperl at all. (Alternatively I could use use > > on_proc_exit() instead of atexit() to register the callback.) > > Use on_proc_exit please. I will continue to object to any attempt > to hang arbitrary processing on atexit(). Ok. > An advantage of on_proc_exit from your end is that it should allow > you to not have to try to prevent the END blocks from using SPI, > as that would still be perfectly functional when your callback > gets called. (Starting a new transaction would be a good idea > though, cf Async_UnlistenOnExit.) I'm surprised that you're suggesting that END block should be allowed to interact with the backend via SPI. It seems to go against what you've said previously about code running at shutdown. I've no use-case for that so I'm happy to leave it disabled. If someone does have a sane use-case, please let me know. It can always be enabled later. Tim.
Tim Bunce wrote: > I've no use-case for that so I'm happy to leave it disabled. If someone > does have a sane use-case, please let me know. It can always be enabled later. > > > As I noted upthread, there have been requests for user level session end handlers before. With SPI enabled as Tom suggests, this would just about buy us that for free. But if you're uncomfortable about it we can take that up at a later date. cheers andrew
Tim Bunce <Tim.Bunce@pobox.com> writes: > On Wed, Jan 27, 2010 at 06:27:50PM -0500, Tom Lane wrote: >> An advantage of on_proc_exit from your end is that it should allow >> you to not have to try to prevent the END blocks from using SPI, >> as that would still be perfectly functional when your callback >> gets called. (Starting a new transaction would be a good idea >> though, cf Async_UnlistenOnExit.) > I'm surprised that you're suggesting that END block should be allowed to > interact with the backend via SPI. It seems to go against what you've > said previously about code running at shutdown. I think you have completely misunderstood what I'm complaining about. What I'm not happy about is executing operations at a point where they're likely to be ill-defined because the code is in the wrong state. In an early on_proc_exit hook, the system is for all practical purposes still fully functional, and so I don't see a reason for an arbitrary restriction on what the END blocks should be able to do. (Or, to repeat myself in a different way: the no-SPI restriction is utterly useless to guard against my real concerns anyway. I see no point in it either here or elsewhere.) regards, tom lane
On Thu, Jan 28, 2010 at 10:39:33AM -0500, Tom Lane wrote: > Tim Bunce <Tim.Bunce@pobox.com> writes: > > On Wed, Jan 27, 2010 at 06:27:50PM -0500, Tom Lane wrote: > >> An advantage of on_proc_exit from your end is that it should allow > >> you to not have to try to prevent the END blocks from using SPI, > >> as that would still be perfectly functional when your callback > >> gets called. (Starting a new transaction would be a good idea > >> though, cf Async_UnlistenOnExit.) > > > I'm surprised that you're suggesting that END block should be allowed to > > interact with the backend via SPI. It seems to go against what you've > > said previously about code running at shutdown. > > I think you have completely misunderstood what I'm complaining about. > What I'm not happy about is executing operations at a point where > they're likely to be ill-defined because the code is in the wrong state. > In an early on_proc_exit hook, the system is for all practical purposes > still fully functional, and so I don't see a reason for an arbitrary > restriction on what the END blocks should be able to do. Ah, okay. I guess I missed your underlying concerns in: http://archives.postgresql.org/message-id/26766.1263149361@sss.pgh.pa.us For the record, [...] and I think it's a worseidea to run arbitrary user-defined code at backend shutdown (the END-blocks bit). > (Or, to repeat myself in a different way: the no-SPI restriction is > utterly useless to guard against my real concerns anyway. I see no > point in it either here or elsewhere.) I've left it in the updated patch I've just posted. There are two more plperl patches in the current commitfest that I'd like to chaperone through to commit (in some form or other) first. Thanks for your help Tom. Tim.