Thread: Feature patch 1 for plperl [PATCH]
I didn't get any significant feedback from the earlier draft so here's the finished 'feature patch 1' for plperl. (This builds on my earlier plperl refactoring patch, and the follow-on ppport.h patch.) Significant changes from the first draft: - New GUC plperl.on_perl_init='...perl...' for admin use. - New GUC plperl.on_trusted_init='...perl...' for plperl user use. - New GUC plperl.on_untrusted_init='...perl...' for plperlu user use. - END blocks now run at backend exit (fixes bug #5066). - Stored procedure subs are now given names ($name__$oid). - More error checking and reporting. - Warnings no longer have an extra newline in the NOTICE text. - Various minor optimizations like pre-growing data structures. Additional changes from the second draft: - SPI functions aren't available during plperl.on_*_init execution. - Added utility functions: quote_literal, quote_nullable, quote_ident, encode_bytea, decode_bytea, looks_like_number, encode_array_literal, encode_array_constructor. - Enabled plperl to "use"/"require" safely by redirecting the require opcode to code that dies if module not already loaded. - Corresponding changes to the documentation. Additional changes in this version: - Added the missing ', arguments' to docs of spi_exec_prepared(). - Added Util.c to list of files for plperl make clean to delete. I'll add this to the commitfest. Tim.
Attachment
On Jan 8, 2010, at 7:01 AM, Tim Bunce wrote: > I didn't get any significant feedback from the earlier draft so here's > the finished 'feature patch 1' for plperl. (This builds on my earlier > plperl refactoring patch, and the follow-on ppport.h patch.) > > Significant changes from the first draft: > - New GUC plperl.on_perl_init='...perl...' for admin use. > - New GUC plperl.on_trusted_init='...perl...' for plperl user use. > - New GUC plperl.on_untrusted_init='...perl...' for plperlu user use. > - END blocks now run at backend exit (fixes bug #5066). > - Stored procedure subs are now given names ($name__$oid). > - More error checking and reporting. > - Warnings no longer have an extra newline in the NOTICE text. > - Various minor optimizations like pre-growing data structures. > > Additional changes from the second draft: > - SPI functions aren't available during plperl.on_*_init execution. > - Added utility functions: quote_literal, quote_nullable, quote_ident, > encode_bytea, decode_bytea, looks_like_number, > encode_array_literal, encode_array_constructor. > - Enabled plperl to "use"/"require" safely by redirecting the require > opcode to code that dies if module not already loaded. > - Corresponding changes to the documentation. > > Additional changes in this version: > - Added the missing ', arguments' to docs of spi_exec_prepared(). > - Added Util.c to list of files for plperl make clean to delete. > > I'll add this to the commitfest. These changes all sound great to me, Tim, and if I can ever get my PL/Perl install working again, I'd be glad to find sometuits and review it. Best, David
On Fri, Jan 8, 2010 at 10:01 AM, Tim Bunce <Tim.Bunce@pobox.com> wrote: > I didn't get any significant feedback from the earlier draft so here's > the finished 'feature patch 1' for plperl. (This builds on my earlier > plperl refactoring patch, and the follow-on ppport.h patch.) > > Significant changes from the first draft: > - New GUC plperl.on_perl_init='...perl...' for admin use.ef > - New GUC plperl.on_trusted_init='...perl...' for plperl user use. > - New GUC plperl.on_untrusted_init='...perl...' for plperlu user use. I kind of thought Tom said these were a bad idea, and I think I kind of agree. We're not going to support multi-line values for GUCs AFAIK, so this is going to be pretty kludgy. What about making the value a comma-separated list of module names to use, or something? > - END blocks now run at backend exit (fixes bug #5066). > - Stored procedure subs are now given names ($name__$oid). > - More error checking and reporting. > - Warnings no longer have an extra newline in the NOTICE text. > - Various minor optimizations like pre-growing data structures. > > Additional changes from the second draft: > - SPI functions aren't available during plperl.on_*_init execution. > - Added utility functions: quote_literal, quote_nullable, quote_ident, > encode_bytea, decode_bytea, looks_like_number, > encode_array_literal, encode_array_constructor. > - Enabled plperl to "use"/"require" safely by redirecting the require > opcode to code that dies if module not already loaded. > - Corresponding changes to the documentation. > > Additional changes in this version: > - Added the missing ', arguments' to docs of spi_exec_prepared(). > - Added Util.c to list of files for plperl make clean to delete. > > I'll add this to the commitfest. The rest of this all seems pretty nice, though I haven't read the patch yet. ...Robert
On fre, 2010-01-08 at 15:01 +0000, Tim Bunce wrote: > I didn't get any significant feedback from the earlier draft so here's > the finished 'feature patch 1' for plperl. I think it would help if you could split this up into about 6 to 10 single-feature patches.
On Fri, Jan 08, 2010 at 10:36:43PM -0500, Robert Haas wrote: > On Fri, Jan 8, 2010 at 10:01 AM, Tim Bunce <Tim.Bunce@pobox.com> wrote: > > I didn't get any significant feedback from the earlier draft so here's > > the finished 'feature patch 1' for plperl. (This builds on my earlier > > plperl refactoring patch, and the follow-on ppport.h patch.) > > > > Significant changes from the first draft: > > - New GUC plperl.on_perl_init='...perl...' for admin use. > > - New GUC plperl.on_trusted_init='...perl...' for plperl user use. > > - New GUC plperl.on_untrusted_init='...perl...' for plperlu user use. > > I kind of thought Tom said these were a bad idea, and I think I kind > of agree. Tom had some concerns which I believe I've addressed. I'd be happy to hear from Tom if he has any remaining concerns. > We're not going to support multi-line values for GUCs > AFAIK, so this is going to be pretty kludgy. I'm not sure what you mean by "this". Typical use-cases would be:plperl.on_perl_init='use MyStuff;'plperl.on_trusted_init='$some_global=42'; > What about making the value a comma-separated list of module names to > use, or something? That would force people who just want to set some global variable to write a module. That seems overly painful for no significant gain. The fact that multi-line values for GUCs aren't supported will naturally enourage anyone wanting to execute many statements to write a module for them. That sems like a perfectly reasonable balance. > > [...] > > The rest of this all seems pretty nice, though I haven't read the patch yet. Thanks Robert. I look forward to your feedback if you do get a chance to read it. Tim.
Peter Eisentraut wrote: > On fre, 2010-01-08 at 15:01 +0000, Tim Bunce wrote: > >> I didn't get any significant feedback from the earlier draft so here's >> the finished 'feature patch 1' for plperl. >> > > I think it would help if you could split this up into about 6 to 10 > single-feature patches. > I think that's a bit excessive. I'd suggest three patches: * the new utility functions (quote_literal, decode_bytea etc.) These should be fairly uncontroversial, but accountfor a large part of the patch volume. * the code relating to library load, interpreter initialization and termination * the remainder (function naming, better error checking, enabling use/require if a lib is already loaded,cleanup and optimization) We could ask Tim to break up the last, but they are all fairly small items, and I at least wouldn't be bothered by having them combined. And having to handle 6 to 10 patches all hitting the same body of code doesn't sound terrible pleasant either. cheers andrew
Andrew Dunstan <andrew@dunslane.net> writes: > Peter Eisentraut wrote: >> I think it would help if you could split this up into about 6 to 10 >> single-feature patches. > ... having to handle 6 to 10 patches all hitting the same body of code > doesn't sound terrible pleasant either. Indeed. That sounds like rather a mess. regards, tom lane
Tim Bunce <Tim.Bunce@pobox.com> writes: > On Fri, Jan 08, 2010 at 10:36:43PM -0500, Robert Haas wrote: >> I kind of thought Tom said these were a bad idea, and I think I kind >> of agree. > Tom had some concerns which I believe I've addressed. You haven't addressed them, you've simply ignored them. For the record, I think it's a bad idea to run arbitrary user-defined code in the postmaster, and I think it's a worse idea to run arbitrary user-defined code at backend shutdown (the END-blocks bit). I do not care in the least what applications you think this might enable --- the negative consequences for overall system stability seem to me to outweigh any possible arguments on that side. What happens when the supplied code has errors, takes an unreasonable amount of time to run, does something unsafe, depends on the backend not being in an error state already, etc etc? I do not have a veto over stuff like this, but if I did, it would not go in. >> We're not going to support multi-line values for GUCs >> AFAIK, so this is going to be pretty kludgy. > I'm not sure what you mean by "this". What he means by "this" is defining GUCs in a way that would make people want to use multi-line values for them. However, that doesn't have anything to do with my worries ... regards, tom lane
On Sun, Jan 10, 2010 at 1:49 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Tim Bunce <Tim.Bunce@pobox.com> writes: >> On Fri, Jan 08, 2010 at 10:36:43PM -0500, Robert Haas wrote: >>> I kind of thought Tom said these were a bad idea, and I think I kind >>> of agree. > >> Tom had some concerns which I believe I've addressed. > > You haven't addressed them, you've simply ignored them. That's not *completely* true. http://archives.postgresql.org/pgsql-hackers/2009-12/msg00432.php > For the record, > I think it's a bad idea to run arbitrary user-defined code in the > postmaster, and I think it's a worse idea to run arbitrary user-defined > code at backend shutdown (the END-blocks bit). I do not care in the > least what applications you think this might enable --- the negative > consequences for overall system stability seem to me to outweigh any > possible arguments on that side. What happens when the supplied code > has errors, takes an unreasonable amount of time to run, does something > unsafe, depends on the backend not being in an error state already, etc > etc? Same thing that happens when you put something stupid into shared_preload_libraries - you destabilize your database cluster and the world blows up. > I do not have a veto over stuff like this, but if I did, it would > not go in. I'm not as strongly opposed to this as you are, but I definitely think there will be some people who shoot themselves in the foot with it. I don't think it's necessarily more dangerous than shared_preload_libraries from a theoretical standpoint, but the sheer fact that this is Perl rather than C means more people will try to do it, and some of them will manage to take out the whole database cluster, which will not be awesome. I think this is a real weakness of our one-process-per-connection model. If it were possible to recycle backends for new connections, there would be no need even to consider doing things like this. Yeah, I know you don't want to do that either, just mentioning it... >>> We're not going to support multi-line values for GUCs >>> AFAIK, so this is going to be pretty kludgy. > >> I'm not sure what you mean by "this". > > What he means by "this" is defining GUCs in a way that would make people > want to use multi-line values for them. However, that doesn't have > anything to do with my worries ... Well, you did mention it previously. http://archives.postgresql.org/pgsql-hackers/2009-12/msg00384.php Anyway, I think you've put this pretty well here: the current definition will make people WANT to use multi-line values for this, and we don't support that. I think Tim's example is fairly contrived - setting a global variable here does not seem likely to be useful to very many users, and the ones who do want it will likely want also want multi-line values. What IS likely to be useful is preloading a bunch of perl modules so that backend startup doesn't take an unreasonably long time. It's nicer to write: plperl.on_perl_init='strict,warnings,LDAP,HTML::Parser,Archive::Zip' rather than: plperl.on_perl_init='use strict;use warnings;use LDAP;use HTML::Parser;use Archive::Zip;' I would strongly suggest to Tim that he rip the portions of this patch that are related to this feature out and submit them separately so that we can commit the uncontroversial portions first. ...Robert
Robert Haas <robertmhaas@gmail.com> writes: > On Sun, Jan 10, 2010 at 1:49 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> What happens when the supplied code >> has errors, takes an unreasonable amount of time to run, does something >> unsafe, depends on the backend not being in an error state already, etc >> etc? > Same thing that happens when you put something stupid into > shared_preload_libraries - you destabilize your database cluster and > the world blows up. shared_preload_libraries is under the sole control of the DBA, though. What distresses me about this is the exposure to ordinary users. In particular, that casual little "atexit" addition appears to mean that *unprivileged* users can break normal functioning of the database, eg by delaying or even preventing shutdown. That's a security hole of the first magnitude. Trying to execute SPI code there could make things even more fun. I also still don't care for the whole concept of on_init code from a theoretical standpoint: as I said before, loading of the plperl shared library should not be a semantically significant event from the user's viewpoint. If these GUCs were PGC_POSTMASTER it wouldn't be so bad, but Tim still seems to want them to be settable inside an individual session before the library is loaded, which makes the loading extremely visible. As an example, if people were using such functionality then the DBA couldn't start preloading plperl for performance without breaking behavior that some of his users might be depending on. regards, tom lane
Robert Haas wrote: > Anyway, I think you've put this pretty well here: the current > definition will make people WANT to use multi-line values for this, > and we don't support that. I think Tim's example is fairly contrived > - setting a global variable here does not seem likely to be useful to > very many users, and the ones who do want it will likely want also > want multi-line values. What IS likely to be useful is preloading a > bunch of perl modules so that backend startup doesn't take an > unreasonably long time. It's nicer to write: > > plperl.on_perl_init='strict,warnings,LDAP,HTML::Parser,Archive::Zip' > > rather than: > > plperl.on_perl_init='use strict;use warnings;use LDAP;use > HTML::Parser;use Archive::Zip;' > I don't know why you would do either of these things. I at least would load one module which would in turn load others. So I'd expect to see something like this: plperl.on_perl_init = 'use lib "/my/app"; use MyApp::Pg;' I think the suggestion that somehow people will want to put a huge list of directives straight into postgresql.conf and that this is a reason not to provide this facility is on the wrong track completely. > I would strongly suggest to Tim that he rip the portions of this patch > that are related to this feature out and submit them separately so > that we can commit the uncontroversial portions first. > > > See my previous email. I suggested that Tim send three patches: one for this controversial stuff, one for the new utility functions for plperl, and one for the remainder. He and I have discussed it and I believe he is agreeable to that. cheers andrew
On Sun, Jan 10, 2010 at 2:49 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> On Sun, Jan 10, 2010 at 1:49 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> What happens when the supplied code >>> has errors, takes an unreasonable amount of time to run, does something >>> unsafe, depends on the backend not being in an error state already, etc >>> etc? > >> Same thing that happens when you put something stupid into >> shared_preload_libraries - you destabilize your database cluster and >> the world blows up. > > shared_preload_libraries is under the sole control of the DBA, though. > What distresses me about this is the exposure to ordinary users. > In particular, that casual little "atexit" addition appears to mean > that *unprivileged* users can break normal functioning of the database, > eg by delaying or even preventing shutdown. That's a security hole of > the first magnitude. Trying to execute SPI code there could make things > even more fun. That's really a separate issue from the on_perl_init stuff, but now that you mention it it does sound like a serious problem. Preventing SPI from being executed there is probably feasible, but I don't like the idea that broken code would cause the database to fail to shut down, and that's probably not fixable (unless maybe we detach shared memory before executing this code, or something?). > I also still don't care for the whole concept of on_init code from a > theoretical standpoint: as I said before, loading of the plperl shared > library should not be a semantically significant event from the user's > viewpoint. If these GUCs were PGC_POSTMASTER it wouldn't be so bad, but > Tim still seems to want them to be settable inside an individual session > before the library is loaded, which makes the loading extremely visible. > As an example, if people were using such functionality then the DBA > couldn't start preloading plperl for performance without breaking > behavior that some of his users might be depending on. Hmm. OK, I agree: that's a problem. ...Robert
Tom Lane wrote: > Robert Haas <robertmhaas@gmail.com> writes: > >> On Sun, Jan 10, 2010 at 1:49 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> >>> What happens when the supplied code >>> has errors, takes an unreasonable amount of time to run, does something >>> unsafe, depends on the backend not being in an error state already, etc >>> etc? >>> > > >> Same thing that happens when you put something stupid into >> shared_preload_libraries - you destabilize your database cluster and >> the world blows up. >> > > shared_preload_libraries is under the sole control of the DBA, though. > What distresses me about this is the exposure to ordinary users. > In particular, that casual little "atexit" addition appears to mean > that *unprivileged* users can break normal functioning of the database, > eg by delaying or even preventing shutdown. That's a security hole of > the first magnitude. Trying to execute SPI code there could make things > even more fun. > I suspect that could be inhibited at least. > I also still don't care for the whole concept of on_init code from a > theoretical standpoint: as I said before, loading of the plperl shared > library should not be a semantically significant event from the user's > viewpoint. If these GUCs were PGC_POSTMASTER it wouldn't be so bad, but > Tim still seems to want them to be settable inside an individual session > before the library is loaded, which makes the loading extremely visible. > As an example, if people were using such functionality then the DBA > couldn't start preloading plperl for performance without breaking > behavior that some of his users might be depending on. > > > Well, I think the proposed plperl.on_perl_init setting could be PGC_POSTMASTER, as I previously commented. The other settings are intended to be used on interpreter start, AIUI. Maybe we need to explore the use cases more. If we made plperl.on_perl_init PGC_POSTMASTER and the other two PGC_SUSET would that alloy your concerns? cheers andrew
On Sun, Jan 10, 2010 at 2:58 PM, Andrew Dunstan <andrew@dunslane.net> wrote: > I don't know why you would do either of these things. I at least would load > one module which would in turn load others. So I'd expect to see something > like this: > > plperl.on_perl_init = 'use lib "/my/app"; use MyApp::Pg;' > > I think the suggestion that somehow people will want to put a huge list of > directives straight into postgresql.conf and that this is a reason not to > provide this facility is on the wrong track completely. Hmm. I have to admit I didn't think about "use lib". That does seem like a plausible thing to want to do. >> I would strongly suggest to Tim that he rip the portions of this patch >> that are related to this feature out and submit them separately so >> that we can commit the uncontroversial portions first. > > See my previous email. I suggested that Tim send three patches: one for this > controversial stuff, one for the new utility functions for plperl, and one > for the remainder. He and I have discussed it and I believe he is agreeable > to that. OK, well then just +1 for that. ...Robert
Andrew Dunstan <andrew@dunslane.net> writes: > Tom Lane wrote: >> As an example, if people were using such functionality then the DBA >> couldn't start preloading plperl for performance without breaking >> behavior that some of his users might be depending on. > If we made plperl.on_perl_init PGC_POSTMASTER and the other two > PGC_SUSET would that alloy your concerns? No, they have to all be PGC_POSTMASTER to answer that concern. Only breaking things for superusers isn't really that big an improvement over breaking them for everybody. regards, tom lane
Tom Lane wrote: > Andrew Dunstan <andrew@dunslane.net> writes: > >> Tom Lane wrote: >> >>> As an example, if people were using such functionality then the DBA >>> couldn't start preloading plperl for performance without breaking >>> behavior that some of his users might be depending on. >>> > > >> If we made plperl.on_perl_init PGC_POSTMASTER and the other two >> PGC_SUSET would that alloy your concerns? >> > > No, they have to all be PGC_POSTMASTER to answer that concern. Only > breaking things for superusers isn't really that big an improvement > over breaking them for everybody. > > > Well, I don't know about Tim but I think I could live with that. And when we get some actual experience with using them we'll have a better handle on whether or not it gives us any pain. cheers andrew
On Jan 10, 2010, at 11:17 AM, Robert Haas wrote: > It's nicer to write: > > plperl.on_perl_init='strict,warnings,LDAP,HTML::Parser,Archive::Zip' > > rather than: > > plperl.on_perl_init='use strict;use warnings;use LDAP;use > HTML::Parser;use Archive::Zip;' Well, no, because sometimes I just want to load something and not have functions exported (into whatever namespaces endsup calling this). So I might have something like: plplerl.on_perl_init='use HTML::Entities ();' Other times I might want those functions exported. FWIW, Bricolage has a feature like this, and you can only put stuff on one line. It's been there since 2002 or so. No onehas ever complained about it; I doubt anyone would complain about this, either. Best, David
Andrew Dunstan <andrew@dunslane.net> writes: > Tom Lane wrote: >> No, they have to all be PGC_POSTMASTER to answer that concern. Only >> breaking things for superusers isn't really that big an improvement >> over breaking them for everybody. > Well, I don't know about Tim but I think I could live with that. And > when we get some actual experience with using them we'll have a better > handle on whether or not it gives us any pain. Upon further review, PGC_POSTMASTER isn't going to work because of this in guc.c: /* * Only allow custom PGC_POSTMASTER variables to be created during shared * library preload; any later than that,we can't ensure that the value * doesn't change after startup. This is a fatal elog if it happens; just * erroringout isn't safe because we don't know what the calling loadable * module might already have hooked into. */ if (context == PGC_POSTMASTER && !process_shared_preload_libraries_in_progress) elog(FATAL, "cannot createPGC_POSTMASTER variables after startup"); We certainly don't want to make it such that plperl *has* to be preloaded, so PGC_POSTMASTER is out. However, I think PGC_SIGHUP would be enough to address my basic worry, which is that people shouldn't be depending on the ability to set these things within an individual session. regards, tom lane
Tim Bunce <Tim.Bunce@pobox.com> writes: > I didn't get any significant feedback from the earlier draft so here's > the finished 'feature patch 1' for plperl. (This builds on my earlier > plperl refactoring patch, and the follow-on ppport.h patch.) Just looking over this patch, I don't think it's nearly robust enough against initialization failures. The original code wasn't very good about that either, but that was (more or less) okay because it was executing predetermined, pretested code that we really don't expect to fail. I think the standard has to be a *lot* higher if we are going to execute user-supplied perl code there. You need to make sure that things are left in a reasonably consistent, safe state if an error occurs. Along the same line, there needs to be more effort put into the errors that can be thrown when one of these failures happen. The current messages don't follow our style guidelines very well, and aren't exposed for translation. regards, tom lane
On Sun, Jan 10, 2010 at 4:35 PM, Robert Haas <robertmhaas@gmail.com> wrote: >>> I would strongly suggest to Tim that he rip the portions of this patch >>> that are related to this feature out and submit them separately so >>> that we can commit the uncontroversial portions first. >> >> See my previous email. I suggested that Tim send three patches: one for this >> controversial stuff, one for the new utility functions for plperl, and one >> for the remainder. He and I have discussed it and I believe he is agreeable >> to that. > > OK, well then just +1 for that. I believe we have agreement on this course of action, so I'm going to mark the current patch as Returned with Feedback. Hopefully Tim will submit separate patches for each of these three areas in the next day or two before start-of-CommitFest - personally, I think they should each get their own thread and their own entry in the CommitFest app, for ease of tracking and reviewing. YMMV, of course. ...Robert
On Tue, Jan 12, 2010 at 07:06:59PM -0500, Robert Haas wrote: > On Sun, Jan 10, 2010 at 4:35 PM, Robert Haas <robertmhaas@gmail.com> wrote: > >>> I would strongly suggest to Tim that he rip the portions of this patch > >>> that are related to this feature out and submit them separately so > >>> that we can commit the uncontroversial portions first. > >> > >> See my previous email. I suggested that Tim send three patches: one for this > >> controversial stuff, one for the new utility functions for plperl, and one > >> for the remainder. He and I have discussed it and I believe he is agreeable > >> to that. > > > > OK, well then just +1 for that. > > I believe we have agreement on this course of action, so I'm going to > mark the current patch as Returned with Feedback. Hopefully Tim will > submit separate patches for each of these three areas in the next day > or two before start-of-CommitFest That's my plan. Plus, hopefully at least one more for inter-sp calling. > personally, I think they should > each get their own thread and their own entry in the CommitFest app, > for ease of tracking and reviewing. YMMV, of course. Yes, that was also my intent. Tim.