Thread: contrib function naming, and upgrade issues
Note that I'm talking here about the names of the C functions, not the SQL names. The existing hstore has some very dubious choices of function names (for non-static functions) in the C code; functions like each(), delete(), fetchval(), defined(), tconvert(), etc. which all look to me like prime candidates for name collisions and consequent hilarity. The patch I'm working on could include fixes for this; but there's an obvious impact on anyone upgrading from an earlier version... is it worth it? -- Andrew (irc:RhodiumToad)
On Fri, Mar 20, 2009 at 9:57 PM, Andrew Gierth <andrew@tao11.riddles.org.uk> wrote: > Note that I'm talking here about the names of the C functions, not > the SQL names. > > The existing hstore has some very dubious choices of function names > (for non-static functions) in the C code; functions like each(), > delete(), fetchval(), defined(), tconvert(), etc. which all look to me > like prime candidates for name collisions and consequent hilarity. > > The patch I'm working on could include fixes for this; but there's an > obvious impact on anyone upgrading from an earlier version... is it > worth it? Based on that description, +1 from me. That kind of hilarity can be a huge time sink when debugging, and it makes it hard to use grep to find all references to a particular function (or #define, typedef, etc.). ...Robert
Andrew Gierth <andrew@tao11.riddles.org.uk> writes: > Note that I'm talking here about the names of the C functions, not > the SQL names. > The existing hstore has some very dubious choices of function names > (for non-static functions) in the C code; functions like each(), > delete(), fetchval(), defined(), tconvert(), etc. which all look to me > like prime candidates for name collisions and consequent hilarity. > The patch I'm working on could include fixes for this; but there's an > obvious impact on anyone upgrading from an earlier version... is it > worth it? I agree that this wasn't an amazingly good choice, but I think there's no real risk of name collisions because fmgr only searches for such names within the particular .so. As you say, renaming *will* break existing dumps. I'd be inclined to leave it alone, at least for now. I hope that someone will step up and implement a decent module system for us sometime soon, which might fix the upgrade problem for changes of this sort. regards, tom lane
On Sat, 2009-03-21 at 01:57 +0000, Andrew Gierth wrote: > Note that I'm talking here about the names of the C functions, not > the SQL names. > > The existing hstore has some very dubious choices of function names > (for non-static functions) in the C code; functions like each(), > delete(), fetchval(), defined(), tconvert(), etc. which all look to me > like prime candidates for name collisions and consequent hilarity. > > The patch I'm working on could include fixes for this; but there's an > obvious impact on anyone upgrading from an earlier version... is it > worth it? Perhaps you can have two sets of functions, yet just one .so? One with the old naming for compatibility, and a set of dehilarified function names for future use. Two .sql files, giving the user choice. -- Simon Riggs www.2ndQuadrant.comPostgreSQL Training, Services and Support
>>>>> "Simon" == Simon Riggs <simon@2ndQuadrant.com> writes: > On Sat, 2009-03-21 at 01:57 +0000, Andrew Gierth wrote:>> Note that I'm talking here about the names of the C functions,not>> the SQL names.>> >> The existing hstore has some very dubious choices of function names>> (for non-staticfunctions) in the C code; functions like each(),>> delete(), fetchval(), defined(), tconvert(), etc. which alllook to me>> like prime candidates for name collisions and consequent hilarity.>> >> The patch I'm working on could includefixes for this; but there's an>> obvious impact on anyone upgrading from an earlier version... is it>> worth it? Simon> Perhaps you can have two sets of functions, yet just one .so?Simon> One with the old naming for compatibility, anda set ofSimon> dehilarified function names for future use. Two .sql files,Simon> giving the user choice. Two .sql files would be pointless. Remember we're talking about the C function names, not the SQL names; the only time the user should notice the difference is when restoring an old dump. As I see it there are three options: 1) do nothing; keep the existing C function names. dump/restore from older versions will still work, but new functionality won't be available without messing with the SQL. 2) hard cutover; rename all the dubious C functions. dump/restore from older versions will get lots of errors, for which the workaround will be "install the new hstore.sql into the database before trying to restore". 3) some sort of compatibility hack involving optionally duplicating the names in the C module. -- Andrew.
>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes: Tom> I agree that this wasn't an amazingly good choice, but I thinkTom> there's no real risk of name collisions because fmgronlyTom> searches for such names within the particular .so. Oh, if only life were so simple. Consider two modules mod1 (source files mod1a.c and mod1b.c) and mod2 (source files mod2a.c and mod2b.c). mod1a.c: contains sql-callable function foo() which calls an extern function bar() defined in mod1b.c. mod1a.o and mod1b.o are linked to make mod1.so. mod2a.c: contains sql-callable function baz() which calls an extern function bar() defined in mod2b.c. These are linked to make mod2.so. Guess what happens when foo() and baz() are both called from within the same backend.... (Perhaps we should be linking contrib and pgxs modules with -Bsymbolic on those platforms where it matters?) -- Andrew.
On Sat, Mar 21, 2009 at 01:05:35PM +0000, Andrew Gierth wrote: > (Perhaps we should be linking contrib and pgxs modules with -Bsymbolic > on those platforms where it matters?) Another possibility is to use the visibility attributes such as those provided in GCC. Maybe the version1 declarion of a function could add the appropriate magic to set the visiblity to public and alter PGXS to set the default visibility to hidden. Voila, modules whose only exported symbols are those declared with a version-1 declaration. Perhaps a little too much magic :) Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Please line up in a tree and maintain the heap invariant while > boarding. Thank you for flying nlogn airlines.
Andrew Gierth <andrew@tao11.riddles.org.uk> writes: > "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes: > Tom> I agree that this wasn't an amazingly good choice, but I think > Tom> there's no real risk of name collisions because fmgr only > Tom> searches for such names within the particular .so. > Oh, if only life were so simple. I think you are missing the point. There are certainly *potential* problems from common function names in different .so's, but that does not translate to evidence of *actual* problems in the Postgres environment. In particular, I believe that we load .so's without adding their symbols to those globally known by the linker --- at least on platforms where that's possible. Not to mention that the universe of other .so's we might load is not all that large. So I think the actual risks posed by contrib/hstore are somewhere between minimal and nonexistent. The past discussions we've had about developing a proper module facility included ways to replace not-quite-compatible C functions. I think that we can afford to let hstore go on as it is for another release or two, in hopes that we'll have something that makes a fix for this transparent to users. The risks don't look to me to be large enough to justify imposing any upgrade pain on users. regards, tom lane
On Saturday 21 March 2009 12:27:27 Tom Lane wrote: > Andrew Gierth <andrew@tao11.riddles.org.uk> writes: > > "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes: > > Tom> I agree that this wasn't an amazingly good choice, but I think > > Tom> there's no real risk of name collisions because fmgr only > > Tom> searches for such names within the particular .so. > > > > Oh, if only life were so simple. > > I think you are missing the point. There are certainly *potential* > problems from common function names in different .so's, but that does not > translate to evidence of *actual* problems in the Postgres environment. > In particular, I believe that we load .so's without adding their symbols > to those globally known by the linker --- at least on platforms where > that's possible. Not to mention that the universe of other .so's we > might load is not all that large. So I think the actual risks posed by > contrib/hstore are somewhere between minimal and nonexistent. > > The past discussions we've had about developing a proper module facility > included ways to replace not-quite-compatible C functions. I think that > we can afford to let hstore go on as it is for another release or two, > in hopes that we'll have something that makes a fix for this transparent > to users. The risks don't look to me to be large enough to justify > imposing any upgrade pain on users. > We've been talking about this magical "proper module facility" for a few releases now... are we still opposed to putting contrib modules in thier own schema? People who took my advice and did that for tsearch were mighty happy when 8.2 broke at the C level, and when 8.3 broke all around. Doing that for hstore now would make the transition a little easier in the future as well. -- Robert Treat Conjecture: http://www.xzilla.net Consulting: http://www.omniti.com
Robert Treat <xzilla@users.sourceforge.net> writes: > We've been talking about this magical "proper module facility" for a few > releases now... are we still opposed to putting contrib modules in thier own > schema? I'm hesitant to do that when we don't yet have either a design or a migration plan for the module facility. We might find we'd shot ourselves in the foot, or at least complicated the migration situation unduly. regards, tom lane
On Sat, Mar 21, 2009 at 9:49 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Treat <xzilla@users.sourceforge.net> writes: >> We've been talking about this magical "proper module facility" for a few >> releases now... are we still opposed to putting contrib modules in thier own >> schema? > > I'm hesitant to do that when we don't yet have either a design or a > migration plan for the module facility. We might find we'd shot > ourselves in the foot, or at least complicated the migration situation > unduly. I think there have been a few designs proposed, but I think part of the problem is a lack of agreement on the requirements. "module facility" seems to mean a lot of different things to different people. ...Robert
>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes: Tom> I agree that this wasn't an amazingly good choice, but I thinkTom> there's no real risk of name collisions because fmgronlyTom> searches for such names within the particular .so. >> Oh, if only life were so simple. Tom> I think you are missing the point. Nope. Tom> There are certainly *potential* problems from common functionTom> names in different .so's, but that does not translatetoTom> evidence of *actual* problems in the Postgres environment. It is true that I have no reason to believe that anyone has ever encountered any problems due to name collisions between hstore and something else. The only question is how to trade off the potential risks against the known difficulties regarding upgrading; I'm quite happy to accept the conclusion that the risk is not sufficient to justify the upgrade pain, but only if the risk is being correctly assessed. Tom> In particular, I believe that we load .so's without adding theirTom> symbols to those globally known by the linker ---at least onTom> platforms where that's possible. This is false; in the exact reverse of the above, we explicitly request RTLD_GLOBAL on platforms where it exists. Tom> Not to mention that the universe of other .so's we might load isTom> not all that large. So I think the actual risksposed byTom> contrib/hstore are somewhere between minimal and nonexistent. The problem extends not only to other loaded .so's, but also to every library linked into the postmaster itself, every library linked into another loaded .so, and every .so (and associated libs) dynamically loaded by another .so (e.g. modules loaded by pls). (-Bsymbolic (or equivalent) would negate all of these, as far as I can tell.) Tom> The risks don't look to me to be large enough to justifyTom> imposing any upgrade pain on users. OK. I will maintain binary compatibility in my patch. -- Andrew.
>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes: > Robert Treat <xzilla@users.sourceforge.net> writes:>> We've been talking about this magical "proper module facility" for>>a few releases now... are we still opposed to putting contrib>> modules in thier own schema? Tom> I'm hesitant to do that when we don't yet have either a designTom> or a migration plan for the module facility. Wemight find we'dTom> shot ourselves in the foot, or at least complicated theTom> migration situation unduly. I've been thinking about this, and my conclusion is that schemas as they currently exist are the wrong tool for making modules/packages. Partly that's based on the relative inflexibility of the search_path setting; it's hard to modify the search_path without completely replacing it, so knowledge of the "default" search path ends up being propagated to a lot of places. There's a parallel here with operating-system package mechanisms; for the most part, the more usable / successful packaging systems don't rely on putting everything in separate directories, instead they have an out-of-band method for specifying what files belong to what package. We already have a mechanism we could use for this: pg_depend. If an "installed package" was a type of object, the functions, types, operators, or any other kind of object installed by the package could have dependency links to it; that would (a) make it trivial to drop, and (b) pg_dump could check for package dependencies and, for objects depending on a package, emit only a package installation command rather than the object definition. (I distinguish an "installed package" from whatever the package definition might be, since it's possible that a package might want to provide multiple APIs, for example for different versions, and these might be installed simultaneously in different schemas.) -- Andrew.
>>>>> "Robert" == Robert Haas <robertmhaas@gmail.com> writes: >> I'm hesitant to do that when we don't yet have either a design or>> a migration plan for the module facility. We mightfind we'd shot>> ourselves in the foot, or at least complicated the migration>> situation unduly. Robert> I think there have been a few designs proposed, but I thinkRobert> part of the problem is a lack of agreement ontheRobert> requirements. "module facility" seems to mean a lot ofRobert> different things to different people. Some ideas: - want to be able to do INSTALL PACKAGE foo; without needing to mess with .sql files. This might default to looking for $libdir/foo.so, or there might be a mechanism to register packages globally or locally. - want to be able to do INSTALL PACKAGE foo VERSION 1; and get the version 1 API rather than whatever the latest is. - want to be able to do INSTALL PACKAGE foo SCHEMA bar; rather than having to edit some .sql file. - want to be able to do DROP PACKAGE foo; - want pg_dump to not output the definitions of any objects that belong to a package, but instead to output an INSTALL PACKAGEfoo VERSION n SCHEMA x; -- Andrew.
On Sun, Mar 22, 2009 at 11:54 AM, Andrew Gierth <andrew@tao11.riddles.org.uk> wrote: > - want to be able to do INSTALL PACKAGE foo; without needing to > mess with .sql files. This might default to looking for > $libdir/foo.so, or there might be a mechanism to register packages > globally or locally. > > - want to be able to do INSTALL PACKAGE foo VERSION 1; and get > the version 1 API rather than whatever the latest is. > > - want to be able to do INSTALL PACKAGE foo SCHEMA bar; rather > than having to edit some .sql file. > > - want to be able to do DROP PACKAGE foo; > > - want pg_dump to not output the definitions of any objects that > belong to a package, but instead to output an INSTALL PACKAGE foo > VERSION n SCHEMA x; I think using PACKAGE is a bad idea as it'll confuse people used to Oracle. MODULE perhaps? -- Dave Page EnterpriseDB UK: http://www.enterprisedb.com
Andrew Gierth wrote: > I've been thinking about this, and my conclusion is that schemas as > they currently exist are the wrong tool for making modules/packages. This has been discussed at length previously, and we even had an incomplete but substantive patch posted. Did you review that? Some of it appears to be in line of what you're proposing here. If you're interested in this area, perhaps you could pick up where Tom Dunstan left off. See URLs here: http://wiki.postgresql.org/wiki/Todo#Source_Code under "Improve the module installation experience (/contrib, etc)" -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Dave Page wrote: > > I think using PACKAGE is a bad idea as it'll confuse people used to > Oracle. MODULE perhaps? > > Right. We debated this extensively in the past. Module was the consensus name. cheers andrew
Hi, Le 22 mars 09 à 12:42, Andrew Gierth a écrit : > Tom> I'm hesitant to do that when we don't yet have either a design > Tom> or a migration plan for the module facility. We might find we'd > Tom> shot ourselves in the foot, or at least complicated the > Tom> migration situation unduly. > > I've been thinking about this, and my conclusion is that schemas as > they currently exist are the wrong tool for making modules/packages. Agreed. Still, schemas are useful and using them should be encouraged, I think. > Partly that's based on the relative inflexibility of the search_path > setting; it's hard to modify the search_path without completely > replacing it, so knowledge of the "default" search path ends up being > propagated to a lot of places. pg_catalog is implicit in the search_path, what about having user schemas with the implicit capability too? Then you have the problem of ordering more than one implicit schemas, the easy solution is solving that the same way we solve trigger orderding: alphabetically. Now, that could mean ugly user-facing schema names: we already know we need synonyms, don't we? > There's a parallel here with operating-system package mechanisms; for > the most part, the more usable / successful packaging systems don't > rely on putting everything in separate directories, instead they have > an out-of-band method for specifying what files belong to what > package. > > We already have a mechanism we could use for this: pg_depend. If an > "installed package" was a type of object, the functions, types, > operators, or any other kind of object installed by the package could > have dependency links to it; that would (a) make it trivial to drop, > and (b) pg_dump could check for package dependencies and, for objects > depending on a package, emit only a package installation command > rather > than the object definition. Here's a sketch of what I came up with: http://wiki.postgresql.org/wiki/ExtensionPackaging It's still needing some work before being a solid proposal, like for example handling cases where you want to pg_restore a database and insist on *not* caring about some extensions (pgq, londiste, slony things, cron restoring into pre-live systems). Or working out some versioning information and dependancies between modules. What it misses the most is hackers acceptance of the proposed concepts, though. > (I distinguish an "installed package" from whatever the package > definition might be, since it's possible that a package might want to > provide multiple APIs, for example for different versions, and these > might be installed simultaneously in different schemas.) Version tracking is yet to be designed in the document. -- dim
Hi, Heard about http://wiki.postgresql.org/wiki/ExtensionPackaging ? :) Le 22 mars 09 à 14:29, Dave Page a écrit : >> - want to be able to do INSTALL PACKAGE foo; without needing to >> mess with .sql files. This might default to looking for >> $libdir/foo.so, or there might be a mechanism to register packages >> globally or locally. Part of the proposal. >> - want to be able to do INSTALL PACKAGE foo VERSION 1; and get >> the version 1 API rather than whatever the latest is. To be added to the proposal. >> - want to be able to do INSTALL PACKAGE foo SCHEMA bar; rather >> than having to edit some .sql file. Part of the proposal (install time variables/options/parameters). >> - want to be able to do DROP PACKAGE foo; Part of the proposal. >> - want pg_dump to not output the definitions of any objects that >> belong to a package, but instead to output an INSTALL PACKAGE foo >> VERSION n SCHEMA x; Part of the proposal. > I think using PACKAGE is a bad idea as it'll confuse people used to > Oracle. MODULE perhaps? Using package would tie us into supporting oracle syntax, which nobody actually wants, it seems. Or at least we have to reserve the keyword for meaning "oracle compliant facility". Module on the other hand is already used in PostgreSQL to refer to the dynamic lib you get when installing C coded extensions (.so or .dll), what we miss here is a way to refer to them in pure SQL, have their existence cared about in the catalogs. That's the part Tom Dunstan worked on IIRC. He also worked out some OS level tools for module handling, but I think I'd prefer to have another notion in between, the extension. The extension would be a new SQL object referring to zero, one or more modules and one or more SQL scripts creating new SQL objects (schemas, tables, views, tablespaces, functions, types, casts, operator classes and families, etc, whatever SQL scripting we support now --- yes, index am would be great too). Those would depend (pg_depend) on the package SQL object. I don't think we need to be able to nest a package creation inside the package SQL scripts, but hey, why not. So my vote is for us to talk about modules (.so) and extensions (the packaging and distribution of them). And this term isn't even new in PostgreSQL glossary ;) Regards, -- dim
Dimitri Fontaine <dfontaine@hi-media.com> writes: > He also worked out some OS level tools for module handling, but I > think I'd prefer to have another notion in between, the extension. > The extension would be a new SQL object referring to zero, one or more > modules and one or more SQL scripts creating new SQL objects (schemas, > tables, views, tablespaces, functions, types, casts, operator classes > and families, etc, whatever SQL scripting we support now --- yes, > index am would be great too). This seems drastically overengineered. What do we need two levels of objects for? regards, tom lane
Le 22 mars 09 à 22:05, Tom Lane a écrit : > This seems drastically overengineered. What do we need two levels of > objects for? We need to be able to refer (pg_depend) to (system level) modules. Any given extension may depend on more than one module. What did I overlook? -- dim
Dimitri Fontaine <dfontaine@hi-media.com> writes: > Le 22 mars 09 � 22:05, Tom Lane a �crit : >> This seems drastically overengineered. What do we need two levels of >> objects for? > We need to be able to refer (pg_depend) to (system level) modules. > Any given extension may depend on more than one module. You really haven't convinced me that this is anything but overcomplication. There might (or might not) be some use-case for being able to declare that module A depends on module B, but that doesn't mean we need a second layer of grouping. regards, tom lane
On Sun, Mar 22, 2009 at 10:46 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > There might (or might not) be some use-case > for being able to declare that module A depends on module B, Typically, earthdistance requires cube so the module dependency is already something that might be useful. But as you said, it doesn't require a second level of grouping, just a way to define dependencies. -- Guillaume
On Sun, Mar 22, 2009 at 7:54 AM, Andrew Gierth <andrew@tao11.riddles.org.uk> wrote: >>>>>> "Robert" == Robert Haas <robertmhaas@gmail.com> writes: > > >> I'm hesitant to do that when we don't yet have either a design or > >> a migration plan for the module facility. We might find we'd shot > >> ourselves in the foot, or at least complicated the migration > >> situation unduly. > > Robert> I think there have been a few designs proposed, but I think > Robert> part of the problem is a lack of agreement on the > Robert> requirements. "module facility" seems to mean a lot of > Robert> different things to different people. > > Some ideas: > > - want to be able to do INSTALL PACKAGE foo; without needing to > mess with .sql files. This might default to looking for > $libdir/foo.so, or there might be a mechanism to register packages > globally or locally. > > - want to be able to do INSTALL PACKAGE foo VERSION 1; and get > the version 1 API rather than whatever the latest is. > > - want to be able to do INSTALL PACKAGE foo SCHEMA bar; rather > than having to edit some .sql file. > > - want to be able to do DROP PACKAGE foo; > > - want pg_dump to not output the definitions of any objects that > belong to a package, but instead to output an INSTALL PACKAGE foo > VERSION n SCHEMA x; This seems about right to me. I think the key to getting this done is to keep the design as simple as possible and to avoid entanglements with other features that may need to be designed independently and first. I think there's a good argument to be made that package management could benefit from the notion of a variable. For example, you might want to write a SQL script or PL/pgsql procedure where ?{version}, or some equally inscrutable glyph, refers to the version specified in the INSTALL MODULE command. I'm deeply skeptical about this approach. Either variables are useful in PL/pgsql - as I tend to believe - or they aren't - as I'm sure can be argued. If they're useful, though, they are probably useful in many contexts other than package management. So I would suggest that either a concerted effort needs to be made to design and implement a useful variable facility (and then we can use it for package mangement, too) or package management needs to be made to work without variables (and then if we eventually add them in general we can use them fpr package management, too). On that basis, I'm inclined to suggest that the SCHEMA and VERSION clauses you've proposed here should be dropped for the first version of this, because I think it will be very, very difficult to implement them without variables. We also, I think, need to try very hard to avoid getting sucked into creating a CPAN-like system for installing modules *on the machine*. We need to focus on how the modules get sucked into PostgreSQL once the OS-level packaging system (RPM, deb, whatever), or the system administrator, have gotten the files installed in some suitable place on the local host, and we now want to make PostgreSQL know about and use them. It might be nice to have a system that does the whole thing, soup to nuts, but again, that's something that can be added later and used by only those that want it. So taking into account suggestions elsewhere on this thread, I suggest "INSTALL MODULE foo" and "DROP MODULE foo". It's pretty clear what DROP MODULE foo should do, but the semantics of INSTALL MODULE foo are a bit less clear. I suspect that it's going to boil down to running a SQL script, which will need to somehow get that module installed. To make that work, I think we need "CREATE MODULE foo" and then "CREATE <TABLE|VIEW|FUNCTION|...> ... MODULE foo". So the SQL script will create the module and then create all of the objects and make them depend on the module using the optional "MODULE foo" clause. ...Robert
Robert Haas <robertmhaas@gmail.com> writes: > ... I suspect that it's going to boil down to running a > SQL script, which will need to somehow get that module installed. To > make that work, I think we need "CREATE MODULE foo" and then "CREATE > <TABLE|VIEW|FUNCTION|...> ... MODULE foo". So the SQL script will > create the module and then create all of the objects and make them > depend on the module using the optional "MODULE foo" clause. I doubt that we want to decorate every CREATE statement we've got with an optional MODULE clause; to name just one objection, it'd probably be impossible to do so without making MODULE a fully reserved word. What was discussed in the last go-round was some sort of state-dependent assignment of a module context. You could imagine either BEGIN MODULE modname; CREATE this;CREATE that;CREATE the_other; END MODULE; or something along the lines of SET current_module = modname; CREATE this;CREATE that;CREATE the_other; SET current_module = null; which is really more or less the same thing except that it makes the state concrete in the form of an examinable variable. In either case you'd need to define how the state would interact with transactions and errors. regards, tom lane
On Sun, Mar 22, 2009 at 10:25 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> ... I suspect that it's going to boil down to running a >> SQL script, which will need to somehow get that module installed. To >> make that work, I think we need "CREATE MODULE foo" and then "CREATE >> <TABLE|VIEW|FUNCTION|...> ... MODULE foo". So the SQL script will >> create the module and then create all of the objects and make them >> depend on the module using the optional "MODULE foo" clause. > > I doubt that we want to decorate every CREATE statement we've got with > an optional MODULE clause; to name just one objection, it'd probably > be impossible to do so without making MODULE a fully reserved word. > > What was discussed in the last go-round was some sort of state-dependent > assignment of a module context. You could imagine either > > BEGIN MODULE modname; > > CREATE this; > CREATE that; > CREATE the_other; > > END MODULE; > > or something along the lines of > > SET current_module = modname; > > CREATE this; > CREATE that; > CREATE the_other; > > SET current_module = null; > > which is really more or less the same thing except that it makes the > state concrete in the form of an examinable variable. In either case > you'd need to define how the state would interact with transactions > and errors. I thought about that, but wasn't sure if people would like it, since it seems a little un-SQL-ish. But I'm fine with it, and it has the additional advantage that it avoids the need to recapitulate the module name many times. If there's no semantic problem with making current_module be a GUC, the SET syntax seems very tempting, since it avoids the need to make up something new and different. ...Robert
>>>>> "Alvaro" == Alvaro Herrera <alvherre@commandprompt.com> writes: >> I've been thinking about this, and my conclusion is that schemas>> as they currently exist are the wrong tool for making>>modules/packages. Alvaro> This has been discussed at length previously, and we even hadAlvaro> an incomplete but substantive patch posted. Did you reviewAlvaro> that? Some of it appears to be in line of what you'reAlvaro> proposing here. If you're interestedin this area, perhapsAlvaro> you could pick up where Tom Dunstan left off. Yes, that's close to what I had in mind. One difference is that I would be inclined to punt more of the installation logic into the module itself. If "INSTALL MODULE foo" worked by calling a specially-declared function in foo.so (if present), it would give the module more flexibility in terms of what to install based on the version number requested, etc.; some helper functions could be provided so that the simpler cases require only a few lines of code. Modules not implemented as .so files would have a bit less flexibility thanks to the fact that we don't have any procedural languages installed by default; how to do versioning for them would require a bit more thought. (Maybe have a defaultmodule.so to do the work for them?) I will consider working on this at some point. -- Andrew.
>>>>> "Dimitri" == Dimitri Fontaine <dfontaine@hi-media.com> writes: >> Partly that's based on the relative inflexibility of the>> search_path setting; it's hard to modify the search_path without>>completely replacing it, so knowledge of the "default" search path>> ends up being propagated to a lot of places. Dimitri> pg_catalog is implicit in the search_path, what about havingDimitri> user schemas with the implicit capability too? Dimitri> Then you have the problem of ordering more than one implicitDimitri> schemas, This is a hint that it's really a bad idea. Instead, what I'd suggest is breaking up search_path into multiple variables - maybe pre_search_path, search_path, and post_search_path. -- Andrew.
>>>>> "Dimitri" == Dimitri Fontaine <dfontaine@hi-media.com> writes: Dimitri> Hi,Dimitri> Heard about http://wiki.postgresql.org/wiki/ExtensionPackaging ? :) Yes, I left a short note on its discussion page a while ago :-) -- Andrew.
>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes: Tom> I doubt that we want to decorate every CREATE statement we'veTom> got with an optional MODULE clause; to name just oneobjection,Tom> it'd probably be impossible to do so without making MODULE aTom> fully reserved word. Tom> What was discussed in the last go-round was some sort ofTom> state-dependent assignment of a module context. You couldTom>imagine either[snip] Tom> or something along the lines of Tom> SET current_module = modname; Tom> CREATE this;Tom> CREATE that;Tom> CREATE the_other; Tom> SET current_module = null; Tom> which is really more or less the same thing except that it makesTom> the state concrete in the form of an examinablevariable. InTom> either case you'd need to define how the state would interactTom> with transactions and errors. I like the SET version better. As for transactions and errors, I think that installing a module should be done inside a transaction anyway; and the usual GUC mechanisms should handle it if it was done using SET LOCAL, no? -- Andrew.
Why do you need any explicit syntax? If the database is loading an SQL file as a result of a LOAD MODULE command wouldn't it know to set whatever internal state it needs to remember that? -- Greg On 22 Mar 2009, at 23:11, Andrew Gierth <andrew@tao11.riddles.org.uk> wrote: >>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes: > > Tom> I doubt that we want to decorate every CREATE statement we've > Tom> got with an optional MODULE clause; to name just one objection, > Tom> it'd probably be impossible to do so without making MODULE a > Tom> fully reserved word. > > Tom> What was discussed in the last go-round was some sort of > Tom> state-dependent assignment of a module context. You could > Tom> imagine either > [snip] > > Tom> or something along the lines of > > Tom> SET current_module = modname; > > Tom> CREATE this; > Tom> CREATE that; > Tom> CREATE the_other; > > Tom> SET current_module = null; > > Tom> which is really more or less the same thing except that it makes > Tom> the state concrete in the form of an examinable variable. In > Tom> either case you'd need to define how the state would interact > Tom> with transactions and errors. > > I like the SET version better. As for transactions and errors, I think > that installing a module should be done inside a transaction anyway; > and the usual GUC mechanisms should handle it if it was done using > SET LOCAL, no? > > -- > Andrew. > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers
On Sunday 22 March 2009 22:46:20 Tom Lane wrote: > You really haven't convinced me that this is anything but > overcomplication. Thinking about it some more what could be convincing is that an extension could be made of only SQL, with no module (.so) (I have a case here). If a single .sql file can be seen as an extension, I'd want to avoid naming it the same as the .so file itself. Having the term "module" refer either to a single .so (or .dll), or a .so with an accompanying .sql file to install it, or even just the SQL file... would add confusion, methinks. If there's not enough confusion here to grant separating what we call a module and what we call an extension, then I'll go edit my proposal :) > There might (or might not) be some use-case > for being able to declare that module A depends on module B, > but that doesn't mean we need a second layer of grouping. Agreed, this reason is not a good one for splitting module and extension. -- dim
On Monday 23 March 2009 04:05:04 Andrew Gierth wrote: > Dimitri> Heard about http://wiki.postgresql.org/wiki/ExtensionPackaging ? > Yes, I left a short note on its discussion page a while ago :-) Hehe... I'll answer here, as it's a more opened forum it seems... Schemas vs Extensions (or modules, we'll see): yes they are orthogonal concepts, but still, extensions should not pollute the public namespace, I (and some other) think. So we're encouraging extension's authors to use their own schema where to put the extension stuff, with the drawback that user would have to remember about it and manage it along with their own schemas, which cause search_path issues. I think your idea of splitting search_path into several components would help a lot here. -- dim
On Sun, Mar 22, 2009 at 11:26 PM, Greg Stark <greg.stark@enterprisedb.com> wrote: > Why do you need any explicit syntax? If the database is loading an SQL file > as a result of a LOAD MODULE command wouldn't it know to set whatever > internal state it needs to remember that? That might not be the only time you ever want to create dependencies on the module object. What if the module wants to create an additional table, view, etc. at some later time, following the load? I'm not sure whether there's a use case for that, but it doesn't seem totally implausible. ...Robert
On Monday 23 March 2009 12:34:31 Robert Haas wrote: > That might not be the only time you ever want to create dependencies > on the module object. What if the module wants to create an > additional table, view, etc. at some later time, following the load? > I'm not sure whether there's a use case for that, but it doesn't seem > totally implausible. Then there's Tom's idea of SET module TO ...; to have the context handy, or a WIP syntax in http://wiki.postgresql.org/wiki/ExtensionPackaging CREATE OR REPLACE EXTENSION foo ... AS $$ $$; So you could REPLACE an existing extension and add whatever you need to. -- dim
Dimitri Fontaine <dfontaine@hi-media.com> writes: > On Sunday 22 March 2009 22:46:20 Tom Lane wrote: >> You really haven't convinced me that this is anything but >> overcomplication. > Thinking about it some more what could be convincing is that an extension > could be made of only SQL, with no module (.so) (I have a case here). > If a single .sql file can be seen as an extension, I'd want to avoid naming it > the same as the .so file itself. Having the term "module" refer either to a > single .so (or .dll), or a .so with an accompanying .sql file to install it, or > even just the SQL file... would add confusion, methinks. I think the way most people are envisioning this is that a module is a set of SQL objects (functions, types, tables, whatever). Whether any of those are C functions in one or more underlying .so files is not really particularly relevant to the module mechanism. It should be possible to have a module that doesn't contain any C code, so the concept of a defining function does not look good to me. A defining SQL script is the way to go. The only way that the underlying .so file(s) become relevant is if you are trying to make this a *packaging* mechanism that can actually deliver and install the set of files required to implement a module. I don't think that's a good idea; not least because systems tend to already have their own packaging mechanisms, and we don't need to invent another. I think "module" should just be a SQL-level concept and not be concerned with how the files it needs arrive where they're needed. regards, tom lane
On Monday 23 March 2009 15:43:04 Tom Lane wrote: > I think the way most people are envisioning this is that a module is a > set of SQL objects (functions, types, tables, whatever). Whether any > of those are C functions in one or more underlying .so files is not > really particularly relevant to the module mechanism. Fine, that's what I wanted to call an extension in order not to change the meaning of module. I'll edit the proposal on the wiki later on tonight. > It should be possible to have a module that doesn't contain any C code, > so the concept of a defining function does not look good to me. A > defining SQL script is the way to go. Agreed here. I added some special SQL syntax on my proposal in order for the module author to be able to provide some advanced notions (dependencies, version, etc). I still think that using this special syntax around custom sql has advantages, namely that help solving the module altering facility and module variable handling. Module variable are needed by e.g. pljava for its classpath setting, which is meant to change depending on the caller from what I've been told. ALTER MODULE pljava SET classpath = 'some value here'; Of course, as hinted by Peter, the variables here are not GUCs. > The only way that the underlying .so file(s) become relevant is if you > are trying to make this a *packaging* mechanism that can actually > deliver and install the set of files required to implement a module. What I'm proposing in the WIP wiki page is to propose a source based packaging based on PGXS (just some glue around it to fetch the right tarball from command line without bothering, then run make and make install, can come much later). Binary packaging could then be made to work by packagers, based on this. What I like about this optional tool is the fact that the -core distribution could then publish extra contribs in a central trusted location, such as http://modules.postgresql.org/. Source based only distribution there, hassle-free, allowing -core to stamp e.g. plproxy as a trusted module for PostgreSQL. Minor releases policy would have to be talked about, of course. > I don't think that's a good idea; not least because systems tend to > already have their own packaging mechanisms, and we don't need to invent > another. I think "module" should just be a SQL-level concept and not be > concerned with how the files it needs arrive where they're needed. Well, maybe just complaining at module "creation" time (that's when you run the SQL script possibly containing CREATE OR REPLACE MODULE ... $$ <sql> $$) would be enough as far as .so dependency is concerned. The error message would of course come from the first create function language C referring to the non existent file, which would trigger a rollback. Is that roughly what you have in mind? -- Dimitri Fontaine Architecte DBA PostgreSQL
On Mon, Mar 23, 2009 at 7:46 AM, Dimitri Fontaine <dfontaine@hi-media.com> wrote: > On Monday 23 March 2009 12:34:31 Robert Haas wrote: >> That might not be the only time you ever want to create dependencies >> on the module object. What if the module wants to create an >> additional table, view, etc. at some later time, following the load? >> I'm not sure whether there's a use case for that, but it doesn't seem >> totally implausible. > > Then there's Tom's idea of SET module TO ...; to have the context handy, or a > WIP syntax in http://wiki.postgresql.org/wiki/ExtensionPackaging > > CREATE OR REPLACE EXTENSION foo ... > AS $$ > $$; > > So you could REPLACE an existing extension and add whatever you need to. I think SET module_context = 'whatever' is the right idea. CREATE OR REPLACE MODULE is not going to work. Suppose that when we originally install the extension we do: CREATE TABLE some_table (id integer not null, foo text not null, primary key (id)); ...later when we try to do CREATE OR REPLACE the definition has changed to: CREATE TABLE some_table (id integer not null, bar text not null, baz text not null, primary key (id)); It may well be that the table has data in it that was inserted after module creation time, and the user may want it preserved with the upgrade, but there's really no way to even begin to guess what the user had in mind here. The CREATE OR REPLACE idea doesn't have very clean semantics even with functions, which are probably the primary use case for this mechanism.If I replace a module, and the new definition doesn'tdefine one of the functions in the original definition, does that amount to an implicit drop of that function? If the module contains a CREATE FUNCTION command, does using CREATE OR REPLACE on the module effectively turn CREATE FUNCTION into CREATE OR REPLACE FUNCTION? Nobody is going to like these semantics, I think, and it gets far uglier when you start looking at tables, views, etc. (It's also worth noting, as an independent point, that I suspect SET module_context = 'whatever' will be easier to implement.) ...Robert
Le 23 mars 09 à 20:33, Robert Haas a écrit : > It may well be that the table has data in it that was inserted after > module creation time, and the user may want it preserved with the > upgrade, but there's really no way to even begin to guess what the > user had in mind here. Exactly, we're not in the business of second guessing our users. So we have versioning information built into the facility, and we should provide a way to tell from which version we're upgrading if that's the case. Then the module author's would be able to do things depending on the value of the OLD.version (to reuse existing notations and concepts) or something. That means supporting conditionals, and that's sound like it's not in the TODO for the first implementation. But still, I don't see how you manage to give the modules authors a nice upgrade facility without something like this. Regards, -- dim
>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes: Tom> I think the way most people are envisioning this is that aTom> module is a set of SQL objects (functions, types, tables,Tom>whatever). Whether any of those are C functions in one or moreTom> underlying .so files is not really particularlyrelevant to theTom> module mechanism. Tom> It should be possible to have a module that doesn't contain anyTom> C code, Yes. Tom> so the concept of a defining function does not look good to me.Tom> A defining SQL script is the way to go. But I disagree with this, for the simple reason that we don't have anything like enough flexibility in the form of conditional DDL or error handling, when working in pure SQL without any procedural help. This is especially true when you start to look at how to handle conflicts, upgrades and versioning. -- Andrew.
Andrew Gierth <andrew@tao11.riddles.org.uk> writes: > "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes: > Tom> A defining SQL script is the way to go. > But I disagree with this, for the simple reason that we don't have > anything like enough flexibility in the form of conditional DDL or > error handling, when working in pure SQL without any procedural help. So? You can have the script create, execute, and remove a function, and thereby perform any operation that a function could possibly perform for you. regards, tom lane
Hello attention, MODULE is ANSI SQL keyword, and modules are class from ANSI SQL. <SQL-server module definition> ::= CREATE MODULE <SQL-server module name> [ <SQL-server module character set specification>] [ <SQL-server module schema clause> ] [ <SQL-server module path specification> ] [ <temporary table declaration>... ] <SQL-server module contents>... END MODULE <SQL-server module character set specification> ::= NAMES ARE <character set specification> <SQL-server module schema clause> ::= SCHEMA <default schema name> <default schema name> ::= <schema name> <SQL-server module path specification> ::= <path specification> <SQL-server module contents> ::= <SQL-invoked routine> <semicolon> Regards Pavel Stehule 2009/3/23 Tom Lane <tgl@sss.pgh.pa.us>: > Robert Haas <robertmhaas@gmail.com> writes: >> ... I suspect that it's going to boil down to running a >> SQL script, which will need to somehow get that module installed. To >> make that work, I think we need "CREATE MODULE foo" and then "CREATE >> <TABLE|VIEW|FUNCTION|...> ... MODULE foo". So the SQL script will >> create the module and then create all of the objects and make them >> depend on the module using the optional "MODULE foo" clause. > > I doubt that we want to decorate every CREATE statement we've got with > an optional MODULE clause; to name just one objection, it'd probably > be impossible to do so without making MODULE a fully reserved word. > > What was discussed in the last go-round was some sort of state-dependent > assignment of a module context. You could imagine either > > BEGIN MODULE modname; > > CREATE this; > CREATE that; > CREATE the_other; > > END MODULE; > > or something along the lines of > > SET current_module = modname; > > CREATE this; > CREATE that; > CREATE the_other; > > SET current_module = null; > > which is really more or less the same thing except that it makes the > state concrete in the form of an examinable variable. In either case > you'd need to define how the state would interact with transactions > and errors. > > regards, tom lane > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers >
Robert Haas wrote: > I think the key to getting this done is > to keep the design as simple as possible and to avoid entanglements > with other features that may need to be designed independently and > first. I think the key to getting this done is to define project purpose and requirements before doing anything else.