Thread: [PATCH] Magic block for modules
This implements a proposal made last november: http://archives.postgresql.org/pgsql-hackers/2005-11/msg00578.php Basically, it tries to catch people loading modules which belong to the wrong version or have had certain constants changed, or architechture mismatches. It's a bit more fine grained though, it currently catches changes in any of the following: PG_VERSION_NUM CATALOG_VERSION_NO the size of 8 basic C types BLCKSZ NAMEDATALEN HAVE_INT64_TIMESTAMP INDEX_MAX_KEYS FUNC_MAX_ARGS VARHDRSZ MAXDIM The compiler used (only brand, not version) It may be overkill, but better safe than sorry. The only one I'm ambivalent about is the first one. We don't require a recompile between minor version changes, or do we? All it requires is to include the header "pgmagic.h" and to put somewhere in their source: PG_MODULE_MAGIC Currently, modules without a magic block are merely logged at LOG level. This needs some discussion though. Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to litigate.
Attachment
Martijn van Oosterhout <kleptog@svana.org> writes: > This implements a proposal made last november: > http://archives.postgresql.org/pgsql-hackers/2005-11/msg00578.php Ah, good, I'd been meaning to do this. > changes in any of the following: > PG_VERSION_NUM > CATALOG_VERSION_NO > the size of 8 basic C types > BLCKSZ=20 > NAMEDATALEN=20 > HAVE_INT64_TIMESTAMP > INDEX_MAX_KEYS > FUNC_MAX_ARGS > VARHDRSZ > MAXDIM > The compiler used (only brand, not version) That seems way overkill to me. FUNC_MAX_ARGS is good to check, but most of those other things are noncritical for typical add-on modules. In particular I strongly object to the check on compiler. Some of us do use systems where gcc and vendor compilers are supposed to interoperate ... and aren't all those Windows compilers supposed to, too? AFAIK it's considered the linker's job to prevent loading 32-bit code into a 64-bit executable or vice versa, so I don't think we need to be checking for common assumptions about sizeof(long). > Currently, modules without a magic block are merely logged at LOG > level. This needs some discussion though. I'm pretty sure we had agreed that magic blocks should be required; otherwise this check will accomplish little. regards, tom lane
On Sun, May 07, 2006 at 08:21:43PM -0400, Tom Lane wrote: > > changes in any of the following: > > > PG_VERSION_NUM > > CATALOG_VERSION_NO > > the size of 8 basic C types > > BLCKSZ=20 > > NAMEDATALEN=20 > > HAVE_INT64_TIMESTAMP > > INDEX_MAX_KEYS > > FUNC_MAX_ARGS > > VARHDRSZ > > MAXDIM > > The compiler used (only brand, not version) > > That seems way overkill to me. FUNC_MAX_ARGS is good to check, but > most of those other things are noncritical for typical add-on modules. I was trying to find variables that when changed would make some things corrupt. For example, a changed NAMEDATALEN will make any use of the syscache a source of errors. A change in INDEX_MAX_KEYS will break the GiST interface, etc. I wondered about letting module writers to select which parts are relevent to them but that just seems like handing people a footgun. > In particular I strongly object to the check on compiler. Some of us do > use systems where gcc and vendor compilers are supposed to interoperate > ... and aren't all those Windows compilers supposed to, too? AFAIK Maybe that's the case now, it didn't used to be. I seem to remember people having difficulties because they compiled the server with MinGW and the modules with VC++. I'll take it out though, it's not like it costs anything. > it's considered the linker's job to prevent loading 32-bit code into > a 64-bit executable or vice versa, so I don't think we need to be > checking for common assumptions about sizeof(long). I know ELF headers contain some of this info, and unix in general doesn't try to allow different bit sizes in one binary. Windows used to (maybe still has) a mechanism to allow 32-bit code to call 16-bit libraries. Do they allow the same for 64-bit libs? > I'm pretty sure we had agreed that magic blocks should be required; > otherwise this check will accomplish little. Sure, I just didn't want to break every module in one weekend. I was thinking of adding it with LOG level now, send a message on -announce saying that at the beginning of the 8.2 freeze it will be an ERROR. Give people time to react. Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to litigate.
Attachment
> > it's considered the linker's job to prevent loading 32-bit > code into a > > 64-bit executable or vice versa, so I don't think we need to be > > checking for common assumptions about sizeof(long). > > I know ELF headers contain some of this info, and unix in > general doesn't try to allow different bit sizes in one > binary. Windows used to (maybe still has) a mechanism to > allow 32-bit code to call 16-bit libraries. Do they allow the > same for 64-bit libs? Yes, but it's not something that it does automatically - you have to specifically seti t up to call the thunking code. It's not something I think we need to support at all. (Performance is also quite horrible - at least on 16 vs 32, I'd assume the same for 32 vs 64) //Magnus
Martijn van Oosterhout <kleptog@svana.org> writes: > On Sun, May 07, 2006 at 08:21:43PM -0400, Tom Lane wrote: >> That seems way overkill to me. FUNC_MAX_ARGS is good to check, but >> most of those other things are noncritical for typical add-on modules. > I was trying to find variables that when changed would make some things > corrupt. For example, a changed NAMEDATALEN will make any use of the > syscache a source of errors. A change in INDEX_MAX_KEYS will break the > GiST interface, etc. By that rationale you'd have to record just about every #define in the system headers. And it still wouldn't be bulletproof --- what of custom-modified code with, say, extra fields inserted into some widely used struct? But you're missing the larger point, which is that in many cases this would be breaking stuff without any need at all. The majority of catversion bumps, for instance, are for things that don't affect the typical add-on module. So checking for identical catversion won't accomplish much except to force additional recompile churn on people doing development against CVS HEAD. The original proposal was just to check for major PG version match. I can see checking FUNC_MAX_ARGS too, because that has a very direct impact on the ABI that every external function sees, but I think the cost/benefit ratio rises pretty darn steeply after that. Another problem with an expansive list of stuff-to-check is where does the add-on module find it out from? AFAICS your proposal would make for a large laundry list of random headers that every add-on would now have to #include. If it's not defined by postgres.h or fmgr.h (which are two things that every backend addon is surely including already) then I'm dubious about using it in the magic block. > Sure, I just didn't want to break every module in one weekend. I was > thinking of adding it with LOG level now, send a message on -announce > saying that at the beginning of the 8.2 freeze it will be an ERROR. > Give people time to react. I think that will just mean that we'll break every module at the start of 8.2 freeze ;-). Unless we forget to change it to error, which IMHO is way too likely. regards, tom lane
On Mon, May 08, 2006 at 10:32:47AM -0400, Tom Lane wrote: > Martijn van Oosterhout <kleptog@svana.org> writes: > > I was trying to find variables that when changed would make some things > > corrupt. For example, a changed NAMEDATALEN will make any use of the > > syscache a source of errors. A change in INDEX_MAX_KEYS will break the > > GiST interface, etc. > > By that rationale you'd have to record just about every #define in the > system headers. And it still wouldn't be bulletproof --- what of > custom-modified code with, say, extra fields inserted into some widely > used struct? I can see that. That's why I specifically aimed at the ones defined in pg_config_manual.h, ie, the ones marked "twiddle me". > ... So checking for identical catversion won't > accomplish much except to force additional recompile churn on people > doing development against CVS HEAD. The original proposal was just > to check for major PG version match. Ok, I've taken out CATVERSION and cut PG version to just the major version. I've also dropped the compiler and several others. > Another problem with an expansive list of stuff-to-check is where does > the add-on module find it out from? All these symbols are defined by including c.h only, which is included by postgres.h, so this is not an issue. I obviously didn't include any symbols that a module would need to add special includes for. The only outlier was CATVERSION but we're dropping that test. > I think that will just mean that we'll break every module at the start > of 8.2 freeze ;-). Unless we forget to change it to error, which IMHO > is way too likely. Ok, one week then. Not everyone follows -patches and will be mighty confused when a CVS update suddenly breaks everything. Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to litigate.
Attachment
On 5/8/06, Martijn van Oosterhout <kleptog@svana.org> wrote: > This implements a proposal made last november: > > http://archives.postgresql.org/pgsql-hackers/2005-11/msg00578.php > All it requires is to include the header "pgmagic.h" and to put > somewhere in their source: > > PG_MODULE_MAGIC Could you serve this as special docstring instead? Eg: PG_MODULE(foomodule) is mandatory, there you can to your magic, and optional: PG_MODULE_DESC("Do foo") PG_MODULE_AUTHOR("FooMan <baz@foo>") This provides more motivation for module authors and also creates (visually) smooth path to provide automatic install, uninstall and registration: PG_MODULE_INSTALL(inst_sql) PG_MODULE_UNINSTALL(uninst_sql) create module foo from '$libdir/foo'; drop module foo; This seems like worthwhile direction to move, especially as it requires pretty small amount of changes. -- marko
On Wed, May 31, 2006 at 01:08:41PM +0300, Marko Kreen wrote: > On 5/8/06, Martijn van Oosterhout <kleptog@svana.org> wrote: > >All it requires is to include the header "pgmagic.h" and to put > >somewhere in their source: > > > >PG_MODULE_MAGIC > > Could you serve this as special docstring instead? Eg: > > PG_MODULE(foomodule) > > is mandatory, there you can to your magic, and optional: <snip> I like it, but I'm not sure there's enough consensus for that. I've suggested before including install info inside the modules themselves but there doesn't seem to be much interest in that. Apart from that there's issues with implementation. The Linux kernel can do it easily because it knows it will be using ELF, thus can use sections to store this info. Postgresql has to support many more types, making things like this tricky (but not impossible). Personally I'd like postgres to move to a system where external modules can easily be installed, uninstalled and upgraded. However, I've not seen the demand yet. Have a nice day -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to litigate.
Attachment
On 5/31/06, Martijn van Oosterhout <kleptog@svana.org> wrote: > On Wed, May 31, 2006 at 01:08:41PM +0300, Marko Kreen wrote: > > On 5/8/06, Martijn van Oosterhout <kleptog@svana.org> wrote: > > >All it requires is to include the header "pgmagic.h" and to put > > >somewhere in their source: > > > > > >PG_MODULE_MAGIC > > > > Could you serve this as special docstring instead? Eg: > > > > PG_MODULE(foomodule) > > > > is mandatory, there you can to your magic, and optional: > > <snip> > > I like it, but I'm not sure there's enough consensus for that. I've > suggested before including install info inside the modules themselves > but there doesn't seem to be much interest in that. I am not suggesting to try to go all the way, just to make sure that your current patch fits into that direction. > Apart from that there's issues with implementation. The Linux kernel > can do it easily because it knows it will be using ELF, thus can use > sections to store this info. Postgresql has to support many more types, > making things like this tricky (but not impossible). PostgreSQL already requires symbol loading functionality for V1 function signatures, so per-module symbols won't be much burden. > Personally I'd like postgres to move to a system where external modules > can easily be installed, uninstalled and upgraded. However, I've not > seen the demand yet. Demand happens only when users get used to such niceties on some other databases. Considering that PostgreSQL is extensibility-wise most advanced database and anything we offer is worlds best, there won't be any demand in years to come. I rather think we should create that demand. Tasks like - see what modules are installed in database. - install module - remove module are rather clunky in current setup. Making them easier would be good thing. Ofcourse, its easy to tell others to do things. I'll try to hack on that area myself also. If not earlier then maybe on Summit Code Sprint at least. -- marko
"Marko Kreen" <markokr@gmail.com> writes: >>> Could you serve this as special docstring instead? Eg: >>> PG_MODULE(foomodule) I have no objection to that, and see no real implementation problem with it: we just add a "const char *" field to the magic block. The other stuff seems too blue-sky, and I'm not even sure that it's the right direction to proceed in. Marko seems to be envisioning a future where an extension module is this binary blob with install/deinstall/etc code all hardwired into it. I don't like that a bit. I think the current scheme with separate SQL scripts is a *good* thing, because it makes it a lot easier for users to tweak the SQL definitions, eg, install the functions into a non-default schema. Also, I don't have a problem imagining extension modules that contain no C code, just PL functions --- so the SQL script needs to be considered the primary piece of the module, not the shared library. Is it worth adding a module name to the magic block, or should we just leave well enough alone? It's certainly not something foreseen as part of the purpose of that block. In the absence of some fairly concrete ideas what to do with it, I'm probably going to vote keep-it-simple. regards, tom lane
On Wed, May 31, 2006 at 11:14:27AM -0400, Tom Lane wrote: > Is it worth adding a module name to the magic block, or should we just > leave well enough alone? It's certainly not something foreseen as part > of the purpose of that block. In the absence of some fairly concrete > ideas what to do with it, I'm probably going to vote keep-it-simple. I actually considered it while writing the patch but decided against given the general tendancy against putting extra info into the modules in general... Personally I think it's a good idea, except: where is this info going to be displayed or used? Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to litigate.
Attachment
On 5/31/06, Tom Lane <tgl@sss.pgh.pa.us> wrote: > "Marko Kreen" <markokr@gmail.com> writes: > >>> Could you serve this as special docstring instead? Eg: > >>> PG_MODULE(foomodule) > > I have no objection to that, and see no real implementation problem with > it: we just add a "const char *" field to the magic block. The other > stuff seems too blue-sky, and I'm not even sure that it's the right > direction to proceed in. It was not blue-sky, it was handwaving :) > Marko seems to be envisioning a future where > an extension module is this binary blob with install/deinstall/etc code > all hardwired into it. I don't like that a bit. I think the current > scheme with separate SQL scripts is a *good* thing, because it makes it > a lot easier for users to tweak the SQL definitions, eg, install the > functions into a non-default schema. Also, I don't have a problem > imagining extension modules that contain no C code, just PL functions > --- so the SQL script needs to be considered the primary piece of the > module, not the shared library. I'll later post a list of ideas that we can hopefully agree on and discuss them further. > Is it worth adding a module name to the magic block, or should we just > leave well enough alone? It's certainly not something foreseen as part > of the purpose of that block. In the absence of some fairly concrete > ideas what to do with it, I'm probably going to vote keep-it-simple. Yes, if we want to keep separate SQL for modules then putting stuff into .so is pointless. -- marko
On Wed, May 31, 2006 at 11:14:27AM -0400, Tom Lane wrote: <snip> > ... The other > stuff seems too blue-sky, and I'm not even sure that it's the right > direction to proceed in. Marko seems to be envisioning a future where > an extension module is this binary blob with install/deinstall/etc code > all hardwired into it. I don't like that a bit. I think the current > scheme with separate SQL scripts is a *good* thing, because it makes it > a lot easier for users to tweak the SQL definitions, eg, install the > functions into a non-default schema. Also, I don't have a problem > imagining extension modules that contain no C code, just PL functions > --- so the SQL script needs to be considered the primary piece of the > module, not the shared library. While you do have a good point about non-binary modules, our module handling need some help IMHO. For example, the current hack for CREATE LANGUAGE to fix things caused by old pg_dumps. I think that's the totally wrong approach long term, I think the pg_dump shouldn't be including the CREATE LANGUAGE statement at all, but should be saying something like "INSTALL plpgsql" and pg_restore works out what is needed for that module. The above requires getting a few bits straight: 1. When given the name of an external module, you need to be able to find the SQL commands needed to make it work. 2. You need to be able to tell if something is installed already or not. 3. You need to be able to uninstall it again. Why do we rely on hand-written uninstall scripts when we have a perfectly functional dependancy mechanism that can adequatly track what was added and remove it again on demand. With these in place, upgrades across versions of postgres could become a lot easier. People using tsearch2 now would get only "INSTALL tsearch2" in their dumps and when they upgrade to 8.2 they get the new definitions for tsearch using GIN. No old definitions to confuse people or the database. (Note: I'm not sure if tsearch would be compatable at the query level, but that's not relevent to the point I'm making). We could get straight into discussions of mechanism, but it would be nice to know if people think the above is a worthwhile idea. Have a ncie day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to litigate.
Attachment
On Wednesday 31 May 2006 13:24, Martijn van Oosterhout wrote: > On Wed, May 31, 2006 at 11:14:27AM -0400, Tom Lane wrote: > > Is it worth adding a module name to the magic block, or should we just > > leave well enough alone? It's certainly not something foreseen as part > > of the purpose of that block. In the absence of some fairly concrete > > ideas what to do with it, I'm probably going to vote keep-it-simple. > > I actually considered it while writing the patch but decided against > given the general tendancy against putting extra info into the modules > in general... > > Personally I think it's a good idea, except: where is this info going > to be displayed or used? > Marko's suggestion on producing a list of installed modules comes to mind, and I suspect tools like pgadmin or ppa will want to be able to show this information. -- Robert Treat Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL
> Marko's suggestion on producing a list of installed modules comes to mind, and > I suspect tools like pgadmin or ppa will want to be able to show this > information. My request for phpPgAdmin is to somehow be able to check if the .so file for a module is present. For instance, I'd like to 'enable slony support' if the slony shared library is present. PPA's slony support automatically executes the .sql files, so all I need to know is if the .so is there. Chris
Christopher Kings-Lynne <chris.kings-lynne@calorieking.com> writes: > My request for phpPgAdmin is to somehow be able to check if the .so file > for a module is present. > For instance, I'd like to 'enable slony support' if the slony shared > library is present. PPA's slony support automatically executes the .sql > files, so all I need to know is if the .so is there. I really think this is backwards: you should be looking for the .sql files. Every module will have a .sql file, not every one will need a .so file. See followup thread in -hackers where we're trying to hash out design details. regards, tom lane
>> For instance, I'd like to 'enable slony support' if the slony shared >> library is present. PPA's slony support automatically executes the .sql >> files, so all I need to know is if the .so is there. > > I really think this is backwards: you should be looking for the .sql > files. Every module will have a .sql file, not every one will need a > .so file. See followup thread in -hackers where we're trying to hash > out design details. Not in this case. Basically Slony has the concept of installing a node into a server. You can have multiple ones of them - different schemas. So, I'd like to be able to detect that the .so is there, and then offer an "install node" feature where WE execute the SQL on their behalf, with all the complicated string substitions already done. The trick is that Slony currently requires you to use a command line tool to execute these scripts for you. At the moment, people have to indicate in our config while that Slony is available, and also point us to where the Slony SQL scripts are located. We do the rest. It's not too important, but it's just an idea. Chris
Christopher Kings-Lynne <chris.kings-lynne@calorieking.com> writes: >> I really think this is backwards: you should be looking for the .sql >> files. Every module will have a .sql file, not every one will need a >> .so file. See followup thread in -hackers where we're trying to hash >> out design details. > Not in this case. > Basically Slony has the concept of installing a node into a server. You > can have multiple ones of them - different schemas. So, I'd like to be > able to detect that the .so is there, and then offer an "install node" > feature where WE execute the SQL on their behalf, with all the > complicated string substitions already done. No, Slony is going to have to adapt to modules, not vice versa. We are *not* designing the module feature on the assumption that every module has some C functions at its core. That would be a shameful restriction of the potential applications. It might be that some way to parameterize the SQL scripts would be handy (the question about which schema to install into comes to mind) ... but that doesn't justify making a .so file the central part of the module concept. But again, this is the wrong list. Please contribute to the "Generalized concept of modules" thread in -hackers. regards, tom lane
On Thursday 01 June 2006 21:38, Christopher Kings-Lynne wrote: > > Marko's suggestion on producing a list of installed modules comes to > > mind, and I suspect tools like pgadmin or ppa will want to be able to > > show this information. > > My request for phpPgAdmin is to somehow be able to check if the .so file > for a module is present. > > For instance, I'd like to 'enable slony support' if the slony shared > library is present. PPA's slony support automatically executes the .sql > files, so all I need to know is if the .so is there. > While I agree with the above (having that for tsearch2 would be nice too) I think we ought to keep in mind the idea of sql based modules. Nothing jumps to mind here ppa wise, but I could see an application looking to see if mysqlcompat was installed before running if it had a good way to do so. -- Robert Treat Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL