Modules - Mailing list pgsql-hackers
From | Mattias Kregert |
---|---|
Subject | Modules |
Date | |
Msg-id | 351CF586.2FA40383@algonet.se Whole thread Raw |
In response to | Re: [HACKERS] Data type removal (dg@illustra.com (David Gould)) |
Responses |
Re: [HACKERS] Modules
|
List | pgsql-hackers |
David Gould wrote: > > To load a module into a kernel all you need to do is read the code in, > resolve the symbols, and maybe call an intialization routine. This is > merely a variation on loading a shared object (.so) file into a program. > > To add a type and related stuff to a database is really a much harder problem. I don't agree. > You need to be able to > - add one or more type descriptions types table > - add input and output functions types, functions tables > - add cast functions casts, functions tables > - add any datatype specific behavior functions functions table > - add access method operators (maybe) amops, functions tables > - add aggregate operators aggregates, functions > - add operators operators, functions > - provide statistics functions > - provide destroy operators > - provide .so files for C functions, SQL for sql functions > (note this is the part needed for a unix kernel module) > - do all the above within a particular schema > > You may also need to create and populate data tables, rules, defaults, etc > required by the implementation of the new type. All this would be done by the init function in the module you load. What we need is a set of functions callable by modules, like module_register_type(name, descr, func*, textin*, textout*, whatever ...) module_register_smgr(name, descr, .....) module_register_command(.... Casts would be done by converting to a common format (text) and then to the desired type. Use textin/textout. No special cast functions would have to exist. Why doesn't it work this way already??? Would not that solve all casting problems? > To unload a type requires undoing all the above. But there is a wrinkle: first > you have to check if there are any dependancies. That is, if the user has > created a table with one of the new types, you have to drop that table > (including column defs, indexes, rules, triggers, defaults etc) before > you can drop the type. Of course the user may not want to drop their tables > which brings us to the the next problem. Dependencies are checked by the OS kernel when you try to unload modules. You cannot unload slhc without first unloading ppp, for example. What's the difference? If you have Mod4X running with /dev/dsp opened, then you can't unload the sound driver, because it is in use, and you cannot unload a.out module if you have a non-ELF program running, and you can see the refcount on all modules and so on... This would not be different in a SQL server. If you have a cursor open, accessing IP types, then you cannot unload the IP-types module. Close the cursor, and you can unload the module if you want to. You don't have to drop tables containing new types just because you unload the module. If you want to SELECT from it, then that module would be loaded automagically when it is needed. > When this gets really hard is when it is time to upgrade an existing database > to a new version. Suppose you add a new column to a type in the new version. > How does a user with lots of data in dozens of tables using the old type > install the new module? > > What about restoring a dump from an old version into a system with the new > version installed? Suppose you change TIMESTAMP to 64 bits time and 16 bits userid... how do you solve that problem? You would probably have to make the textin/textout functions for the type recognize the old format and make the appropriate conversions. Perhaps add zero userid, or default to postmaster userid? This would not be any different if TIMESTAMP was in a separate module. For the internal storage format, every type could have it's own way of recognizing different versions of the data. For example, say you have an IPv4 module and inserts millions of IP-addresses, then you upgrade to IPv6 module. It would then be able to look at the data and see if it is a IPv4 or IPv6 address. Of course, you would have problems if you tried to downgrade and had lots of IPv6 addresses inserted. MyOwnType could use the first few bits of the data to decide which version it is, and later releases of MyOwnType-module would be able to recognize the older formats. This way, types could be upgraded without dump-and-load procedure. > Or how about migrating to a different platform? Can we move data from > a little endian platform (x86) to a big endian platform (sparc)? Obviously > the .so files will be different, but what about the copying the data out and > reloading it? Is this a problem right now? Dump and reload, how can it fail? > Just to belabor this, it is perfectly reasonable to add a set of types and > functions that have no 'C' implementation. The 'loadable module' analogy > misses a lot of the real requirements. Why would someone want a type without implementation? Ok, let the module's init function register a type marked as "non-existant"? Null function? /* m */
pgsql-hackers by date: