Re: [HACKERS] plperl intial pass - Mailing list pgsql-hackers
From | wieck@debis.com (Jan Wieck) |
---|---|
Subject | Re: [HACKERS] plperl intial pass |
Date | |
Msg-id | m118tke-0003kvC@orion.SAPserv.Hamburg.dsh.de Whole thread Raw |
In response to | Re: [HACKERS] plperl intial pass ("Mark Hollomon" <mhh@nortelnetworks.com>) |
List | pgsql-hackers |
Mark Hollomon wrote: > > A dynamically loadable Tcl module contains one special > > function named <libname>_Init() where first character of > > libname is capitalized. On dynamic load, this function is > > called with the invoking interpreter as argument. This > > function then calls Tcl_CreateCommand() etc. to tell Tcl > ^^^^^^^^^^^^^^^^^ > > And here-in lies the problem. Tcl_CreateCommand is sitting, not > in the executable, but in the shared-lib with the function call > handler. dlopen(), by default will not link across shared-libs. > > postgres > /-----/ \-----\ > | | > plperl.so ---> Opcode.so > ^^ > This link doesn't happen. But it does for PL/Tcl - at least under Linux-ELF. (C = Call to, L = Location of functions code segment): +-------------------------+ | postgres | +-------------------------+ | | dynamic load | v +---------------------------+ +---------------------------+ | pltcl.so |--------->| libtcl8.0.so | | | auto- | | | C Tcl_CreateInterp() | dynamic | L Tcl_CreateInterp() | | C Tcl_CreateCommand() | load | L Tcl_CreateCommand() | | L static pltcl_SPI_exec() | | C pltcl_SPI_exec() | +---------------------------+ +---------------------------+ After loading of pltcl.so, it calls Tcl_CreateInterp() to build a Tcl interpreter, and then calls Tcl_CreateCommand() to tell that interpreter the address of one of it's hidden (static) functions plus a name for it from the script side. The interpreter just remembers this in it's command hash table, and if that keyword occurs when it expects a command/procedure name, just calls it via the function pointer. There is no -ltcl8.0 switch in the link step of postgres. The fact that pltcl.so needs something out of libtcl8.0.so is told when linking pltcl.so: gcc -shared -o pltcl.so pltcl.o -L/usr/local/lib -ltcl8.0 That results in this: [pgsql@hot] ~ > ldd bin/postgres libdl.so.1 => /lib/libdl.so.1 (0x4000a000) libm.so.5 => /lib/libm.so.5 (0x4000d000) libtermcap.so.2 => /usr/lib/libtermcap.so.2 (0x40016000) libncurses.so.3.0 => /lib/libncurses.so.3.0 (0x4001a000) libc.so.5 => /lib/libc.so.5 (0x4005b000) [pgsql@hot] ~ > ldd lib/pltcl.so ./lib/pltcl.so => ./lib/pltcl.so (0x4000a000) libc.so.5 => /lib/libc.so.5 (0x40010000) libtcl8.0.so => /usr/local/lib/libtcl8.0.so (0x400cb000) As you see, there is no libtcl mentioned in the shared lib dependencies of the postgres backend. It's the pltcl.so shared object that remembers this. And if you invoke "ldd -r -d pltcl.so" it will print alot of unresolveable symbols, but most of them are backend symbols (the others are math ones because the above gcc -shared call is in fact incomplete - but since the backend is already linked against libm.so it doesn't matter :-). So if I want to use My dynamically loadable package for Tcl from inside the PL/Tcl interpreter, I would have to call My_Init() from pltcl.so AND add My.so to the linkage of pltcl.so. Calling My_Init() causes that "pltcl.o" has an unresolved reference to symbol _My_Init. The linker find's it in My.so and saves this info in pltcl.so so the dynamic loader can (and does) resolve it whenever something load pltcl.so. The important key is to reference at least one symbol in the shared lib you want to get automatically loaded. You can add as much link libs with -l as you want. If none of their symbols is needed, the linker will not save this dependency (because there is none) in the resulting .so. I'll give it a try and USE some binary Tcl packages from inside. Will tell ya soon. > Getting those two to play together is more than I care to attempt. > I am researching a fix now to let linux installations use dlopen > if it is available. Don't think you need to. > > This is just the way I would do it for Tcl and I'll surely do > > it someday. I would like to have a second, unsafe > > interpreter in the module. That could then modify files or > > use the frontend library to access a different database on > > another server. Needless to say that this then would be an > > untrusted language, available only for db superusers. > > > > Yes, I've been thinking about that as well. It would be nice to have > permissions based on userid. Maybe the 'suid' stuff that is being > discussed in another thread will gives us a mechanism. I know, I know - and I know how. It cannot work for "internal" language functions. But for anything that goes through some loading (dynloader or PL call hander), the fmgr looks up pg_proc and put's informations into the FmgrInfo struct. Adding a setuid field to pg_proc and remembering that too wouldn't be too much and it then would know when calling such a beast. Fmgr then manages a current user stack which must be reset on a transaction abort. Anything that needs the current user simply looks at the toplevel stack entry. This is totally transparent then for all non-builtin functions and all non-builtin triggers (where I don't know of one). Maybe I kept this far too long in mind. But I thought about some more complicated changes to the function call interface for a while that would require touching several dozens of source files (single argument NULL identification, returning tuples and tuple SET's). Doing SETUID would have been some DONE WHILE AT IT. I really should do it earlier than the SET's, because they require subselecting RTE's (which it the third thread now - eh - I better shut up). Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #========================================= wieck@debis.com (Jan Wieck) #
pgsql-hackers by date: