Thread: [PATCH] Magic block for modules

[PATCH] Magic block for modules

From
Martijn van Oosterhout
Date:
This implements a proposal made last november:

http://archives.postgresql.org/pgsql-hackers/2005-11/msg00578.php

Basically, it tries to catch people loading modules which belong to the
wrong version or have had certain constants changed, or architechture
mismatches. It's a bit more fine grained though, it currently catches
changes in any of the following:

PG_VERSION_NUM
CATALOG_VERSION_NO
the size of 8 basic C types
BLCKSZ
NAMEDATALEN
HAVE_INT64_TIMESTAMP
INDEX_MAX_KEYS
FUNC_MAX_ARGS
VARHDRSZ
MAXDIM
The compiler used (only brand, not version)

It may be overkill, but better safe than sorry. The only one I'm
ambivalent about is the first one. We don't require a recompile between
minor version changes, or do we?

All it requires is to include the header "pgmagic.h" and to put
somewhere in their source:

PG_MODULE_MAGIC

Currently, modules without a magic block are merely logged at LOG
level. This needs some discussion though.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

Attachment

Re: [PATCH] Magic block for modules

From
Tom Lane
Date:
Martijn van Oosterhout <kleptog@svana.org> writes:
> This implements a proposal made last november:
> http://archives.postgresql.org/pgsql-hackers/2005-11/msg00578.php

Ah, good, I'd been meaning to do this.

> changes in any of the following:

> PG_VERSION_NUM
> CATALOG_VERSION_NO
> the size of 8 basic C types
> BLCKSZ=20
> NAMEDATALEN=20
> HAVE_INT64_TIMESTAMP
> INDEX_MAX_KEYS
> FUNC_MAX_ARGS
> VARHDRSZ
> MAXDIM
> The compiler used (only brand, not version)

That seems way overkill to me.  FUNC_MAX_ARGS is good to check, but
most of those other things are noncritical for typical add-on modules.
In particular I strongly object to the check on compiler.  Some of us do
use systems where gcc and vendor compilers are supposed to interoperate
... and aren't all those Windows compilers supposed to, too?  AFAIK
it's considered the linker's job to prevent loading 32-bit code into
a 64-bit executable or vice versa, so I don't think we need to be
checking for common assumptions about sizeof(long).

> Currently, modules without a magic block are merely logged at LOG
> level. This needs some discussion though.

I'm pretty sure we had agreed that magic blocks should be required;
otherwise this check will accomplish little.

            regards, tom lane

Re: [PATCH] Magic block for modules

From
Martijn van Oosterhout
Date:
On Sun, May 07, 2006 at 08:21:43PM -0400, Tom Lane wrote:
> > changes in any of the following:
>
> > PG_VERSION_NUM
> > CATALOG_VERSION_NO
> > the size of 8 basic C types
> > BLCKSZ=20
> > NAMEDATALEN=20
> > HAVE_INT64_TIMESTAMP
> > INDEX_MAX_KEYS
> > FUNC_MAX_ARGS
> > VARHDRSZ
> > MAXDIM
> > The compiler used (only brand, not version)
>
> That seems way overkill to me.  FUNC_MAX_ARGS is good to check, but
> most of those other things are noncritical for typical add-on modules.

I was trying to find variables that when changed would make some things
corrupt. For example, a changed NAMEDATALEN will make any use of the
syscache a source of errors. A change in INDEX_MAX_KEYS will break the
GiST interface, etc. I wondered about letting module writers to select
which parts are relevent to them but that just seems like handing
people a footgun.

> In particular I strongly object to the check on compiler.  Some of us do
> use systems where gcc and vendor compilers are supposed to interoperate
> ... and aren't all those Windows compilers supposed to, too?  AFAIK

Maybe that's the case now, it didn't used to be. I seem to remember
people having difficulties because they compiled the server with MinGW
and the modules with VC++. I'll take it out though, it's not like it
costs anything.

> it's considered the linker's job to prevent loading 32-bit code into
> a 64-bit executable or vice versa, so I don't think we need to be
> checking for common assumptions about sizeof(long).

I know ELF headers contain some of this info, and unix in general
doesn't try to allow different bit sizes in one binary. Windows used to
(maybe still has) a mechanism to allow 32-bit code to call 16-bit
libraries. Do they allow the same for 64-bit libs?

> I'm pretty sure we had agreed that magic blocks should be required;
> otherwise this check will accomplish little.

Sure, I just didn't want to break every module in one weekend. I was
thinking of adding it with LOG level now, send a message on -announce
saying that at the beginning of the 8.2 freeze it will be an ERROR.
Give people time to react.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

Attachment

Re: [PATCH] Magic block for modules

From
"Magnus Hagander"
Date:
> > it's considered the linker's job to prevent loading 32-bit
> code into a
> > 64-bit executable or vice versa, so I don't think we need to be
> > checking for common assumptions about sizeof(long).
>
> I know ELF headers contain some of this info, and unix in
> general doesn't try to allow different bit sizes in one
> binary. Windows used to (maybe still has) a mechanism to
> allow 32-bit code to call 16-bit libraries. Do they allow the
> same for 64-bit libs?

Yes, but it's not something that it does automatically - you have to
specifically seti t up to call the thunking code. It's not something I
think we need to support at all. (Performance is also quite horrible -
at least on 16 vs 32, I'd assume the same for 32 vs 64)


//Magnus

Re: [PATCH] Magic block for modules

From
Tom Lane
Date:
Martijn van Oosterhout <kleptog@svana.org> writes:
> On Sun, May 07, 2006 at 08:21:43PM -0400, Tom Lane wrote:
>> That seems way overkill to me.  FUNC_MAX_ARGS is good to check, but
>> most of those other things are noncritical for typical add-on modules.

> I was trying to find variables that when changed would make some things
> corrupt. For example, a changed NAMEDATALEN will make any use of the
> syscache a source of errors. A change in INDEX_MAX_KEYS will break the
> GiST interface, etc.

By that rationale you'd have to record just about every #define in the
system headers.  And it still wouldn't be bulletproof --- what of
custom-modified code with, say, extra fields inserted into some widely
used struct?

But you're missing the larger point, which is that in many cases this
would be breaking stuff without any need at all.  The majority of
catversion bumps, for instance, are for things that don't affect the
typical add-on module.  So checking for identical catversion won't
accomplish much except to force additional recompile churn on people
doing development against CVS HEAD.  The original proposal was just
to check for major PG version match.  I can see checking FUNC_MAX_ARGS
too, because that has a very direct impact on the ABI that every
external function sees, but I think the cost/benefit ratio rises pretty
darn steeply after that.

Another problem with an expansive list of stuff-to-check is where does
the add-on module find it out from?  AFAICS your proposal would make for
a large laundry list of random headers that every add-on would now have
to #include.  If it's not defined by postgres.h or fmgr.h (which are two
things that every backend addon is surely including already) then I'm
dubious about using it in the magic block.

> Sure, I just didn't want to break every module in one weekend. I was
> thinking of adding it with LOG level now, send a message on -announce
> saying that at the beginning of the 8.2 freeze it will be an ERROR.
> Give people time to react.

I think that will just mean that we'll break every module at the start
of 8.2 freeze ;-).  Unless we forget to change it to error, which IMHO
is way too likely.

            regards, tom lane

Re: [PATCH] Magic block for modules

From
Martijn van Oosterhout
Date:
On Mon, May 08, 2006 at 10:32:47AM -0400, Tom Lane wrote:
> Martijn van Oosterhout <kleptog@svana.org> writes:
> > I was trying to find variables that when changed would make some things
> > corrupt. For example, a changed NAMEDATALEN will make any use of the
> > syscache a source of errors. A change in INDEX_MAX_KEYS will break the
> > GiST interface, etc.
>
> By that rationale you'd have to record just about every #define in the
> system headers.  And it still wouldn't be bulletproof --- what of
> custom-modified code with, say, extra fields inserted into some widely
> used struct?

I can see that. That's why I specifically aimed at the ones defined in
pg_config_manual.h, ie, the ones marked "twiddle me".

> ... So checking for identical catversion won't
> accomplish much except to force additional recompile churn on people
> doing development against CVS HEAD.  The original proposal was just
> to check for major PG version match.

Ok, I've taken out CATVERSION and cut PG version to just the major
version. I've also dropped the compiler and several others.

> Another problem with an expansive list of stuff-to-check is where does
> the add-on module find it out from?

All these symbols are defined by including c.h only, which is included
by postgres.h, so this is not an issue. I obviously didn't include any
symbols that a module would need to add special includes for. The only
outlier was CATVERSION but we're dropping that test.

> I think that will just mean that we'll break every module at the start
> of 8.2 freeze ;-).  Unless we forget to change it to error, which IMHO
> is way too likely.

Ok, one week then. Not everyone follows -patches and will be mighty
confused when a CVS update suddenly breaks everything.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

Attachment

Re: [PATCH] Magic block for modules

From
"Marko Kreen"
Date:
On 5/8/06, Martijn van Oosterhout <kleptog@svana.org> wrote:
> This implements a proposal made last november:
>
> http://archives.postgresql.org/pgsql-hackers/2005-11/msg00578.php

> All it requires is to include the header "pgmagic.h" and to put
> somewhere in their source:
>
> PG_MODULE_MAGIC

Could you serve this as special docstring instead?  Eg:

PG_MODULE(foomodule)

is mandatory, there you can to your magic, and optional:

PG_MODULE_DESC("Do foo")
PG_MODULE_AUTHOR("FooMan <baz@foo>")

This provides more motivation for module authors and also creates
(visually) smooth path to provide automatic install, uninstall and registration:

PG_MODULE_INSTALL(inst_sql)
PG_MODULE_UNINSTALL(uninst_sql)

create module foo from '$libdir/foo';
drop module foo;

This seems like worthwhile direction to move, especially
as it requires pretty small amount of changes.

--
marko

Re: [PATCH] Magic block for modules

From
Martijn van Oosterhout
Date:
On Wed, May 31, 2006 at 01:08:41PM +0300, Marko Kreen wrote:
> On 5/8/06, Martijn van Oosterhout <kleptog@svana.org> wrote:
> >All it requires is to include the header "pgmagic.h" and to put
> >somewhere in their source:
> >
> >PG_MODULE_MAGIC
>
> Could you serve this as special docstring instead?  Eg:
>
> PG_MODULE(foomodule)
>
> is mandatory, there you can to your magic, and optional:

<snip>

I like it, but I'm not sure there's enough consensus for that. I've
suggested before including install info inside the modules themselves
but there doesn't seem to be much interest in that.

Apart from that there's issues with implementation. The Linux kernel
can do it easily because it knows it will be using ELF, thus can use
sections to store this info. Postgresql has to support many more types,
making things like this tricky (but not impossible).

Personally I'd like postgres to move to a system where external modules
can easily be installed, uninstalled and upgraded. However, I've not
seen the demand yet.

Have a nice day
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

Attachment

Re: [PATCH] Magic block for modules

From
"Marko Kreen"
Date:
On 5/31/06, Martijn van Oosterhout <kleptog@svana.org> wrote:
> On Wed, May 31, 2006 at 01:08:41PM +0300, Marko Kreen wrote:
> > On 5/8/06, Martijn van Oosterhout <kleptog@svana.org> wrote:
> > >All it requires is to include the header "pgmagic.h" and to put
> > >somewhere in their source:
> > >
> > >PG_MODULE_MAGIC
> >
> > Could you serve this as special docstring instead?  Eg:
> >
> > PG_MODULE(foomodule)
> >
> > is mandatory, there you can to your magic, and optional:
>
> <snip>
>
> I like it, but I'm not sure there's enough consensus for that. I've
> suggested before including install info inside the modules themselves
> but there doesn't seem to be much interest in that.

I am not suggesting to try to go all the way, just to make sure that
your current patch fits into that direction.

> Apart from that there's issues with implementation. The Linux kernel
> can do it easily because it knows it will be using ELF, thus can use
> sections to store this info. Postgresql has to support many more types,
> making things like this tricky (but not impossible).

PostgreSQL already requires symbol loading functionality
for V1 function signatures, so per-module symbols won't be
much burden.

> Personally I'd like postgres to move to a system where external modules
> can easily be installed, uninstalled and upgraded. However, I've not
> seen the demand yet.

Demand happens only when users get used to such niceties on some
other databases. Considering that PostgreSQL is extensibility-wise
most advanced database and anything we offer is worlds best,
there won't be any demand in years to come.

I rather think we should  create that demand.  Tasks like

- see what modules are installed in database.
- install module
- remove module

are rather clunky in current setup. Making them easier would be good thing.

Ofcourse, its easy to tell others to do things.  I'll try to hack on that area
myself also.  If not earlier then maybe on Summit Code Sprint at least.

--
marko

Re: [PATCH] Magic block for modules

From
Tom Lane
Date:
"Marko Kreen" <markokr@gmail.com> writes:
>>> Could you serve this as special docstring instead?  Eg:
>>> PG_MODULE(foomodule)

I have no objection to that, and see no real implementation problem with
it: we just add a "const char *" field to the magic block.  The other
stuff seems too blue-sky, and I'm not even sure that it's the right
direction to proceed in.  Marko seems to be envisioning a future where
an extension module is this binary blob with install/deinstall/etc code
all hardwired into it.  I don't like that a bit.  I think the current
scheme with separate SQL scripts is a *good* thing, because it makes it
a lot easier for users to tweak the SQL definitions, eg, install the
functions into a non-default schema.  Also, I don't have a problem
imagining extension modules that contain no C code, just PL functions
--- so the SQL script needs to be considered the primary piece of the
module, not the shared library.

Is it worth adding a module name to the magic block, or should we just
leave well enough alone?  It's certainly not something foreseen as part
of the purpose of that block.  In the absence of some fairly concrete
ideas what to do with it, I'm probably going to vote keep-it-simple.

            regards, tom lane

Re: [PATCH] Magic block for modules

From
Martijn van Oosterhout
Date:
On Wed, May 31, 2006 at 11:14:27AM -0400, Tom Lane wrote:
> Is it worth adding a module name to the magic block, or should we just
> leave well enough alone?  It's certainly not something foreseen as part
> of the purpose of that block.  In the absence of some fairly concrete
> ideas what to do with it, I'm probably going to vote keep-it-simple.

I actually considered it while writing the patch but decided against
given the general tendancy against putting extra info into the modules
in general...

Personally I think it's a good idea, except: where is this info going
to be displayed or used?

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

Attachment

Re: [PATCH] Magic block for modules

From
"Marko Kreen"
Date:
On 5/31/06, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> "Marko Kreen" <markokr@gmail.com> writes:
> >>> Could you serve this as special docstring instead?  Eg:
> >>> PG_MODULE(foomodule)
>
> I have no objection to that, and see no real implementation problem with
> it: we just add a "const char *" field to the magic block.  The other
> stuff seems too blue-sky, and I'm not even sure that it's the right
> direction to proceed in.

It was not blue-sky, it was handwaving :)

> Marko seems to be envisioning a future where
> an extension module is this binary blob with install/deinstall/etc code
> all hardwired into it.  I don't like that a bit.  I think the current
> scheme with separate SQL scripts is a *good* thing, because it makes it
> a lot easier for users to tweak the SQL definitions, eg, install the
> functions into a non-default schema.  Also, I don't have a problem
> imagining extension modules that contain no C code, just PL functions
> --- so the SQL script needs to be considered the primary piece of the
> module, not the shared library.

I'll later post a list of ideas that we can hopefully agree on
and discuss them further.

> Is it worth adding a module name to the magic block, or should we just
> leave well enough alone?  It's certainly not something foreseen as part
> of the purpose of that block.  In the absence of some fairly concrete
> ideas what to do with it, I'm probably going to vote keep-it-simple.

Yes, if we want to keep separate SQL for modules then
putting stuff into .so is pointless.

--
marko

Re: [PATCH] Magic block for modules

From
Martijn van Oosterhout
Date:
On Wed, May 31, 2006 at 11:14:27AM -0400, Tom Lane wrote:
<snip>
> ...  The other
> stuff seems too blue-sky, and I'm not even sure that it's the right
> direction to proceed in.  Marko seems to be envisioning a future where
> an extension module is this binary blob with install/deinstall/etc code
> all hardwired into it.  I don't like that a bit.  I think the current
> scheme with separate SQL scripts is a *good* thing, because it makes it
> a lot easier for users to tweak the SQL definitions, eg, install the
> functions into a non-default schema.  Also, I don't have a problem
> imagining extension modules that contain no C code, just PL functions
> --- so the SQL script needs to be considered the primary piece of the
> module, not the shared library.

While you do have a good point about non-binary modules, our module
handling need some help IMHO. For example, the current hack for CREATE
LANGUAGE to fix things caused by old pg_dumps. I think that's the
totally wrong approach long term, I think the pg_dump shouldn't be
including the CREATE LANGUAGE statement at all, but should be saying
something like "INSTALL plpgsql" and pg_restore works out what is
needed for that module.

The above requires getting a few bits straight:

1. When given the name of an external module, you need to be able to
find the SQL commands needed to make it work.

2. You need to be able to tell if something is installed already or
not.

3. You need to be able to uninstall it again. Why do we rely on
hand-written uninstall scripts when we have a perfectly functional
dependancy mechanism that can adequatly track what was added and remove
it again on demand.

With these in place, upgrades across versions of postgres could become
a lot easier. People using tsearch2 now would get only "INSTALL
tsearch2" in their dumps and when they upgrade to 8.2 they get the new
definitions for tsearch using GIN. No old definitions to confuse people
or the database. (Note: I'm not sure if tsearch would be compatable at
the query level, but that's not relevent to the point I'm making).

We could get straight into discussions of mechanism, but it would be
nice to know if people think the above is a worthwhile idea.

Have a ncie day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

Attachment

Re: [PATCH] Magic block for modules

From
Robert Treat
Date:
On Wednesday 31 May 2006 13:24, Martijn van Oosterhout wrote:
> On Wed, May 31, 2006 at 11:14:27AM -0400, Tom Lane wrote:
> > Is it worth adding a module name to the magic block, or should we just
> > leave well enough alone?  It's certainly not something foreseen as part
> > of the purpose of that block.  In the absence of some fairly concrete
> > ideas what to do with it, I'm probably going to vote keep-it-simple.
>
> I actually considered it while writing the patch but decided against
> given the general tendancy against putting extra info into the modules
> in general...
>
> Personally I think it's a good idea, except: where is this info going
> to be displayed or used?
>

Marko's suggestion on producing a list of installed modules comes to mind, and
I suspect tools like pgadmin or ppa will want to be able to show this
information.

--
Robert Treat
Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL

Re: [PATCH] Magic block for modules

From
Christopher Kings-Lynne
Date:
> Marko's suggestion on producing a list of installed modules comes to mind, and
> I suspect tools like pgadmin or ppa will want to be able to show this
> information.

My request for phpPgAdmin is to somehow be able to check if the .so file
for a module is present.

For instance, I'd like to 'enable slony support' if the slony shared
library is present.  PPA's slony support automatically executes the .sql
files, so all I need to know is if the .so is there.

Chris


Re: [PATCH] Magic block for modules

From
Tom Lane
Date:
Christopher Kings-Lynne <chris.kings-lynne@calorieking.com> writes:
> My request for phpPgAdmin is to somehow be able to check if the .so file
> for a module is present.

> For instance, I'd like to 'enable slony support' if the slony shared
> library is present.  PPA's slony support automatically executes the .sql
> files, so all I need to know is if the .so is there.

I really think this is backwards: you should be looking for the .sql
files.  Every module will have a .sql file, not every one will need a
.so file.  See followup thread in -hackers where we're trying to hash
out design details.

            regards, tom lane

Re: [PATCH] Magic block for modules

From
Christopher Kings-Lynne
Date:
>> For instance, I'd like to 'enable slony support' if the slony shared
>> library is present.  PPA's slony support automatically executes the .sql
>> files, so all I need to know is if the .so is there.
>
> I really think this is backwards: you should be looking for the .sql
> files.  Every module will have a .sql file, not every one will need a
> .so file.  See followup thread in -hackers where we're trying to hash
> out design details.

Not in this case.

Basically Slony has the concept of installing a node into a server.  You
can have multiple ones of them - different schemas.  So, I'd like to be
able to detect that the .so is there, and then offer an "install node"
feature where WE execute the SQL on their behalf, with all the
complicated string substitions already done.

The trick is that Slony currently requires you to use a command line
tool to execute these scripts for you.

At the moment, people have to indicate in our config while that Slony is
available, and also point us to where the Slony SQL scripts are located.
  We do the rest.

It's not too important, but it's just an idea.

Chris


Re: [PATCH] Magic block for modules

From
Tom Lane
Date:
Christopher Kings-Lynne <chris.kings-lynne@calorieking.com> writes:
>> I really think this is backwards: you should be looking for the .sql
>> files.  Every module will have a .sql file, not every one will need a
>> .so file.  See followup thread in -hackers where we're trying to hash
>> out design details.

> Not in this case.

> Basically Slony has the concept of installing a node into a server.  You
> can have multiple ones of them - different schemas.  So, I'd like to be
> able to detect that the .so is there, and then offer an "install node"
> feature where WE execute the SQL on their behalf, with all the
> complicated string substitions already done.

No, Slony is going to have to adapt to modules, not vice versa.  We are
*not* designing the module feature on the assumption that every module
has some C functions at its core.  That would be a shameful restriction
of the potential applications.

It might be that some way to parameterize the SQL scripts would be handy
(the question about which schema to install into comes to mind) ... but
that doesn't justify making a .so file the central part of the module
concept.

But again, this is the wrong list.  Please contribute to the
"Generalized concept of modules" thread in -hackers.

            regards, tom lane

Re: [PATCH] Magic block for modules

From
Robert Treat
Date:
On Thursday 01 June 2006 21:38, Christopher Kings-Lynne wrote:
> > Marko's suggestion on producing a list of installed modules comes to
> > mind, and I suspect tools like pgadmin or ppa will want to be able to
> > show this information.
>
> My request for phpPgAdmin is to somehow be able to check if the .so file
> for a module is present.
>
> For instance, I'd like to 'enable slony support' if the slony shared
> library is present.  PPA's slony support automatically executes the .sql
> files, so all I need to know is if the .so is there.
>

While I agree with the above (having that for tsearch2 would be nice too) I
think we ought to keep in mind the idea of sql based modules.  Nothing jumps
to mind here ppa wise, but I could see an application looking to see if
mysqlcompat was installed before running if it had a good way to do so.

--
Robert Treat
Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL