Thread: contrib function naming, and upgrade issues

contrib function naming, and upgrade issues

From

Andrew Gierth

Date:

20 March 2009, 23:23:20

Note that I'm talking here about the names of the C functions, not
the SQL names.

The existing hstore has some very dubious choices of function names
(for non-static functions) in the C code; functions like each(),
delete(), fetchval(), defined(), tconvert(), etc. which all look to me
like prime candidates for name collisions and consequent hilarity.

The patch I'm working on could include fixes for this; but there's an
obvious impact on anyone upgrading from an earlier version... is it
worth it?

-- 
Andrew (irc:RhodiumToad)

Re: contrib function naming, and upgrade issues

From

Robert Haas

Date:

20 March 2009, 23:42:35

On Fri, Mar 20, 2009 at 9:57 PM, Andrew Gierth
<andrew@tao11.riddles.org.uk> wrote:
> Note that I'm talking here about the names of the C functions, not
> the SQL names.
>
> The existing hstore has some very dubious choices of function names
> (for non-static functions) in the C code; functions like each(),
> delete(), fetchval(), defined(), tconvert(), etc. which all look to me
> like prime candidates for name collisions and consequent hilarity.
>
> The patch I'm working on could include fixes for this; but there's an
> obvious impact on anyone upgrading from an earlier version... is it
> worth it?

Based on that description, +1 from me.  That kind of hilarity can be a
huge time sink when debugging, and it makes it hard to use grep to
find all references to a particular function (or #define, typedef,
etc.).

...Robert

Re: contrib function naming, and upgrade issues

From

Tom Lane

Date:

21 March 2009, 01:38:12

Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
> Note that I'm talking here about the names of the C functions, not
> the SQL names.

> The existing hstore has some very dubious choices of function names
> (for non-static functions) in the C code; functions like each(),
> delete(), fetchval(), defined(), tconvert(), etc. which all look to me
> like prime candidates for name collisions and consequent hilarity.

> The patch I'm working on could include fixes for this; but there's an
> obvious impact on anyone upgrading from an earlier version... is it
> worth it?

I agree that this wasn't an amazingly good choice, but I think there's
no real risk of name collisions because fmgr only searches for such names
within the particular .so.  As you say, renaming *will* break existing
dumps.  I'd be inclined to leave it alone, at least for now.  I hope
that someone will step up and implement a decent module system for us
sometime soon, which might fix the upgrade problem for changes of this
sort.
        regards, tom lane

Re: contrib function naming, and upgrade issues

From

Simon Riggs

Date:

21 March 2009, 06:13:43

On Sat, 2009-03-21 at 01:57 +0000, Andrew Gierth wrote:
> Note that I'm talking here about the names of the C functions, not
> the SQL names.
> 
> The existing hstore has some very dubious choices of function names
> (for non-static functions) in the C code; functions like each(),
> delete(), fetchval(), defined(), tconvert(), etc. which all look to me
> like prime candidates for name collisions and consequent hilarity.
> 
> The patch I'm working on could include fixes for this; but there's an
> obvious impact on anyone upgrading from an earlier version... is it
> worth it?

Perhaps you can have two sets of functions, yet just one .so? One with
the old naming for compatibility, and a set of dehilarified function
names for future use. Two .sql files, giving the user choice.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support

Re: contrib function naming, and upgrade issues

From

Andrew Gierth

Date:

21 March 2009, 09:25:29

>>>>> "Simon" == Simon Riggs <simon@2ndQuadrant.com> writes:
> On Sat, 2009-03-21 at 01:57 +0000, Andrew Gierth wrote:>> Note that I'm talking here about the names of the C
functions,not>> the SQL names.>> >> The existing hstore has some very dubious choices of function names>> (for
non-staticfunctions) in the C code; functions like each(),>> delete(), fetchval(), defined(), tconvert(), etc. which
alllook to me>> like prime candidates for name collisions and consequent hilarity.>> >> The patch I'm working on could
includefixes for this; but there's an>> obvious impact on anyone upgrading from an earlier version... is it>> worth
it?
Simon> Perhaps you can have two sets of functions, yet just one .so?Simon> One with the old naming for compatibility,
anda set ofSimon> dehilarified function names for future use. Two .sql files,Simon> giving the user choice.

Two .sql files would be pointless. Remember we're talking about the C
function names, not the SQL names; the only time the user should notice
the difference is when restoring an old dump.

As I see it there are three options:

1) do nothing; keep the existing C function names. dump/restore from
older versions will still work, but new functionality won't be
available without messing with the SQL.

2) hard cutover; rename all the dubious C functions. dump/restore from
older versions will get lots of errors, for which the workaround will
be "install the new hstore.sql into the database before trying to
restore".

3) some sort of compatibility hack involving optionally duplicating the
names in the C module.

-- 
Andrew.

Re: contrib function naming, and upgrade issues

From

Andrew Gierth

Date:

21 March 2009, 10:05:56

>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:
Tom> I agree that this wasn't an amazingly good choice, but I thinkTom> there's no real risk of name collisions because
fmgronlyTom> searches for such names within the particular .so.
 

Oh, if only life were so simple.

Consider two modules mod1 (source files mod1a.c and mod1b.c) and mod2
(source files mod2a.c and mod2b.c).

mod1a.c: contains sql-callable function foo() which calls an extern
function bar() defined in mod1b.c. mod1a.o and mod1b.o are linked to
make mod1.so.

mod2a.c: contains sql-callable function baz() which calls an extern
function bar() defined in mod2b.c. These are linked to make mod2.so.

Guess what happens when foo() and baz() are both called from within
the same backend....

(Perhaps we should be linking contrib and pgxs modules with -Bsymbolic
on those platforms where it matters?)

-- 
Andrew.

Re: contrib function naming, and upgrade issues

From

Martijn van Oosterhout

Date:

21 March 2009, 10:22:35

On Sat, Mar 21, 2009 at 01:05:35PM +0000, Andrew Gierth wrote:
> (Perhaps we should be linking contrib and pgxs modules with -Bsymbolic
> on those platforms where it matters?)

Another possibility is to use the visibility attributes such as those
provided in GCC. Maybe the version1 declarion of a function could add
the appropriate magic to set the visiblity to public and alter PGXS to
set the default visibility to hidden. Voila, modules whose only
exported symbols are those declared with a version-1 declaration.

Perhaps a little too much magic :)

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Please line up in a tree and maintain the heap invariant while
> boarding. Thank you for flying nlogn airlines.

Re: contrib function naming, and upgrade issues

From

Tom Lane

Date:

21 March 2009, 13:27:31

Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:
>  Tom> I agree that this wasn't an amazingly good choice, but I think
>  Tom> there's no real risk of name collisions because fmgr only
>  Tom> searches for such names within the particular .so.

> Oh, if only life were so simple.

I think you are missing the point.  There are certainly *potential*
problems from common function names in different .so's, but that does not
translate to evidence of *actual* problems in the Postgres environment.
In particular, I believe that we load .so's without adding their symbols
to those globally known by the linker --- at least on platforms where
that's possible.  Not to mention that the universe of other .so's we
might load is not all that large.  So I think the actual risks posed by
contrib/hstore are somewhere between minimal and nonexistent.

The past discussions we've had about developing a proper module facility
included ways to replace not-quite-compatible C functions.  I think that
we can afford to let hstore go on as it is for another release or two,
in hopes that we'll have something that makes a fix for this transparent
to users.  The risks don't look to me to be large enough to justify
imposing any upgrade pain on users.
        regards, tom lane

Re: contrib function naming, and upgrade issues

From

Robert Treat

Date:

21 March 2009, 22:34:55

On Saturday 21 March 2009 12:27:27 Tom Lane wrote:
> Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
> > "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:
> >  Tom> I agree that this wasn't an amazingly good choice, but I think
> >  Tom> there's no real risk of name collisions because fmgr only
> >  Tom> searches for such names within the particular .so.
> >
> > Oh, if only life were so simple.
>
> I think you are missing the point.  There are certainly *potential*
> problems from common function names in different .so's, but that does not
> translate to evidence of *actual* problems in the Postgres environment.
> In particular, I believe that we load .so's without adding their symbols
> to those globally known by the linker --- at least on platforms where
> that's possible.  Not to mention that the universe of other .so's we
> might load is not all that large.  So I think the actual risks posed by
> contrib/hstore are somewhere between minimal and nonexistent.
>
> The past discussions we've had about developing a proper module facility
> included ways to replace not-quite-compatible C functions.  I think that
> we can afford to let hstore go on as it is for another release or two,
> in hopes that we'll have something that makes a fix for this transparent
> to users.  The risks don't look to me to be large enough to justify
> imposing any upgrade pain on users.
>

We've been talking about this magical "proper module facility" for a few 
releases now... are we still opposed to putting contrib modules in thier own 
schema? People who took my advice and did that for tsearch were mighty happy 
when 8.2 broke at the C level, and when 8.3 broke all around. Doing that for 
hstore now would make the transition a little easier in the future as well. 

-- 
Robert Treat
Conjecture: http://www.xzilla.net
Consulting: http://www.omniti.com

Re: contrib function naming, and upgrade issues

From

Tom Lane

Date:

21 March 2009, 22:49:22

Robert Treat <xzilla@users.sourceforge.net> writes:
> We've been talking about this magical "proper module facility" for a few 
> releases now... are we still opposed to putting contrib modules in thier own 
> schema?

I'm hesitant to do that when we don't yet have either a design or a
migration plan for the module facility.  We might find we'd shot
ourselves in the foot, or at least complicated the migration situation
unduly.
        regards, tom lane

Re: contrib function naming, and upgrade issues

From

Robert Haas

Date:

22 March 2009, 00:56:00

On Sat, Mar 21, 2009 at 9:49 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Treat <xzilla@users.sourceforge.net> writes:
>> We've been talking about this magical "proper module facility" for a few
>> releases now... are we still opposed to putting contrib modules in thier own
>> schema?
>
> I'm hesitant to do that when we don't yet have either a design or a
> migration plan for the module facility.  We might find we'd shot
> ourselves in the foot, or at least complicated the migration situation
> unduly.

I think there have been a few designs proposed, but I think part of
the problem is a lack of agreement on the requirements.  "module
facility" seems to mean a lot of different things to different people.

...Robert

Re: contrib function naming, and upgrade issues

From

Andrew Gierth

Date:

22 March 2009, 08:09:34

>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:
Tom> I agree that this wasn't an amazingly good choice, but I thinkTom> there's no real risk of name collisions because
fmgronlyTom> searches for such names within the particular .so.

>> Oh, if only life were so simple.
Tom> I think you are missing the point.

Nope.
Tom> There are certainly *potential* problems from common functionTom> names in different .so's, but that does not
translatetoTom> evidence of *actual* problems in the Postgres environment.

It is true that I have no reason to believe that anyone has ever
encountered any problems due to name collisions between hstore and
something else. The only question is how to trade off the potential
risks against the known difficulties regarding upgrading; I'm quite
happy to accept the conclusion that the risk is not sufficient to
justify the upgrade pain, but only if the risk is being correctly
assessed.
Tom> In particular, I believe that we load .so's without adding theirTom> symbols to those globally known by the linker
---at least onTom> platforms where that's possible.

This is false; in the exact reverse of the above, we explicitly
request RTLD_GLOBAL on platforms where it exists.
Tom> Not to mention that the universe of other .so's we might load isTom> not all that large.  So I think the actual
risksposed byTom> contrib/hstore are somewhere between minimal and nonexistent.

The problem extends not only to other loaded .so's, but also to every
library linked into the postmaster itself, every library linked into
another loaded .so, and every .so (and associated libs) dynamically
loaded by another .so (e.g. modules loaded by pls).

(-Bsymbolic (or equivalent) would negate all of these, as far as I can
tell.)
Tom> The risks don't look to me to be large enough to justifyTom> imposing any upgrade pain on users.

OK. I will maintain binary compatibility in my patch.

-- 
Andrew.

Re: contrib function naming, and upgrade issues

From

Andrew Gierth

Date:

22 March 2009, 08:43:00

>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:
> Robert Treat <xzilla@users.sourceforge.net> writes:>> We've been talking about this magical "proper module facility"
for>>a few releases now... are we still opposed to putting contrib>> modules in thier own schema?

Tom> I'm hesitant to do that when we don't yet have either a designTom> or a migration plan for the module facility.
Wemight find we'dTom> shot ourselves in the foot, or at least complicated theTom> migration situation unduly.

I've been thinking about this, and my conclusion is that schemas as
they currently exist are the wrong tool for making modules/packages.

Partly that's based on the relative inflexibility of the search_path
setting; it's hard to modify the search_path without completely
replacing it, so knowledge of the "default" search path ends up being
propagated to a lot of places.

There's a parallel here with operating-system package mechanisms; for
the most part, the more usable / successful packaging systems don't
rely on putting everything in separate directories, instead they have
an out-of-band method for specifying what files belong to what package.

We already have a mechanism we could use for this: pg_depend. If an
"installed package" was a type of object, the functions, types,
operators, or any other kind of object installed by the package could
have dependency links to it; that would (a) make it trivial to drop,
and (b) pg_dump could check for package dependencies and, for objects
depending on a package, emit only a package installation command rather
than the object definition.

(I distinguish an "installed package" from whatever the package
definition might be, since it's possible that a package might want to
provide multiple APIs, for example for different versions, and these
might be installed simultaneously in different schemas.)

-- 
Andrew.

Re: contrib function naming, and upgrade issues

From

Andrew Gierth

Date:

22 March 2009, 08:54:23

>>>>> "Robert" == Robert Haas <robertmhaas@gmail.com> writes:
>> I'm hesitant to do that when we don't yet have either a design or>> a migration plan for the module facility.  We
mightfind we'd shot>> ourselves in the foot, or at least complicated the migration>> situation unduly. 
Robert> I think there have been a few designs proposed, but I thinkRobert> part of the problem is a lack of agreement
ontheRobert> requirements.  "module facility" seems to mean a lot ofRobert> different things to different people. 

Some ideas:
- want to be able to do  INSTALL PACKAGE foo;  without needing to  mess with .sql files.  This might default to looking
for $libdir/foo.so, or there might be a mechanism to register packages  globally or locally. 
- want to be able to do  INSTALL PACKAGE foo VERSION 1;  and get  the version 1 API rather than whatever the latest is.
- want to be able to do  INSTALL PACKAGE foo SCHEMA bar;  rather  than having to edit some .sql file.
- want to be able to do  DROP PACKAGE foo;
- want pg_dump to not output the definitions of any objects that  belong to a package, but instead to output an INSTALL
PACKAGEfoo  VERSION n SCHEMA x; 

--
Andrew.

Re: contrib function naming, and upgrade issues

From

Dave Page

Date:

22 March 2009, 10:29:51

On Sun, Mar 22, 2009 at 11:54 AM, Andrew Gierth
<andrew@tao11.riddles.org.uk> wrote:

>  - want to be able to do  INSTALL PACKAGE foo;  without needing to
>   mess with .sql files.  This might default to looking for
>   $libdir/foo.so, or there might be a mechanism to register packages
>   globally or locally.
>
>  - want to be able to do  INSTALL PACKAGE foo VERSION 1;  and get
>   the version 1 API rather than whatever the latest is.
>
>  - want to be able to do  INSTALL PACKAGE foo SCHEMA bar;  rather
>   than having to edit some .sql file.
>
>  - want to be able to do  DROP PACKAGE foo;
>
>  - want pg_dump to not output the definitions of any objects that
>   belong to a package, but instead to output an INSTALL PACKAGE foo
>   VERSION n SCHEMA x;

I think using PACKAGE is a bad idea as it'll confuse people used to
Oracle. MODULE perhaps?


--
Dave Page
EnterpriseDB UK:   http://www.enterprisedb.com

Re: contrib function naming, and upgrade issues

From

Alvaro Herrera

Date:

22 March 2009, 10:48:16

Andrew Gierth wrote:

> I've been thinking about this, and my conclusion is that schemas as
> they currently exist are the wrong tool for making modules/packages.

This has been discussed at length previously, and we even had an
incomplete but substantive patch posted.  Did you review that?  Some of
it appears to be in line of what you're proposing here.  If you're
interested in this area, perhaps you could pick up where Tom Dunstan
left off.

See URLs here:
http://wiki.postgresql.org/wiki/Todo#Source_Code
under "Improve the module installation experience (/contrib, etc)"

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: contrib function naming, and upgrade issues

From

Andrew Dunstan

Date:

22 March 2009, 11:43:07


Dave Page wrote:
>
> I think using PACKAGE is a bad idea as it'll confuse people used to
> Oracle. MODULE perhaps?
>
>   


Right. We debated this extensively in the past. Module was the consensus 
name.

cheers

andrew

Re: contrib function naming, and upgrade issues

From

Dimitri Fontaine

Date:

22 March 2009, 17:17:36

Hi,

Le 22 mars 09 à 12:42, Andrew Gierth a écrit :
> Tom> I'm hesitant to do that when we don't yet have either a design
> Tom> or a migration plan for the module facility.  We might find we'd
> Tom> shot ourselves in the foot, or at least complicated the
> Tom> migration situation unduly.
>
> I've been thinking about this, and my conclusion is that schemas as
> they currently exist are the wrong tool for making modules/packages.

Agreed.
Still, schemas are useful and using them should be encouraged, I think.

> Partly that's based on the relative inflexibility of the search_path
> setting; it's hard to modify the search_path without completely
> replacing it, so knowledge of the "default" search path ends up being
> propagated to a lot of places.

pg_catalog is implicit in the search_path, what about having user
schemas with the implicit capability too?

Then you have the problem of ordering more than one implicit schemas,
the easy solution is solving that the same way we solve trigger
orderding: alphabetically. Now, that could mean ugly user-facing
schema names: we already know we need synonyms, don't we?

> There's a parallel here with operating-system package mechanisms; for
> the most part, the more usable / successful packaging systems don't
> rely on putting everything in separate directories, instead they have
> an out-of-band method for specifying what files belong to what
> package.
>
> We already have a mechanism we could use for this: pg_depend. If an
> "installed package" was a type of object, the functions, types,
> operators, or any other kind of object installed by the package could
> have dependency links to it; that would (a) make it trivial to drop,
> and (b) pg_dump could check for package dependencies and, for objects
> depending on a package, emit only a package installation command
> rather
> than the object definition.

Here's a sketch of what I came up with:  http://wiki.postgresql.org/wiki/ExtensionPackaging

It's still needing some work before being a solid proposal, like for
example handling cases where you want to pg_restore a database and
insist on *not* caring about some extensions (pgq, londiste, slony
things, cron restoring into pre-live systems). Or working out some
versioning information and dependancies between modules.
What it misses the most is hackers acceptance of the proposed
concepts, though.

> (I distinguish an "installed package" from whatever the package
> definition might be, since it's possible that a package might want to
> provide multiple APIs, for example for different versions, and these
> might be installed simultaneously in different schemas.)

Version tracking is yet to be designed in the document.
--
dim

Re: contrib function naming, and upgrade issues

From

Dimitri Fontaine

Date:

22 March 2009, 17:27:36

Hi,

Heard about http://wiki.postgresql.org/wiki/ExtensionPackaging ? :)

Le 22 mars 09 à 14:29, Dave Page a écrit :
>>  - want to be able to do  INSTALL PACKAGE foo;  without needing to
>>   mess with .sql files.  This might default to looking for
>>   $libdir/foo.so, or there might be a mechanism to register packages
>>   globally or locally.

Part of the proposal.

>>  - want to be able to do  INSTALL PACKAGE foo VERSION 1;  and get
>>   the version 1 API rather than whatever the latest is.

To be added to the proposal.

>>  - want to be able to do  INSTALL PACKAGE foo SCHEMA bar;  rather
>>   than having to edit some .sql file.

Part of the proposal (install time variables/options/parameters).

>>  - want to be able to do  DROP PACKAGE foo;

Part of the proposal.

>>  - want pg_dump to not output the definitions of any objects that
>>   belong to a package, but instead to output an INSTALL PACKAGE foo
>>   VERSION n SCHEMA x;

Part of the proposal.

> I think using PACKAGE is a bad idea as it'll confuse people used to
> Oracle. MODULE perhaps?

Using package would tie us into supporting oracle syntax, which nobody
actually wants, it seems. Or at least we have to reserve the keyword
for meaning "oracle compliant facility".

Module on the other hand is already used in PostgreSQL to refer to the
dynamic lib you get when installing C coded extensions (.so or .dll),
what we miss here is a way to refer to them in pure SQL, have their
existence cared about in the catalogs. That's the part Tom Dunstan
worked on IIRC.
He also worked out some OS level tools for module handling, but I
think I'd prefer to have another notion in between, the extension.

The extension would be a new SQL object referring to zero, one or more
modules and one or more SQL scripts creating new SQL objects (schemas,
tables, views, tablespaces, functions, types, casts, operator classes
and families, etc, whatever SQL scripting we support now --- yes,
index am would be great too). Those would depend (pg_depend) on the
package SQL object. I don't think we need to be able to nest a package
creation inside the package SQL scripts, but hey, why not.

So my vote is for us to talk about modules (.so) and extensions (the
packaging and distribution of them). And this term isn't even new in
PostgreSQL glossary ;)

Regards,
--
dim

Re: contrib function naming, and upgrade issues

From

Tom Lane

Date:

22 March 2009, 18:05:51

Dimitri Fontaine <dfontaine@hi-media.com> writes:
> He also worked out some OS level tools for module handling, but I  
> think I'd prefer to have another notion in between, the extension.

> The extension would be a new SQL object referring to zero, one or more  
> modules and one or more SQL scripts creating new SQL objects (schemas,  
> tables, views, tablespaces, functions, types, casts, operator classes  
> and families, etc, whatever SQL scripting we support now --- yes,  
> index am would be great too).

This seems drastically overengineered.  What do we need two levels of
objects for?
        regards, tom lane

Re: contrib function naming, and upgrade issues

From

Dimitri Fontaine

Date:

22 March 2009, 18:27:20

Le 22 mars 09 à 22:05, Tom Lane a écrit :
> This seems drastically overengineered.  What do we need two levels of
> objects for?

We need to be able to refer (pg_depend) to (system level) modules.
Any given extension may depend on more than one module.

What did I overlook?
--
dim

Re: contrib function naming, and upgrade issues

From

Tom Lane

Date:

22 March 2009, 18:47:11

Dimitri Fontaine <dfontaine@hi-media.com> writes:
> Le 22 mars 09 � 22:05, Tom Lane a �crit :
>> This seems drastically overengineered.  What do we need two levels of
>> objects for?

> We need to be able to refer (pg_depend) to (system level) modules.
> Any given extension may depend on more than one module.

You really haven't convinced me that this is anything but
overcomplication.  There might (or might not) be some use-case
for being able to declare that module A depends on module B,
but that doesn't mean we need a second layer of grouping.
        regards, tom lane

Re: contrib function naming, and upgrade issues

From

Guillaume Smet

Date:

22 March 2009, 19:15:35

On Sun, Mar 22, 2009 at 10:46 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> There might (or might not) be some use-case
> for being able to declare that module A depends on module B,

Typically, earthdistance requires cube so the module dependency is
already something that might be useful. But as you said, it doesn't
require a second level of grouping, just a way to define dependencies.

-- 
Guillaume

Re: contrib function naming, and upgrade issues

From

Robert Haas

Date:

22 March 2009, 22:40:39

On Sun, Mar 22, 2009 at 7:54 AM, Andrew Gierth
<andrew@tao11.riddles.org.uk> wrote:
>>>>>> "Robert" == Robert Haas <robertmhaas@gmail.com> writes:
>
>  >> I'm hesitant to do that when we don't yet have either a design or
>  >> a migration plan for the module facility.  We might find we'd shot
>  >> ourselves in the foot, or at least complicated the migration
>  >> situation unduly.
>
>  Robert> I think there have been a few designs proposed, but I think
>  Robert> part of the problem is a lack of agreement on the
>  Robert> requirements.  "module facility" seems to mean a lot of
>  Robert> different things to different people.
>
> Some ideas:
>
>  - want to be able to do  INSTALL PACKAGE foo;  without needing to
>   mess with .sql files.  This might default to looking for
>   $libdir/foo.so, or there might be a mechanism to register packages
>   globally or locally.
>
>  - want to be able to do  INSTALL PACKAGE foo VERSION 1;  and get
>   the version 1 API rather than whatever the latest is.
>
>  - want to be able to do  INSTALL PACKAGE foo SCHEMA bar;  rather
>   than having to edit some .sql file.
>
>  - want to be able to do  DROP PACKAGE foo;
>
>  - want pg_dump to not output the definitions of any objects that
>   belong to a package, but instead to output an INSTALL PACKAGE foo
>   VERSION n SCHEMA x;

This seems about right to me.  I think the key to getting this done is
to keep the design as simple as possible and to avoid entanglements
with other features that may need to be designed independently and
first. I think there's a good argument to be made that package
management could benefit from the notion of a variable.  For example,
you might want to write a SQL script or PL/pgsql procedure where
?{version}, or some equally inscrutable glyph, refers to the version
specified in the INSTALL MODULE command.

I'm deeply skeptical about this approach.  Either variables are useful
in PL/pgsql - as I tend to believe - or they aren't - as I'm sure can
be argued.  If they're useful, though, they are probably useful in
many contexts other than package management.  So I would suggest that
either a concerted effort needs to be made to design and implement a
useful variable facility (and then we can use it for package
mangement, too) or package management needs to be made to work without
variables (and then if we eventually add them in general we can use
them fpr package management, too).  On that basis, I'm inclined to
suggest that the SCHEMA and VERSION clauses you've proposed here
should be dropped for the first version of this, because I think it
will be very, very difficult to implement them without variables.

We also, I think, need to try very hard to avoid getting sucked into
creating a CPAN-like system for installing modules *on the machine*.
We need to focus on how the modules get sucked into PostgreSQL once
the OS-level packaging system (RPM, deb, whatever), or the system
administrator, have gotten the files installed in some suitable place
on the local host, and we now want to make PostgreSQL know about and
use them.  It might be nice to have a system that does the whole
thing, soup to nuts, but again, that's something that can be added
later and used by only those that want it.

So taking into account suggestions elsewhere on this thread, I suggest
"INSTALL MODULE foo" and "DROP MODULE foo".  It's pretty clear what
DROP MODULE foo should do, but the semantics of INSTALL MODULE foo are
a bit less clear.  I suspect that it's going to boil down to running a
SQL script, which will need to somehow get that module installed.  To
make that work, I think we need "CREATE MODULE foo" and then "CREATE
<TABLE|VIEW|FUNCTION|...> ... MODULE foo".  So the SQL script will
create the module and then create all of the objects and make them
depend on the module using the optional "MODULE foo" clause.

...Robert

Re: contrib function naming, and upgrade issues

From

Tom Lane

Date:

22 March 2009, 23:25:47

Robert Haas <robertmhaas@gmail.com> writes:
> ...  I suspect that it's going to boil down to running a
> SQL script, which will need to somehow get that module installed.  To
> make that work, I think we need "CREATE MODULE foo" and then "CREATE
> <TABLE|VIEW|FUNCTION|...> ... MODULE foo".  So the SQL script will
> create the module and then create all of the objects and make them
> depend on the module using the optional "MODULE foo" clause.

I doubt that we want to decorate every CREATE statement we've got with
an optional MODULE clause; to name just one objection, it'd probably
be impossible to do so without making MODULE a fully reserved word.

What was discussed in the last go-round was some sort of state-dependent
assignment of a module context.  You could imagine either
BEGIN MODULE modname;
CREATE this;CREATE that;CREATE the_other;
END MODULE;

or something along the lines of
SET current_module = modname;
CREATE this;CREATE that;CREATE the_other;
SET current_module = null;

which is really more or less the same thing except that it makes the
state concrete in the form of an examinable variable.  In either case
you'd need to define how the state would interact with transactions
and errors.
        regards, tom lane

Re: contrib function naming, and upgrade issues

From

Robert Haas

Date:

22 March 2009, 23:40:37

On Sun, Mar 22, 2009 at 10:25 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> ...  I suspect that it's going to boil down to running a
>> SQL script, which will need to somehow get that module installed.  To
>> make that work, I think we need "CREATE MODULE foo" and then "CREATE
>> <TABLE|VIEW|FUNCTION|...> ... MODULE foo".  So the SQL script will
>> create the module and then create all of the objects and make them
>> depend on the module using the optional "MODULE foo" clause.
>
> I doubt that we want to decorate every CREATE statement we've got with
> an optional MODULE clause; to name just one objection, it'd probably
> be impossible to do so without making MODULE a fully reserved word.
>
> What was discussed in the last go-round was some sort of state-dependent
> assignment of a module context.  You could imagine either
>
>        BEGIN MODULE modname;
>
>        CREATE this;
>        CREATE that;
>        CREATE the_other;
>
>        END MODULE;
>
> or something along the lines of
>
>        SET current_module = modname;
>
>        CREATE this;
>        CREATE that;
>        CREATE the_other;
>
>        SET current_module = null;
>
> which is really more or less the same thing except that it makes the
> state concrete in the form of an examinable variable.  In either case
> you'd need to define how the state would interact with transactions
> and errors.

I thought about that, but wasn't sure if people would like it, since
it seems a little un-SQL-ish.  But I'm fine with it, and it has the
additional advantage that it avoids the need to recapitulate the
module name many times.  If there's no semantic problem with making
current_module be a GUC, the SET syntax seems very tempting, since it
avoids the need to make up something new and different.

...Robert

Re: contrib function naming, and upgrade issues

From

Andrew Gierth

Date:

22 March 2009, 23:57:57

>>>>> "Alvaro" == Alvaro Herrera <alvherre@commandprompt.com> writes:
>> I've been thinking about this, and my conclusion is that schemas>> as they currently exist are the wrong tool for
making>>modules/packages.

Alvaro> This has been discussed at length previously, and we even hadAlvaro> an incomplete but substantive patch
posted. Did you reviewAlvaro> that?  Some of it appears to be in line of what you'reAlvaro> proposing here.  If you're
interestedin this area, perhapsAlvaro> you could pick up where Tom Dunstan left off.

Yes, that's close to what I had in mind.

One difference is that I would be inclined to punt more of the
installation logic into the module itself. If "INSTALL MODULE foo"
worked by calling a specially-declared function in foo.so (if
present), it would give the module more flexibility in terms of what
to install based on the version number requested, etc.; some helper
functions could be provided so that the simpler cases require only a
few lines of code.

Modules not implemented as .so files would have a bit less flexibility
thanks to the fact that we don't have any procedural languages
installed by default; how to do versioning for them would require a
bit more thought. (Maybe have a defaultmodule.so to do the work for
them?)

I will consider working on this at some point.

-- 
Andrew.

Re: contrib function naming, and upgrade issues

From

Andrew Gierth

Date:

23 March 2009, 00:04:33

>>>>> "Dimitri" == Dimitri Fontaine <dfontaine@hi-media.com> writes:
>> Partly that's based on the relative inflexibility of the>> search_path setting; it's hard to modify the search_path
without>>completely replacing it, so knowledge of the "default" search path>> ends up being propagated to a lot of
places.
Dimitri> pg_catalog is implicit in the search_path, what about havingDimitri> user schemas with the implicit capability
too?
Dimitri> Then you have the problem of ordering more than one implicitDimitri> schemas,

This is a hint that it's really a bad idea.

Instead, what I'd suggest is breaking up search_path into multiple
variables - maybe pre_search_path, search_path, and
post_search_path.

-- 
Andrew.

Re: contrib function naming, and upgrade issues

From

Andrew Gierth

Date:

23 March 2009, 00:06:25

>>>>> "Dimitri" == Dimitri Fontaine <dfontaine@hi-media.com> writes:
Dimitri> Hi,Dimitri> Heard about http://wiki.postgresql.org/wiki/ExtensionPackaging ? :)

Yes, I left a short note on its discussion page a while ago :-)

-- 
Andrew.

Re: contrib function naming, and upgrade issues

From

Andrew Gierth

Date:

23 March 2009, 00:11:20

>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:
Tom> I doubt that we want to decorate every CREATE statement we'veTom> got with an optional MODULE clause; to name just
oneobjection,Tom> it'd probably be impossible to do so without making MODULE aTom> fully reserved word.
 
Tom> What was discussed in the last go-round was some sort ofTom> state-dependent assignment of a module context.  You
couldTom>imagine either[snip]
 
Tom> or something along the lines of
Tom>     SET current_module = modname;
Tom>     CREATE this;Tom>     CREATE that;Tom>     CREATE the_other;
Tom>     SET current_module = null;
Tom> which is really more or less the same thing except that it makesTom> the state concrete in the form of an
examinablevariable.  InTom> either case you'd need to define how the state would interactTom> with transactions and
errors.

I like the SET version better. As for transactions and errors, I think
that installing a module should be done inside a transaction anyway;
and the usual GUC mechanisms should handle it if it was done using
SET LOCAL, no?

-- 
Andrew.

Re: contrib function naming, and upgrade issues

From

Greg Stark

Date:

23 March 2009, 00:25:57

Why do you need any explicit syntax? If the database is loading an SQL  
file as a result of a LOAD MODULE command wouldn't it know to set  
whatever internal state it needs to remember that?



-- 
Greg


On 22 Mar 2009, at 23:11, Andrew Gierth <andrew@tao11.riddles.org.uk>  
wrote:

>>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:
>
> Tom> I doubt that we want to decorate every CREATE statement we've
> Tom> got with an optional MODULE clause; to name just one objection,
> Tom> it'd probably be impossible to do so without making MODULE a
> Tom> fully reserved word.
>
> Tom> What was discussed in the last go-round was some sort of
> Tom> state-dependent assignment of a module context.  You could
> Tom> imagine either
> [snip]
>
> Tom> or something along the lines of
>
> Tom>    SET current_module = modname;
>
> Tom>    CREATE this;
> Tom>    CREATE that;
> Tom>    CREATE the_other;
>
> Tom>    SET current_module = null;
>
> Tom> which is really more or less the same thing except that it makes
> Tom> the state concrete in the form of an examinable variable.  In
> Tom> either case you'd need to define how the state would interact
> Tom> with transactions and errors.
>
> I like the SET version better. As for transactions and errors, I think
> that installing a module should be done inside a transaction anyway;
> and the usual GUC mechanisms should handle it if it was done using
> SET LOCAL, no?
>
> -- 
> Andrew.
>
> -- 
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers

Re: contrib function naming, and upgrade issues

From

Dimitri Fontaine

Date:

23 March 2009, 05:45:13

On Sunday 22 March 2009 22:46:20 Tom Lane wrote:
> You really haven't convinced me that this is anything but
> overcomplication.

Thinking about it some more what could be convincing is that an extension
could be made of only SQL, with no module (.so) (I have a case here).

If a single .sql file can be seen as an extension, I'd want to avoid naming it
the same as the .so file itself. Having the term "module" refer either to a
single .so (or .dll), or a .so with an accompanying .sql file to install it, or
even just the SQL file... would add confusion, methinks.

If there's not enough confusion here to grant separating what we call a module
and what we call an extension, then I'll go edit my proposal :)

> There might (or might not) be some use-case
> for being able to declare that module A depends on module B,
> but that doesn't mean we need a second layer of grouping.

Agreed, this reason is not a good one for splitting module and extension.
--
dim

Re: contrib function naming, and upgrade issues

From

Dimitri Fontaine

Date:

23 March 2009, 05:54:51

On Monday 23 March 2009 04:05:04 Andrew Gierth wrote:
>  Dimitri> Heard about http://wiki.postgresql.org/wiki/ExtensionPackaging ?
> Yes, I left a short note on its discussion page a while ago :-)

Hehe... I'll answer here, as it's a more opened forum it seems...

Schemas vs Extensions (or modules, we'll see): yes they are orthogonal
concepts, but still, extensions should not pollute the public namespace, I
(and some other) think.

So we're encouraging extension's authors to use their own schema where to put
the extension stuff, with the drawback that user would have to remember about
it and manage it along with their own schemas, which cause search_path issues.

I think your idea of splitting search_path into several components would help
a lot here.
--
dim

Re: contrib function naming, and upgrade issues

From

Robert Haas

Date:

23 March 2009, 08:34:39

On Sun, Mar 22, 2009 at 11:26 PM, Greg Stark
<greg.stark@enterprisedb.com> wrote:
> Why do you need any explicit syntax? If the database is loading an SQL file
> as a result of a LOAD MODULE command wouldn't it know to set whatever
> internal state it needs to remember that?

That might not be the only time you ever want to create dependencies
on the module object.  What if the module wants to create an
additional table, view, etc. at some later time, following the load?
I'm not sure whether there's a use case for that, but it doesn't seem
totally implausible.

...Robert

Re: contrib function naming, and upgrade issues

From

Dimitri Fontaine

Date:

23 March 2009, 08:50:18

On Monday 23 March 2009 12:34:31 Robert Haas wrote:
> That might not be the only time you ever want to create dependencies
> on the module object.  What if the module wants to create an
> additional table, view, etc. at some later time, following the load?
> I'm not sure whether there's a use case for that, but it doesn't seem
> totally implausible.

Then there's Tom's idea of SET module TO ...; to have the context handy, or a
WIP syntax in http://wiki.postgresql.org/wiki/ExtensionPackaging
 CREATE OR REPLACE EXTENSION foo ... AS $$ $$;

So you could REPLACE an existing extension and add whatever you need to.
--
dim

Re: contrib function naming, and upgrade issues

From

Tom Lane

Date:

23 March 2009, 11:43:56

Dimitri Fontaine <dfontaine@hi-media.com> writes:
> On Sunday 22 March 2009 22:46:20 Tom Lane wrote:
>> You really haven't convinced me that this is anything but
>> overcomplication.

> Thinking about it some more what could be convincing is that an extension 
> could be made of only SQL, with no module (.so) (I have a case here).

> If a single .sql file can be seen as an extension, I'd want to avoid naming it 
> the same as the .so file itself. Having the term "module" refer either to a 
> single .so (or .dll), or a .so with an accompanying .sql file to install it, or 
> even just the SQL file... would add confusion, methinks.

I think the way most people are envisioning this is that a module is a
set of SQL objects (functions, types, tables, whatever).  Whether any
of those are C functions in one or more underlying .so files is not
really particularly relevant to the module mechanism.

It should be possible to have a module that doesn't contain any C code,
so the concept of a defining function does not look good to me.  A
defining SQL script is the way to go.

The only way that the underlying .so file(s) become relevant is if you
are trying to make this a *packaging* mechanism that can actually
deliver and install the set of files required to implement a module.
I don't think that's a good idea; not least because systems tend to
already have their own packaging mechanisms, and we don't need to invent
another.  I think "module" should just be a SQL-level concept and not be
concerned with how the files it needs arrive where they're needed.
        regards, tom lane

Re: contrib function naming, and upgrade issues

From

Dimitri Fontaine

Date:

23 March 2009, 14:47:30

On Monday 23 March 2009 15:43:04 Tom Lane wrote:
> I think the way most people are envisioning this is that a module is a
> set of SQL objects (functions, types, tables, whatever).  Whether any
> of those are C functions in one or more underlying .so files is not
> really particularly relevant to the module mechanism.

Fine, that's what I wanted to call an extension in order not to change the
meaning of module. I'll edit the proposal on the wiki later on tonight.

> It should be possible to have a module that doesn't contain any C code,
> so the concept of a defining function does not look good to me.  A
> defining SQL script is the way to go.

Agreed here.
I added some special SQL syntax on my proposal in order for the module author
to be able to provide some advanced notions (dependencies, version, etc).

I still think that using this special syntax around custom sql has advantages,
namely that help solving the module altering facility and module variable
handling.

Module variable are needed by e.g. pljava for its classpath setting, which is
meant to change depending on the caller from what I've been told.
 ALTER MODULE pljava SET classpath = 'some value here';

Of course, as hinted by Peter, the variables here are not GUCs.

> The only way that the underlying .so file(s) become relevant is if you
> are trying to make this a *packaging* mechanism that can actually
> deliver and install the set of files required to implement a module.

What I'm proposing in the WIP wiki page is to propose a source based packaging
based on PGXS (just some glue around it to fetch the right tarball from
command line without bothering, then run make and make install, can come much
later).
Binary packaging could then be made to work by packagers, based on this.

What I like about this optional tool is the fact that the -core distribution
could then publish extra contribs in a central trusted location, such as
http://modules.postgresql.org/.
Source based only distribution there, hassle-free, allowing -core to stamp
e.g. plproxy as a trusted module for PostgreSQL. Minor releases policy would
have to be talked about, of course.

> I don't think that's a good idea; not least because systems tend to
> already have their own packaging mechanisms, and we don't need to invent
> another.  I think "module" should just be a SQL-level concept and not be
> concerned with how the files it needs arrive where they're needed.

Well, maybe just complaining at module "creation" time (that's when you run
the SQL script possibly containing CREATE OR REPLACE MODULE ... $$ <sql> $$)
would be enough as far as .so dependency is concerned.
The error message would of course come from the first create function language
C referring to the non existent file, which would trigger a rollback.

Is that roughly what you have in mind?
--
Dimitri Fontaine
Architecte DBA PostgreSQL

Re: contrib function naming, and upgrade issues

From

Robert Haas

Date:

23 March 2009, 16:36:22

On Mon, Mar 23, 2009 at 7:46 AM, Dimitri Fontaine
<dfontaine@hi-media.com> wrote:
> On Monday 23 March 2009 12:34:31 Robert Haas wrote:
>> That might not be the only time you ever want to create dependencies
>> on the module object.  What if the module wants to create an
>> additional table, view, etc. at some later time, following the load?
>> I'm not sure whether there's a use case for that, but it doesn't seem
>> totally implausible.
>
> Then there's Tom's idea of SET module TO ...; to have the context handy, or a
> WIP syntax in http://wiki.postgresql.org/wiki/ExtensionPackaging
>
>  CREATE OR REPLACE EXTENSION foo ...
>  AS $$
>  $$;
>
> So you could REPLACE an existing extension and add whatever you need to.

I think SET module_context = 'whatever' is the right idea.  CREATE OR
REPLACE MODULE is not going to work.  Suppose that when we originally
install the extension we do:

CREATE TABLE some_table (id integer not null, foo text not null,
primary key (id));

...later when we try to do CREATE OR REPLACE the definition has changed to:

CREATE TABLE some_table (id integer not null, bar text not null, baz
text not null, primary key (id));

It may well be that the table has data in it that was inserted after
module creation time, and the user may want it preserved with the
upgrade, but there's really no way to even begin to guess what the
user had in mind here.

The CREATE OR REPLACE idea doesn't have very clean semantics even with
functions, which are probably the primary use case for this mechanism.If I replace a module, and the new definition
doesn'tdefine one of 
the functions in the original definition, does that amount to an
implicit drop of that function?  If the module contains a CREATE
FUNCTION command, does using CREATE OR REPLACE on the module
effectively turn CREATE FUNCTION into CREATE OR REPLACE FUNCTION?
Nobody is going to like these semantics, I think, and it gets far
uglier when you start looking at tables, views, etc.

(It's also worth noting, as an independent point, that I suspect SET
module_context = 'whatever' will be easier to implement.)

...Robert

Re: contrib function naming, and upgrade issues

From

Dimitri Fontaine

Date:

23 March 2009, 17:13:07

Le 23 mars 09 à 20:33, Robert Haas a écrit :
> It may well be that the table has data in it that was inserted after
> module creation time, and the user may want it preserved with the
> upgrade, but there's really no way to even begin to guess what the
> user had in mind here.

Exactly, we're not in the business of second guessing our users. So we
have versioning information built into the facility, and we should
provide a way to tell from which version we're upgrading if that's the
case.

Then the module author's would be able to do things depending on the
value of the OLD.version (to reuse existing notations and concepts) or
something. That means supporting conditionals, and that's sound like
it's not in the TODO for the first implementation. But still, I don't
see how you manage to give the modules authors a nice upgrade facility
without something like this.

Regards,
--
dim

Re: contrib function naming, and upgrade issues

From

Andrew Gierth

Date:

23 March 2009, 17:34:45

>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:
Tom> I think the way most people are envisioning this is that aTom> module is a set of SQL objects (functions, types,
tables,Tom>whatever).  Whether any of those are C functions in one or moreTom> underlying .so files is not really
particularlyrelevant to theTom> module mechanism.
 
Tom> It should be possible to have a module that doesn't contain anyTom> C code,

Yes.
Tom> so the concept of a defining function does not look good to me.Tom> A defining SQL script is the way to go.

But I disagree with this, for the simple reason that we don't have
anything like enough flexibility in the form of conditional DDL or
error handling, when working in pure SQL without any procedural help.
This is especially true when you start to look at how to handle
conflicts, upgrades and versioning.

-- 
Andrew.

Re: contrib function naming, and upgrade issues

From

Tom Lane

Date:

23 March 2009, 20:11:19

Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:
>  Tom> A defining SQL script is the way to go.

> But I disagree with this, for the simple reason that we don't have
> anything like enough flexibility in the form of conditional DDL or
> error handling, when working in pure SQL without any procedural help.

So?  You can have the script create, execute, and remove a function,
and thereby perform any operation that a function could possibly
perform for you.
        regards, tom lane

Re: contrib function naming, and upgrade issues

From

Pavel Stehule

Date:

24 March 2009, 05:25:10

Hello

attention, MODULE is ANSI SQL keyword, and modules are class from ANSI SQL.

<SQL-server module definition> ::= CREATE MODULE <SQL-server module name>     [ <SQL-server module character set
specification>]     [ <SQL-server module schema clause> ] [ <SQL-server module path 
specification> ]     [ <temporary table declaration>... ]     <SQL-server module contents>...     END MODULE
<SQL-server module character set specification> ::= NAMES ARE <character set specification>
<SQL-server module schema clause> ::= SCHEMA <default schema name>
<default schema name> ::= <schema name>
<SQL-server module path specification> ::= <path specification>
<SQL-server module contents> ::= <SQL-invoked routine> <semicolon>

Regards
Pavel Stehule

2009/3/23 Tom Lane <tgl@sss.pgh.pa.us>:
> Robert Haas <robertmhaas@gmail.com> writes:
>> ...  I suspect that it's going to boil down to running a
>> SQL script, which will need to somehow get that module installed.  To
>> make that work, I think we need "CREATE MODULE foo" and then "CREATE
>> <TABLE|VIEW|FUNCTION|...> ... MODULE foo".  So the SQL script will
>> create the module and then create all of the objects and make them
>> depend on the module using the optional "MODULE foo" clause.
>
> I doubt that we want to decorate every CREATE statement we've got with
> an optional MODULE clause; to name just one objection, it'd probably
> be impossible to do so without making MODULE a fully reserved word.
>
> What was discussed in the last go-round was some sort of state-dependent
> assignment of a module context.  You could imagine either
>
>        BEGIN MODULE modname;
>
>        CREATE this;
>        CREATE that;
>        CREATE the_other;
>
>        END MODULE;
>
> or something along the lines of
>
>        SET current_module = modname;
>
>        CREATE this;
>        CREATE that;
>        CREATE the_other;
>
>        SET current_module = null;
>
> which is really more or less the same thing except that it makes the
> state concrete in the form of an examinable variable.  In either case
> you'd need to define how the state would interact with transactions
> and errors.
>
>                        regards, tom lane
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

Re: contrib function naming, and upgrade issues

From

Peter Eisentraut

Date:

25 March 2009, 09:13:03

Robert Haas wrote:
> I think the key to getting this done is
> to keep the design as simple as possible and to avoid entanglements
> with other features that may need to be designed independently and
> first.

I think the key to getting this done is to define project purpose and 
requirements before doing anything else.