Thread: [rfc] new CREATE FUNCTION (and more)

[rfc] new CREATE FUNCTION (and more)

From
Marko Kreen
Date:
Situation: 7.1 has a new backend interface for loadable modules. (but it supports old 7.0 interface too)

Now the problem: How do you tell the backend with what interface to use for particular function?  At the moment 7.1-cvs
uses'newC' instead of 'C' in the LANGUAGE part.
 

But this not good:

1) the 'newC' will be quite silly after couple of years, when it  will be the standard.

2) there is another change in the horizon, which would be the  automatic detection of function parameters, or rather
the shared object should provide info about it.
 

3) It has nothing to do with 'C'.  The loadable modules can be  programmed in any language, as long it supports C
calling conventions.
 

4) And IMHO "LANGUAGE 'C'" is a hack, LANGUAGE construct should be  used only for actual definitions.  Now should we
extendone hack  with another hack?
 


Requirement: 7.1 should understand the 7.0 syntax, 7.2 should understand 7.1 and 7.0 syntax.  That means the
dump/restoreshould work between versions.  Whether 7.2 has the 'oldC' handler is another matter, but it should not load
itwith wrong defaults.
 

I propose new command:
 CREATE FUNCTION name( [ftype [, ...] ] ) RETURNS rtypeFROM [ LIBRARY ] obj_file AS link_sym[ WITH [ TYPE = ( 0 | 1 |
...) ]       [[,] ATTRIBUTE = ( attr [, ...] ) ] ]
 
   This mostly like the current "CREATE FUNCTION .. LANGUAGE 'C'".   Main difference is that the TYPE=0 means the old
'C'interface   and TYPE=1 means 'newC' interface.  Default is 1.  (As said,   7.1 supports the old LANGUAGE 'C'
variant,so I think it is   not needed the default to be 0.)
 

 CREATE FUNCTION ... AS defn ... LANGUAGE 'C' ..
   means 7.0 oldC/Informix interface.  No new languages will   come in this way.  (I mean those where the defn is
actually  objname, symbol pair.)      This only is for compatibility.  The ".. LANGUAGE .." should   be only used for
theactual definitions.
 

Alternative:
 newC will be created as:
   CREATE FUNCTION .. LANGUAGE 'C' WITH (pg_params)
 default is old_params, 7.1 pg_dump dumps newC with "(pg_params)". But as I said this is a hack.


Now some future ideas.  I really think that something like that
should come into PostgreSQL eventually.

 LOAD MODULE name FROM [LIBRARY] foomodule.so
   The lib has a struct (e.g.) pg_module_<name>_info which defines   init/remove functions, functions, operators and
types. PostgreSQL   registers module somehow, and when the module gets DROPped then   PostgreSQL calls its remove
funtionsand removes all stuff it has   itself registered.
 
 LOAD FUNCTION name FROM [LIBRARY] foo.so
   This means that in the object file there is defined struct   (e.g.) pg_function_<name>_info. (Probably by help of
macros).
{ I am not sure if the following is needed, better they go through  the LOAD MODULE? }
 LOAD TYPE name FROM [LIBRARY] foo.so
   Module has struct (e.g.) pg_type_<name>_info.
 LOAD OPERATOR name FROM [LIBRARY] foo.so AS obj_name
   Module has struct (e.g.) pg_operator_<obj_name>_info

Random notes:

* why struct not some init funtion? ->* it will be easier to function/module programmer.* avoids duplicate code* makes
possibledifferent interfaces.* main backend can detect incompatible interface
 

* I am not knowledgeable in dump/restore problems.  Someone who is should comment on this what features are else
needed.

* The *.so finding should accept some search paths (LD_LIBRARY_PATH?) (Does it now?)

* In future maybe some currently 'core' parts can be separated into 'core modules' e.g. all geometric stuff.  So they
canbe loaded only as needed.
 

* There was a previous discussion on modules:
 Mark Hollomon's idea:
http://www.postgresql.org/mhonarc/pgsql-hackers/1999-06/msg00959.html
 Jan Wieck objections:
http://www.postgresql.org/mhonarc/pgsql-hackers/1999-06/msg00983.html
 IMHO the objections are not very strong but sure the modules interface needs lot of work.



-- 
marko



Re: [rfc] new CREATE FUNCTION (and more)

From
Tom Lane
Date:
Marko Kreen <marko@l-t.ee> writes:
>     This mostly like the current "CREATE FUNCTION .. LANGUAGE 'C'".
>     Main difference is that the TYPE=0 means the old 'C' interface
>     and TYPE=1 means 'newC' interface.  Default is 1.

This improves matters how, exactly?  As far as I can see, this just
replaces a readable construct with magic numbers, for a net loss in
readability and no change in functionality.

I don't have any great love for the names 'C' and 'newC' either, but
unless we are willing to break backward-compatibility of function
declarations in 7.1, I think we are stuck with those names or ones
isomorphic to them.

In the long run, it seems that it'd be a good idea to embed function
declaration info straight into a loadable module, per Philip's idea
of a special function or your idea of a table.  However that does not
change the issue of names for function-call conventions in the least,
it merely avoids the problem of keeping a script of SQL declarations
in sync with the library file.  (One brain-dead-simple definition of
the info function or table is that it returns/contains a text string
that's exactly the SQL commands needed to create the function
definitions, except we could allow them to omit the pathname
of the library file.  We can probably do better than that, but as
far as raw functionality goes, that will accomplish everything that
a fancier-looking API would do.)
        regards, tom lane


Re: [rfc] new CREATE FUNCTION (and more)

From
Bruce Momjian
Date:
> Marko Kreen <marko@l-t.ee> writes:
> >     This mostly like the current "CREATE FUNCTION .. LANGUAGE 'C'".
> >     Main difference is that the TYPE=0 means the old 'C' interface
> >     and TYPE=1 means 'newC' interface.  Default is 1.
> 
> This improves matters how, exactly?  As far as I can see, this just
> replaces a readable construct with magic numbers, for a net loss in
> readability and no change in functionality.
> 
> I don't have any great love for the names 'C' and 'newC' either, but
> unless we are willing to break backward-compatibility of function
> declarations in 7.1, I think we are stuck with those names or ones
> isomorphic to them.

I am recommending C70 for old functions, and C for current-style
functions.  That way, we can implement C71 if we want for backward
compatibility.  I think making everyone use newC for the current style
is going to be confusing.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [rfc] new CREATE FUNCTION (and more)

From
Marko Kreen
Date:
On Thu, Nov 16, 2000 at 11:20:58AM -0500, Tom Lane wrote:
> Marko Kreen <marko@l-t.ee> writes:
> >     This mostly like the current "CREATE FUNCTION .. LANGUAGE 'C'".
> >     Main difference is that the TYPE=0 means the old 'C' interface
> >     and TYPE=1 means 'newC' interface.  Default is 1.
> 
> This improves matters how, exactly?  As far as I can see, this just
> replaces a readable construct with magic numbers, for a net loss in
> readability and no change in functionality.

Hmm.  I think I have to agree.  The thing is I did all-powerful
CREATE FUNCTION, then I noticed that the module-provided-info
stuff is separate functionality and split them off into LOAD *
functions.  So I did not noticed that the remaining CREATE
FUNCTION has not much point anymore...  :)

> I don't have any great love for the names 'C' and 'newC' either, but
> unless we are willing to break backward-compatibility of function
> declarations in 7.1, I think we are stuck with those names or ones
> isomorphic to them.

Ok.  I only want to note that the "newC" interface  is "good" in
the sense that it probably stays around a long time.  It would
be nice if the name seems reasonable after a couple of years
too. But I better shut up on this issue now.

> In the long run, it seems that it'd be a good idea to embed function
> declaration info straight into a loadable module, per Philip's idea
> of a special function or your idea of a table.  However that does not
> change the issue of names for function-call conventions in the least,

Yes.  

> it merely avoids the problem of keeping a script of SQL declarations
> in sync with the library file.  (One brain-dead-simple definition of
> the info function or table is that it returns/contains a text string
> that's exactly the SQL commands needed to create the function
> definitions, except we could allow them to omit the pathname
> of the library file.  We can probably do better than that, but as
> far as raw functionality goes, that will accomplish everything that
> a fancier-looking API would do.)

Embedded stuff makes the handling less error-prone and
comfortable.  Makefiles too dont bring any new functionality
to the program being compiled... :)

But I think that "LOAD MODULE" starts bringing new functionality
but what exactly I do not know yet...

-- 
marko



Re: [rfc] new CREATE FUNCTION (and more)

From
Nathan Myers
Date:
On Thu, Nov 16, 2000 at 11:20:58AM -0500, Tom Lane wrote:
> I don't have any great love for the names 'C' and 'newC' either, but
> unless we are willing to break backward-compatibility of function
> declarations in 7.1, I think we are stuck with those names or ones
> isomorphic to them.
> 
> In the long run, it seems that it'd be a good idea to embed function
> declaration info straight into a loadable module, per Philip's idea
> of a special function or your idea of a table. 

Until somebody implements Philip's idea, a much simpler approach could 
obviate the whole issue:
- Keep the name 'C' for both old-style and new-style module declarations.- Require that new-style modules define a
distinguishedsymbol, such as   "int __postgresql_call_7_1;".
 

The module loader can look for symbols that start with "__postgresql_call"
and adjust automatically, or report an error.  This 
- Breaks no backward compatibility, - Defines a clear method for handling future changes, to prevent this     problem
fromarising again, - Creates no particular inconvenience for writers of modules, and - Might be very easy to
implement.

Nathan Myers
ncm@zembu.com


Re: [rfc] new CREATE FUNCTION (and more)

From
Tom Lane
Date:
Nathan Myers <ncm@zembu.com> writes:
>  - Keep the name 'C' for both old-style and new-style module declarations.
>  - Require that new-style modules define a distinguished symbol, such as 
>    "int __postgresql_call_7_1;".

I was thinking along the same lines myself.  I'd want to do it on a
per-function basis, though, rather than assuming that all functions in
a module must use the same interface.

I'd be inclined to define a macro that creates the signal object,
so that you'd write something like

PG_FUNCTION_API_V2(foo);

Datum
foo(PG_FUNCTION_ARGS)
{...
}

to create a dynamically loadable new-style function.

Comments?
        regards, tom lane


Re: [rfc] new CREATE FUNCTION (and more)

From
Philip Warner
Date:
At 12:59 16/11/00 -0800, Nathan Myers wrote:
>
> - Keep the name 'C' for both old-style and new-style module declarations.
> - Require that new-style modules define a distinguished symbol, such as 
>   "int __postgresql_call_7_1;".
>
>The module loader can look for symbols that start with "__postgresql_call"
>and adjust automatically, or report an error.  This 

Cute idea. *If* people like the idea of an info function of some kind then
all we have to do is agree on it's name (not even the parameters, I think),
then we can get the 7.1 function manager to look for it. This is definitely
the simplest implementation of a brain-dead info function!


----------------------------------------------------------------
Philip Warner                    |     __---_____
Albatross Consulting Pty. Ltd.   |----/       -  \
(A.B.N. 75 008 659 498)          |          /(@)   ______---_
Tel: (+61) 0500 83 82 81         |                 _________  \
Fax: (+61) 0500 83 82 82         |                 ___________ |
Http://www.rhyme.com.au          |                /           \|                                |    --________--
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/


Re: [rfc] new CREATE FUNCTION (and more)

From
Philip Warner
Date:
At 20:06 16/11/00 -0500, Tom Lane wrote:
>Nathan Myers <ncm@zembu.com> writes:
>>  - Keep the name 'C' for both old-style and new-style module declarations.
>>  - Require that new-style modules define a distinguished symbol, such as 
>>    "int __postgresql_call_7_1;".
>
>I was thinking along the same lines myself.  I'd want to do it on a
>per-function basis, though, rather than assuming that all functions in
>a module must use the same interface.
>
>I'd be inclined to define a macro that creates the signal object,
>so that you'd write something like
>
>PG_FUNCTION_API_V2(foo);

This sounds perfect. Would you generate an 'info' function to return a list
of entry points, or just use dummy object names? The info function has the
advantage that it can return version information as well, and a clutter of
dummy entry points might look a little messy.

I had been thinking along the lines of a generic extensible interface using
a known function, but by using macros you can even hide that layer, making
whatever we do now completely compatible. 

What's also cute, is if the underlying method is designed carefully you may
be able to get one library working with multiple interfaces, especially if
the stuff that comes with PG_FUNCTION_ARGS can also provide version
information.


----------------------------------------------------------------
Philip Warner                    |     __---_____
Albatross Consulting Pty. Ltd.   |----/       -  \
(A.B.N. 75 008 659 498)          |          /(@)   ______---_
Tel: (+61) 0500 83 82 81         |                 _________  \
Fax: (+61) 0500 83 82 82         |                 ___________ |
Http://www.rhyme.com.au          |                /           \|                                |    --________--
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/


Re: [rfc] new CREATE FUNCTION (and more)

From
Marko Kreen
Date:
On Thu, Nov 16, 2000 at 08:06:30PM -0500, Tom Lane wrote:
> Nathan Myers <ncm@zembu.com> writes:
> >  - Keep the name 'C' for both old-style and new-style module declarations.
> >  - Require that new-style modules define a distinguished symbol, such as 
> >    "int __postgresql_call_7_1;".
> 
> I was thinking along the same lines myself.  I'd want to do it on a
> per-function basis, though, rather than assuming that all functions in
> a module must use the same interface.
> 
> I'd be inclined to define a macro that creates the signal object,
> so that you'd write something like
> 
> PG_FUNCTION_API_V2(foo);
> 
> Datum
> foo(PG_FUNCTION_ARGS)
> {
>     ...
> }
> 
> to create a dynamically loadable new-style function.
> 
> Comments?

I like it :)

e.g.
struct pg_function_info_header {    int api_ver;};

and 
PG_FUNCTION_TAG(foo);

expands to
struct pg_function_info_header __pg_function_foo_info = { 0 };

so when we sometimes get around to add more fields to it
we increase the api_ver.  For more info also the macros will
be different.  This _TAG means "no info is given it is only
tagged as newC".

Comments?

-- 
marko



Re: [rfc] new CREATE FUNCTION (and more)

From
Tom Lane
Date:
Philip Warner <pjw@rhyme.com.au> writes:
>> I'd be inclined to define a macro that creates the signal object,
>> so that you'd write something like
>> 
>> PG_FUNCTION_API_V2(foo);

> This sounds perfect. Would you generate an 'info' function to return a list
> of entry points, or just use dummy object names? The info function has the
> advantage that it can return version information as well, and a clutter of
> dummy entry points might look a little messy.

What I was thinking was that the macro would expand either to
int pg_api_foo = 2;

or
int pg_api_foo(void) { return 2; }

The former would be more compact, presumably, but the latter would
probably be more portable --- we already have to have the ability to
find and call functions in a dynamic-link library, whereas I'm not so
sure about the ability to find and read values of global variables.

In either case, the system would be able to extract an integer version
value associated with each function defined by the library.  (If we
don't find the version-defining symbol, we assume old-style C API.)
Meaning of values other than "2" reserved for future definition.

I like this way better than a central info function for the whole
library, because you'd write the version declaration right with the
function itself.  A central info function would be more of a pain to
maintain, I think.
        regards, tom lane


Re: [rfc] new CREATE FUNCTION (and more)

From
Philip Warner
Date:
At 21:08 16/11/00 -0500, Tom Lane wrote:
>Philip Warner <pjw@rhyme.com.au> writes:
>>> I'd be inclined to define a macro that creates the signal object,
>>> so that you'd write something like
>>> 
>>> PG_FUNCTION_API_V2(foo);
...
>
>What I was thinking was that the macro would expand either to
>
>    int pg_api_foo = 2;
>
>or
>
>    int pg_api_foo(void) { return 2; }
>

For possible future compatibility, can you also do something like:
   PG_FUNCTION_API_V2;   PG_FUNCTION_V2(foo);   PG_FUNCTION_V2(bar);   ...

Where

PG_FUNCTION_API_V2 expands to:
   int pg_fmgr_api_version(void) { return 2; }

And PG_FUNCTION_V2(foo) either does nothing or expands to:
   int pg_fmgr_api2_version_foo(void) { return 2; }

The first call will tell PG that (because it is version 2), it should
expect the next set of entry points. Since we will not be allowing mixed
versions in this version of the API (I think), we could really have the
subsequent macros do nothing.

This way we make it more independant of future API versions by not
requiring a specific special entry point for each function. Then can do
things like use the same entry point for multiple functions, possibly act
as stubs pointing to other libraries (by loading & returning another
library entry point) etc etc. 


>I like this way better than a central info function for the whole
>library, because you'd write the version declaration right with the
>function itself.  A central info function would be more of a pain to
>maintain, I think.

In the plans for PG_FUNCTION_API_V3(?), I actually have the info function
returning a struct with values for 'iscacheable', 'isstrict', and the
actual entry point to use. This reduces duplication since the programmer is
the best person to know these attributes. But using macros is fine:

PG_FUNCTION_API_V3 expand to:
   typedef struct {bool iscacheable, bool isstrict, ptr entrypoint }
pg_fmgr_info;   int pg_fmgr_api_version(void) { return 3; }

and
   PG_FUNCTION_V3(foo, false, true, foo_entry_point)

expand to:
   void pg_fmgr_api_version_foo(fmgr_info *i)       { i->iscacheable=false;         i->isstrict=true;
i->entrypoint=foo_entry_point;}
 

will work as well.

Perhaps in PG_FUNCTION_API_V4 we can implement some kind of interface for
listing supported entry points for module loading...







----------------------------------------------------------------
Philip Warner                    |     __---_____
Albatross Consulting Pty. Ltd.   |----/       -  \
(A.B.N. 75 008 659 498)          |          /(@)   ______---_
Tel: (+61) 0500 83 82 81         |                 _________  \
Fax: (+61) 0500 83 82 82         |                 ___________ |
Http://www.rhyme.com.au          |                /           \|                                |    --________--
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/


Re: [rfc] new CREATE FUNCTION (and more)

From
Tom Lane
Date:
Philip Warner <pjw@rhyme.com.au> writes:
> For possible future compatibility, can you also do something like:
>     PG_FUNCTION_API_V2;
>     PG_FUNCTION_V2(foo);
>     PG_FUNCTION_V2(bar);
>     ...

> Where
> PG_FUNCTION_API_V2 expands to:
>     int pg_fmgr_api_version(void) { return 2; }
> And PG_FUNCTION_V2(foo) either does nothing or expands to:
>     int pg_fmgr_api2_version_foo(void) { return 2; }

I'm not following the point here.  Why two different macros?  It doesn't
look to me like the first one does anything.  The per-routine macro
calls should be capable of doing everything that needs to be done.

Per your comments and Marko's about future extension, it seems that
a single-word result might start to get a little cramped before long.
I like Marko's design:
struct pg_function_info_header {    int api_ver;};

The api_ver field is sufficient for now, but for values > 2 there
might be additional fields defined.

We can either have this struct be an initialized global variable,
or have a called function that returns a pointer to it, depending on
the question of which way seems easier to implement/more portable.
The macro can hide the details of how it's done.

> The first call will tell PG that (because it is version 2), it should
> expect the next set of entry points. Since we will not be allowing mixed
> versions in this version of the API (I think),

Yes, we will, because there is a case in the regression tests that
will break anything that doesn't cope with mixed versions ;-).
I deliberately left some of the routines in regress.c old-style ...

> This way we make it more independant of future API versions by not
> requiring a specific special entry point for each function. Then can do
> things like use the same entry point for multiple functions, possibly act
> as stubs pointing to other libraries (by loading & returning another
> library entry point) etc etc. 

Hmm.  This stub idea might be a sufficient reason to say that we want to
do a function call rather than look for a global variable.  However,
I am unpersuaded by the idea that a one-liner function per useful entry
point is an intolerable amount of overhead.  Let's keep it simple here.

>     PG_FUNCTION_V3(foo, false, true, foo_entry_point)
> expand to:
>     void pg_fmgr_api_version_foo(fmgr_info *i) 
>        { i->iscacheable=false; 
i-> isstrict=true;
i-> entrypoint=foo_entry_point; }

I prefer something like
const inforec * pg_api_foo(void){    static inforec foo_info = { ... };    return &foo_info;}

since this avoids prejudging anything.  (In your example, how does
the version function *know* how big the record it's been handed is?
Loading a version-N library into a Postgres version < N might bomb
hard because the info function scribbles on fields that aren't there.
Handing back a pointer to something that the main code then treats
as read-only seems much safer.)  The above implementation with a
preset static inforec is of course only one way it could be done
without breaking the ABI for the info function...

> Perhaps in PG_FUNCTION_API_V4 we can implement some kind of interface for
> listing supported entry points for module loading...

I think that should be seen as a separate feature, rather than
mixing it up with support information about any individual function.
        regards, tom lane


Re: [rfc] new CREATE FUNCTION (and more)

From
Philip Warner
Date:
At 22:10 16/11/00 -0500, Tom Lane wrote:
>Philip Warner <pjw@rhyme.com.au> writes:
>> For possible future compatibility, can you also do something like:
>>     PG_FUNCTION_API_V2;
>>     PG_FUNCTION_V2(foo);
>>     PG_FUNCTION_V2(bar);
>>     ...
>
>> Where
>> PG_FUNCTION_API_V2 expands to:
>>     int pg_fmgr_api_version(void) { return 2; }
>> And PG_FUNCTION_V2(foo) either does nothing or expands to:
>>     int pg_fmgr_api2_version_foo(void) { return 2; }
>
>I'm not following the point here.  Why two different macros?  It doesn't
>look to me like the first one does anything.  The per-routine macro
>calls should be capable of doing everything that needs to be done.

I think the PG_FUNCTION_API_V2 macros is very important because it will
tell the function manager which 'protocol' to use. 

In my view, the individual stub entry points for each function are part of
the protocol and should not be assumed. Returning a struct would be ideal
since it allows more flexibility in the furture.

So long as the version is always in the first bytes of the struct, we are
covered for compatibility.


>> The first call will tell PG that (because it is version 2), it should
>> expect the next set of entry points. Since we will not be allowing mixed
>> versions in this version of the API (I think),
>
>Yes, we will, because there is a case in the regression tests that
>will break anything that doesn't cope with mixed versions ;-).
>I deliberately left some of the routines in regress.c old-style ...

I'd still argue for a PG_FUNCTION_API_V2 macro for the reasons above. What
the fmgrs needs to do is:

- call pg_fmgr_api_version() to get the protocol version
- when it wants to call a function 'foo' see if there is a 'pg_api_foo'
entry point, and if so, use the new interface, o/wise use the old one. No
need to even call it.

Future versions will call pg_fmgr_api_version() and possibly pass
appropriate structs to info-functions or whatever.


>
>Hmm.  This stub idea might be a sufficient reason to say that we want to
>do a function call rather than look for a global variable.

I agree.


>I am unpersuaded by the idea that a one-liner function per useful entry
>point is an intolerable amount of overhead.  Let's keep it simple here.

Wasn't worried about the overhead at all, just the offense to my aesthetics
8-).


>>     PG_FUNCTION_V3(foo, false, true, foo_entry_point)
>> expand to:
>>     void pg_fmgr_api_version_foo(fmgr_info *i) 
>>        { i->iscacheable=false; 
>i-> isstrict=true;
>i-> entrypoint=foo_entry_point; }
>
>I prefer something like
>
>    const inforec * pg_api_foo(void)
>    {
>        static inforec foo_info = { ... };
>        return &foo_info;
>    }
>
>since this avoids prejudging anything.  (In your example, how does
>the version function *know* how big the record it's been handed is?

Becuase the function manager called pg_fmgr_api_version and made sure it
passed the right structure.


>Loading a version-N library into a Postgres version < N might bomb
>hard because the info function scribbles on fields that aren't there.

If it calls pg_fmgr_api_version and can't get an acceptable version, it
would bomb nicely. Maybe in future versions we could even support some kind
of protocol negotiation, but that seems less than useful at this stage.


>Handing back a pointer to something that the main code then treats
>as read-only seems much safer.)

This is fine too; but in all cases they have to be sure that they agree on
what is being handed back. So long at the blocks always have the version
number on the header it is fine.

>> Perhaps in PG_FUNCTION_API_V4 we can implement some kind of interface for
>> listing supported entry points for module loading...
>
>I think that should be seen as a separate feature, rather than
>mixing it up with support information about any individual function.

Fine.


My only real issue with all of this is that we need to separate the
protocol selection from the the data exchange.



----------------------------------------------------------------
Philip Warner                    |     __---_____
Albatross Consulting Pty. Ltd.   |----/       -  \
(A.B.N. 75 008 659 498)          |          /(@)   ______---_
Tel: (+61) 0500 83 82 81         |                 _________  \
Fax: (+61) 0500 83 82 82         |                 ___________ |
Http://www.rhyme.com.au          |                /           \|                                |    --________--
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/


Re: [rfc] new CREATE FUNCTION (and more)

From
Tom Lane
Date:
Philip Warner <pjw@rhyme.com.au> writes:
> So long as the version is always in the first bytes of the struct, we are
> covered for compatibility.

Right ...

> I'd still argue for a PG_FUNCTION_API_V2 macro for the reasons above. What
> the fmgrs needs to do is:

> - call pg_fmgr_api_version() to get the protocol version
> - when it wants to call a function 'foo' see if there is a 'pg_api_foo'
> entry point, and if so, use the new interface, o/wise use the old one. No
> need to even call it.

This strikes me as completely backwards, because it prejudges an
assumption that protocol decisions can be made on a library-wide basis.
I see no need for a library-wide protocol definition.  What I want to
do is call 'pg_api_foo' (if it exists) to find out all about the
function 'foo', without any restriction on whether 'foo' is like 'bar'
that happens to have been linked into the same shlib.

The test to see if 'pg_api_foo' exists is going to be the expensive
part of this anyway.  Once you've done that, you may as well call it...

> My only real issue with all of this is that we need to separate the
> protocol selection from the the data exchange.

Negotiating a protocol to negotiate protocol strikes me as considerable
overkill.  It should be plenty sufficient to say that a parameterless
function with a determinable name will hand back a struct whose first
word identifies the contents of the struct.  Why do we need another
layer on top of that?  Especially if it's a layer that makes the
unsupported assumption that all functions in a given shlib are similar?
That way reduces flexibility, rather than increasing it.
        regards, tom lane


Re: [rfc] new CREATE FUNCTION (and more)

From
Tom Lane
Date:
Nathan Myers <ncm@zembu.com> writes:
> Why declare a function instead of a static struct?

Well, two reasons.  One is that a function provides wiggle room if we
later decide that the library needs to do some computation before
handing back the function info struct.  For example, Philip suggested
that the info function might be just a stub that causes an additional
library to be loaded.  I'm not sure we'll ever really *need* such
flexibility in the future, but when it costs hardly anything to leave
the option open, why not?

The second reason is that if it's a function call, we only have one
primitive lookup operation that we expect the dynamic loader to be able
to support: find a function in a shlib by name.  We have that already in
order to be able to call the real function.  If it's a global variable,
then we need a second primitive lookup operation: find a global variable
in a shlib by name.  Given the way that dynamic-link shared libraries
work, this is *not* necessarily the same as the first operation (code
and data are handled much differently in a shared library!) and might
not even be available on all platforms.  At the very least it'd imply a
round of per-platform development and portability testing that I doubt
we can afford if we want to shoehorn this feature into the 7.1 schedule.

In short, using a variable looks like more work for less functionality,
and so the choice seems pretty clear-cut to me: use a function.

> Users are allowed to have functions that start 
> with "pg" already, and that's quite a reasonable prefix for 
> functions meant to be called by Postgres.  Therefore, I suggest 
> a prefix "_pg" instead of "pg".  Thus,

>   const struct _pg_user_function _pg_user_function_foo = { 2, };

The exact details of the name prefix need to be settled regardless
of whether the name is attached to a variable or a function.  I was
thinking of pg_finfo_foo for a function named foo.  We want to keep
the prefix reasonably short, so as to reduce the risk of duplicate-
symbol conflicts if the platform's linker truncates names to some
fixed length (I'm sure there are still some that do :-().  Using
a leading underscore (_pg_info_foo) might help but I worry about
creating conflicts with system names if we do that.
        regards, tom lane


Re: [rfc] new CREATE FUNCTION (and more)

From
Tom Lane
Date:
Philip Warner <pjw@rhyme.com.au> writes:
> Not at all. The call is, as you point out, defining the protocl for
> enquiry. Nothing more. With 7.1, the process above is sufficient. There is
> no need to call *in this version* because pg_fmgr_api_version returns
> enough information when combined with the existence of a well-defined entry
> point. Future versions would need to call the entry point, I would expect.

Well, I was planning to go ahead and call the entry point anyway, just
so that it could tell me "it's an old-style function" if it wanted to.
Not sure that'll ever happen in practice, but that case ought to work
IMHO.
        regards, tom lane


Re: [rfc] new CREATE FUNCTION (and more)

From
Philip Warner
Date:
At 23:07 16/11/00 -0500, Tom Lane wrote:
>Philip Warner <pjw@rhyme.com.au> writes:
>>
>> - call pg_fmgr_api_version() to get the protocol version
>> - when it wants to call a function 'foo' see if there is a 'pg_api_foo'
>> entry point, and if so, use the new interface, o/wise use the old one. No
>> need to even call it.
>
>This strikes me as completely backwards, because it prejudges an
>assumption that protocol decisions can be made on a library-wide basis.
...
>unsupported assumption that all functions in a given shlib are similar?
>That way reduces flexibility, rather than increasing it.

Not at all. The call is, as you point out, defining the protocl for
enquiry. Nothing more. With 7.1, the process above is sufficient. There is
no need to call *in this version* because pg_fmgr_api_version returns
enough information when combined with the existence of a well-defined entry
point. Future versions would need to call the entry point, I would expect.

If we really wanted to we could let it also return a struct, which could
indicate if all, some or none of the functions in the module have info
functions.

I guess it's not a big issue, and and as you say 'negtiating to negotiate'
is probably overkill. Also, if we really need it in a future version we can
add it easily enough - it's would be just one more 'look for entry point'
call in the function manager.

I'll be very happy to see newC replaced with this...


----------------------------------------------------------------
Philip Warner                    |     __---_____
Albatross Consulting Pty. Ltd.   |----/       -  \
(A.B.N. 75 008 659 498)          |          /(@)   ______---_
Tel: (+61) 0500 83 82 81         |                 _________  \
Fax: (+61) 0500 83 82 82         |                 ___________ |
Http://www.rhyme.com.au          |                /           \|                                |    --________--
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/