Thread: Last call for comments: fmgr rewrite [LONG]

Last call for comments: fmgr rewrite [LONG]

From
Tom Lane
Date:
I attach the current draft of my document explaining the upcoming
function manager interface changes.  This has been modified from the
previously circulated version on the basis of comments from Jan and
others.  There is also a preliminary fmgr.h showing actual code for
the proposed interface macros.

Current implementation status: the core fmgr routines have been
rewritten and tested, also the executor's function-call code and the
function handler routines for the three PL languages.  I haven't yet
changed over any pg_proc entries to new-style except the PL handlers.
However, passing nulls into and out of SQL, plpgsql, and plperl
functions works properly now, and it would work in pltcl if pltcl had a
convention for distinguishing nulls from empty strings (Jan, do you want
to do something about that?).  I still need to change trigger handling
so we can get rid of the CurrentTriggerData global variable, and then
can start working on updating individual function routines to the new
conventions.  I will probably not start on that work until after we make
the 7.1 branch and I can commit what I have.
        regards, tom lane


Proposal for function-manager redesign            21-May-2000
--------------------------------------

We know that the existing mechanism for calling Postgres functions needs
to be redesigned.  It has portability problems because it makes
assumptions about parameter passing that violate ANSI C; it fails to
handle NULL arguments and results cleanly; and "function handlers" that
support a class of functions (such as fmgr_pl) can only be done via a
really ugly, non-reentrant kluge.  (Global variable set during every
function call, forsooth.)  Here is a proposal for fixing these problems.

In the past, the major objections to redoing the function-manager
interface have been (a) it'll be quite tedious to implement, since every
built-in function and everyplace that calls such functions will need to
be touched; (b) such wide-ranging changes will be difficult to make in
parallel with other development work; (c) it will break existing
user-written loadable modules that define "C language" functions.  While
I have no solution to the "tedium" aspect, I believe I see an answer to
the other problems: by use of function handlers, we can support both old
and new interfaces in parallel for both callers and callees, at some
small efficiency cost for the old styles.  That way, most of the changes
can be done on an incremental file-by-file basis --- we won't need a
"big bang" where everything changes at once.  Support for callees
written in the old style can be left in place indefinitely, to provide
backward compatibility for user-written C functions.

Note that neither the old function manager nor the redesign are intended
to handle functions that accept or return sets.  Those sorts of functions
need to be handled by special querytree structures.


Changes in pg_proc (system data about a function)
-------------------------------------------------

A new column "proisstrict" will be added to the system pg_proc table.
This is a boolean value which will be TRUE if the function is "strict",
that is it always returns NULL when any of its inputs are NULL.  The
function manager will check this field and skip calling the function when
it's TRUE and there are NULL inputs.  This allows us to remove explicit
NULL-value tests from many functions that currently need them.  A function
that is not marked "strict" is responsible for checking whether its inputs
are NULL or not.  Most builtin functions will be marked "strict".

An optional WITH parameter will be added to CREATE FUNCTION to allow
specification of whether user-defined functions are strict or not.  I am
inclined to make the default be "not strict", since that seems to be the
more useful case for functions expressed in SQL or a PL language, but
am open to arguments for the other choice.


The new function-manager interface
----------------------------------

The core of the new design is revised data structures for representing
the result of a function lookup and for representing the parameters
passed to a specific function invocation.  (We want to keep function
lookup separate from function call, since many parts of the system apply
the same function over and over; the lookup overhead should be paid once
per query, not once per tuple.)


When a function is looked up in pg_proc, the result is represented as

typedef struct
{   PGFunction  fn_addr;    /* pointer to function or handler to be called */   Oid         fn_oid;     /* OID of
function(NOT of handler, if any) */   short       fn_nargs;   /* 0..FUNC_MAX_ARGS, or -1 if variable arg count */
bool       fn_strict;  /* function is "strict" (NULL in => NULL out) */   void       *fn_extra;   /* extra space for
useby handler */
 
} FmgrInfo;

For an ordinary built-in function, fn_addr is just the address of the C
routine that implements the function.  Otherwise it is the address of a
handler for the class of functions that includes the target function.
The handler can use the function OID and perhaps also the fn_extra slot
to find the specific code to execute.  (fn_oid = InvalidOid can be used
to denote a not-yet-initialized FmgrInfo struct.  fn_extra will always
be NULL when an FmgrInfo is first filled by the function lookup code, but
a function handler could set it to avoid making repeated lookups of its
own when the same FmgrInfo is used repeatedly during a query.)  fn_nargs
is the number of arguments expected by the function, and fn_strict is
its strictness flag.

FmgrInfo already exists in the current code, but has fewer fields.  This
change should be transparent at the source-code level.


During a call of a function, the following data structure is created
and passed to the function:

typedef struct
{   FmgrInfo   *flinfo;         /* ptr to lookup info used for this call */   Node       *context;        /* pass info
aboutcontext of call */   Node       *resultinfo;     /* pass or return extra info about result */   bool
isnull;        /* function must set true if result is NULL */   short       nargs;          /* # arguments actually
passed*/   Datum       arg[FUNC_MAX_ARGS];  /* Arguments passed to function */   bool        argnull[FUNC_MAX_ARGS];
/*T if arg[i] is actually NULL */
 
} FunctionCallInfoData;
typedef FunctionCallInfoData* FunctionCallInfo;

flinfo points to the lookup info used to make the call.  Ordinary functions
will probably ignore this field, but function class handlers will need it
to find out the OID of the specific function being called.

context is NULL for an "ordinary" function call, but may point to additional
info when the function is called in certain contexts.  (For example, the
trigger manager will pass information about the current trigger event here.)
If context is used, it should point to some subtype of Node; the particular
kind of context can then be indicated by the node type field.  (A callee
should always check the node type before assuming it knows what kind of
context is being passed.)  fmgr itself puts no other restrictions on the use
of this field.

resultinfo is NULL when calling any function from which a simple Datum
result is expected.  It may point to some subtype of Node if the function
returns more than a Datum.  Like the context field, resultinfo is a hook
for expansion; fmgr itself doesn't constrain the use of the field.

nargs, arg[], and argnull[] hold the arguments being passed to the function.
Notice that all the arguments passed to a function (as well as its result
value) will now uniformly be of type Datum.  As discussed below, callers
and callees should apply the standard Datum-to-and-from-whatever macros
to convert to the actual argument types of a particular function.  The
value in arg[i] is unspecified when argnull[i] is true.

It is generally the responsibility of the caller to ensure that the
number of arguments passed matches what the callee is expecting; except
for callees that take a variable number of arguments, the callee will
typically ignore the nargs field and just grab values from arg[].

The isnull field will be initialized to "false" before the call.  On
return from the function, isnull is the null flag for the function result:
if it is true the function's result is NULL, regardless of the actual
function return value.  Note that simple "strict" functions can ignore
both isnull and argnull[], since they won't even get called when there
are any TRUE values in argnull[].

FunctionCallInfo replaces FmgrValues plus a bunch of ad-hoc parameter
conventions, global variables (fmgr_pl_finfo and CurrentTriggerData at
least), and other uglinesses.


Callees, whether they be individual functions or function handlers,
shall always have this signature:

Datum function (FunctionCallInfo fcinfo);

which is represented by the typedef

typedef Datum (*PGFunction) (FunctionCallInfo fcinfo);

The function is responsible for setting fcinfo->isnull appropriately
as well as returning a result represented as a Datum.  Note that since
all callees will now have exactly the same signature, and will be called
through a function pointer declared with exactly that signature, we
should have no portability or optimization problems.


Function coding conventions
---------------------------

As an example, int4 addition goes from old-style

int32
int4pl(int32 arg1, int32 arg2)
{   return arg1 + arg2;
}

to new-style

Datum
int4pl(FunctionCallInfo fcinfo)
{   /* we assume the function is marked "strict", so we can ignore    * NULL-value handling */
   return Int32GetDatum(DatumGetInt32(fcinfo->arg[0]) +                        DatumGetInt32(fcinfo->arg[1]));
}

This is, of course, much uglier than the old-style code, but we can
improve matters with some well-chosen macros for the boilerplate parts.
I propose below macros that would make the code look like

Datum
int4pl(PG_FUNCTION_ARGS)
{   int32   arg1 = PG_GETARG_INT32(0);   int32   arg2 = PG_GETARG_INT32(1);
   PG_RETURN_INT32( arg1 + arg2 );
}

This is still more code than before, but it's fairly readable, and it's
also amenable to machine processing --- for example, we could probably
write a script that scans code like this and extracts argument and result
type info for comparison to the pg_proc table.

For the standard data types float4, float8, and int8, these macros should
hide the indirection and space allocation involved, so that the function's
code is not explicitly aware that these types are pass-by-reference.  This
will offer a considerable gain in readability, and it also opens up the
opportunity to make these types be pass-by-value on machines where it's
feasible to do so.  (For example, on an Alpha it's pretty silly to make int8
be pass-by-ref, since Datum is going to be 64 bits anyway.  float4 could
become pass-by-value on all machines...)

Here are the proposed macros and coding conventions:

The definition of an fmgr-callable function will always look like

Datum
function_name(PG_FUNCTION_ARGS)
{...
}

"PG_FUNCTION_ARGS" just expands to "FunctionCallInfo fcinfo".  The main
reason for using this macro is to make it easy for scripts to spot function
definitions.  However, if we ever decide to change the calling convention
again, it might come in handy to have this macro in place.

A nonstrict function is responsible for checking whether each individual
argument is null or not, which it can do with PG_ARGISNULL(n) (which is
just "fcinfo->argnull[n]").  It should avoid trying to fetch the value
of any argument that is null.

Both strict and nonstrict functions can return NULL, if needed, withPG_RETURN_NULL();
which expands to{ fcinfo->isnull = true; return (Datum) 0; }

Argument values are ordinarily fetched using code likeint32    name = PG_GETARG_INT32(number);

For float4, float8, and int8, the PG_GETARG macros will hide the pass-by-
reference nature of the data types; for example PG_GETARG_FLOAT4 expands to(* (float64)
DatumGetPointer(fcinfo->arg[number]))
and would typically be called like this:float4  arg = PG_GETARG_FLOAT4(0);
Note that "float4" and "float8" are the recommended typedefs to use, not
"float32data" and "float64data", and the macros are named accordingly.
But 64-bit ints should be declared as "int64".

Non-null values are returned with a PG_RETURN_XXX macro of the appropriate
type.  For example, PG_RETURN_INT32 expands toreturn Int32GetDatum(x)
and PG_RETURN_FLOAT8 expands to{ float8 *retval = palloc(sizeof(float8));  *retval = (x);  return
PointerGetDatum(retval);}
 
which again hides the pass-by-reference nature of the datatype.

fmgr.h will provide PG_GETARG and PG_RETURN macros for all the basic data
types.  Modules or header files that define specialized SQL datatypes
(eg, timestamp) should define appropriate macros for those types, so that
functions manipulating the types can be coded in the standard style.

For non-primitive data types (particularly variable-length types) it
probably won't be very practical to hide the pass-by-reference nature of
the data type, so the PG_GETARG and PG_RETURN macros for those types
probably won't do more than DatumGetPointer/PointerGetDatum plus the
appropriate typecast.  Functions returning such types will need to
palloc() their result space explicitly.  I recommend naming the GETARG
and RETURN macros for such types to end in "_P", as a reminder that they
produce or take a pointer.  For example, PG_GETARG_TEXT_P yields "text *".

For TOAST-able data types, the PG_GETARG macro will deliver a de-TOASTed
data value.  There might be a few cases where the still-toasted value is
wanted, but I am having a hard time coming up with examples.  For the
moment I'd say that any such code could use a lower-level macro that is
just ((struct varlena *) DatumGetPointer(fcinfo->arg[n])).

Note: the above examples assume that arguments will be counted starting at
zero.  We could have the ARG macros subtract one from the argument number,
so that arguments are counted starting at one.  I'm not sure if that would be
more or less confusing.  Does anyone have a strong feeling either way about
it?

When a function needs to access fcinfo->flinfo or one of the other auxiliary
fields of FunctionCallInfo, it should just do it.  I doubt that providing
syntactic-sugar macros for these cases is useful.


Call-site coding conventions
----------------------------

There are many places in the system that call either a specific function
(for example, the parser invokes "textin" by name in places) or a
particular group of functions that have a common argument list (for
example, the optimizer invokes selectivity estimation functions with
a fixed argument list).  These places will need to change, but we should
try to avoid making them significantly uglier than before.

Places that invoke an arbitrary function with an arbitrary argument list
can simply be changed to fill a FunctionCallInfoData structure directly;
that'll be no worse and possibly cleaner than what they do now.

When invoking a specific built-in function by name, we have generally
just written something likeresult = textin ( ... args ... )
which will not work after textin() is converted to the new call style.
I suggest that code like this be converted to use "helper" functions
that will create and fill in a FunctionCallInfoData struct.  For
example, if textin is being called with one argument, it'd look
something likeresult = DirectFunctionCall1(textin, PointerGetDatum(argument));
These helper routines will have declarations likeDatum DirectFunctionCall2(PGFunction func, Datum arg1, Datum arg2);
Note it will be the caller's responsibility to convert to and from
Datum; appropriate conversion macros should be used.

The DirectFunctionCallN routines will not bother to fill in
fcinfo->flinfo (indeed cannot, since they have no idea about an OID for
the target function); they will just set it NULL.  This is unlikely to
bother any built-in function that could be called this way.  Note also
that this style of coding cannot pass a NULL input value nor cope with
a NULL result (it couldn't before, either!).  We can make the helper
routines elog an error if they see that the function returns a NULL.

(Note: direct calls like this will have to be changed at the same time
that their called routines are changed to the new style.  But that will
still be a lot less of a constraint than a "big bang" conversion.)

When invoking a function that has a known argument signature, we have
usually written eitherresult = fmgr(targetfuncOid, ... args ... );
orresult = fmgr_ptr(FmgrInfo *finfo, ... args ... );
depending on whether an FmgrInfo lookup has been done yet or not.
This kind of code can be recast using helper routines, in the same
style as above:result = OidFunctionCall1(funcOid, PointerGetDatum(argument));result = FunctionCall2(funcCallInfo,
               PointerGetDatum(argument),                       Int32GetDatum(argument));
 
Again, this style of coding does not allow for expressing NULL inputs
or receiving a NULL result.

As with the callee-side situation, I propose adding argument conversion
macros that hide the pass-by-reference nature of int8, float4, and
float8, with an eye to making those types relatively painless to convert
to pass-by-value.  For the value-to-pointer direction a little bit of
a trick is needed: these macros will take the address of their argument,
meaning that the argument must be a variable not an expression or a
compiler error will result.  So it's not *completely* transparent,
but the notational ugliness is minimal.

The existing helper functions fmgr(), fmgr_c(), etc will be left in
place until all uses of them are gone.  Of course their internals will
have to change in the first step of implementation, but they can
continue to support the same external appearance.


Notes about function handlers
-----------------------------

Handlers for classes of functions should find life much easier and
cleaner in this design.  The OID of the called function is directly
reachable from the passed parameters; we don't need the global variable
fmgr_pl_finfo anymore.  Also, by modifying fcinfo->flinfo->fn_extra,
the handler can cache lookup info to avoid repeat lookups when the same
function is invoked many times.  (fn_extra can only be used as a hint,
since callers are not required to re-use an FmgrInfo struct.
But in performance-critical paths they normally will do so.)

Issue: in what context should a handler allocate memory that it intends
to use for fn_extra data?  The current palloc context when the handler
is actually called might be considerably shorter-lived than the FmgrInfo
struct, which would lead to dangling-pointer problems at the next use
of the FmgrInfo.  Perhaps FmgrInfo should also store a memory context
identifier that the handler could use to allocate space of the right
lifespan.  (Having fmgr_info initialize this to CurrentMemoryContext
should work in nearly all cases, though a few places might have to
set it differently.)  At the moment I have not done this, since the
existing PL handlers only need to set fn_extra to point at long-lived
structures (data in their own caches) and don't really care which
context the FmgrInfo is in anyway.

Are there any other things needed by the call handlers for PL/pgsql and
other languages?

During the conversion process, support for old-style builtin functions
and old-style user-written C functions will be provided by appropriate
function handlers.  For example, the handler for old-style builtins
looks roughly like fmgr_c() used to.


System table updates
--------------------

In the initial phase, two new entries will be added to pg_language
for language types "newinternal" and "newC", corresponding to
builtin and dynamically-loaded functions having the new calling
convention.

There will also be a change to pg_proc to add the new "proisstrict"
column.

Then pg_proc entries will be changed from language code "internal" to
"newinternal" piecemeal, as the associated routines are rewritten.
(This will imply several rounds of forced initdbs as the contents of
pg_proc change, but I think we can live with that.)

The old language names "internal" and "C" will continue to refer to
functions with the old calling convention.  We should deprecate
old-style functions because of their portability problems, but the
support for them will only be one small function handler routine,
so we can leave them in place for as long as necessary.

The expected calling convention for PL call handlers will need to change
all-at-once, but fortunately there are not very many of them to fix.
/*-------------------------------------------------------------------------** fmgr.h*    Definitions for the Postgres
functionmanager and function-call*    interface.** This file must be included by all Postgres modules that either
define*or call fmgr-callable functions.*** Portions Copyright (c) 1996-2000, PostgreSQL, Inc* Portions Copyright (c)
1994,Regents of the University of California** $Id: fmgr.h,v 1.12 2000/01/26 05:58:38 momjian Exp
$**-------------------------------------------------------------------------*/
#ifndef    FMGR_H
#define FMGR_H


/** All functions that can be called directly by fmgr must have this signature.* (Other functions can be called by
usinga handler that does have this* signature.)*/
 

typedef struct FunctionCallInfoData    *FunctionCallInfo;

typedef Datum (*PGFunction) (FunctionCallInfo fcinfo);

/** This struct holds the system-catalog information that must be looked up* before a function can be called through
fmgr. If the same function is* to be called multiple times, the lookup need be done only once and the* info struct
savedfor re-use.*/
 
typedef struct
{   PGFunction  fn_addr;    /* pointer to function or handler to be called */   Oid         fn_oid;     /* OID of
function(NOT of handler, if any) */   short       fn_nargs;   /* 0..FUNC_MAX_ARGS, or -1 if variable arg count */
bool       fn_strict;  /* function is "strict" (NULL in => NULL out) */   void       *fn_extra;   /* extra space for
useby handler */
 
} FmgrInfo;

/** This struct is the data actually passed to an fmgr-called function.*/
typedef struct FunctionCallInfoData
{   FmgrInfo   *flinfo;            /* ptr to lookup info used for this call */   struct Node *context;        /* pass
infoabout context of call */   struct Node *resultinfo;    /* pass or return extra info about result */   bool
isnull;        /* function must set true if result is NULL */short        nargs;          /* # arguments actually
passed*/   Datum       arg[FUNC_MAX_ARGS];    /* Arguments passed to function */   bool        argnull[FUNC_MAX_ARGS];
 /* T if arg[i] is actually NULL */
 
} FunctionCallInfoData;

/** This routine fills a FmgrInfo struct, given the OID* of the function to be called.*/
extern void fmgr_info(Oid functionId, FmgrInfo *finfo);

/** This macro invokes a function given a filled-in FunctionCallInfoData* struct.  The macro result is the returned
Datum--- but note that* caller must still check fcinfo->isnull!  Also, if function is strict,* it is caller's
responsibilityto verify that no null arguments are present* before calling.*/
 
#define FunctionCallInvoke(fcinfo)  ((* (fcinfo)->flinfo->fn_addr) (fcinfo))


/*-------------------------------------------------------------------------*        Support macros to ease writing
fmgr-compatiblefunctions** A C-coded fmgr-compatible function should be declared as**        Datum*
function_name(PG_FUNCTION_ARGS)*       {*            ...*        }** It should access its arguments using appropriate
PG_GETARG_xxxmacros* and should return its result using
PG_RETURN_xxx.**-------------------------------------------------------------------------*/

/* Standard parameter list for fmgr-compatible functions */
#define PG_FUNCTION_ARGS    FunctionCallInfo fcinfo

/* If function is not marked "proisstrict" in pg_proc, it must check for* null arguments using this macro.  Do not try
toGETARG a null argument!*/
 
#define PG_ARGISNULL(n)  (fcinfo->argnull[n])

/* Macros for fetching arguments of standard types */

#define PG_GETARG_INT32(n)   DatumGetInt32(fcinfo->arg[n])
#define PG_GETARG_INT16(n)   DatumGetInt16(fcinfo->arg[n])
#define PG_GETARG_CHAR(n)    DatumGetChar(fcinfo->arg[n])
#define PG_GETARG_BOOL(n)    DatumGetBool(fcinfo->arg[n])
#define PG_GETARG_OID(n)     DatumGetObjectId(fcinfo->arg[n])
#define PG_GETARG_POINTER(n) DatumGetPointer(fcinfo->arg[n])
/* these macros hide the pass-by-reference-ness of the datatype: */
#define PG_GETARG_FLOAT4(n)  (* DatumGetFloat32(fcinfo->arg[n]))
#define PG_GETARG_FLOAT8(n)  (* DatumGetFloat64(fcinfo->arg[n]))
#define PG_GETARG_INT64(n)   (* (int64 *) PG_GETARG_POINTER(n))
/* use this if you want the raw, possibly-toasted input datum: */
#define PG_GETARG_RAW_VARLENA_P(n)  ((struct varlena *) PG_GETARG_POINTER(n))
/* use this if you want the input datum de-toasted: */
#define PG_GETARG_VARLENA_P(n)  \(VARATT_IS_EXTENDED(PG_GETARG_RAW_VARLENA_P(n)) ?  \ (struct varlena *)
heap_tuple_untoast_attr((varattrib*) PG_GETARG_RAW_VARLENA_P(n)) :  \ PG_GETARG_RAW_VARLENA_P(n))
 
/* GETARG macros for varlena types will typically look like this: */
#define PG_GETARG_TEXT_P(n) ((text *) PG_GETARG_VARLENA_P(n))

/* To return a NULL do this: */
#define PG_RETURN_NULL()  \do { fcinfo->isnull = true; return (Datum) 0; } while (0)

/* Macros for returning results of standard types */

#define PG_RETURN_INT32(x)   return Int32GetDatum(x)
#define PG_RETURN_INT16(x)   return Int16GetDatum(x)
#define PG_RETURN_CHAR(x)    return CharGetDatum(x)
#define PG_RETURN_BOOL(x)    return BoolGetDatum(x)
#define PG_RETURN_OID(x)     return ObjectIdGetDatum(x)
#define PG_RETURN_POINTER(x) return PointerGetDatum(x)
/* these macros hide the pass-by-reference-ness of the datatype: */
#define PG_RETURN_FLOAT4(x)  \do { float4 *retval_ = (float4 *) palloc(sizeof(float4)); \     *retval_ = (x); \
returnPointerGetDatum(retval_); } while (0)
 
#define PG_RETURN_FLOAT8(x)  \do { float8 *retval_ = (float8 *) palloc(sizeof(float8)); \     *retval_ = (x); \
returnPointerGetDatum(retval_); } while (0)
 
#define PG_RETURN_INT64(x)  \do { int64 *retval_ = (int64 *) palloc(sizeof(int64)); \     *retval_ = (x); \      return
PointerGetDatum(retval_);} while (0)
 
/* RETURN macros for other pass-by-ref types will typically look like this: */
#define PG_RETURN_TEXT_P(x)  PG_RETURN_POINTER(x)


/*-------------------------------------------------------------------------*        Support routines and macros for
callersof fmgr-compatible functions*-------------------------------------------------------------------------*/
 

/* These are for invocation of a specifically named function with a* directly-computed parameter list.  Note that
neitherarguments nor result* are allowed to be NULL.*/
 
extern Datum DirectFunctionCall1(PGFunction func, Datum arg1);
extern Datum DirectFunctionCall2(PGFunction func, Datum arg1, Datum arg2);
extern Datum DirectFunctionCall3(PGFunction func, Datum arg1, Datum arg2,                             Datum arg3);
extern Datum DirectFunctionCall4(PGFunction func, Datum arg1, Datum arg2,                             Datum arg3, Datum
arg4);
extern Datum DirectFunctionCall5(PGFunction func, Datum arg1, Datum arg2,                             Datum arg3, Datum
arg4,Datum arg5);
 
extern Datum DirectFunctionCall6(PGFunction func, Datum arg1, Datum arg2,                             Datum arg3, Datum
arg4,Datum arg5,                             Datum arg6);
 
extern Datum DirectFunctionCall7(PGFunction func, Datum arg1, Datum arg2,                             Datum arg3, Datum
arg4,Datum arg5,                             Datum arg6, Datum arg7);
 
extern Datum DirectFunctionCall8(PGFunction func, Datum arg1, Datum arg2,                             Datum arg3, Datum
arg4,Datum arg5,                             Datum arg6, Datum arg7, Datum arg8);
 
extern Datum DirectFunctionCall9(PGFunction func, Datum arg1, Datum arg2,                             Datum arg3, Datum
arg4,Datum arg5,                             Datum arg6, Datum arg7, Datum arg8,                             Datum
arg9);

/* These are for invocation of a previously-looked-up function with a* directly-computed parameter list.  Note that
neitherarguments nor result* are allowed to be NULL.*/
 
extern Datum FunctionCall1(FmgrInfo *flinfo, Datum arg1);
extern Datum FunctionCall2(FmgrInfo *flinfo, Datum arg1, Datum arg2);
extern Datum FunctionCall3(FmgrInfo *flinfo, Datum arg1, Datum arg2,                       Datum arg3);
extern Datum FunctionCall4(FmgrInfo *flinfo, Datum arg1, Datum arg2,                       Datum arg3, Datum arg4);
extern Datum FunctionCall5(FmgrInfo *flinfo, Datum arg1, Datum arg2,                       Datum arg3, Datum arg4,
Datumarg5);
 
extern Datum FunctionCall6(FmgrInfo *flinfo, Datum arg1, Datum arg2,                       Datum arg3, Datum arg4,
Datumarg5,                       Datum arg6);
 
extern Datum FunctionCall7(FmgrInfo *flinfo, Datum arg1, Datum arg2,                       Datum arg3, Datum arg4,
Datumarg5,                       Datum arg6, Datum arg7);
 
extern Datum FunctionCall8(FmgrInfo *flinfo, Datum arg1, Datum arg2,                       Datum arg3, Datum arg4,
Datumarg5,                       Datum arg6, Datum arg7, Datum arg8);
 
extern Datum FunctionCall9(FmgrInfo *flinfo, Datum arg1, Datum arg2,                       Datum arg3, Datum arg4,
Datumarg5,                       Datum arg6, Datum arg7, Datum arg8,                       Datum arg9);
 

/* These are for invocation of a function identified by OID with a* directly-computed parameter list.  Note that
neitherarguments nor result* are allowed to be NULL.  These are essentially FunctionLookup() followed* by
FunctionCallN(). If the same function is to be invoked repeatedly,* do the FunctionLookup() once and then use
FunctionCallN().*/
extern Datum OidFunctionCall1(Oid functionId, Datum arg1);
extern Datum OidFunctionCall2(Oid functionId, Datum arg1, Datum arg2);
extern Datum OidFunctionCall3(Oid functionId, Datum arg1, Datum arg2,                          Datum arg3);
extern Datum OidFunctionCall4(Oid functionId, Datum arg1, Datum arg2,                          Datum arg3, Datum
arg4);
extern Datum OidFunctionCall5(Oid functionId, Datum arg1, Datum arg2,                          Datum arg3, Datum arg4,
Datumarg5);
 
extern Datum OidFunctionCall6(Oid functionId, Datum arg1, Datum arg2,                          Datum arg3, Datum arg4,
Datumarg5,                          Datum arg6);
 
extern Datum OidFunctionCall7(Oid functionId, Datum arg1, Datum arg2,                          Datum arg3, Datum arg4,
Datumarg5,                          Datum arg6, Datum arg7);
 
extern Datum OidFunctionCall8(Oid functionId, Datum arg1, Datum arg2,                          Datum arg3, Datum arg4,
Datumarg5,                          Datum arg6, Datum arg7, Datum arg8);
 
extern Datum OidFunctionCall9(Oid functionId, Datum arg1, Datum arg2,                          Datum arg3, Datum arg4,
Datumarg5,                          Datum arg6, Datum arg7, Datum arg8,                          Datum arg9);
 

/* The parameters and results of FunctionCallN() and friends should be* converted to and from Datum using the
XXXGetDatumand DatumGetXXX* macros of c.h, plus these additional macros (perhaps these should be* moved to c.h?).
Thesemacros exist to hide the pass-by-reference* nature of a few of our basic datatypes, with the thought that these*
typesmight someday become pass-by-value.  Pass-by-reference is not* completely hidden, because you can only hand a
variableof the right* type to these XXXGetDatum macros; no constants or expressions!*/
 
#define DatumGetFloat4(x)  (* ((float4 *) DatumGetPointer(x)))
#define DatumGetFloat8(x)  (* ((float8 *) DatumGetPointer(x)))
#define DatumGetInt64(x)   (* ((int64 *) DatumGetPointer(x)))
#define Float4GetDatum(x)  PointerGetDatum((Pointer) &(x))
#define Float8GetDatum(x)  PointerGetDatum((Pointer) &(x))
#define Int64GetDatum(x)   PointerGetDatum((Pointer) &(x))


/** Routines in fmgr.c*/
extern Oid fmgr_internal_language(const char *proname);

/** Routines in dfmgr.c*/
extern PGFunction fmgr_dynamic(Oid functionId);
extern PGFunction load_external_function(char *filename, char *funcname);
extern void load_file(char *filename);


/*-------------------------------------------------------------------------** !!! OLD INTERFACE !!!** All the
definitionsbelow here are associated with the old fmgr API.* They will go away as soon as we have converted all call
pointsto use* the new API.  Note that old-style callee functions do not depend on* these definitions, so we don't need
tohave converted all of them before* dropping the old API ... just all the old-style call
points.**-------------------------------------------------------------------------*/

/* ptr to func returning (char *) */
#if defined(__mc68000__) && defined(__ELF__)
/* The m68k SVR4 ABI defines that pointers are returned in %a0 instead of* %d0. So if a function pointer is declared to
returna pointer, the* compiler may look only into %a0, but if the called function was declared* to return return an
integertype, it puts its value only into %d0. So the* caller doesn't pink up the correct return value. The solution is
to*declare the function pointer to return int, so the compiler picks up the* return value from %d0. (Functions
returningpointers put their value* *additionally* into %d0 for compability.) The price is that there are* some warnings
aboutint->pointer conversions...*/
 
typedef int32 ((*func_ptr) ());
#else
typedef char *((*func_ptr) ());
#endif

typedef struct {   char *data[FUNC_MAX_ARGS];
} FmgrValues;

/** defined in fmgr.c*/
extern char *fmgr(Oid procedureId, ... );
extern char *fmgr_faddr_link(char *arg0, ...);

/**    Macros for calling through the result of fmgr_info.*/

/* We don't make this static so fmgr_faddr() macros can access it */
extern FmgrInfo        *fmgr_pl_finfo;

#define fmgr_faddr(finfo) (fmgr_pl_finfo = (finfo), (func_ptr) fmgr_faddr_link)

#define    FMGR_PTR2(FINFO, ARG1, ARG2)  ((*(fmgr_faddr(FINFO))) (ARG1, ARG2))

/**    Flags for the builtin oprrest selectivity routines.*  XXX These do not belong here ... put 'em in some
planner/optimizerheader.*/
 
#define    SEL_CONSTANT     1        /* operator's non-var arg is a constant */
#define    SEL_RIGHT    2            /* operator's non-var arg is on the right */

#endif    /* FMGR_H */

Re: Last call for comments: fmgr rewrite [LONG]

From
Chris Bitmead
Date:
Tom Lane wrote:

> typedef struct
> {
>     FmgrInfo   *flinfo;         /* ptr to lookup info used for this call */
>     Node       *context;        /* pass info about context of call */
>     Node       *resultinfo;     /* pass or return extra info about result */
>     bool        isnull;         /* function must set true if result is NULL */
>     short       nargs;          /* # arguments actually passed */
>     Datum       arg[FUNC_MAX_ARGS];  /* Arguments passed to function */
>     bool        argnull[FUNC_MAX_ARGS];  /* T if arg[i] is actually NULL */
> } FunctionCallInfoData;

Just wondering what the implications of FUNC_MAX_ARGS is, and whether
something like...

struct FuncArg 
{  Datum arg;  bool argnull;
};

typedef struct
{   FmgrInfo   *flinfo;         /* ptr to lookup info used for this call
*/   Node       *context;        /* pass info about context of call */   Node       *resultinfo;     /* pass or return
extrainfo about
 
result */   bool        isnull;         /* function must set true if result is
NULL */   short       nargs;          /* # arguments actually passed */   struct FuncArg args[];
} FunctionCallInfoData;

might remove an arbitrary argument limit?

> int32
> int4pl(int32 arg1, int32 arg2)
> {
>     return arg1 + arg2;
> }
> to new-style
> Datum
> int4pl(FunctionCallInfo fcinfo)
> {
>     /* we assume the function is marked "strict", so we can ignore
>      * NULL-value handling */
> 
>     return Int32GetDatum(DatumGetInt32(fcinfo->arg[0]) +
>                          DatumGetInt32(fcinfo->arg[1]));
> }


Wondering if some stub code generator might be appropriate so that
functions can can continue to look as readable as before?


Re: Last call for comments: fmgr rewrite [LONG]

From
Bruce Momjian
Date:
> I attach the current draft of my document explaining the upcoming
> function manager interface changes.  This has been modified from the
> previously circulated version on the basis of comments from Jan and
> others.  There is also a preliminary fmgr.h showing actual code for
> the proposed interface macros.

Frankly, everything is very quiet.  I have no problem branching the CVS
tree and getting started soon, if people want that.

--  Bruce Momjian                        |  http://www.op.net/~candle pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: Last call for comments: fmgr rewrite [LONG]

From
Tom Lane
Date:
Chris Bitmead <chrisb@nimrod.itg.telstra.com.au> writes:
> Just wondering what the implications of FUNC_MAX_ARGS is, and whether
> something like...

> struct FuncArg 
> {
>    Datum arg;
>    bool argnull;
> };

I did consider that but it's probably not worth near-doubling the size
of the struct (think about how that will pack, especially if Datum
becomes 8 bytes).  The average callee will probably not be looking at
the argnull array at all, so it won't have a dependency on the offset to
argnull in the first place.  Furthermore FUNC_MAX_ARGS is not going to
vanish in the foreseeable future; we have fixed-size arrays in places
like pg_proc and there's just not enough reason to go to the pain of
making those variable-size.  So the only possible win would be to make
dynamically loaded functions binary-compatible across installations with
varying FUNC_MAX_ARGS values ... and since that'd matter only if they
looked at argnull *and* not at any other structure that depends on
FUNC_MAX_ARGS, it's probably not worth it.

> Wondering if some stub code generator might be appropriate so that
> functions can can continue to look as readable as before?

Er, did you read to the end of the proposal?
        regards, tom lane


Re: Last call for comments: fmgr rewrite [LONG]

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> Frankly, everything is very quiet.  I have no problem branching the CVS
> tree and getting started soon, if people want that.

Yeah, it seems like we could do a 7.0.1 and make the 7.1 CVS branch
sooner than the end of the month.  Maybe sometime this week?
        regards, tom lane


Re: Last call for comments: fmgr rewrite [LONG]

From
Chris Bitmead
Date:
Tom Lane wrote:
> 
> Chris Bitmead <chrisb@nimrod.itg.telstra.com.au> writes:
> > Just wondering what the implications of FUNC_MAX_ARGS is, and whether
> > something like...
> 
> > struct FuncArg
> > {
> >    Datum arg;
> >    bool argnull;
> > };
> 
> I did consider that but it's probably not worth near-doubling the size
> of the struct (think about how that will pack, especially if Datum
> becomes 8 bytes). 

But FUNC_MAX_ARGS is currently 16. 98% of functions are probably 1 or 2
arguments. So your way you always use 144 bytes. With my proposal most
will use 16 or 32 bytes because of the variable struct size and you
won't have an arbitrary limit of 16 args.

> Furthermore FUNC_MAX_ARGS is not going to
> vanish in the foreseeable future; we have fixed-size arrays in places
> like pg_proc and there's just not enough reason to go to the pain of
> making those variable-size.

Well if anybody ever wanted to do it, not having to re-write every
function in the system would be a nice win. Maybe there are other wins
we don't see yet in not having a fixed limit?

> So the only possible win would be to make
> dynamically loaded functions binary-compatible across installations with
> varying FUNC_MAX_ARGS values ... and since that'd matter only if they
> looked at argnull *and* not at any other structure that depends on
> FUNC_MAX_ARGS, it's probably not worth it.

Hmm. Looks like a possible future win to me. Anybody who has a library
of functions might not have to recompile.

> > Wondering if some stub code generator might be appropriate so that
> > functions can can continue to look as readable as before?
> 
> Er, did you read to the end of the proposal?

Yep. Did I miss your point?


Re: Last call for comments: fmgr rewrite [LONG]

From
Tom Lane
Date:
Chris Bitmead <chrisb@nimrod.itg.telstra.com.au> writes:
> Tom Lane wrote:
>> I did consider that but it's probably not worth near-doubling the size
>> of the struct (think about how that will pack, especially if Datum
>> becomes 8 bytes). 

> But FUNC_MAX_ARGS is currently 16. 98% of functions are probably 1 or 2
> arguments. So your way you always use 144 bytes. With my proposal most
> will use 16 or 32 bytes because of the variable struct size and you
> won't have an arbitrary limit of 16 args.

No, because we aren't ever going to be dynamically allocating these
things; they'll be local variables in the calling function.  Typical
code looks like this:

static Datum
ExecMakeFunctionResult(Node *node, List *arguments, ExprContext *econtext,                      bool *isNull, bool
*isDone)
{   FunctionCallInfoData    fcinfo;   Datum                   result;
   MemSet(&fcinfo, 0, sizeof(fcinfo));
   /* ... fill non-defaulted fields of fcinfo here ... */
   result = FunctionCallInvoke(&fcinfo);   *isNull = fcinfo.isnull;   return result;
}

To take advantage of a variable-length struct we'd need to do a palloc,
which is pointless and slow.  The only reason I care about the size of
the struct at all is that I don't want that MemSet() to take longer
than it has to.  (While I don't absolutely have to zero the whole
struct, it's simple and clean to do that, and it ensures that unused
fields will have a predictable value.)

Bottom line is that there *will* be a FUNC_MAX_ARGS limit.  The only
question is whether there's any point in making the binary-level API
for called functions be independent of the exact value of FUNC_MAX_ARGS.
I kinda doubt it.  There are a lot of other things that are more likely
to vary across installations than FUNC_MAX_ARGS; I don't see this as
being the limiting factor for portability.

> Well if anybody ever wanted to do it, not having to re-write every
> function in the system would be a nice win.

We already did the legwork of not having to rewrite anything.  It's
only a config.h twiddle and recompile.  I think that's plenty close
enough...

>>>> Wondering if some stub code generator might be appropriate so that
>>>> functions can can continue to look as readable as before?
>> 
>> Er, did you read to the end of the proposal?

> Yep. Did I miss your point?

Possibly, or else I'm missing yours.  What would a stub code generator
do for us that the proposed GETARG and RETURN macros won't do?
        regards, tom lane


Re: Last call for comments: fmgr rewrite [LONG]

From
Chris Bitmead
Date:
Tom Lane wrote:

> No, because we aren't ever going to be dynamically allocating these
> things; they'll be local variables in the calling function. 

Fair enough then. Although that being the case, I don't see the big deal
about using a few more bytes of stack space which costs absolutely
nothing, even though the binary compatibility is a small but still real
advantage.

> >>>> Wondering if some stub code generator might be appropriate so that
> >>>> functions can can continue to look as readable as before?
> >>
> >> Er, did you read to the end of the proposal?
> 
> > Yep. Did I miss your point?
> 
> Possibly, or else I'm missing yours.  What would a stub code generator
> do for us that the proposed GETARG and RETURN macros won't do?

Only that it might be slightly cleaner code, but you're probably right.
I just have experience doing this sort of thing and know that manually
grabbing each argument can be painful with hundreds of functions.


Re: Last call for comments: fmgr rewrite [LONG]

From
Tom Lane
Date:
Chris Bitmead <chrisb@nimrod.itg.telstra.com.au> writes:
>> Possibly, or else I'm missing yours.  What would a stub code generator
>> do for us that the proposed GETARG and RETURN macros won't do?

> Only that it might be slightly cleaner code, but you're probably right.
> I just have experience doing this sort of thing and know that manually
> grabbing each argument can be painful with hundreds of functions.

The conversion is going to be a major pain in the rear, no doubt about
that :-(.  I suspect it may take us more than one release cycle to get
rid of all the old-style functions in the distribution, and we perhaps
will never be able to drop support for old-style dynamically loaded
functions.

OTOH, I also have experience with code preprocessors and they're no fun
either in an open-source environment.  You gotta port the preprocessor
to everywhere you intend to run, make it robust against a variety of
coding styles, etc etc.  Don't really want to go there.

On the third hand, you've got the germ of an idea: maybe a really
quick-and-dirty script would be worth writing to do some of the basic
conversion editing.  It wouldn't have to be bulletproof because we
would go over the results by hand anyway, but it could help...
        regards, tom lane


Re: Last call for comments: fmgr rewrite [LONG]

From
Bruce Momjian
Date:
> Tom Lane wrote:
> 
> > No, because we aren't ever going to be dynamically allocating these
> > things; they'll be local variables in the calling function. 
> 
> Fair enough then. Although that being the case, I don't see the big deal
> about using a few more bytes of stack space which costs absolutely
> nothing, even though the binary compatibility is a small but still real
> advantage.

I like Tom's clean design better.  Flexibility for little payback
usually just messes up clarity of the code.

--  Bruce Momjian                        |  http://www.op.net/~candle pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: Last call for comments: fmgr rewrite [LONG]

From
Chris Bitmead
Date:
Tom Lane wrote:

> OTOH, I also have experience with code preprocessors and they're no fun
> either in an open-source environment.  You gotta port the preprocessor
> to everywhere you intend to run, make it robust against a variety of
> coding styles, etc etc.  Don't really want to go there.

I was thinking of something more along the lines of a Corba idl code
generator, only simpler. Maybe as simple as a file like:

int4plus: INT4, INT4
int4minus: INT4, INT4
etc...

that gets generated into some stubs that call the real code...

Datum
int4pl_stub(PG_FUNCTION_ARGS)
{   int32   arg1 = PG_GETARG_INT32(0);   int32   arg2 = PG_GETARG_INT32(1);
   return PG_RETURN_INT32(int4pl(arg1, arg2));
}


Re: Last call for comments: fmgr rewrite [LONG]

From
Tom Lane
Date:
Chris Bitmead <chrisb@nimrod.itg.telstra.com.au> writes:
> I was thinking of something more along the lines of a Corba idl code
> generator, only simpler. Maybe as simple as a file like:

> int4plus: INT4, INT4
> int4minus: INT4, INT4
> etc...

> that gets generated into some stubs that call the real code...

> Datum
> int4pl_stub(PG_FUNCTION_ARGS)
> {
>     int32   arg1 = PG_GETARG_INT32(0);
>     int32   arg2 = PG_GETARG_INT32(1);

>     return PG_RETURN_INT32(int4pl(arg1, arg2));
> }

OK ... but I don't think we want to leave a useless extra level of
function call in the code forever.  What I'm starting to visualize
is a simple editing script that adds the above decoration to an existing
function definition, and then you go back and do any necessary cleanup
by hand.  There is a lot of cruft that we should be able to rip out of
the existing code (checks for NULL arguments that are no longer needed
if the function is declared strict, manipulation of pass-by-ref args
and results for float4/float8/int8 datatypes, etc etc) so a hand
editing pass will surely be needed.  But maybe we could mechanize
creation of the basic GETARG/RETURN decorations...
        regards, tom lane


Re: Last call for comments: fmgr rewrite [LONG]

From
Chris Bitmead
Date:
Bruce Momjian wrote:
> 
> > Tom Lane wrote:
> >
> > > No, because we aren't ever going to be dynamically allocating these
> > > things; they'll be local variables in the calling function.
> >
> > Fair enough then. Although that being the case, I don't see the big deal
> > about using a few more bytes of stack space which costs absolutely
> > nothing, even though the binary compatibility is a small but still real
> > advantage.
> 
> I like Tom's clean design better.  Flexibility for little payback
> usually just messes up clarity of the code.

I tend to think grouping data that belongs together as by definition
"clean". Whenever I'm tempted to have concurrent arrays like this I
always pull back because it seems to lead to major pain later. For
example, I can see situations where I'd like to pass an argument around
together with it's is-null information...


struct FuncArg 
{  Datum arg;  bool argnull;
};

typedef struct
{   struct FuncArg args[];
} FunctionCallInfoData;

Datum someFunc(FunctionCallInfo fcinfo)
{return INT32(foo(fcinfo.args[0]) +  bar(fcinfo.args[1], fcinfo.args[2]));
}

int foo(FuncArg a) {  if (a.argnull && INT32(a.arg) > 0 ||     (!a.argnull && INT32(a.arg <= 0)    return 3;  else
return4;
 
}

int bar(FuncArg a, FuncArg b) {   if (a.argnull || !b.argnull)     return 0   else      return INT32(a.arg) ~
INT32(b.arg);
}


Re: Last call for comments: fmgr rewrite [LONG]

From
Tom Lane
Date:
Chris Bitmead <chrisb@nimrod.itg.telstra.com.au> writes:
> Whenever I'm tempted to have concurrent arrays like this I always pull
> back because it seems to lead to major pain later. For example, I can
> see situations where I'd like to pass an argument around together with
> it's is-null information...

That's not an unreasonable point ... although most of the existing code
that needs to do that seems to need additional values as well (the
datum's type OID, length, pass-by-ref flag are commonly needed).
Something close to the Const node type is what you tend to end up with.
The fmgr interface is (and should be, IMHO) optimized for the case where
the called code knows exactly what it's supposed to get and doesn't need
the overhead info.  In particular, the vast majority of C-coded
functions in the backend should be marked 'strict' in pg_proc, and will
then not need to bother with argnull at all...
        regards, tom lane


Re: Last call for comments: fmgr rewrite [LONG]

From
Hannu Krosing
Date:
Tom Lane wrote:
> 
> ------------------------------------------------------------------------------
> Proposal for function-manager redesign                  21-May-2000
> --------------------------------------
> 
> 
> Note that neither the old function manager nor the redesign are intended
> to handle functions that accept or return sets.  Those sorts of functions
> need to be handled by special querytree structures.

Does the redesign allow functions that accept/return tuples ?

On my first reading at least I did not notice it.

-----------
Hannu


Re: Last call for comments: fmgr rewrite [LONG]

From
Hannu Krosing
Date:
Tom Lane wrote:
> 
> Chris Bitmead <chrisb@nimrod.itg.telstra.com.au> writes:
> > Whenever I'm tempted to have concurrent arrays like this I always pull
> > back because it seems to lead to major pain later. For example, I can
> > see situations where I'd like to pass an argument around together with
> > it's is-null information...
> 
> That's not an unreasonable point ... although most of the existing code
> that needs to do that seems to need additional values as well (the
> datum's type OID, length, pass-by-ref flag are commonly needed).
> Something close to the Const node type is what you tend to end up with.
> The fmgr interface is (and should be, IMHO) optimized for the case where
> the called code knows exactly what it's supposed to get and doesn't need
> the overhead info.

It may be true for C functions, but functions in higher level languages 
often like to be able to operate on several types of arguments (or at least 
to operate on both NULL and NOT NULL args)

> In particular, the vast majority of C-coded functions in the backend
> should be marked 'strict' in pg_proc, and will then not need to bother
> with argnull at all...

But the main aim of fmgr redesign is imho _not_ to make existing functions 
work better but to enable a clean way for designing new functions/languages.

I'm probably wrong, but to me it seems that the current proposal solves only 
the problem with NULLs, and leaves untouched the other problem of arbitrary 
restrictions on number of arguments (unless argcount > MAX is meant to be 
passed using VARIABLE i.e. -1)

------------------------
Hannu


Re: Last call for comments: fmgr rewrite [LONG]

From
Tom Lane
Date:
JanWieck@t-online.de (Jan Wieck) writes:
>     I'm  not  totally  sure  what  you  mean  with the ugly, non-
>     reentrant   kluge.    I    assume    it's    this    annoying
>     setjmp()/longjmp() juggling - isn't it?

No, I was unhappy with the global variables like fmgr_pl_info and
CurrentTriggerData.  As you say, error handling in the PL managers
is pretty ugly, but I don't see a way around that --- and at least
the ugliness is localized ;-)

>     A new querytree structure cannot gain  it,  if  the  function
>     manager  cannot  handle  it.  At  least we need to define how
>     tuple sets as arguments and results should be handled in  the
>     future,  and  define  the  fmgr  interface  according to that
>     already.

At the moment I'm satisfied to have a trapdoor that allows extension of
the fmgr interface --- that's what the context and resultinfo fields are
intended for.  In my mind this is a limited redesign of one specific API
for limited objectives.  If we try to turn the project into "fix
everything anyone could possibly want for functions" then nothing will
get done at all...

>> resultinfo is NULL when calling any function from which a simple Datum
>> result is expected.  It may point to some subtype of Node if the function
>> returns more than a Datum.  Like the context field, resultinfo is a hook
>> for expansion; fmgr itself doesn't constrain the use of the field.

>     Good  place  to  put  in  a  tuple descriptor for [SET] tuple
>     return types.  But the same type  of  information  should  be
>     there per argument.

The context field could be used to pass additional information about
arguments, too.  Actually, the way things are currently coded, it
wouldn't be hard to throw in more extension pointers like context
and resultinfo, so long as they are defined to default to NULL for
simple calls of functions accepting and returning Datums.  As I was
remarking to Chris, I have some concern about not bloating the struct,
but a pointer or two more or less won't hurt.

>     At  this  point I'd like to add another relkind we might want
>     to have.  This relkind  just  describes  a  tuple  structure,
>     without having a heap or rules. Only to define a complex type
>     to be used in function declarations.

Could be a good idea.  In the original Postgres code it seems the only
way to define a tuple type is to create a table with that structure
--- but maybe you have no intention of using the table, and only want
the type...

>> It is generally the responsibility of the caller to ensure that the
>> number of arguments passed matches what the callee is expecting; except
>> for callees that take a variable number of arguments, the callee will
>> typically ignore the nargs field and just grab values from arg[].

>     If you already think about calling  the  same  function  with
>     variable number of arguments, where are the argtypes?

Not fmgr's problem --- it doesn't know a thing about the argument or
result types.  I'm not sure that the variable-arguments business will
ever really get implemented; I just wanted to be sure that these data
structures could represent it if we do want to implement it.

>> For TOAST-able data types, the PG_GETARG macro will deliver a de-TOASTed
>> data value.  There might be a few cases where the still-toasted value is
>> wanted, but I am having a hard time coming up with examples.

>     length()   and   octetlength()  are  good  candidates.

OK, so it will be possible to get at the still-toasted value.

>     For the two PL handlers I wrote that's enough.  They  allways
>     store  their  own  private  information  in their own private
>     memory. Having some place there which is initialized to NULL,
>     where  they  can  leave  a  pointer to avoid a lookup at each
>     invocation is perfect.

Yes, I've already changed them to do this ;-).

>> In the initial phase, two new entries will be added to pg_language
>> for language types "newinternal" and "newC", corresponding to
>> builtin and dynamically-loaded functions having the new calling
>> convention.

>     I would prefer "interal_ext" and "C_ext".

Someone else suggested renaming the old languages types to "oldXXX"
and giving the new ones pride of place with the basic names "internal"
and "C".  For the internal functions we could do this if we like.
For dynamically loaded functions we will break existing code (or at
least the CREATE FUNCTION scripts for it) if we don't stick with "C"
as the name for the old-style interface.  Is that worth the long-term
niceness of a simple name for the new-style interface?  I went for
compatibility but I won't defend it very hard.  Comments anyone?

>     What I'm missing (don't know  which  of  these  are  standard
>     compliant):

>         Extending  the  system  catalog to give arguments a name.

>         Extending  the  system  catalog to provide default values
>         for arguments.

>         Extending call semantics so  functions  can  have  INPUT,
>         OUTPUT and INOUT arguments.

None of these are fmgr's problem AFAICS, nor do I see a reason to
add them to the current work proposal.  They look like a future
project to me...
        regards, tom lane


Re: Last call for comments: fmgr rewrite [LONG]

From
Peter Eisentraut
Date:
I just got my hands on the real SQL99 stuff, dated September 1999, and it
contains a function creation syntax that is strikingly similar to ours,
which would make it a shame not to at least try to play along. Below is a
heavily reduced BNF which should give you some idea -- note in particular
the NULL call conventions. Download your copy at
<ftp://jerry.ece.umassd.edu/isowg3/x3h2/Standards/>.

        <schema function> ::=             CREATE FUNCTION <schema qualified name>               <SQL parameter
declarationlist>               RETURNS <data type>               [ <routine characteristics>... ]               [
<dispatchclause> ]               <routine body>
 
        <dispatch clause> ::= STATIC DISPATCH        /* no idea */
        <SQL parameter declaration list> ::=             <left paren>               [ <SQL parameter declaration> [ {
<comma><SQL parameter declaration> }... ] ]             <right paren>
 
        <SQL parameter declaration> ::=               [ <parameter mode> ] [ <SQL parameter name> ]
<parametertype>               [ RESULT ]
 
        <parameter mode> ::= IN | OUT | INOUT    /* default is IN */
        <routine body> ::=               <SQL routine body>             | <external body reference>        <SQL routine
body>::= <SQL procedure statement>    /* which means a particular subset of SQL statements */        <external body
reference>::=             EXTERNAL [ NAME <external routine name> ]             [ <parameter style clause> ]
[ <external security clause> ]
 
        <routine characteristic> ::=               LANGUAGE { ADA | C | COBOL | FORTRAN | MUMPS | PASCAL | PLI | SQL }
          | PARAMETER STYLE { SQL | GENERAL }             | SPECIFIC <specific name>    /* apparently to disambiguate
overloadedfunctions */             | { DETERMINISTIC | NOT DETERMINISTIC }             | { NO SQL | CONTAINS SQL |
READSSQL DATA | MODIFIES SQL DATA }             | { RETURNS NULL ON NULL INPUT | CALLED ON NULL INPUT }
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^            | <transform group specification>             | <dynamic
resultsets characteristic>
 


-- 
Peter Eisentraut                  Sernanders väg 10:115
peter_e@gmx.net                   75262 Uppsala
http://yi.org/peter-e/            Sweden