Thread: Function parameter names

Function parameter names

From
Dennis Bjorklund
Date:
I'm in the middle of implementing function parameter names. In pg_proc 
the types of the parameters are currently stored as an oidvector, but 
how should I store the parameter names?

Would it be a good idea to create a namevector in the same way as
oidvector? Would a normal array like name[] be better?

What is the reason for the oidvector to be it's own type instead of a 
oid[]?

-- 
/Dennis



Re: Function parameter names

From
Peter Eisentraut
Date:
Dennis Bjorklund writes:

> I'm in the middle of implementing function parameter names.

How will that work in arbitrary procedural languages?

> Would it be a good idea to create a namevector in the same way as
> oidvector? Would a normal array like name[] be better?
>
> What is the reason for the oidvector to be it's own type instead of a
> oid[]?

An oidvector is fixed length, so you can access it via a C struct, like
pgprocval->proargtypes[2].  With a normal array, things get much more
complicated, and that cost would be fairly large if you want to resolve
the argument types of function calls.  (This is the reason that the
maximum number of function parameters is limited to some arbitrary
number.)

So these two reasons make a "namevector" impractical: First, it would
probably not be in the performance critical path.  Second, it would use up
a fixed length of NAMEDATALEN * FUNC_MAX_ARGS (currently 1024 bytes) in
every pg_proc row.  In this case, a regular name[] would be more suitable.
Just be sure to put it after all the fixed-length fields.

-- 
Peter Eisentraut   peter_e@gmx.net



Re: Function parameter names

From
Tom Lane
Date:
Peter Eisentraut <peter_e@gmx.net> writes:
> Dennis Bjorklund writes:
>> I'm in the middle of implementing function parameter names.

> So these two reasons make a "namevector" impractical: First, it would
> probably not be in the performance critical path.  Second, it would use up
> a fixed length of NAMEDATALEN * FUNC_MAX_ARGS (currently 1024 bytes) in
> every pg_proc row.  In this case, a regular name[] would be more suitable.
> Just be sure to put it after all the fixed-length fields.

Actually I'd suggest text[], as there is no good reason to pad the
array entries to a fixed length.
        regards, tom lane


Re: Function parameter names

From
Dennis Bjorklund
Date:
On Sun, 23 Nov 2003, Peter Eisentraut wrote:

> How will that work in arbitrary procedural languages?

It is passed along to the handler of the language, and if the language
wants, it can insert these into its environment before execution. I plan
to look at the languages pgsql and sql, but any other language can of 
course use it once it's implemented.

It will still work if you do not give argument names of course.

> > What is the reason for the oidvector to be it's own type instead of a
> > oid[]?
> 
> An oidvector is fixed length, so you can access it via a C struct, like
> pgprocval->proargtypes[2].  With a normal array, things get much more
> complicated, and that cost would be fairly large if you want to resolve
> the argument types of function calls.

And the cost will be fairly large for named parameters as well then. On 
the other hand, if one omits the parameter names one would get almost the 
same speed as before (an extra test is needed to see if we have any 
parameter names that needs to be setup before the language handler is 
called).

> Second, it would use up a fixed length of NAMEDATALEN * FUNC_MAX_ARGS
> (currently 1024 bytes) in every pg_proc row.

Yea, this I of course knew and was not happy about.

> In this case, a regular name[] would be more suitable.
> Just be sure to put it after all the fixed-length fields.

I'll take a look at name[] then as the first try.

-- 
/Dennis



Re: Function parameter names

From
Dennis Bjorklund
Date:
On Sun, 23 Nov 2003, Tom Lane wrote:

> Actually I'd suggest text[], as there is no good reason to pad the
> array entries to a fixed length.

The only reason against is that all other identifiers have this arbitrary
limit. But sure, text[] would work just as well and without any
restriction.

Is the reason to use name at all in pg "just" about speed, or is there
some other reason?

-- 
/Dennis



Re: Function parameter names

From
Tom Lane
Date:
Dennis Bjorklund <db@zigo.dhs.org> writes:
> Is the reason to use name at all in pg "just" about speed, or is there
> some other reason?

Peter already explained it: we like fixed-width fields in system
catalogs so that we can overlay C struct definitions onto the tuples.
This is a fairly significant notational advantage in the backend code,
whether or not you care about speed.
        regards, tom lane


Re: Function parameter names

From
Tom Lane
Date:
Dennis Bjorklund <db@zigo.dhs.org> writes:
> And the cost will be fairly large for named parameters as well then. On 
> the other hand, if one omits the parameter names one would get almost the 
> same speed as before (an extra test is needed to see if we have any 
> parameter names that needs to be setup before the language handler is 
> called).

I see absolutely no need for the call path to do anything with this
stuff before it reaches the language handler.  All the handlers fetch
the pg_proc tuple anyway, and are perfectly capable of extracting the
parameter names for themselves if they want to.
        regards, tom lane


Re: Function parameter names

From
Dennis Bjorklund
Date:
On Sun, 23 Nov 2003, Tom Lane wrote:

> I see absolutely no need for the call path to do anything with this
> stuff before it reaches the language handler.  All the handlers fetch
> the pg_proc tuple anyway

Even better then!

I've not gotten so far that i've looked much at that code.  What I have
implemented so far is the parser and the extra nodes needed in the Syntax
tree, and I've added a field to pg_proc of currently the wrong type. The
next step for me is to find out how the arrays work inside the backend. 

It's all new code for me so it takes a little more time then what it would 
otherwise. On the bright side is that the code in pg is fairly easy to 
read (but there is a lot).

-- 
/Dennis



Re: Function parameter names

From
Dennis Bjorklund
Date:
On Sun, 23 Nov 2003, Tom Lane wrote:

> Actually I'd suggest text[], as there is no good reason to pad the
> array entries to a fixed length.

I've implemented this part now and it stores the paremeter names in the 
pg_proc table as a text[] field.

However, in the parser I use IDENT to get the parameter names and already
in the lexer the IDENT tokens are truncated to length NAMEDATALEN.

So I've got 3 options:
1) Leave it as is now where the system table allows any length   but the parser only lets you insert "short"
identifiers.
2) Change the type to name[]
3) Change the parser to accept identifiers of any length and add   the length check in a later phase for the
identifiersthat need   to be shorter.
 

Any opinions or should I just make a choice myself?

-- 
/Dennis



Re: Function parameter names

From
Tom Lane
Date:
Dennis Bjorklund <db@zigo.dhs.org> writes:
> However, in the parser I use IDENT to get the parameter names and already
> in the lexer the IDENT tokens are truncated to length NAMEDATALEN.

Right.  What's the problem?
        regards, tom lane


Re: Function parameter names

From
Dennis Bjorklund
Date:
On Tue, 25 Nov 2003, Tom Lane wrote:

> Dennis Bjorklund <db@zigo.dhs.org> writes:
> > However, in the parser I use IDENT to get the parameter names and already
> > in the lexer the IDENT tokens are truncated to length NAMEDATALEN.
> 
> Right.  What's the problem?

It's strange to allow identifiers to be of any length in the system table 
when there is no way to create it using normal syntax. The parser accepts 
this kind of input:

CREATE FUNCTION foo (x int) RETURNS int AS ...

and the identifier x (as all identifiers) can not be too long. Still, one 
can create the function and update the system table by hand to change x to 
a longer name. Doesn't that sound ugly to you?

It's not a technical problem, but a matter of style. Everything works as 
it is now, but works is not always enough.

-- 
/Dennis



Re: Function parameter names

From
Tom Lane
Date:
Dennis Bjorklund <db@zigo.dhs.org> writes:
> and the identifier x (as all identifiers) can not be too long. Still, one 
> can create the function and update the system table by hand to change x to 
> a longer name. Doesn't that sound ugly to you?

It has always been, and likely always will be, possible to use manual
updating of the system catalogs to arrive at states that you could not
get into otherwise, and which might or might not work "correctly", for
whatever value of "correctly" you think is correct.  This doesn't
particularly bother me, since we have always told people that manual
updates are unsupported and are strictly for people who know exactly
what they're doing.

If it really bugs you, possibly the column could be declared as
"varchar(NAMEDATALEN-1)[]" rather than "text[]", but I think the amount
of effort needed to make that happen within the .bki file would be well
out of proportion to the usefulness.  (Actually, it'd still not be 100%
right, since varchar(N) counts characters not bytes ...)
        regards, tom lane


Re: Function parameter names

From
Neil Conway
Date:
Dennis Bjorklund <db@zigo.dhs.org> writes:
> It's strange to allow identifiers to be of any length in the system
> table when there is no way to create it using normal syntax.

I agree with Tom -- that doesn't seem strange to me at all.

-Neil