Thread: Proposal: OUT parameters for plpgsql
Awhile back I wrote: > Basically what I am thinking is that we have all the infrastructure > today to solve the OUT-parameter problem, it's just not wrapped up in > an easy-to-use package. Here is a more fully fleshed-out proposal. The basic goal that I'm aiming at is to be able to use OUT-parameter notation in plpgsql as an easy way of returning multiple values from a function. There is a lot of stuff on the table that this doesn't touch, such as procedures executing outside transaction control, parameter matching by name instead of position, default values for parameters, etc. I don't want to do anything that will interfere with adding features like those later, and so this proposal is just minimal decoration on the existing ability to create functions that return RECORD. The generic thing: you can write OUT and INOUT parameters in CREATE FUNCTION. If you do, you can omit the RETURNS clause, in which case the result type of the function is implicitly RECORD; or you can specify RECORD explicitly; or you can specify SETOF RECORD; or, if there's just one OUT/INOUT param, you can specify that param's type or SETOF it. (When there is just one, should the default be to return that type rather than a one-column RECORD?) Note that the result type is RECORD; we won't explicitly create a named composite type for such functions. (We could, perhaps, but I think it'd clutter the catalogs more than be useful.) It might be interesting however to allow explicit specification of RETURNS existing-composite-type with a matching set of OUT parameters. Calling such a function from SQL: you write just the values for the IN and INOUT parameters, and the result is a record of the OUT and INOUT parameters. So typical call style would be SELECT * FROM foo(1,2,'xyzzy'); Unlike with an ordinary RECORD-returning function, you do not specify an AS list, since the result column names and types are already known. (We'll have to invent a column name in the case of an OUT parameter that wasn't given a name in CREATE FUNCTION, but this seems like no big deal.) Within plpgsql, OUT parameters are variables initialized to NULL, INOUT parameters are variables initialized with the passed-in values. You write just RETURN with no argument (or RETURN NEXT in the SETOF case) to pass back the current values of these variables. pg_proc changes: add two new columns to pg_proc, paralleling proargnames; call them something like proargmodes and proallargtypes. proargmodes is an array of char values that indicate the IN, OUT, or INOUT mode; proallargtypes is an array of parameter datatype OIDs. In a function declared without any OUT or INOUT parameters, both of these are set to NULL and the behavior is the same as before. When there are any OUT or INOUT parameters, the existing proargtypes array includes only the IN and INOUT parameters (so that it can continue to have the existing semantics). proallargtypes includes all the parameter types, and proargmodes shows what's what. proargnames matches up with proallargtypes, not proargtypes, in this case. Note the implication that only the IN (and INOUT) parameters determine questions like whether a function conflicts with another function --- that is, pg_proc's primary key is still proargtypes not proallargtypes. Thus for example you couldn't create bothfoo(a IN int, x OUT float)foo(a IN int, y OUT float, z OUT text) since these would have conflicting input-arguments signatures. It might be better to go ahead and fill the new pg_proc columns all the time, even for all-IN-parameters cases. This seems a bit space-wasteful but it might be easier for clients such as pg_dump to deal with. A possible future extension is to add some sort of direct procedure call syntax in plpgsql. Right now, you'd need to write something likeselect into x,y,z from foo(a,b,c); to call a function with IN parameters a,b,c and get back OUT parameters x,y,z. It would obviously be nicer to writefoo(a,b,c,x,y,z); However, I'm inclined to wait on this until someone does parameter-matching-by-name, for fear of creating problems for that feature. (If people think this couldn't interfere, maybe I'll go ahead and do it.) I'm not planning to invent new calling syntax at the SQL level, either, since that's probably best reserved for procedures-outside- transactions. I will probably fix SQL-language functions to be able to work with OUT parameters, but at least for now, the other PL languages will just reject them. Anyone who wants to improve pltcl,plperl, etc to handle the feature is welcome to. Comments? regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> writes: > The generic thing: you can write OUT and INOUT parameters in CREATE > FUNCTION. If you do, you can omit the RETURNS clause, in which case the > result type of the function is implicitly RECORD; or you can specify > RECORD explicitly; or you can specify SETOF RECORD; or, if there's just > one OUT/INOUT param, you can specify that param's type or SETOF it. > (When there is just one, should the default be to return that type > rather than a one-column RECORD?) Do other databases not allow you to specify a return value in addition to the OUT and INOUT parameters? In other languages that allow multiple return values using pass by reference you still get one distinguished return value through the regular syntax. Things like int foo(int in, int *inout) I would have expected the return value to be an extra column added to the record. Otherwise things get weird if you have an existing plpgsql function used in plpgsql code that returns a value and want to add an OUT parameter. All of the sudden you have to convert over the whole call syntax entirely and move the existing return value into another OUT parameter? -- greg
On Mon, Mar 21, 2005 at 02:25:47PM -0500, Tom Lane wrote: > A possible future extension is to add some sort of direct procedure call > syntax in plpgsql. Right now, you'd need to write something like > select into x,y,z from foo(a,b,c); > to call a function with IN parameters a,b,c and get back OUT parameters > x,y,z. It would obviously be nicer to write > foo(a,b,c,x,y,z); > However, I'm inclined to wait on this until someone does > parameter-matching-by-name, for fear of creating problems for that > feature. (If people think this couldn't interfere, maybe I'll go ahead > and do it.) I'm not planning to invent new calling syntax at the SQL > level, either, since that's probably best reserved for procedures-outside- > transactions. ISTM that we'd want to support calling by position as well as calling by parameter name, which means there will need to be a different calling convention for the two modes. foo(a, b, c) is the only logical way to call by position, and I don't see how it could interfer with calling by name, which would need to be something like foo(param_a=>a, param_b=>b, param_c=>c). In the meantime, I think being able to call foo(a, b, c) is much nicer than SELECT INTO b, c foo(a, b). Not only is it more compact and readable, it will make porting to plpgsql easier. -- Jim C. Nasby, Database Consultant decibel@decibel.org Give your computer some brain candy! www.distributed.net Team #1828 Windows: "Where do you want to go today?" Linux: "Where do you want to go tomorrow?" FreeBSD: "Are you guys coming, or what?"
Greg Stark <gsstark@mit.edu> writes: > Do other databases not allow you to specify a return value in addition to the > OUT and INOUT parameters? Oracle discourages OUT parameters in a function, so PL/SQL at least shouldn't be a big source of cases where that's a problem. > I would have expected the return value to be an extra column added to the > record. I'd prefer not to do that, because having a "return type" that's different from the true return type of the function (ie the RECORD) is going to cause untold amounts of confusion. regards, tom lane
On Tue, 21 Mar 2005, Greg Stark wrote: > Tom Lane <tgl@sss.pgh.pa.us> writes: > > > The generic thing: you can write OUT and INOUT parameters in CREATE > > FUNCTION. If you do, you can omit the RETURNS clause, in which case the > > result type of the function is implicitly RECORD; or you can specify > > RECORD explicitly; or you can specify SETOF RECORD; or, if there's just > > one OUT/INOUT param, you can specify that param's type or SETOF it. > > (When there is just one, should the default be to return that type > > rather than a one-column RECORD?) > > Do other databases not allow you to specify a return value in addition to the > OUT and INOUT parameters? ANSI SQL allows at most one OUT parameter for a function (which can be used instead of having the function return a value via the usual means). AFAICT DB2 and Oracle reserve parameter functionality for procedures. Gavin
Gavin Sherry <swm@linuxworld.com.au> writes: >> Tom Lane <tgl@sss.pgh.pa.us> writes: >>> (When there is just one, should the default be to return that type >>> rather than a one-column RECORD?) > ANSI SQL allows at most one OUT parameter for a function (which can be > used instead of having the function return a value via the usual means). OK, so that answers my question above: a single OUT parameter should be equated to an ordinary return value, not a RECORD, so as to emulate this aspect of the spec. regards, tom lane
>>ANSI SQL allows at most one OUT parameter for a function (which can be >>used instead of having the function return a value via the usual means). > > OK, so that answers my question above: a single OUT parameter should be > equated to an ordinary return value, not a RECORD, so as to emulate this > aspect of the spec. Shouldn't you have the spec in one hand while designing this feature? :D Chris
Christopher Kings-Lynne <chriskl@familyhealth.com.au> writes: > Shouldn't you have the spec in one hand while designing this feature? :D Actually, plpgsql generally pays more attention to Oracle than the spec ;-) The truth though is that I'd missed that aspect of SQL99, and now that I have read it I don't care to emulate it very closely. They seem to be mixing up the simple business of input vs output parameters vs function results with functions that are really methods of object classes: An SQL-invoked function is an SQL-invoked routine whose invocation returns a value. Every parameter of anSQL-invoked function is an input parameter, one of which may be designated as the result SQL parameter. Theformat of an SQL-invoked function is specified by <SQL-invoked function> (see Subclause 11.49, "<SQL-invoked routine>"). An SQL-invoked function can be a type-preserving function; a type-preserving function is an SQL-invokedfunction that has a result SQL parameter. The result data type of a type- preserving function issome subtype of the data type of its result SQL parameter. The first part of that looks OK until you realize that they specified the result parameter to be effectively INOUT not OUT ... and then it goes downhill from there. If we ever decide that we want to touch that stuff, we will still have room to, because the spec actually thinks that a "result SQL parameter" is one that has RESULT attached to its declaration; it's not simply a matter of having a single OUT parameter. I don't think that this really says what Gavin thought it did. regards, tom lane
On Tue, 22 Mar 2005, Christopher Kings-Lynne wrote: > >>ANSI SQL allows at most one OUT parameter for a function (which can be > >>used instead of having the function return a value via the usual means). > > > > OK, so that answers my question above: a single OUT parameter should be > > equated to an ordinary return value, not a RECORD, so as to emulate this > > aspect of the spec. > > Shouldn't you have the spec in one hand while designing this feature? :D We're miles away from the spec already. For example, the spec requires that we pass the name of some variable construct, whether it be a host variable, a transition variable (ie, NEW.foo in the case of a row level trigger on a table with an attribute foo) or one of a few other less useful things, for OUT parameters. I think that not specifying OUT parameters when invoking a function is a little of confusing but it gives us a lot: its much easier to develop since we don't have to add a stack of infrastructure for host variables, for one. Gavin
Gavin Sherry <swm@linuxworld.com.au> writes: > I think that not specifying OUT parameters when invoking a function is a > little of confusing but it gives us a lot: its much easier to develop > since we don't have to add a stack of infrastructure for host variables, > for one. Also, it doesn't stop us from adding that infrastructure later. The call syntax SELECT * FROM foo(...) will still do what it has been doing for awhile, but you could imagine inventing other call syntaxes that accommodate host variables more directly. regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> writes: > > I would have expected the return value to be an extra column added to the > > record. > > I'd prefer not to do that, because having a "return type" that's > different from the true return type of the function (ie the RECORD) > is going to cause untold amounts of confusion. Yes, I can see that angle. I was just thinking that since the whole point of this exercise was to achieve some compatibility with a specific interface that your hands were going to be tied. But that other point about other systems only allowing IN or INOUT on procedures where normal return values aren't allowed at all seems to resolve that issue. -- greg
On Mon, 21 Mar 2005, Tom Lane wrote: > Awhile back I wrote: > > Basically what I am thinking is that we have all the infrastructure > > today to solve the OUT-parameter problem, it's just not wrapped up in > > an easy-to-use package. > > > Note that the result type is RECORD; we won't explicitly create a named > composite type for such functions. (We could, perhaps, but I think it'd > clutter the catalogs more than be useful.) It might be interesting > however to allow explicit specification of RETURNS existing-composite-type > with a matching set of OUT parameters. > > Calling such a function from SQL: you write just the values for the IN and > INOUT parameters, and the result is a record of the OUT and INOUT parameters. > So typical call style would be > SELECT * FROM foo(1,2,'xyzzy'); > Unlike with an ordinary RECORD-returning function, you do not specify > an AS list, since the result column names and types are already known. > (We'll have to invent a column name in the case of an OUT parameter that > wasn't given a name in CREATE FUNCTION, but this seems like no big deal.) > I am not sure so this syntax is readable. I'm sure, so this solution is possible and usefull, but you mix SRF style of calling and normal style. For anonymous out record (not OUT parameters) is better Firebird syntax CREATE FUNCTION fce (...) RETURNS (c1 integer, c2 integer) AS BEGIN c1 := 10; c2 := 20; RETURN; END; SELECT * FROM fce (...); c1 | c2 ------- 10 | 20 There is on first view clear which calling style I have to use. This is very similar you proposal - one difference - all OUT params are separeted into return's list. Or clasic SP CREATE FUNCTION fce (IN a integer, OUT b integer) RETURNS bool AS BEGIN b := a; RETURN 't'; END; When I use OUT params I have to have DECLARE command for variables DECLARE b integer; SELECT fce(10, b); fce --- t SELECT b; b -- 10 This is (I think) more standard behavior. Regards Pavel Stehule
Pavel Stehule <stehule@kix.fsv.cvut.cz> writes: > On Mon, 21 Mar 2005, Tom Lane wrote: >> So typical call style would be >> SELECT * FROM foo(1,2,'xyzzy'); > I am not sure so this syntax is readable. I'm sure, so this solution is > possible and usefull, but you mix SRF style of calling and normal style. Well, what I am after here is basically to try to fix the function definition style to be more standard. The SELECT FROM calling style is admittedly not standard, but it's what we already support and will have to continue to support indefinitely. We can go back and work on that end of things in the future; I don't think this proposal forecloses anything as far as other call syntaxes go. > For anonymous out record (not OUT parameters) is better Firebird syntax > > CREATE FUNCTION fce (...) RETURNS (c1 integer, c2 integer) AS That might be more readable, but it's not standard and not flexible (no INOUT parameters), so I don't really see the advantage. > When I use OUT params I have to have DECLARE command for variables > DECLARE b integer; That's basically what I want to avoid, for the time being at least. Variables in straight SQL don't make any sense to me: variables go with conventional, imperative programming languages and SQL really isn't one. A variable for an OUT result should live in some program that is calling SQL, which means it's a feature for client-side code or a feature that exists inside a PL. The advantage of what I am proposing is basically that we can create functions that return OUT parameters without having to buy into inventing SQL variables. We can always do that later if we decide we want to. regards, tom lane