Thread: Multiple "selects" returned from a single stored procedure

Multiple "selects" returned from a single stored procedure

From
Shachar Shemesh
Date:
Hi list,

I'm in the process of porting an existing MS-SQL database to PostgreSQL.
The application uses OLE DB (and that's why I'm writing the OLE DB for
Postgresql). One of the requirements is that we will introduce as little
changes to the application. It has to be able to work with both
databases, as well as Access (which is not really a database).

Now the question:
MS-SQL has the capacity for both out variables from stored procedures,
as well as running several "selects" inside the procedures, and then
giving the results for all selects to the caller. Fortunetly for me,
that specific application doesn't run more than one select per stored
procedure.

The way I handled out variables so far was to have the function return a
compound type, with the variables as rows. With embedded selects,
however, this will no longer work.

I guess what I would like to suggest is for the thus far unused "select"
command in PLPGSQL to be used, in some way, to return values outside the
scope of the strict "returns" context. I guess out variables will also
be nice, but that's besides the point.

If anyone has any ideas on how to both modify called parameters, and
return a rowset, please let me know. Best I came up with so far was to
create a temporary table for the out vars or the selects. I can then rig
the OLE DB to make it look as if the function returned that.
            Shachar

-- 
Shachar Shemesh
Lingnu Open Systems Consulting
http://www.lingnu.com/






Re: Multiple "selects" returned from a single stored procedure

From
Joe Conway
Date:
Shachar Shemesh wrote:
> MS-SQL has the capacity for both out variables from stored procedures,
> as well as running several "selects" inside the procedures, and then
> giving the results for all selects to the caller. Fortunetly for me,
> that specific application doesn't run more than one select per stored
> procedure.
> 
> The way I handled out variables so far was to have the function return a
> compound type, with the variables as rows. With embedded selects,
> however, this will no longer work.
> 
> I guess what I would like to suggest is for the thus far unused "select"
> command in PLPGSQL to be used, in some way, to return values outside the
> scope of the strict "returns" context. I guess out variables will also
> be nice, but that's besides the point.

This has come up before (search the archives). I think the answer is to 
implement actual stored procedures (as opposed to functions, which is 
what we now have). A stored procedure call, per SQL99/2003 would look 
something like:  call sp_my_stored_proc();
(which in MSSQL looks like "exec sp_my_stored_proc()")

The difference between this and an SRF is that the stored procedure 
cannot be used in a FROM clause, and therefore cannot be joined with 
other data or filtered with WHERE criteria. But that fact also means 
that we should be able to deal with projecting multiple heterogenous 
result sets, and the structure of the sets does not need to be known in 
advance.

> If anyone has any ideas on how to both modify called parameters, and
> return a rowset, please let me know. Best I came up with so far was to
> create a temporary table for the out vars or the selects. I can then rig
> the OLE DB to make it look as if the function returned that.

I wonder if you could write an SRF that returns setof refcursor, and 
then expand the cursors one-by-one in the OLE DB layer. See:
http://www.postgresql.org/docs/7.4/interactive/plpgsql-cursors.html
(37.8.3.3. Returning Cursors)

HTH,

Joe


Re: Multiple "selects" returned from a single stored procedure

From
Shachar Shemesh
Date:
Joe Conway wrote:

> Shachar Shemesh wrote:
>
>> I guess what I would like to suggest is for the thus far unused "select"
>> command in PLPGSQL to be used, in some way, to return values outside the
>> scope of the strict "returns" context. I guess out variables will also
>> be nice, but that's besides the point.
>
>
> This has come up before (search the archives).

I did. Found people who asked about it, but didn't find an actual answer.

> I think the answer is to implement actual stored procedures (as 
> opposed to functions, which is what we now have). A stored procedure 
> call, per SQL99/2003 would look something like:
>   call sp_my_stored_proc();
> (which in MSSQL looks like "exec sp_my_stored_proc()")

The current docs say, at least from within pgplsql, that "call foo" 
translates to "select * from foo". psql doesn't seem to carry a "call" 
command at all. From PgOleDb, I just do "select * from foo".

> The difference between this and an SRF is that the stored procedure 
> cannot be used in a FROM clause, and therefore cannot be joined with 
> other data or filtered with WHERE criteria.

I don't see that as a problem.

> But that fact also means that we should be able to deal with 
> projecting multiple heterogenous result sets, and the structure of the 
> sets does not need to be known in advance.

That would require some way of actually returning the results, wouldn't it?

>> return a rowset, please let me know. Best I came up with so far was to
>> create a temporary table for the out vars or the selects. I can then rig
>> the OLE DB to make it look as if the function returned that.
>
> If anyone has any ideas on how to both modify called parameters, and
>
> I wonder if you could write an SRF that returns setof refcursor, and 
> then expand the cursors one-by-one in the OLE DB layer.

Yes, that seems to be the direction to go. THANKS!

I'll see how easy it will be to implement in OLE DB, but returning setof 
refcursor certainly allows me to return several rowsets with different 
column info.

I have several ways I can take this solution, and I would like your opinion:
1. Simply assume that if a command returned a single column of 
refcursors, that it meant to return several rowsets, and implement the 
corresponding OLE DB interface.
2. Require that the cursors be named a certain way, according to their 
intended usage. If the command returned cursors named "<unnamed portal 
9>", to just treat it as is, while if it returned a cursor named 
"MultiResult1", treat it as above?

The advantage of 2 is that it allows me to simulate out variables. If 
the cursor is called "outputvars", I direct it to the output variables 
interface.
The disadvantage is that I'm not sure what to do if only some of the 
rows in the result are named MultiResult.

Whatever method I'll use, I may have to start a transaction for the 
purpose of the command, if we were not already in one. Otherwise, the 
ref-cursors are just useless strings.

> HTH,
>
> Joe
>
Thanks,
Shachar

-- 
Shachar Shemesh
Lingnu Open Source Consulting
http://www.lingnu.com/



Re: Multiple "selects" returned from a single stored procedure

From
Joe Conway
Date:
Shachar Shemesh wrote:
>> I think the answer is to implement actual stored procedures (as 
>> opposed to functions, which is what we now have). A stored procedure 
>> call, per SQL99/2003 would look something like:
>>   call sp_my_stored_proc();
>> (which in MSSQL looks like "exec sp_my_stored_proc()")
> 
> The current docs say, at least from within pgplsql, that "call foo" 
> translates to "select * from foo". psql doesn't seem to carry a "call" 
> command at all. From PgOleDb, I just do "select * from foo".

I think you're misinterpreting the docs. The CALL keyword has not been 
implemented in Postgres AFAIK. Where precisely did you see that?

>> The difference between this and an SRF is that the stored procedure 
>> cannot be used in a FROM clause, and therefore cannot be joined with 
>> other data or filtered with WHERE criteria.
> 
> I don't see that as a problem.

It is THE problem as far as implementing what you're looking for. An SRF 
will never be able to deal with multiple results sets because the number 
and types of the returned attributes must be known/resolved when the 
query is parsed. On the other hand, a stored procedure would not have 
this restriction if it, *by design could not* participate in a normal 
SELECT.

>> But that fact also means that we should be able to deal with 
>> projecting multiple heterogenous result sets, and the structure of the 
>> sets does not need to be known in advance.
> 
> That would require some way of actually returning the results, wouldn't it?

Sure. I should think you'd allow a normal SELECT statement in your 
stored procedure, exactly as you would do in MSSQL. The result tuples 
could be formed and projected in similar fashion to EXPLAIN or SHOW ALL. 
See, for example, ShowAllGUCConfig() in guc.c.

> I have several ways I can take this solution, and I would like your 
> opinion:
> 1. Simply assume that if a command returned a single column of 
> refcursors, that it meant to return several rowsets, and implement the 
> corresponding OLE DB interface.

This seems cleanest, except you might want a configurable option to turn 
it off, in case someone really wants the refcursor results.

> 2. Require that the cursors be named a certain way, according to their 
> intended usage. If the command returned cursors named "<unnamed portal 
> 9>", to just treat it as is, while if it returned a cursor named 
> "MultiResult1", treat it as above?
> 
> The advantage of 2 is that it allows me to simulate out variables. If 
> the cursor is called "outputvars", I direct it to the output variables 
> interface.
> The disadvantage is that I'm not sure what to do if only some of the 
> rows in the result are named MultiResult.

This one seems a bit grotty to me. But the only other kludge I can think 
of to similate output variables would require some backend hacking, or 
at least a user defined C function.

In case you're interested, here is the idea. Implement functions 
necessary to create, change, and remove session local variables. If the 
function uses named parameters (recently implemented for functions) 
*and* one or more session local variable of the same names are found, 
assume the value of the variables are your outputvar results.

> Whatever method I'll use, I may have to start a transaction for the 
> purpose of the command, if we were not already in one. Otherwise, the 
> ref-cursors are just useless strings.

Cursors can now outlive transactions (DECLARE ... WITH HOLD), but there 
is the downside (recently discussed on one of the lists) that when the 
transaction is ended, the cursor is copied to a tuplestore. As long as 
the tuplestore fits within sort_mem (work_mem in 7.5+), it will be held 
entirely within memory. If not, it will spill to disk.

Joe


Re: Multiple "selects" returned from a single stored procedure

From
Shachar Shemesh
Date:
Joe Conway wrote:

> I think you're misinterpreting the docs. The CALL keyword has not been 
> implemented in Postgres AFAIK. Where precisely did you see that?

Can't find it any more. I'm pretty sure I read it somewhere. Doesn't matter.

>> I don't see that as a problem.
>
>
> It is THE problem as far as implementing what you're looking for. An 
> SRF will never be able to deal with multiple results sets because the 
> number and types of the returned attributes must be known/resolved 
> when the query is parsed. On the other hand, a stored procedure would 
> not have this restriction if it, *by design could not* participate in 
> a normal SELECT.

What I meant is that I don't see not being able to use stored procedures 
as a FROM source as a problem.

> Sure. I should think you'd allow a normal SELECT statement in your 
> stored procedure, exactly as you would do in MSSQL. The result tuples 
> could be formed and projected in similar fashion to EXPLAIN or SHOW 
> ALL. See, for example, ShowAllGUCConfig() in guc.c.

I'm more into client-side hacking than server-side hacking. I'm 
currently using libpq for client-server communications, so bumping the 
protocol version a notch higher won't affect me much, but it's a point 
to consider.

>
>> I have several ways I can take this solution, and I would like your 
>> opinion:
>> 1. Simply assume that if a command returned a single column of 
>> refcursors, that it meant to return several rowsets, and implement 
>> the corresponding OLE DB interface.
>
>
> This seems cleanest, except you might want a configurable option to 
> turn it off, in case someone really wants the refcursor results.

As soon as I figure out how to pass custom-attributes to OLE DB. Docs 
seem hazy on this. Marek, who originally wrote OLE DB, did manage to 
figure it out. Unfotunetly, I had to dump his code entirely for 
unrelated reasons. I'll be glad if he reintroduced it into this 
iteration (hint hint).

In any case, as OLE DB uses a binary interface to pass data around, I 
need to know of all data types you would want to participate in any 
activity. This means that refcursors will not be supported on their own 
unless I figure out how to return them standalone, which I don't see 
happening.

>
>> 2. Require that the cursors be named a certain way, according to 
>> their intended usage. If the command returned cursors named "<unnamed 
>> portal 9>", to just treat it as is, while if it returned a cursor 
>> named "MultiResult1", treat it as above?
>>
>> The advantage of 2 is that it allows me to simulate out variables. If 
>> the cursor is called "outputvars", I direct it to the output 
>> variables interface.
>> The disadvantage is that I'm not sure what to do if only some of the 
>> rows in the result are named MultiResult.
>
>
> This one seems a bit grotty to me. But the only other kludge I can 
> think of to similate output variables would require some backend 
> hacking, or at least a user defined C function.

I can also assume all refcursors are rowsets EXCEPT if one is named 
"outputvars". Then again, I'm doing this work in the context of the 
needs of a particular client, and he already said it was ok with him to 
pull the output vars from one of the rowsets.

>
> In case you're interested, here is the idea. Implement functions 
> necessary to create, change, and remove session local variables. If 
> the function uses named parameters (recently implemented for functions) 

There's one of the rubs. I'm basing my development on 7.4 backend. This 
has to go into production ASAP, and I can't rely on 7.5 being out by the 
time I'm done.

Also, I can't know (from the client) what the input vars are called 
unless I perform another query. The interface that does this extra query 
is highly recommended against by the OLE DB docs, for that reason.

Interestingly enough, my specific client does this extra query. However, 
I don't believe in writing general code based on a single entity's 
needs. The design must be one capable of expanding, even if the 
implementation is sparse (which it is, at the moment).

> *and* one or more session local variable of the same names are found, 
> assume the value of the variables are your outputvar results.

My backend knowledge is not good enough to understand this in depth, but 
I'll keep this message in case the need arises for later re-reading.

>> Whatever method I'll use, I may have to start a transaction for the 
>> purpose of the command, if we were not already in one. Otherwise, the 
>> ref-cursors are just useless strings.
>
>
> Cursors can now outlive transactions (DECLARE ... WITH HOLD), but 
> there is the downside (recently discussed on one of the lists) that 
> when the transaction is ended, the cursor is copied to a tuplestore. 
> As long as the tuplestore fits within sort_mem (work_mem in 7.5+), it 
> will be held entirely within memory. If not, it will spill to disk.

No, I think that's an overhead I can't afford. Let's not forget that I'm 
a driver, and I'm expected to be relatively transparent, performance 
wise. At the moment, I'm not using cursors (lazy fetching), so libpq is 
allocating quite a bit of memory on the client side. Reserving such an 
amount on the server side seems a little too much.

Just out of curiosity, how do I manually destroy the cursor when it's no 
longer needed? Just do "close refcursor"?

Also, does this copy take place when the table is changed, or as soon as 
the transaction ends? If the former, it may not matter.

In any case, my original concern was that I'll be changing semantics by 
opening another transaction. Upon retrospect, however, that's precisely 
what PostgreSQL is doing if one is not active. Assuming the OLE DB user 
uses the Transaction interface, and does not send a "begin" command, I'm 
capable of knowing whether I'm inside a transaction or not.

> Joe
         Shachar

-- 
Shachar Shemesh
Lingnu Open Source Consulting
http://www.lingnu.com/



Re: Multiple "selects" returned from a single stored procedure

From
Joe Conway
Date:
Shachar Shemesh wrote:
> Just out of curiosity, how do I manually destroy the cursor when it's no 
> longer needed? Just do "close refcursor"?

Yup:

http://www.postgresql.org/docs/current/static/sql-close.html

> Also, does this copy take place when the table is changed, or as soon as 
> the transaction ends? If the former, it may not matter.

IIRC it is the latter.

Joe