Thread: Dynamic/polymorphic record/composite return types for C user-defined-functions

Dynamic/polymorphic record/composite return types for C user-defined-functions

From
Stephen Scheck
Date:
Hi,

I'm trying to write some user-defined functions in C which operate on a large object (so their common first argument will be an OID referencing an object in the pg_largeobject catalog table created with lo_create()) and return either a single row or a set depending on the function. Depending on the contents of the BLOB, some of the functions have a need to return polymorphic number column(s) as part of their result row (i.e. it could be an integer, real or double depending on the input BLOB).

I've tried various approaches for this but none of them quite work the way I need and I'm wondering if I'm missing a fundamental bit of understanding of Postgres' type system or it simply doesn't support what I want to do. Here's what I've tried:

1) Declare the function as RETURNS my_type (or RETURNS SETOF my_type), my_type having been defined with CREATE TYPE my_type AS ... with the column defined as a specific number type (integer, real, double precision, etc.). This works as I want, but only allows supporting the specific number type declared (since function signature polymorphism can only differentiate by input types, and any* types are not allowed in CREATE TYPE definitions).

2) Declare the function as RETURNS an anonymous row type, via OUT parameters or RETURNS TABLE. Declare the polymorphic number column as "anynonarray". The problem here is without a polymorphic IN parameter, the OUT type cannot be resolved. I worked around this by adding DEFAULT NULL::integer IN parameter which satisfies CREATE FUNCTION and calls but doesn't propagate the correct type through the FunctionCallInfo (I can't recall the exact error message but it didn't work).

3) Declare the function as RETURNS an anonymous row type, via OUT parameters or RETURNS TABLE. Declare the polymorphic number column as "any", which doesn't enforce correspondence between IN and OUT parameters. Doesn't work - when I call the function I get this: "ERROR: cannot display a value of type any". I don't think this would work even if the column isn't in the select-list (i.e. just used as a join or filter condition) since if I do an explicit cast, I get this error message: 'cannot cast type "any" to integer'.

As an aside, does this imply "any" as an OUT parameter has no use?

4) Declare the function as RETURNS RECORD or RETURNS SETOF RECORD. Use CreateTemplateTupleDesc()/BlessTupleDesc() to dynamically create a tuple description on the fly and return it. Depending on call context, I get different error messges:

SELECT * FROM info(<lo_oid>);
ERROR: a column definition list is required for functions returning "record"

-- or

SELECT (info(lo_oid_column)).* FROM test_table;
ERROR: record type has not been registered

I'm out of ideas. Isn't this the kind of dynamic behavior for which CreateTemplateTupleDesc()/BlessTupleDesc() is intended?

Any suggestions appreciated.

Cheers,
-Steve

Stephen Scheck <singularsyntax@gmail.com> writes:
> I'm trying to write some user-defined functions in C which operate on a
> large object (so their common first argument will be an OID referencing an
> object in the pg_largeobject catalog table created with lo_create()) and
> return either a single row or a set depending on the function. Depending on
> the contents of the BLOB, some of the functions have a need to return
> polymorphic number column(s) as part of their result row (i.e. it could be
> an integer, real or double depending on the input BLOB).

> I've tried various approaches for this but none of them quite work the way
> I need and I'm wondering if I'm missing a fundamental bit of understanding
> of Postgres' type system or it simply doesn't support what I want to do.

It doesn't.  Type analysis happens at parse time, not at run time, so
you cannot expect a query variable's data type to be determined by the
contents of some data value not seen until runtime.

The only way I can see to get this to work is a hack similar to common
usage of dblink: you declare the function as returning RECORD or SETOF
RECORD, and then the calling query has to specify an AS clause that
shows what column type(s) it's expecting to get back on this particular
call.  That works, sorta, for dblink usages where you're writing
    SELECT ... FROM dblink('some particular SQL command') AS ...
and so you know what you're expecting to get from the remote SQL
command.  But it's certainly ugly, and you haven't said enough about
your use-case to tell if this is workable for you or not.

If you're only worried about numbers, is it really so critical to
preserve the datatype?  You could coerce 'em all to numeric to dodge the
problem, albeit at some loss of efficiency.

Another thought here is that if you don't try to expand a record value,
the need for the parser to know its column types goes away; that is,
if you just write

    SELECT function_returning_record(...) FROM ...

and not

    SELECT (function_returning_record(...)).* FROM ...

I think that the run-time-blessed-record-type hack will work okay.
Of course that greatly limits what you can do with the result in SQL,
but if you just need to ship it to a client it might be all right.

            regards, tom lane


Hi,

We have solved that problem on the way to function always returns text, but text was actually formated json... We have used plv8 before 9.2 to actually execute dynamic SQL and return result...
However, I think some kind of dynamic record type would be very usefull... (Maybe just record but without need to say AS...)

Though have no idea how hard it would be to implement something like that... And how really it would be usefull widely...

We are avoid much as possible client code - and trying to make it tiny as possible....

All business logic is inside DB... In case we havent found json solution...

We would need to write "client code" related to business logic to actually be able to say, in this concrete case SELECT * FROM function(parameters) AS (expected result)

Or as Stephen described he would need to read BLOB outside DB and there apply the logic - instead of in DB...

Kind Regards,

Misa

On Tuesday, April 2, 2013, Tom Lane wrote:
Stephen Scheck <singularsyntax@gmail.com> writes:
> I'm trying to write some user-defined functions in C which operate on a
> large object (so their common first argument will be an OID referencing an
> object in the pg_largeobject catalog table created with lo_create()) and
> return either a single row or a set depending on the function. Depending on
> the contents of the BLOB, some of the functions have a need to return
> polymorphic number column(s) as part of their result row (i.e. it could be
> an integer, real or double depending on the input BLOB).

> I've tried various approaches for this but none of them quite work the way
> I need and I'm wondering if I'm missing a fundamental bit of understanding
> of Postgres' type system or it simply doesn't support what I want to do.

It doesn't.  Type analysis happens at parse time, not at run time, so
you cannot expect a query variable's data type to be determined by the
contents of some data value not seen until runtime.

The only way I can see to get this to work is a hack similar to common
usage of dblink: you declare the function as returning RECORD or SETOF
RECORD, and then the calling query has to specify an AS clause that
shows what column type(s) it's expecting to get back on this particular
call.  That works, sorta, for dblink usages where you're writing
        SELECT ... FROM dblink('some particular SQL command') AS ...
and so you know what you're expecting to get from the remote SQL
command.  But it's certainly ugly, and you haven't said enough about
your use-case to tell if this is workable for you or not.

If you're only worried about numbers, is it really so critical to
preserve the datatype?  You could coerce 'em all to numeric to dodge the
problem, albeit at some loss of efficiency.

Another thought here is that if you don't try to expand a record value,
the need for the parser to know its column types goes away; that is,
if you just write

        SELECT function_returning_record(...) FROM ...

and not

        SELECT (function_returning_record(...)).* FROM ...

I think that the run-time-blessed-record-type hack will work okay.
Of course that greatly limits what you can do with the result in SQL,
but if you just need to ship it to a client it might be all right.

                        regards, tom lane


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
On Tue, Apr 2, 2013 at 12:55 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Stephen Scheck <singularsyntax@gmail.com> writes:
>> I'm trying to write some user-defined functions in C which operate on a
>> large object (so their common first argument will be an OID referencing an
>> object in the pg_largeobject catalog table created with lo_create()) and
>> return either a single row or a set depending on the function. Depending on
>> the contents of the BLOB, some of the functions have a need to return
>> polymorphic number column(s) as part of their result row (i.e. it could be
>> an integer, real or double depending on the input BLOB).
>
>> I've tried various approaches for this but none of them quite work the way
>> I need and I'm wondering if I'm missing a fundamental bit of understanding
>> of Postgres' type system or it simply doesn't support what I want to do.
>
> It doesn't.  Type analysis happens at parse time, not at run time, so
> you cannot expect a query variable's data type to be determined by the
> contents of some data value not seen until runtime.
>
> The only way I can see to get this to work is a hack similar to common
> usage of dblink: you declare the function as returning RECORD or SETOF
> RECORD, and then the calling query has to specify an AS clause that
> shows what column type(s) it's expecting to get back on this particular
> call.  That works, sorta, for dblink usages where you're writing
>         SELECT ... FROM dblink('some particular SQL command') AS ...
> and so you know what you're expecting to get from the remote SQL
> command.  But it's certainly ugly, and you haven't said enough about
> your use-case to tell if this is workable for you or not.

There is a another way: the 'hstore populate_record hack'. Downside
(vs returning RECORD) is that you need a defined type -- a table or a
composite type.  But it's terse and easy to abstract since you're not
providing the full column list.

http://www.postgresql.org/docs/devel/static/hstore.html#HSTORE-FUNC-TABLE

merlin