Thread: table / query as a prameter for PL/pgSQL function

table / query as a prameter for PL/pgSQL function

From

Ondrej Ivanič

Date:

08 August 2011, 00:24:37

Hi,

It is possible to pass query result (or cursor?) as function
parameter? I need a function which emits zero or more rows per input
row (map function from map&reduce paradigm). Function returns record
(or array): (value1, value2, value3)
I've tried the following:

1) create or replace function test (r record) returns setof record as $$ ...
Doesn't work: PL/pgSQL functions cannot accept type record

2) pass query as text parameter and open no scroll cursor inside the function
It works but it's ugly.

3) hardcode the query inside function
Similar to (2) and looks better but I need several functions with
different queries inside:
...
for r in (query) loop
    ...
end loop;
...

4) use function in "select" clause:
select my_map_func(col1, col2, col3, col4) from ... -- the rest of the query
In this case I wasn't able figure out how to access record members
returned by the function:

select ?, ?, ?, count(*) from (
   select my_map_func(col1, col2, col3, col4) as map_func_result from ...
) as map
group by 1, 2, 3

The '?' should be something like map.map_func_result.value1 (both
map.value1 and map_func_result.value1 doesn't not work). If function
returns array then I can access value1 by using map_func_result[1]

Is there a better way how to solve this? I'm kind of satisfied with 4
(maybe 3) but it is little bit cumbersome

Thanks,
--
Ondrej Ivanic
(ondrej.ivanic@gmail.com)

Re: table / query as a prameter for PL/pgSQL function

From

David Johnston

Date:

08 August 2011, 00:50:12

On Aug 7, 2011, at 23:24, Ondrej Ivanič <ondrej.ivanic@gmail.com> wrote:

Hi,

It is possible to pass query result (or cursor?) as function
parameter? I need a function which emits zero or more rows per input
row (map function from map&reduce paradigm). Function returns record
(or array): (value1, value2, value3)
I've tried the following:

1) create or replace function test (r record) returns setof record as $$ ...
Doesn't work: PL/pgSQL functions cannot accept type record

From the docs you can try using the "refcursor" data type though I have never done so myself.

4) use function in "select" clause:
select my_map_func(col1, col2, col3, col4) from ... -- the rest of the query
In this case I wasn't able figure out how to access record members
returned by the function:

select ?, ?, ?, count(*) from (
select my_map_func(col1, col2, col3, col4) as map_func_result from ...
) as map
group by 1, 2, 3

The '?' should be something like map.map_func_result.value1 (both
map.value1 and map_func_result.value1 doesn't not work). If function
returns array then I can access value1 by using map_func_result[1]

Try " (map.map_func_result).value1 " - the parenthesis around the table alias and column are necessary.

Is there a better way how to solve this? I'm kind of satisfied with 4
(maybe 3) but it is little bit cumbersome

Thanks,
--
Ondrej Ivanic
(ondrej.ivanic@gmail.com)

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: table / query as a prameter for PL/pgSQL function

From

Merlin Moncure

Date:

08 August 2011, 14:23:47

2011/8/7 Ondrej Ivanič <ondrej.ivanic@gmail.com>:
> Hi,
>
> It is possible to pass query result (or cursor?) as function
> parameter? I need a function which emits zero or more rows per input
> row (map function from map&reduce paradigm). Function returns record
> (or array): (value1, value2, value3)
> I've tried the following:
>
> 1) create or replace function test (r record) returns setof record as $$ ...
> Doesn't work: PL/pgSQL functions cannot accept type record
>
> 2) pass query as text parameter and open no scroll cursor inside the function
> It works but it's ugly.
>
> 3) hardcode the query inside function
> Similar to (2) and looks better but I need several functions with
> different queries inside:
> ...
> for r in (query) loop
>    ...
> end loop;
> ...
>
> 4) use function in "select" clause:
> select my_map_func(col1, col2, col3, col4) from ... -- the rest of the query
> In this case I wasn't able figure out how to access record members
> returned by the function:
>
> select ?, ?, ?, count(*) from (
>   select my_map_func(col1, col2, col3, col4) as map_func_result from ...
> ) as map
> group by 1, 2, 3
>
> The '?' should be something like map.map_func_result.value1 (both
> map.value1 and map_func_result.value1 doesn't not work). If function
> returns array then I can access value1 by using map_func_result[1]
>
> Is there a better way how to solve this? I'm kind of satisfied with 4
> (maybe 3) but it is little bit cumbersome

You have a few of different methods for passing sets between functions.
1) refcursor as David noted.  reasonably fast. however, I find the
'FETCH' mechanic a little inflexible.
2) stage data to TABLE/TEMP TABLE; extremely flexible, but can be a
headache because a non temp table can get thrashed pretty hard a and a
'TEMP' can cause severe function plan invalidation issues if you're
not careful
3) arrays of composites -- the most flexible and very fast for *small*
amounts of records (say less than 10,000):

#3 is my favorite method unless the data being passed is very large.
Here is an example of it in use:

CREATE TYPE foo_t as (a int, b text);
CREATE FUNCTION get_foos() RETURNS SETOF foo_t AS
$$
BEGIN
  RETURN QUERY SELECT 1, 'abc' UNION ALL SELECT 2, 'def';
END;
$$ LANGUAGE PLPGSQL;

CREATE FUNCTION do_foos(_foos foo_t[]) returns VOID AS
$$
DECLARE
  f foo_t;
BEGIN
  FOR f in SELECT * FROM UNNEST(_foos)
  LOOP
    RAISE NOTICE '% %', f.a, f.b;
  END LOOP;
END;
$$ LANGUAGE PLPGSQL;

postgres=# SELECT do_foos(ARRAY(SELECT (a,b)::foo_t FROM get_foos()));
NOTICE:  1 abc
NOTICE:  2 def
 do_foos
---------

(1 row)

Also, if you are deploying vs 9.1, be sure to check out Pavel's
for-in-array which is better method to do the unnest() which expands
the array.

merlin

Re: table / query as a prameter for PL/pgSQL function

From

Ondrej Ivanič

Date:

09 August 2011, 01:48:20

Hi,

2011/8/9 Merlin Moncure <mmoncure@gmail.com>:
> You have a few of different methods for passing sets between functions.

I do not want to pass data between functions. The ideal solution
should look like this:
select * from my_map_func(<select query>)

> 1) refcursor as David noted.  reasonably fast. however, I find the
> 'FETCH' mechanic a little inflexible.

I've came across this but manual example wasn't (isn't) clear to me:
CREATE TABLE test (col text);
INSERT INTO test VALUES ('123');

CREATE FUNCTION reffunc(refcursor) RETURNS refcursor AS '
BEGIN
    OPEN $1 FOR SELECT col FROM test;
    RETURN $1;
END;
' LANGUAGE plpgsql;

BEGIN;
SELECT reffunc('funccursor');
FETCH ALL IN funccursor;
COMMIT;

What is the "funccursor"?

Anyway, I will try to use something like this
CREATE FUNCTION my_map_func(refcursor) returns setof text[] as $$
...
$$ LANGUAGE 'plpgsql';

BEGIN;
DECLARE my_cursor NO SCROLL CURSOR FOR <query>;
SELECT * FROM my_map_func(my_cursor);
COMMIT;

I'll keep you posted.

> 2) stage data to TABLE/TEMP TABLE; extremely flexible, but can be a
> headache because a non temp table can get thrashed pretty hard a and a
> 'TEMP' can cause severe function plan invalidation issues if you're
> not careful

I'm not familiar with this issue (function plan invalidation issues).
Could you please provide more details/links about it?

> 3) arrays of composites -- the most flexible and very fast for *small*
> amounts of records (say less than 10,000):

My data set is huge: between 1 and 5 mil rows and avg row size is 100
- 400 bytes

--
Ondrej Ivanic
(ondrej.ivanic@gmail.com)

Re: table / query as a prameter for PL/pgSQL function

From

Merlin Moncure

Date:

09 August 2011, 11:01:20

2011/8/8 Ondrej Ivanič <ondrej.ivanic@gmail.com>:
> Hi,
>
> 2011/8/9 Merlin Moncure <mmoncure@gmail.com>:
>> You have a few of different methods for passing sets between functions.
>
> I do not want to pass data between functions. The ideal solution
> should look like this:
> select * from my_map_func(<select query>)

well, the method still applies: you'd just do:
select * from my_map_func(array(<select query that grabs foo_t type>))

...but, it sounds like that method is not appropriate -- see below.

>> 1) refcursor as David noted.  reasonably fast. however, I find the
>> 'FETCH' mechanic a little inflexible.
>
> I've came across this but manual example wasn't (isn't) clear to me:
> CREATE TABLE test (col text);
> INSERT INTO test VALUES ('123');
>
> CREATE FUNCTION reffunc(refcursor) RETURNS refcursor AS '
> BEGIN
>    OPEN $1 FOR SELECT col FROM test;
>    RETURN $1;
> END;
> ' LANGUAGE plpgsql;
>
> BEGIN;
> SELECT reffunc('funccursor');
> FETCH ALL IN funccursor;
> COMMIT;
>
> What is the "funccursor"?

funccursor is the name -- just a string.  refcursors can be named with
a variable string and later fetched as an identifier -- they are kinda
unique in that respect.

>> 2) stage data to TABLE/TEMP TABLE; extremely flexible, but can be a
>> headache because a non temp table can get thrashed pretty hard a and a
>> 'TEMP' can cause severe function plan invalidation issues if you're
>> not careful
>
> I'm not familiar with this issue (function plan invalidation issues).
> Could you please provide more details/links about it?

well, in your particular case it's probably not so much of an issue.
plpgsql, when a function is executed for the first time in a session,
'compiles' the source code into a plan that is kept around until it
invalidates.  one of the things that causes a plan to invalidate is a
table getting dropped that is inside the plan -- temp tables are
notorious for doing that (in older postgres we'd get annoying OID
errors).  if your application is even partially cpu bound, and you
have a lot of plpgsql flying around,  that can add up in a surprising
hurry.  temp tables also write to the system catalogs, so if your
function calls are numerous, short, and sweet, array passing is the
way to go because it's a completely in-memory structure that can be
used like a set (via unnest) without those issues.  for 'big' data
though, it's not good.

>> 3) arrays of composites -- the most flexible and very fast for *small*
>> amounts of records (say less than 10,000):
>
> My data set is huge: between 1 and 5 mil rows and avg row size is 100
> - 400 bytes

your best bet is probably a cursor IMO.

merlin