Thread: table / query as a prameter for PL/pgSQL function
Hi, It is possible to pass query result (or cursor?) as function parameter? I need a function which emits zero or more rows per input row (map function from map&reduce paradigm). Function returns record (or array): (value1, value2, value3) I've tried the following: 1) create or replace function test (r record) returns setof record as $$ ... Doesn't work: PL/pgSQL functions cannot accept type record 2) pass query as text parameter and open no scroll cursor inside the function It works but it's ugly. 3) hardcode the query inside function Similar to (2) and looks better but I need several functions with different queries inside: ... for r in (query) loop ... end loop; ... 4) use function in "select" clause: select my_map_func(col1, col2, col3, col4) from ... -- the rest of the query In this case I wasn't able figure out how to access record members returned by the function: select ?, ?, ?, count(*) from ( select my_map_func(col1, col2, col3, col4) as map_func_result from ... ) as map group by 1, 2, 3 The '?' should be something like map.map_func_result.value1 (both map.value1 and map_func_result.value1 doesn't not work). If function returns array then I can access value1 by using map_func_result[1] Is there a better way how to solve this? I'm kind of satisfied with 4 (maybe 3) but it is little bit cumbersome Thanks, -- Ondrej Ivanic (ondrej.ivanic@gmail.com)
Hi,
It is possible to pass query result (or cursor?) as function
parameter? I need a function which emits zero or more rows per input
row (map function from map&reduce paradigm). Function returns record
(or array): (value1, value2, value3)
I've tried the following:
1) create or replace function test (r record) returns setof record as $$ ...
Doesn't work: PL/pgSQL functions cannot accept type record
From the docs you can try using the "refcursor" data type though I have never done so myself.
4) use function in "select" clause:
select my_map_func(col1, col2, col3, col4) from ... -- the rest of the query
In this case I wasn't able figure out how to access record members
returned by the function:
select ?, ?, ?, count(*) from (
select my_map_func(col1, col2, col3, col4) as map_func_result from ...
) as map
group by 1, 2, 3
The '?' should be something like map.map_func_result.value1 (both
map.value1 and map_func_result.value1 doesn't not work). If function
returns array then I can access value1 by using map_func_result[1]
Try " (map.map_func_result).value1 " - the parenthesis around the table alias and column are necessary.
Is there a better way how to solve this? I'm kind of satisfied with 4
(maybe 3) but it is little bit cumbersome
Thanks,
--
Ondrej Ivanic
(ondrej.ivanic@gmail.com)
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
2011/8/7 Ondrej Ivanič <ondrej.ivanic@gmail.com>: > Hi, > > It is possible to pass query result (or cursor?) as function > parameter? I need a function which emits zero or more rows per input > row (map function from map&reduce paradigm). Function returns record > (or array): (value1, value2, value3) > I've tried the following: > > 1) create or replace function test (r record) returns setof record as $$ ... > Doesn't work: PL/pgSQL functions cannot accept type record > > 2) pass query as text parameter and open no scroll cursor inside the function > It works but it's ugly. > > 3) hardcode the query inside function > Similar to (2) and looks better but I need several functions with > different queries inside: > ... > for r in (query) loop > ... > end loop; > ... > > 4) use function in "select" clause: > select my_map_func(col1, col2, col3, col4) from ... -- the rest of the query > In this case I wasn't able figure out how to access record members > returned by the function: > > select ?, ?, ?, count(*) from ( > select my_map_func(col1, col2, col3, col4) as map_func_result from ... > ) as map > group by 1, 2, 3 > > The '?' should be something like map.map_func_result.value1 (both > map.value1 and map_func_result.value1 doesn't not work). If function > returns array then I can access value1 by using map_func_result[1] > > Is there a better way how to solve this? I'm kind of satisfied with 4 > (maybe 3) but it is little bit cumbersome You have a few of different methods for passing sets between functions. 1) refcursor as David noted. reasonably fast. however, I find the 'FETCH' mechanic a little inflexible. 2) stage data to TABLE/TEMP TABLE; extremely flexible, but can be a headache because a non temp table can get thrashed pretty hard a and a 'TEMP' can cause severe function plan invalidation issues if you're not careful 3) arrays of composites -- the most flexible and very fast for *small* amounts of records (say less than 10,000): #3 is my favorite method unless the data being passed is very large. Here is an example of it in use: CREATE TYPE foo_t as (a int, b text); CREATE FUNCTION get_foos() RETURNS SETOF foo_t AS $$ BEGIN RETURN QUERY SELECT 1, 'abc' UNION ALL SELECT 2, 'def'; END; $$ LANGUAGE PLPGSQL; CREATE FUNCTION do_foos(_foos foo_t[]) returns VOID AS $$ DECLARE f foo_t; BEGIN FOR f in SELECT * FROM UNNEST(_foos) LOOP RAISE NOTICE '% %', f.a, f.b; END LOOP; END; $$ LANGUAGE PLPGSQL; postgres=# SELECT do_foos(ARRAY(SELECT (a,b)::foo_t FROM get_foos())); NOTICE: 1 abc NOTICE: 2 def do_foos --------- (1 row) Also, if you are deploying vs 9.1, be sure to check out Pavel's for-in-array which is better method to do the unnest() which expands the array. merlin
Hi, 2011/8/9 Merlin Moncure <mmoncure@gmail.com>: > You have a few of different methods for passing sets between functions. I do not want to pass data between functions. The ideal solution should look like this: select * from my_map_func(<select query>) > 1) refcursor as David noted. reasonably fast. however, I find the > 'FETCH' mechanic a little inflexible. I've came across this but manual example wasn't (isn't) clear to me: CREATE TABLE test (col text); INSERT INTO test VALUES ('123'); CREATE FUNCTION reffunc(refcursor) RETURNS refcursor AS ' BEGIN OPEN $1 FOR SELECT col FROM test; RETURN $1; END; ' LANGUAGE plpgsql; BEGIN; SELECT reffunc('funccursor'); FETCH ALL IN funccursor; COMMIT; What is the "funccursor"? Anyway, I will try to use something like this CREATE FUNCTION my_map_func(refcursor) returns setof text[] as $$ ... $$ LANGUAGE 'plpgsql'; BEGIN; DECLARE my_cursor NO SCROLL CURSOR FOR <query>; SELECT * FROM my_map_func(my_cursor); COMMIT; I'll keep you posted. > 2) stage data to TABLE/TEMP TABLE; extremely flexible, but can be a > headache because a non temp table can get thrashed pretty hard a and a > 'TEMP' can cause severe function plan invalidation issues if you're > not careful I'm not familiar with this issue (function plan invalidation issues). Could you please provide more details/links about it? > 3) arrays of composites -- the most flexible and very fast for *small* > amounts of records (say less than 10,000): My data set is huge: between 1 and 5 mil rows and avg row size is 100 - 400 bytes -- Ondrej Ivanic (ondrej.ivanic@gmail.com)
2011/8/8 Ondrej Ivanič <ondrej.ivanic@gmail.com>: > Hi, > > 2011/8/9 Merlin Moncure <mmoncure@gmail.com>: >> You have a few of different methods for passing sets between functions. > > I do not want to pass data between functions. The ideal solution > should look like this: > select * from my_map_func(<select query>) well, the method still applies: you'd just do: select * from my_map_func(array(<select query that grabs foo_t type>)) ...but, it sounds like that method is not appropriate -- see below. >> 1) refcursor as David noted. reasonably fast. however, I find the >> 'FETCH' mechanic a little inflexible. > > I've came across this but manual example wasn't (isn't) clear to me: > CREATE TABLE test (col text); > INSERT INTO test VALUES ('123'); > > CREATE FUNCTION reffunc(refcursor) RETURNS refcursor AS ' > BEGIN > OPEN $1 FOR SELECT col FROM test; > RETURN $1; > END; > ' LANGUAGE plpgsql; > > BEGIN; > SELECT reffunc('funccursor'); > FETCH ALL IN funccursor; > COMMIT; > > What is the "funccursor"? funccursor is the name -- just a string. refcursors can be named with a variable string and later fetched as an identifier -- they are kinda unique in that respect. >> 2) stage data to TABLE/TEMP TABLE; extremely flexible, but can be a >> headache because a non temp table can get thrashed pretty hard a and a >> 'TEMP' can cause severe function plan invalidation issues if you're >> not careful > > I'm not familiar with this issue (function plan invalidation issues). > Could you please provide more details/links about it? well, in your particular case it's probably not so much of an issue. plpgsql, when a function is executed for the first time in a session, 'compiles' the source code into a plan that is kept around until it invalidates. one of the things that causes a plan to invalidate is a table getting dropped that is inside the plan -- temp tables are notorious for doing that (in older postgres we'd get annoying OID errors). if your application is even partially cpu bound, and you have a lot of plpgsql flying around, that can add up in a surprising hurry. temp tables also write to the system catalogs, so if your function calls are numerous, short, and sweet, array passing is the way to go because it's a completely in-memory structure that can be used like a set (via unnest) without those issues. for 'big' data though, it's not good. >> 3) arrays of composites -- the most flexible and very fast for *small* >> amounts of records (say less than 10,000): > > My data set is huge: between 1 and 5 mil rows and avg row size is 100 > - 400 bytes your best bet is probably a cursor IMO. merlin