Re: table / query as a prameter for PL/pgSQL function - Mailing list pgsql-general

From Merlin Moncure
Subject Re: table / query as a prameter for PL/pgSQL function
Date
Msg-id CAHyXU0wCFCtKGLHS6pU4ynE+dOMDO5Psofbn4AwXuo9XNvuYQg@mail.gmail.com
Whole thread Raw
In response to Re: table / query as a prameter for PL/pgSQL function  (Ondrej Ivanič <ondrej.ivanic@gmail.com>)
List pgsql-general
2011/8/8 Ondrej Ivanič <ondrej.ivanic@gmail.com>:
> Hi,
>
> 2011/8/9 Merlin Moncure <mmoncure@gmail.com>:
>> You have a few of different methods for passing sets between functions.
>
> I do not want to pass data between functions. The ideal solution
> should look like this:
> select * from my_map_func(<select query>)

well, the method still applies: you'd just do:
select * from my_map_func(array(<select query that grabs foo_t type>))

...but, it sounds like that method is not appropriate -- see below.

>> 1) refcursor as David noted.  reasonably fast. however, I find the
>> 'FETCH' mechanic a little inflexible.
>
> I've came across this but manual example wasn't (isn't) clear to me:
> CREATE TABLE test (col text);
> INSERT INTO test VALUES ('123');
>
> CREATE FUNCTION reffunc(refcursor) RETURNS refcursor AS '
> BEGIN
>    OPEN $1 FOR SELECT col FROM test;
>    RETURN $1;
> END;
> ' LANGUAGE plpgsql;
>
> BEGIN;
> SELECT reffunc('funccursor');
> FETCH ALL IN funccursor;
> COMMIT;
>
> What is the "funccursor"?

funccursor is the name -- just a string.  refcursors can be named with
a variable string and later fetched as an identifier -- they are kinda
unique in that respect.

>> 2) stage data to TABLE/TEMP TABLE; extremely flexible, but can be a
>> headache because a non temp table can get thrashed pretty hard a and a
>> 'TEMP' can cause severe function plan invalidation issues if you're
>> not careful
>
> I'm not familiar with this issue (function plan invalidation issues).
> Could you please provide more details/links about it?

well, in your particular case it's probably not so much of an issue.
plpgsql, when a function is executed for the first time in a session,
'compiles' the source code into a plan that is kept around until it
invalidates.  one of the things that causes a plan to invalidate is a
table getting dropped that is inside the plan -- temp tables are
notorious for doing that (in older postgres we'd get annoying OID
errors).  if your application is even partially cpu bound, and you
have a lot of plpgsql flying around,  that can add up in a surprising
hurry.  temp tables also write to the system catalogs, so if your
function calls are numerous, short, and sweet, array passing is the
way to go because it's a completely in-memory structure that can be
used like a set (via unnest) without those issues.  for 'big' data
though, it's not good.

>> 3) arrays of composites -- the most flexible and very fast for *small*
>> amounts of records (say less than 10,000):
>
> My data set is huge: between 1 and 5 mil rows and avg row size is 100
> - 400 bytes

your best bet is probably a cursor IMO.

merlin

pgsql-general by date:

Previous
From: Adrian Klaver
Date:
Subject: Re: Pgadmin goes missing in Ubuntu
Next
From: c k
Date:
Subject: postgresql server crash on windows 7 when using plpython