Thread: How to write a function that manipulates a set of results

How to write a function that manipulates a set of results

From
Ashley Moran
Date:
Hi

I'm passing this on second hand so forgive me if I'm a bit vague...

My boss is trying to write a function in PL/pgsql that manipulates a
small, temporary set of data.  It fetches about 20 rows from a table
of approx 4 million, then it needs to do further calculations to
narrow them down further.  Previously we have only written database
code for SQL Server, and there we would use a table variable.
Apparently temporary tables are not equivalent and not suitable for
this.

The closest I've come from reading the docs is functions that return
SETOF.  However I can't see any way to query the result set like a
table.  Can anyone offer any hints/links?

Thanks
Ashley

Re: How to write a function that manipulates a set of results

From
Andreas Kretschmer
Date:
Ashley Moran <work@ashleymoran.me.uk> schrieb:
> The closest I've come from reading the docs is functions that return
> SETOF.  However I can't see any way to query the result set like a  table.

Right.

> Can anyone offer any hints/links?

select * from <insert_your_srf-function_here>.

For instance, if your SRF-function is called foo(), do:

select * from foo();


Andreas
--
Really, I'm not out to destroy Microsoft. That will just be a completely
unintentional side effect.                              (Linus Torvalds)
"If I was god, I would recompile penguin with --enable-fly."    (unknow)
Kaufbach, Saxony, Germany, Europe.              N 51.05082°, E 13.56889°

Re: How to write a function that manipulates a set of results

From
Ashley Moran
Date:
On Mar 14, 2007, at 6:02 pm, Andreas Kretschmer wrote:
> select * from <insert_your_srf-function_here>.
>
> For instance, if your SRF-function is called foo(), do:
>
> select * from foo();


Andreas,

It's more complicated than that.  What we need to do is something
along the lines of:

results = SELECT * FROM foo();
DELETE FROM results WHERE (some condition involving results);
some_value = SELECT value FROM results WHERE (etc);

and so on...

All of which is easy with table variable, but I can't see how to
translate it to PL/pgsql.  Is there any way to manipulate result sets
in a set-based manner like this?

Ashley




Re: How to write a function that manipulates a set of results

From
Andreas Kretschmer
Date:
Ashley Moran <work@ashleymoran.me.uk> schrieb:

>
> On Mar 14, 2007, at 6:02 pm, Andreas Kretschmer wrote:
> >select * from <insert_your_srf-function_here>.
> >For instance, if your SRF-function is called foo(), do:
> >select * from foo();
>
>
> Andreas,
>
> It's more complicated than that.  What we need to do is something  along
> the lines of:
>
> results = SELECT * FROM foo();

Try something like:

create results as SELECT * FROM foo();

Than you can do:

> DELETE FROM results WHERE (some condition involving results);
> some_value = SELECT value FROM results WHERE (etc);

Try: create table some_value as SELECT value FROM results WHERE (etc);

>
> and so on...

Yes, and so on...

>
> All of which is easy with table variable, but I can't see how to  translate
> it to PL/pgsql.  Is there any way to manipulate result sets  in a set-based
> manner like this?

You can do a lot in pl/pgsql. For some things, for instance dynamically
created SQLs, you need EXECUTE. But you can do a lot with pl/pgsql.


Andreas
--
Really, I'm not out to destroy Microsoft. That will just be a completely
unintentional side effect.                              (Linus Torvalds)
"If I was god, I would recompile penguin with --enable-fly."    (unknow)
Kaufbach, Saxony, Germany, Europe.              N 51.05082°, E 13.56889°

Re: How to write a function that manipulates a set of results

From
Tom Lane
Date:
Ashley Moran <work@ashleymoran.me.uk> writes:
> ... All of which is easy with table variable, but I can't see how to
> translate it to PL/pgsql.  Is there any way to manipulate result sets
> in a set-based manner like this?

Sure: use a table.  What was the objection to temp tables exactly?

            regards, tom lane

Re: How to write a function that manipulates a set of results

From
Tony Caduto
Date:
Ashley Moran wrote:
> It fetches about 20 rows from a table of approx 4 million, then it
> needs to do further calculations to narrow them down further.
> Previously we have only written database code for SQL Server, and
> there we would use a table variable.  Apparently temporary tables are
> not equivalent and not suitable for this.
>
>
Seems like a temp table with a insert into would work for you.

insert into mytemptable
(field1,field2)
select field1,field2 from sometable where field1 = 5;

Then you can query the new temp table anyway you would like while you
are still in the function.

--
Tony Caduto
AM Software Design
http://www.amsoftwaredesign.com
Home of PG Lightning Admin for Postgresql
Your best bet for Postgresql Administration


Re: How to write a function that manipulates a set of results

From
Ashley Moran
Date:
On Mar 14, 2007, at 7:25 pm, Tom Lane wrote:
> Sure: use a table.  What was the objection to temp tables exactly?


My boss is under the impression that calling the procedure twice on
the same connection creates a name conflict for the temporary table.

Also I'm unsure how the use of temporary tables will affect the
running of the query.  This function will be called up to 400 times
by a wrapping function that needs to complete quickly (relatively -
the calculations need to return in a few seconds).  I haven't had
chance to read much about PG temp tables yet, so I don't know how the
implementation works.  Are small tables stored in RAM, or does
creating a temporary table always force a disk write?

Sorry my sheer ignorance here.  I have already written a working,
fully tested - albeit unoptimised - implementation of this algorithm
in Ruby, accessible as a basic (HTTP GET) web service.  However, a
developer here believes that the overhead of an HTTP connection over
our gigabit LAN will add too much to the (> 1 second) running time of
the calculations.  This means that for political reasons we are
forced to rewrite it as a stored procedure that we can call directly
from application code.  The project is urgent, the aforementioned
developer is now on holiday, and so my non-developer boss is
scrabbling to learn PL/pgsql and I am forced to bombard the list with
inane questions.  Thanks for your patience :)

Ashley




Re: How to write a function that manipulates a set of results

From
"Joshua D. Drake"
Date:
> Sorry my sheer ignorance here.  I have already written a working, fully
> tested - albeit unoptimised - implementation of this algorithm in Ruby,
> accessible as a basic (HTTP GET) web service.  However, a developer here
> believes that the overhead of an HTTP connection over our gigabit LAN
> will add too much to the (> 1 second) running time of the calculations.
> This means that for political reasons we are forced to rewrite it as a
> stored procedure that we can call directly from application code.  The
> project is urgent, the aforementioned developer is now on holiday, and
> so my non-developer boss is scrabbling to learn PL/pgsql and I am forced
> to bombard the list with inane questions.  Thanks for your patience :)

Use plperl :) (seriously). Or plruby, there is one.

Joshua D. Drake

>
> Ashley
>
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: In versions below 8.0, the planner will ignore your desire to
>       choose an index scan if your joining column's datatypes do not
>       match
>


--

      === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
             http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/


Re: How to write a function that manipulates a set of results

From
Alvaro Herrera
Date:
Ashley Moran wrote:
>
> On Mar 14, 2007, at 7:25 pm, Tom Lane wrote:
> >Sure: use a table.  What was the objection to temp tables exactly?
>
> My boss is under the impression that calling the procedure twice on
> the same connection creates a name conflict for the temporary table.
>
> Also I'm unsure how the use of temporary tables will affect the
> running of the query.  This function will be called up to 400 times
> by a wrapping function that needs to complete quickly (relatively -
> the calculations need to return in a few seconds).  I haven't had
> chance to read much about PG temp tables yet, so I don't know how the
> implementation works.  Are small tables stored in RAM, or does
> creating a temporary table always force a disk write?

Temp tables stay in RAM until they are bigger than temp_buffers.  If you
need them to be big and quick, maybe it would be appropriate to use
indexes (note these count towards temp_buffers), ANALYZE, etc.

--
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

Re: How to write a function that manipulates a set of results

From
Tom Lane
Date:
Alvaro Herrera <alvherre@commandprompt.com> writes:
> Ashley Moran wrote:
>> Also I'm unsure how the use of temporary tables will affect the
>> running of the query.  This function will be called up to 400 times
>> by a wrapping function that needs to complete quickly (relatively -
>> the calculations need to return in a few seconds).  I haven't had
>> chance to read much about PG temp tables yet, so I don't know how the
>> implementation works.  Are small tables stored in RAM, or does
>> creating a temporary table always force a disk write?

> Temp tables stay in RAM until they are bigger than temp_buffers.  If you
> need them to be big and quick, maybe it would be appropriate to use
> indexes (note these count towards temp_buffers), ANALYZE, etc.

You do need to realize that creation of a temp table involves making
entries in the system catalogs.  If you can set it up so that you reuse
the same temp table(s) for the life of a connection, you'll save a lot
of thrashing and need for catalog vacuuming (the ON COMMIT DELETE ROWS
option for temp tables might help here).  Other than that gotcha, they
should be pretty efficient.

            regards, tom lane

Re: How to write a function that manipulates a set of results

From
Ashley Moran
Date:
On 15 Mar 2007, at 00:21, Tom Lane wrote:

>> Temp tables stay in RAM until they are bigger than temp_buffers.
>> If you
>> need them to be big and quick, maybe it would be appropriate to use
>> indexes (note these count towards temp_buffers), ANALYZE, etc.
>
> You do need to realize that creation of a temp table involves making
> entries in the system catalogs.  If you can set it up so that you
> reuse
> the same temp table(s) for the life of a connection, you'll save a lot
> of thrashing and need for catalog vacuuming (the ON COMMIT DELETE ROWS
> option for temp tables might help here).  Other than that gotcha, they
> should be pretty efficient.

Thanks for these tips, they are extremely useful

Re: How to write a function that manipulates a set of results

From
Stefan Berglund
Date:
On Wed, 14 Mar 2007 18:50:27 +0000, work@ashleymoran.me.uk (Ashley
Moran) wrote:
 in <B5B55C8F-8C52-48A5-B8D9-8B071681299D@ashleymoran.me.uk>

>It's more complicated than that.  What we need to do is something
>along the lines of:
>
>results = SELECT * FROM foo();
>DELETE FROM results WHERE (some condition involving results);
>some_value = SELECT value FROM results WHERE (etc);
>
>and so on...
>
>All of which is easy with table variable, but I can't see how to
>translate it to PL/pgsql.  Is there any way to manipulate result sets
>in a set-based manner like this?

A table returning function or SRF can be used in joins with other tables
or subqueries.  In fact, you can use it in either of two formats:

If the SRF returns a native data type then you can use just the function
name.  Consider the function foo() which returns INTEGER.

CREATE OR REPLACE FUNCTION foo () RETURNS SETOF INTEGER AS

SELECT *
FROM
  foo() F INNER JOIN
  some_table T ON F=T.id;

If the SRF returns a composite type then you can use the function name
qualified by any of the members of the list of types.

SELECT *
FROM
  foo() F INNER JOIN
  some_table T ON F.num=T.id;

---
Stefan Berglund