Thread: Stored Procedure Performance

Stored Procedure Performance

From

"Simon Dale"

Date:

11 April 2006, 04:19:55

Hi,

I’m trying to evaluate PostgreSQL as a database that will have to store a high volume of data and access that data frequently. One of the features on our wish list is to be able to use stored procedures to access the data and I was wondering if it is usual for stored procedures to perform slower on PostgreSQL than raw SQL?

A simple example of this can be shown with the following commands:

First I created a test table:

CREATE TABLE test (

id int8,

name varchar(128),

description varchar(500),

constraint “pk_test” primary key (id)

);

Then the function I want to test:

CREATE OR REPLACE FUNCTION readTest() RETURNS SETOF test AS

DECLARE

row test%ROWTYPE;

BEGIN

FOR row IN SELECT * FROM test LOOP

RETURN NEXT row;

END LOOP;

RETURN;

END;

$$ LANGUAGE plpgsql;

Firstly, I ran EXPLAIN on the raw SQL to see how long that takes to access the database the results are as follows:

EXPLAIN ANALYZE SELECT * FROM test;

Seq Scan on test (cost=0.00..10.90 rows=90 width=798) (actual time=0.003..0.003 rows=0 loops=1)

Total runtime: 0.074 ms

(2 rows)

Secondly, I ran EXPLAIN on the function created above and the results are as follows:

EXPLAIN ANALYZE SELECT * FROM readTest();

Function Scan on readtest (cost=0.00..12.50 rows=1000 width=798) (actual time=0.870..0.870 rows=0 loops=1)

Total runtime: 0.910 ms

(2 rows)

I know that the function is planned the first time it is executed so I ran the same command again to remove that processing from the timings and the results are as follows:

EXPLAIN ANALYZE SELECT * FROM readTest();

Function Scan on readtest (cost=0.00..12.50 rows=1000 width=798) (actual time=0.166..0.166 rows=0 loops=1)

Total runtime: 0.217 ms

(2 rows)

Event with the planning removed, the function still performs significantly slower than the raw SQL. Is that normal or am I doing something wrong with the creation or calling of the function?

Thanks for your help,

Simon

Visit our Website at www.rm.com

This message is confidential. You should not copy it or disclose its contents to anyone. You may use and apply the information for the intended purpose only. Internet communications are not secure; therefore, RM does not accept legal responsibility for the contents of this message. Any views or opinions presented are those of the author only and not of RM. If this email has come to you in error, please delete it, along with any attachments. Please note that RM may intercept incoming and outgoing email communications.

Freedom of Information Act 2000
This email and any attachments may contain confidential information belonging to RM. Where the email and any attachments do contain information of a confidential nature, including without limitation information relating to trade secrets, special terms or prices these shall be deemed for the purpose of the Freedom of Information Act 2000 as information provided in confidence by RM and the disclosure of which would be prejudicial to RM's commercial interests.

This email has been scanned for viruses by Trend ScanMail.

Re: Stored Procedure Performance

From

"hubert depesz lubaczewski"

Date:

11 April 2006, 04:58:39

On 4/11/06, Simon Dale <sdale@rm.com> wrote:

I'm trying to evaluate PostgreSQL as a database that will have to store a high volume of data and access that data frequently. One of the features on our wish list is to be able to use stored procedures to access the data and I was wondering if it is usual for stored procedures to perform slower on PostgreSQL than raw SQL?

worry but your benchmark is completelly flawed.
1st. the tables are empty. will you ever run the real code on empty tables?
2nd. do you really need a stored procedure for such a simple query?

testing something that's far from real usage will not give you any good.
return next will of course show up as slower than standard select. the thing is - will the relative slowness of return next matter to you when you will put more logic in the procedure?

depesz

Re: Stored Procedure Performance

From

"Rajesh Kumar Mallah"

Date:

11 April 2006, 05:23:08

On 4/11/06, Simon Dale <sdale@rm.com> wrote:
>
>
>
> Hi,
>
>
>
> I'm trying to evaluate PostgreSQL as a database that will have to store a
> high volume of data and access that data frequently. One of the features on
> our wish list is to be able to use stored procedures to access the data and
> I was wondering if it is usual for stored procedures to perform slower on
> PostgreSQL than raw SQL?

No.

RETURN NEXT keeps accumulating the data before returning.
I am not sure if any optimisations have been done to that effect.

In general functions are *NOT* slower than RAW SQL.

Regds
mallah.

Re: Stored Procedure Performance

From

Richard Huxton

Date:

11 April 2006, 07:04:35

Rajesh Kumar Mallah wrote:
> On 4/11/06, Simon Dale <sdale@rm.com> wrote:
>>
>> I'm trying to evaluate PostgreSQL as a database that will have to store a
>> high volume of data and access that data frequently. One of the features on
>> our wish list is to be able to use stored procedures to access the data and
>> I was wondering if it is usual for stored procedures to perform slower on
>> PostgreSQL than raw SQL?
>
> No.
>
> RETURN NEXT keeps accumulating the data before returning.
> I am not sure if any optimisations have been done to that effect.
>
> In general functions are *NOT* slower than RAW SQL.

Actually, in cases where there is a simple way to state the query in raw
SQL then I'd expect that a procedural solution IS slower. After all,
you're adding another layer of processing.

Of course, you normally wouldn't write a procedural solution to a simple
query.

Added to this is the difference that plpgsql is planned once whereas raw
sql will be planned on each query. This means you save planning costs
with the plpgsql but have the chance to get better plans with the raw sql.
--
   Richard Huxton
   Archonet Ltd

Re: Stored Procedure Performance

From

"Merlin Moncure"

Date:

11 April 2006, 10:50:13

On 4/11/06, Simon Dale <sdale@rm.com> wrote:
> I'm trying to evaluate PostgreSQL as a database that will have to store a
> high volume of data and access that data frequently. One of the features on
> our wish list is to be able to use stored procedures to access the data and
> I was wondering if it is usual for stored procedures to perform slower on
> PostgreSQL than raw SQL?

pl/pgsql procedures are a very thin layer over the query engine.
Generally, they run about the same speed as SQL but you are not making
apples to apples comparison.  One of the few but annoying limitations
of pl/pgsql procedures is that you can't return a select directly from
the query engine but have to go through the return/return next
paradigm which will be slower than raw query for obvious reasons.

You can however return a refcursor and you may want to look at them in
situations where you want to return arbitrary sets outside the query
engine or between pl/pgsql functions.  An example of using refcurors
in that way is on my blog at
http://people.planetpostgresql.org/merlin/index.php?/archives/2-Dealing-With-Recursive-Sets-With-PLPGSQL.html

Generally, in my opinion if you want to really unlock the power of
postgresql you have to master pl/pgsql.  Go for it...it will work and
work well.

merlin

merlin

Re: Stored Procedure Performance

From

Christopher Browne

Date:

11 April 2006, 10:57:11

sdale@rm.com ("Simon Dale") wrote:
> <p class=MsoNormal><font size=2 face=Arial><span style='font-size:10.0pt;
> font-family:Arial'>Event with the planning removed, the function still performs
> significantly slower than the raw SQL. Is that normal or am I doing something wrong
> with the creation or calling of the
> function?<o:p></o:p></span></font></p>

I'd expect this, yes.

You're doing something via "stored procedure logic" that would be done
more directly via straight SQL; of course it won't be faster.

In effect, pl/pgsql involves (planning once) then running each line of
logic.  In effect, you replaced one query (select * from some table)
into 90 queries.  Yup, there's extra cost there.

There's not some "magic" by which stored procedures provide results
faster as a natural "matter of course;" the performance benefits
generally fall out of two improvements:

 1.  You eliminate client-to-server round trips.

    A stored proc that runs 8 queries saves you 8 round trips over
    submitting the 8 queries directly.  Saving you latency time.

 2.  You can eliminate the marshalling and transmission of unnecessary
     data.

   A stored proc that runs 8 queries, and only returns summarized
   results that all come from the last table queried will eliminate
   the need to marshall and transmit (possibly over a slow link) the
   data for the 7 preceding queries.

The case that you tried can benefit from neither of those effects;
your stored procedure eliminates NO round trips, and NO
marshalling/transmission.
--
(format nil "~S@~S" "cbbrowne" "gmail.com")
http://linuxdatabases.info/info/rdbms.html
Rules of  the Evil Overlord  #228.  "If the  hero claims he  wishes to
confess  in public  or to  me  personally, I  will remind  him that  a
notarized deposition will serve just as well."
<http://www.eviloverlord.com/>

Re: Stored Procedure Performance

From

"H.J. Sanders"

Date:

11 April 2006, 11:02:39

Hello

At my little machine (pentium 4, 2.8 Ghz, 256 Mb RAM, Suse linux 9)
I can process about 100000 records a minute using the next setup:

begin work

begin for
    processing
    if 10.000 records processed:
        commit work
        begin work
    end if
end for

commit work (!)

Regards

Henk Sanders




> -----Oorspronkelijk bericht-----
> Van: pgsql-performance-owner@postgresql.org
> [mailto:pgsql-performance-owner@postgresql.org]Namens Merlin Moncure
> Verzonden: dinsdag 11 april 2006 15:50
> Aan: Simon Dale
> CC: pgsql-performance@postgresql.org
> Onderwerp: Re: [PERFORM] Stored Procedure Performance
>
>
> On 4/11/06, Simon Dale <sdale@rm.com> wrote:
> > I'm trying to evaluate PostgreSQL as a database that will have to store a
> > high volume of data and access that data frequently. One of the features on
> > our wish list is to be able to use stored procedures to access the data and
> > I was wondering if it is usual for stored procedures to perform slower on
> > PostgreSQL than raw SQL?
>
> pl/pgsql procedures are a very thin layer over the query engine.
> Generally, they run about the same speed as SQL but you are not making
> apples to apples comparison.  One of the few but annoying limitations
> of pl/pgsql procedures is that you can't return a select directly from
> the query engine but have to go through the return/return next
> paradigm which will be slower than raw query for obvious reasons.
>
> You can however return a refcursor and you may want to look at them in
> situations where you want to return arbitrary sets outside the query
> engine or between pl/pgsql functions.  An example of using refcurors
> in that way is on my blog at
> http://people.planetpostgresql.org/merlin/index.php?/archives/2-Dealing-With-Recursive-Sets-With-PLPGSQL.html
>
> Generally, in my opinion if you want to really unlock the power of
> postgresql you have to master pl/pgsql.  Go for it...it will work and
> work well.
>
> merlin
>
> merlin
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: don't forget to increase your free space map settings
>

Re: Stored Procedure Performance

From

Alvaro Herrera

Date:

11 April 2006, 11:09:04

Merlin Moncure wrote:
> On 4/11/06, Simon Dale <sdale@rm.com> wrote:
> > I'm trying to evaluate PostgreSQL as a database that will have to store a
> > high volume of data and access that data frequently. One of the features on
> > our wish list is to be able to use stored procedures to access the data and
> > I was wondering if it is usual for stored procedures to perform slower on
> > PostgreSQL than raw SQL?
>
> pl/pgsql procedures are a very thin layer over the query engine.
> Generally, they run about the same speed as SQL but you are not making
> apples to apples comparison.  One of the few but annoying limitations
> of pl/pgsql procedures is that you can't return a select directly from
> the query engine but have to go through the return/return next
> paradigm which will be slower than raw query for obvious reasons.

There's one problem that hasn't been mentioned.  For the optimizer a
PL/pgSQL function (really, a function in any language except SQL) is a
black box.  If you have a complex join of two or three functions, and
they don't return 1000 rows, it's very likely that the optimizer is
going to get it wrong.

--
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

Re: Stored Procedure Performance

From

"Merlin Moncure"

Date:

11 April 2006, 18:14:20

On 4/11/06, Alvaro Herrera <alvherre@commandprompt.com> wrote:
> Merlin Moncure wrote:
> > pl/pgsql procedures are a very thin layer over the query engine.
> > Generally, they run about the same speed as SQL but you are not making
> > apples to apples comparison.  One of the few but annoying limitations
> > of pl/pgsql procedures is that you can't return a select directly from
> > the query engine but have to go through the return/return next
> > paradigm which will be slower than raw query for obvious reasons.
>
> There's one problem that hasn't been mentioned.  For the optimizer a
> PL/pgSQL function (really, a function in any language except SQL) is a
> black box.  If you have a complex join of two or three functions, and
> they don't return 1000 rows, it's very likely that the optimizer is
> going to get it wrong.

This doesn't bother me that much. Those cases usually have a high
overlap with views.You just have to plan on the function being fully
materialized before it is inovled further.  What drives me crazy is I
have to do 'select * from plpgsql_srf()' but I am allowed to do the
much friendlier and more versatile 'select sql_srf()', even if they do
more or less the same thing.

On the flip side, what drives me crazy about sql functions is that all
tables have to be in the search path for the validator.  Since I
frequently use the trick of having multiple schemas with one set of
functions this is annoying.

Merlin