Thread: Stored Procedure Performance
Hi,
I’m trying to evaluate PostgreSQL as a database that will have to store a high volume of data and access that data frequently. One of the features on our wish list is to be able to use stored procedures to access the data and I was wondering if it is usual for stored procedures to perform slower on PostgreSQL than raw SQL?
A simple example of this can be shown with the following commands:
First I created a test table:
CREATE TABLE test (
id int8,
name varchar(128),
description varchar(500),
constraint “pk_test” primary key (id)
);
Then the function I want to test:
CREATE OR REPLACE FUNCTION readTest() RETURNS SETOF test AS
$$
DECLARE
row test%ROWTYPE;
BEGIN
FOR row IN SELECT * FROM test LOOP
RETURN NEXT row;
END LOOP;
RETURN;
END;
$$ LANGUAGE plpgsql;
Firstly, I ran EXPLAIN on the raw SQL to see how long that takes to access the database the results are as follows:
EXPLAIN ANALYZE SELECT * FROM test;
Seq Scan on test (cost=0.00..10.90 rows=90 width=798) (actual time=0.003..0.003 rows=0 loops=1)
Total runtime: 0.074 ms
(2 rows)
Secondly, I ran EXPLAIN on the function created above and the results are as follows:
EXPLAIN ANALYZE SELECT * FROM readTest();
Function Scan on readtest (cost=0.00..12.50 rows=1000 width=798) (actual time=0.870..0.870 rows=0 loops=1)
Total runtime: 0.910 ms
(2 rows)
I know that the function is planned the first time it is executed so I ran the same command again to remove that processing from the timings and the results are as follows:
EXPLAIN ANALYZE SELECT * FROM readTest();
Function Scan on readtest (cost=0.00..12.50 rows=1000 width=798) (actual time=0.166..0.166 rows=0 loops=1)
Total runtime: 0.217 ms
(2 rows)
Event with the planning removed, the function still performs significantly slower than the raw SQL. Is that normal or am I doing something wrong with the creation or calling of the function?
Thanks for your help,
Simon
Visit our Website at www.rm.com
This message is confidential. You should not copy it or disclose its contents to anyone. You may use and apply the information for the intended purpose only. Internet communications are not secure; therefore, RM does not accept legal responsibility for the contents of this message. Any views or opinions presented are those of the author only and not of RM. If this email has come to you in error, please delete it, along with any attachments. Please note that RM may intercept incoming and outgoing email communications.
Freedom of Information Act 2000
This email and any attachments may contain confidential information belonging to RM. Where the email and any attachments do contain information of a confidential nature, including without limitation information relating to trade secrets, special terms or prices these shall be deemed for the purpose of the Freedom of Information Act 2000 as information provided in confidence by RM and the disclosure of which would be prejudicial to RM's commercial interests.
This email has been scanned for viruses by Trend ScanMail.
I'm trying to evaluate PostgreSQL as a database that will have to store a high volume of data and access that data frequently. One of the features on our wish list is to be able to use stored procedures to access the data and I was wondering if it is usual for stored procedures to perform slower on PostgreSQL than raw SQL?
worry but your benchmark is completelly flawed.
1st. the tables are empty. will you ever run the real code on empty tables?
2nd. do you really need a stored procedure for such a simple query?
testing something that's far from real usage will not give you any good.
return next will of course show up as slower than standard select. the thing is - will the relative slowness of return next matter to you when you will put more logic in the procedure?
depesz
On 4/11/06, Simon Dale <sdale@rm.com> wrote: > > > > Hi, > > > > I'm trying to evaluate PostgreSQL as a database that will have to store a > high volume of data and access that data frequently. One of the features on > our wish list is to be able to use stored procedures to access the data and > I was wondering if it is usual for stored procedures to perform slower on > PostgreSQL than raw SQL? No. RETURN NEXT keeps accumulating the data before returning. I am not sure if any optimisations have been done to that effect. In general functions are *NOT* slower than RAW SQL. Regds mallah.
Rajesh Kumar Mallah wrote: > On 4/11/06, Simon Dale <sdale@rm.com> wrote: >> >> I'm trying to evaluate PostgreSQL as a database that will have to store a >> high volume of data and access that data frequently. One of the features on >> our wish list is to be able to use stored procedures to access the data and >> I was wondering if it is usual for stored procedures to perform slower on >> PostgreSQL than raw SQL? > > No. > > RETURN NEXT keeps accumulating the data before returning. > I am not sure if any optimisations have been done to that effect. > > In general functions are *NOT* slower than RAW SQL. Actually, in cases where there is a simple way to state the query in raw SQL then I'd expect that a procedural solution IS slower. After all, you're adding another layer of processing. Of course, you normally wouldn't write a procedural solution to a simple query. Added to this is the difference that plpgsql is planned once whereas raw sql will be planned on each query. This means you save planning costs with the plpgsql but have the chance to get better plans with the raw sql. -- Richard Huxton Archonet Ltd
On 4/11/06, Simon Dale <sdale@rm.com> wrote: > I'm trying to evaluate PostgreSQL as a database that will have to store a > high volume of data and access that data frequently. One of the features on > our wish list is to be able to use stored procedures to access the data and > I was wondering if it is usual for stored procedures to perform slower on > PostgreSQL than raw SQL? pl/pgsql procedures are a very thin layer over the query engine. Generally, they run about the same speed as SQL but you are not making apples to apples comparison. One of the few but annoying limitations of pl/pgsql procedures is that you can't return a select directly from the query engine but have to go through the return/return next paradigm which will be slower than raw query for obvious reasons. You can however return a refcursor and you may want to look at them in situations where you want to return arbitrary sets outside the query engine or between pl/pgsql functions. An example of using refcurors in that way is on my blog at http://people.planetpostgresql.org/merlin/index.php?/archives/2-Dealing-With-Recursive-Sets-With-PLPGSQL.html Generally, in my opinion if you want to really unlock the power of postgresql you have to master pl/pgsql. Go for it...it will work and work well. merlin merlin
sdale@rm.com ("Simon Dale") wrote: > <p class=MsoNormal><font size=2 face=Arial><span style='font-size:10.0pt; > font-family:Arial'>Event with the planning removed, the function still performs > significantly slower than the raw SQL. Is that normal or am I doing something wrong > with the creation or calling of the > function?<o:p></o:p></span></font></p> I'd expect this, yes. You're doing something via "stored procedure logic" that would be done more directly via straight SQL; of course it won't be faster. In effect, pl/pgsql involves (planning once) then running each line of logic. In effect, you replaced one query (select * from some table) into 90 queries. Yup, there's extra cost there. There's not some "magic" by which stored procedures provide results faster as a natural "matter of course;" the performance benefits generally fall out of two improvements: 1. You eliminate client-to-server round trips. A stored proc that runs 8 queries saves you 8 round trips over submitting the 8 queries directly. Saving you latency time. 2. You can eliminate the marshalling and transmission of unnecessary data. A stored proc that runs 8 queries, and only returns summarized results that all come from the last table queried will eliminate the need to marshall and transmit (possibly over a slow link) the data for the 7 preceding queries. The case that you tried can benefit from neither of those effects; your stored procedure eliminates NO round trips, and NO marshalling/transmission. -- (format nil "~S@~S" "cbbrowne" "gmail.com") http://linuxdatabases.info/info/rdbms.html Rules of the Evil Overlord #228. "If the hero claims he wishes to confess in public or to me personally, I will remind him that a notarized deposition will serve just as well." <http://www.eviloverlord.com/>
Hello At my little machine (pentium 4, 2.8 Ghz, 256 Mb RAM, Suse linux 9) I can process about 100000 records a minute using the next setup: begin work begin for processing if 10.000 records processed: commit work begin work end if end for commit work (!) Regards Henk Sanders > -----Oorspronkelijk bericht----- > Van: pgsql-performance-owner@postgresql.org > [mailto:pgsql-performance-owner@postgresql.org]Namens Merlin Moncure > Verzonden: dinsdag 11 april 2006 15:50 > Aan: Simon Dale > CC: pgsql-performance@postgresql.org > Onderwerp: Re: [PERFORM] Stored Procedure Performance > > > On 4/11/06, Simon Dale <sdale@rm.com> wrote: > > I'm trying to evaluate PostgreSQL as a database that will have to store a > > high volume of data and access that data frequently. One of the features on > > our wish list is to be able to use stored procedures to access the data and > > I was wondering if it is usual for stored procedures to perform slower on > > PostgreSQL than raw SQL? > > pl/pgsql procedures are a very thin layer over the query engine. > Generally, they run about the same speed as SQL but you are not making > apples to apples comparison. One of the few but annoying limitations > of pl/pgsql procedures is that you can't return a select directly from > the query engine but have to go through the return/return next > paradigm which will be slower than raw query for obvious reasons. > > You can however return a refcursor and you may want to look at them in > situations where you want to return arbitrary sets outside the query > engine or between pl/pgsql functions. An example of using refcurors > in that way is on my blog at > http://people.planetpostgresql.org/merlin/index.php?/archives/2-Dealing-With-Recursive-Sets-With-PLPGSQL.html > > Generally, in my opinion if you want to really unlock the power of > postgresql you have to master pl/pgsql. Go for it...it will work and > work well. > > merlin > > merlin > > ---------------------------(end of broadcast)--------------------------- > TIP 5: don't forget to increase your free space map settings >
Merlin Moncure wrote: > On 4/11/06, Simon Dale <sdale@rm.com> wrote: > > I'm trying to evaluate PostgreSQL as a database that will have to store a > > high volume of data and access that data frequently. One of the features on > > our wish list is to be able to use stored procedures to access the data and > > I was wondering if it is usual for stored procedures to perform slower on > > PostgreSQL than raw SQL? > > pl/pgsql procedures are a very thin layer over the query engine. > Generally, they run about the same speed as SQL but you are not making > apples to apples comparison. One of the few but annoying limitations > of pl/pgsql procedures is that you can't return a select directly from > the query engine but have to go through the return/return next > paradigm which will be slower than raw query for obvious reasons. There's one problem that hasn't been mentioned. For the optimizer a PL/pgSQL function (really, a function in any language except SQL) is a black box. If you have a complex join of two or three functions, and they don't return 1000 rows, it's very likely that the optimizer is going to get it wrong. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
On 4/11/06, Alvaro Herrera <alvherre@commandprompt.com> wrote: > Merlin Moncure wrote: > > pl/pgsql procedures are a very thin layer over the query engine. > > Generally, they run about the same speed as SQL but you are not making > > apples to apples comparison. One of the few but annoying limitations > > of pl/pgsql procedures is that you can't return a select directly from > > the query engine but have to go through the return/return next > > paradigm which will be slower than raw query for obvious reasons. > > There's one problem that hasn't been mentioned. For the optimizer a > PL/pgSQL function (really, a function in any language except SQL) is a > black box. If you have a complex join of two or three functions, and > they don't return 1000 rows, it's very likely that the optimizer is > going to get it wrong. This doesn't bother me that much. Those cases usually have a high overlap with views.You just have to plan on the function being fully materialized before it is inovled further. What drives me crazy is I have to do 'select * from plpgsql_srf()' but I am allowed to do the much friendlier and more versatile 'select sql_srf()', even if they do more or less the same thing. On the flip side, what drives me crazy about sql functions is that all tables have to be in the search path for the validator. Since I frequently use the trick of having multiple schemas with one set of functions this is annoying. Merlin