Thread: stored procedures and dynamic queries

stored procedures and dynamic queries

From
Ivan Sergio Borgonovo
Date:
Any general rule about dynamically generated queries in stored
procedures vs. performances?

I was going to write stuff as simple as

create or replace function EditQty(int, int, int, varchar(10))
 returns boolean as
'
declare
    _uid alias for $1;
    _aid alias for $2;
    _qty alias for $3;
    _table alias $4;
    _modified timestamp;
begin
    _table := 'shop_commerce_basket' || _table
    _modified := now();
    update _table
    set qty=_qty,
    modified=_modified
    where uid=_uid and aid=_aid;
    if not found then
        insert into _table (uid, aid, qty)
        values(_uid,_aid,_qty);
    end if;
end;
' language plpgsql;

Is it going to perform worse than with a static table name?

Where can I find some clue about the effects of similar decisions?

thx

--
Ivan Sergio Borgonovo
http://www.webthatworks.it


Re: stored procedures and dynamic queries

From
Richard Huxton
Date:
Ivan Sergio Borgonovo wrote:
> Any general rule about dynamically generated queries in stored
> procedures vs. performances?

It's the same decision as any with any prepared plan vs plan-each-time
trade-off.

A query built using EXECUTE will have to be planned each time. That
costs you something but means the plan will have all the information it
needs.

A pre-planned query saves planning time on the second,third etc runs but
the plan won't change with the query-parameters.

So:
A query that's going to be executed a million times in a loop with the
same plan each time implies you want a pre-planned query.

A query executed once, or where changing input parameters would benefit
from changing plans implies you want to re-plan each time.


If you don't have a good reason to think it matters one way or the
other, then it probably doesn't. If it's not worth testing then it's not
worth optimising either.


Of course, in your example the query wouldn't work at all - you'd need
to use the EXECUTE command in plpgsql.

--
   Richard Huxton
   Archonet Ltd

Re: stored procedures and dynamic queries

From
Ivan Sergio Borgonovo
Date:
On Mon, 03 Dec 2007 19:06:29 +0000
Richard Huxton <dev@archonet.com> wrote:

> Ivan Sergio Borgonovo wrote:
> > Any general rule about dynamically generated queries in stored
> > procedures vs. performances?
>
> It's the same decision as any with any prepared plan vs
> plan-each-time trade-off.

Should I guess this trade off on aruspices or is it possible to gain
a little bit of culture or it's something based on heuristic and
experience?

If the second, any good reading?
How should I take into account parameters like:
- is it a insert/update vs select query
- is it performed on a large table?
- is it performed frequently? frequently with same parameters?
frequently with different parameters?
- does the table is growing?
- are there frequent delete?
etc...

I'm just guessing things that may impact.

> Of course, in your example the query wouldn't work at all - you'd
> need to use the EXECUTE command in plpgsql.

OK just adapted from a more complicated one and forgot to fix it.

thx

--
Ivan Sergio Borgonovo
http://www.webthatworks.it


Re: stored procedures and dynamic queries

From
Richard Huxton
Date:
Ivan Sergio Borgonovo wrote:
> On Mon, 03 Dec 2007 19:06:29 +0000
> Richard Huxton <dev@archonet.com> wrote:
>
>> Ivan Sergio Borgonovo wrote:
>>> Any general rule about dynamically generated queries in stored
>>> procedures vs. performances?
>> It's the same decision as any with any prepared plan vs
>> plan-each-time trade-off.
>
> Should I guess this trade off on aruspices or is it possible to gain
> a little bit of culture or it's something based on heuristic and
> experience?

Unless it's an obvious decision (millions of small identical queries vs.
occasional large complex ones) then you'll have to test. That's going to
be true of any decision like this on any system.

> If the second, any good reading?
> How should I take into account parameters like:
> - is it a insert/update vs select query
> - is it performed on a large table?
> - is it performed frequently? frequently with same parameters?
> frequently with different parameters?
> - does the table is growing?
> - are there frequent delete?

Read the manuals, particularly the sections on MVCC, also the planner
and EXPLAIN. Read up on the WAL and checkpoints.

You might want to skim through the archives on the performance list. You
might find some of the community-related resources useful too. Some of
the following might be a little out-of-date, so check:

http://www.westnet.com/~gsmith/content/postgresql/
http://www.powerpostgresql.com/PerfList
http://www.varlena.com/varlena/GeneralBits/Tidbits/index.php

HTH
--
   Richard Huxton
   Archonet Ltd

Re: stored procedures and dynamic queries

From
Ivan Sergio Borgonovo
Date:
On Tue, 04 Dec 2007 08:14:56 +0000
Richard Huxton <dev@archonet.com> wrote:

> Unless it's an obvious decision (millions of small identical
> queries vs. occasional large complex ones) then you'll have to
> test. That's going to be true of any decision like this on any
> system.

:(

I'm trying to grasp a general idea from the view point of a developer
rather than a sysadmin. At this moment I'm not interested in
optimisation, I'm interested in understanding the trade off of
certain decisions in the face of a cleaner interface.

Most of the documents available are from a sysadmin point of view.
That makes me think that unless I write terrible SQL it won't make a
big difference and the first place I'll have to look at if the
application need to run faster is pg config.

This part (for posterity) looks as the most interesting for
developers:
http://www.gtsm.com/oscon2003/toc.html
Starting from Functions

Still I can't understand some things, I'll come back.

thanks for the right pointers.

--
Ivan Sergio Borgonovo
http://www.webthatworks.it


Re: stored procedures and dynamic queries

From
Richard Huxton
Date:
Ivan Sergio Borgonovo wrote:
> On Tue, 04 Dec 2007 08:14:56 +0000
> Richard Huxton <dev@archonet.com> wrote:
>
>> Unless it's an obvious decision (millions of small identical
>> queries vs. occasional large complex ones) then you'll have to
>> test. That's going to be true of any decision like this on any
>> system.
>
> :(
>
> I'm trying to grasp a general idea from the view point of a developer
> rather than a sysadmin. At this moment I'm not interested in
> optimisation, I'm interested in understanding the trade off of
> certain decisions in the face of a cleaner interface.

Always go for the cleaner design. If it turns out that isn't fast
enough, *then* start worrying about having a bad but faster design.

> Most of the documents available are from a sysadmin point of view.
> That makes me think that unless I write terrible SQL it won't make a
> big difference and the first place I'll have to look at if the
> application need to run faster is pg config.

The whole point of a RDBMS is so that you don't have to worry about
this. If you have to start tweaking the fine details of these things,
then that's a point where the RDBMS has reached its limits. In a perfect
world you wouldn't need to configure PG either, but it's not that clever
I'm afraid.

Keep your database design clean, likewise with your queries, consider
whether you can cache certain results and get everything working first.

Then, look for where bottle-necks are, do you have any unexpectedly
long-running queries? (see www.pgfoundry.org for some tools to help with
log analysis)

> This part (for posterity) looks as the most interesting for
> developers:
> http://www.gtsm.com/oscon2003/toc.html
> Starting from Functions

Note that this is quite old now, so some performance-related assumptions
will be wrong for current versions of PG.

> Still I can't understand some things, I'll come back.


--
   Richard Huxton
   Archonet Ltd

Re: stored procedures and dynamic queries

From
Ted Byers
Date:
--- Richard Huxton <dev@archonet.com> wrote:
> Ivan Sergio Borgonovo wrote:
> > On Tue, 04 Dec 2007 08:14:56 +0000
> > Richard Huxton <dev@archonet.com> wrote:
> >
> >> Unless it's an obvious decision (millions of
> small identical
> >> queries vs. occasional large complex ones) then
> you'll have to
> >> test. That's going to be true of any decision
> like this on any
> >> system.
> >
> > :(
> >
> > I'm trying to grasp a general idea from the view
> point of a developer
> > rather than a sysadmin. At this moment I'm not
> interested in
> > optimisation, I'm interested in understanding the
> trade off of
> > certain decisions in the face of a cleaner
> interface.
>
> Always go for the cleaner design. If it turns out
> that isn't fast
> enough, *then* start worrying about having a bad but
> faster design.
>
I don't know about "clean", but the mantra here is
"First you make it provably correct, then you make it
fast."

I am a fan of making things as simple as practicable,
BUT NO SIMPLER. I don't know how that meshes with
"clean", but the general idea is right.

If you look at my code, my C++ and my java code is
infinitely better than my SQL, so with either of
those, I end up making fewer variants, and my starting
point is generally much closer to my end point, and
that is just a function of experience.  With my SQL
code, I generally find myself producing a variety of
scripts to support a given task.  And these include
exploration of just about everything SQL has to offer.
 I TRY everything, from joins to correlated subqueries
to stored procedures to views, and beyond.  And I am
still trying to develop an intuition as to which
options will serve best in a given situation,
analogous to design patterns I routinely use in my C++
and Java code.

Your reaction to Richard's advice to test seems to
imply you want a more direct approach to "THE" answer.
 That is chasing something that doesn't exist!  I tell
you, as a developer (and NOT a sysadmin), there is no
substitute for experience and THAT can ONLY be
obtained through trial and error.  I haven't seen any
books about design patterns in SQL (while there are
plenty in C++ and Java, and other languages), so there
is no short cut.

In short, the general idea most developers I know work
with is "code and test, and then code again and test
again, until you've tried the variety of options that
exist."

There is no substitute for testing yourself.  I have
seen repeated advice to replace correlated subqueries
by left joins, claiming that ubiquitous experience is
that the joins are faster, and yet, in my tests, in
most cases there was little difference in speed while
in others the correlated subqueries were faster.  So
it appears the advice was based on experience with
dated software and the RDBMS in question had
subsequently dramatically improved how it handled
correlated subqueries.  And my use of EXPLAIN
confirmed what I was finding through testing.

> > Most of the documents available are from a
> sysadmin point of view.
> > That makes me think that unless I write terrible
> SQL it won't make a
> > big difference and the first place I'll have to
> look at if the
> > application need to run faster is pg config.
>
> The whole point of a RDBMS is so that you don't have
> to worry about
> this. If you have to start tweaking the fine details
> of these things,
> then that's a point where the RDBMS has reached its
> limits. In a perfect
> world you wouldn't need to configure PG either, but
> it's not that clever
> I'm afraid.
>
I am not sure I buy this, if I properly understand it.
 Trust me, I have written some really bad but simple
queries that took hours to complete a task that was
completed in less than a minute with smarter code.
And yet the "bad" code I'd written was similar in
nature to examples used in some texts to explain ideas
in SQL.  The point is, until you get extensive
experience in SQL programming and optimization, you
won't know what is bad code until you test it.

Personally, I rely on the sysadmin to administer the
RDBMS properly, to ensure it is configured
appropriately for our application, AND I ask his or
her advice and input on how I design and implement my
SQL code, as well as for input on distributed
application architecture.  You can't do it all.  On my
development machine, I just use whatever the default
configuration is, so I have it up and running in a
flash and can focus on my development.  I'll change
that configuration ONLY if the sysadmin tells me there
is a problem with the default.  My advice, therefore
is forget about configuration issues and focus on
getting your SQL right, and then fast, and let your
sysadmin advise, and possibly help, with changes to
your configuration should he or she feel it needs to
be modified to better represent how your application
will behave once in production.

> Keep your database design clean, likewise with your
> queries, consider
> whether you can cache certain results and get
> everything working first.
>
Richard, could you possibly clarify what you mean by a
"clean design"?  Is it different from what I normally
do with regard to ensuring all the requisite data is
available, properly normalized, with a suitable suite
of indeces, keys, &c., and as  simple as practicable?
(That is, over-simplification is avoided.)  I also
tend to ensure that all user access to the data is
through either a stored procedure or a read only view
(perhaps with a little paranoia thrown in ;).

Ted

Re: stored procedures and dynamic queries

From
Ivan Sergio Borgonovo
Date:
On Tue, 04 Dec 2007 13:54:15 +0000
Richard Huxton <dev@archonet.com> wrote:

> Always go for the cleaner design. If it turns out that isn't fast
> enough, *then* start worrying about having a bad but faster design.

mmm yeah right. I did express myself badly.
What I mean I've first to know what are the boundaries to know what
a good design is.
I'm ready to refactor... I'd just like to avoid it for ignorance of
common knowledge about good practice.

BTW still a good reading for dev:

http://www.gtsm.com/oscon2004/

> > Most of the documents available are from a sysadmin point of view.
> > That makes me think that unless I write terrible SQL it won't
> > make a big difference and the first place I'll have to look at if
> > the application need to run faster is pg config.

> The whole point of a RDBMS is so that you don't have to worry about
> this. If you have to start tweaking the fine details of these

This will definitively be the last resort. These times you can't wear
so many hats as before.

> Keep your database design clean, likewise with your queries,
> consider whether you can cache certain results and get everything
> working first.

At the end... if you don't look to much to details everything will
reach a defined deterministic state after all ;)

> Note that this is quite old now, so some performance-related
> assumptions will be wrong for current versions of PG.

I noticed. Maybe this will be part of some other question later.

--
Ivan Sergio Borgonovo
http://www.webthatworks.it


Re: stored procedures and dynamic queries

From
"Obe, Regina"
Date:
 As a side note, there is actually a book on design patterns in SQL,
although I personally haven't read it. From the reviews I recall reading
about it, I think its mostly based on Oracle Features.  Still might be a
good read as far as PostgreSQL is concerned except for the sections on
Graphs and recursive trees since Oracle has special syntactical sugar
for that kind of stuff that is unique to Oracle.

http://www.rampant-books.com/book_2006_1_sql_coding_styles.htm

Hope that helps,
Regina



-----Original Message-----
From: pgsql-general-owner@postgresql.org
[mailto:pgsql-general-owner@postgresql.org] On Behalf Of Ted Byers
Sent: Tuesday, December 04, 2007 9:59 AM
To: Richard Huxton; Ivan Sergio Borgonovo
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] stored procedures and dynamic queries

--- Richard Huxton <dev@archonet.com> wrote:
> Ivan Sergio Borgonovo wrote:
> > On Tue, 04 Dec 2007 08:14:56 +0000
> > Richard Huxton <dev@archonet.com> wrote:
> > 
> >> Unless it's an obvious decision (millions of
> small identical
> >> queries vs. occasional large complex ones) then
> you'll have to
> >> test. That's going to be true of any decision
> like this on any
> >> system.
> > 
> > :(
> > 
> > I'm trying to grasp a general idea from the view
> point of a developer
> > rather than a sysadmin. At this moment I'm not
> interested in
> > optimisation, I'm interested in understanding the
> trade off of
> > certain decisions in the face of a cleaner
> interface.
> 
> Always go for the cleaner design. If it turns out
> that isn't fast 
> enough, *then* start worrying about having a bad but
> faster design.
> 
I don't know about "clean", but the mantra here is
"First you make it provably correct, then you make it
fast."

I am a fan of making things as simple as practicable,
BUT NO SIMPLER. I don't know how that meshes with
"clean", but the general idea is right.

If you look at my code, my C++ and my java code is
infinitely better than my SQL, so with either of
those, I end up making fewer variants, and my starting
point is generally much closer to my end point, and
that is just a function of experience.  With my SQL
code, I generally find myself producing a variety of
scripts to support a given task.  And these include
exploration of just about everything SQL has to offer.
 I TRY everything, from joins to correlated subqueries
to stored procedures to views, and beyond.  And I am
still trying to develop an intuition as to which
options will serve best in a given situation,
analogous to design patterns I routinely use in my C++
and Java code.

Your reaction to Richard's advice to test seems to
imply you want a more direct approach to "THE" answer.
 That is chasing something that doesn't exist!  I tell
you, as a developer (and NOT a sysadmin), there is no
substitute for experience and THAT can ONLY be
obtained through trial and error.  I haven't seen any
books about design patterns in SQL (while there are
plenty in C++ and Java, and other languages), so there
is no short cut.

In short, the general idea most developers I know work
with is "code and test, and then code again and test
again, until you've tried the variety of options that
exist."

There is no substitute for testing yourself.  I have
seen repeated advice to replace correlated subqueries
by left joins, claiming that ubiquitous experience is
that the joins are faster, and yet, in my tests, in
most cases there was little difference in speed while
in others the correlated subqueries were faster.  So
it appears the advice was based on experience with
dated software and the RDBMS in question had
subsequently dramatically improved how it handled
correlated subqueries.  And my use of EXPLAIN
confirmed what I was finding through testing.

> > Most of the documents available are from a
> sysadmin point of view.
> > That makes me think that unless I write terrible
> SQL it won't make a
> > big difference and the first place I'll have to
> look at if the
> > application need to run faster is pg config.
> 
> The whole point of a RDBMS is so that you don't have
> to worry about 
> this. If you have to start tweaking the fine details
> of these things, 
> then that's a point where the RDBMS has reached its
> limits. In a perfect 
> world you wouldn't need to configure PG either, but
> it's not that clever 
> I'm afraid.
> 
I am not sure I buy this, if I properly understand it.
 Trust me, I have written some really bad but simple
queries that took hours to complete a task that was
completed in less than a minute with smarter code. 
And yet the "bad" code I'd written was similar in
nature to examples used in some texts to explain ideas
in SQL.  The point is, until you get extensive
experience in SQL programming and optimization, you
won't know what is bad code until you test it.

Personally, I rely on the sysadmin to administer the
RDBMS properly, to ensure it is configured
appropriately for our application, AND I ask his or
her advice and input on how I design and implement my
SQL code, as well as for input on distributed
application architecture.  You can't do it all.  On my
development machine, I just use whatever the default
configuration is, so I have it up and running in a
flash and can focus on my development.  I'll change
that configuration ONLY if the sysadmin tells me there
is a problem with the default.  My advice, therefore
is forget about configuration issues and focus on
getting your SQL right, and then fast, and let your
sysadmin advise, and possibly help, with changes to
your configuration should he or she feel it needs to
be modified to better represent how your application
will behave once in production.

> Keep your database design clean, likewise with your
> queries, consider 
> whether you can cache certain results and get
> everything working first.
> 
Richard, could you possibly clarify what you mean by a
"clean design"?  Is it different from what I normally
do with regard to ensuring all the requisite data is
available, properly normalized, with a suitable suite
of indeces, keys, &c., and as  simple as practicable? 
(That is, over-simplification is avoided.)  I also
tend to ensure that all user access to the data is
through either a stored procedure or a read only view
(perhaps with a little paranoia thrown in ;).

Ted

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

               http://archives.postgresql.org/
-----------------------------------------
The substance of this message, including any attachments, may be
confidential, legally privileged and/or exempt from disclosure
pursuant to Massachusetts law. It is intended
solely for the addressee. If you received this in error, please
contact the sender and delete the material from any computer.