Thread: can stored procedures with computational sql queries improve API performance?

Hello.

I have a straight forward question, but I am just trying to analyze the 
specifics.

So I have a set of queries depending on each other in a sequence to 
compute some results for generating financial report.

It involves summing up some amounts from tuns or of rows and also on 
certain conditions it categorizes the amounts into types (aka Debit 
Balance, Credit balance etc).

There are at least 6 queries in this sequence and apart from 4 input 
parameters. these queries never change.

So will I get any performance benefit by having them in a stored 
procedure rather than sending the queries from my Python based API?

Regards.




On Jul 9, 2024, at 17:58, Krishnakant Mane <kkproghub@gmail.com> wrote:
>
> Hello.
>
> I have a straight forward question, but I am just trying to analyze the specifics.
>
> So I have a set of queries depending on each other in a sequence to compute some results for generating financial
report.
>
> It involves summing up some amounts from tuns or of rows and also on certain conditions it categorizes the amounts
intotypes (aka Debit Balance, Credit balance etc). 
>
> There are at least 6 queries in this sequence and apart from 4 input parameters. these queries never change.
>
> So will I get any performance benefit by having them in a stored procedure rather than sending the queries from my
Pythonbased API? 

Almost certainly.

Doing it all in a stored procedure or likely even better a single query will remove all of the latency involved in
goingback and forth between your app and the database. 

Insofar as the queries you are running separately access similar data, a single query will be able to do that work
once.

There are other potential benefits (a smaller number of queries reduces planning time, for example).


On 7/10/24 06:44, Guyren Howe wrote:
> On Jul 9, 2024, at 17:58, Krishnakant Mane <kkproghub@gmail.com> wrote:
>> Hello.
>>
>> I have a straight forward question, but I am just trying to analyze the specifics.
>>
>> So I have a set of queries depending on each other in a sequence to compute some results for generating financial
report.
>>
>> It involves summing up some amounts from tuns or of rows and also on certain conditions it categorizes the amounts
intotypes (aka Debit Balance, Credit balance etc).
 
>>
>> There are at least 6 queries in this sequence and apart from 4 input parameters. these queries never change.
>>
>> So will I get any performance benefit by having them in a stored procedure rather than sending the queries from my
Pythonbased API?
 
> Almost certainly.
>
> Doing it all in a stored procedure or likely even better a single query will remove all of the latency involved in
goingback and forth between your app and the database.
 
>
> Insofar as the queries you are running separately access similar data, a single query will be able to do that work
once.
>
> There are other potential benefits (a smaller number of queries reduces planning time, for example).


Basically there are if else conditions and it's not just the queries but 
the conditional sequence in which they execute.

So one single query won't do the job.

But Thank you for confirming my understanding.

I believe that the fact that stored procedures are compiled also makes 
them perform faster, is that correct?

Regards.




On 7/10/24 06:44, Guyren Howe wrote:
>> On Jul 9, 2024, at 17:58, Krishnakant Mane <kkproghub@gmail.com> wrote:
>>> Hello.
>>>
>>> I have a straight forward question, but I am just trying to analyze the specifics.
>>>
>>> So I have a set of queries depending on each other in a sequence to compute some results for generating financial
report.
>>>
>>> It involves summing up some amounts from tuns or of rows and also on certain conditions it categorizes the amounts
intotypes (aka Debit Balance, Credit balance etc). 
>>>
>>> There are at least 6 queries in this sequence and apart from 4 input parameters. these queries never change.
>>>
>>> So will I get any performance benefit by having them in a stored procedure rather than sending the queries from my
Pythonbased API? 
>> Almost certainly.
>>
>> Doing it all in a stored procedure or likely even better a single query will remove all of the latency involved in
goingback and forth between your app and the database. 
>>
>> Insofar as the queries you are running separately access similar data, a single query will be able to do that work
once.
>>
>> There are other potential benefits (a smaller number of queries reduces planning time, for example).
>
>
> Basically there are if else conditions and it's not just the queries but the conditional sequence in which they
execute.
>
> So one single query won't do the job.

You might be surprised what you can do in one query. Feel free to share.





> On Jul 9, 2024, at 7:21 PM, Krishnakant Mane <kkproghub@gmail.com> wrote:
>
> 
>> On 7/10/24 06:44, Guyren Howe wrote:
>>> On Jul 9, 2024, at 17:58, Krishnakant Mane <kkproghub@gmail.com> wrote:
>>> Hello.
>>>
>>> I have a straight forward question, but I am just trying to analyze the specifics.
>>>
>>> So I have a set of queries depending on each other in a sequence to compute some results for generating financial
report.
>>>
>>> It involves summing up some amounts from tuns or of rows and also on certain conditions it categorizes the amounts
intotypes (aka Debit Balance, Credit balance etc). 
>>>
>>> There are at least 6 queries in this sequence and apart from 4 input parameters. these queries never change.
>>>
>>> So will I get any performance benefit by having them in a stored procedure rather than sending the queries from my
Pythonbased API? 
>> Almost certainly.
>>
>> Doing it all in a stored procedure or likely even better a single query will remove all of the latency involved in
goingback and forth between your app and the database. 
>>
>> Insofar as the queries you are running separately access similar data, a single query will be able to do that work
once.
>>
>> There are other potential benefits (a smaller number of queries reduces planning time, for example).
>
>
> Basically there are if else conditions and it's not just the queries but the conditional sequence in which they
execute.
>
> So one single query won't do the job.
>
Are you processing the results of each of the queries in your python code before sending the next query?  If so, i
don'tthink you will see much improvement per query  


> But Thank you for confirming my understanding.
>
> I believe that the fact that stored procedures are compiled also makes them perform faster, is that correct?
>
If the SP is fired in a loop or very frequently ( not monthly), yes
> Regards.
>
>
>



On 2024-07-10 06:28:46 +0530, Krishnakant Mane wrote:
> I have a straight forward question, but I am just trying to analyze the
> specifics.
>
> So I have a set of queries depending on each other in a sequence to compute
> some results for generating financial report.

I am assuming that you aren't creating hundreds of financial reports per
second. So you care about performance because each report takes
significant time (seconds, maybe even minutes). Right?


> It involves summing up some amounts from tuns or of rows and also on certain
> conditions it categorizes the amounts into types (aka Debit Balance, Credit
> balance etc).
>
> There are at least 6 queries in this sequence and apart from 4 input
> parameters. these queries never change.
>
> So will I get any performance benefit by having them in a stored procedure
> rather than sending the queries from my Python based API?

For just 6 queries I doubt that. You will save one round trip per query,
but that should only be a few milliseconds unless your database is on
the other side of the planet.

You might also get some performance improvement if your queries are
returning a significant amount of data which is only needed for
constructing further queries but doesn't enter the final report. In this
case keeping it in the database might be quite a bit faster than
transferring it back and forth between the database and the client.
OTOH, temporary tables or CTEs might be sufficient for that.

        hp

--
   _  | Peter J. Holzer    | Story must make more sense than reality.
|_|_) |                    |
| |   | hjp@hjp.at         |    -- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |       challenge!"

Attachment
Hi Krishnkant,

On Wed, Jul 10, 2024 at 2:58 AM Krishnakant Mane <kkproghub@gmail.com> wrote:
Hello.

I have a straight forward question, but I am just trying to analyze the
specifics.

So I have a set of queries depending on each other in a sequence to
compute some results for generating financial report.

It involves summing up some amounts from tuns or of rows and also on
certain conditions it categorizes the amounts into types (aka Debit
Balance, Credit balance etc).

There are at least 6 queries in this sequence and apart from 4 input
parameters. these queries never change.

So will I get any performance benefit by having them in a stored
procedure rather than sending the queries from my Python based API?

Regards.

Functions and procedures have a significant potential to improve performance but there are a few things to watch. Return of experience after having written a few hundreds for a project. The list below is by no mean not exhaustive.

1) Warning: Compatibility with other database engines
If your model needs to run on other technologies (Oracle, MySQL, MS-SQL, etc.), the increase in maintenance efforts may become prohibitive

2) Warning: deadlocks
Make sure to design the procedures in such a way that you don't mutually depend on other parallel invocations

3) WARNING: Don't overload the server
Try to remain conservative in terms of computation in your server. Try to stick to selects, updates, deletes, joins, and simple arithmetics and strings manipulations. You can do a lot more but it may quickly affect the overall performance.

The reasons why I would recommend to use them:

A) Reduction of round trips
Even though it may be a few ms at a time, it can add up and become significant. This is where you gain performance.

B) Implicit transactions
A function will not free locks until it returns. This means that if your queries depend on cells, or modify cells, the behavior will be coherent, reducing the risk of race conditions. If other invocations depend on the same data, the locks will take care of sequencing execution to maintain integrity. In other words, you can safely invoke functions in parallel and let the engine do the scheduling when necessary.

C) API evolution
As long as you have only 1 or 2 applications running against the database, it may not be an issue. If you have more and your model needs to evolve, you may get to a situation where updating them all at the same time can become a challenge, especially if you depend on external providers. By using procedures and functions, you can abstract the model and maintain a standard interface to the application.

Note: * I DON'T RECOMMEND IT * but in some cases it can be handy to have the same function name with different sets of parameters (for instance to present a wrapper with default parameters, and other instances with a finer, more complete control). It can happen if you don't clean up timely older versions of the API when you upgrade your model - and it can become a nightmare.

Last recommendation: activate the logs and review regularly the performance of your functions. You may identify occurrences that run very fast and others not so. It can help you identify potential conflicts or model optimizations.

Hope it helps
--
Olivier Gautherot

 
On Tue, Jul 9, 2024 at 8:58 PM Krishnakant Mane <kkproghub@gmail.com> wrote:
Hello.

I have a straight forward question, but I am just trying to analyze the
specifics.

So I have a set of queries depending on each other in a sequence to
compute some results for generating financial report.

It involves summing up some amounts from tuns or of rows and also on
certain conditions it categorizes the amounts into types (aka Debit
Balance, Credit balance etc).

There are at least 6 queries in this sequence and apart from 4 input
parameters. these queries never change.

So will I get any performance benefit by having them in a stored
procedure rather than sending the queries from my Python based API?

One problem is that the query planner reverts to a generic query plan if you execute the same query over and over in a loop in the SP.

That bit us once.  A big SP that had been running "normally" for months suddenly went from about 20 minutes to six hours.  The solution (given by someone on this list a couple of years ago) was to add "set plan_cache_mode = force_custom_plan;" above the call.

That way, the query plan was updated every time.  Performance dropped to about 8 minutes IIRC.

Re: can stored procedures with computational sql queries improve API performance?

From
Juan Rodrigo Alejandro Burgos Mella
Date:

Great tip!!! Thx

El mié, 10 de jul de 2024, 16:17, Ron Johnson <ronljohnsonjr@gmail.com> escribió:
On Tue, Jul 9, 2024 at 8:58 PM Krishnakant Mane <kkproghub@gmail.com> wrote:
Hello.

I have a straight forward question, but I am just trying to analyze the
specifics.

So I have a set of queries depending on each other in a sequence to
compute some results for generating financial report.

It involves summing up some amounts from tuns or of rows and also on
certain conditions it categorizes the amounts into types (aka Debit
Balance, Credit balance etc).

There are at least 6 queries in this sequence and apart from 4 input
parameters. these queries never change.

So will I get any performance benefit by having them in a stored
procedure rather than sending the queries from my Python based API?

One problem is that the query planner reverts to a generic query plan if you execute the same query over and over in a loop in the SP.

That bit us once.  A big SP that had been running "normally" for months suddenly went from about 20 minutes to six hours.  The solution (given by someone on this list a couple of years ago) was to add "set plan_cache_mode = force_custom_plan;" above the call.

That way, the query plan was updated every time.  Performance dropped to about 8 minutes IIRC.