Re: can stored procedures with computational sql queries improve API performance? - Mailing list pgsql-general

From Olivier Gautherot
Subject Re: can stored procedures with computational sql queries improve API performance?
Date
Msg-id CAJ7S9TVhTrrjX4y=tOPOJp1dqGJDYeX0H_GMP_fiVuRRd2e7ww@mail.gmail.com
Whole thread Raw
In response to can stored procedures with computational sql queries improve API performance?  (Krishnakant Mane <kkproghub@gmail.com>)
List pgsql-general
Hi Krishnkant,

On Wed, Jul 10, 2024 at 2:58 AM Krishnakant Mane <kkproghub@gmail.com> wrote:
Hello.

I have a straight forward question, but I am just trying to analyze the
specifics.

So I have a set of queries depending on each other in a sequence to
compute some results for generating financial report.

It involves summing up some amounts from tuns or of rows and also on
certain conditions it categorizes the amounts into types (aka Debit
Balance, Credit balance etc).

There are at least 6 queries in this sequence and apart from 4 input
parameters. these queries never change.

So will I get any performance benefit by having them in a stored
procedure rather than sending the queries from my Python based API?

Regards.

Functions and procedures have a significant potential to improve performance but there are a few things to watch. Return of experience after having written a few hundreds for a project. The list below is by no mean not exhaustive.

1) Warning: Compatibility with other database engines
If your model needs to run on other technologies (Oracle, MySQL, MS-SQL, etc.), the increase in maintenance efforts may become prohibitive

2) Warning: deadlocks
Make sure to design the procedures in such a way that you don't mutually depend on other parallel invocations

3) WARNING: Don't overload the server
Try to remain conservative in terms of computation in your server. Try to stick to selects, updates, deletes, joins, and simple arithmetics and strings manipulations. You can do a lot more but it may quickly affect the overall performance.

The reasons why I would recommend to use them:

A) Reduction of round trips
Even though it may be a few ms at a time, it can add up and become significant. This is where you gain performance.

B) Implicit transactions
A function will not free locks until it returns. This means that if your queries depend on cells, or modify cells, the behavior will be coherent, reducing the risk of race conditions. If other invocations depend on the same data, the locks will take care of sequencing execution to maintain integrity. In other words, you can safely invoke functions in parallel and let the engine do the scheduling when necessary.

C) API evolution
As long as you have only 1 or 2 applications running against the database, it may not be an issue. If you have more and your model needs to evolve, you may get to a situation where updating them all at the same time can become a challenge, especially if you depend on external providers. By using procedures and functions, you can abstract the model and maintain a standard interface to the application.

Note: * I DON'T RECOMMEND IT * but in some cases it can be handy to have the same function name with different sets of parameters (for instance to present a wrapper with default parameters, and other instances with a finer, more complete control). It can happen if you don't clean up timely older versions of the API when you upgrade your model - and it can become a nightmare.

Last recommendation: activate the logs and review regularly the performance of your functions. You may identify occurrences that run very fast and others not so. It can help you identify potential conflicts or model optimizations.

Hope it helps
--
Olivier Gautherot

 

pgsql-general by date:

Previous
From: Rich Shepard
Date:
Subject: Re: Finding error in long input file
Next
From: sud
Date:
Subject: Dropping column from big table