C on Client versus C on Server - Mailing list pgsql-performance

From Eliot Gable
Subject C on Client versus C on Server
Date
Msg-id BANLkTi=2EpfSK94LK=2hWSVxBCjV3hfacQ@mail.gmail.com
Whole thread Raw
Responses Re: C on Client versus C on Server  (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
List pgsql-performance
I have a PL/PGSQL stored procedure on my server which is about 3,500 lines long. To further improve performance after more than year of optimizing the PL/PGSQL, and also to hide the logic so it is not as easy to copy, I am starting to re-write the stored procedure in C. The initial design is such that a C++ client would query the PL/PGSQL stored procedure using one transaction and the result of the transaction would be several open cursors it could use to pull down the data from the server. This data would be re-written to XML and passed back to the original requestor for use. I am now also going to replace the XML conversion process and re-write the C++ logic as a module for the system that initiates the XML request. So, the layout is going from:

User Client ----> XML ----> C++ Server / PG Client -----> SQL -----> PL/PGSQL Stored Proc on PGSQL Server

To:

User Client / C Module / PG Client -----> SQL ----> C Stored Proc on PGSQL Server

This will improve overall efficiency considerably. However, I am now wonder about the follow....

The PL/PGSQL procedure has many large SELECT statements which pull various data from various tables. Then, between those SELECT statements, it runs logic to determine what other SELECT statements it should execute. Once it figures this all out and finishes running all the select statements, it joins some of those results together and returns 3 cursors which represent all the data the User Client needs in order to proceed with processing its workload. Every SELECT statement that the procedure ends up executing has its data stored in those three cursors, and all of it is required by the User Client to process its workload. Given that this data must be transferred across the link from the Postgres server to the client, would it be more efficient to keep all of the contents of the PL/PGSQL stored procedure, including the logic about which SELECT statements to run, on the Postgres server, or would it be just as fast to have that logic contained in the C module for the User Client and have that C module make multiple requests to the Postgres server?

The stored procedure also does write some records to three tables on the server which represent a summary of everything that it computed and returned to the User Client for billing purposes. My initial intuition on this is that keeping it all in the stored procedure on the server is going to be faster for two main reasons:

1) Each select statement from the User Client C Module would be a separate transaction which would drastically increase transaction overhead for the whole set of requests.
2) Writing the billing data at the end would mean that I not only have to pull all the data down to the User Client, I must also push the data back up to the server for writing the billing records.

So, am I looking at this the right way, or am I missing something?

Thanks in advance for any responses.

--
Eliot Gable

"We do not inherit the Earth from our ancestors: we borrow it from our children." ~David Brower

"I decided the words were too conservative for me. We're not borrowing from our children, we're stealing from them--and it's not even considered to be a crime." ~David Brower

"Esse oportet ut vivas, non vivere ut edas." (Thou shouldst eat to live; not live to eat.) ~Marcus Tullius Cicero

pgsql-performance by date:

Previous
From: Josh Berkus
Date:
Subject: Re: table contraints checks only happen in planner phase
Next
From: Heikki Linnakangas
Date:
Subject: Re: C on Client versus C on Server