Re: PERSISTANT PREPARE (another point of view) - Mailing list pgsql-sql
From | Milan Oparnica |
---|---|
Subject | Re: PERSISTANT PREPARE (another point of view) |
Date | |
Msg-id | g6cilk$2chv$1@news.hub.org Whole thread Raw |
In response to | Re: PERSISTANT PREPARE (another point of view) (Craig Ringer <craig@postnewspapers.com.au>) |
Responses |
Re: PERSISTANT PREPARE (another point of view)
|
List | pgsql-sql |
Craig Ringer wrote: > Milan Oparnica wrote: >> I found this link from IBM DB2 developers showing why PERSISTENT >> PREPARE is a good idea and how could it be implemented. > > [snip] > >> NONE OF POPULAR SQL DBMS (Oracle, MS SQL, MySQL, Postgre, INTERBASE, >> FIREBIRD) HAVE THIS FEATURE. >> >> WHY ? > > I suspect that people tend to use SQL or PL/PgSQL stored procedures > instead. I'm not 100% sure SQL functions cache their query plans, but I > know PL/PgSQL does. > > Exactly what is gained by the use of persistent prepare over the use of > a stored procedure? > Its easier to write and call prepared statements then SP's. When writing, you just create parameters and SQL body of the statement. When calling from the application or report engine, all you have to do is "EXEC <statement name> (<parameters>)". In case of SP's written as SET OF CustomDataType it's pretty easy to call ("SELECT * FROM <proc> (<params>)) but its hard to maintain CutomDataType when changing table structures. In case of SP's written using CURSORS calling convention is not simple. BEGIN; SELECT <proc>('cursor_name', <input parameter>); FETCH ALL FROM cursor_name; END; This usually requires using more complicating ways to define source recordsets in reports designers. I'm also not sure how cursors use indexes of the tables. I've run some tests on several millions records with and without indexes and found out that there was no significant difference in performance. Plus, for now, CURSORS remain READ-ONLY in PG. There is one thing more. Prepared statements could be used as table-type (or view-type) datasource. That means we couldrun prepared statements over other prepared statements accepting same parameters. This makes possible creating complex "low-level" BASE queries and dozens of reporting queries that rely on them. Something like subquery structures: SELECT A.Fld1,B.Fld2 FROM (SELECT Fld1 FROM Table1 WHERE xxx) A INNER JOIN Table2 B ON A.Fld1=B.Fld2 WHERE yyy > What would the interface to the feature be through database access > drivers like JDBC? Explicit PREPARE GLOBAL or similar, then invocation > with EXECUTE ? > They could remain the same. If possible, when calling EXECUTE <xxx> PG could try local-then global (or reverse) PREPARED STATEMENTS pool, and run the statement that was first found. This is greatly simplified. It would require much detailed planning if this option is to be built in PG. > How would users using increasingly common layers like Hibernate/JPA use it? > I don't now, I never used Hibernate/JPA. Somebody could speak for that. Making PERSISTANT PREPARE statement available in PG doesn't mean that everybody must use it. > I'm also curious about how you'd address the possible need for periodic > re-planning as the data changes, though AFAIK SQL functions suffer from > the same limitation there. > Perhaps there could be several ways. It could be implemented within VACUUM procedure. It could re-plan all global prepared statements according to new circumstances. There should also be a manual re-planning command of specific statement maybe even with passing parameters to it so the optimization could be even more efficient. > I guess I personally just don't understand what the point of the > persistent prepare feature you describe is. However, this post that you > linked to: > > http://archives.postgresql.org/pgsql-hackers/2008-04/msg00867.php > > actually describes a query plan cache, rather than persistent prepare. > The post assumes the app will explicitly manage the cache, which I'm not > sure is a good idea, but I can see the point of a plan cache. There > might be some heuristics Pg could use to decide what to cache and to > evict (planner time cost vs memory use, frequency of use, etc) so the > app doesn't have to know or care about the plan cache. However, I'm not > too sure how you'd match an incoming query to a cached plan, and > determine that the plan was still valid, with enough speed to really > benefit from the plan cache. Then again, I don't know much about Pg's > innards, so that doesn't mean much. > > Tom Lane responded to that post to point out some of the complexities: > > http://archives.postgresql.org/pgsql-hackers/2008-04/msg00868.php > > -- > Craig Ringer > I think persistent prepare has to do something with global caching plans. Preparing statement actually prepares a plan of its execution giving us the possibility to execute it. With persistent prepare statement it somehow these plans have to go from local to global. I agree they shouldn't be managed by app itself. App can only create requests for creation, modification, execution and removing of such object. This post is about a point of view. I know we can use stored procedures and views and other good features of PG to manipulate all data we need. What I aim is: it can be done in more simple and efficient way with PERSISTENT PREPARE implementation. Let me point the key benefits: - huge number of database developers could use this technique without getting involved with CURSORS and CUSTOM DATA TYPE SETS. This means a lot of hard-coded sQL commands built into apps itself easily transfered to database layer. It also means thousands of Microsoft Jet based applications migrated to PG. Why ? Because databases are getting larger as project advance and companies grow, and PG could offer an easier way to upsize then their own Microsoft SQL engine does. It also means a lot of new and less experienced database programmers turn their focus on PG. Its much easier to understand PREPARED statements than SP constructions returning rowsets. - EXEC(UTE) is a well known SQL command understood and implemented by any database engine almost the same way. It's also natively supported by report designing tools and ADO and ODBC drivers. It means that front end app doesn't even have to notice if the statement is locally or globally prepared. This means that some modules of the system could create global statements and other modules would simply use them, assuming that they are in the DB as a part of "setup" process. Lots of functionality is implemented in PG not all of them used by developers. My opinion is that implementation of PERSISTENT or GLOBAL PREPARED STATEMENT would be of much use.