Thread: Re: [PATCHES] prepareable statements
nconway@klamath.dyndns.org (Neil Conway) writes: > The attached patch implements per-backend prepareable statements. Finally some feedback: > The syntax is: > PREPARE name_of_stmt(param_types) FROM <some query>; > EXECUTE name_of_stmt [INTO relation] [USING args]; > DEALLOCATE [PREPARE] name_of_stmt; > I don't really like the 'FROM' keyword in PREPARE (I was planning to > use 'AS'), but that's what SQL92 specifies. Actually not. SQL92 defines this command as <prepare statement> ::= PREPARE <SQL statement name> FROM <SQL statement variable> <SQL statement variable> ::= <simple value specification> where <simple value specification> ::= <parameter name> | <embedded variable name> (the normal <literal> case for <simple value specification> is disallowed). So what they are really truly defining here is an embedded-SQL operation in which the statement-to-prepare comes from some kind of string variable in the client program. (SQL99 makes this even clearer by moving PREPARE into Part 5, Host Language Bindings.) AFAICT, the syntax we are setting up with actual SQL following the PREPARE keyword is *not* valid SQL92 nor SQL99. It would be a good idea to look and see whether any other DBMSes implement syntax that is directly comparable to the feature we want. (Oracle manuals handy, anyone?) Assuming we do not find any comparable syntax to steal, my inclination would be to go back to your original syntax and use "AS" as the delimiter. That way we're not creating problems for ourselves if we ever want to implement the truly spec-compliant syntax (in ecpg, say). > The syntax is largely SQL92 compliant, but not totally. I'm not sure how > the SQL spec expects parameters to be set up in PREPARE, but I doubt > it's the same way I used. I can't see any hint of specifying parameter types in SQL's PREPARE at all. So we're on our own there, unless we can take some guidance from other systems. > And the SQL92 spec for EXECUTE is functionally > similar, but uses a different syntax (EXECUTE ... USING INTO <rel>, I > think). It's not really similar at all. Again, the assumed context is an embedded SQL program, and the real targets of INTO are supposed to be host-program variable names. (plpgsql's use of SELECT INTO is a lot more similar to the spec than our main grammar's use of it.) While I won't strongly object to implementing EXECUTE INTO as you've shown it, I think a good case could be made for leaving it out, on the grounds that our form of SELECT INTO is a mistake and a compatibility problem, and we shouldn't propagate it further. Any opinions out there? In general, this is only vaguely similar to what SQL92 contemplates, and you're probably better off not getting too close to their syntax... Moving on to coding issues of varying significance: > The patch stores queries in a hash table in TopMemoryContext. Fine with me. No reason to change to a linked list. (But see note below.) > Also, I'm not entirely sure my approach to memory management is > correct. Each entry in the hash table stores its data in its > own MemoryContext, which is deleted when the statement is > DEALLOCATE'd. When actually running the prepared statement > through the executor, CurrentMemoryContext is used. Let me know > if there's a better way to do this. I think it's all right. On entry to ExecuteQuery, current context should be TransactionCommandContext, which is a perfectly fine place for constructing the querytree-to-execute. You do need to copy the querytree as you're doing because of our lamentable tendency to scribble on querytrees in the executor. * In PrepareQuery: plan_list must be same len as query list (indeed you have an Assert for that later); this code will blow it if a UTILITY_CMD is produced by the rewriter. (Can happen: consider a NOTIFY produced by a rule.) Insert a NULL into the plan list to keep the lists in step. * In StoreQuery, the MemoryContextSwitchTo(TopMemoryContext) should be unnecessary. The hashtable code stuffs its stuff into its own context. You aren't actually storing anything into TopMemoryContext, only into children thereof. * DeallocateQuery is not prepared for uninitialized hashtable. * RunQuery should NOT do BeginCommand; that was done by postgres.c. * Sending output only for last query is wrong; this makes incorrect assumptions about what the rewriter will produce. AFAIK there is no good reason you should not execute all queries with the passed-in dest; that's what postgres.c does. * Is it really appropriate to be doing Show_executor_stats stuff here? I think only postgres.c should do that. * This is certainly not legal C: + if (Show_executor_stats) + ResetUsage(); + + QueryDesc *qdesc = CreateQueryDesc(query, plan, dest, NULL); + EState *state = CreateExecutorState(); You must be using a C++ compiler. * The couple of pfrees at the bottom of ExecuteQuery are kinda silly considering how much else got allocated and not freed there. * transformPrepareStmt is not doing the right thing with extras_before and extras_after. Since you only allow an OptimizableStmt in the syntax, probably these will always remain NIL, but I'd suggest throwing in a test and elog. * What if the stored query is replaced between the time that transformExecuteStmt runs and the time the EXECUTE stmt is actually executed? All your careful checking of the parameters could be totally wrong --- and ExecuteQuery contains absolutely no defenses against a mismatch. One answer is to store the expected parameter typelist (array) in the ExecuteStmt node during transformExecuteStmt, and then verify that this matches after you look up the statement in ExecuteQuery. * transformExecuteStmt must disallow subselects and aggregate functions in the parameter expressions, since you aren't prepared to generate query plans for them. Compare the processing of default or check-constraint expressions. BTW, you might as well do the fix_opids call at transform time not runtime, too. * In gram.y: put the added keywords in the appropriate keyword-list production (hopefully the unreserved one). * Syntax for prepare_type_list is not good; it allows ( , int ) Probably best to push the () case into prepare_type_clause. * typeidToString is bogus. Use format_type_be instead. * Why does QueryData contain a context field? * prepare.h should contain a standard header comment. * You missed copyfuncs/equalfuncs support for the three added node types. regards, tom lane
On Sat, Jul 20, 2002 at 10:00:01PM -0400, Tom Lane wrote: > AFAICT, the syntax we are setting up with actual SQL following the > PREPARE keyword is *not* valid SQL92 nor SQL99. It would be a good > idea to look and see whether any other DBMSes implement syntax that > is directly comparable to the feature we want. (Oracle manuals handy, > anyone?) I couldn't find anything on the subject in the Oracle docs -- they have PREPARE for use in embedded SQL, but I couldn't see a reference to PREPARE for usage in regular SQL. Does anyone else know of an Oracle equivalent? > Assuming we do not find any comparable syntax to steal, my inclination > would be to go back to your original syntax and use "AS" as the > delimiter. That way we're not creating problems for ourselves if we > ever want to implement the truly spec-compliant syntax (in ecpg, say). Ok, sounds good to me. > * This is certainly not legal C: > > + if (Show_executor_stats) > + ResetUsage(); > + > + QueryDesc *qdesc = CreateQueryDesc(query, plan, dest, NULL); > + EState *state = CreateExecutorState(); > > You must be using a C++ compiler. Well, it's legal C99 I believe. I'm using gcc 3.1 with the default CFLAGS, not a C++ compiler -- I guess it's a GNU extension... In any case, I've fixed this. > * What if the stored query is replaced between the time that > transformExecuteStmt runs and the time the EXECUTE stmt is actually > executed? Good point ... perhaps the easiest solution would be to remove DEALLOCATE. Since the backend's prepared statements are flushed when the backend dies, there is little need for deleting prepared statements earlier than that. Users who need to prevent name clashes for plan names can easily achieve that without using DEALLOCATE. Regarding the syntax for EXECUTE, it occurs to me that it could be made to be more similar to the PREPARE syntax -- i.e. PREPARE foo(text, int) AS ...; EXECUTE foo('a', 1); (rather than EXECUTE USING -- the effect being that prepared statements now look more like function calls on a syntactical level, which I think is okay.) Cheers, Neil -- Neil Conway <neilconway@rogers.com> PGP Key ID: DB3C29FC
Neil Conway wrote: >On Sat, Jul 20, 2002 at 10:00:01PM -0400, Tom Lane wrote: > > >>AFAICT, the syntax we are setting up with actual SQL following the >>PREPARE keyword is *not* valid SQL92 nor SQL99. It would be a good >>idea to look and see whether any other DBMSes implement syntax that >>is directly comparable to the feature we want. (Oracle manuals handy, >>anyone?) >> >> > >I couldn't find anything on the subject in the Oracle docs -- they have >PREPARE for use in embedded SQL, but I couldn't see a reference to >PREPARE for usage in regular SQL. Does anyone else know of an Oracle >equivalent? > > Oracle doesn't have this functionality exposed at the SQL level. In Oracle the implementation is at the protocol level (i.e. sqlnet). Therefore the SQL syntax is the same when using preparedstatements or when not using them. The client implementation of the sqlnet protocol decides to use prepared statements or not. As of Oracle 8, I think pretty much all of the Oracle clients use prepared statements for all the sql statements. The sqlnet protocol exposes 'open', 'prepare', 'describe', 'bind', 'fetch' and 'close'. None of theseare exposed out into the SQL syntax. thanks, --Barry
nconway@klamath.dyndns.org (Neil Conway) writes: > Regarding the syntax for EXECUTE, it occurs to me that it could be made > to be more similar to the PREPARE syntax -- i.e. > PREPARE foo(text, int) AS ...; > EXECUTE foo('a', 1); > (rather than EXECUTE USING -- the effect being that prepared statements > now look more like function calls on a syntactical level, which I think > is okay.) Hmm, maybe *too* much like a function call. Is there any risk of a conflict with syntax that we might want to use to invoke stored procedures? If not, this is fine with me. regards, tom lane
On Tue, 2002-07-23 at 11:34, Tom Lane wrote: > nconway@klamath.dyndns.org (Neil Conway) writes: > > Regarding the syntax for EXECUTE, it occurs to me that it could be made > > to be more similar to the PREPARE syntax -- i.e. > > > PREPARE foo(text, int) AS ...; > > > EXECUTE foo('a', 1); > > > (rather than EXECUTE USING -- the effect being that prepared statements > > now look more like function calls on a syntactical level, which I think > > is okay.) > > Hmm, maybe *too* much like a function call. Is there any risk of a > conflict with syntax that we might want to use to invoke stored > procedures? If not, this is fine with me. Stored procedures would use PERFORM would they not? I like the function syntax. It looks and acts like a temporary 'sql' function.
Rod Taylor wrote: > > On Tue, 2002-07-23 at 11:34, Tom Lane wrote: > > nconway@klamath.dyndns.org (Neil Conway) writes: > > > Regarding the syntax for EXECUTE, it occurs to me that it could be made > > > to be more similar to the PREPARE syntax -- i.e. > > > > > PREPARE foo(text, int) AS ...; > > > > > EXECUTE foo('a', 1); > > > > > (rather than EXECUTE USING -- the effect being that prepared statements > > > now look more like function calls on a syntactical level, which I think > > > is okay.) > > > > Hmm, maybe *too* much like a function call. Is there any risk of a > > conflict with syntax that we might want to use to invoke stored > > procedures? If not, this is fine with me. > > Stored procedures would use PERFORM would they not? > > I like the function syntax. It looks and acts like a temporary 'sql' > function. FWIW, Oracle uses EXECUTE to execute stored procedures. It is not apart of the SQL language, but a SQL*Plus command: EXECUTE my_procedure(); The Oracle call interface defines a function to call stored procedures: OCIStmtExecute(); Likewise, the privilege necessary to execute a stored procedure is 'EXECUTE' as in: GRANT EXECUTE ON my_procedure TO mascarm; Again, FWIW. Mike Mascari mascarm@mascari.com
Mike Mascari wrote: > FWIW, Oracle uses EXECUTE to execute stored procedures. It is not apart > of the SQL language, but a SQL*Plus command: > > EXECUTE my_procedure(); > Also with Transact SQL (i.e. MSSQL and Sybase) Syntax Execute a stored procedure: [[EXEC[UTE]]{ [@return_status =] {procedure_name [;number] | @procedure_name_var}[[@parameter =] {value | @variable[OUTPUT] | [DEFAULT]] [,...n] [WITH RECOMPILE] However, as Peter E. has pointed out, SQL99 uses the keyword CALL: 15.1 <call statement> Function Invoke an SQL-invoked routine. Format <call statement> ::= CALL <routine invocation> FWIW, Joe
To expand on the Oracle implementation, the EXECUTE command in SQL*Plus results in an anonymous pl/sql block (as opposed to a named procedure). being sent over the wire such as the following: begin my_procedure(); end; As mentioned in the previous post, the EXECUTE command is only a SQL*Plus keyword (well, Server Manager too but that was killed in 9i). Mike Mascari wrote: > Rod Taylor wrote: > >>On Tue, 2002-07-23 at 11:34, Tom Lane wrote: >> >>>nconway@klamath.dyndns.org (Neil Conway) writes: >>> >>>>Regarding the syntax for EXECUTE, it occurs to me that it could be made >>>>to be more similar to the PREPARE syntax -- i.e. >>> >>>>PREPARE foo(text, int) AS ...; >>> >>>>EXECUTE foo('a', 1); >>> >>>>(rather than EXECUTE USING -- the effect being that prepared statements >>>>now look more like function calls on a syntactical level, which I think >>>>is okay.) >>> >>>Hmm, maybe *too* much like a function call. Is there any risk of a >>>conflict with syntax that we might want to use to invoke stored >>>procedures? If not, this is fine with me. >> >>Stored procedures would use PERFORM would they not? >> >>I like the function syntax. It looks and acts like a temporary 'sql' >>function. > > > FWIW, Oracle uses EXECUTE to execute stored procedures. It is not apart > of the SQL language, but a SQL*Plus command: > > EXECUTE my_procedure(); > > The Oracle call interface defines a function to call stored procedures: > > OCIStmtExecute(); > > Likewise, the privilege necessary to execute a stored procedure is > 'EXECUTE' as in: > > GRANT EXECUTE ON my_procedure TO mascarm; > > Again, FWIW. > > Mike Mascari > mascarm@mascari.com > > ---------------------------(end of broadcast)--------------------------- > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org >
On Sat, Jul 20, 2002 at 10:00:01PM -0400, Tom Lane wrote: > * In gram.y: put the added keywords in the appropriate keyword-list > production (hopefully the unreserved one). I think the patch already does this, doesn't it? If not, what else needs to be modified? > * Syntax for prepare_type_list is not good; it allows > ( , int ) Erm, I don't see that it does. The syntax is: prep_type_list: Typename { $$ = makeList1($1); } | prep_type_list ',' Typename { $$ = lappend($1, $3); } ; (i.e. there's no ' /* EMPTY */ ' case) > * Why does QueryData contain a context field? Because the context in which the query data is stored needs to be remembered so that it can be deleted by DeallocateQuery(). If DEALLOCATE goes away, this should also be removed. I've attached a revised patch, which includes most of Tom's suggestions, with the exception of the three mentioned above. The syntax is now: PREPARE q1(int, float, text) AS ...; EXECUTE q1(5, 10.0, 'foo'); DEALLOCATE q1; I'll post an updated patch to -patches tomorrow that gets rid of DEALLOCATE. I also need to check if there is a need for executor_stats. Finally, should the syntax for EXECUTE INTO be: EXECUTE q1(...) INTO foo; or EXECUTE INTO foo q1(...); The current patch uses the former, which I personally prefer, but I'm not adamant about it. Cheers, Neil -- Neil Conway <neilconway@rogers.com> PGP Key ID: DB3C29FC
Attachment
I've two queries - 1. emrxdbs=# explain select * from patient A where exists (select NULL from patient B where B.mrn=A.mrn and B.dob=A.dob and B.sex=A.sex and B.lastname=A.lastname and B.firstname=A.firstname group by B.mrn, B.dob, B.sex, B.lastname, B.firstname having A.patseq < max(B.patseq)) limit 10; NOTICE: QUERY PLAN: Limit (cost=0.00..121.50 rows=10 width=141) -> Seq Scan on patient a (cost=0.00..6955296.53 rows=572430 width=141) SubPlan -> Aggregate (cost=6.03..6.05 rows=1 width=42) -> Group (cost=6.03..6.05 rows=1 width=42) -> Sort (cost=6.03..6.03 rows=1 width=42) -> Index Scan using patient_name_idxon patient b (cost=0.00..6.02 rows=1 width=42) 2. emrxdbs=# explain select * from patient A where exists (select NULL from patient B where B.mrn=A.mrn and B.dob=A.dob and B.sex=A.sex and B.lastname=A.lastname and B.firstname=A.firstname and B.mrn='3471585' group by B.mrn, B.dob, B.sex, B.lastname, B.firstname having A.patseq < max(B.patseq)) limit 10; NOTICE: QUERY PLAN: Limit (cost=0.00..121.45 rows=10 width=141) -> Seq Scan on patient a (cost=0.00..6951997.59 rows=572430 width=141) SubPlan -> Aggregate (cost=6.03..6.05 rows=1 width=42) -> Group (cost=6.03..6.04 rows=1 width=42) -> Sort (cost=6.03..6.03 rows=1 width=42) -> Index Scan using patient_mrnfac_idxon patient b (cost=0.00..6.02 rows=1 width=42) The first query results come back fairly quick, the 2nd one just sits there forever. It looks similar in the two query plans. Let me know. thanks. johnl
On Thu, 2002-07-25 at 15:55, John Liu wrote: > I've two queries - > > 1. emrxdbs=# explain select * from patient A where exists (select NULL from > patient B where B.mrn=A.mrn and B.dob=A.dob and B.sex=A.sex and > B.lastname=A.lastname and B.firstname=A.firstname group by B.mrn, B.dob, > B.sex, B.lastname, B.firstname having A.patseq < max(B.patseq)) limit 10; > NOTICE: QUERY PLAN: > > Limit (cost=0.00..121.50 rows=10 width=141) > -> Seq Scan on patient a (cost=0.00..6955296.53 rows=572430 width=141) > SubPlan > -> Aggregate (cost=6.03..6.05 rows=1 width=42) > -> Group (cost=6.03..6.05 rows=1 width=42) > -> Sort (cost=6.03..6.03 rows=1 width=42) > -> Index Scan using patient_name_idx on patient > b (cost=0.00..6.02 rows=1 width=42) > > 2. emrxdbs=# explain select * from patient A where exists (select NULL from > patient B where B.mrn=A.mrn and B.dob=A.dob and B.sex=A.sex and > B.lastname=A.lastname and B.firstname=A.firstname and B.mrn='3471585' group > by B.mrn, B.dob, B.sex, B.lastname, B.firstname having A.patseq < > max(B.patseq)) limit 10; > NOTICE: QUERY PLAN: > > Limit (cost=0.00..121.45 rows=10 width=141) > -> Seq Scan on patient a (cost=0.00..6951997.59 rows=572430 width=141) > SubPlan > -> Aggregate (cost=6.03..6.05 rows=1 width=42) > -> Group (cost=6.03..6.04 rows=1 width=42) > -> Sort (cost=6.03..6.03 rows=1 width=42) > -> Index Scan using patient_mrnfac_idx on > patient b (cost=0.00..6.02 rows=1 width=42) > > The first query results come back fairly quick, the 2nd one just sits there > forever. > It looks similar in the two query plans. It seems that using patient_mrnfac_idx instead of patient_name_idx is not a good choice in your case ;( try moving the B.mrn='3471585' from FROM to HAVING and hope that this makes the DB use the same plan as for the first query select * from patient A where exists ( select NULL from patient B where B.mrn=A.mrn and B.dob=A.dob andB.sex=A.sex and B.lastname=A.lastname and B.firstname=A.firstname group by B.mrn, B.dob, B.sex, B.lastname,B.firstname having A.patseq < max(B.patseq) and B.mrn='3471585') limit 10; ----------- Hannu
Neil Conway writes: > Regarding the syntax for EXECUTE, it occurs to me that it could be made > to be more similar to the PREPARE syntax -- i.e. > > PREPARE foo(text, int) AS ...; > > EXECUTE foo('a', 1); > > (rather than EXECUTE USING -- the effect being that prepared statements > now look more like function calls on a syntactical level, which I think > is okay.) I'm not sure I like that. It seems too confusing. Why not keep it as the standard says? (After all, it is the PREPARE part that we're adjusting, not EXECUTE.) -- Peter Eisentraut peter_e@gmx.net
On Thu, Jul 25, 2002 at 10:54:04PM +0200, Peter Eisentraut wrote: > I'm not sure I like that. It seems too confusing. Why not keep > it as the standard says? (After all, it is the PREPARE part that > we're adjusting, not EXECUTE.) I think it's both, isn't it? My understanding of Tom's post is that the features described by SQL92 are somewhat similar to the patch, but not directly related. On the other hand, if other people also find it confusing, that would be a good justification for changing it. Personally, I think it's pretty clear, but I'm not adamant about it. Cheers, Neil -- Neil Conway <neilconway@rogers.com> PGP Key ID: DB3C29FC
Neil Conway writes: > On Thu, Jul 25, 2002 at 10:54:04PM +0200, Peter Eisentraut wrote: > > I'm not sure I like that. It seems too confusing. Why not keep > > it as the standard says? (After all, it is the PREPARE part that > > we're adjusting, not EXECUTE.) > > I think it's both, isn't it? My understanding of Tom's post is that the > features described by SQL92 are somewhat similar to the patch, but not > directly related. What I was trying to say is this: There is one "prepared statement" facility in the standards that allows you to prepare a statement defined in a host variable, whereas you are proposing one that specifies the statement explicitly. However, both of these are variants of the same concept, so the EXECUTE command doesn't need to be different. -- Peter Eisentraut peter_e@gmx.net