Thread: Re: [GENERAL] Count(*) throws error
On Wed, 2007-07-11 at 17:42 -0400, Tom Lane wrote: > "Jasbinder Singh Bali" <jsbali@gmail.com> writes: > > I'm using the following statement in my plpgsql function > > > SELECT INTO no_rows COUNT(*) FROM tbl_concurrent; > > > I have decalred no_rows int4 and initialized it to zero > > > Running the function throws the following error: > > > ERROR: syntax error at or near "(" at character 13 > > QUERY: SELECT $1 (*) FROM tbl_concurrent > > I'll bet a nickel you have a local variable named "count" in that > function, and plpgsql is blindly trying to substitute its value into > the SQL query. The replacement of "COUNT" by " $1 " in the query > text is the tip-off. I came across a boat load of these the other day. Seems fairly naff that we substitute variables blindly. Seems like we could be slightly more friendly without too much bother: at least only substitute after the VALUES clause in INSERT. We really shouldn't substitute "var = var" to "$n = $n" either; am I right in thinking the latter would happen silently and cause potential error? -- Simon Riggs EnterpriseDB http://www.enterprisedb.com
"Simon Riggs" <simon@2ndquadrant.com> writes: > Seems like we could be slightly more friendly without too much bother: > at least only substitute after the VALUES clause in INSERT. Surely you jest. regards, tom lane
"Simon Riggs" <simon@2ndquadrant.com> writes: > Seems like we could be slightly more friendly without too much bother: Actually, rather than get into that sort of AI-complete project, it strikes me that the most useful response is user education. There ought to be a section in the plpgsql documentation explaining exactly how the substitution works and showing a couple examples of the pitfalls. I'll try to draft something up. regards, tom lane
On Wed, 2007-07-11 at 18:13 -0400, Tom Lane wrote: > "Simon Riggs" <simon@2ndquadrant.com> writes: > > Seems like we could be slightly more friendly without too much bother: > > at least only substitute after the VALUES clause in INSERT. > > Surely you jest. No. There are a places where parameters clearly aren't allowed, so making the substitutions in those places can easily be prevented. The remainder of the problem is as hard as you think, but getting half way there seems very easy. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com
"Tom Lane" <tgl@sss.pgh.pa.us> writes: > "Simon Riggs" <simon@2ndquadrant.com> writes: >> Seems like we could be slightly more friendly without too much bother: > > Actually, rather than get into that sort of AI-complete project, Is it really AI-complete? ISTM the *only* place where a parameter is allowed is where the parser inserts a ColumnRef node? -- Gregory Stark EnterpriseDB http://www.enterprisedb.com
"Simon Riggs" <simon@2ndquadrant.com> writes: > On Wed, 2007-07-11 at 18:13 -0400, Tom Lane wrote: >> "Simon Riggs" <simon@2ndquadrant.com> writes: >>> Seems like we could be slightly more friendly without too much bother: >>> at least only substitute after the VALUES clause in INSERT. >> >> Surely you jest. > No. There are a places where parameters clearly aren't allowed, so > making the substitutions in those places can easily be prevented. The > remainder of the problem is as hard as you think, but getting half way > there seems very easy. It's not nearly as easy as you think. Even for the limited case of the column list for an INSERT, consider DECLARE i int; j int;...INSERT INTO mytable (arraycol[i]) VALUES (j); Considering that the current plpgsql parser doesn't have any knowledge *at all* of the syntactic structure of individual SQL commands, even teaching it to recognize an INSERT column list correctly would be a huge addition of code. The payback for that seems darn small considering that (1) that's only a small portion of the trouble cases and (2) it is very easy for users to recognize a failure in this context, whereas many of the other trouble cases aren't mechanically recognizable at all. Ultimately we have to do a better job of teaching users to recognize this issue for themselves; which is why I think it's basically a documentation problem. The other problem with trying to inject a small amount of smarts is that it complicates explaining the system. I think the docs have to explain *exactly* how the substitution works (which they fail to do right now, but I intend to make 'em do so today), and putting in any special cases will make that explanation longer and more confusing. regards, tom lane
Gregory Stark <stark@enterprisedb.com> writes: > "Tom Lane" <tgl@sss.pgh.pa.us> writes: >> Actually, rather than get into that sort of AI-complete project, > Is it really AI-complete? ISTM the *only* place where a parameter is allowed > is where the parser inserts a ColumnRef node? The first problem is that we have to make the decision far upstream of that. The second is that whether we *can* insert a parameter symbol is not the same as whether we *should*. Even if we made that work, we'd only be fixing the cases that currently result in trivial-for-users-to-recognize syntactic errors. The deeper semantic errors arising from substituting a parameter where the user meant to refer to a similarly-named column can only be dealt with through user education. regards, tom lane
"Tom Lane" <tgl@sss.pgh.pa.us> writes: > "Simon Riggs" <simon@2ndquadrant.com> writes: >> On Wed, 2007-07-11 at 18:13 -0400, Tom Lane wrote: >>> "Simon Riggs" <simon@2ndquadrant.com> writes: >>>> Seems like we could be slightly more friendly without too much bother: >>>> at least only substitute after the VALUES clause in INSERT. >>> >>> Surely you jest. > >> No. There are a places where parameters clearly aren't allowed, so >> making the substitutions in those places can easily be prevented. The >> remainder of the problem is as hard as you think, but getting half way >> there seems very easy. > > It's not nearly as easy as you think. Even for the limited case of > the column list for an INSERT, consider > > DECLARE i int; j int; > ... > INSERT INTO mytable (arraycol[i]) VALUES (j); huh, i hadn't actually seen that before -- it doesn't seem very useful. But I've certainly seen and used "UPDATE SET arraycol[i] = val". > Considering that the current plpgsql parser doesn't have any knowledge > *at all* of the syntactic structure of individual SQL commands, even > teaching it to recognize an INSERT column list correctly would be a huge > addition of code. It does eventually call the SQL parser though doesn't it? I was thinking the way to do it would be to delay substituting the formal parameters to later, after the parsing. So instead of substituting them as the tokens are lexed, instead suck in the tokens, run the parser -- which we currently do anyways just to check the syntax -- then walk the tree looking for ColumnRefs where the name matches a variable name. Then keep around that parse tree instead of just the series of lex tokens to later call analyze on and execute. > The other problem with trying to inject a small amount of smarts is that > it complicates explaining the system. Well that's quite true, but then that's the basic difficulty with any part of plpgsql. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com
Gregory Stark <stark@enterprisedb.com> writes: > So instead of substituting them as the tokens are lexed, instead suck in the > tokens, run the parser -- which we currently do anyways just to check the > syntax -- then walk the tree looking for ColumnRefs where the name matches a > variable name. Then keep around that parse tree instead of just the series of > lex tokens to later call analyze on and execute. Hmm. That sounds cool but I think it actually has a pretty substantial disadvantage --- there is then no easily user-readable representation of the query that shows *which* occurrences got substituted. With the textual replacement method you have a string you can look at, though it's true that we don't always show it to the user if we are not aware there's a problem. Again, I'm trying to look at the big picture of both syntactic and semantic errors. If we solve only the syntactic end of it I think we'd actually be worse off, because then users would be even more lost when they hit a semantic error (unwanted substitution). regards, tom lane
Tom Lane wrote: > Again, I'm trying to look at the big picture of both syntactic and > semantic errors. If we solve only the syntactic end of it I think we'd > actually be worse off, because then users would be even more lost when > they hit a semantic error (unwanted substitution). The only real solution is to have some clear syntactic sugar denoting where variables are to be interpolated: SELECT * FROM foo WHERE (bar + 1) = bar -- Which is the var? SELECT * FROM fooWHERE (bar + 1) = $bar -- OK SELECT * FROM foo WHERE (bar + 1) = {bar} -- OK It's not clear to plpgsql because it's *not clear*. In any other namespace-conflict situation I can think of it's always inner-definition-is-visible. This would of course solve the problem, but only by preventing you from substituting in variablesthat conflict with columns. Unless you generate a warning at function compile-time that doesn't seem much better. -- Richard Huxton Archonet Ltd