Re: ECPG gets embedded quotes wrong - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | Re: ECPG gets embedded quotes wrong |
Date | |
Msg-id | 691295.1603240515@sss.pgh.pa.us Whole thread Raw |
In response to | ECPG gets embedded quotes wrong (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: ECPG gets embedded quotes wrong
|
List | pgsql-hackers |
I wrote: > It looks to me like a sufficient fix is just to keep these quote > sequences as-is within a converted string, so that the attached > appears to fix it. Poking at this further, I noticed that there's a semi-related bug that this patch changes the behavior for, without fixing it exactly. That has to do with use of a string literal as "execstring" in ECPG's PREPARE ... FROM and EXECUTE IMMEDIATE commands. Right now, it appears that there is simply no way to write a double quote as part of the SQL command in this context. The EXECUTE IMMEDIATE docs say that such a literal is a "C string", so one would figure that \" (backslash-double quote) is the way, but that just produces syntax errors. The reason is that ECPG's lexer is in SQL mode at this point so it thinks the double-quoted string is a SQL quoted identifier, in which backslash isn't special so the double quote terminates the identifier. Ooops. Knowing this, you might try writing two double quotes, but that doesn't work either, because the <xd>{xddouble} lexer rule converts that to one double quote, and you end up with an unterminated literal in the translated C code rather than in the ECPG input. My patch above modifies this to the extent that two double quotes come out as two double quotes in the translated C code, but that just results in nothing at all, since the C compiler sees adjacent string literals, which the C standard commands it to concatenate. Then you probably get a mysterious syntax error from the backend because it thinks your intended-to-be SQL quoted identifier isn't quoted. However, this is the behavior a C programmer would expect for adjacent double quotes in a literal, so maybe people wouldn't see it as mysterious. Anyway, what to do? 1. Nothing, except document that you can't put a double quote into the C string literal in these commands. 2. Make two-double-quotes work to produce a data double quote, which I think could be done fairly easily with some post-processing in the execstring production. However, this doesn't have much to recommend it other than being easily implementable. C programmers would not think it's natural, and the fact that backslash sequences other than \" would work as a C programmer expects doesn't help. 3. Find a way to lex the literal per C rules, as the EXECUTE IMMEDIATE docs clearly imply we should. (The PREPARE docs are silent on the point AFAICS.) Unfortunately, this seems darn near impossible unless we want to make IMMEDIATE (more) reserved. Since it's currently unreserved, the grammar can't tell which flavor of EXEC SQL EXECUTE ... it's dealing with until it looks ahead past the name-or-IMMEDIATE token, so that it must lex the literal (if any) too soon. I tried putting in a mid-rule action to switch the lexer back to C mode but failed because of that ambiguity. Maybe we could make it work with a bunch of refactoring, but it would be ugly and subtle code, in both the grammar and lexer. On the whole I'm inclined to go with #1. There's a reason why nobody has complained about this in twenty years, which is that the syntaxes with a string literal are completely useless. There's no point in writing EXEC SQL EXECUTE IMMEDIATE "SQL-statement" when you can just write EXEC SQL SQL-statement, and similarly for PREPARE. (The other variant that takes the string from a C variable is useful, but that one doesn't have any weird quoting problem.) So I can't see expending the effort for #3, and I don't feel like adding and documenting the wart of #2. Thoughts? regards, tom lane
pgsql-hackers by date: