"David G. Johnston" <david.g.johnston@gmail.com> writes: > IOW, as long as the output string matches: ^"(?:"{2})*"$ I do not see how > it is possible for format to lay in a value at %I that is any more > insecure than the current behavior. If the input string already matches > that pattern then it could be output as-is without any additional risk and > with the positive benefit of making this case work as expected. The broken > case then exists when someone actually intends to name their identifier > <"something"> which then correctly becomes <"""something"""> on output.
But that's exactly the problem: you just broke a case that used to work. format('%I') is not supposed to guess at what the user intends; it is supposed to produce a string that, after being passed through identifier parsing (dequoting or downcasing), will match the input. It is not format's business to break that contract just because the input has already got some double quotes in it.
An example of where this might be important is if you're trying to construct a query with arbitrary column headers in the output. You can do format('... AS %I ...', ..., column_label, ...) and be confident that the label will be exactly what you've got in column_label. This proposed change would break that for labels that happen to already have double-quotes --- but who are we to say that that can't have been what you wanted?
I agree with Tom that we shouldn't key off of contents in the string to determine whether or not to quote. Introducing the behave I describe in an intuitive way would require some kind of type-specific handling in format(). I'm not sure what the cost of this is to the project, but David makes the very reasonable point that imposing the burden of choosing between `%s` and `%I` opens up the possibility of confusing vulnerabilities.