=?utf-8?Q?=C3=81lvaro?= Herrera <alvherre@alvh.no-ip.org> writes:
> On 2025-Apr-06, Tom Lane wrote:
>> If we can cite the SQL standard then it's an entirely defensible
>> restriction.
> We can. It says (in 5.2 <token> and <separator>)
> <regular identifier> ::= <identifier body>
> <identifier body> ::= <identifier start> [ <identifier part>... ]
> <identifier part> ::= <identifier start> | <identifier extend>
> <identifier start> ::= !! See the Syntax Rules.
> <identifier extend> ::= !! See the Syntax Rules.
Hmm, but that's about non-quoted identifiers, so of course their
character set is pretty restricted. What's of concern here is
what's allowed in double-quoted identifiers. AFAICS there's
not much restriction: it can be any <nondoublequote character>,
and SR 7 says
7) A <nondoublequote character> is any character of the source
language character set other than a <double quote>.
NOTE 115 — “source language character set” is defined in
Subclause 4.10.1, “Host languages”, in ISO/IEC 9075-1.
The referenced bit of 9075-1 is pretty vague too:
No matter what binding style is chosen, SQL-statements are written
in an implementation-defined character set, known as the source
language character set. The source language character set is not
required to be the same as the character set of any character
string appearing in SQL-data.
So I'm not really seeing anything there that justifies forbidding any
characters. However, I still think that if we're going to forbid CR
or LF, we might as well go the whole way and forbid all the ASCII
control characters; none of them are any saner to use in identifiers
than those two. (I'd be for banning and similar as well, on
the same usability grounds as banning tabs, except that putting an
encoding dependency into this rule will not end well.)
regards, tom lane