Re: ('dog$house' = quote_ident('dog$house')) is surprisingly FALSE - Mailing list pgsql-general

From Bryn Llewellyn
Subject Re: ('dog$house' = quote_ident('dog$house')) is surprisingly FALSE
Date
Msg-id 9AC088C9-CB59-4846-8715-0A9554F7FDE6@yugabyte.com
Whole thread Raw
In response to Re: ('dog$house' = quote_ident('dog$house')) is surprisingly FALSE  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: ('dog$house' = quote_ident('dog$house')) is surprisingly FALSE
List pgsql-general

So I can see an argument for the existing behavior. It doesn't seem worth changing in any case. And I don't really see the documentation being improved by covering this corner case in detail when the current behavior is at least intuitive.

quote_ident is a good bit more conservative than the core lexer about what is an "identifier character" --- it considers all non-ASCII characters as requiring quoting, too.

For typical uses of quote_ident, I think this is good future-proofing: it makes it very much less likely that something quote_ident decides not to quote would be rejected by some future PG version (not to mention non-PG SQL databases). So I'm not really in a hurry to change the code. Maybe we should tweak the docs a bit.

Is the use of *any* non-Latin character in the name of a database object a corner case? I appreciate that some outfits require Latin object names because they don't want to make it hard to have a global development staff. But not all outfits follow such a rule.

It's a pity that the doc reference that I started with and the second one that David mentioned:


don't x-reference each other. I had no reason to suspect that what I read in the dedicated account of quote_ident was something of a sketch—and that it (loosely speaking) has to be interpreted in the light of the second one. I appreciate that the PG doc aims for maximally terse, full-on DRY accounts. But neither of the two accounts gives a hint that quote_ident is less permissive than are PG's SQL and PL/pgSQL parsers themselves. This would have fixed it for me:

«
[quote_ident] Returns its input argument string suitably rendered to be used as an identifier in an SQL statement string. Rendering is not always needed and in some cases the return value is identical to the input value. The rules for when rendering is needed, and what form it takes, are explained in [x-ref]. Here are four examples:

  select quote_ident('eye'); → eye
  select quote_ident($$ab"cd'ef$$); → "ab""cd'ef"
  select quote_ident('øye'); → "øye"
  select quote_ident('我的桌子'); → "我的桌子"

The third and fourth examples show that quote_ident is much stricter than the rules explained in [x-ref] demand. They explain that for example, the SQL identifier for this role name:

  redaktør

is that bare name—with no rendering. Quote_ident's unnecessary strictness means that some tempting uses for the function are not viable. A test like this:

  if 'proposed_name' <> quote_ident('proposed_name') then ...

will give a false negative for large classes of proposed names.
»

About:

good future-proofing: it makes it very much less likely that something quote_ident decides not to quote would be rejected by some future PG version

Does this imply a risk that a future PG version will go against the SQL standard and reject any non-latin name that is free of all punctuation characters, when used in the role of a SQL identifier, unless it's double quoted?

After all, I'd guess that a great deal of SQL text and PL/pgSQL source text is typed by hand—and you don't usually see double quotes delineating every SQL identifier when this isn't needed. (And you almost never see double quotes delineating a SQL identifier where this rendering is necessary.)

pgsql-general by date:

Previous
From: Christophe Pettus
Date:
Subject: Re: ('dog$house' = quote_ident('dog$house')) is surprisingly FALSE
Next
From: Bryn Llewellyn
Date:
Subject: Re: ('dog$house' = quote_ident('dog$house')) is surprisingly FALSE